Portmanteau (Spliterator)

For the most part, classes in the standard Java libraries consist of fairly normal words (e.g. a ScheduledExecutorService is an ExecutorService – a service that manages the execution of tasks – which allows for scheduling). There are exceptions (and I don’t mean the Java ones), of course, such as the OMGVMCID.

Java 8, specifically JSR-335 brought about a bunch of new classes, including some of my Guava favourites (the function package, with Supplier and friends, and Optional) and Streams. Most of these, again, have fairly normal-sounding names; there are a few that might sound a bit hacked together like BiConsumer but I think the meaning is generally intuitive (a consumer of two arguments). There is one that was initially fairly puzzling, though – “Spliterator” (rendered as one word). The obvious reading of that is as a portmanteau of “split” and “iterator”, but I wasn’t sure at first how useful it would be. Of course I have written code to partition collections or other iterables, but this was usually done on an ad-hoc basis.

The API seems designed around easily scheduling parallel computation. The key method here is trySplit(). To quote the Java API docs:

If this spliterator can be partitioned, returns a Spliterator covering elements, that will, upon return from this method, not be covered by this Spliterator.

I’ll go over this in a bit more detail.

  • If the spliterator can currently be split, then trySplit() returns a new spliterator, and this “takes over” some of the elements in our existing spliterator. Our existing spliterator, if traversed directly after this method returns (e.g. with tryAdvance()), will not have these elements.
  • Otherwise, trySplit() returns null. Our existing spliterator retains its elements.

For example, this is one way of doing a parallel file deletion, which starts from one spliterator and splits it as necessary.

// Assume there exists an ExecutorService, EXECUTOR.

class DeleteTask implements Runnable {
  private final Spliterator<File> filesToDelete;
  
  DeleteTask(Spliterator<File> filesToDelete) { 
    this.filesToDelete = filesToDelete; 
  }

  @Override
  public void run() {
    Spliterator<T> filesToDelegate = filesToDelete.trySplit();
    Future<T> delegateFuture;
    while (filesToDelegate != null) {
      delegateFuture = EXECUTOR.submit(new DeleteTask(filesToDelegate));
      filesToDelegate = filesToDelete.trySplit();
    }
    filesToDelete.forEachRemaining(File::delete);
    delegateFuture.get(); // we can use a fancier way of managing completions if needed
  }
}

Viewing this in relation to parallel programming, there are obvious similarities to fork-join based workflows, especially in cases where the work involves a map or filter, or doing some reduction that’s commutative and associative. A traditional way of writing the above wouldn’t be too different; you might write some logic that splits the files into batches (which now lives in the spliterator), and then similarly dispatch deletion jobs to an ExecutorService.

From the above, it seems that a one notable difference is where responsibility for how the stream should be split lives. The Spliterator has some responsibility for deciding whether it should split itself. To some extent this makes sense; in cases where we want batches to have rather specific sizes (e.g. if performance is highly sensitive to elements being processed in specific batch sizes), we can guard trySplit() with suitable calls to the estimateSize() or getExactSizeIfKnown() methods. This also can be useful for allowing us to avoid fiddly edge cases (where our size bounds pass, but we must work with less than a full batch of elements, for instance).

Spliterators are also useful because they include characteristics that dictate, among other things, how the elements are to be traversed. For example (though this doesn’t quite apply to the file deletion example) we may be able to use a certain optimisation if we can assume that we will see the data in sorted order; spliterators do carry this information (whereas if we used a standard Iterator we might not be able to easily do this without some additional bookkeeping). The same might apply for cases where we can do better if we know elements are unique. The collections API also generates spliterators with the correct characteristics, in a sense (e.g. a NavigableSet would give you a spliterator that already has the sorted and unique properties present).

I’m still not sure why this name was decided upon, as opposed to, say, SplittableIterator. Perhaps this was done in the interest of terseness and because the decomposition is fairly obvious, though the name still really doesn’t look like a proper word to me. It does seem that spliterators have use cases beyond standard iterators as well, even when parallel computation is out of the picture – the characteristics and ability to dynamically switch on them seems potentially useful. That said, I still have not used a Spliterator directly in my professional work, and have no intention of doing so unless it fits the task I’m doing!

Volatile Variables (2017 Q2 Review)

It feels like Q2 went by very quickly, perhaps because I’m settling in somewhat into my team at Palantir, and many things have been relatively stable this quarter.

Looking at GitHub statistics, I merged in 29 pull requests over the past three months (one up from Q1). I think one thing that certainly has improved is the size of my pull requests, however; I’ve made a conscious effort to split up features into several reasonably-sized chunks; this includes chains of four or five PRs at times (and some of these can run into the mid to high hundreds of lines each – suggesting that we would easily be looking at a 3000-line monster again if we didn’t do this split). I didn’t check the data, but it’s quite likely I committed less code in Q2 than in Q1; some of this stems from being more involved in reviewing others’ pull requests, as well as dicey debugging (where days can be spent to figure out a very small tweak to make things work). That’s certainly not a problem though – one of the PRs I’m happiest with involved a net reduction in the size of the codebase.

Other professional goals have been going acceptably well. I’ve continued to keep on keeping on as far as interviewing is concerned. Generally technical development has chugged on, though at a bit of a slower pace; my paper reading groups have slowed down a fair bit, for various reasons (friends were travelling or otherwise unavailable), but hopefully that should pick back up in Q3. The academic side of things is a bit of a brighter spot for me as well; a paper on symbolic model checking LDLK was accepted at IJCAI 2017, and I also cleaned up version 1.3.0 that includes support for CTL*K.

I think in Q3 I would aim to merge at least thirty pull requests (I know, it’s a blunt metric). The paper reading groups should continue. I also want to continue on my “Imperial Computing Comprehensives” project; that is an extra-difficult exam paper that crosses disciplines between all compulsory first- and second-year modules at Imperial. (I think my average module mark at Imperial was probably just over 90, but I’m trying to set the paper so that I would score a low first; around a 75.)

Financially, my portfolio increased by 0.3 percent from Q1 to Q2 (would actually have decreased if not for reinvested dividends). I think I’ve actually been decently successful in not worrying excessively about the vicissitudes of the market; in early April as well as at the very end of June my portfolio took beatings of more than a thousand pounds, but I wasn’t too concerned about that. I’m somehow skeptical that the bull market will keep going, but I’m not complaining.

Focusing more on things within my sphere of influence here, notable changes in the Q2 budget involved a drop-off in transport expenditure (as I started really taking walking to work more seriously) and… not much else. There was one big luxury purchase in Q1 (laptop), as well as in Q2 (IJCAI travels). Generally, things were fairly tight though not as tight as in Q1, especially since I didn’t do any specific budget challenges (these usually take place in February). I’m planning on loosening this a little in Q3, for various reasons, including anxiety about valuations in the market, possible travel and a desire to have some better food as well. I can cost the ingredients of my lunch today at 39p!

I’ve started taking up competitive puzzle-solving in my spare time as well. These involve solving logical puzzles against the clock (both well-known ones like Sudoku, Kakuro or Slitherlink, as well as some that are written specifically for a competition – like the Odd Sums Sudoku I discussed a while back). I’m still pretty bad at these, but we’ll see how things go. Generally I find that my logical skills are decent, but my spatial reasoning is not great; I struggle a lot with loop puzzles, or toroidal sudoku (basically like normal sudoku, except regions are irregularly shaped and can wrap around). I’ve continued to practice my singing as well – those A4s are still stable; B4s and C5s are still a bit flaky.

I travelled a couple of times in Q2 – Aarhus, Palo Alto, and then Zagreb (wow, literally from A to Z – I can think of Zurich which is lexicographically later than Zagreb, though can’t think of anywhere before Aarhus). I enjoyed each trip for various reasons, though both Aarhus and Zagreb were whirlwind trips in a sense (flew in in the morning, and flew out the next morning). I’d be happy to revisit all of these places, and I think I will with quite high probability actually. All of these trips were for work though; I didn’t travel anywhere for personal recreation.

In terms of music I’ve been listening to, this quarter seemed to take on a relatively darker mood (though my Q1 review featured James Arthur’s Back from the Edge which isn’t exactly very bright either). There are parts of Lukas Graham’s 7 Years that I appreciate, including fair sections of the lyrical content. However I’ve always found the tone of the lead vocalist a little strange, and the production is awkward (there’s a part where the band is cheered, which doesn’t really make sense to me; I see the song more as one of reflection and acknowledgement of uncertainty, rather than of triumph). However, there was an acoustic piano cover of this that I enjoyed; Schuller’s voice is clean and pleasant, and I find the delivery more appropriate.

There is of course this part of the song that I identify with to some extent, though… it’s a little complicated.

I only see my goals, I don’t believe in failure

I like the ambition and drive, but it can be dangerous if said goals are poorly set. Resilience to failure is of course very important and something I’m still trying to get better at, but it’s important to respect failed attempts for what they are – and learn from them, where appropriate. I think the main lesson here is that it’s important to set good goals, especially if one intends to double down on them. (The second part is less contentious, as long as not believing is interpreted in the right way.)

Let’s push on in Q3.

The University Gatekeepers (Hotseat: MAT 2013)

Background

I didn’t have to take the Mathematics Admission Test when I applied to Imperial (though the Cambridge STEP was a part of the conditions of my offer, probably because I didn’t take FM). This was added as a part of the admission process in 2013, though seemingly only for students aiming to read Mathematics. Oxford has been running this test for fairly long, and they use it both for Mathematics as well as Computer Science. Unlike STEP, candidates don’t get a choice of questions, but instead are prescribed questions based on the programme they’re applying to.

The paper has seven questions and candidates need to answer five. All candidates have to answer question 1, which is actually ten multiple-choice questions worth 4 marks each (no penalties for guessing). There are then six 15-mark questions that follow. Candidates in Computer Science generally have two questions that the broader Mathematics population also takes – in 2013 this featured an algebra question on functional equations, and a discrete mathematics/combinatorics question. There are two further questions, one of which tends to incorporate a significant element of mathematical logic (2013’s problem revolved around the notion of common knowledge, much like the infamous Cheryl’s birthday question), and another that deals with formal languages and reasoning about similar structures (the 2013 problem focused on an inductively generated language).

I did the Computer Science stream of the paper, and managed to put together a score of 95 (40 on the first problem; 14 on Q2, 15 on Q5, 15 on Q6 and 11 on Q7) owing to a lucky (or, depending on how you see it, educated) guess on one of the multiple-choice problems. The “Mathematics-only” questions Q3 and Q4 didn’t seem any harder, though; I cleared both of them without much difficulty (whereas Q2 was somewhat nastier, in general). I lost the points on Q7 owing to a careless error; specifically, that if an expression Bw is reversed, the resultant expression is certainly not wB (well, unless w is palindromic!)

The specific paper I did is here.

Selected Problems in Depth

Problem 1: The Gauntlet

I don’t find multiple choice questions to play very well with mathematics, because it’s very often possible to reverse engineer a solution or otherwise arrive at the correct answer via an unexpected solution path. For example, consider problem G:

p_n(x) = (x - 1) + (x - 2) + \ldots + (x - n)

What is the remainder when p_n(x) is divided by p_{n-1}(x)?

Presumably, the intended solution path was to express p_n(x) in a more convenient form (noting the arithmetic progression in the constant terms) and then actually do the division. However, given the options, it sufficed to simply test it for p_3(x) and p_2(x), noting that the result of the division was negative, and that only one of the options was negative.

Nonetheless, there were certainly ample opportunities for trick questions (this paper had E, where there was a trap with an algebraic expression easily being bounded by degree 7, though it wasn’t the case that it actually was degree 7).

I got through 9 of the problems fairly smoothly, though struggled with H (failing to realise that one of the equations was actually that of a semicircle, and plowing into aggressive trigonometry-based integration). In the end I ran out of time towards the end of the integration, but I guessed the only option that featured the subtraction of the area under one of the lines, which turned out to be correct.

Problem 2: Functional Equations

This was generally fairly straightforward till the end, where the task was to find some function f that satisfied f(t) - f(1-t) = (2t - 1)^3. In hindsight it seemed fairly obvious to conclude that a good way of solving this would be to treat f as a cubic polynomial, though I struggled for a while substituting random values of t in to the functional equation to see if anything dropped out. I then went through some messy reasoning along the lines of “the difference between two points increases cubically as the (x-)distance between said points increases linearly, so the derivative is probably quadratic, thus the function is probably cubic”. I then established f(t) = At^3 + Bt^2 + Ct + D and was on my way after a modest algebraic bash.

Problem 6: Reasoning about Knowledge

This question introduces Alice, Bob and Charlie; Charlie writes a (different) number on Alice and Bob’s foreheads. Thus Alice and Bob can see the other’s number, and Charlie can see both. The question begins with a little warm-up to prompt candidates into thinking about epistemic reasoning (if Alice knows her number right off the bat, it must be 2 – because she can see Bob’s 1; for any other number she wouldn’t be able to know). Things heat up a little later on:

  • Charlie: Your numbers are between 1 and 10 inclusive.
  • Alice: I don’t know my number. Is my number a square number?
  • Charlie: If I told you, you would know your number.
  • Bob: I don’t know my number.

What is Alice’s number?

We start by noting that the squares are 1, 4 and 9. Since Alice can see Bob’s number, she already knows that her number is one of two possibilities, spaced two apart. For Charlie’s statement to make sense, exactly one of the possibilities has to be a square. This tells us that Bob’s number is 2, 3, 5 or 8. (It couldn’t be 10, because if it was Alice would know that her number is 9.)

Next, Bob can see Alice’s number, but doesn’t know his. Since he can see Alice’s number, Alice’s number A must be one for which both A + 1 and A - 1 are candidates. This gives us the answer: her number has to be 4.

This one, in my opinion, stopped at a fairly early level; things could develop a fair bit more. However, assuming proportional time weighting (and bearing in mind that I did write my thesis on epistemic logic) the 22.5 minutes candidates were given for this is reasonably challenging.

Synthesis

I found STEP to be somewhat more demanding, perhaps because it had questions that focused more on depth rather than breadth, and also because I took the computer science stream of the paper while STEP just focuses on mathematics.

The average score of all candidates on questions 1 through 5 was 44.8, with 60.6 for successful applicants. Nonetheless, I subsequently worked through problems 3 and 4 and actually found them considerably easier than 2. Interestingly, there was a candidate who scored between 85 and 89 but was nonetheless rejected; also, no one scored 100, which seems surprising (this would certainly have been possible if not for the careless slip I made on problem 2 – or maybe the marking for some aspects such as the rigor demanded in proofs was stricter than my own marking).

This was quite enjoyable, actually. I’m not sure how I would have fared had I taken this exam in the past! I’d think that my knowledge of logic, as well as the more computer-science oriented topics (discrete mathematics and formal languages) would have improved, though knowledge of say calculus or analysis probably would be slightly weaker than when I finished high school.

Playing with Blocks (Hotseat: British Informatics Olympiad R1, 2015)

Second in a series. I apologise for the temporary hiatus, had a couple of weeks travelling and then intense sessions at work (part of this was my choice, but still).

Background

I was not in the UK for high school, and studying computing/informatics wasn’t common in my high school in Singapore either. I never actually participated in the British Informatics Olympiad. This is the UK’s national high-school competition in computing, and unlike normal programming contests there’s an explicit portion involving reasoning and discrete mathematics which I like. This can get intense; problems especially in the second round can feature fairly complex reasoning – such as 2009’s second round which required understanding models of computation in FRACTRAN (and eventually deciphering a “program” that computed the Fibonacci series). For quite a few contest problems, intuitively determining a solution, or even being uncertain but firing it off anyway in hopes of getting it accepted is much easier than trying to prove that the solution is correct.

The first round is somewhat less intense, but still has plenty of opportunities for interesting problems. There are typically three tasks; the first is gentle and meant to be a warm-up, the second tests implementation (perhaps more like what I might expect on a programming exam in university) and the third tests algorithms (in a sense, more like a standard contest problem). Each question has most of the points (about 75%) allocated to implementation and the remainder to several “reasoning” questions. Partial scores are awarded for solutions that aren’t optimal or fail on some cases (much like in the IOI, though the information is not known to the students).

This was one of the easier papers I’ve done; I worked my way through the tasks very quickly, and didn’t have too much of a problem (other than a careless error in problem 2). Most of the time I score above 90 points on these, though sometimes if I don’t “get” problem 3 in an optimal way or make subtle errors on problem 2 the score will be a fair bit lower.

The specific paper I did is here.

Problem 1: Block Palindromes

Given a string, find the number of ways of decomposing it into at least two blocks such that when the blocks are reversed (but not the letters within the blocks), the arrangement of blocks looks the same. For example, BARBAR is a block palindrome, since (BAR)(BAR) reads the same forwards and backwards. Of course, every ordinary palindrome is also a block palindrome, since taking one-letter blocks is allowed.

Not too much here; the problem is readily amenable to a simple recursive implementation (try taking each block length from 1 to half the string length and see if the string ends with that prefix – if so, recurse and add that result to our total). It would be possible to use memoisation to improve performance here, since there can be repeated subproblems e.g. with AABCBAA, you would be solving for the number of solutions to BCB twice. However, given the small constraints (strings of up to 10 characters) this wasn’t needed.

This problem probably had one of the trickier reasoning questions though – if all groupings of a block palindrome contain an even number of blocks, how many groupings can there be (and why)?

I attacked this one with a proof by contradiction – that it is not possible to have any groupings

S = B_1 B_2 \ldots B_{n-1} B_n B_{n-1} \ldots B_2 B_1

for n \geq 2. This can be shown by construction:

S' = (B_1) (B_2 \ldots B_{n-1} B_n B_{n-1} \ldots B_2) (B_1)

is a valid decomposition, which has three blocks. Of course, the empty string isn’t a valid block, so we require S = B_1 B_1 and the number of groupings has to be 1. We can probably formulate an even stronger condition (for instance, that apart from B_1 itself, no prefix of B_1 is also a suffix of B_1). While the reasoning seemed sound to me, I wasn’t too confident I had the right answer here.

(Note: The question text implied that a block palindrome could be decomposed into blocks in some way, so while 0 satisfies “all groupings” it wouldn’t be a block palindrome.)

Problem 2: Battleships

Given a funky (deterministic) algorithm for generating random numbers, implement it, and use it to fill a grid with ships of certain sizes via some kind of rejection sampling. The key constraints are that the grid has a finite size, and more interestingly ships are not allowed to touch (even diagonally).

Implementing the random-generator is pretty much following instructions from the spec. However, there are several ways to implement the latter constraint. Given that optimality is generally not important in problem 2, I went for a rather nasty solution that, say, for a k-ship placed horizontally would iterate through a 3 \times (k + 2) bounding rectangle around the ship and mark all those squares as unusable. Better asymptotic complexity can probably be obtained using an R-tree (though that is probably not reasonable for contests – especially without a reference!).

There was a rather nasty “reasoning” problem about the number of ways to place ships of size 4, 3, 2 and 1 on a 5 by 5 grid – this was also where I lost the five points. I hacked my implementation to, instead of using the random number generator, simply use backtracking to count the number of possible placements. However, I forgot that my bounding rectangle system was based on booleans (in other words, a square that was unusable because of two different ships would suddenly become usable after just one was removed), and thus came up with the impressive total of 427,840 possibilities (correct answer: 1,424)!

I subsequently fixed the aforementioned bug (using integers instead) and confirmed that the answer was indeed 1,424. During the test I did try smoke-tests that verified that a 1-ship could be placed in 25 ways and a 5-ship in 10; I attempted to do two 1-ships, but decided that I would have had too much pain working through the combinatorics for that.

Problem 3: Modern Art

This problem revolved around (reasonably) efficiently determining the ith permutation of a string in terms of lexicographic ordering, possibly involving duplicate characters. For example, in a string which consists of five As, Bs, Cs and Ds – what is the 11,223,344,556th permutation?

Of course, for smaller strings and smaller numbers it would suffice to implement some kind of next-permutation generator. This is a pretty standard algorithm. We can then step through the permutations until we get what we want – and there are some test cases that would be solvable with this; you would get (I think) 11 of the 25 points available.

The key idea is that we don’t want to explore all of the permutations. From combinatorics, we can quickly calculate the number of permutations of five As, Bs, Cs and Ds:

\displaystyle {20 \choose 5} \times {15 \choose 5} \times {10 \choose 5} \times {5 \choose 5} = 11,732,745,024

Intuitively, we are close to the “end”; we can thus be pretty sure that our 11-millionth permutation should start with a D. We can thus write a recursive (or iterative) algorithm that identifies the first character based on this search.

I began by implementing a function that computes the number of ways of permuting a given number of As, Bs, Cs and Ds. This was done in a bit of a hacky way and has some (probably unnecessary) memoisation.

private static long numPerms(long as, long bs, long cs, long ds) {
  List<Long> longs = Lists.newArrayList(as, bs, cs, ds);
  Collections.sort(longs);
  if (cache.contains(longs)) {
    return cache.get(longs);
  }
  long totalChars = as + bs + cs + ds;
  long perms = 1L;

  // Computes (total C l) then updates total
  for (Long l : longs) {
    long num = 1;
    long denom = 1;
    for (int i = 0; i < l; i++) {
      num *= totalChars - i;
      denom *= i + 1;
    }
    perms *= num / denom;
    totalChars -= l;
  }

  cache.put(longs, perms);
  return perms;
}

Interestingly, it turns out it would actually have been okay to build the entire numerator and denominator and then divide the two; perhaps 20 was chosen to avoid complications with overflow (since 21! will exceed the max value of a 64 bit integer). Anyway, the highest value the numerator could have taken on here was 1,860,480 (20 to 16 multiplied together) so I was sure I was safe this way.

We then go through the characters in lexicographic order, and for each one, consider the number of permutations that would be feasible. The general approach is something like this:

private static String getPermutation(long as, long bs, long cs, long ds, long offset) {
  if (as == 0 && bs == 0 && cs == 0 && ds == 0) {
    return ""; // we've built the entire thing
  }

  long permsCovered = as == 0 ? 0 : numPerms(as - 1, bs, cs, ds);
  if (offset < permsCovered) {
    return "A" + getPermutation(as - 1, bs, cs, ds, offset);
  }
  long permsPassed = permsCovered;
  permsCovered += bs == 0 ? 0 : numPerms(as, bs - 1, cs, ds);
  if (offset < permsCovered) {
    return "B" + getPermutation(as, bs - 1, cs, ds, offset - permsPassed);
  }
  permsPassed = permsCovered;
  permsCovered += cs == 0 ? 0 : numParams(as, bs, cs - 1, ds);
  if (offset < permsCovered) {
    return "C" + getPermutation(as, bs, cs - 1, ds, offset - permsPassed);
  }
  // and the same for D.
}

I actually finished the implementation fairly quickly, but decided to write a couple of tests; I stepped through the 12 permutations of “ABBC” as given in the spec, and also wrote a test for “AAABBBCCCDDD” that all the permutations were generated in the correct order (using numPerms to give me the range). I also of course tried a couple of edge-ish cases such as each of “A”, “B”, “C” and “D”, and a test to verify we could handle the expected scale (five of each character, retrieving the 1st as well as 11732745024th permutation).

The “reasoning” problems on this one were very easy compared to the other two questions.

Conclusion

Unfortunately meta-analysis is somewhat limited here as unlike the AIME results, the ones for the BIO aren’t public. I’m pretty sure 95 is a solid score as the contest isn’t that easy, but I won’t be surprised if there are multiple 100s (with a bit of caution I could have gotten one)!

Style during programming contests is an interesting issue. I find that many contestants especially with C++ like to use rather bespoke preprocessor routines (e.g. cin as f, cout as g). I’m actually used to some of them as well e.g. ll for long long, vpll for vector<pair<long, long>>. If one understands these it’s probably better for speed, though when debugging I find having not-unreasonable names is preferable (even if my code is written more for speed than style).

Odd Sums Sudoku

The puzzle above is from the WPF Sudoku GP, Round 4 2016 #13. To solve it, each row, column and marked 3×3 box must contain the digits 1 to 9 once each, and the solution must have, in the marked rows, maximal odd substrings summing to the given totals in the correct order.

I’ve always enjoyed a good puzzle, though I’ve never considered myself especially good at them. That said, I’ve done my fair share of sudoku, including several books where puzzles were labelled as very difficult; yet, fairly often identifying triples, or the odd X-wing, XY-wing or unique rectangle would be sufficient to blow through the puzzles. At least, I thought that clearing some “very difficult” puzzles in 8 or 10 minutes was blowing through them; I found the World Puzzle Federation’s Sudoku GP, where some of these had benchmark times along the lines of 3-4 minutes!

Furthermore, in addition to your standard “classic” puzzles which feature the standard rules (i.e. each row, column and marked 3×3 box must contain the digits from 1 to 9 once each), there are a couple of interesting variants that require understanding new rulesets and figuring out how to deal with them. After a while, as mentioned the classic puzzles largely tend to be solved with standard techniques, and the challenge there is dispatching them with extreme speed; the variants, on the other hand, tend to maintain some degree of freshness (at least as far as I’m concerned).

The above puzzle was an example of this; several of the rows are marked with additional clues, and for these rows, the largest odd substrings in the row must sum to the values indicated by the clues. For example, if a row was 178356924, the clue associated with it would be 8, 8, 9. Note that this is also generated by several other valid sequences, such as 532174689 or 174283569.

For the above puzzle, the 9, 5, 3, 8 clue is a good starting point, and indicates that we have 9, 5, 3 and (17 or 71) as substrings of the solution to that row in some order; given the position of 6, we can conclude that the 9 and the 3 must occur before the 6 (though we don’t know). We thus have 9 in the bottom left hand corner, an even number (either 2 or 4), and then a 3. Now, reading the first column clue 1, 12, 3, 9; we have 1, (57 or 75), 3 and 9 as substrings. The 9 is at the end, and 3 cannot occur before the 4 as there’s no space to fit it. Thus 3 occurs after the 4, but it has to be separated from the 9 by an even number. More simply, the second column clue 10, 9, 6 indicates that our 5 has to be preceded by an even number and then a 1. There’s only one spot left for a 7 in this lower box, so we can insert it.

We thus reach this position:

Moving on, we can consider the middle column. 3 can only be satisfied by a singular 3, so we know that the 3 in the bottom row cannot be in the middle column. It thus must be in the sixth column, with 1 and 7 taking up the eighth and ninth columns (in some order).

Now consider the rightmost column. It is 3, 10, 12; 3 can only be satisfied by a singular 3, so the odd substrings must be 3, (19 or 91) and (57 or 75). Given the position of the 2 (fourth row), it must be the case that the column has 1 and 9 in the fifth and sixth rows, an even number in the seventh, and 5 and 7 in the eighth and ninth. We thus clearly have 7 in the corner, and 5 above it.

Finally, we can actually decide on the ordering of 1 and 9 as well; consider the hint for the middle row, which ends with 8; a 9 would automatically exceed this. Thus the middle row must contain the 1 and the sixth row the 9. The hint gives us for free that we have a 7 to the left of the 1.

There are more interesting deductions we can make as well. Consider the seventh column. The clue is 16, 9 and this can be satisfied in two ways: {1357} in some order and then a singleton 9, or {79} and {135} in some order. However, we can prove that the latter construction is impossible; the middle row is an even number, and also we have already used the 1 and the 5 in the bottom right hand box so we can’t place those numbers there. Thus, the first four numbers must be {1357} in some order and there is a 9 somewhere in the bottom right box.

The clue in the first row requires a 5 at the end. Given the 10, 10, 5 clue and the 4 already placed in the sixth column, the only odd substring in the last box must be the 5. We thus must have the 5 in the first row, and now only the 3 can fit in the fourth row. We also gather that the last two columns of the first row must be even, and we have enough clues to place these (using standard Sudoku rules).

Finally, we can place the 4 in the middle right box (now that the 3 is there, there’s only one space left where it can fit). Similarly, we can place the 5 (knowing that it can’t be in the seventh column any longer).

From this point on, solving the rest of the puzzle should be reasonably straightforward. The next step would probably involve placing 1, 9, 6, 7 and 3 in the first row (in that order – using the column hints and the knowledge that we have two tens)!

For these contests, I’d say the challenge lies largely in figuring out how the ruleset can be used to help solve the puzzle (and, well, speed; you’re thrown 13-15 puzzles to solve in the space of 90 minutes and I can only usually manage about seven or eight, and for that matter the ones I do finish are the ones on the easier end of the scale!).

On the Road Again

I travelled to Denmark for a recruiting event last week. The nature of the travel (2-hour flights plus an hour or so of ground transportation on each end) was not something I was used to; admittedly most of the flights I can remember taking are pretty long-haul ones (London-Singapore, London-San Francisco are by far the most common; London-New York exists too, though is less common). I had a great time, in any case – although being awake for about 24 consecutive hours especially when slightly ill was not too enjoyable, it was nice to meet up with students and flex some speed coding and debugging muscles. I think I still coded well even when really tired, though it did lead to some confusing moments as I tried to understand what exactly “PAY THE PRICE” meant in terms of a discount (it means no discount, but given that there was already “HALF PRICE” and a sliding reduction scheme, this did confuse me quite a bit).

This is just the second trip so far this year, though I anticipate there’ll be quite a few more. The two trips I’ve made so far were for business reasons; I’ve been hammering away at a major project at work that should be drawing to a close soon so there should be scope for personal time off. There are also likely to be more work trips in the pipeline – within the scope of my major project, for company events at large as well as for recruiting purposes. Of course, there would be even more trips if I was a Forward Deployed Engineer as opposed to a dev.

I remember reading an article in the BA inflight magazine en route to Billund, and noted that in some survey, 30 percent of people would accept a lower paying job if it meant that they could travel more for work. I think I subsequently found the survey they were referencing. I certainly would be in the 70 percent of people that would not, though it’s hard to draw an accurate conclusion given the way the question is framed (what’s the baseline here? I’d think people who already travel 50 percent compared to people who don’t travel at all might respond differently to being asked if they would do this to travel “more”). The survey was also commissioned by Booking.com, which might have a vested interest in portraying people as more willing to travel (so that companies would engage travel provider services more).

Of course, in business terms there are the obvious benefits of having in-person interaction between team members and/or getting the people who know what they’re doing on the ground, to see what’s going on. They can be really useful to get things done fast; communicating over instant messaging or even via a video conference doesn’t tend to be as effective.

I would say the upside of travel as far as an individual is concerned includes the opportunity to experience new places and see new sights, though I’m not sure how much of that you could achieve in a business trip. I find that many of the past trips I’ve had had been high-octan fire-filled affairs (though there have been exceptions – and not the programming kind). Other benefits involve meeting up with existing friends and acquaintances in the area (again, the tight schedules of many of my trips in the past precludes this). The article does make reasonable points concerning extending one’s stay, though – I actually did that two years ago when I stayed on in California for an additional week after one such trip – and fluidity of plans leading to last-minute bookings.

One of the things that makes me wary of travelling too much is actually long, unpredictable delays through security and customs – this might partially be a result of the routes I typically fly on (which go through major hubs with lots of traffic). I do have strategies to mitigate this (typically, I book an aisle seat near the front, and walk at a very brisk pace once disembarking from the plane; I also tend to choose flights that arrive at relatively unpopular times). This wasn’t a problem at LCY nor in Billund Airport in Denmark; queues were very short and handled very quickly.

To some extent this can be mitigated by Registered Traveller in the UK, the ability to use the electronic terminals in the US and the biometric gates in Singapore; it’s especially useful for me since I don’t typically travel with checked baggage, making the whole process of leaving the airport and getting where I need to go much faster. I did some computations to check whether the UK Registered Traveller scheme was worthwhile, and in the end decided it was reasonable. I used an assumed number of 12 trips per year with an estimated time savings of 25 minutes per trip (a crude estimate, but there are 20 minutes between the UK Border Force’s benchmarks for non-EU and EU passport queues; going by benchmarks ePassports are probably even faster hence the + 5) to figure out that the net cost was 14 pounds per hour saved post-tax (or 24 pre-tax). It could be argued that those specific hours might have unusually high value (e.g. stress at the end of a long flight), perhaps boosting the case for it. I’ll probably travel even more with Registered Traveller in place, actually.

I guess the other thing that has recently dampened my appetite for travel (at least on the personal front) would be the depreciation in the pound; relative to many people I know my proportion of overseas expenses is actually already fairly high (because of investments; 90% of my equities and 50% of my bonds are based overseas – and I hold much larger quantities of equities than bonds). To some extent I overreacted to Brexit (my savings, even in US dollar terms, have increased from the “Remain” case model I had) and there’s certainly still scope for travel around the EU where the pound has fallen more like 10 as opposed to 16 percent, though the length and luxury of such trips is likely to be reduced.

I’ll be travelling again soon, though it’ll be another of those long flights instead. Nonetheless, at some point in the future I definitely want to revisit Denmark. This trip was focused rather closely on executing the recruiting events in question (well, there isn’t really much you can do with one day anyway; I was in Denmark for less than 24 hours) – while I did get to see a little of the city, there was certainly much more to do.

Revisiting Mathematics Competitions (Hotseat: AIME II, 2015)

I remember I had a pretty good time doing the Haskell January Test earlier this year, and thought it might be both enjoyable and useful to revisit some of these past mental exercises. Hotseat is intended to be a series that discusses my thoughts, ideas and reflections as I take or re-take examinations or competitions; many of these will end up being related to mathematics or computer science in some way, though I’ll try to also include a few that go against that grain.

Background

I didn’t actually participate in the American Invitational Mathematics Examination (AIME) as I don’t think my high school offered it (the only maths competitions I remember taking myself are the Singapore Math Olympiad and the Australian Mathematics Competition). This is typically the “second round” of the main mathematical competition in the US (the first being the AMC, and the third being the USAMO), and features 15 problems, all of which have integer answers from 0 to 999. The point of this is to facilitate machine scoring. While the questions might indeed have integer answers, they can be very difficult especially towards the end.

2015 II was supposed to be an easy paper, though I didn’t find it to be very much easier than what I recall these to be. You can find the paper here.

I managed to fight my way through 12 out of 15 questions, leaving out two hard problems towards the end and a not-so-difficult one in the middle (mainly because I struggle somewhat with geometry in three dimensions). I think I’m not used to doing these as quickly as I would have in the past as well; problem 14, for instance, would probably have taken under 15 minutes in the past while it now took 32. 8 minutes on problem 1 is rather bad, as well.

The values below indicate the time spent on each problem; here green means the problem was answered correctly while grey means no answer was given. (In practice, in a contest you would take the 1-in-1000 shot and guess, since there are no penalties for guessing!). I’ve also outlined the subject areas – you can see that I tend to be stronger with combinatorics and probability, and struggle somewhat with geometry!

I think I remember mentioning in a post a very long time ago that when I look at data, I tend to initially look for things that seem unusual. Bearing in mind that the difficulty of these papers tends to rise, we could say the 32 minutes spent on question 14 or 15 on question 7 is perhaps not too far from expectations (the target “pace” here is 12 minutes per question, if you’re trying to finish). Bearing that in mind, we can look at a couple of outliers:

  • Problems 1, 6 and 11 took unduly large amounts of time.
  • Problems 2, 3, 8 and 12 were done well above pace.

Some of these were actually pretty interesting, which is one reason I tend to enjoy these competitions as well.

Selected Problems in Depth

Problem 1: Let N be the least positive integer that is both 22 percent less than one integer and 16 percent greater than another integer. Find the remainder when N is divided by 1000.

For this question you were actually supposed to find N; the “divided by 1000” shenanigans was so that the answer would be an integer from 0 to 999. The challenge here was really formulating a requisite equation from the problem statement, which would be

\dfrac{39}{50} k_1 = N = \dfrac{29}{25} k_2

and then reasoning that N must be divisible by both 29 and 39, such that k_1 and k_2 were both integers. Since these are relatively prime N = 29 \times 39 = 1131 and the answer was 131. I guess I hadn’t really warmed up yet, and wanted to be very careful to set up the expression correctly, so this took rather long.

6 was a bash with the Vieta formulas, though my brain temporarily switched off when I somehow thought a cubic equation should have four roots. 11, on the other hand, was a rather messy geometry question which I initially found tough to visualise. The solution on Art of Problem Solving involves constructing perpendiculars from the centre of the circle to the two lines, but I didn’t think of that (and for that matter wouldn’t have known that was a property of a circumcentre), and instead had a rather painful trigonometry bash to find the answer.

On the brighter side, 2 and 3 were rather simple questions (an elementary probability question, and a rather quick modulo-arithmetic one) and I cleared them off very quickly. I dealt with 8 using some rather quick-and-dirty casework that yielded the desired result, though it wasn’t too interesting. I’d say 12 was a bit more of a fun one.

Problem 12There are 2^{10} = 1024 possible 10-letter strings in which each letter is either an A or a B. Find the number of such strings that do not have more than 3 adjacent letters that are identical.

I thought in terms of recurrences and dynamic programming, and realised that valid strings (in the long case) must end in BA, BAA, BAAA or their flipped versions (alternatively, with a blank space in front). To end a string with BAA, it must previously have ended with BA, and so on for BAAA. Thus, the following equations drop out:

 BA(n) = AB(n - 1) + ABB(n - 1) + ABBB(n - 1)

 BAA(n) = BA(n - 1); BAAA(n) = BAA(n - 1)

Since the AB and BA cases are symmetric, we can simplify the first equation above to

 BA(n) = BA(n - 1) + BAA(n - 1) + BAAA(n - 1)

And we have of course the base cases;  BA(1) = 1, BAA(1) = 0, BAAA(1) = 0. Computing  2 \left( BA(10) + BAA(10) + BAAA(10) \right) is then not difficult, and yields 548.

In mathematical terms I would say problem 14 required some insight, though for seasoned competitors it would probably still not have been too difficult.

Problem 14Let x and y be real numbers satisfying x^4y^5 + y^4x^5 = 810 and x^3y^6 + y^3x^6 = 945. Evaluate 2x^3 + (xy)^3 + 2y^3.

I initially tried squaring the resultant expression, and combining the two given results in various ways. This went on for about 20 minutes with little meaningful progress. I then took a step back, attempted and finished up with problem 11, and came back to this. I suddenly had the idea of combining the two givens to force a cubic factorisation:

 x^3y^6 + 3x^4y^5 + 3x^5y^4 + x^6y^3 = 810 \times 3 + 945 

And that simplifies down to

 x^3y^3 (x+y)^3 = 3375 \leadsto xy(x+y) = 15 

We then proceed by factoring the first result:

 x^3y^3 (xy(x+y)) = 810 \leadsto x^3y^3 = 54 

We can get 17.5 for x^3 + y^3 by factoring the second given to x^3y^3(x^3 + y^3). The final result is then just 35 + 54 = 89.

I subsequently figured out solutions for 13 (my brief intuition about complex numbers was kind of on the right track, but I didn’t have time) and then 9 (not good with 3D geometry), and had a look at the rather painful solution for 15.

Meta-Analysis

It’s worth taking a look at how this performance was relative to the actual score distribution, which is available on the AMC statistics page. A score of 12 would be in the 96.09th percentile, which is interesting; I did the 2015 AIME I a few weeks ago and scored 11, but on that contest that was the 97.49th percentile!

There’s also a report on item difficulty (i.e. the proportion of correct responses to each problem). I’ve graphed this below (items I answered correctly are in green, while those I did not are in red):

It’s interesting that problem 12, a problem I found to be one of the easier ones (assuming you sort by success and then time, I found it to be the 8th hardest) turned out to be the second “hardest” problem on the paper! I’d attribute some of this to a programming background; if I took this contest when I was in high school I think I would probably struggle with 12.

13 and 15 were hard as expected; the next hardest problem in general after that was 10 (just a notch above 12 in difficulty for me) though again it relies on tricky combinatorics. We then have 6, 11 and 8 before 14; I found 6 and 11 relatively harder as well, as mentioned above, though 8 was fairly trivial for me. It turns out that my geometry weakness is fairly real, as 9 was completed by more than half of the participants! We can also observe problem 1 being challenging relative to its location in the paper.

Conclusion

This was a good mental workout, and it felt nice to be able to hack away at the various problems as well. I think a 12 is pretty solid (the 96th percentile is that of AIME qualifiers, which itself is something like the top 5 percent of high school students taking the AMC, so that would be the 99.8th percentile – and you could argue that considering only students taking the AMC already introduces a selection bias towards those with greater aptitude). I’d probably have been able to solve 9 and maybe 15 if I was doing this back in high school, though I’m not sure I would have managed 12.

I can certainly feel a loss of speed, though I think my problem-solving ability is for the most part intact, which is good.

On Hints in Puzzles

I played a game of Paperback with a couple of friends from Imperial yesterday. The game involves players making words from letter cards in their hands, and using them to buy additional, more powerful letter cards and/or cards worth victory points. It’s pretty much a deckbuilder crossed with a word game; while anagramming skills are indeed important, they aren’t always necessary (for instance, I managed to form the word WORK to buy the second highest tier victory point card) and certainly aren’t sufficient (keeping one’s deck efficient is certainly important).

In any case, the game reached a point where one of my classmates seemed to be agonising over what was presumably a tricky hand. The game does have a number of cards that use digraphs such as ES, TH or ED, which can make searching for words a little trickier. The common vowel (a vowel that players are all allowed to use in their words in addition to the cards in their hand) was an O, and the hand in question was [IT], [R], [B], [U] and two wilds. I immediately saw a solution when he sought help (BURrITO or BURrITOs; none of the cards had abilities dependent on word length) and let him know that a ‘perfect’ solution for his hand existed, though we decided to let him figure it out for himself. This started to drag on, though, and giving him the answer directly seemed excessive; I thus decided to drop hints that would hopefully lead him to the solution.

Technically, the search space that he would have had at this point would have been 5! + 6! \times 26 + {7 \choose 2} \times 5! \times 26^2 which is about 1.7 million possibilities, though of course one does not simply iterate through them; combinations like OqRzBITU would probably not even be considered directly, especially since the distribution of letters in English certainly isn’t memoryless (the Q being a good example; it is rarely followed by any character other than U). I gave a few hints, in this order:

  1. The word is a common word that a person in middle school would likely know.
  2. The word can be made with just one of the blanks.
  3. The blank you use matches one of the letters already on the table.
  4. You can eat it.

We could argue that the point of a hint could be to reduce the search space being considered by disqualifying some solutions, and/or serve as a way to encourage the algorithm to explore cases more likely to be correct first. Interestingly, it seems that clue 1 was neither of the above, though. It didn’t really cut down the search space much, and as far as I know it’s not easy to use this information to organise one’s search. Its point was mainly to reassure my classmate that the solution wasn’t too esoteric and he would know it was a valid solution upon considering it. I didn’t think I had played any particularly uncommon words earlier in the game (the rarest might have just been TENSOR or something), though I think I might have played Anagrams with him before, where I know I’ve called out some more… interesting words.

The next two clues do actually substantially reduce the search space in a sense though; the second clue slashes it to just 6! \times 26 = 18720 and the third further cuts it down to 6! \times 6 = 4320. I would think that after the third hint, the search should be pretty fast; I would probably try doubling up on the R and T first as they’re common.

Nonetheless, hint number 4 seemed to be the hint he responded best to, and it falls into the second category (encouraging the search to explore cases more likely to be correct). I’m not sure how humans solve anagrams, especially when blanks are involved; I personally seem to look at letters, intuitively construct a word that looks like it incorporates the key letters present, and then verify that the word is constructible (in a sense, it is a generate-and-test approach). Given the point values of the tiles, I saw including the B, U and IT cards as being the most important; thinking of words that use these patterns resulted in BITUMEN and BURRITO fairly quickly, the latter of which was feasible.

I noticed that I tend to bias towards using the constraint letters near the beginning of the word though; this might be a cognitive issue with trying to satisfy the constraints quickly, in an unusual form of the human brain often being weak at long-term planning. This wasn’t really the case with the burrito example, but another friend had a hand of ten cards (I think) at some point with an [M], [IS], [T], [R], [D] and a couple of blanks. With the common O and knowing that the M, IS and D were higher-value, I saw MISDOeR and struggled to incorporate the T, finding it difficult to back out of starting the word with a MIS- prefix. It turns out it might have been okay in that I could have found MISsORTeD, there were easier solutions such as aMORTISeD or MODeRnIST.

On hindsight, though, my hinting might not have been ideal. This is because BURrITO and its plural form weren’t the only solutions; there was OBITUaRy, which some people might find easier to see (though I didn’t actually see this during the game). I think clue 1 would probably already be invalid; I knew the word ‘obituary’ in primary school, since it was a section in the newspapers that I would browse through in the mornings, but I’m not sure it would be fair to expect middle school students in general to be familiar with the term. Clue 2 would certainly be incorrect.

The Boy Who Would’ve Reached for the Stars (Q1 Review)

The past few weeks have been incredibly intense; I had been root-causing and then fixing a fairly complicated issue on AtlasDB that involved unearthing rather messy relationships between various components. I thus took three days off this week to cool off a little, and also to collect some of my thoughts on the quarter gone by.

Looking at my public GitHub statistics, I merged twenty-eight pull requests over the last three months, ranging in size from one-liners to an almost 3000-line monster that should probably never have been. That’s about two per week, which is a reasonably healthy rate, especially bearing in mind that much of the last three weeks was spent debugging the big issue. (I acknowledge this metric is pretty coarse, but it does at least reflect positively in terms of the pace at which development work is getting reviewed and merged to develop.)

It seems I’ve done quite a fair bit of pull-request reviewing as well; recently this has to some extent shifted towards benchmarking and performance, which is something I find to be personally interesting. Concepts that a few months ago were merely abstract ideas I was vaguely aware of, such as volatile variables, blackholes and false sharing now actually do mean a fair bit more to me. I’d say there’s progress here.

Other goals in terms of computing such as learning more about distributed systems, contributing to interviewing and sharpening my knowledge of Java have certainly been trundling along acceptably well. My work with MCMAS development also continues, and on a bright note my submission (together with Prof. Alessio Lomuscio) of a paper on symbolic model checking CTL*K got accepted at AAMAS 2017. I’m also in the process of setting up a more modern version control infrastructure for MCMAS, and merging my own CTL*K-related changes from the project down into MCMAS’s master branch. This adds an end-to-end test framework and also fixes some old bugs in MCMAS, including deadlocks in counterexample generation.

Financially, the market continued its march upwards and my portfolio gained 4.8 percent (inclusive of reinvested dividends). Interestingly, I thought it had largely been flat across this January-March period. I was of course aware of the Trump reflation trade, but thought it was already largely priced in by the end of last year. Looking at things within my control, I continued my monthly regular investment scheme, and successfully completed an admittedly ill-conceived challenge in February to limit discretionary expenditures for the month to a hundred and fifty pounds (clocking in at 148.16). I decided to replace my laptop in March, which was a significant budget bump, but not unreasonable given my previous machine had an SSD failure and had served me well for about two years.

Amidst all this, it can be easy at times for me to simply lose myself in the music rhythm of getting things done, to paraphrase a certain rapper, without considering why said things are being done. I had to make a trip to New York to do some important work, and on the flight back to London I noticed X Factor winner James Arthur’s new album, “Back from the Edge” on the inflight entertainment system. I liked “Impossible” and “Say You Won’t Let Go”, and I was aware he had a great belting voice, so I decided to give it a spin.

The title track, among other things, flags a desire for the speaker to return to his beginnings. It’s delivered with a lot of vocal power (as I would expect), which works for me at least (viewed in the sense of announcing or declaring one’s return):

Back from the edge, back from the dead
Back from the tears that were too easily shed
Back to the start, back to my heart
Back to the boy — who would’ve reached for the stars

Thinking back, things have changed quite a fair bit for me over the last few years as I moved from Singapore to London. I’d say the most obvious changes were in terms of academic/professional expectations (they’ve always been fairly high, but skyrocketed after placing 1st in first year), work ethic (I do remember being pretty lazy) and personal finance (I’m now much more frugal and also invest in stocks and funds). In many cases, I’m relatively satisfied that these changes have taken place; I wouldn’t say that after these four to five years and reflecting upon them I’m now seeking to come “back from the edge” or “back from the dead”. However, I’m sure there are some less obvious changes that are certainly less positive, such as relatively less interest in and less time spent on learning outside of engineering and finance; these are often exacerbated by high expectations leading to more time/effort spent on ensuring said high expectations are met. There definitely exist things which I used to like about my past self that aren’t as present in my present self.

I’ll be finalising Q2 targets soon; apart from the usual engineering/finance/friendships (EFF) targets which are certainly very important, I will include one or two other things as well. I’ve generally found identifying what I want to accomplish beyond EFF which are often my primary foci difficult, and other things tend to be very quickly deprioritised if specific objectives for them aren’t established.

On Paying Oneself First

The expression “pay yourself first” is one very frequently put forth in personal finance circles. In practice, this often involves automatically rerouting some amount of money into a separate savings or investment account whenever a paycheck comes in. Besides the mathematical benefits (compounding; to some extent, risk mitigation via cost averaging if one’s investing), I can see how the approach could be useful psychologically (in that it reflects a shift in one’s mindset concerning money, as well as a conscious prioritization of savings and capital growth). I’ve personally been following this, though I’ve been investing the money into a portfolio of equity trusts and index funds (I’d recommend having an emergency pot to sustain a few months’ expenses first, though).

I see no reason why this can’t be generalized beyond money, though, especially since we do have to manage far more important scarce resources. In particular, I’m looking at time. I’ve received feedback that I tend to slant towards being outcome-oriented, and this does mean that if an important project is lagging but I see a way to recover, I’ll tend to vigorously pursue it – it’s thus not too difficult for me to end up spending 70 or even more hours a week on said project (or even 90-plus, as I did in second year at Imperial). I’ve learned that this is unsustainable, but for me it’s still not the easiest decision to drop something (to the point where a friend commended me for actually using some of my leave during the industrial placement!).

If we look at time in the same way, we get 24 hours a day, or 168 a week; things like sleep are of course important, but I tend to see them more as bills or taxes that need to be paid (not paying them tends to lead to interest!). So paying myself first would involve reserving some time for something else; I’d propose personal learning and development as a good candidate for this.

This is perhaps unsurprising; I suspect that if I polled people as to what “investing in oneself” entails, many answers concerning education would be forthcoming. Like bonds and equities (hopefully), developing one’s skills can lead to future payoffs. I do tend to partition this time into roughly three different domains:

  1. Technical development – e.g. paper reading, programming contests, code katas. These are likely to feed back in to my software engineering work. I’d probably consider these similar to equity income mutual funds; they (ideally) grow in value reasonably steadily and generate nice payoffs along the way too.
  2. General professional development – e.g. writing, finance, tax. Useful for both software engineering work (from what I can recall, I’ve written a lot of docs) and also for managing my own professional matters. Again, these are generally useful; perhaps they have smaller immediate payoffs than technical development, though I tend to think of them as also very important in the long run. Perhaps these would be more similar to a growth-focused fund then? Or even BRK-B (or BRK-A; we’ll get to that but I don’t have a quarter of a million quite yet!)
  3. Random development – singing, photography, etc. These are generally quite fun (I do also enjoy software engineering, writing and finance, but they like all other domains tend to have diminishing marginal returns), and might suddenly explode into usefulness or value given the right conditions. Perhaps these are like emerging market equity funds that are focused on a specific country. There’s certainly a fair bit of variance, but the returns can sometimes be impressive. (If one wishes to take the metaphor even further, deeply out of the money options could be more accurate; they certainly add a bit of fun to a portfolio, too! That said, I have no direct experience with option trading.)

Of course, money and finances are an important thing to manage, and I believe that paying oneself first is a good strategy there. However, it does feel to me that time is even more important. My admittedly equity-heavy portfolio lost around 6 percent during the US election jitters, and this is a portfolio which I’ve built up over the course of about a year and a half now – yet I didn’t feel much (I was expecting a further fall post-Trump win, though in the end the markets rallied). I’m the kind of person who can get annoyed if I find that a day was wasted – let’s not even get started on my reaction to 6% of 18 months (just over a month; 32.87 days). Of course, we have to be careful about jumping to conclusions about what constitutes waste or loss, but I think the point that I find loss of time more painful than loss of money (at least at a small-percentage scale) still holds.