Sets of Sets

If I’m asked to think of a real-life example of a set of sets, I think the first example I would come up with was about job lots of items being sold. Interestingly, it actually took a while – the ideas I came up with before that, which were companies I have investments in (note: mutual funds and ETFs own sets of companies), or messages to my friends were really unions of multiple sets.

How would we model a set of sets in Java? Well, that’s not too hard, since a Set is a type after all:

That said, we’ve got to be careful. If we continue the above with this:

What should the output of lines 7, 8 and 9 be? Are they equivalent, even? Well, there are some possible arguments in each direction:

  • They should return true, because the reference to the set thingsBeingSold is still there.
  • They should return false, because the value of thingsBeingSold isn’t the same as the value inserted.

I would argue that the first point is stronger, and would have been my answer when I first learnt about Java. However, the actual answer is that lines 8 and 9 should return true while line 7 generally returns false, though there are a few special cases:

  • It will return true if thingsBeingSold.hashCode() before line 6 has the same hash as thingsBeingSold.hashCode() after line 6 i.e. there is a hash collision, or
  • if the Item created in line 6 equals either of the Items created in lines 3 and 4 (unlikely).

The operation of line 8 is relatively simple: we check each element in the set to see if it is equal, and assuming our set’s equals() method does what it’s supposed to do we will always find the match. We can potentially improve on this with parallelStream(). Line 9 does a reference-equality check on the elements instead and would probably be faster as we don’t have to actually inspect the set contents. In this case, it’s also going to be true. However, these are linear time operations, at least. The point of using hash-based structures here, among other things, would be to knock the time complexity of lookups down to amortised constant.

In a sense, we find the object only if we fortuitously (!) happen to be looking for the same hash code. Java HashSets effectively are a key-set based view of a HashMap, and when we look up objects we actually investigate them by their hashes first, only bothering to check the values if we have a match on the hashes. This allows for some fairly interesting optimisations, actually; I previously thought that HashMaps use buckets and store an array of linked lists of mappings + information, resizing appropriately when the ratio of elements to buckets exceeded a load factor. This was correct up to Java 7, but in Java 8 it appears there’s a fun new optimisation that builds red-black trees sorted by hash codes (while objects themselves might not be Comparable, their hashes are ints, which certainly are) if some buckets get too large relative to others in an attempt to cut the worst-case complexity from linear to logarithmic. Still, when we inspect nodes, we look at their hashes first.

Note that this isn’t just a problem with HashSets; you can get into similar debacles with TreeSets, because the underlying red-black trees aren’t going to reorder themselves. In general, the Javadoc for Set does warn you that mutable objects shouldn’t be changed in ways that would affect their equals() method after they’ve been put into a set. HashSet and TreeSet are allowed to behave the way they do because of subsequent invariants between equals() and hashCode(), and comparison-based sets are supposed to use comparators that are consistent with equals() i.e. a.equals(b) iff a.compareTo(b) == 0.

One way to get round this is of course to ensure that the types used in these sets are immutable, or at least the parts that are used in equals() and related methods are immutable. If that’s infeasible, a good solution could be to delete the set that is to be changed from the set, and add it back after changes; add is idempotent and so the set will (correctly) stay as a set. Wiring this up seems rather messy, though; it looks like we need to listen for state changes (so some kind of publish/subscribe mechanism), where the objects being stored publish both their old and new state so that we can modify the set accordingly.

A Software Engineer’s Perspective on Shenzhen I/O

I had some opportunity to have a bit of a break over the Christmas and new year period, and James recommended this title to me about a month ago or so, so I thought I’d give it a shot. It’s essentially a kind of competitive hardware programming game – you’re given a task to complete, and the challenge is to build a circuit exhibiting the specified behaviour. Of course, in practice this is wrapped in a pretty fun if frivolous story. To do this, you have to arrange various components on a circuit board, such as microcontrollers (which allow some form of assembly programming), bridges, logic gates, etc. The verification interface (pictured in the screenshot) definitely reminds me of Quartus II.

I did some x86-64 assembly programming during my first year at Imperial, and thus initially hit a bit of a snag (I kept typing in commas between operands, for a bit wondered why cmp, jne or shl weren’t valid commands, and also tried to perform arithmetic operations with both source and destination registers – you have to use an accumulator). It didn’t take too long to figure things out, though.

I’d say the first puzzle which got tricky was this one (there was an earlier one on using a greedy algorithm for coin change which might have been a bit tricky too):

  1. You will receive a signal on a non-blocking IO channel that consists of three packets. The first indicates the location to which you must move a robot using a motor, the second indicates the number of pulses to clean the plants at that location (?) and the third indicates the number of pulses to spray them with nutrition.
  2. To control the robot, you have to use its motor. Fire a signal of 100 for one pulse and then pull back to 50 to move the robot forward (increasing numbers); fire a signal of 0 for one pulse and then pull back to 50 to move it backward (decreasing).
  3. The clean and feed steps must be done for precisely the correct number of cycles.
  4. (Implicit) You may assume that there’s no overlap in signals (i.e. the robot will always have finished its feed steps before it gets another packet).

To add to the challenge, the microcontrollers have limited amounts of memory, and thus our programs have to be pretty short. As tasks grow more complex, we need to break them down into smaller pieces; we then end up needing to define protocols between the components. For example, for the above problem, the workflow is as follows:

  1. The “main” (first) module, upon receiving a packet, dispatches the target location to the second module and then blocks until said second module has sent a signal back.
  2. The second module compares the robot’s current and target locations, and submits commands to a third module (the one in the top right hand corner) to drive the robot to the target.
  3. The third module simply drives the motor to move the robot one step forward on a 0 input, and back on any other input.
  4. Once the robot is in the right location, the second module signals back to the first, unblocking it.
  5. The first module reads the rest of the packet and executes the clean and feed steps appropriately.

While this is of course different from what most software engineers will be doing in practice, I think it is not unreasonable to use this as part of teaching computer architecture and/or assembly (hello first-year Imperial students!). I think the game does teach a couple of useful ideas.

  1. Abstraction. This becomes critical for some of the larger problems. Quite a number of the tasks I’ve completed so far have involved developing protocols that use message passing between two or three modules.
  2. Optimisation can be painful. Most of the time the cleanest factoring is not going to give you the best score, to say the least; I think most users will experience some pain rewriting assembly to fit in fewer lines and/or run with fewer instructions.
  3. Complex conditional structures and gotos. You can use jumps, but those tend to be expensive; it thus becomes pretty common to use nested tests (teq, for example, checks if its two operands have equal value). Tests set a three-valued flag; instructions can be preceded by “+” or “-” to be run only if the flag is set to true or false respectively. Rather nastily, the flag can be set to neither as well (there is a tcp instruction that indeed writes the three-valued output to the flag!)
  4. A reminder to consider edge cases. For the Coin Change circuit, the test cases did include a case with exact change, as well as a case where only the largest coins were needed. I’m not sure if the robot test cases included anything too sneaky beyond no movement being required.

Highly recommended; in addition to all the fiddly circuitry there’s also a pretty cool card game that’s a strange hybrid of FreeCell and Spider Solitaire. I’d give this a 9 out of 10, easily. That said, I think people with no prior programming experience would find the learning curve very steep. The game does pretty well at introducing some useful constructs via suitably contrived tasks (there was one on the ROM module, one on the I/O expander, and one about using the digits of an integer as flags) to help mitigate that, though.

An Unexpected Economic Computation

Here’s a calculation that I found pretty surprising. Have a guess as to its meaning:

 \dfrac{\left( \dfrac{60}{(24-15)} \times 2.40 \right)}{1 - (0.4 + 0.02)} \approx 27.59

If you guessed something relating to savings, you’d be on the right track; the denominator would probably have clued you in to something along those lines, since the 0.4 + 0.02 is the marginal tax that higher rate earners in the UK would face (as of the 2016/17 tax year). You might have figured out then, that the fraction on top probably refers to some form of a time savings expressed in minutes. The 2.40 is probably a bit more puzzling especially if you’re not from London; from the previous observations it’s the cost of taking some convenience that saves nine minutes. That’s right; you might recognise that as the price of a single-trip Tube fare.

Putting that together, that computation is actually the effective hourly gross rate I’m being paid to walk to my office as opposed to taking the Tube. Specifically, I’m comparing (a) taking the Tube, which takes about 15 minutes plus a single trip fare of 2.40, and (b) walking, which takes 24 minutes. We’re thus looking at a net rate of 2.40/9 minutes. Of course, I pay for the Tube using post tax money, so we need to factor that in, and to get an hourly rate we scale that linearly.

Now of course this doesn’t mean I’ll never take the Tube to work again, or even that I’ll take it much less often – the 9 minute difference can be important (say there’s a high priority issue going on, or I’m about to be late for a meeting), and walking can be pretty unpleasant (if I’m very tired, it’s late, or it’s too cold; that said, I’ve done this in the current ~0 degree weather and found it fine, at least). Probably, doing this in the mornings would generally be more reasonable and sustainable as I’m pretty tired by the end of the day. Saving one trip even occasionally is enough of a win, I’d say, and getting some time to clear my head in the mornings is probably not too bad as well.

A question could be why I don’t use a Travelcard for the computations instead. The weekly one for zones 1-2 costs 33 and since I largely stay within zone 1 and 2 we can assume that that’s all I’ll need (I’ll ignore the odd trip to Heathrow). I think 16 or 17 is probably about the number of rides I’d take in a week if I used it aggressively (two a day, maybe three or four on the weekends). We can re-run the numbers with \frac{33}{16.5}, which means a trip costs 2.00. Our final answer is 22.99 which is still pretty solid (if you work 42 hours a week that’s a gross annual income of just over 50,000). Anyway, as it turns out because I use contactless, if I happen to have a week where I do travel a lot I’ll effectively be using one.

Now let’s have a look at the monthly or annual tickets. These cost 126.80 for monthly, or 1,320 for annual. Assuming that we make 2.357 trips a day on average (taking the middle of the estimates above), and a month as 365.25/12 days, with the monthly card the cost of a trip falls to 1.767 and with the annual card it falls to 1.533. The “gross wage” calculations become 20.31 and 17.62, which while not as high are still solid amounts. 17.62 an hour corresponds to about just over 38,000 in gross income assuming a 42 hour week, which would put you in about the 78th percentile of earners according to the ONS data for 2013-2014. I guess this will probably be a bit lower now, with inflation/wage growth, but still decent.

Assuming usage is high, it seems that the annual card might be the way to go. However, an issue is that I sometimes need to travel overseas for work potentially a fair chunk (let’s say personal plus work travel adds up to two months a year), so the annual one is quite definitely a non-starter. Multiply the price of a trip by 6/5 to factor that in, and you end up with a per-trip figure that’s higher than the monthly card. I have been getting the monthly card in the past, especially when I was a student and it only cost 86.50, partly because walking to Imperial was going to take a very long time (that route yields savings more on the order of 30 minutes). Note that while losing a card is a legitimate concern, you can usually recover the travel passes using the TfL website.

(N.B. This idea was conceived independent of the currently ongoing Tube strike, though in a sense it’s a bit of a bonus that things aren’t affected.)

Jeremy’s Best of 2016

This was inspired by a YouGov article that I read. It can be interesting to look at some of the polling data (not just for elections/political things – YouGov does lots of interesting research). I’m largely in agreement with Surowiecki’s “wisdom of the crowds” for the scenarios he has identified – but these are actually quite rare (wise crowds should arrive at their decisions independently, and some should be able to draw on some form of local knowledge); when I think of decisions made by a crowd I often think of asset bubbles (but this fails independence).

Nonetheless, this post is intended to finish the year on a relatively lighter note. I find myself agreeing with the crowd that 2016 has been painful, for the world and for both the countries I find some degree of attachment to, though personally it has actually been pretty decent. For 2017 I’m cautiously bullish on myself, partly out of pragmatism (I don’t see the value in being too bearish for things within my sphere of influence). I’d say I’m pretty ambivalent, if slightly bearish as far as the rest of the world is concerned – I do think Brexit and President Trump do have possible upsides even if I don’t think I would have opted for the additional volatility introduced by each. There is news that’s clearly negative to me, such as the events in Syria. Nonetheless my investing habits would say I’m an optimist – I continued buying through the drawdown in Februrary, piled in after the Brexit vote and would have after Trump had the markets stayed down.

Unfortunately I couldn’t really answer many of the categories (e.g. favourite film/actor/actress). I’ll add a few other categories which I think I would find more interesting to talk about.

This is quite long, so here’s a table of contents.

Best music act / artist of 2016

YouGov poll vote: Adele.
JK concurs, though weakly.
Honorable mention: Nil.

Huh. 25 was released last year, though I guess it’s fair game; if you asked me to assign this last year I’d probably have said Macklemore, because of Starting Over, Otherside and Can’t Hold Us (all 2013). It’s hard for me to pick a clear winner this year; I have been listening to Daughtry (It’s Not Over and Home have really nice acoustic versions, though they’re from 2006!), James Arthur (Impossible from 2012, mainly) and, strangely, the Backstreet Boys (Show Me The Meaning, 1999 and Shape Of My Heart, 2000) a fair bit. Hello is powerful, though I’d probably opt for Million Years Ago as my winner here. I normally find numerical hyperbole annoying, but it worked for me here.

I guess I’ll define “of 2016” as being something I experienced or discovered in 2016, independent of when it was actually produced. That does hold for Adele’s 25, certainly.

Best sporting event of 2016

YouGov poll vote: Rio 2016.
JK concurs.
Honorable mention: Cubs win the World Series.

I’d think for many countries this is a completely unsurprising result.

I don’t follow baseball (or sport, for that matter) closely at all, but I know the Cubs win was very well received at work and I guess by proxy enjoyed the moment. I was a little surprised to not see it on the board, but then I remembered I work in a US based company.

Best event of 2016

YouGov poll vote: Rio 2016.
JK: 100% on MCMAS-Dynamic. (Something personal – 3rd on the poll.)
Honorable mention: Graduation at Imperial. (Another personal thing – 3rd on the poll.)

I don’t follow sports closely. From the rest of the post you can figure out that I’m not a supporter of Brexit, so the top two are off the board. My grades were always pretty solid, but I guess this was a form of validation of performance across multiple cohorts (as debatable as those last few points can get when grading projects…). The cherry on the cake was balancing it with part-time contracting work at Palantir.

Graduation was fun, in spite of my complaints that a lot of it was symbolic. For me, I’d say its value was centered around reminiscing with some of the professors who’ve played an active role in my studies (Ally, Duncan and Tony, most notably), having a good talk with two close friends who aren’t in the UK any more (we still talk, but face to face meetings are more enjoyable), catching up with some of my other close friends and coursemates in general.

Worst event of 2016

YouGov poll vote: Brexit vote.
JK: Implosion of the opposition in the UK. (8th on the poll.)
Honorable mention: Syrian conflicts. (3rd on the poll.)

It’s really tough to pin this on a single event. I lean conservative, but a credible opposition is a good thing to have.

Interestingly, “something personal” was not on the board at all, though I guess people might be a bit more reluctant to share such negative events. That said, surely “something personal” (without further clarification) as a worst answer is valid?

Biggest impact on the world in 2016

YouGov poll vote: Donald Trump.
JK emphatically concurs.
Honorable mention: David Cameron. (4th on the poll.)

I don’t have any other candidates for this; I’ve picked the main enablers of the two massive news events for me this year. If Clinton won and the vote was to Remain I really wouldn’t know. Obama, perhaps?

New word/phrase from 2016 that sums up the year

YouGov poll vote: Brexit.
JK: Post-truth. (2nd on the poll.)
Honorable mention: Brexit. (1st on the poll.)

As far as I’m concerned Brexit is small potatoes compared to the Trump election, though I think I can see why some respondents went with Brexit anyway – Trump himself described his election as being “Brexit-plus-plus-plus”, and I agree. We then had the Article 50 court case, and while I do see issues with not implementing Brexit (since that was what was voted for), the court case wasn’t exactly about that, and branding the judges “Enemies of the People” was unacceptable. (Though to be fair that’s also related to Brexit.)

The profane options (3rd and 5th) are a fine response, I’d say.  I had to look up “lit” (4th; intoxicated?) and “dab” (6th; a kind of dance) – the former makes sense, I’m not so sure about the latter. Perhaps it bears some kind of connotation that I’m not aware of?

I’ve left out some of the more entertainment-related categories featured earlier in the article, but added two more that are relevant to me:

Favourite book/paper/article of 2016

JK: “Stock Series“, Jim Collins (jlcollinsnh)
Honorable mention: “Symbolic Implementation of Alternating Automata“, Bloem et al.

I’ve been passive investing for slightly over a year and half now, as far as I can tell. I’ve always been pretty frugal. Nonetheless I’d say this was a very well presented introduction to private investment and was useful for corroborating some of the existing knowledge I’d picked up along the way reading Monevator and other personal finance blogs.

Of course, past performance is no indicator of future returns. My own portfolio doesn’t fully coincide with his growth approach (VTSAX/VTI + cash), in that I hold a small slug of REITs, go international and attempt tilting (value, small-caps, EM). I think all-domestic is debatable in the US whilst probably unreasonable elsewhere, though I’m not doing say pure VNRT or HSBC’s American Index Fund + cash either. For a one fund portfolio VTSAX/VTI seem like good ideas; I might prefer the world trackers VWRL (not cheap at 0.25% OCF, though), HMWO (0.15%) or MBWA (0.20%), though they come at a cost so I can see an argument to stick with the VTSAX (0.05%).

The second paper helped me understand the “breakpoint construction”, a technique for converting an alternating automaton (think finite-state machine where one travels infinite paths and can be in multiple states at the same time) to an equivalent nondeterminstic Buchi automaton (finite-state machine, where one travels infinite paths, but is only in one state at a time) with potentially many more states. It’s a huge mess to explain and I think the paper helped quite a fair bit in understanding how to utilise this construction.

Favourite game of 2016

JK: “Keep Talking and Nobody Explodes”, Steel Crate Games
Honorable mention: “Fallout 4”, Bethesda Studios

Generally I don’t play video games too much, though having other players (especially friends) certainly helps. After a while, we knew to asynchronously dispatch blocking modules quickly (Passwords, Complicated Wires), got better at reading Morse Code, and I remembered how to solve Simon Says with no strikes and parts of the Large Button tasks without needing the manual. (“Double Trouble” was as far as we got, I think… “I Am Hardcore” not yet, unfortunately.)

As a bonus, I decided I wanted to mod some elements of the game (mainly to add additional modules that tested anagram skills, certain properties concerning loop termination, and relationships in finite-length sequences of numbers). I thus meddled around with Unity a little, and also revisited C# which I hadn’t used for two years. That’s actually the challenge you see in the picture at the top: the game picks an 8 letter word which has no one-word anagrams apart from itself. It shuffles the letters and then encodes each using a shift cipher (it actually uses 8 shift ciphers, one for each button) based on the button index and certain characteristics of the bomb (such as its serial number, batteries and ports). Solving the module is thus simply reversing the ciphers and decoding the anagram, but doing that under pressure can be tricky.

I haven’t played the game for a while given that most of the bomb defusal squad (if you can call it that) is no longer based in London, but I really enjoyed the times we did play.

Fallout 4 was pretty fun the first time round; after that not so much. After finishing the main storyline (going with the Minutemen ending) I started optimising things a bit more aggressively and the game got considerably easier, even on Very Hard (I’m terrible at FPS games, but certain combinations of perks synergised pretty well). The VATS system was even the basis for a dynamic programming loop induction proof I gave my tutees.

Navigating Tube Fares

A bit of an additional post for the week, as I’ve had a little bit more spare time! This post is a more fully-fleshed out response to a question my friend Andrea had, about the value of an annual travelcard.

I’ve started doing my preliminary accounts for 2016, and one of the things I examined was my transport expenditure. I typically try to use what’s known as zero-based budgeting (that is, each category and the value assigned to it is justified from fresh assumptions, rather than say raising the previous year’s data by RPI and calling it a day). Of course there’s some flexibility (I’m not going to pass up a social gathering just because of finances, unless it’s insanely expensive – which is unlikely given the background of my friends, or at least the activities we take part in together).

There’s a column of 86.50s, corresponding to a string of monthly zone 1-2 Travelcards purchased on student discount. We then have a crash to two low months as I was in the US and Singapore respectively, a figure just over 100 for November, and December looks to be closing around 50; I didn’t purchase any Travelcards after August. At the time, I made these decisions because I was unsure if going for the annual Travelcard was a reasonable idea, especially given that I would frequently not be in London owing to international travels, both for work and for personal affairs. The total cost for the category for the year was 894.68; this is lower than normal because I didn’t purchase any flights this year. I’ve been a bit cautious having been deployed internationally on quite a few occasions; I didn’t realise that you can refund the remaining value of a Travelcard!

This would have been 924 if I bought an annual zone 1-2 Travelcard (sadly, I’d now need 1,320 as I’m no longer a student); that said, with one I might have travelled more as well. Also, I was out for two months and started occasionally walking to the office in December. You can get refunds on the remaining value of a Travelcard – that said, I’m not sure repeatedly canceling and then repurchasing annual Travelcards is permissible, and it seems like it would certainly be inconvenient. Loss shouldn’t be too major of a concern, as Oyster cards can be registered to an online account which one can use to transfer a season pass away from a lost card. (I’ve done this before, though with a monthly pass.)

I think a question would then be as follows: exactly how frequently (in terms of number of days) do I need to use the Tube to make pay-as-you-go (PAYG)/monthly/annual Travelcards the best choice? We can examine that under a few assumptions:

  • The traveller is an adult.
  • All journeys are within Zone 1.
  • PAYG is implemented through contactless, so weekly caps apply.
  • The year begins on a Monday (this matters for weekly capping computations).
  • 16/7 trips per day (that’s reasonably realistic for me).
  • (Somewhat cheeky) If one travels for N days one travels for the first N days of the year.
  • Journeys on day are made between 0430 of D and 0430 of day D + 1.
  • The “greedy monthly flexible” (GMF) strategy works as follows:
    • It buys monthly travelcards as long as there are full months remaining.
    • For the partial month (if one exists), it uses the cheaper of:
      • a monthly travelcard
      • PAYG (with weekly capping)

Obviously GMF dominates a pure PAYG strategy, because for full months a monthly travelcard always beats PAYG (consider February), and for partial months GMF considers PAYG, so it does at least as well as PAYG. If I’m not wrong GMF is optimal under these contrived conditions: it intuitively seems difficult to recover from burning through February, the shortest month, without buying the monthly travelcard as you’d need four weekly ones. However, in the general case GMF is certainly not optimal (consider the period February 28 – March 31; you can buy the Travelcard on February 28, which expires March 27, and then pay for four days of fares, or pay February 28 and buy the Travelcard on March 1; the optimal strategy saves three days of fares).

The fare if one has to travel for N days is reflected in the graph below; and unsurprisingly the flexible methods are superior for small N but inferior for large N. Our model has a break-even point at about 314-315 days.

The final decision, unsurprisingly, boils down to the level of certainty you can have about your travels. If you don’t expect to be spending more than around 50 days outside of the UK, the annual travelcard seems like an idea worthy of consideration especially if you know when said days lie. That said, we have made two key assumptions, one of which favours the monthly strategy and one of which favours the annual one:

  • An upfront lump-sum payment is needed if you’re using the annual scheme. Our analysis did not account for the time value of money (you would need to discount the monthly payments to today to get a fairer comparison of the two).
  • However with the monthly strategy we’ve assumed that plans are known well in advance (at least a month) and implementation is done perfectly. In practice, there are likely to be some minor errors or plans not aligning neatly on month boundaries that will result in slightly higher fares.

I personally don’t expect to travel more than that, but I won’t be getting an annual card next year, for other reasons. (In particular, that “16/7 trips per day” assumption is unlikely to be valid, but that’s a subject for another post.)

Jeremy’s Annual Review 2016

As mentioned in a previous post, I’ve been using the system of objectives and key requirements (OKRs) to keep track of how I’ve been faring, looking at various metrics. I think OKRs do tell a significant part of the story, but they certainly don’t cover everything; a year is certainly long enough that things can change direction massively. There may be unexpected things that have come along for better or worse.

Why am I doing this? Reviews in general help me to reflect on the direction I’ve been taking, and allow me to consider how I should orient myself in the next period under consideration. I also find that being held publicly accountable is useful in spurring myself to perform better.

I’ll split this into three main sections: what went well, what did not go well and what major lessons were learnt. There’ll be another post to come concerning targets for the next year, though I’m sure some of the information here will feed back into that.

Things that went well:

  1. Academics at Imperial. There is still certainly room for improvement, but scoring 100 percent in the project and a 94.5 average for the final year blew even my best expectations away (95 and 92, respectively). I wasn’t too happy with some of the scores for the modules I took, but I think all things considered this was a very solid finish to four arduous years.
  2. Contracting at Palantir. I think working part-time as a software engineer during the term was great, though I realise the long hours (mainly because context switching became super expensive) had nontrivial costs. The main drivers behind this involved maintaining the currency of my skills as a developer as well as keeping in touch with the team at Palantir, and I think both ends were achieved here.
  3. Friendships. I was expecting the worst after we moved on from Imperial, but I’ve been keeping in touch with a fair few people. It hasn’t always been the easiest thing to do and it’s certainly caused me a lot of stress (unfortunately, it might have also had that effect on said friends), but I find maintaining good communication very worthwhile and I’m happy to have friends who also (at least seem to) believe this is important.
  4. Travels. I don’t mind a small amount of travel, but for various reasons was faced with a London – San Francisco – Palo Alto – Washington DC – New York – London – Singapore – London trip which ordinarily is more than I would like. Nonetheless I broke this down into small chunks and handled each leg of the journey smoothly. Besides the aforementioned trip I also had another visit to Palo Alto earlier in the year as well.
  5. Writing. I think at least over the past three months or so when I started here, I’ve been a lot more disciplined with sticking to a schedule and forcing myself to express my thoughts.

Things that didn’t go so well:

  1. Acquaintances. I’d have liked to keep in touch with more people from uni. I know this goes against the third point above, though that focuses more on the closest friends I had at Imperial whilst this focuses more on quantity.
  2. Working hours. Subsequent to a previous post where I explained how I tracked time at Imperial, there have been spikes above 80 and approaching 90 even, which is something I need to learn to control. I think I tend to have a very strong bias towards action, plus I do know that I can be very determined; but this can lead to such situations.
  3. Shifting standards / “the problem of never settling”. I find that I’m usually able to identify weaknesses and then address them (good), but there will always be more weaknesses to find, and a high quality bar very often has pretty heavy costs too. I’ll spend some time over the next two weeks to clear my head on this. As always there’s the flip side to this which probably leads to my current attitude, that settling for less-than-great performance isn’t ideal as well.
  4. Investments. I’m largely a defensive investor, as Benjamin Graham would call it. I invest every month when my paycheck comes in, and diversify into a portfolio of passive and active funds (this is needed in some cases, such as for access to small-value). Yet there were many occasions this year where the markets kept me up at night or had me worried, when they really shouldn’t. I have a tendency to be very aggressive with these (I bought in after Brexit; not so much after Trump since there wasn’t really a significant drop there once the markets opened).

Key lessons I learned (largely chronological):

  1. “It’s supposed to be hard. If it wasn’t hard, everyone would do it. It’s the hard that makes it great”. (You might recognise this as a quote from Tom Hanks’s A League Of Their Own.) I think I’ve generally liked to put myself through my paces, but this came out of the depths of working with MCMAS-Dynamic. In a way finishing that project gave me some kind of further validation that the 90%+ grades and the strong performance reviews I had were warranted. Sure, it can be painful, but the sense of accomplishment I often get when successfully pulling off something really difficult is great. And even at the end (before the grades) I was still concerned the project was not hard enough!
  2. Sometimes I need to take the first step. I’d say I learned this most clearly in what I call the post-Imperial crash period, when I started realising that a number of my friends would no longer be in London. I was somewhat concerned about how things would go from then on, and this caused me a fair bit of stress. I became considerably more proactive with many of them, and it seems things have worked out better than what I expected.
  3. “If you’re made to wait, it’s for your own good”. Performing high-quality work takes time. During MCMAS-Dynamic I found that after 8 hours or so per day I had difficulty maintaining my concentration; I thus switched my strategy from firing off those 11 or 12-hour days during the term that required comparatively less abstract reasoning to an 8 hour a day 7 day a week model. I was pretty drained out by the end, still. I’d be inclined to think that the pure technical difficulty of MCMAS-Dynamic is higher than anything I’ve had to face so far at Palantir, but I’ve similarly run into seeming brick walls with some of my work (though generally have not been implementing the 7 day a week thing). The quote above was introduced to me by a teammate who was concerned by the hours I’d been working; I think it was intended to be in an outgoing direction (that is, I might be making others wait for their own good), but it could also apply to some extent in the other way (that is, doubling down on effort in an attempt to rush something out is not always a good idea; being forced to slow down might be better).

The Problem of Never Settling

There’s still a little bit more of the year (I’d say we’re about 95 percent through). However we’re sufficiently far along that I think it’s time to take stock of what has happened so far, and evaluate what has happened.

Of course, this has been a year of shifting gears, at least in professional terms. At the turn of the year I was struggling with the Immerman-Szelepcsenyi theorem and how to write static analyzers; in the middle I tackled probably some of the most intellectually challenging work I have ever faced, dealt with leaving Imperial (in spite of the aforementioned challenges – there were other things), and set off around the globe. And now at the end I’m left thinking about how to best manage my time and energy across a seemingly unbounded spectrum of professional and personal responsibilities.

The continual pursuit of improvement, even if incremental, can be a great thing. It’s of course difficult to assign suitable metrics, but the mathematics of compounding makes a 0.5% daily improvement more than 5x if sustained over a year. Familiarity with, say, the Parable of the Talents (on being a good steward of what one has) drives the point home further – I’m thankful for the opportunities that I’ve been given so far, but that inevitably comes with some sense of responsibility, that said opportunities are suitably managed and capitalised upon.

An issue with that, then, is distinguishing what constitutes responsibility and what constitutes going beyond (the technical term, especially in a moral or ethical context, would be supererogation). For me at least, the standards I set are somewhat dynamic, and they’ll tend to rise further on strong performances or fall back on weaker ones (most obviously academic targets, also a couple of personal ones). I seek to make these standards difficult but achievable, for I find this spurs growth; yet, it’s way too easy to forget about that and then perceive missing a challenging target as a complete failure (when, actually, pursuing said target probably had a positive effect on the overall outcome, just that it set an anchor at an even higher point).

I tend to push hard to meet what I’ve set for myself. I have a practice of doing weekly reviews, and I think this increases my awareness of how I’m performing; I thus tend to put in a bit (or a lot) of extra effort if something doesn’t look like it’s tracking its target. However, as I’ve mentioned in previous posts, these standards can be and often are blunt instruments especially because they frequently are not entirely under my control.

There are two fairly well-known aphorisms that I find applicable to this – one that’s pretty general (perfect is the enemy of the good) and one more contemporary and specific to software engineering (launch and iterate). Indeed, these challenging targets can sometimes appear overwhelming, though generally I find myself able to focus on what I can do and proceed.

It’s important to keep a clear head through all of this, and taking a bit of time out over last weekend has helped. I think what I’m getting at here is that there can be a dark side to the pursuit of continual improvement, possibly because it leads to greater monitoring and self-accountability, and with some success that can lead to expectations being rapidly revised upwards (at least in my experience). In spite of that, I still believe it’s worth doing.

Stepping Back

I met up with a couple of my closest friends from uni over the weekend. At the end of what was a rather tough week (plugging away at debugging several issues regarding distributed systems) where everything seemed to center around getting a bunch of servers to behave as I wanted them to, it felt very different and somewhat refreshing to take a step back. I’m usually quite capable of focusing aggressively on a task at hand, which usually is good as far as said task is concerned but may not be optimal in general terms.

We played Loaded Questions, and one of the questions that came up was what we didn’t like about our professions. Being a bunch of Computing alumni, the set of answers was probably not too surprising:

  1. The salary.
  2. It’s difficult to measure performance.
  3. The algorithms are too complex.

I contributed #2 (though to be fair, thinking that I submitted either of the other answers is not entirely unreasonable – from an earlier post you can see that I investigated what looked like a 1 basis point shortfall in interest payments, and I just said that I had a tough week with debugging). Though I’ve tried to consider frameworks for evaluating this, I frequently find myself using outcomes and deliverables as primary measures of performance. This leads to me assessing myself based on my ability to get things done regardless of the costs or means required, and there are many factors beyond my control that can influence these.

I think this is a bigger issue when dealing with areas where I tend to believe I’m largely in control of the volume and quality of “output”, whatever that means. For example, as far as my investment portfolio is concerned, I tend to prefer to set goals along the lines of “set aside at least N for investment, incurring fees less than F” (largely controllable), rather than “portfolio net worth to be at least N” (I don’t believe I have the edge or insight to be able to be in control of such goals). I like to think software is a domain where I do exercise quite a bit of control over what I write, so it certainly is an issue in that sense.

In any case, I was happy to have had the opportunity to cool off a little. I think it’s a good thing that I can and do take my work pretty seriously (it has led to reasonably solid results in the past), even if it comes with the occasional tradeoff of getting excessively absorbed in it. On balance I think it’s a positive trait to have, but it’s important to remember that there are many things beyond whether an invariant holds in a bunch of servers.

Interest on the Interest

midpoints

I don’t remember the early years of my education that well. I do remember that maths was consistently my favourite subject back in primary school, though I wasn’t particularly good at it.

Anyway, it was around year 4 (so I was about 10 years old) when I started to take a bit more interest in personal finance. I’m not sure why this happened (I don’t remember young Jeremy being very interested in material things, and although the dot-com crash was in 2001 I’m not sure I knew about it at all back then!). I think at the time I viewed the stock market as very speculative (clearly hadn’t heard of mutual funds or ETFs); the childish me probably saw it as an “adult thing” to do as well (to be fair, if manually picking stocks that’s probably a reasonable view). I was thus more focused on what the older me would recognise as fixed-income investments.

However, in any case, I had saved some money from birthdays and the Lunar New Year, and given that I wasn’t going to be using it immediately I thought it would be good to put it to work. I was vaguely aware of how banks worked, at least as far as retail banking was concerned (i.e. the bank takes your deposit at rate x and lends out your money at y > x; the delta is for the service of matching depositors and borrowers). I was also aware of other schemes such as fixed deposits and other types of savings accounts. Interest rates at the time were about 2 to 3 percent, and knowing little else I thought that was not too bad for a start; my account at the time had an annual equivalent rate of 3%.

I remember looking through my bank statements then, and noticing that interest was paid twice a year, at the end of June and December. It didn’t take long for me to figure out that the December figure was bigger, at least partially because it was calculated including the interest from June. I then started wondering what would happen if the interest payments were made monthly, daily … or even billions of times per second. With some research I learned about continuous compounding; even if you were able to do this compounding at 3% infinitely often you’d still “only” get a rate of e^{0.03} - 1 = 0.0305 for your efforts.

However, the figures didn’t tally up with my calculations for a long while. I remember initially wondering why I wasn’t paid exactly 1.5% on each payment. Nevertheless, by then I had some familiarity with exponents, and I realised that 1.015^2 = 1.030025 > 1.03 and really we should be expecting \sqrt{1.03} - 1 = 0.01489 each time, rather than 1.5 percent. Still, this didn’t square up with the figures (it was getting down to cents, but still). I let the matter rest at the time, since it was broadly correct. Also, I noticed that the June payments tended to be a little small, the December payments a little too big – so I thought it averaged out in the end (which it did – that’s the point of an AER!).

Anyway, 15 years later I found the reason why, as part of prep work for a reading group I’m doing with Stan. I’m surprised I didn’t think about it back then especially given the observation about June and December payments (at the time, I made the oversimplifying abstraction that the payments were made “every six months”). The key is that the interest was calculated using what is known as an act/365 daycount which factors in the actual number of days for the period you were earning interest, and the first “half” of the year is shorter than the second “half”! Consider that in a non-leap year:

  • From 1 January to 30 June you have 3 \times 31 + 2 \times 30 + 28 = 181 days, but
  • From 1 July to 31 December you have 365 - 181 = 184 days!

With this, we can calculate how much should actually be paid each time. We need to solve

 \dfrac{181}{365} r + \dfrac{184}{365} r \left( 1 + \dfrac{181}{365} r \right) = 0.03 \leadsto r \approx 0.0297783

And so for the January-June period, on a $1 investment you would expect interest of

 \dfrac{181}{365} r \approx 0.0147668

which is notably less than the 0.01489 figure that we have treating each month to be the same length.

Note that a wide variety of daycount conventions are used, depending on which financial instruments are concerned! There is the 30/360 daycount, where every month is treated as 30 days and the year as having 360 days, which makes month-level abstractions valid but becomes unpleasant when you go below that; you also have the act/360 which like act/365 seems computationally nice. There’s also act/act (used for US treasury debt, notably), which guarantees identical value per day within a period at the expense of dealing annoyingly with leap years and/or the fact that the number of days in a year is odd, and many further variants of what I’ve discussed so far as well including a few particularly nasty ones that scale on business days as opposed to calendar days.

ScheduledExecutorServices

Scheduled Executor Service and Quota time chart

I briefly touched on this in my first post, but I find that there are quite a few things which I aim to carry out (at least / at most) N times per time interval T. This is described by John Sonmez in Soft Skills as a quota. You may have been able to infer this from the frequency of posts here – I’ve been aiming to write at least 1 blog post per week.

On another note, scheduled executor services are a part of the Java concurrency libraries (introduced in Java 5). They allow clients to submit one-shot tasks wrapped in callables (possibly with some delay before execution), as well as tasks with a fixed rate (more similar to a quota, though not quite – mainly because quotas are concerned with getting things done, while executor services are concerned with enabling tasks to run; also because a scheduled executor service won’t start concurrent tasks). Clients can also specify tasks with a fixed delay; this differs from a fixed rate in that the countdown to the next execution starts after the current task has completed.

If one assumes that the tasks complete relatively quickly, then quotas are, in a way, less restrictive than scheduled executor services; they give flexibility as to when in the time period T the N events occur. This is especially important for tasks that require 2-way synchronization (for me, that largely involves spending time with friends) – it would be even more so for barriers involving multiple people though I haven’t actually planned any such arrangements.

A downside of this is that it delays the decision of deciding when in each period T the events should be scheduled; it’s arguably simpler to schedule them at the same point in each period. Also, if one follows the letter of the quota, then this can lead to very uneven intervals between occurrences – for syncing up with friends, blog posts and quite a few other things, while this certainly isn’t bad it’s also less than ideal (imagine if I had a “weekly sync” with a friend, but actually spoke to them once every 2 weeks at 23.35 on Sunday for 20 minutes, and then again at 00.05 on Monday for 20 minutes). I find that a good way around this is to normally target the same point, but allow for flexibility; I’m not sure you can readily do this in ScheduledExecutorService (you’d have to cancel the old task and reschedule a new one with the correct delay, I think).

The diagram above more succinctly illustrates the difference between the timing semantics for the various things I’ve described. More often than not, at least for meeting up and/or syncing with people, the pattern is closer to the second or fourth above (i.e. with random variation, plus the occasional additional occurrence; perhaps the occasional missed occurrence as well, though I didn’t include that in the diagram).

Another way I’ve modeled this in the past is on the concept of a failure detector in distributed systems. Servers/subsystems can arrange periodic heartbeats, suspecting them of failure if a heartbeat is not received (and “un-suspecting” them if one is subsequently received). Though because of the aforementioned flexibility, a pattern that conforms to a quota could result in a heartbeat interval of 2T. I guess the idea I had previously was I didn’t want to lose contact with friends, bearing in mind that I was still in Imperial at that time and thus would quite naturally and easily meet many of my friends in lectures or in labs – on the other hand, I’m the only person from my class to go to Palantir (at least at this point). I find using a system based on quotas is certainly much easier for me to manage, as well.