Limitations of the Bus Factor

Over lunch today, a friend and I discussed what we were doing in response to COVID-19. I mentioned I was working from home, and that the purpose of working from home was presumably to avoid the offices becoming a coronavirus cluster. This isn’t the same as quarantine or self-isolation in that I can and do still leave the house, attend smaller-scale social events and meet friends. The probability seems not too high, but even if I become sick the idea is that we don’t want this to spread to others on the team or at the company.

The bus factor is a simple measurement of how concentrated key personnel are in a team. It is defined as the minimum number of people that can be removed from a team before the project stalls. For example, if a shop has five employees, where everyone can restock shelves but only two people are authorised to operate the cash register, then the shop team has a bus factor of 2, since if those two people are removed the shop will not function.

Generally, a higher number is better, as it means the team can tolerate more people being unavailable. This is often used to inform risk planning, such as by prioritising context-sharing, developer documentation or other means of spreading knowledge.

However, there is quite a lot of variance within a single measurement. For example, suppose we have a team A which has two leads, and we say that the team will stall if both of them are unavailable (but only them – any other group of people becoming unavailable, including everyone but one of the leads, is okay). Also consider team B, where the absence of any two people will cause the team to stall (though no individual would). Both teams have a bus factor of 2. However, A is strictly safer than B, and it seems to be quite a large margin. I’m not sure that B is safer than a team C, where there is one absolutely critical lead (bus factor 1), for that matter.

This could get us to a concrete problem. Suppose there is a 1 percent probability that each person on a team becomes unavailable for some reason. What is the smallest number of people such that it is possible that a team with a bus factor of 2 is actually riskier than a team with a bus factor of 1?


Finding the Extremes

To answer this question, we want to find the extremes of a given Bus Factor BF in terms of risk, because we want to compare the riskiest Bus Factor 2 team against the safest Bus Factor 1 team. This drops out quite naturally: the safest team is the one where there exists one specific set of BF people which will cause operations to stall, and no other combination that is not a superset of that set is essential. Conversely, the riskiest team is one where every combination of BF people is critical. As defined above, team B is the riskiest possible Bus Factor 2 team, and team C is the safest possible Bus Factor 1 team.

Without loss of generality, let the people in the team be numbered 1, 2, \ldots, n and the lead have number 1 (or 1 and 2 in the case of team A). Let the probability that an entity E becomes unavailable be P(E_\times). Then, the probability that each team’s work stalls is as follows:

\displaystyle P(A_\times) = P(1_\times \wedge 2_\times)

\displaystyle P(B_\times) = P \left( \bigvee_{i=1}^{n} \bigvee_{j=i+1}^{n} i_\times \wedge j_\times \right)

\displaystyle P(C_\times) = P(1_\times)

These terms as written aren’t very comparable – depending on the underlying probability distributions.

Attempting to Compare Terms

I’ve used logical formulas above to avoid making claims about the underlying probability distributions. However, if we assume that the probabilities are independently and identically distributed, and we say that over some fixed period of time the probability that an individual becomes unavailable is p, then we have

\displaystyle P(A_\times) = p^2

\displaystyle P(C_\times) = p

P(B_\times) is marginally trickier. This is the probability there are at least two people who become unavailable. It is probably easier to work out the complement P(B_\times'): that is the probability zero people or one person is unavailable. If we define X as the number of people (out of n) who become unavailable, then P(B_\times') = P(X=0) + P(X=1). It’s clear that P(X=0) = (1-p)^n. P(X=1) = np(1-p)^{n-1}: the probability a specific person is unavailable and no others are is p(1-p)^{n-1}, but there are n valid versions of this. More generally, X follows what’s called a binomial distribution. Thus,

\displaystyle P(B_\times') = (1-p)^n + np(1-p)^{n-1} = (1-p)^{n-1} \left( 1 - p + np \right)

and thus

P(B_\times) = 1 - (1-p)^{n-1} \left( 1 - p + np \right)

Notice that, as expected, this is dependent on n, while the other teams A and C have probabilities that don’t depend on n. Furthermore, as n increases, P(B_\times) increases – intuitively, as you consider more people, every way in which the team was already broken ignoring the new people remains broken, but some previously unbroken states can also break. (A more mathematical argument could look at the ratios of successive terms.)

Synthesis

The combination of exponential and polynomial (well, linear) terms in P(B_\times) would make direct comparison with P(C_\times) difficult. We can however fix some risk probability p and find out at which value of n team B (no one is individually critical, but every pair of people is critical) has a greater probability of stalling than team C (there is a single critical lead). To answer our initial question, we choose p=0.01. Since we have a closed form for the probability of team B having issues, we can plot a graph:

It turns out the answer here, n=16 is actually relatively small in computational terms, even if it is large as far as teams are concerned. That said, the conditions for a team being the safest possible for a given bus factor are extremely strict – the sheer under-resourcing from removing people, even if they aren’t information silos or have unique knowledge, would still severely impair team function.

Learning German 5: Do Not Give The Children Cigarettes

I had my Goethe-Zertifikat A1: Start Deutsch 1 exam about 3 weeks ago. The exam has four parts: listening, reading, writing and speaking. I was most nervous about the oral exam, but once it started, I felt more comfortable. The third part of the oral exam is about making and responding to requests. One of the other candidates took a card with a “no smoking” sign on it, but I think he forgot the word “smoking”. He was smart, though, and said, “Please do not give the children cigarettes”. I think my speaking skills were good enough, though unfortunately not as humorous.

Vor drei Woche habe ich meine Goethe-Zertifikat A1 Prüfung gemacht. Die Prüfung hat vier Teilen, nämlich Lesen, Hören, Schreiben und Sprechen. Ich war am nervösesten über die mundliche Prüfung, aber als sie angefangen hat, fühle ich mich besser. Das dritte Teil der mundlichen Prüfung ist über Bitten formulieren und reagieren. Ein anderer Kandidat hat eine Karte mit einem “Rauchen verboten” Schild genommen. Ich habe gedacht, dass er hat das Wort “rauchen” vergessen. Er war intelligent, und hat gesagt “Bitte geben Sie meinen Kindern keine Zigaretten”. Ich denke, dass ich habe gut genug gesprochen, aber ich war leider nicht so lustig.


Taboo is a game where one needs to communicate a word to one’s partner, without saying a list of “taboo” words. For example, for the word “Twister” one might have Tornado, Game, Colours, Legs and Arms as taboo words. If I still wanted to go for the party game approach (which I think is easier than the tornado approach as it’s very precise if done correctly), I might say something like “Something played at a party where a wheel is spun and players contort their limbs to place them on red, green, blue or yellow circles”.

I’ve found some similarities between playing Taboo and speaking in German (and even in Mandarin), in that I can’t always use the most direct approach of communicating something. The reason for this is different: in Taboo, it is the rules of the game, while with second and third languages it is typically because I lack vocabulary required to use the most direct approach. For example, if I wanted to communicate “I want a pair of chopsticks”, I could do this easily in English or Chinese (我要筷子), but I don’t know the word for that in German (Essstäbchen, it turns out). I might thus need to work around it via description, with something like “Könnten Sie mir chinesisches Besteck geben?”. There are a number of English words that have been adopted into German, so I could just take a shot in the dark with “Geben Sie mir ein Paar Chopsticks, bitte.” which would probably get the point across especially to a German speaker, though it’s wrong.

I’ve also tried playing Taboo myself in German. Even sticking to simple sentences (to avoid struggling with stringing together subclauses or neighbouring-sentences on the fly) not having a broad vocabulary makes it tricky. For example, I saw a card with goal word Biene or bee (banned: gelb or yellow, Stachel or sting, Summen or sum, Honig or honey, Blütenstaub or pollen). In English, this isn’t very hard: “there’s a queen, they make hexagonal structures, produce golden liquid good for coughs… you can say you have this in your bonnet, or if you’re in a rush you’re making a this-line for something”. I found this tricky in German. Even trying to translate what I’d say in English, I might begin “Es hat Königin”, but it goes downhill from there (turns out hexagon is just das Hexagon, and I could improvise liquid by just saying Wasser). I wouldn’t be able to give the idioms and I’m not sure they still work in German. Effectively, all of the missing vocabulary items are permanent Taboo words.

I think one of my strengths in English communication is being able to make statements fairly precisely (for example, “I don’t have a particularly strong opinion; while I agree with (specific parts of X), I’m not confident X is correct because I am not in a position to assess (other part of X)”, or “X is correct in principle but I’m not sure it’s appropriate to implement”). I’m not there in Chinese, and definitely not in German. This can be an issue with using possibly circumlocutory descriptions – if they accidentally indicate a preference that isn’t really there. For example, for the “chinesisches Besteck” example above, one can’t fault a server that brings a soup spoon (and depending on one’s point of view, the server could be justifiably annoyed if one rejects the soup spoon and continues to make this request without qualifying it further).

Is It Someone’s Birthday Everyday? Solving the Inverse Birthday Problem

The birthday paradox is a fairly well-known but unintuitive observation that although there are normally 365 days in a year (for simplicity, ignoring leap years and assuming a uniform distribution for birthdays), with as few as 23 people it is more likely than not that two of them will share a birthday. One of my teammates proposed an inverse of this problem: what is the minimum number of people are needed so that it is more likely than not that it is someone’s birthday every day?

Modelling A Specific Instance

There’s a lot to take in here, so let’s try to solve a simpler problem first. We could attempt to solve: given a specific number of people, how would we calculate the probability that it is someone’s birthday every day? If we have this method, we can reconstruct the answer to the original problem via a search algorithm.

The more general problem here would be: suppose we have N \geq 1 people. Each person draws one of K \geq 1 events with replacement – these are equally probable and independent. What is the probability that all events occur at least once? The more general introductory problem has K = 365, but of course that’s tricky to work with right off the bat, so let’s examine a couple of the smaller cases.

Algorithmic Ideation

Base Case Analysis

If K = 1, since we said that we always have at least one person, the one possible event will happen and the probability is 1. Moving on to K = 2, here N = 1 is a special case with probability of 0. For larger N, there are two specific situations we want to avoid, where every person draws the same event (remember that there were two events). Each of these happens with probability 2^{-N} so the probability of everything occurring once would be 1 - 2 \times 2^{-N} = 1 - 2^{-N + 1}.

We can now analyze K = 3 – if N \leq 2 we definitively have probability 0. For N = 3, the first person will receive a novel outcome with certainty, but the second and third person also need novel outcomes, and this happens with probability \frac{2}{3} \times \frac{1}{3} = \frac{2}{9}. It gets a little trickier with N = 4:

  • The first person to draw an event will get a novel event. This leaves 3 people to draw events, and 2 out of 3 events undrawn. We can call this state (3, 2, 3).
  • The second person to draw an event will get a novel event with probability 2/3, putting us in the state (2, 1, 3). Alternatively, he/she will draw the same event as the first person with probability 1/3, getting us in the state (2, 2, 3).
  • If we’re in (2, 1, 3), there’s a 1/3 probability that the third person draws the last event (in which case we are done), and a 2/3 probability he doesn’t, putting us in (1, 1, 3).
  • If we’re in (1, 1, 3) we have one person who needs to draw the last event with probability 1/3.
  • If we’re in (2, 2, 3), the third person will get a novel event with probability 2/3, putting us in (1, 1, 3). He might draw the same event as the first two people, in which case we definitely failed.

More generally, we can model the problem using a state machine (a technique which I discussed two and a half years ago). The states of this machine can be written as 3-tuples (p, q, r) which means we have p people to draw events, there are r events of which q still need to be drawn. Notice that q \leq r, and that for our problem, we have r = K and we generally want to find the probability of success for the state (N, K, K).

For K = 3, N = 4 we can draw this as follows.

Recurrence Relations

We can define V(p, q, r) as the probability that transitions from the state (p, q, r) will eventually end up in a SUCCESS state, defined as (p, 0, r) for all values of p and r (as that means all events were drawn). We make some observations on simple cases:

  • V(p, 0, r) = 1. We are already in a success state by definition.
  • V(0, q \geq 1, r) = 0. If everyone has drawn an event and there are still events that haven’t occurred, there is no chance for us to draw those events.

That’s all we need, and we can look at the recursive case. In the state (p, q, r), there is a \frac{q}{r} probability that we draw a new outcome, putting us in the state (p-1, q-1, r). There is also a 1 - \frac{q}{r} probability that that doesn’t happen – in that case we’re in the state (p-1, q, r). It thus follows that

V(p, q, r) = \dfrac{q}{r} \times V(p-1, q-1, r) + \left( 1 - \dfrac{q}{r} \right) \times V(p-1, q, r)

Notice that these states are well-formed, since we handled the p=0 and q=0 base cases earlier. It is possible to optimise this by adding some more sophisticated base cases, but this still gives us a clear enough method for working out the probability. For example,

  • V(2, 1, 3) =\frac{1}{3} \times V(1, 0, 3) + \frac{2}{3} \times V(1, 1, 3) = \frac{1}{3} \times 1 + \frac{2}{3} \times \frac{1}{3} = \frac{1}{3} + \frac{2}{9} = \frac{5}{9}
  • V(2, 2, 3) =\frac{2}{3} \times V(1, 1, 3) + \frac{2}{3} \times V(1, 2, 3) = \frac{2}{3} \times \frac{1}{3} + \frac{1}{3} \times 0 = \frac{2}{9}
  • V(4, 3, 3) = V(3, 2, 3) = \frac{2}{3} \times V(2, 1, 3) + \frac{2}{3} \times V(2, 2, 3) = \frac{4}{9}

 

Solving The Specific Instance

The answer to our original problem, for a given number of people N is then V(N, 365, 365). Working this out by hand is likely to be painful, so we should avoid that. It’s easy to code something simple up that is based on the recurrence relation above:

private static double findSuccessProbability(long remainingPeople, long remainingEvents, long totalEvents) {
    if (remainingEvents == 0) {
        return 1.0;
    }
    if (remainingPeople == 0) {
        return 0.0;
    }

    double probabilityOfNewEvent = ((double) remainingEvents) / totalEvents;
    double successProbabilityOnNewEvent = findSuccessProbability(remainingPeople - 1, remainingEvents - 1, totalEvents);
    double successProbabilityOnOldEvent = findSuccessProbability(remainingPeople - 1, remainingEvents, totalEvents);

    return probabilityOfNewEvent * successProbabilityOnNewEvent
            + (1 - probabilityOfNewEvent) * successProbabilityOnOldEvent;
}

We can test this and verify that indeed V(4, 3, 3) = \frac{4}{9}. However, if we try to call this with, say (500, 365, 365) the program doesn’t seem to give the answer quickly.

One solution here is to use memoisation. The naive version written above ends up computing the answer for the same state potentially exponentially many times, but we can cache the results of previous computations and use them. This could be done with a map with the 3-tuple as a cache key; alternatively, a bottom-up dynamic programming solution can be written, since there’s a clear order in which we process the subproblems. In any case, these methods should get the runtime down from exponential to O(NK).

Reconstructing The General Solution

We want to find the smallest N such that V(N, 365, 365) > 0.5. We could start from N = 365 and then keep trying larger values of N: since K is constant this is actually not that bad as we can reuse the results of computations from previous attempts when trying new values of N.

public static void main(String[] args) {
    long numPeople = 365;
    double successProbability = findSuccessProbability(numPeople, 365, 365);
    while (successProbability <= 0.5) {
        numPeople++;
        successProbability = findSuccessProbability(numPeople, 365, 365);
    }
    System.out.println(numPeople);
}

This gives us our answer: 2287. We have V(2286, 365, 365) = 0.499414 \ldots while V(2287, 365, 365) = 0.500371 \ldots.

This approach looks a bit primitive – and normally we’d use a modified binary search where we first search for an upper bound via something that grows exponentially, and then binary search up to our bound. This works, since V(p, q, r) increases as p increases for constant q, r.

However, for this specific problem and more generally for recurrences where we end up reducing parameters by 1, we still need to burn through almost all of the smaller subproblems we skip over by isolating the bound, so it won’t actually give us that much here.

Usually when working with recurrences it is nice to find a closed-form solution for our expression (i.e. we have a general expression for V(p, q, r) that doesn’t itself reference V), though I didn’t really see how to make progress with this one.

Risks in Playing the Long Game

Monevator is one of the personal finance sites that I read fairly frequently. They have a fairly good weekly column that includes a number of interesting links, both about personal finance as well as on other topics like psychology or recent news events. The article I read yesterday begins as follows:

Unlike every other personal finance blogger around, I’ve started 2020 wondering whether it’s time I spent more money.

I would not call myself a personal finance blogger, but I had thoughts along similar lines when planning my budget for 2020. Eventually, I decided yes, and expanded the budget at the end of the planning session (and this is already after a 50% increase over 2018). My reasoning was not too different from the author’s (and the logic behind having a need for such an argument was similar as well – saving, which is a form of delaying gratification, seems to come naturally as a default to me).

In theory at least, I can work out how much £1 spent today would cost me in retirement if I was to do that at 65 if I make some assumptions. There are 37 years to go there, and if we use a 4.5% real (that is, accounting for inflation) return that becomes about £5.10. In other words, I could buy roughly five times the number of goods I could today, if I was willing to wait 37 years.

However, a large problem with excessively delaying gratification (both for money and in general) is that the expected benefit may not materialise – this is known as an interruption risk. These risks can typically take two forms:

  1. Collection risk, where the projected reward doesn’t actually become available. In context this means that the £1 I invested doesn’t become the inflation-adjusted equivalent of £5.10 37 years down the road.
  2. Termination risk, where the person delaying gratification does not get to benefit from the projected reward, even though it did become available. In context this means that although the £1 I invested does become £5.10 (or more), I don’t survive 37 more years or I do but my ability to use it is limited or hampered for some reason.

The former could arise for various reasons, as my projection was dependent on a 4.5% real return assumption. This is in line with the past 119 years – in fact that was slightly higher at 5%, though I would use 4.5% because of platform and fund fees (see Credit Suisse’s 2019 Global Investment Returns Yearbook). This pattern may not continue going forward if there are large, value-destroying events (e.g. faster than expected climate change effects, nuclear war). Even if the pattern continues, I might not be able to get this return for more local concerns (e.g. aggressive investment taxes as part of left-wing policy, if my fund of choice turns out to be poorly managed).

The latter is also very relevant. The author mentions that he’s “within striking distance of 47” and is concerned about planning 40 years ahead when using a compound interest calculator; I’m probably between fifteen or twenty years younger, and the pessimist (or realist?) in me also gets uneasy about putting 40 in that box, let alone the analogous 55-60 (e.g. see James 4:14 – whether you’re a Christian or not I think the sentiment is relatable). Even if I do make it there, it’s possible that I’d be in less of a position to enjoy the money, for time or health reasons. If I had my current financial state at the end of National Service in Singapore, for example, that 9-month period before starting university would have looked quite different and I would have done more things, some of which I’m not in a position to do right now because of work (e.g. I’d probably have done a two-month round-the-world trip).

I am still intending to be mostly frugal and saving a fair amount. Travel will likely be the largest source of expenses for me – flights and hotels are not cheap. I also spent quite a lot of money on clothing last year, though I’m trying to avoid that as I’m running up against (reasonable) storage limitations and there are quite a number of clothes that I haven’t worn for a while. Thus my plans are nowhere near as far as the author’s claim that he’ll make little effort to grow his wealth. I don’t think I’m in a position where that would be responsible, in that my current savings/portfolio are nowhere near enough to be sufficient for retirement.

Burning Lights and Falling Stars (2019 Review)

When I look back on 2019, what first comes to mind are a number of protracted struggles. These spanned a broad range of issues, from work to philosophy and from finance to logic. This was probably either the toughest or the second toughest year I’ve had since I moved to London. It’s comparable to 2016 (with MCMAS and Palantir work in parallel, and then TimeLock): in both cases I think that I’ve grown and learnt quite a fair bit over the course of the year, probably more than in other years, but also that this generally isn’t a long-term sustainable pattern.

Software Engineering

The early part of the year involved finishing up work done last year on Transactions2. This project was generally successful, though more painful than expected. There were a lot of weird bugs that took quite a lot of iteration, and since the previous dev lead with whom I worked with on this project left the team and these involved pretty deep technical knowledge it would have been very costly to ramp anyone else up. Delivery could have been improved with a bit more careful project management/planning, but in terms of technical execution I think I performed at or slightly above my personal bar here.

I became a tech lead in March and then a more general team lead later on. A lot of the year involved me trying to get comfortable with the roles; it was a conscious decision to focus more on that as opposed to further refining technical execution of individual projects.

The former focuses on making architectural decisions, prioritising feature and support work, being a representative of the team at higher-level meetings, and being responsible for communicating thoughts and ideas with other teams. The first two things I think were expected to go mostly smoothly, and did: I already contributed to quite a bit of these while “strictly” an IC, and I definitely consulted the leads at the time and then jumped in on things I thought were more important than the work I was originally planning on doing. I maintain a lot of context on what’s going on on the team and product, and am aware that my input on these is valued quite well. The latter two also turned out to have worked better than I expected, in that feedback has generally been positive with regard to communication and argumentation. I also don’t think I’m where I’d like to be with regards to coming up with ideas for things to do (as opposed to, given a list of features, filter them for feasibility/benefit and then prioritise them) though there are of course support structures for that.

The latter role seems to also involve project management, and developing other members of the team to become stronger engineers. This hasn’t gone as well. I think it’s not so much a question of not wanting to do it or investing in it, but instead not being efficient by making an invalid assumption that others would follow the same growth path as me (which is basically mostly by absorption and from figuring things out as they come up). That by itself is maybe reasonable in terms of expectations (I think it worked very well in my case), but I took a long time to course-correct. It looks a bit ironic that why I took a long time to course-correct was because my bandwidth was strained on a lot more pressing things because I felt I needed to handle them directly, and the antidote to that is scaling myself by investing in team members’ growth.

I continued to work on some technical projects throughout the year, though not as intensely as in 2017 or 2018 probably by design. I’d probably say that the highlights were the two hackweek projects that I did. The Summer project is going into production, and involved careful reasoning through the AtlasDB development stack. The Winter project came with a bit of a pleasant surprise: my efforts on being thoughtful about interviews landed a runner-up for use of tech, which I really wasn’t expecting; I did the project more because I saw a relevant need.

Recreation and Personal Development

A fairly common theme here is skill development – there’s some evidence of this in terms of Sudoku and German, both of which are things I spend a fair bit of my free time on.

Sudoku and Logic Puzzles

I set a goal to achieve a global rank in the top 100 of the World Puzzle Federation Sudoku and Puzzle GPs. This was achieved in both cases: I finished with rank 66 in Sudoku and 92 in Puzzles. The contests run monthly (I think it’s actually every 4 weeks) from January to August.

Each Sudoku contest is basically an exam paper which is marked out of 600, though additional points can be obtained by submitting a fully correct paper before time is up. There are 8 contests, and the overall GP ranking is based on the sum of the six highest scores. I was at around the 350/600 mark at the end of last year, and so was expecting a final score of about 2200 or so (allowing for discards of bad results), but I probably got slightly better over the year and managed to finish with 2504. Puzzles went less smoothly as I missed the first round as I was on holiday and had a really bad round 6 where I chose very tough puzzles to attack, but I still squeaked over the top-100 line.

Deutsch

I started learning German this year. I think I began with Lingvist and Duolingo around May or so, and then started formal lessons in June. I have lessons once a week for 1.5 hours. I was originally hoping to take the A1 exam by the end of this year – this is currently planned for January, though my teacher’s quite confident that I won’t have issues. Nonetheless I do think I’ve made substantial progress.

I hoped to push on towards A2 actually. It’s unclear if I am at A2: while the CEFR descriptors don’t look too hard at that level (I feel I can satisfy almost all of the criteria), I’m aware there’s quite a lot of grammatical knowledge expected at A2 that I don’t know or am not confident about (e.g. irregular verbs in the simple past, passive voice). I’ve tried the Hören, Lesen and Schreiben (listening, reading and writing) sections of various exams aimed at the A2 level, like the telc A2 and UK GCSEs (I’m saving the Goethe papers for when I actually do that exam) and generally have been able to do quite well on them (i.e. above 80% on telc A2, Grade 9s on the GCSE components – maybe not writing), though admittedly Sprechen (speaking) is the part I’m most concerned about.

Travel and Exploration

Travel this year included four trips to Singapore (two were based around weddings). I also visited Zürich multiple times, Japan, Brussels, Boston, Stockholm and Palo Alto (though mostly for work). I did not travel as much as I wanted to outside of work, possibly because of prioritising other goals (I mainly worked on logic puzzles and German on the weekends) though actually they aren’t that incompatible (e.g. a flight is a good amount of time to work on learning German).

I really enjoyed the Japan trip (highlights included the Sankei-en Garden, Shin-Yokohama Raumen Museum and Ginza Tamai), and though the wedding trips were short (just 3-4 days in Singapore each time) I don’t regret them at all. Travelling is interesting: I don’t always look forward to the trips (especially the work ones) but almost always enjoy them. It’s possible there’s some sunk-cost fallacy here, though that’s mitigated by me not directly paying for the work trips in terms of money (still in terms of time, of course).

Financials

I generally looked at my portfolio quite a bit less after Q2, perhaps because I invested more resources in my new role at work and also because there generally was a fair bit of non-actionable stress.

Savings Rate

There are several ways to slice this: Savings Rate #4 from this post is what I normally use. In a UK context, define S_1 as workplace pension contributions, S_2 as individual pension contributions (e.g. SIPPs), S_3 as other savings (ISAs, taxable accounts), C as consumption and D as tax changes derived from taxable benefits. Furthermore, define \tau as the tax rate expected when one withdraws from one’s pensions. Then

SR = \dfrac{\tau(S_1 + S_2) + S_3}{\tau(S_1 + S_2) + S_3 + C + D}

Roughly 55% this year, up two percentage points from last year (looking at it, my expenses did increase quite a lot from last year, but not by as much as the raise I received, so this makes sense). This is in a pretty reasonable spot.

Discipline in Spending

I’ll paste an extract from the essay I wrote on my birthday this year, that captures the major spending increases here:

  • Groceries has had an increasing trend year-on-year, going up by 50% from 2017 to 2018, and another 26% this year. Some of this is because I patronise M&S nowadays; I actually find the food tastes better. The ”three meat or fish items for £10” deals were something I used to scorn in first year at university, calling it absurdly expensive; I now actually use that on occasion.
  • Gifts has increased to 2x over last year. I think this is natural, seeing as my financial position is a bit more stable now. I also definitely recognise that I don’t have infinite time to use this.
  • Travel has increased to 2.5x over last year, as part of a generally increasing trend. This is fine as long as it doesn’t increase exponentially from here. Some of this is also probably because I started thinking about flying slightly more premium cabins (premium economy or even business), and where this is not the case booking extra legroom seats. It seems I also can spend quite a bit when making “bleisure” trips, such as to Boston and Palo Alto. I think that’s okay, though – especially if I’m spending it on interesting experiences (and not tat).
  • Clothing has increased to 2x over last year. I have lumped shoes into clothing and that is certainly a part of the cost, but that definitely doesn’t explain the whole delta. I think it tends to be a large number of items as opposed to individually expensive ones – perhaps it’s a form of retail therapy? I think something actionable here might be to implement a one-in-one-out policy in 2020. £28.74 per week for clothes is concerning. This does cluster (e.g. I spent £250 in the Black Friday and Cyber Monday sales, including on a pair of Nudie raw denim and two pairs of Converse that should hopefully last) but it’s still more than I would expect. I’d view most articles of clothing apart from shoes to generally be luxuries at £28.74, and I certainly don’t feel like I’m buying one such treat per week.

There’s still some discipline, in that my savings rate has increased in spite of these trends, and in most cases (apart from the clothing one) the increases are justified and were made (mostly) thoughtfully. I have to deduct points for the clothing-as-retail-therapy thing though.

Learning German 4: German GCSEs – Lesen/Reading

By now, I’ve taken learning German more seriously for half a year. It’s still difficult, especially speaking and writing. I tried the German GCSE exam for reading, and managed to get a grade 9 with 49 points out of 60 (that’s the highest possible grade in GCSE). However, I’m still not confident. I find building sentences problematic if I want to talk about more complex things. Reading is slightly easier, but I won’t know if a sentence is wrong!

Mittlerweile lerne ich seit ein halb Jahr Deutsch. Es ist noch schwierig, besonders sprechen und schreiben. Ich hatte die Deutsch GCSE-Prüfüngen für Lesen probiert. Ich habe im Prüfüngen einen Neun (das ist die höchstmögliche Note) mit 49/60 Punkten erzielt, aber ich habe noch kein Konfidenz. Ich finde Satzbau problematisch, wenn ich komplexe Dinge sagen will. Lesen ist ein bisschen einfacher, aber ich werde nicht wissen ob ein Satz falsch ist. 


The paper I tried out specifically was the June 2018 exam for AQA GCSE German. This qualification is assessed over four papers, which attempt to test the four common language skills – Paper 1 tests listening, Paper 2 speaking, Paper 3 reading and Paper 4 writing. I found it interesting to see how an attempt is made to segment performance by limiting the extent to which the skills not being assessed are required. For example, in Papers 1 and 3 many questions ask for answers in English, and for the sections where answers are required in German full sentences generally aren’t needed, and linguistic errors generally aren’t penalised unless they affect clarity.

I tried out Paper 3 first, largely for practical reasons – I will probably do Paper 1 at some point, as the sound files are also available online. I can’t quite administer Paper 2 to myself, and for Paper 4 I’d need someone to evaluate my own writing, as I’m certainly not in a position to do that yet. I attempted the Higher paper, which is aimed at students seeking to get a grade between 4 and 9 (recall that passing grades range from 1 to 9). I wasn’t confident of getting a super high grade as I’ve only studied German on-and-off for around seven months or so; GCSE courses typically run for two years. Nonetheless, I managed to scrape it (the grade boundary for 9 was at 48 of 60, and I scored 49 – note that 48 is good enough, so I had one point of breathing room).

The paper is divided into three sections; the first involves reading texts and responding in English, the second is similar to the first but in German, and the third involves translating a German passage back to English. Generally, within each section difficulty increases, which led to a rather non-monotonic difficulty curve. Q8 at the end of Section A was probably a question aimed at top students, while Q9, the first question of Section B was a crossover with Foundation Tier, so aimed towards the lower end of the Higher Tier range.

I also found the translation task fairly straightforward – I knew all of the words apart from Bauernhof, and based on context (working with animals, starting the day early, hard work) along with guessing from prefixes (Bau– for construction: it turns out Bauer means farmer but I didn’t know that as well, and –hof for yard e.g. from Bahnhof – train station – or Friedhof – cemetery) guessed it correctly as farm. My natural tendency to aim for precision seemed to work very well on this question.

Most of the marks lost came from holes in my German vocabulary. There were questions that asked for specific features of the text to be described in English, and these often effectively reduced to questions about the definitions of specific words. For example, there were questions that effectively asked what Dieb (thief), enspannen (relax), vermeiden (avoid) or lügen (lie, as in telling an untruth), none of which I knew, meant – I was able to figure out from context that that word was the desired answer, but had to guess when translating. Interestingly, if I was asked to write the answer in German (being allowed to lift) I would have scored these marks!

There were also a few marks which were lost because of carelessness. I’m aware that trying to pick out keywords and translating them is often a trap, yet I still ended up making some of these mistakes. For example, on the first question which involved reading people’s descriptions of what they did to help the environment, I accidentally described someone who said Meine Eltern haben immer ihren Müll getrennt, aber … habe ich nichts gemacht as saying they separated their trash. I probably got too excited about knowing what getrennt meant – it’s often used when asking if a bill is to be split in restaurants – and wrote down “separating trash”. The correct answer is of course, nothing – the separation was done by their parents, not them!

I’m not entirely sure this grade 9 is secure, being just two points away from an 8. That said, the 7-8 boundary is at 41, so I’m quite confident of at least an 8. This also means that if I was to take the entire GCSE qualification an overall 9 probably wouldn’t happen, as I’d see reading as the language skill I’m currently most confident in in German. Students have overall grades calculated based on the sum of their marks in each of the four papers – while a candidate doesn’t need to get a 9 in each component, any marks below the 9 boundary in one paper must generally be offset by marks in other papers. I’m not sure what standard is expected, but I won’t be too surprised if I struggle to get a 5 in Schreiben (writing) or Sprechen (speaking).

Advents

December is a rather unusual month for me. My birthday is in December, as is Christmas and New Year’s Eve. By many metrics it’s also the end of often discretely-viewed periods of time (Q4, H2, a year – this year, a decade as well). It thus tends to lend itself particularly well to both introspection as well as frenzied rushes to complete things before the end of the relevant period.

The Biblical season of Advent begins four Sundays before Christmas and focuses on preparation for and awaiting the celebration of Jesus’s birth. This means that the day on which Advent begins varies depending on which day of the week Christmas falls on (if Christmas is itself a Monday, then four Sundays before that would be the 3rd of December; conversely if Christmas is itself a Sunday, four Sundays before that would be the 27th of November). I was aware of the season in terms of its observance in church, though I don’t think it manifested much outside that.

The name originates from the Latin adventus; more generally there is the word advent which can also be used to describe something arriving that is significant (e.g. “with the advent of refrigeration, fresh food could be kept fresh for longer” is good, while I’d find “the 20th of September marks my advent in the UK” strange unless you’re someone who made significant changes to the UK). The naming of the season is apt from a Christian perspective (for obvious reasons), and probably even without one (though that’s a separate discussion).

I first came across the concept of an advent calendar, which as the same suggests counts down the days to Christmas during Advent, on a virtual pets site called Neopets. On Neopets, this offered a gift and a small amount of the site’s currency every day during December (though, differently from most advent calendars, this included the days from December 26 to 31 as well). However, apart from the Neopets one I wasn’t aware of this being a tradition in Singapore (or elsewhere, for that matter).

The idea of gifts isn’t inherent to advent calendars (initial versions served very much as mechanisms to track the days to Christmas as well), though it is common, especially in commercial contexts. I saw many more of these when I came to the UK – perhaps this makes sense, as Neopets was started by British developers. Many commercial advent calendars feature small items in 24 or 25 sealed and opaque, but individually openable compartments. The idea here is that one tracks the days to Christmas by opening each compartment only when the relevant day arrives: on December 1, the door marked 1 is opened, and so on until the last day. There’s technically nothing stopping one from opening the later compartments early, but I guess one would be cheating oneself of the theorised anticipation and excitement in the build-up to Christmas.

I tend to associate these primarily with chocolate (I probably first encountered these in Sainsbury’s in my first year), though many other variations (beauty products, alcohol, toys etc.) exist. There are also purpose-built empty containers (presumably intended for people to buy for their kids or spouses) with 24 or 25 small drawers or pockets.

There is even a fairly popular advent calendar for programming problems (Advent of Code), which I’ve found useful to get a bit of algorithm/data structure practice. I’ll be doing that this year in Haskell (maintaining the option to switch to Java or Python if things get too difficult or I get too busy, especially later on in the series).

There’s probably something to be said around what constitutes a good countdown to Christmas. I’ll admit that on first reaction, I find a number of the commercial ones out there a little awkward. However, the definition of ‘good’ is likely to be highly dependent on what one views the Christmas period to be about. For example, among other things I want to be able to evaluate and introspect on the year gone by, and also to be present when spending time with family and friends – and I find that, say, an alcohol-based calendar is helpful for neither end. However, these could be appropriate for someone who finds this to be helpful because they enjoy it, and/or for giving them confidence to interact and/or interact better with family and friends.

Detecting Progress in Sequences

I often try to evaluate whether something that is difficult to measure directly and clearly has improved. For example, I might like to evaluate how my German or logic puzzle skills have changed over time. I could try German past exams or look at logic contest results – however, one problem with these is that there is a lot of noise. For example, for the logic contest results, a score could be artificially low because I had a bad day or the contest was unexpectedly hard; it could also be high because I made multiple lucky guesses on difficult puzzles. Thus, a single measurement is unlikely to be sufficiently reliable.

One solution is then to use one’s average score, or take other statistical summaries of multiple data points. However, we probably don’t want to consider all of the data points we have as equally important. For example, I ranked 155th in a Sudoku GP contest in late 2017 – if I’m trying to evaluate my skill level at the end of 2019, that’s probably not relevant.

We could pick a cut-off point (for example, the beginning of 2019, or the last ten contests) and then discard all of the data from before that, and then simply treat the remaining data as equally important. This is often the basis of sliding window algorithms; if we say that we’re interested in one’s average score from the last ten contests, we can find this metric over time by considering a part of the list ending at today. There are methods for calculating these metrics efficiently (taking time linear in the length of the data stream).

Unfortunately, choosing a suitable window can be difficult – small windows can be vulnerable to noise, while large ones may fail to account for trends present within an individual window. As far as I know, this selection is more of an art than a science.

We can use more complicated approaches as well. Instead of picking a hard cut-off, where data from before the cut-off is irrelevant, we can instead treat data points as becoming less relevant over time. A method that’s often used is exponential weighting; giving the most recent observation a weight of 0 < \alpha < 1, the second most recent a weight of \alpha (1 - \alpha), the third \alpha (1 - \alpha)^2 and so on. As \alpha approaches 0, we approach a simple historical average; as \alpha approaches 1, we approach remembering just the most recent element. I’m not sure if the underlying assumption that events become exponentially less relevant over time is appropriate.

In spite of possibly sounding complex, this method does have computationally favourable properties. If we’re keeping track of a stream of data, we don’t actually need more than constant additional memory. It’s enough to keep just the previous reported average, because incorporating a fresh data point D into our metric can be done by S_{new} = \alpha D + (1 - \alpha) S_{old}.

There are some dangers here as well. The first challenge is bootstrapping; how should one pick the initial value of S? One could use the first observation, or perhaps an average of the first few observations if short-term divergence from reality is unacceptable.

I think there’s also a risk with massive outliers massively skewing the average (e.g. an API call which usually takes nanoseconds exceptionally taking an hour because of a system outage). This exists with any statistical technique, but if \alpha is small, our estimate will be “corrupted” by the exceptional data even after quite a few additional measurements. With the sliding window method, once the window has expired, the exceptional data point drops out.

In general, the methods we’ve covered assign weighting functions to the data points – the simple average just assigns the same weight to everything, the sliding window assigns the same weight to everything in the window and 0 to things outside the window, while the exponentially weighted moving average (EWMA) weights each point differently based on how recent it is.

As an extension, there are techniques for maintaining constant-size reservoirs of values that can be used to approximate more general summaries like order statistics, standard deviations or skewness. These often rely on holding a subset of the values being observed in memory. The selection mechanism for which values should be kept can be written to bias towards more recent measurements. In some ways, the calculation of our standard sliding-window based moving average can be implemented as a special case of this, where new entries are always included, and the oldest entry at each insertion is evicted. That said, we would probably not do this for just an average, as we can do that with constant memory (just remember the current average).

It’s not a particularly scientific or deterministic method, but in practice I find it useful to consider graphs with various transforms on top of them and draw conclusions based on that. I don’t have the sufficient statistical background or intuition to decide beforehand what would work well, unfortunately.

Lessons from a Purple Dragon: Exploring Game Search Spaces

I think I first played the first Spyro the Dragon game when I was about seven years old. I’m not sure what attracted me to the game. It might have been the idea of collecting bright, shiny gems, the rather gentle learning curve (unlike the Crash Bandicoot, Contra or Metal Slug games that I remember having access to on the PS1 at that time) or perhaps the exploration required to complete a 100% clear of the game.

More recently, a collection of all three games in remastered form was released on PC. I never actually got round to playing Spyro 3, for some reason, and the first two games were pretty enjoyable, so I purchased it. It was available on Steam for £34.99, though I searched elsewhere and managed to procure it for just £13.10.

In Spyro 1, the titular purple dragon has flame breath, is able to charge at enemies with his horns, and can jump and glide around (though he cannot fly). Levels in Spyro 1 generally feature a mostly linear path from a level’s starting point to its exit, with enemies and obstacles along the path; figuring out how to traverse this path can be interesting, in that one usually needs to understand how enemy attacks work, and/or how Spyro can navigate relevant obstacles, typically through judicious platforming.

However, reaching each level’s exit is rarely the end goal. Typically, levels contain a number of gems (and sometimes dragon eggs as well), and collecting these requires exploration beyond the main route. Although I certainly wasn’t familiar with the abstract concept at that time, when I first played these levels as a seven-year-old I would usually perform a depth-first search. I would head down an interesting path until it reached a dead end or looped back to somewhere I’d already explored, and then continue to a new path that hadn’t been searched yet.

Depth first search is naturally a reasonable way to explore a game world. Since players can’t teleport freely, breadth first search is generally not an option. Iterative deepening-style approaches could make sense if paths are uni-directional or if there are frequently very long routes that don’t need to be explored on the way from one area of interest to another, as it is usually possible to exit and re-enter levels from the beginning. However, this isn’t generally true in Spyro level designs. Thus, for Spyro I don’t think this core process has changed very much. What keeps Spyro interesting for me tends to not be about overcoming the obstacles on a given path (with some exceptions), but instead about identifying legitimate paths that can be plausibly explored. A simple example is in one of the areas in World 1, Town Square; there is an optional area accessed by gliding down a staircase and around a wall. This eluded me for a long time when I was seven, as clearing most of the levels (in the sense of reaching the exit, not 100% completion) can be done with gliding in mostly straight lines.

Typically, the game will give you hints that such hidden areas exist. Most obviously, each level has a number of gems to collect, and this number is known to the player – so shortfalls typically indicate that there are still hidden areas to be found (provided the player has meticulously collected everything in the areas they have already explored). It is also fairly common for some gems to be visible from one of the main paths, even if it is not made obvious how to reach them. With the Town Square example, one of the dragons in the area does actually give you a big hint on what to do, though I wasn’t aware of this as I skipped cutscenes then.

I usually had to resort to walkthroughs or online guides to find these answers in the past. I’m pretty confident it was quite different from the way I’d approach these problems now. I think I mostly just tried a ton of different approaches and saw what worked. However, I now use deductive reasoning more heavily – perhaps experience with participating in logic puzzle contests has helped. Instead of simply exploring paths that come to light and trying things that look encouraging, I tend to apply a perhaps more principled approach to identifying suitable routes to explore once the obvious methods have been tried. I would consider the usual platforming techniques that may be appropriate and test them against the level setup (e.g. Is there a high point I could glide from? Could there be some non-obvious platforms that are actually safe to land on? If there is/are supercharge ramps that increase Spyro’s running and jumping speed, can I use these? Can I combine them?). I’ve been able to solve quite a number of levels that I wasn’t able to in the past.

Some of this reminds me of a book I read earlier this year, Problem Solving 101: A Simple Book For Smart People by Ken Watanabe. The techniques introduced there for making decisions on how to approach problems are applicable to clearing a Spyro level – for various root causes/problems what one should do varies, and it is probably more expedient to think through what needs to be considered in a principled way than scurry off aggressively pursuing whatever solution comes to mind. I don’t think I’ve ever been faced with a level sufficiently difficult to need drawing a full-blown logic tree out, but even mapping out such a tree in my mind helps a lot.

Anatidaephobia: Ducks, Ponds and Probability

We discussed another interesting question at work, this time over Slack. This one seemed more mathematical than programming-based, though.

Four small ducks are in a large circular pond. They can be at any point in the circle, with equal probability. What is the probability that a diameter can be drawn so that all four ducks are in the same semicircle in the pond?

Naturally, there is a straightforward generalisation:

N small ducks are in a large circular pond. They can be at any point in the circle, with equal probability. What is the probability that we can fence off some sector of the pond which subtends an angle P, so that all four ducks are enclosed in the fenced area?

If I had to do this as part of an engineering problem, my first reaction would be to implement a Monte Carlo simulation. This would quickly reveal an answer of \frac{1}{2} for the first part, but in the second part things might become less obvious.

Usually for this kind of problem I tend to start by varying some of the parameters and trying to solve a simpler version of the problem. With one duck, drawing a suitable diameter is trivial; with two, drawing a diameter through one of the ducks is sufficient (since the second duck is on one side or the other – ‘small’ here means that we don’t consider the case where a duck lies exactly on the diameter). Going up to three, things get a little complicated; depending on the position of the first two ducks, the third duck can either be placed anywhere (if the first two ducks are at the same location, for example) or perhaps might be quite restricted (if the ducks are on almost opposite sides of the pond).

I then looked at other possible ways of simplifying the problem. For example, we don’t really care about where the ducks are relative to the centre of the pond. The relative angles between the ducks and the centre of the pond suffice to identify whether drawing the diameter is possible, and how far they are from the centre of the pond on a given axis won’t affect this. We can thus consider the ducks as uniform randomly occupying points on a looping one-dimensional continuum, such as the interval [0, 1).

Returning to three ducks, we can try to formalise the intuition that the range of positions allowed for the third duck varies depending on the position of the first two ducks. Define the span of n ducks S_n to be the total space on the continuum that the ducks occupy. For base cases, we define S_1 = 0, and S_2 is just the smaller distance between the first and second duck, so it has to be uniformly distributed between 0 and 0.5.

If we fix the value of S_2 = x, we can attempt to find the range of allowable third-duck positionings such that S_3 \leq n. Without loss of generality, suppose the two ducks are sitting at points 0 and x. Then, the lowest possible point the third duck can sit at would be x - n, and the highest possible point n (assuming of course n \geq x; if this is not the case then there are clearly no possible positions). The range of this interval is n - (x - n) = 2n - x, and the probability that a duck lands in that range is 2n - x. This of course makes the assumption that 2n - x is less than 1; if it is more than 1, then the probability would be 1; however, in our specific case since x > 0 and n = \frac{1}{2} we don’t have to worry about this.

This then reduces to a problem about conditional probabilities. We want to weight each possible value of S_2 based on how likely it is; the relative likelihood of each value is given by the probability density function. For S_2, we have

\displaystyle f_{S_2}(x) = \left \{ \begin{array}{lr} 2 & 0 \leq x \leq 0.5 \\ 0 & \text{otherwise} \end{array} \right.

Then, weighting based on this density function, we have:

\displaystyle P(S_3 \leq n) = \int_{x} P(S_3 \leq n | S_2 = x) f_{S_2}(x) \text{d}x = \int_{0}^{n} (2n-x) f_{S_2}(x) \text{d}x

If n \leq 0.5 then that will be equal to

\displaystyle \int_{0}^{n} (2n-x) (2) \text{d}x = \left[ 4nx - x^2 \right]_{0}^{n} = 3n^2

Thus we can find a diameter for \dfrac{3}{4} = 0.75 of cases with three ducks. If n > 0.5, then we need to make several refinements – most obviously, the upper bound of the integral stops at 0.5, as there’s no span with two ducks larger than that. Furthermore, there may be values of x where 2n - x > 1 in which case we need to clamp it down to 1. For simplicity, we focus on the case where n \leq 0.5 which is sufficient for our purposes.

Moving on, to find P(S_4 \leq n), we can similarly formulate

\displaystyle P(S_4 \leq n) = \int_{x} P(S_4 \leq n | S_3 = x) f_{S_3}(x) \text{d}x

f_{S_3}(x) may not seem immediately obvious, but we can rely on the CDF of S_3 which was calculated earlier – the probability density function is simply its derivative. Thus f_{S_3}(x) = 6x, and the conditional probability can be handled with the same logic that we used to go from S_2 to S_3, bearing in mind that the middle duck(s) aren’t important. Hence

\displaystyle P(S_4 \leq n) = \int_{0}^{n} (2n-x) (6x) \text{d}x = 6 \int_{0}^{n} 2nx - x^2 \text{d}x = 6 \left[ nx^2 - \frac{x^3}{3} \right]_{0}^{n} = 4n^3

and we have the answer to the original question, by substituting n = \frac{1}{2} we obtain an answer of \frac{1}{2}.

Interestingly, the CDFs of the various S_k seems to take on the form kn^{k-1}. We can prove this result by induction on k. This clearly holds for k \leq 4, based on our work so far – though note that k \leq 2 is enough for our proof. So we now take some arbitrary integer k, and assume that P(S_k \leq n) = kn^{k-1}. We need to show that P(S_{k+1} \leq n) = (k+1)n^k. The way we can do this is very similar to what we did for the concrete steps.

\displaystyle P(S_{k+1} \leq n) = \int_{x} P(S_{k+1} \leq n | S_k = x) f_{S_k}(x) \text{d}x

Since we’re assuming that P(S_k \leq n) = kn^{k-1}, f_{S_k}(x) = k(k-1)x^{k-2}. We can simplify out the conditional probability term, based on a similar argument on the conditional probability as before. Thus

\displaystyle P(S_{k+1} \leq n) = \int_{0}^{n} (2n-x) k(k-1)x^{k-2} \text{d}x = k(k-1) \int_{0}^{n} (2n-x) x^{k-2} \text{d}x

What follows is a bit of careful rewriting:

\displaystyle P(S_{k+1} \leq n) = k(k-1) \int_{0}^{n} 2n x^{k-2}- x^{k-1} \text{d}x = k(k-1) \left[ \frac{2n x^{k-1}}{k-1} - \frac{x^k}{k} \right]_{0}^{n}

And we can simplify this to:

\displaystyle P(S_{k+1} \leq n) = \left[ 2nk x^{k-1} - (k-1)x^k \right]_{0}^{n} = 2kn^k - (k-1)n^k = (k+1)n^k

which was what we expected. This also allows us to draw more specific conclusions – for example, in general for n = \frac{1}{2} the probability is just simply \frac{k}{2^{k-1}} for k ducks.