# That One Thing (2018 Q3 Review)

Monomania refers to a mental disorder where one pathologically focuses on a single thing (naturally, at the expense of others, as one’s attention is limited). To some extent, I may have experienced this in my second year at Imperial, where for some reason I became fixated on my academic performance. I remember studying almost 100 hours per week when preparing for my exams then (typically achieved by a very strict 9am-11pm schedule everyday for about five to six weeks). I’m not sure how I managed at the time. In terms of grades, I think one should aim at 60 + epsilon, 70 + epsilon or 100, and I chose the last of these.

Things have changed since. I now evaluate how things have changed over a variety of facets. Even professionally, the metrics are a bit more multi-dimensional now. I’m not sure if it was the Palantir internship or something else, but in fourth year I made some decisions that clearly weren’t optimal as far as academic performance was concerned (working 15-20 hours a week). At that time, my main goals were get an average of 90% and submit twenty pull requests as a part-time software engineer. After I graduated the metrics became writing papers and pull requests; the latter quickly changed to patents as, all other things held equal, I find developing two or three innovative systems far more interesting and probably more meaningful than fifty rudimentary changes.

In terms of development, I’ve certainly found my recent work to be challenging (which is a good thing!). I’ve been continuing my focus on performance work, though now that many easy wins have been taken the features are getting more complex. I’m also starting to have more clarity on the kind of work that motivates me (and discussing this with my lead and others); having such discussions feels relevant and good.

I presented the LDLK on finite traces paper at AAMAS’18 (and it seems I now have a h-index of 2). It’s unlikely there’ll be another conference paper for a while though, as I’m currently diverting my computer-science energies towards writing a journal paper which summarises and extends the three conference papers to some extent.

A highlight of the quarter was receiving the VCLA award for an outstanding Masters’ thesis as well (Imperial article). In terms of raw computer science difficulty, MCMAS-Dynamic is probably the most complicated project I’ve delivered (there are a couple I’ve done since that I’m also proud of, but I’m pretty sure MCMAS still wins). Some recognition is always nice; too much ego-stroking can be unhealthy, but the odd bit is enjoyable.

In terms of finance, I didn’t really pay much attention to things this quarter. I was aware of the Turkish and Argentinian currency troubles as well as the US-China trade spat, but things on my end are still largely business-as-usual. In fact I thought this quarter was not going to be good, but looking at things it’s largely an effect of recency bias; while performance in September was indeed weak, July and August were actually very good.

The high expenses of Q2 have actually snowballed further in Q3, though most of this is explainable by six-month periodic expenses and booking travel for the end of the year. Discretionary expenses have actually fallen a fair amount, which is unsurprising as Q2 expenses included Stockholm travel as well as renewing my web server. Savings rate is down, though still on track to be in a good place this year.

Work on dealing with logical puzzles has continued, and I think I’ve managed to find a source of metrics; Logic Masters India has a good selection of past contests along with a rating system. Based on a few of these, my Puzzle skills are currently around the low to middle 600s, while my Sudoku skills are in the high 700s. Individual contests may vary in difficulty, so normalisation is used. Interestingly, this is done on two axes, comparing both one’s score relative to the top scorer and one’s ranking relative to all participants.

I’ve also started to take an interest in cryptic crosswords, though I’m certainly nowhere as competitive at these; my general knowledge is not sufficiently broad. “Logical puzzles” are designed with the intention of being culture-neutral; these often demand a wide vocabulary (which I think I have) and broad general knowledge (less so). It helps that there is a group at Palantir that attempts these recreationally; I’m not aware of a similar group for general logic puzzles.

When one is investigating a search problem, one typically needs to balance exploration (searching prospectively for additional solutions) and exploitation (locally refining good known solutions). Historically when managing my own decisions I’ve largely been pushing exploitation for quite some time. I first discovered an interest in mathematics when I was around six; while this isn’t quite computer science or software engineering, that was near-by. I started programming at 15, and when choosing my degree looked at CS which I felt balanced my interests, skills and prospects well. When I was at university, I had the aforementioned monomaniac episode; since then I’ve branched out a little.

I’ve been behind the times (the song was released in 2017, apparently), but I’ve been listening to Meant to Be (Bebe Rexha and Florida Georgia Line) a fair bit this quarter. I don’t particularly enjoy the vocals in the original, but it’s been covered well; this one is better polished, while this one has the approach I go for when I try to sing it!

It’s enjoyable for me to (try to) sing along to, including the high vocal line in the chorus; it quickly becomes a question of hitting the tenor Bb4 repeatedly. Listening to the song certainly does make me thing of search space exploration problems, specifically genetic/evolutionary algorithms. The lyrics in the chorus sound very much like evaluating a fitness function on a candidate solution and then deciding whether to accept it:

So won’t you ride with me, ride with me?
See where this thing goes?
If it’s meant to be, it’ll be, it’ll be; baby if it’s meant to be.

The first verse has elements of not being too pressured with exploitation as well, and thematically echoes bits of Good Things Come To Those Who Wait which I looked at last quarter. The second verse reflects a degree of caution after being burned by poor approaches in the past. As I plan for what I might want to try to achieve (God willing) on a timescale of months or years, it’s important to not index too heavily on past experiences or initial reads of how a given path may turn out.

# Algorithmic Modelling – Alphabear 2

Alphabear 2 is a word game where players are given a two-dimensional grid of tiles. Some of these tiles show letters, along with timers; others are concealed and must first be exposed by using an adjacent letter in a word. Players use letter tiles (anywhere on the grid) to form a word. When a player makes a word, he/she scores points equivalent to the sum of the timers on the tiles that are used. The timers on all tiles that aren’t used then count down by 1; tiles that reach zero become unusable stone blocks.

Usually, this means that you want to form long words that incorporate tiles that have shorter timers; for example, there is a ten letter word on the above board (PEDESTALED), but it doesn’t use the F. After playing this word, the timers on the remaining letters will count down by 1; the P will now have a timer of 2, the first E 10, and so on.

However, the game does have various mechanics that mean this isn’t always the best strategy. For example, one needs to ensure that there are actually enough letter tiles to work with, and that the distribution of letter tiles allows reasonable words to be formed. The game also has players choose bears for each mission which have special abilities such as awarding additional points for certain words. Actually, in hindsight FEASTED might have been a better choice than DEADLIFTS, as I was using a bear that gives a point bonus for words ending in ED.

In addition to scoring points by making words, players also receive bonus points for the largest fully cleared rectangular area. The game identifies this for you. I’m not sure how this is actually implemented, though there are large enough boards (15 by 15?) that it’s probably not the naive algorithm of generating and testing all rectangles; where $N$ is the width of the board that would be $O(N^6)$. There is a DP-based approach where one can keep track of the number of stone blocks that slashes this to $O(N^4)$ which would probably be enough. With more clever approaches it’s possible to knock the time complexity down to $O(N^2)$ – and that’s probably optimal as any less would involve not actually reading some parts of the grid.

Most of the time, this isn’t something to worry about too much. I’m able to clear most levels with no stones at all. However, if we need to take a stone, ideally it should be as far away from the center of the board as possible (note that only the largest rectangular area counts).

Defining optimal play in general is difficult, owing to the spatial reasoning required by the largest clear area bonus, uncertainty of tiles that are yet to be revealed and sheer breadth of options when faced with many open tiles.

Before starting a round, the player has to select bears to use. Optimal decisions here largely depend on the specific challenge being played. A bear that scores bonus points for six-letter words isn’t particularly useful on a tight board where one has only a few open tiles at a time; some bears add time to timed boards, which isn’t useful if the board has no time limit. Furthermore, players select up to three bears each round; these bears can have useful synergies. In particular, I like a combination of “summon three Zs” and “score triple points for words including JQXZ (not stacking, unfortunately)”; “summon ING” or “summon ED” goes quite well with these two as well. I’ve used these to complete a mission deemed “impossible” given the strength of my bears at the time.

We can perhaps begin by focusing more narrowly on optimal play of an Alphabear endgame (where all tiles have been revealed), ignoring special abilities bears may have for now and assuming that the board has a solution. Here is an example:

From this point on, there is no more uncertainty as to what letters we have to use. Interestingly, this board shows that a greedy algorithm where one tries to play the longest possible words doesn’t work. One could begin WORKSHOPPED, leaving RAII, but this is actually an unwinnable state. There are several two-turn solutions, like ROADWORKS + HIPPIE, or WORSHIPPED + KORAI. (Obviously, this is very high-level play; I think I played SKIPPER with this board. That leaves RODIAHWO, for which I’d probably have played something like RADIO + HOW, though HAIRDO + WO is better.)

Given a multiset of pairs of letters and timers $L = (l,t)^+$, we want to return a suitably ordered list of words $S = \left[ l^+ \right]$, that satisfies the following properties:

• legitimacy: every word $s \in S$ is included in the game’s dictionary.
• resourcing: words do not use tiles that aren’t in $L$.
• timing: for each pair $(l, t) \in L$, $l$ is used in a word before or on position $t$.

The above set of properties is not correctly specified without a refinement to deal with duplicate letters. There are several ways to deal with this; assuming low cardinality, one approach could be to recognise all duplicate instances of letters as separate letters, and suitably expand the dictionary to allow this. Our list is then $L = \left[ (A,3), (B_1,1), (B_2,1), (E, 1) \right]$, and the dictionary recognises both $B_1 E$ and $B_2 E$.

I don’t have a good way of finding an optimal solution; I suspect the problem is actually NP-complete; it feels like SAT or 3SAT may be reducible to this problem, though I haven’t worked through the details. Nonetheless, there are several search heuristics we can try. Clearly, on a given turn we must use everything which now has a timer of 1. Also, although this board is an example of why you don’t want to be greedy, in general it probably makes sense to do some kind of ordered backtracking, where one expands the most promising nodes first. I can imagine an A* search could actually work well, though finding an admissible heuristic could be hard.

There’s a fourth property we want as well:

• optimality: the score produced by the words is maximal.

Going back to our sample board, WORSHIPPED and KORAI will score one more point than ROADWORKS and HIPPIE, because there is one fewer timer tick. In general, we want to minimise the number of timer ticks by playing longer words upfront where possible, as this reduces the number of timers that are still ticking. Once one has devised a feasible solution, it may make sense to perform some iterative optimisation by seeing if it’s possible to bring letters forward. Of course, that approach may get us stuck in a local optimum. I could see some kind of iterative hill-climbing algorithm from the top-n admissible solutions (as found by our ordered backtracking) yielding decent results.

Unfortunately my vocabulary and anagram skills are far from good enough to handle the required complexity, especially under time pressure. Most of the time, successfully playing every letter on the board is enough to complete the mission’s objectives. Interestingly, it looks like the game wasn’t really programmed to deal properly with successfully using every letter but failing the mission. The game claims that there’ll be an additional reward on the reward screen when identifying the board as clear; yet, there is no reward screen as the mission was failed.

# Rebooting Ourselves

It was a fairly rough week. Thus, over this weekend I thought about re-examining and resetting various procedures and things I do, as opposed to actively filling my time with activities. This reminded me of a Humans of New York post I came across several years ago:

“I’m rebooting my life entirely, again. It’s time for Andrew 5.0.”

In computer science, semantic versioning is a system for identifying different versions of software products in a standardised way. Under this system, a product’s version is an ordered triple of nonnegative integers written $major.minor.patch$. This is typically used for software, though the definition does not seem to require it. The system discusses changes in terms of a public application programming interface (API), which specifies what functionality the product offers.

In terms of software, a SQL database’s API could include the types of queries that may be processed. For MCMAS-Dynamic, the public API would include the details of the modelling language it is able to verify properties for. A non-software example could include a simple kettle; the public API could include how one adds or removes liquid, how one turns it on or off, and possibly alarms or other feedback mechanisms for when the liquid has boiled.

When a new version of a product is released, the version number is increased (in lexicographic terms). How this increase is done depends on the types of changes since the previous version:

• If the public API is ‘broken’, meaning that previously valid ways of using the API are no longer valid or accomplish different things, then the change requires a major version bump. To do this, the major version is incremented, and the minor and patch versions are reset to 0 (e.g. $7.5.1 \leadsto 8.0.0$). For example, if the kettle used to play an alarm when it the liquid was boiled and this was a part of the public API, then the major version should be bumped if this functionality is removed. (In particular, if the API did not specify that the kettle had to play an alarm, this change might not warrant a major version bump.)
• If new features are added without ‘breaking’ the API or there are non-trivial internal improvements, the change leads to a minor version bump. The minor version is incremented and the patch version is reset to 0 (e.g. $7.5.1 \leadsto 7.6.0$). For example, if the new version of the kettle is substantially more energy-efficient, then that could be a minor version bump.
• If something was broken and has been fixed (without changing the public API), then the patch version should be incremented (e.g. $7.5.1 \leadsto 7.5.2$). For example, if the kettle previously rang an alarm twice when the liquid was boiled even though the kettle’s API specifies it should only ring once, then a change that makes the alarm only ring once could be part of a patch version bump.
• Multiple changes in aggregate should be evaluated in aggregate. In most cases, the largest magnitude of all constituent changes applies, though generally speaking this is not true (consider one bugfix plus two changes, one which breaks the API and another that reverts that change – that is a patch bump, not a major bump).

Generally, making a more aggressive version bump than would be required for one’s change is acceptable, though it can confuse users. In particular, I tend to expect backward-incompatible changes when facing a major version bump; not finding any can be surprising and confusing.

The sentiment of the quote sounded like it was a major version bump. Defining an API for one’s life is obviously very difficult; even if one tries to use a lot of abstraction, I find that there are just too many facets. Rather loosely, our API might be split into a bunch of micro-services. We can treat physical needs and bodily functions like breathing and digestion as infrastructural. These services might then focus on the range of activities we involve ourselves in, or range of activities we could involve ourselves in. For me personally, this could include software engineering, getting along with other people, finance and budgeting, computer science, writing, puzzle solving and so on.

Hopefully, I would imagine that we chew through a lot of patch versions as we continue to improve skills. Today’s release notes could include “Jeremy knows a little bit more about thread pools” (I read a chapter of Java Performance Tuning today over a post-lunch coffee). Minor versions would also be relatively common; this wouldn’t be from today specifically, but “Jeremy can vaguely attempt a Balance Loop puzzle” is probably pretty recent, extending the sudoku and other puzzle-solving features.

Depending on how we define the API, major version bumps could be very common. It is typically important to be relatively disciplined with breaking changes in an API in a software engineering context, as clients may often depend on one’s product in non-obvious ways. While others’ dependencies on us can indeed be non-obvious, I think one factor that changes things is that our systems seem to be ephemeral whilst program code is not. A codebase left as it is over years or centuries retains its capabilities (admittedly, finding suitable infrastructure to run the product might be an issue).

On the other hand, there is some evidence that we lose skills that are underutilised with time. I used to play Dance Dance Revolution quite a lot and could probably pass an arbitrary level 15 song as well as some 17s; I doubt I can do that today as I haven’t played in a few years. The ways we interact with others or manage our finances may change as our personal environments change as well; for example, if I moved away from the UK, I would not be able to allocate my investments the way I do now, because I would lose the ability to use ISAs (and probably most other forms of UK-specific tax-free savings). This may even happen without action (for example, if the UK government changes how ISAs or tax-free savings work) – though you could argue that declaring the use of specific vehicles in one’s API might be too specific and implementation-dependent (“I will use tax-advantaged accounts that are valid in my location appropriately” is maybe better).

In light of the above, I would be a bit laxer with what constituted a ‘breaking change’, which pulls things back toward subjectivity which I think semantic versioning was trying to avoid. I might regard myself as having major version 2 right now; I could consider everything up to and including my second year at Imperial as version 0, which is typically used in development to refer to a pre-release period of rapid iteration. Although National Service and/or moving to the UK for studies did bring about nontrivial changes, I really didn’t know what I wanted to do at that time (not that I know now, but there is at least a vague direction).

The Google internship was probably the turning point for version 1; that also coincided with several major changes with regard to finance, investment, philosophy and priorities. I’d call the second major change to be when graduating from Imperial and starting at Palantir; even then, I’d regard the first set of changes to be more fundamental. The re-examination I did over the weekend is actually probably a patch release (or maybe a minor that improves several non-functional characteristics); it certainly doesn’t warrant a major version bump.

# Algorithmic Modelling – Codenames

Codenames is a board game designed by Vlaada Chvatil which tests communication. The game is played with two teams; each team splits itself into clue-givers and guessers.

There is a board of 25 cards, each with one (or more) words on them. Of these cards, 9 are owned by the first team, 8 owned by the second team, 7 are neutral and one is the ‘assassin’.

The clue-givers attempt to communicate to the guessers which cards are owned by their team. This is done by giving a one-word clue followed by the number of cards N corresponding to the clue. The rules establish some bounds on allowed associations (for example, ‘sounds like’ clues are not allowed).

I don’t know how much time went into selecting the words to appear on individual cards, but there are certainly many words in the deck that can be interpreted in many ways, which makes the game fun. For example, I can think of quite a number of ways to clue BOND (COVALENT for unambiguity; DURATION, ASSET or DEBT for the debt; JAMES, AGENT or SEVEN or even at a stretch something like CASINO for the character; FRIEND or FRIENDSHIP; STREET or TUBE; CONTRACT, PROMISE or WORD for the idea of a promise; ADHERE or CONNECT, among others). Which one I might pick would depend on the other words on the board.

Guessers then identify up to N+1 cards they think their team owns. This can be based on the entire history of clues given, not just the previous clue. These cards are guessed one at a time; a team is only allowed to make further guesses if they guess ‘correctly’ one of their team’s cards. Teams may refrain from making all guesses.

The game ends either when all cards owned by a team are revealed (that team wins), or if a team reveals the assassin (that team loses).

In practice, clue-givers will need to consider all cards, not just the ones their team owns. The penalty for ‘failure’ varies; the assassin is an instant loss, and revealing a card owned by the other team is also bad. For example, with the cards APPLE, BANANA and GRAPE it would be very tempting to declare (FRUIT, 3) as a clue; yet, if KIWI is owned by the other team (or worse, is the assassin) it might be dangerous.

However, if the other team has already revealed that KIWI is theirs (e.g. with another clue, maybe SOUTH or BIRD) then the FRUIT clue becomes safe. Thus, pre-computing a strategy at the beginning of the game (e.g. by partitioning the eight or nine clues owned into several logical groups while avoiding other words) may not be optimal.

I tend to consider making an effort to win to be a key part of games, and thus what moves may be considered good also often depends on the state of the game. For example, if the opposing team has just one card left, I will give a broader clue that may have more tenuous links in a ‘do or die’ effort to finish on this turn. A less extreme example would be attempting weaker links if behind, or perhaps playing slightly more conservatively if ahead.

Mathematically modelling Codenames can be tough. We can try modelling the game state as a tuple $(O, E, N, x, h, f_O, f_E)$ where $O, E, N$ are sets of the remaining clue words our own team has, words the enemy team has, and the neutral words, $x$ is the assassin, $h = (t,w,n)^+$ is the history of the game, and $f_O$ and $f_E$ are preference functions of the form $h, w, n, O, E, N, x \rightarrow P$, returning an ordered list $O \cup E \cup N \cup \lbrace x \rbrace$. $(t, w, n)$ is a clue meaning that a given team gave a clue word and hinted that some number of cards was relevant. This already abstracts two difficult parts away – ordering the clues, and determining how many of the top preferences to pick.

I think the preference functions need to take into account previous clues from both teams; if a previous clue could clearly have corresponded to two words and I picked the wrong one, it might be fairly high on the list even if unrelated. Similarly if this scenario happens to the opposing team, I would probably avoid the word that would blatantly be theirs.

The notion of degree of confidence also isn’t captured as well in our ordering; going back to the fruit example, having a clear ordering would imply that clues for less than 4 items could reliably result in correct guesses (if we knew that APPLE was by a small margin the best pick, we could safely and confidently give (FRUIT, 1) to clue it, which seems wrong). One could imagine modelling these with selection probabilities, though things would be even more complex.

The above representation still seems computationally difficult to work with. The evolution of the preference function as one moves from one state to another is unclear (so in that representation it is fully general), making lookahead difficult. It doesn’t seem like a greedy choice is always best; for example, given eight clues that are reasonably divisible into four pairs, a clue for 4 words taking one element from each pair might be a bad idea if the other words can’t easily be linked.

We can proceed by simplifying the preference functions; a simple approach could be that for each $w, n$ each team has a persistent preference function that returns an ordered list of $O_0 \cup E_0 \cup N_0 \cup \lbrace x \rbrace$. The preference functions during the game then return the subsequence of that list that contains strictly the words still left in the game. This of course doesn’t take into account past information or clues from the other team.

With this, we can attempt to solve the game using standard techniques; assuming that the vocabulary of clues is bounded (let’s say it must be from the Linux dictionary), a game state is winning for the current team if there exists some word for which the preference function returns everything in $O$ as a prefix. A state is losing if all moves in that state produce a state which is winning for the opposing team.

We can generalise this to a simple probabilistic model as well; the preference ‘functions’ instead return a discrete random variable that indicates either guessing some word or passing. A simplified model could then, at the start of the game, assign weights to each candidate indicating the relative probability that candidate would be selected. These can be normalized to account for a pass probability. As words are removed from the board, the probability of the remaining words being selected scales (we can imagine a simple rejection-sampling where we discard words that aren’t actually on the board any more).

The algorithm for the probability that we get a win from a given state is then slightly more complex (though I think still reasonably covered by standard techniques).

# Likelihood Estimation

One of my team-mates introduced another interesting question over lunch. Working through it reminded me of some of the statistics problems I struggled with at Imperial, specifically during Intelligent Data and Probabilistic Inference. It reinforced that in spite of scoring 90 for that course I’m still not confident I knew what I was doing then (or now).

Suppose you have a symmetric triangular distribution of unknown width and mean. Given that the distribution has yielded three independent samples of 2, 4 and 5, what is the expectation of the mean?

The triangular distribution can be used as an estimate for something which is known to be bounded both above and below, that also takes into account a value known to be the most probable. Some argue that it is the simplest distribution satisfying these (though one could argue that a cosine or some kind of truncated normal might be more applicable).

The instinctive answer I have is simply the mean of the samples, or $\frac{11}{3}$, though I was suspicious as probability and statistics often yield non-intuitive results.

The distribution is named as such because it has a triangular probability density function; because of the laws of probability (area under the function must be 1), specifying the minimum, maximum and mean is enough to uniquely identify it. Supposing we have a minimum $a$, a maximum $b$ and a mean $c$, this yields a pdf of:

$f(x) =\begin{cases} \dfrac{2(x-a)}{(b-a)(c-a)} & a \leq x \leq c \\ \dfrac{2(b-x)}{(b-a)(b-c)} & c \leq x \leq b \\ 0 & \text{otherwise} \end{cases}$

We are only dealing with a symmetric case, so we can set $c = \frac{a+b}{2}$ which cleans things up a little:

$f(x) =\begin{cases} \dfrac{4(x-a)}{(b-a)^2} & a \leq x \leq c \\ \dfrac{4(b-x)}{(b-a)^2} & c \leq x \leq b \\ 0 & \text{otherwise} \end{cases}$

Based on our observations that we have three samples of 2, 4 and 5, we can construct the likelihood that a given triangular distribution gave rise to a certain result. While the probability sampling a given distribution resolves to a precise value is infinitesimally small, we can still compare them in relative terms using the density functions. We can write this as

$P(2,4,5;a,b) = f(2)f(4)f(5)$

Expanding this term will depend on where exactly 2, 4 and 5 fall in our triangle. Let’s work out the most promising case (where 2 falls on the left of $c$ while 4 and 5 fall on its right); the rest are left as an exercise to the reader. In this case, we have

$P(2,4,5;a,b) = \dfrac{4(2-a)}{(b-a)^2} \times \dfrac{4(b-4)}{(b-a)^2} \times \dfrac{4(b-5)}{(b-a)^2} = \dfrac{64(2-a)(b-4)(b-5)}{(b-a)^6}$

At this point, we notice the original question needs a bit more specification. We aren’t given what the distribution of possible values of $a$ and $b$ is. One way of getting around this is just to pick a uniform distribution; however, that isn’t quite defined over the real line. We can for now simply find the maximum likelihood estimate for $a$ and $b$.

Alternatively, if we give prior probability distributions for $a$ and $b$, we could also use the samples as information to get a posterior distribution. Usually we would pick a conjugate prior distribution that wouldn’t fundamentally change even when accounting for the sample; I didn’t find one for the triangular distribution, though.

If we want to find the most likely distribution, we seek to find an extreme point; this can be done by taking partial derivatives (and this expression actually lines up quite well with the quotient rule). There is a fairly standard ‘trick’ for handling these, though; since the logarithm is a strictly increasing function, we compute the log-likelihood instead. The maximum of the logarithm will also be the maximum of the original function. Using the laws of logarithms, we get something a lot more tractable:

$\log P(2,4,5;a,b) = \log 64 - 6 \log(b-a) + \log(2-a) + \log(b-4) + \log(b-5)$

Computing the partial derivatives is then straightforward. We then set them to zero, and solve the resulting equations simultaneously; this yields

$a, b = \left( \dfrac{1}{3} \left( 4 + \sqrt{\frac{2}{5}} \right), \dfrac{1}{3} \left( 16 - \sqrt{10} \right) \right), \left( \dfrac{1}{3} \left( 4 - \sqrt{\frac{2}{5}} \right), \dfrac{1}{3} \left( 16 + \sqrt{10} \right) \right)$

We’ll want the second pair of solutions; the first actually has $b \approx 4.279$ which is no good (we need $b > 5$). Interestingly, the mean of that triangular distribution is then $\dfrac{2}{15} \left(25 + \sqrt{10} \right) \approx 3.75497$ which is not quite $11/3$.

Indeed, though, the log-likelihood we get with these values of $a$ and $b$ is about $-4.74053$. Indeed, if we look at the family of distributions with $a = \frac{11}{3} - \alpha$ and $b = \frac{11}{3} + \alpha$, the best we get is about $-4.7473$.

# On Deckbuilding Games and Slay the Spire

Deckbuilding is a genre of games where players start with a (usually) standard deck of cards and then improve it over the course of the game. This can involve acquiring new cards that are usually more powerful than the initial set of cards and/or synergise in better ways, upgrading existing cards, or even removing cards that are weaker or less relevant so that one is more likely to draw the (better) remaining cards.

These games typically offer some replayability because the sequence of decisions one may have on how one can improve one’s deck differs from game to game; what it means for a deck to be successful can also vary depending on what the other players are doing.

Slay the Spire is a player-versus-environment RPG where combat is driven by a deckbuilding mechanic; players start with simple offensive (Strike) and defensive (Defend) moves plus possibly a few class-specific cards. The combat is turn-based. On a player’s turn, the player draws 5 cards. Cards have an energy cost, and players initially have three energy per turn to play as many cards from their hand as they want. Of course, there are cards and other mechanics that can be leveraged to obtain additional card draws or energy. Players are also shown their opponents’ next moves, which allows for some degree of planning; a simple example of this could be deciding to eliminate a specific monster in a group, because it is about to attack.

In terms of gameplay, the start of the game does feel relatively rudimentary as the player always has the same or a similar deck. However, each time a player defeats a group of monsters, he is given the option (but not the obligation) to add one of three cards to his deck.

The options that come up in the first few rounds will typically shape how the deck develops. For example, while most cards can be upgraded once only, there is a specific card called Searing Blow that may be upgraded multiple times; eventually, it can develop into a card that costs 2 energy but yet deals far more damage than Bludgeon, a 3 cost card that deals pure single-target damage. It is a viable strategy to build a deck around this, eliminating other cards from the deck so that one draws it on every turn; there are cards like Double Tap (which, for 1 cost, doubles another attack) that can further enhance this type of deck.

Conversely, I frequently play in a style which relies on combining cards that give a constant strength bonus to all attacks along with cards that attack multiple times for small amounts of energy. That said, some degree of flexibility is important, as the choices given to the player will be different on different plays; being wedded to a single strategy can be dangerous, as specific key cards may not be available.

Players can also acquire relics which further influence game mechanics; these can simply provide boosts (such as “every 10th attack does double damage”), make certain styles of gameplay more attractive (“the third Attack card you play each turn boosts your Dexterity by 1 for the rest of the combat”) or even significantly change game mechanics (“retain your hand rather than discarding it” or “if you empty your hand, draw 1 card”). These are random (and apart from at the shop / rewards from winning boss battles, players aren’t really given options here) so, as before, planning on specific relics becoming available is also typically not a good solution.

Winning the game unlocks a new-game-plus mode called ascension; players start at Ascension 1, which increases the rate at which elite enemies spawn. Winning the game on Ascension 1 unlocks Ascension 2, which applies the restrictions of Ascension 1 and makes normal enemies do more damage. This increases incrementally up to Ascension 20. I’ve got up to 16 on one of the characters. At high levels there’s not very much room for error; I have had a 10 win-streak without Ascensions, but struggle with high Ascensions quite a bit. I think I tend to force my deck to fit a certain mold perhaps too early on; this can get scuppered if I subsequently am unlucky.

Another challenge a player can take on, as with many other games, is speed-running, or attempting to win quickly. Playing fast (as shown in the screenshot above) limits one’s time for decision making. For example, I normally plan my entire route through the map for each level at the beginning of the level, but that of course takes a bit of time. A normal, more relaxed game can easily take 30-40 minutes, especially if I’m playing with a relatively less aggressive build.

I’ve played the game for about thirty hours or so. It has been enjoyable, and the Defect character (an automaton that manipulates a queue of orbs) is particularly fun, if difficult to use. In terms of educational value I think the game demands a decent intuition of probability, as well as understanding how cards may work together. There are some synergies that are explicit or clear, but others are often less obvious and require some thought/experimentation.

# Travelling Songs

For a recent Palantir event, I was asked to state a song that I enjoy listening to whilst travelling. I had a few candidates in mind, but went with This Town by Niall Horan. I didn’t have to justify my choice – but I had several reasons in mind.

Most of my travel for work is done solo. This makes sense for going on-site, but even for larger company events and conventions I find that I end up travelling on my own fairly often. Also, quite a lot of my travel requires flying; even when it is not long-haul, the overhead of going through airports and ensuring that I am on time already adds quite a bit of travel time. I thus find that I have a fair chunk of time where I spend time thinking and reflecting on how things have been going. I thus picked a quieter, more downbeat song which gives me enough space to maintain a thread of reflective thought. That said, interestingly the other songs I had in mind were unlikely to be as conducive to that!

I should count myself fortunate to have travelled quite a fair amount for work (that said I’m sure there must be people who are sick of excessive travel). One “class” of songs that might be relevant for some would be a sense of adventure and anticipation; for me personally on work trips that is a bit less relevant. Many trips are to places I’ve already been to (I guess the company only has a few major offices) – in the past three years I think I have visited Palo Alto about eight times. Of course, that may change with university on-site recruiting picking up; I first visited both Denmark and Croatia on these trips, and I did enjoy those travels a fair bit.

I think of myself as fairly ambitious. There are often more things I like to do than have the time to execute on (and hence metaphorically “words I never got to say the first time around”). In some ways, this is especially true of my travels to Singapore, as I’m there only about three to four weeks a year out of 52. Separately, work trips often involve very intense iteration (that tends to be what motivates the trip in the first place), leaving little time to explore the place I’m in.

Thinking about it, the song does also mention the act of travelling (“drive highways and byways to be there with you”). I didn’t pick the song for that reason, though – identifying work as the thing one is always chasing is dangerous.

The other songs I had in mind included Fury of the Storm by DragonForce; that follows the “sense of adventure” theme. That song also brings back memories of my first internship at Palantir, which involved a decent amount of travel. I’m not sure I could imagine listening to metal for ten straight hours or so, though. I also considered James Arthur’s Back from the Edge which I featured in a review about a year and a half ago; travel certainly doesn’t provide a clean or blank slate, but it does provide a temporal boundary for how I categorise things, I think.

During the event, nine other engineers volunteered their choices. There was a fairly broad range of picks, ranging from rather bubbly dance pop (Elastic Heart by Sia) to harder rock picks, and a couple of alternative choices I was completely unfamiliar with. It was nice to see this variety, though.

# Algorithmic Problem-Solving: On Pigs and Poison

A teammate asked me a computer science / algorithm question over lunch. It took me somewhere between 30 to 40 minutes of thought and discussion with teammates to solve the problem, inclusive of writing about 20 lines of Java; I can see it being a kind of ‘brain teaser’ interview question (and thus generally best avoided). That said, genuinely not having heard this problem before, I think it was interesting to see the thought process I went through to get to the final answer.

You have 1,000 buckets of water, 1 of which is poisoned. It is known that the poison if ingested will kill a pig within 15 minutes. You have one hour to determine which bucket contains the poison. What is the minimum number of pigs that you need, and how would you do it? (Eventually, of course, you want to generalise your answer.)

After some initial “lateral thinking”-style questions (does the poison only kill pigs? is the death timing exact?), I determined that it was intended to be a genuine algorithms problem. As a computer scientist, I then thought of binary encodings. This gives a quick solution to the case where you have just 15 minutes to determine which bucket is poisoned – if you convert each number to a 10-bit binary number (so the first bucket is 0000000000, the second is 0000000001, and so on) and give the first bucket to no pigs, the second to pig number 10, the third to pig number 9, the fourth to pig numbers 9 and 10 and so on, the final state of the pigs 15 minutes later allows you to determine the poisoned bucket. If we used this method and only pigs 5 and 7 died, then the poisoned bucket has binary representation 0000101000, which corresponds to bucket number 40. In any case, we know the answer is bounded above by 10.

I then considered ways of extending this solution given that we can run four trials instead of one. I did consider partial experiments, but didn’t see a way this would be helpful. One quick possibility we came up with was to linearly split the buckets among a number of pigs to cut down the search space – though this risks losing one pig at each iteration. Still, we determined that 6 pigs would work:

• On trial 1, you have 1000 buckets and 6 pigs – divide into 7 groups of at most 143.
• On trial 2, you have 143 buckets and possibly 5 pigs – divide into 6 groups of at most 24.
• On trial 3, you have 24 buckets and possibly 4 pigs – divide into 5 groups of at most 5.
• On trial 4, you have 5 buckets and possibly 3 pigs – the binary method can distinguish 8 buckets.

Yet, this same method wouldn’t work with 5 pigs. We were ready to settle on 6 as the answer, but the question asker revealed that it was possible to do better than 6. The idea of imbalanced groups then came up – the ‘no risk’ group of buckets should be larger, because in that case we could guarantee that we would still have all our pigs if we needed to investigate that group.

I guess I haven’t done algorithm interviews for a good while, and it thus took me to this point before I thought of setting up a recurrence relation. Define $W(n, p)$ as the maximum number of buckets distinguishable with $n$ trials and $p$ pigs remaining. I then set up this relation:

$W(n, p) = \begin{cases} 2^p & n = 1 \\ 1 & p = 0 \\ W(n-1, p) + pW(n-1,p-1) & \text{otherwise} \end{cases}$

The last case comes about because we can safely allocate a group of $W(n-1, p)$ buckets to be given to no pigs, since with one fewer trial we can distinguish between them. For each of the pigs present, we need to prepare to continue the investigation with one fewer trial and one fewer pig, hence $pW(n-1,p-1)$.

This is more efficient, but doesn’t quite get us there yet. I noted that $p(4, 5)$ was somewhere in the 800s, which established that strategies based on risking one pig in every round apart from the last would be insufficient. Thinking a little more about it, I realised that in the extreme case, you could always have a special bucket each round that you gave to all pigs – if they all died that round, we would know that bucket was poisoned. However, you couldn’t have more than one of these per round – if they all died, you wouldn’t know which bucket was the cause.

More generally, you could split the groups of buckets into groups that are given to zero, one, two… up to $p$ pigs, where $p$ is the number of pigs still left. The groups given to smaller numbers of pigs could likely be larger, to account for the additional resources available in later rounds. The following recurrence relation then drops out:

$W(n, p) = \begin{cases} 2^p & n = 1 \\ 1 & p = 0 \\ \sum_{i=0}^p \binom{p}{i} W(n-1, p-i) & \text{otherwise} \end{cases}$

This actually has a closed form solution. I didn’t notice it when looking at the recurrence, though it became apparent when I coded up a program to calculate the required $W(4, 5) = 3125$. I was surprised to see how much additional discriminatory power five pigs had, though four wouldn’t be sufficient. Interestingly, five pigs and three periods were sufficient. The general closed form solution is simply $W(n, p) = (n + 1)^p$ and our argument provides a method for actually identifying the relevant bucket.

Finally, an interesting challenge here was to establish that it wasn’t possible to achieve this with four pigs (so that 5 would be the minimum). One way to do this was to reason in terms of information theory – a pig either survives all the way or dies after a given round, and then it stays dead. This is $n + 1$ many possibilities, and there are $p$ pigs, so it seems that we can’t do better than $(n+1)^p$. Clearly $5^4 = 625 < 1000$ so four pigs isn’t enough; also, this expression matches the value obtained by our recurrence-based method, suggesting that that is indeed optimal.

# Tactics and Strategy (2018 Q2 Review)

I generally interpret strategy to refer to thinking about broader goals and general approaches for achieving these goals; conversely I think of tactics as finer-grained methods for effectively completing smaller tasks (that hopefully contribute to fulfilling one’s strategic goals). As a software engineer who works with databases, strategic concerns could be “the database should be able to serve requests quickly”, or “you shouldn’t be able to shoot yourself in the foot too easily”; tactics could involve using B-trees or clever data structures, or being careful about what APIs one exposes respectively.

For individual endeavours and also at work, I often need to play both a strategist and a tactician. I’m responsible both for determining what to seek at a high level and for implementing the required steps. There’s some tension in devoting resources towards getting better at either of these; I’d say the skills are certainly distinct (though not independent; good strategic thinking requires knowledge of what may be tactically plausible, and one should implement suitable tactics to optimise for outcomes that one strategically favours).

Strategically things have been a bit foggy this quarter. To some end there has been progress in terms of general quality as a developer, at least by some metrics. Getting better technically is good, though it seems to have been more of a focus than I remember prioritising it to be.

I have to some extent been struggling to find satisfaction in things, as well (there was a very relevant series of sermons on contentment in church – but practical application is often much harder than theory). I think some of this was a change from working with a preponderance of ‘shiny’ features leading to patent applications or other forms of explicit external recognition. I even forgot about some features that were clearly valued by other teams or people until they reminded me that I wrote them when discussing what I’d been working on in general!

Outside of work, also, I need to find strategic direction (there hasn’t been very much). With regard to computer science, I need to think where I go from here. AAMAS’18 in Stockholm is coming up and I’ll be presenting there about model checking LDLK on finite traces. This paper was very challenging to write; differently from the first two, I actually had to develop a substantial amount of novel content. I found myself guilty of excessive handwaving in the original proof presented in the thesis – there was a detail which I claimed to be trivial that turned out to require a full column of argument! I also have one more paper to write. I’ve also not been doing as much independent study of both computer science and software engineering as I would like.

In terms of finance, one of the good things about index investing is that there isn’t that much to do once the system is set up. There’s even less if you’ve automated some of the workflows (as I have)! After that, much return is subject to the random walk of the markets – with the (I’d say justified) belief, of course, that that walk trends upwards. The main ‘action’ required then is to stay the course, which requires patience and persistence.

Markets have been a bit of a mixed bag. Equities have gone up a lot apart from emerging markets, but a good chunk of this is likely to be currency effects as the pound fell hard. The usual table follows:

(Disclaimer: I hold the Fidelity World Index Acc, Vanguard LS80, iShares EM index, iShares property index, BTC, GBP and USD in various quantities, among other things.)

Spending has exploded (well, relatively) this quarter. Some of this is due to periodic expenses with a period greater than three months coming in in this quarter – renewing the servers and domain name for this website and brokerage fees. There is also some (I’d say permissible) lifestyle inflation too – frugality or cheapness to the detriment of daily happiness is usually not something worth doing unless one is forced into that position.

My budget also has a category called ‘Learning and Development’, which includes books for reading and fees for various educational activities (such as conferences, lessons, exams and the like). This swelled in Q2. I haven’t been reading substantially more. Most of this was the fees for AAMAS’18, FLoC’18 (more logic!) and music lessons.

I’ve made a conscious decision to spend more time outside of work on various recreational pursuits (even if the precise details of these aren’t clearly directed). I’ve gotten faster and better at solving logic puzzles and Sudoku; it’s difficult to come up with clear performance metrics, but I’m usually able to score in the 70th percentile or so in online contests, and can occasionally squeak into the 80s in broader contests (where the average skill level is slightly lower). I know that I was in the 20s or 30s when I first started trying these. I ranked 42 of 157 in the UK Sudoku Championship 2018, and less impressively 50 of 141 in the UK Puzzle Championship 2018.

I’ve continued writing here – a bit less frequently than I would like, admittedly. I also had my first vocal lesson in about three years or so – it’s good to be back. To some extent I strive for quality in what I do, even in recreation (I told my teacher “I enjoy sounding good” – I didn’t go as far as “I only enjoy it if I’m sounding good”, though that’s probably more true than I like to admit…)

I traditionally feature a song I’ve listened to a lot over the quarter in these reviews; I’ve done this for about five years now, and it can be interesting to see what I was listening to several years ago. In many cases I liked most of the songs then and I still do now; I tend to select quite heavily for a meaningful message or idea, and I like to think I’m quite resistant to faddish qualities in songs.

I first heard this song on a plane (like many others!) when flying from London to Singapore for a one-week vacation in May. I have listened to a couple of songs by the British boyband The Wanted before. One of their heavier pieces, Warzone (about leaving an abusive relationship) featured in my 2014 Q2 review, where I decided the 90- and 100-hour weeks I worked then were a bit too much of a cost. I also quite like Chasing the Sun, and a couple of their album-only tracks as well.

Unfortunately, the band dissolved a few years ago. They weren’t quite as successful as One Direction, so as far as I know only one member, Nathan Sykes, has continued with a solo career. I didn’t actually know this until I came across his new album Unfinished Business (yes, quite apt). The opening track is titled Good Things Come To Those Who Wait. It reminded me a fair bit of James Arthur’s Back From The Edge both aurally and thematically.

The second verse is relevant to some issues I’ve been thinking about this quarter:

Some people run and fall
They don’t even walk at all
There’s nothing to prove, just to feel you exist

It’s important to balance aggressive implementation (“running”) with careful evaluation of whether what one’s implementing is relevant to what one wants to do – especially if the path ahead is unclear. In some cases, it can be better to move more slowly and carefully.

The last line sounds weird, but in context is a reason why the speaker takes his time with things. The converse of feeling a need to prove oneself to others is understandable. I do fall into that trap sometimes, though I know it can be a dangerous thing to do.

More generally, the idea that good things come to those who wait can be contentious. I think it depends on what kind of waiting is involved – waiting to strike when opportunity arises is great (which I think is the point of the song – “I prefer to stick when the others would twist”), but idle, muddled waiting might not work so well.

As far as the album is concerned, I also found I Can’t Be Mad to be excellent. There were also a number of other good songs; I think it was a reasonably strong offering.

# Scoring Points

I think it would be reasonable for me to describe myself as competitive. Through my years at university I had a goal of scoring a GPA of 90 percent (where a first-class degree is awarded at 70 percent); I added a clause of working a part-time job later on. I’ve enjoyed participating in a variety of contests from young, both in terms of work (Singapore Math Olympiad, ICPC) and play (Sims 4, music exams, puzzles).

Speaking of puzzle contests, I participated in the UK Sudoku Championship 2018 at the beginning of this week. Participants were given 2 hours to solve 16 puzzles of varying difficulty. To keep things interesting, the puzzles aren’t just standard or “classic” sudoku, but also feature additional constraints – for example, the highest-scoring puzzle had an additional constraint where both major diagonals had to have the digits 1 to 9 once each as well. I solved 13 puzzles, which led to a placement of 42 out of 136 (or 157, depending on whether one considers zero scores). I finished the remaining puzzles after the time ran out and would have needed about 2:20 overall; the fastest solver cleared everything in 1:09, just under half that!

I think one of my weaknesses can be a tendency to over-index on pursuing things that I think are worth doing. This may not work out so well if my thoughts turn out to be incorrect or misguided, or if the pursuit causes unpleasant side-effects. Writing this reminds me of the 90 to 100-hour weeks I used to pull in second year when studying for the exams then.

Anyway, one of the things I’ve been looking at recently has been building up a healthy portfolio. I had a conversation with my parents today, and one of the things that came up as part of the discussion was something I used to do in high school.

In Singapore, most food courts and hawker centers have an economy rice stall. The stalls offer steamed white rice accompanied by various cooked dishes. These include meat-based dishes like fried chicken or Chinese char siu, vegetable dishes and tofu, among others. Ordering from an economy rice stall is typically cheap (as the name suggests); thinking about it, it also bears similarity to the low-cost model for airlines in that one chooses and pays specifically for what one wants, and has flexibility in choosing what is included.

My high-school canteen had an economy rice stall, and I remember budgeting S$1 (now 56p; then probably more like 45p or so) for lunch every day. This would typically result in fairly unhealthy meals, admittedly, with rice plus one hot-dog and a small portion of nuts and anchovies. For comparison, many meals in the canteen ranged from S$2 to S\$3. Note that cooked food in Singapore is generally much cheaper than in London, and food in school canteens is typically subsidised.

Some of this frugality continued when I entered university. I had similar ‘adventures’ in my first year at Imperial, where I somehow maintained a grocery budget of £7 per week (£1 per day) for a full academic term. Things relaxed over time, especially as I convinced myself that I did have reasonable earning power. To be fair, there was increasing evidence of that as I did several internships and part-time jobs while I was at Imperial.

Clearly, if one is trying to build a portfolio quickly, one factor that can be optimised on would be maximising fresh inflows to the portfolio. Minimising one’s expenses and thus increasing one’s savings rate helps with that, of course.

However, aggressive expense minimisation often leads to other appreciable costs in terms of health, happiness and stress. Apart from direct opportunity costs (for example, from foregoing higher quality or safer food), there are secondary effects as well (affecting socialisation, for instance). The mental overhead of needing to evaluate many small financial decisions every day can be significant as well; I’ve found it to be the case in my personal experience.

Having a sizable portfolio can be useful; I don’t think I have to be convinced of the value of financial independence even though I don’t necessarily have early retirement plans at this stage. However, as an end in and of itself it is not particularly satisfying. I’m fortunate in that I haven’t taken on large debt obligations, and right now (though I may regret saying this later) bumping the numbers in some brokerage account or on my spreadsheets is nice, but brings very little marginal happiness.

It may well be obvious that, to quote a blog post I read this week, life is more than compounding money. In general, reducing it to an arbitrary single end is difficult. Nonetheless, when pursuing a goal I sometimes lose some sight of broader things, and it’s thus important to remind myself of this.

It has been said that “what gets measured gets managed”, and for me at least most goals tend to be fairly quantitiative and measurable in nature. I often place these in some kind of OKR framework which often isn’t as friendly to softer, more qualitative tasks. That probably explains to some extent “losing sight of broader things”. Some may have a very clear and specific view of what they want to do, but I’m not there yet.