Browse Month

September 2017

Readability and Performance

I find myself faced with tradeoffs between readability and performance fairly often. This happens both in terms of programming (where, apart from in some specific situations, the former usually isn’t too bad) and in English.

In programming, code that is readable is usually desirable. This often makes it easier to maintain, especially for other developers who subsequently need to understand it. For example, let’s suppose we have a Java hash map, and we need to return an Optional indicating the value with a key that matches a given Predicate, if it exists. However, if there are multiple keys matching the Predicate, we want to throw an exception (even if the values are the same).

I would probably write something like this:

However, in terms of performance we could clearly improve. Using limit does allow us to shortcircuit (so we don’t look at more elements than we need to). However, creating stream objects could still add significant overhead, especially if the predicate is simple.

The code below should accomplish the same function, but more quickly.

This is still fairly readable, in my opinion, though there is a bit more state to keep track of. However, if the predicate is costly, we could do even better by taking our first approach and parallelising it, essentially framing the problem as a map-reduction. (The end result is likely to be slightly less readable. It might be substantially less so for people unfamiliar with concurrency).

Of course, it is not always the case that readability and performance are negatively correlated. Optimising compilers implement many techniques that transform a program with relatively readable source code into one that is efficient for a machine. This is often true, though, when an entirely different (and usually more complex) algorithm and/or implementation is selected because of performance.

It goes without saying that readability is also subjective (tabs versus spaces is a starting point). Readability does not always imply maintainability as well. For example, deeply nested structures where each layer focuses on one specific functionality may be readable, but may prove less maintainable given that they may produce longer, more convoluted stack-traces when exceptions occur.

Defining “performance” in natural languages is tricky. We could consider performance to be efficiency of communication, assuming a perfect ability to comprehend. Well-written compilers implement the source language specification faithfully. They should thus be able to understand valid source, even if it’s not in a human readable form. This happens in practice; processes like JavaScript minification are commonplace.

Of course, English does not quite have a specification. Nonetheless, I think efficiency under perfect comprehension is probably a reasonably close analogue to software performance. For example, if I was asked to explain the logical argument rule of modus tollens, I would probably say something like:

Suppose we have if A, then B. Suppose not B. Then, we have not A (since if we had A, then we would have B).

That explanation would probably make sense to readers vaguely familiar with formal logic and/or philosophy. Again, I could conceivably further increase “efficiency”, by removing the bracketed part. I could even boil things down further:

If A, then B. Not B. Thus not A.

Of course, if the person I was talking to was unfamiliar with logic and, let’s say, a friend, I would proceed very differently. This would, in a sense, be prioritising readability:

It’s a way of making an argument in logic. The idea is that if we know that if A is true, then B is true, and we know B is false, then we can say that A must also be false. Let’s use a concrete example. Let’s say that if it rains, I’ll wear a hoodie, and I’m not wearing a hoodie. Deducing that it must not be raining would be an example of modus tollens.

That said, there is also plenty of subjectivity in terms of readability for natural languages. There have been many attempts to compute estimates or measures of readability. These often use formulae that accept word length and syllable distributions and/or lookup tables. However, many of these fail to account for, among other things, the level of abstraction in text. Going back to our three explanations of modus tollens, they have approximate Flesch-Kincaid grade levels of 0.4, -2.6 and 5.5 respectively. However, I’m fairly confident that most three or four year olds (grade -2.6) would not quite understand the concept of modus tollens after reading “if A then B, not B, thus not A”. Even if 11 to 12 year olds (grade 5.5) were split into groups and given the three explanations, I suspect that the third would still be the most likely to help the students understand the concept.

I actually find the second explanation the most readable, owing to domain knowledge. In just nine words, it gives me a precise and accurate description of what the logical rule is. It might be appropriate for a general audience. It probably would not be for an audience familiar with neither logic nor philosophy.

The first is good in that it explains why not A is reasonable. I could see myself appreciating it if the subject matter was somewhat more complex (e.g. results in modal and temporal logic).

The third features a concrete example, which is useful when learning. It also explicitly sets out some assumptions common in logic, such as “suppose X” meaning “suppose X is true“. However, I find it to be less readable, as I need to read a longer text to extract the main conclusion of the paragraph.

Earlier, I highlighted that readability and maintainability of software and source code are not one and the same. Let’s consider the notion of maintainability. If spoken (especially extemporaneously), I’m not sure this is necessary at all. If written, that could refer to the difficulty of proofreading or returning to the work when trying to revise it, perhaps?

In the software case, there are times when my hand has been forced and I’ve thus opted for a complicated, less readable implementation. Outside of word games like Taboo and exams or application processes with strict word limits, I don’t think I’ve had to for English. Thus, in general I would consider my target audience and aim to present it in what I expect would be the most efficient way possible, while still retaining the required readability.

What Does “Likely” Mean? (Estimative Probability)

I typically take some time out on Saturday mornings to do some reading. This often begins with developments in personal finance or computer science, though there’s a tendency to branch out from that; I generally don’t restrict myself from going off on tangents.

For some reason, this week I came across a Masters thesis discussing communication of probabilities, specifically in the intelligence domain. It seems that I found it via A Wealth of Common Sense, a blog concerning personal finance that I read occasionally; there was a link there to a Library of Economics blogpost discussing issues with mapping qualitative descriptions to quantitative probabilities. For example, if I say that something is highly likely to happen, I would be implying that I believe it would happen with probability at least N; however, the numerical value of N is uncertain, and differs among different people.

For me at least, N would be around 80 percent (and, incidentally, the author of that post agrees); that said, I can certainly envisage people assigning values of N as low as 0.6 or as high as 0.95. Note that the problem can also be two-tailed (e.g. about evenmightpossiblenot inconceivable). The LoE author’s son proposes a reasonable scheme, which is to have authors outline their own mapping in a glossary. This is probably (well, I mean P >=0.7) a good first step, though there are implementation challenges in terms of length as well as completeness.

It turns out that the concept of words of estimative probability is treated very seriously in the intelligence domain. It is understandably important, as briefs are typically prepared in natural language, and often need to be communicated to audiences that may not be entirely comfortable with mathematical notation. To quote a CIA officer:

Most consumers of intelligence aren’t particularly sophisticated when it comes to probabilistic analysis. They like words and pictures, too. My experience is that [they] prefer briefings that don’t center on numerical calculation. That’s not to say we can’t do it, but there’s really not that much demand for it.

Furthermore, deriving highly precise (though possibly not highly accurate) estimates for probabilities is almost certainly (*cough* I mean P <= 0.03) pointless, and is likely (P >= 0.7) damaging in that it tends to (P >= 0.6) give a false sense of security and accuracy when that does not actually exist.

The original proposal divides probabilities onto a seven-point scale (arguably five, as the ends are meant to reflect absolute certainties): certain, almost certain, probable, chances about even, probably not, almost certainly not, impossible. I think most people would agree that the above ordering is in decreasing order of probabilities. Of course, strictly adhering to the above labels would impart a degree of awkwardness to writing, and a group of variants for each level (such as highly doubtful for almost certainly not) soon developed.

Interestingly, the author also gives possible a fairly specific meaning; he defines it to mean “greater than zero and less than one” (which makes sense; of course, something always happening is certainly possible – but it seems pointless to not use the more precise word), but also adds the restriction that “no numerical odds (can) be assigned”. This seems like a useful construct, especially in the early stages of knowing things when uncertainty tends to be high; the other descriptive terms were conceived with uncertainty ranges of about 10% on each side.

The author of the Masters thesis also considers how words of estimative probability are used in other fields. I found the comparison to weather forecasting particularly interesting, as the author rightly points out that that is one field in which the general public is given numeric estimates (of the probability of precipitation). Weather forecasters typically add their own prose when reporting as well, which allowed some comparison. That said, a major difference is that in forecasting, these estimates can be derived with fair precision (and, as it turns out, accuracy) as they can (and, in the UK, do) originate from predictive models of atmospheric conditions. There seem to be standardised terms as far as the US National Weather Service is concerned, though I wasn’t able to find comparable guidance from the UK Met Office.

The clarity required from words of estimative probability depends on the consequences of miscommunication, as well. This is of course important in intelligence, with some claiming that there was miscommunication regarding warnings related to the September 11 terrorist attacks. Incorrectly reporting a weather forecast is almost certainly (ugh, P >= 0.95) less consequential, though people may make bad decisions concerning taking umbrellas when going out or hanging clothes out to dry. I can imagine contexts where this would also be very important (such as experimental trials in medicine or financial planning), though it seems for the most part some degree of ambiguity or even unknown disagreement is probably okay.

A Tour of Two Islands (IJCAI-17 Reflections)

Over the past two weeks, I took some time out from work and visited two islands, one substantially larger in land area than the other (and, incidentally, I am returning to another island). I’ve been to both at some point in the past; the larger one for vacation, and the smaller for study, work, visiting family and friends and for vacation as well. The latter is not too difficult to guess – it’s Singapore; the former is Australia.

Academic Program

I was in Melbourne for the twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17). Of course, the main highlight for me was presenting “Model Checking Multi-Agent Systems against LDLK Specifications“, a paper which built on parts of my Masters’ thesis on MCMAS-Dynamic. This was my first time presenting at an academic conference, though it’s not my first publication (that’s in AAMAS 17 on CTL*K); it was a little nerve-wracking, though I practiced the presentation a number of times beforehand and I think that helped a lot. Prof. Michael Wooldridge (the author of a commonly used textbook on multi-agent systems) even gave me a compliment that it was confidently presented. I’m not so sure about that, but I guess I have some private local state, and in any case deontic logic doesn’t typically have box p implies p as an axiom.

I also enjoyed the rest of the academic program as well. Apart from Alessio, I had the opportunity to meet up with several other academics whose names, at least, were very familiar. The keynote was delivered by Prof. Stuart Russell (one of the co-authors of the well-known Introduction to AI textbook), and discussed the concept of provably beneficial AI. Essentially, humans tend to be poor at specifying things (e.g. King Midas); many naive specifications of goals or objective functions naturally lead to agents enforcing self-preservation; more informally, “you can’t fetch the coffee if you’re dead”. The talk then discussed a way of formulating a problem such that it would always be a dominant strategy for an AI agent to allow itself to be switched off (as compared to not allowing itself to be switched off); see [Hadfield-Menell et al. 2016] (The Off-Switch Game) for more detail. The other invited speakers also had fairly interesting topics, discussing applications of AI in teaching, commerce and Texas hold-em poker.

I made it a point to attend as many paper presentation sessions as feasible. Expectedly, at a top conference like IJCAI there were quite a number of talks, especially in areas further removed from my work, that proved largely inscrutable. I think my presentation would also have been tricky to follow if one was not reasonably acquainted with temporal logic and/or multi-agent systems. Nonetheless, I gave my best effort in trying to follow the presentations, and there were certainly many interesting talks.

There was a session devoted rather specifically to the discussion of AI applied in the context of computer games. I found [Dann et al. 2017] (Real-Time Navigation in Classical Platform Games via Skill Reuse) to be particularly interesting; the key idea was having the agent acquire skills for small local movements, such as making a single jump to jump on top of a block three cells above the ground. The agent learns success probabilities for these skills, and uses a variant of Dijkstra with log-likelihood to find the “most likely” way to reach the goal safely (though there are some complications with re-planning if a step fails). The authors also demonstrated that the model performs better than the state of the art algorithm, which is based on A*, for maze-like levels. Other talks in the session also discussed level generation and physics engine inference (for platformers), RTS combat planning, narrative control in an educational RPG… and strategy derivation in Hex (this felt a bit out of place, but I guess they must have had exactly five papers that fit the theme well; sessions had either four or six presentations).

In terms of tutorials, I was unfortunately rather sleepy when I landed in Melbourne on Sunday. I did attend the tutorials on Monday, though; there was a very gently paced introduction to argumentation theory (the course Argumentation and Multi-Agent Systems which Prof. Francesca Toni teaches at Imperial would likely have covered this material, though she was on sabbatical when I was in year 4), and a less gently paced overview of heterogeneous learning (machine learning where some differences in the data sources can be exploited to yield better results).

Non-Academic Program

Stepping back from the academic side of IJCAI, I also enjoyed the social events at the conference, and I think they were run fairly well; that said, I acknowledge I don’t have a benchmark at all. The reception at the Melbourne Cricket Ground was very generous, especially in terms of dessert (chocolate tarts!), and the conference in the Docklands was certainly done to a high standard. It also introduced me to the “alternating drop” protocol, which is apparently common (but peculiar to) Australia. Essentially, given two choices of main course X and Y, instead of having guests indicate their preference, they are served XYXYXY… around the table (of course, those with dietary restrictions are excluded). I can see how this is logistically advantageous (you know beforehand that it will suffice to prepare just over N/2 of both options, given N guests), but I’m not sure I like the probability of not getting my preference…

I stayed at the “recommended” Crown Metropol which was certainly luxurious. The hotel is located south of the Yarra river in Melbourne, and is part of the Crown Entertainment Complex – among other things, this features a fairly large casino, several restaurants and luxury shopping (way out of my budget range). Nonetheless, it was a good base from which I could easily explore the city, especially on the few days I stayed on after IJCAI.

Post-Conference

I made it a point to leave a few days after the conference itself was over to explore the city, as I knew the conference days themselves were going to be fairly packed (8 am – 6.30 pm usually, running till 11 or so on the night of the Reception and Banquet). Like many work trips, the “official business” part of the trip was fairly draining – I perhaps should have seen this coming. I did manage to have a look around, though; I took a river cruise, walked through a few parts of the city and indulged in a bit of outlet shopping. I also sought out a local fast-food chain (Red Rooster, specialising in grilled chicken).

Given that I’d flown this far across the globe when going from London to Melbourne (it’s roughly 20 hours in the sky) I also decided to spend a few days in Singapore on the way back. For the most part I focused on spending time with family, though I also managed to find some time to flip through some of the IJCAI proceedings.

General Thoughts


It’s surprisingly difficult to leave work behind…

This is certainly the first time I’ve had a two-week break since I started full-time at Palantir. (I had some things I decided I would clear out last year around the Christmas and New Year period, so no long break for me then.) That said, rather unsurprisingly in hindsight attending and presenting at IJCAI was easily just as intellectually stimulating and demanding as my regular work, so it felt more like a one-week break. Still, it was quite a dramatic slowdown.

I think the break worked well. It came at a time just after I launched some fairly hefty deliverables; knowing that I was leaving for two weeks, I had been pushing somewhat hard to get things out before I left. I hadn’t been back home for 11 months. I was initially a little apprehensive about going off for two consecutive weeks, though the logistics of course made sense, and I don’t think I have major regrets, at least not yet!