Browse Category

# Quantitative Challenges

I spent about two to three hours studying and then working through mathematical problems on quantitative finance today. Specifically, these were questions from Blyth’s An Introduction to Quantitative Finance and dealt with interest rate swaps. These are contracts where one party typically pays fixed payments (e.g. 5%) and receives floating payments, which are dependent on market rates (e.g. LIBOR + 1.5%), though float-float swaps exist too (e.g. across currencies). Swaps can be used to mitigate interest rate risk (or gain exposure!); they can also be mutually beneficial depending on companies’ borrowing characteristics.

If I had to classify the mathematics involved in the problems I did, beyond “applied” I’m not sure how else I could label it. For many of these problems, the first steps involved figuring out how to mathematically model the financial products involved. There then followed some elementary algebra, along with proofs that required some intuition to pick the right approach (for lack of a better word). The exercises I did this time around relied less on probability than normal; much of this probably stemmed from a result that forward interest rates (i.e. the interest rate you’d get from future time T1 to later future time T2) could be valued independent of the distribution of possible values.

I’ve struggled quite a fair bit with the book, both in terms of the reading material as well as the exercises. Today’s chapter was relatively easier, though that might have been because I was reading through the chapter for the second time. It was my first time doing many of the exercises, though it seems like they went relatively smoothly today.

Some of this might be because I work through the chapters at a very relaxed clip of about one per month. Like many other mathematical domains, there tend to be many dependencies on previous topics. The earlier result I mentioned on forward interest rates, for example, was from the previous chapter; yet, it was instrumental in computing the valuation of a swap. There are certain fundamental ideas that I learned way back in 422 (Computational Finance at Imperial). I also think my mathematical knowledge and logical intuition have (hopefully) mostly stayed with me. Furthermore, I like to think that I remember the concepts at a high level. However, many proofs require recognizing that expressions are in certain forms and can thus be rewritten; I’m still yet to develop that level of familiarity – or shall I say intimacy – with the content.

This might also be partially self-created, especially where the reading material is concerned. When I see theorems, I tend to try my hand at proving them on my own first. These often prove to be rather tricky endeavors; the aforementioned lack of keen familiarity with the material certainly doesn’t help. Typically, I can understand the proofs fairly easily when reading them. However, I usually expect myself to figure out the intuition behind the proof (including reproducing it, at least at a high level), which isn’t always so forthcoming. That actually reminds me of what I used to do at Imperial for certain modules, especially the (in my opinion) two hardest of the course: 438 Complexity and 493 Intelligent Data and Probabilistic Inference. I would make the effort to understand why many of the proofs in those courses worked. I’d also try to figure out how the author might have come up with the proof, or at least what the core intuitions might have been. This included relatively nasty ones (e.g. SAT being NP-complete via direct argument, or ELBO results in variational inference), and I think it paid off in terms of understanding.

It could be argued that finding the material difficult is expected, because the subject matter is itself complex. I tried to obtain a popular estimate of the complexity of the material covered by the book, but didn’t find much data; there were only a handful of reviews on Amazon, which offered a wide spectrum of views (from “[o]ne of the best introductory treatise (sic)” to “I would hardly call it an “Introduction” to quantitative finance”). I’m not sure how to start more simply, though; there is a fair bit of assumed mathematical knowledge, but this is at least partly spelled out in the introduction.

While the book has proved challenging at times, I’ve not found it too hard to follow. Discussing the problems with a friend has also helped a fair bit, especially since the book doesn’t have solutions!

# What Does “Likely” Mean? (Estimative Probability)

I typically take some time out on Saturday mornings to do some reading. This often begins with developments in personal finance or computer science, though there’s a tendency to branch out from that; I generally don’t restrict myself from going off on tangents.

For some reason, this week I came across a Masters thesis discussing communication of probabilities, specifically in the intelligence domain. It seems that I found it via A Wealth of Common Sense, a blog concerning personal finance that I read occasionally; there was a link there to a Library of Economics blogpost discussing issues with mapping qualitative descriptions to quantitative probabilities. For example, if I say that something is highly likely to happen, I would be implying that I believe it would happen with probability at least N; however, the numerical value of N is uncertain, and differs among different people.

For me at least, N would be around 80 percent (and, incidentally, the author of that post agrees); that said, I can certainly envisage people assigning values of N as low as 0.6 or as high as 0.95. Note that the problem can also be two-tailed (e.g. about evenmightpossiblenot inconceivable). The LoE author’s son proposes a reasonable scheme, which is to have authors outline their own mapping in a glossary. This is probably (well, I mean P >=0.7) a good first step, though there are implementation challenges in terms of length as well as completeness.

It turns out that the concept of words of estimative probability is treated very seriously in the intelligence domain. It is understandably important, as briefs are typically prepared in natural language, and often need to be communicated to audiences that may not be entirely comfortable with mathematical notation. To quote a CIA officer:

Most consumers of intelligence aren’t particularly sophisticated when it comes to probabilistic analysis. They like words and pictures, too. My experience is that [they] prefer briefings that don’t center on numerical calculation. That’s not to say we can’t do it, but there’s really not that much demand for it.

Furthermore, deriving highly precise (though possibly not highly accurate) estimates for probabilities is almost certainly (*cough* I mean P <= 0.03) pointless, and is likely (P >= 0.7) damaging in that it tends to (P >= 0.6) give a false sense of security and accuracy when that does not actually exist.

The original proposal divides probabilities onto a seven-point scale (arguably five, as the ends are meant to reflect absolute certainties): certain, almost certain, probable, chances about even, probably not, almost certainly not, impossible. I think most people would agree that the above ordering is in decreasing order of probabilities. Of course, strictly adhering to the above labels would impart a degree of awkwardness to writing, and a group of variants for each level (such as highly doubtful for almost certainly not) soon developed.

Interestingly, the author also gives possible a fairly specific meaning; he defines it to mean “greater than zero and less than one” (which makes sense; of course, something always happening is certainly possible – but it seems pointless to not use the more precise word), but also adds the restriction that “no numerical odds (can) be assigned”. This seems like a useful construct, especially in the early stages of knowing things when uncertainty tends to be high; the other descriptive terms were conceived with uncertainty ranges of about 10% on each side.

The author of the Masters thesis also considers how words of estimative probability are used in other fields. I found the comparison to weather forecasting particularly interesting, as the author rightly points out that that is one field in which the general public is given numeric estimates (of the probability of precipitation). Weather forecasters typically add their own prose when reporting as well, which allowed some comparison. That said, a major difference is that in forecasting, these estimates can be derived with fair precision (and, as it turns out, accuracy) as they can (and, in the UK, do) originate from predictive models of atmospheric conditions. There seem to be standardised terms as far as the US National Weather Service is concerned, though I wasn’t able to find comparable guidance from the UK Met Office.

The clarity required from words of estimative probability depends on the consequences of miscommunication, as well. This is of course important in intelligence, with some claiming that there was miscommunication regarding warnings related to the September 11 terrorist attacks. Incorrectly reporting a weather forecast is almost certainly (ugh, P >= 0.95) less consequential, though people may make bad decisions concerning taking umbrellas when going out or hanging clothes out to dry. I can imagine contexts where this would also be very important (such as experimental trials in medicine or financial planning), though it seems for the most part some degree of ambiguity or even unknown disagreement is probably okay.

# Elevation (Hotseat: STEP I 2015)

#### Background

STEP (short for Sixth Term Examination Paper) is a somewhat difficult Maths exam that is used by several UK universities (including Cambridge, Warwick and Imperial) as part of a conditional offer for courses in Mathematics and Computer Science. A part of my conditional offer for Imperial included a 2 in any STEP paper of my choice, though it was paired with a relatively awkward 35 points for IB – perhaps because the rest of my portfolio was pretty strong.

There are three papers – I, II and III; III includes A Level Further Mathematics content, while I and II remain within the A Level Mathematics syllabus. I is also somewhat easier than II; that said, I think both papers exist because Cambridge does sometimes want students who didn’t take Further Mathematics to get a pair of grades in these exams. Nonetheless, STEP I itself is certainly no pushover. Students are graded on a scale of S, 1, 2, 3, U; the 2015 STEP I paper had 73.1 percent of students scoring at least ‘3’ (the lowest pass grade), and just 42.6 percent scoring at least ‘2’ (the lowest grade many universities would consider). This may be compared with A Level mathematics in 2015, where the analogous metrics of A*-E and A*-C respectively are 98.7 and 80.8 percent; and this is even before we factor in selection bias.

Each paper consists of 13 questions, but candidates are only required to pick six of them; their highest-scoring six questions will be used to determine their final score. Questions have equal weight (and each is marked with integers out of 20, which seems suspiciously similar to how I’ve seen this done at many universities!). Eight of the 13 questions are classified as “pure mathematics” and include tasks testing concepts like calculus, trigonometry, algebraic manipulation, series and even number theory. Three are classified as “mechanics”, typically requiring calculations on Newtonian mechanics, and two as “probability and statistics”. I almost always do 4/0/2 or 3/1/2. Note that it is possible to attempt seven or even more questions as a form of “insurance”, though given the strict time constraints this is likely to be difficult.

#### Performance

I had a fairly decent run, picking up 114 out of 120 points; mainly losing these to minor slips/cases where an important statement was not explicitly asserted, and a good chunk in question 7 with not clearly handling a case which was expected to be shown to bear no fruit (I thought it was rather obvious that it was not needed, and dismissed it in one line).

The last row indicates the order in which I attempted the problems; it seems this was generally consistent with how long I actually took on them (problems 2 and 13 were fast; 1 and 8 were somewhat in-between, and I struggled for a bit with 12 and messed up 7 while using up a fair bit of the time). Note that the “break-even” time if one wants to answer all questions would be 30 minutes per question.

#### Selected Problems in Depth

##### Problem 8: Series Division

First, prove that $1 + 2 + \ldots + n = \frac{n(n+1)}{2}$ and $(N-m)^k + m^k$ is divisible by $N$. Then, consider

$S = 1^k + 2^k + 3^k + \ldots + n^k$

Show that if n is a positive odd integer, then S is divisible by n, and if n is even then S is divisible by n/2. Show further, that S is divisible by $1 + 2 + 3 + \ldots + n$.

The two lead ins were simple. Induction does work but in both cases there were much better methods that could be used (write the series twice and pair terms up, and then pair terms in the binomial expansion). Later parts involved pairing S with a zero term, but the general theme of “pairing terms” persisted throughout the question. I think the toughest part of this one was, knowing that one had to show divisibility by $\frac{n(n+1)}{2}$ at the very end, figuring out that it was safe to split this into two terms and show them separately. This works because the highest common factor of $n$ and $n + 1$ is 1. My number theory was a bit rusty so I wasn’t sure if that was correct, and proved it during the paper.

##### Problem 12: On Fish Distributions

The number X of casualties arriving at a hospital each day follows a Poisson distribution with mean 8. Casualties require surgery with probability 1/4. The number of casualties arriving on each day are independent of the number arriving on any other day, as are the casualties’ requirements for surgery. (After some initial work) Prove that the number requiring surgery each day also follows a Poisson distribution and state its mean. Given that in a particular random chosen week 12 casualties require surgery on Monday and Tuesday, find the probability that 8 casualties require surgery on Monday (as a fraction, in its lowest terms).

This one really wasn’t too bad, though it involved a lot of algebraic manipulation and it seems I took quite a long time on it when doing the paper. Essentially, the independence condition should hint that if we have X casualties, the probability of S needing surgery is clearly binomially distributed. X itself is a random variable, but that’s fine; the law of conditional expectation gives us

$P(S = s) = \displaystyle \sum_{t = s}^{\infty} P(S = s | X = t) P (X = t)$

and a suitable substitution yields this:

$P(S = s) = \displaystyle \sum_{t = s}^{\infty} \left( \frac{t!}{s! (t - s)!} \times \left( \frac{1}{4} \right)^s \times \left( \frac{3}{4} \right)^{t-s} \times \frac{e^{-8} 8^t}{t!}\right)$

After some fairly involved algebraic manipulation, one can indeed recover a Poisson form for the pdf of S. Using this, the last part is relatively simple actually; we want $P(S_1 = 8 | S_1 + S_2 = 12)$ and relying on the fact that a sum of independent Poissons is Poisson itself (so the means of $S_1$ and $S_2$ are 2 each gives us that the mean of $S_1 + S_2$ is Poisson, and with mean 4).

##### Problem 13: Probability and State Machines

Already covered in a previous post. I dispatched this one very quickly, though I was already familiar with the Markov process model that I used here.

#### Meta-Analysis

The main datasource we have available here is an Examiner’s Report that discusses to some extent what happened (though we don’t have full details). The grade boundary for an S is 96, so a 114 is comfortably in that range; 3.5 percent of candidates scored that. Question-level data isn’t published beyond the comments in the Examiner’s Report.

Focusing on the questions that were attempted, my opener Q1 which was a test of curve-sketching was also the highest-scoring question on the paper, with students scoring an average mark of over 13.5 (with caveats that some students were worried about curve sketching and thus avoided it). This was the second “easiest” question as far as I was concerned.

The other pure maths questions I attempted (2, 7 and 8) were also popular, and attempted with varying degrees of success (questions 2 and 7 were the second and third highest scoring questions). Probability and Statistics for some reason seems to always cause issues for students attempting these papers, with mean scores in the 2-3 range, though having done Olympiads in the past (and specialising in combinatorics and probability) I understandably found these easier.

Generally, STEP I for me tends to be fairly tough but certainly comfortable, II is solidly difficult, and doing III often makes me feel like a dog behind a chemistry set (“I have no idea what I’m doing”), especially since I didn’t take Further Maths in high school.

# Probability and State Machines

Suppose you have a fair six-sided die, and repeatedly toss it. What is the probability that you throw exactly one 4 before your first 6, or you throw exactly one 5 before your first 6 (inclusive or)?
(STEP I Q13, 2015)

There are several ways to approach this problem. There’s a “conventional” method that involves careful use of binomial series (and that was the approach taken in the mark scheme). However, the approach I used for this was somewhat different; some of it may be attributable to my computer science (or perhaps even logic/verification) background.

We can model this event using a finite state machine, where each state has a transition distribution that indicates the probability of transitioning to each other state in the state machine. For example, we can consider a simpler problem: what is the probability that you throw exactly one 4 before your first 6?

Well, we can be in multiple “states”:

• We could not have rolled anything of significance yet. In this case, we have a 4 in 6 probability of staying in this state (1, 2, 3 or 5); a 1 in 6 probability of immediate failure (rolling a 6), and a 1 in 6 probability of registering a 4.
• We could have already registered a 4. In this case, we again have a 4 in 6 probability of staying in this state (1, 2, 3 or 5), a 1 in 6 probability of success (rolling a 6), and a 1 in 6 probability of failure (rolling a 4).
• We could already have succeeded or failed. Additional rolls won’t change anything in these cases.

This can be modelled as the following state machine:

We can then compute a success probability for each state – that is, “given I’m in this state, what’s the probability I succeed”?

• The success probability PS of the state Pass is 1, and that of Fail is 0; we write this as PS(Pass) = 1 and PS(Fail) = 0.
• PS(4) is a bit more interesting; it is $\frac{4}{6}PS(4) + \frac{1}{6}PS(Pass) + \frac{1}{6}PS(Fail)$ and solving this yields half. This is intuitively expected; at this point, we’re counting on rolling a 6 before rolling a 4, and on a fair die the probability of that is half.
• We can now determine PS(Start); this is $\frac{4}{6}PS(Start) + \frac{1}{6}PS(4) + \frac{1}{6}PS(Fail)$. Solving the equation yields the answer of $\frac{1}{4}$.

Note that we could only use this approach because, with the way we’ve defined the states, the process is memoryless; that is, the transition probabilities depend only on the current state. We could try to directly construct a state machine for the original question (either exactly one 4 or exactly one 5 before the first 6, or both), though it seems that the requirement of memorylessness makes things somewhat complex; we would need states tracking whether we experienced zero, one, or two-plus of each number.

We can use a well-known probability axiom here, though; $P(A \cup B) = P(A) + P(B) - P(A \cap B)$ – defining rolling exactly one 4 before one 6 as event A and respectively with a 5 for event B. Furthermore, our initial exploration already yielded values for $P(A)$ as a quarter, and $P(B)$ would also be a quarter by symmetry. We thus, instead, construct a state machine for the intersection, where both need to be satisfied.

• Throughout, a 1, 2 or 3 does nothing and simply puts us back in the state we were in.
• As we start, we can also roll a 4 or 5, which mean we’ve got the one 4 or 5 we want; or we can roll a 6, which is an immediate failure.
• Once we have a 4 registered, we want to roll a 5 which gets us into the (4, 5) state. 6 remains an immediate failure, but 4 also now becomes one (remember that we want to have exactly one 4 and exactly one 5 before our 6).
• The above logic also holds for 5s, except with 4 and 5 swapped.
• Finally, in the (4, 5) state, rolling a 6 yields success, but either a 4 or a 5 would become immediate failures.

Here’s the result:

We could use the same method of success probabilities used earlier; I’ll take a little shortcut.

• PS(4,5) must be $\frac{1}{3}$ as we’re counting on a 6 being rolled before either a 4 or a 5, and with a fair die each of these has equal probability of being the first of the set to be rolled.
• PS(5) is $\frac{1}{6}PS(4,5) + \frac{2}{6}PS(Fail) + \frac{3}{6}PS(5)$, which gives us $PS(5) = \frac{1}{9}$. By symmetry, we can be sure that $PS(4) = \frac{1}{9}$ too.
• PS(Start) is $\frac{3}{6}PS(Start) + \frac{1}{6}PS(4) + \frac{1}{6}PS(5) + \frac{1}{6}PS(Fail)$. Solving that yields $PS(Start) = \frac{2}{27}$.

We’re not done yet, though, as this isn’t what we wanted. We wanted $P(A \cup B)$; but now we can apply the identity and get

$P(A \cup B) = P(A) + P(B) - P(A \cap B) = \dfrac{1}{4} + \dfrac{1}{4} - \dfrac{2}{27} = \dfrac{23}{54}$

which is the same answer obtained by the STEP problem-setters, though they used binomial series instead.

The above approach can of course be generalised to handle expected values (on the basis that there can be multiple distinct terminal states, where each state is assigned a value). For example, if a player won a consolation prize of $21 if he ‘bust out’ (rolled a 6 before rolling either a 4 or a 5), and a full$60 if he won, and we wanted to determine the value of the game, we could draw the graph with three terminals instead, perhaps something like this:

We would then perform our computations on the expected value (EV) of each state, with terminals being exactly their value.

Also, though for our examples we didn’t have cycles in our graphs, it’s possible that in general there could be cycles. Of course, these can be handled by solving the multiple PS or EV-equations simultaneously.

# A Metric for Speedrunning Exams

In gaming, a speedrun is a playthrough of a game where one attempts to complete it as quickly as possible, perhaps subject to some constraints (such as completing a game with 100% completion of all sidequests). I wouldn’t say I’m skilled/fast enough for aggressively focusing on speed in many of the tests I’ve covered; I didn’t finish the AIME, and while I did pretty much “complete” the BMO and MAT within their time limits it wasn’t with much time to spare at all.

Typically when taking exams where one expects to finish comfortably (and even in cases when one does not) I find it’s rarely a good idea to go as fast as possible; mistakes are likely to be made and one typically wants to secure the marks for questions one is able to answer. I’d almost always use up almost all of the time as well; there’s typically no reason not to go back and check one’s past work, and/or improve the quality of one’s existing answers. Generally up to and including the IB exams I took at the end of high school, for most subjects there was time left over at the end; once at Imperial most exams tended to be very fast-paced – I would either finish them with typically fewer than ten minutes on the clock, or not finish them at all. (There were exceptions, such as first-year Logic, second-year Concurrency and fourth-year Modal Logic.)

A speedrun of an exam would involve completing the questions as quickly as possible; perhaps using much less time than is given. In constructing a metric for this, it becomes apparent that incorrect answers need to be penalised in some way (it would otherwise be feasible to leave all questions blank and immediately stop the clock). However, ideally the penalties would not be too harsh; for example, invalidating performances with any answers incorrect would address our first concern, but would not be enjoyable at all. It also seems awkward to invalidate an entire run on the basis of not knowing, say, a single fact or definition.

There are two obvious raw metrics for performance of a speedrun of an exam:

• the proportion of total marks scored, $M \in [0, 1]$ where higher is better;
• the proportion of time taken, $T \in [0, 1]$ where lower is better.

Combining those metrics, I think the following metric for an overall performance $P$ is fairly intuitive.

$P_0 = M/T$

In a sense we’re measuring efficiency with respect to time, against a benchmark student who uses all of the time and scores 100% of the marks. However, I don’t think this is a good metric, because it can readily be abused; a strategy that quickly determines the easiest mark(s) to score on the paper and then attempts only those marks will do very well (note: not necessarily questions; for example, “find the roots of $f(x) = 36x^4 - 23x^3 - 31x^2 - 102x$” is a rather nasty question, but there’s one obvious answer that drops out in a second or two).

Of course, a way around this could be that scripts must be checked at the end, to ensure that there’s a reasonable bona fide attempt to each and every problem. However, that requires manual verification. An alternative option could be to introduce a minimum mark threshold $m$; attempts that have $M < m$ are invalidated.

$P_1 = \begin{cases} 0 & M < m \\ M/T & M \geq m \end{cases}$

This metric is a decent improvement, though it still has some issues:

• A strategy that seeks to identify the $m$ easiest marks to obtain and then attempts only those would perform well. This can be mitigated if $m$ is fairly high; for example, for A level mathematics papers I would set something like $m = 0.93$ or so.
• Also, if $m$ is set too high (e.g. $m = 1$ is the aforementioned “invalidate all non-perfect runs” strategy), too many runs, including runs that fail because of unknown facts/definitions may be invalidated.

We can also consider continuous penalty functions based on $M$ that increase perhaps more harshly than $M$ itself, as $M$ decreases from 1. For example,

$P_2 = \max \left\lbrace 0, \dfrac{1 - 2(1 - M)}{T} \right\rbrace$

Thus, a run with $M = 0.7$ for a given time has its score reduced to 40 percent of another run with the same time but $M = 1$. The max from 0 could be left out, if one wished to give negative scores to runs with $M < 0.5$ though I think that’s fairly harsh.

There’s also no reason we should restrict ourselves to linear functions. Consider

$P_3 = M^\alpha / T, \alpha > 1$

Higher values of $\alpha$ will penalise flaws more heavily; consider two runs with the same time, but one having $M = 0.9$ and the other $M = 1$; with $\alpha = 2$ the imperfect run scores 81 percent the score of the perfect run, but with $\alpha = 10$ the imperfect run scores a mere 34.9 percent the score of the perfect run! Also, observe that as $\alpha \rightarrow \infty$ we approach $P_1$ with a 100 percent threshold.

Of course, we also have the exponential option:

$P_4 = e^{-\lambda (1 - M)} / T, \lambda > 0$

In approaches 3 and 4, each additional mistake is penalised more. I think this makes sense for papers where candidates who understand the concepts can be expected to lose most or all of the points they lose to careless errors.

One or two slips should, in my opinion, result in a notable performance hit relative to perfect runs, but the run should still be salvageable; more than a few would seem like grounds for invalidating the run. It could be possible to blend the threshold idea (approach 1) with either approach 3 or 4, though we could argue that the heavy penalties involved would already destroy runs with “more than a few” errors.

# The University Gatekeepers (Hotseat: MAT 2013)

#### Background

I didn’t have to take the Mathematics Admission Test when I applied to Imperial (though the Cambridge STEP was a part of the conditions of my offer, probably because I didn’t take FM). This was added as a part of the admission process in 2013, though seemingly only for students aiming to read Mathematics. Oxford has been running this test for fairly long, and they use it both for Mathematics as well as Computer Science. Unlike STEP, candidates don’t get a choice of questions, but instead are prescribed questions based on the programme they’re applying to.

The paper has seven questions and candidates need to answer five. All candidates have to answer question 1, which is actually ten multiple-choice questions worth 4 marks each (no penalties for guessing). There are then six 15-mark questions that follow. Candidates in Computer Science generally have two questions that the broader Mathematics population also takes – in 2013 this featured an algebra question on functional equations, and a discrete mathematics/combinatorics question. There are two further questions, one of which tends to incorporate a significant element of mathematical logic (2013’s problem revolved around the notion of common knowledge, much like the infamous Cheryl’s birthday question), and another that deals with formal languages and reasoning about similar structures (the 2013 problem focused on an inductively generated language).

I did the Computer Science stream of the paper, and managed to put together a score of 95 (40 on the first problem; 14 on Q2, 15 on Q5, 15 on Q6 and 11 on Q7) owing to a lucky (or, depending on how you see it, educated) guess on one of the multiple-choice problems. The “Mathematics-only” questions Q3 and Q4 didn’t seem any harder, though; I cleared both of them without much difficulty (whereas Q2 was somewhat nastier, in general). I lost the points on Q7 owing to a careless error; specifically, that if an expression $Bw$ is reversed, the resultant expression is certainly not $wB$ (well, unless $w$ is palindromic!)

The specific paper I did is here.

#### Selected Problems in Depth

##### Problem 1: The Gauntlet

I don’t find multiple choice questions to play very well with mathematics, because it’s very often possible to reverse engineer a solution or otherwise arrive at the correct answer via an unexpected solution path. For example, consider problem G:

$p_n(x) = (x - 1) + (x - 2) + \ldots + (x - n)$

What is the remainder when $p_n(x)$ is divided by $p_{n-1}(x)$?

Presumably, the intended solution path was to express $p_n(x)$ in a more convenient form (noting the arithmetic progression in the constant terms) and then actually do the division. However, given the options, it sufficed to simply test it for $p_3(x)$ and $p_2(x)$, noting that the result of the division was negative, and that only one of the options was negative.

Nonetheless, there were certainly ample opportunities for trick questions (this paper had E, where there was a trap with an algebraic expression easily being bounded by degree 7, though it wasn’t the case that it actually was degree 7).

I got through 9 of the problems fairly smoothly, though struggled with H (failing to realise that one of the equations was actually that of a semicircle, and plowing into aggressive trigonometry-based integration). In the end I ran out of time towards the end of the integration, but I guessed the only option that featured the subtraction of the area under one of the lines, which turned out to be correct.

##### Problem 2: Functional Equations

This was generally fairly straightforward till the end, where the task was to find some function $f$ that satisfied $f(t) - f(1-t) = (2t - 1)^3$. In hindsight it seemed fairly obvious to conclude that a good way of solving this would be to treat $f$ as a cubic polynomial, though I struggled for a while substituting random values of $t$ in to the functional equation to see if anything dropped out. I then went through some messy reasoning along the lines of “the difference between two points increases cubically as the (x-)distance between said points increases linearly, so the derivative is probably quadratic, thus the function is probably cubic”. I then established $f(t) = At^3 + Bt^2 + Ct + D$ and was on my way after a modest algebraic bash.

##### Problem 6: Reasoning about Knowledge

This question introduces Alice, Bob and Charlie; Charlie writes a (different) number on Alice and Bob’s foreheads. Thus Alice and Bob can see the other’s number, and Charlie can see both. The question begins with a little warm-up to prompt candidates into thinking about epistemic reasoning (if Alice knows her number right off the bat, it must be 2 – because she can see Bob’s 1; for any other number she wouldn’t be able to know). Things heat up a little later on:

• Charlie: Your numbers are between 1 and 10 inclusive.
• Alice: I don’t know my number. Is my number a square number?
• Charlie: If I told you, you would know your number.
• Bob: I don’t know my number.

What is Alice’s number?

We start by noting that the squares are 1, 4 and 9. Since Alice can see Bob’s number, she already knows that her number is one of two possibilities, spaced two apart. For Charlie’s statement to make sense, exactly one of the possibilities has to be a square. This tells us that Bob’s number is 2, 3, 5 or 8. (It couldn’t be 10, because if it was Alice would know that her number is 9.)

Next, Bob can see Alice’s number, but doesn’t know his. Since he can see Alice’s number, Alice’s number $A$ must be one for which both $A + 1$ and $A - 1$ are candidates. This gives us the answer: her number has to be 4.

This one, in my opinion, stopped at a fairly early level; things could develop a fair bit more. However, assuming proportional time weighting (and bearing in mind that I did write my thesis on epistemic logic) the 22.5 minutes candidates were given for this is reasonably challenging.

#### Synthesis

I found STEP to be somewhat more demanding, perhaps because it had questions that focused more on depth rather than breadth, and also because I took the computer science stream of the paper while STEP just focuses on mathematics.

The average score of all candidates on questions 1 through 5 was 44.8, with 60.6 for successful applicants. Nonetheless, I subsequently worked through problems 3 and 4 and actually found them considerably easier than 2. Interestingly, there was a candidate who scored between 85 and 89 but was nonetheless rejected; also, no one scored 100, which seems surprising (this would certainly have been possible if not for the careless slip I made on problem 2 – or maybe the marking for some aspects such as the rigor demanded in proofs was stricter than my own marking).

This was quite enjoyable, actually. I’m not sure how I would have fared had I taken this exam in the past! I’d think that my knowledge of logic, as well as the more computer-science oriented topics (discrete mathematics and formal languages) would have improved, though knowledge of say calculus or analysis probably would be slightly weaker than when I finished high school.

# Odd Sums Sudoku

The puzzle above is from the WPF Sudoku GP, Round 4 2016 #13. To solve it, each row, column and marked 3×3 box must contain the digits 1 to 9 once each, and the solution must have, in the marked rows, maximal odd substrings summing to the given totals in the correct order.

I’ve always enjoyed a good puzzle, though I’ve never considered myself especially good at them. That said, I’ve done my fair share of sudoku, including several books where puzzles were labelled as very difficult; yet, fairly often identifying triples, or the odd X-wing, XY-wing or unique rectangle would be sufficient to blow through the puzzles. At least, I thought that clearing some “very difficult” puzzles in 8 or 10 minutes was blowing through them; I found the World Puzzle Federation’s Sudoku GP, where some of these had benchmark times along the lines of 3-4 minutes!

Furthermore, in addition to your standard “classic” puzzles which feature the standard rules (i.e. each row, column and marked 3×3 box must contain the digits from 1 to 9 once each), there are a couple of interesting variants that require understanding new rulesets and figuring out how to deal with them. After a while, as mentioned the classic puzzles largely tend to be solved with standard techniques, and the challenge there is dispatching them with extreme speed; the variants, on the other hand, tend to maintain some degree of freshness (at least as far as I’m concerned).

The above puzzle was an example of this; several of the rows are marked with additional clues, and for these rows, the largest odd substrings in the row must sum to the values indicated by the clues. For example, if a row was 178356924, the clue associated with it would be 8, 8, 9. Note that this is also generated by several other valid sequences, such as 532174689 or 174283569.

For the above puzzle, the 9, 5, 3, 8 clue is a good starting point, and indicates that we have 9, 5, 3 and (17 or 71) as substrings of the solution to that row in some order; given the position of 6, we can conclude that the 9 and the 3 must occur before the 6 (though we don’t know). We thus have 9 in the bottom left hand corner, an even number (either 2 or 4), and then a 3. Now, reading the first column clue 1, 12, 3, 9; we have 1, (57 or 75), 3 and 9 as substrings. The 9 is at the end, and 3 cannot occur before the 4 as there’s no space to fit it. Thus 3 occurs after the 4, but it has to be separated from the 9 by an even number. More simply, the second column clue 10, 9, 6 indicates that our 5 has to be preceded by an even number and then a 1. There’s only one spot left for a 7 in this lower box, so we can insert it.

We thus reach this position:

Moving on, we can consider the middle column. 3 can only be satisfied by a singular 3, so we know that the 3 in the bottom row cannot be in the middle column. It thus must be in the sixth column, with 1 and 7 taking up the eighth and ninth columns (in some order).

Now consider the rightmost column. It is 3, 10, 12; 3 can only be satisfied by a singular 3, so the odd substrings must be 3, (19 or 91) and (57 or 75). Given the position of the 2 (fourth row), it must be the case that the column has 1 and 9 in the fifth and sixth rows, an even number in the seventh, and 5 and 7 in the eighth and ninth. We thus clearly have 7 in the corner, and 5 above it.

Finally, we can actually decide on the ordering of 1 and 9 as well; consider the hint for the middle row, which ends with 8; a 9 would automatically exceed this. Thus the middle row must contain the 1 and the sixth row the 9. The hint gives us for free that we have a 7 to the left of the 1.

There are more interesting deductions we can make as well. Consider the seventh column. The clue is 16, 9 and this can be satisfied in two ways: {1357} in some order and then a singleton 9, or {79} and {135} in some order. However, we can prove that the latter construction is impossible; the middle row is an even number, and also we have already used the 1 and the 5 in the bottom right hand box so we can’t place those numbers there. Thus, the first four numbers must be {1357} in some order and there is a 9 somewhere in the bottom right box.

The clue in the first row requires a 5 at the end. Given the 10, 10, 5 clue and the 4 already placed in the sixth column, the only odd substring in the last box must be the 5. We thus must have the 5 in the first row, and now only the 3 can fit in the fourth row. We also gather that the last two columns of the first row must be even, and we have enough clues to place these (using standard Sudoku rules).

Finally, we can place the 4 in the middle right box (now that the 3 is there, there’s only one space left where it can fit). Similarly, we can place the 5 (knowing that it can’t be in the seventh column any longer).

From this point on, solving the rest of the puzzle should be reasonably straightforward. The next step would probably involve placing 1, 9, 6, 7 and 3 in the first row (in that order – using the column hints and the knowledge that we have two tens)!

For these contests, I’d say the challenge lies largely in figuring out how the ruleset can be used to help solve the puzzle (and, well, speed; you’re thrown 13-15 puzzles to solve in the space of 90 minutes and I can only usually manage about seven or eight, and for that matter the ones I do finish are the ones on the easier end of the scale!).

# Revisiting Mathematics Competitions (Hotseat: AIME II, 2015)

I remember I had a pretty good time doing the Haskell January Test earlier this year, and thought it might be both enjoyable and useful to revisit some of these past mental exercises. Hotseat is intended to be a series that discusses my thoughts, ideas and reflections as I take or re-take examinations or competitions; many of these will end up being related to mathematics or computer science in some way, though I’ll try to also include a few that go against that grain.

#### Background

I didn’t actually participate in the American Invitational Mathematics Examination (AIME) as I don’t think my high school offered it (the only maths competitions I remember taking myself are the Singapore Math Olympiad and the Australian Mathematics Competition). This is typically the “second round” of the main mathematical competition in the US (the first being the AMC, and the third being the USAMO), and features 15 problems, all of which have integer answers from 0 to 999. The point of this is to facilitate machine scoring. While the questions might indeed have integer answers, they can be very difficult especially towards the end.

2015 II was supposed to be an easy paper, though I didn’t find it to be very much easier than what I recall these to be. You can find the paper here.

I managed to fight my way through 12 out of 15 questions, leaving out two hard problems towards the end and a not-so-difficult one in the middle (mainly because I struggle somewhat with geometry in three dimensions). I think I’m not used to doing these as quickly as I would have in the past as well; problem 14, for instance, would probably have taken under 15 minutes in the past while it now took 32. 8 minutes on problem 1 is rather bad, as well.

The values below indicate the time spent on each problem; here green means the problem was answered correctly while grey means no answer was given. (In practice, in a contest you would take the 1-in-1000 shot and guess, since there are no penalties for guessing!). I’ve also outlined the subject areas – you can see that I tend to be stronger with combinatorics and probability, and struggle somewhat with geometry!

I think I remember mentioning in a post a very long time ago that when I look at data, I tend to initially look for things that seem unusual. Bearing in mind that the difficulty of these papers tends to rise, we could say the 32 minutes spent on question 14 or 15 on question 7 is perhaps not too far from expectations (the target “pace” here is 12 minutes per question, if you’re trying to finish). Bearing that in mind, we can look at a couple of outliers:

• Problems 1, 6 and 11 took unduly large amounts of time.
• Problems 2, 3, 8 and 12 were done well above pace.

Some of these were actually pretty interesting, which is one reason I tend to enjoy these competitions as well.

#### Selected Problems in Depth

Problem 1: Let $N$ be the least positive integer that is both 22 percent less than one integer and 16 percent greater than another integer. Find the remainder when $N$ is divided by 1000.

For this question you were actually supposed to find $N$; the “divided by 1000” shenanigans was so that the answer would be an integer from 0 to 999. The challenge here was really formulating a requisite equation from the problem statement, which would be

$\dfrac{39}{50} k_1 = N = \dfrac{29}{25} k_2$

and then reasoning that $N$ must be divisible by both 29 and 39, such that $k_1$ and $k_2$ were both integers. Since these are relatively prime $N = 29 \times 39 = 1131$ and the answer was 131. I guess I hadn’t really warmed up yet, and wanted to be very careful to set up the expression correctly, so this took rather long.

6 was a bash with the Vieta formulas, though my brain temporarily switched off when I somehow thought a cubic equation should have four roots. 11, on the other hand, was a rather messy geometry question which I initially found tough to visualise. The solution on Art of Problem Solving involves constructing perpendiculars from the centre of the circle to the two lines, but I didn’t think of that (and for that matter wouldn’t have known that was a property of a circumcentre), and instead had a rather painful trigonometry bash to find the answer.

On the brighter side, 2 and 3 were rather simple questions (an elementary probability question, and a rather quick modulo-arithmetic one) and I cleared them off very quickly. I dealt with 8 using some rather quick-and-dirty casework that yielded the desired result, though it wasn’t too interesting. I’d say 12 was a bit more of a fun one.

Problem 12There are $2^{10} = 1024$ possible 10-letter strings in which each letter is either an A or a B. Find the number of such strings that do not have more than 3 adjacent letters that are identical.

I thought in terms of recurrences and dynamic programming, and realised that valid strings (in the long case) must end in BA, BAA, BAAA or their flipped versions (alternatively, with a blank space in front). To end a string with BAA, it must previously have ended with BA, and so on for BAAA. Thus, the following equations drop out:

$BA(n) = AB(n - 1) + ABB(n - 1) + ABBB(n - 1)$

$BAA(n) = BA(n - 1); BAAA(n) = BAA(n - 1)$

Since the AB and BA cases are symmetric, we can simplify the first equation above to

$BA(n) = BA(n - 1) + BAA(n - 1) + BAAA(n - 1)$

And we have of course the base cases; $BA(1) = 1, BAA(1) = 0, BAAA(1) = 0$. Computing $2 \left( BA(10) + BAA(10) + BAAA(10) \right)$ is then not difficult, and yields $548$.

In mathematical terms I would say problem 14 required some insight, though for seasoned competitors it would probably still not have been too difficult.

Problem 14Let $x$ and $y$ be real numbers satisfying $x^4y^5 + y^4x^5 = 810$ and $x^3y^6 + y^3x^6 = 945$. Evaluate $2x^3 + (xy)^3 + 2y^3$.

I initially tried squaring the resultant expression, and combining the two given results in various ways. This went on for about 20 minutes with little meaningful progress. I then took a step back, attempted and finished up with problem 11, and came back to this. I suddenly had the idea of combining the two givens to force a cubic factorisation:

$x^3y^6 + 3x^4y^5 + 3x^5y^4 + x^6y^3 = 810 \times 3 + 945$

And that simplifies down to

$x^3y^3 (x+y)^3 = 3375 \leadsto xy(x+y) = 15$

We then proceed by factoring the first result:

$x^3y^3 (xy(x+y)) = 810 \leadsto x^3y^3 = 54$

We can get $17.5$ for $x^3 + y^3$ by factoring the second given to $x^3y^3(x^3 + y^3)$. The final result is then just $35 + 54 = 89$.

I subsequently figured out solutions for 13 (my brief intuition about complex numbers was kind of on the right track, but I didn’t have time) and then 9 (not good with 3D geometry), and had a look at the rather painful solution for 15.

#### Meta-Analysis

It’s worth taking a look at how this performance was relative to the actual score distribution, which is available on the AMC statistics page. A score of 12 would be in the 96.09th percentile, which is interesting; I did the 2015 AIME I a few weeks ago and scored 11, but on that contest that was the 97.49th percentile!

There’s also a report on item difficulty (i.e. the proportion of correct responses to each problem). I’ve graphed this below (items I answered correctly are in green, while those I did not are in red):

It’s interesting that problem 12, a problem I found to be one of the easier ones (assuming you sort by success and then time, I found it to be the 8th hardest) turned out to be the second “hardest” problem on the paper! I’d attribute some of this to a programming background; if I took this contest when I was in high school I think I would probably struggle with 12.

13 and 15 were hard as expected; the next hardest problem in general after that was 10 (just a notch above 12 in difficulty for me) though again it relies on tricky combinatorics. We then have 6, 11 and 8 before 14; I found 6 and 11 relatively harder as well, as mentioned above, though 8 was fairly trivial for me. It turns out that my geometry weakness is fairly real, as 9 was completed by more than half of the participants! We can also observe problem 1 being challenging relative to its location in the paper.

#### Conclusion

This was a good mental workout, and it felt nice to be able to hack away at the various problems as well. I think a 12 is pretty solid (the 96th percentile is that of AIME qualifiers, which itself is something like the top 5 percent of high school students taking the AMC, so that would be the 99.8th percentile – and you could argue that considering only students taking the AMC already introduces a selection bias towards those with greater aptitude). I’d probably have been able to solve 9 and maybe 15 if I was doing this back in high school, though I’m not sure I would have managed 12.

I can certainly feel a loss of speed, though I think my problem-solving ability is for the most part intact, which is good.

• 1
• 2