The latest iteration of the January Haskell Tests is out (see Dr Tony Field’s page, and my post on the 2017 test). I think the last time I wrote any Haskell was eleven months ago when taking the 2017 test for a spin; nonetheless I was able to pick things up quickly. I guess the syntax (outside of monads) is pretty intuitive.
This test in some ways felt a fair bit easier than the last, at least in terms of the assessed portion (the last time round, it took me up to just past the two-hour mark to clear that; this time round, I was done with the questions that were actually worth marks by 1:15). However, the ‘bonus’ unassessed Part 4 seemed hard; finishing it in time looked to be a pretty tall order. Part 4 itself could be decomposed into three fairly chunky parts, and could even be a reasonable basis for a separate test!
The test dealt with optimising code that was already in static single assignment (SSA) form. I’d covered this topic before in a course on software reliability (within the context of static program verification), though I don’t remember covering it within the context of (possibly optimising) compilers. SSA form is fairly restrictive:
- Each variable can only be assigned to once.
- Operator arguments can only comprise constants and direct variable references.
For example, the assignment is clearly not in SSA form; but it can be rewritten as
where each statement is. In most programming languages variables can be assigned to multiple times; as part of conversion, we tend to add suffixes to variables for versioning. A more involved example follows:
long notSsa(x, y) { a = x + 100; b = 2 * y + 3; // compound expression b = b + x; // can't reassign! return a + b; } long ssaVersion(x, y) { a0 = x + 100; k0 = 2 * y; b0 = k0 + 3; b1 = b0 + x; return a0 + b1; }
SSA also allows for the concept of a phi function, which has two expressions, and non-deterministically returns one or the other. This is useful for performing optimisations regarding control flow. For example, consider this function:
long veracity(x): z = 1; y = 5; t = 0; if (x > 0) { z = 2; t = z + y; } else { z = 3; t = (2 * y) - z; } return x + t; }
We can rewrite this using a phi function; after an if-block, the final value of a variable could be the value at the end of either branch.
long veracity(x): z0 = 1; y0 = 5; t0 = 0; if (x0 > 0) { z1 = 2; t1 = z1 + y0; } else { z2 = 3; k0 = 2 * y1; t2 = k0 - z2; } z3 = PHI(z1, z2); t3 = PHI(t1, t2); return x0 + t3; }
Although one had to work with phi functions as early as Part 2 (the function written at the end of part 2 should be able to simplify the above function significantly), thinking about the purpose of phi functions certainly wasn’t important until Part 4.
Part 1 involved writing an interpreter to execute statements in a (very) simple subset of C, where code has already been parsed into tokens. I think this was intended to set a baseline for students who were struggling with the material; I knocked through the 14 marks worth (which is enough for a pass; that’s 47%) in just 17 minutes.
Things got somewhat tougher in part 2. The opening was easy, with two 3-mark questions inviting candidates to implement simple constant folding optimisations (in particular, that and
, along with calculating out results) and then determining the value of an expression given knowledge of some variable values.
The tricky part then began, with a task to implement constant propagation and folding to actual code blocks in SSA form. Conceptually, this is not difficult – once the value of a variable is known, it can be substituted into all subsequent occurrences. A fix-point approach fits naturally, since knowing something about some variables can lead to knowledge about more variables. Nonetheless, I struggled for a while with implementing the suggested worklist algorithm; the aforementioned fix-point approach was suggested too, and would probably have been more elegant, if harder to implement. I finished this part at 1:01.
Part 3 involved removing phi functions if it was not possible to clean them up as part of the SSA optimisation, by pushing the variable assignments back into code. It seems a decision was made to state the rules for these explicitly in the specification, for some reason (to be fair, as Tony notes it is straightforward), which seemed to make for a pretty tame last assessed part. There was some tricky thinking required here, nonetheless; I think one has to be very careful with handling multiple phis!
Finally, Part 4 tasked students with translating a function to SSA form. This itself is weighty enough that I think another test could easily be written on it. This part can be decomposed into three (maybe four) separate challenges:
- Translate basic blocks to SSA form correctly.
- Maintain correct information about versioning of variables.
- Decompose compound statements (like
).
- Add support for conditionals.
- Add support for loops (do-while was a part of the simple C subset used).
Squeezing all of that into the remainder of the time after finishing the first three parts would have been tough; I finished the first two full tasks (basic blocks at 2:07 and conditionals at 2:46) and squeezed out a comment on what to do for the last sub-task. The last part of task 2 was also written rather hackily as I was racing the clock. For a quick preview:
lookupWithFailure :: Id -> UsageMap -> Int lookupWithFailure id map = fromMaybe (0) (lookup id map) - 1 -- TODO hack
To be fair, I could probably have churned all three main and three bonus parts out in Java within the time-limit; it’s nonetheless useful to engage a different set of programming skills.
Last year, I wrote that I enjoyed revisiting Haskell; I enjoyed this year’s task too. I think I actually enjoyed this installment more, probably because of the outsize difficulty in Part 4! Although I haven’t written professional Haskell, I often find myself writing functional-style code in Java, and I’d still give it an enthusiastic thumbs-up for a choice of first language to teach at university. There are some influences from professional development coming in here, too; a year ago I probably wouldn’t have batted an eye at using types like a Haskell [(Str, [Int])] or Java Map<String, List<Integer>>, while I’ll tend to give structures types more quickly. My tolerance for long functions (without good reason, in non performance-critical scenarios) has also certainly decreased.
This post is Part 1… I’ll take a more in depth look at Part 4 of the exam in Part 2.