Constant Propagation (January Haskell Tests 2018 – Part 1)

The latest iteration of the January Haskell Tests is out (see Dr Tony Field’s page, and my post on the 2017 test). I think the last time I wrote any Haskell was eleven months ago when taking the 2017 test for a spin; nonetheless I was able to pick things up quickly. I guess the syntax (outside of monads) is pretty intuitive.

This test in some ways felt a fair bit easier than the last, at least in terms of the assessed portion (the last time round, it took me up to just past the two-hour mark to clear that; this time round, I was done with the questions that were actually worth marks by 1:15). However, the ‘bonus’ unassessed Part 4 seemed hard; finishing it in time looked to be a pretty tall order. Part 4 itself could be decomposed into three fairly chunky parts, and could even be a reasonable basis for a separate test!

The test dealt with optimising code that was already in static single assignment (SSA) form. I’d covered this topic before in a course on software reliability (within the context of static program verification), though I don’t remember covering it within the context of (possibly optimising) compilers. SSA form is fairly restrictive:

  • Each variable can only be assigned to once.
  • Operator arguments can only comprise constants and direct variable references.

For example, the assignment  x = 4 * y + 7 * z is clearly not in SSA form; but it can be rewritten as  a = 4*y; b = 7*z; x = a + b where each statement is. In most programming languages variables can be assigned to multiple times; as part of conversion, we tend to add suffixes to variables for versioning. A more involved example follows:

long notSsa(x, y) {
  a = x + 100;
  b = 2 * y + 3; // compound expression
  b = b + x; // can't reassign!
  return a + b;
}

long ssaVersion(x, y) {
  a0 = x + 100;
  k0 = 2 * y;
  b0 = k0 + 3;
  b1 = b0 + x;
  return a0 + b1;
}

SSA also allows for the concept of a phi function, which has two expressions, and non-deterministically returns one or the other. This is useful for performing optimisations regarding control flow. For example, consider this function:

long veracity(x):
  z = 1;
  y = 5;
  t = 0;
  if (x > 0) {
    z = 2;
    t = z + y;
  } else {
    z = 3;
    t = (2 * y) - z;
  }
  return x + t;
}

We can rewrite this using a phi function; after an if-block, the final value of a variable could be the value at the end of either branch.

long veracity(x):
  z0 = 1;
  y0 = 5;
  t0 = 0;
  if (x0 > 0) {
    z1 = 2;
    t1 = z1 + y0;
  } else {
    z2 = 3;
    k0 = 2 * y1;
    t2 = k0 - z2;
  }
  z3 = PHI(z1, z2);
  t3 = PHI(t1, t2);
  return x0 + t3;
}

Although one had to work with phi functions as early as Part 2 (the function written at the end of part 2 should be able to simplify the above function significantly), thinking about the purpose of phi functions certainly wasn’t important until Part 4.

Part 1 involved writing an interpreter to execute statements in a (very) simple subset of C, where code has already been parsed into tokens. I think this was intended to set a baseline for students who were struggling with the material; I knocked through the 14 marks worth (which is enough for a pass; that’s 47%) in just 17 minutes.

Things got somewhat tougher in part 2. The opening was easy, with two 3-mark questions inviting candidates to implement simple constant folding optimisations (in particular, that  0 + x = x and \phi(x, x) = x , along with calculating out results) and then determining the value of an expression given knowledge of some variable values.

The tricky part then began, with a task to implement constant propagation and folding to actual code blocks in SSA form. Conceptually, this is not difficult – once the value of a variable is known, it can be substituted into all subsequent occurrences. A fix-point approach fits naturally, since knowing something about some variables can lead to knowledge about more variables. Nonetheless, I struggled for a while with implementing the suggested worklist algorithm; the aforementioned fix-point approach was suggested too, and would probably have been more elegant, if harder to implement. I finished this part at 1:01.

Part 3 involved removing phi functions if it was not possible to clean them up as part of the SSA optimisation, by pushing the variable assignments back into code. It seems a decision was made to state the rules for these explicitly in the specification, for some reason (to be fair, as Tony notes it is straightforward), which seemed to make for a pretty tame last assessed part. There was some tricky thinking required here, nonetheless; I think one has to be very careful with handling multiple phis!

Finally, Part 4 tasked students with translating a function to SSA form. This itself is weighty enough that I think another test could easily be written on it. This part can be decomposed into three (maybe four) separate challenges:

  1. Translate basic blocks to SSA form correctly.
    1. Maintain correct information about versioning of variables.
    2. Decompose compound statements (like x = 3 * y + 5 * z).
  2. Add support for conditionals.
  3. Add support for loops (do-while was a part of the simple C subset used).

Squeezing all of that into the remainder of the time after finishing the first three parts would have been tough; I finished the first two full tasks (basic blocks at 2:07 and conditionals at 2:46) and squeezed out a comment on what to do for the last sub-task. The last part of task 2 was also written rather hackily as I was racing the clock. For a quick preview:

lookupWithFailure :: Id -> UsageMap -> Int
lookupWithFailure id map
  = fromMaybe (0) (lookup id map) - 1 -- TODO hack

To be fair, I could probably have churned all three main and three bonus parts out in Java within the time-limit; it’s nonetheless useful to engage a different set of programming skills.

Last year, I wrote that I enjoyed revisiting Haskell; I enjoyed this year’s task too. I think I actually enjoyed this installment more, probably because of the outsize difficulty in Part 4! Although I haven’t written professional Haskell, I often find myself writing functional-style code in Java, and I’d still give it an enthusiastic thumbs-up for a choice of first language to teach at university. There are some influences from professional development coming in here, too; a year ago I probably wouldn’t have batted an eye at using types like a Haskell [(Str, [Int])] or Java Map<String, List<Integer>>, while I’ll tend to give structures types more quickly. My tolerance for long functions (without good reason, in non performance-critical scenarios) has also certainly decreased.

This post is Part 1… I’ll take a more in depth look at Part 4 of the exam in Part 2.


Leave a Reply