# Constant Propagation (January Haskell Tests 2018 – Part 1)

The latest iteration of the January Haskell Tests is out (see Dr Tony Field’s page, and my post on the 2017 test). I think the last time I wrote any Haskell was eleven months ago when taking the 2017 test for a spin; nonetheless I was able to pick things up quickly. I guess the syntax (outside of monads) is pretty intuitive.

This test in some ways felt a fair bit easier than the last, at least in terms of the assessed portion (the last time round, it took me up to just past the two-hour mark to clear that; this time round, I was done with the questions that were actually worth marks by 1:15). However, the ‘bonus’ unassessed Part 4 seemed hard; finishing it in time looked to be a pretty tall order. Part 4 itself could be decomposed into three fairly chunky parts, and could even be a reasonable basis for a separate test!

The test dealt with optimising code that was already in static single assignment (SSA) form. I’d covered this topic before in a course on software reliability (within the context of static program verification), though I don’t remember covering it within the context of (possibly optimising) compilers. SSA form is fairly restrictive:

• Each variable can only be assigned to once.
• Operator arguments can only comprise constants and direct variable references.

For example, the assignment $x = 4 * y + 7 * z$ is clearly not in SSA form; but it can be rewritten as $a = 4*y; b = 7*z; x = a + b$ where each statement is. In most programming languages variables can be assigned to multiple times; as part of conversion, we tend to add suffixes to variables for versioning. A more involved example follows:

long notSsa(x, y) {
a = x + 100;
b = 2 * y + 3; // compound expression
b = b + x; // can't reassign!
return a + b;
}

long ssaVersion(x, y) {
a0 = x + 100;
k0 = 2 * y;
b0 = k0 + 3;
b1 = b0 + x;
return a0 + b1;
}

SSA also allows for the concept of a phi function, which has two expressions, and non-deterministically returns one or the other. This is useful for performing optimisations regarding control flow. For example, consider this function:

long veracity(x):
z = 1;
y = 5;
t = 0;
if (x > 0) {
z = 2;
t = z + y;
} else {
z = 3;
t = (2 * y) - z;
}
return x + t;
}

We can rewrite this using a phi function; after an if-block, the final value of a variable could be the value at the end of either branch.

long veracity(x):
z0 = 1;
y0 = 5;
t0 = 0;
if (x0 > 0) {
z1 = 2;
t1 = z1 + y0;
} else {
z2 = 3;
k0 = 2 * y1;
t2 = k0 - z2;
}
z3 = PHI(z1, z2);
t3 = PHI(t1, t2);
return x0 + t3;
}

Although one had to work with phi functions as early as Part 2 (the function written at the end of part 2 should be able to simplify the above function significantly), thinking about the purpose of phi functions certainly wasn’t important until Part 4.

Part 1 involved writing an interpreter to execute statements in a (very) simple subset of C, where code has already been parsed into tokens. I think this was intended to set a baseline for students who were struggling with the material; I knocked through the 14 marks worth (which is enough for a pass; that’s 47%) in just 17 minutes.

Things got somewhat tougher in part 2. The opening was easy, with two 3-mark questions inviting candidates to implement simple constant folding optimisations (in particular, that $0 + x = x$ and $\phi(x, x) = x$, along with calculating out results) and then determining the value of an expression given knowledge of some variable values.

The tricky part then began, with a task to implement constant propagation and folding to actual code blocks in SSA form. Conceptually, this is not difficult – once the value of a variable is known, it can be substituted into all subsequent occurrences. A fix-point approach fits naturally, since knowing something about some variables can lead to knowledge about more variables. Nonetheless, I struggled for a while with implementing the suggested worklist algorithm; the aforementioned fix-point approach was suggested too, and would probably have been more elegant, if harder to implement. I finished this part at 1:01.

Part 3 involved removing phi functions if it was not possible to clean them up as part of the SSA optimisation, by pushing the variable assignments back into code. It seems a decision was made to state the rules for these explicitly in the specification, for some reason (to be fair, as Tony notes it is straightforward), which seemed to make for a pretty tame last assessed part. There was some tricky thinking required here, nonetheless; I think one has to be very careful with handling multiple phis!

Finally, Part 4 tasked students with translating a function to SSA form. This itself is weighty enough that I think another test could easily be written on it. This part can be decomposed into three (maybe four) separate challenges:

1. Translate basic blocks to SSA form correctly.
1. Maintain correct information about versioning of variables.
2. Decompose compound statements (like $x = 3 * y + 5 * z$).
lookupWithFailure :: Id -> UsageMap -> Int
= fromMaybe (0) (lookup id map) - 1 -- TODO hack