Playing with Blocks (Hotseat: British Informatics Olympiad R1, 2015)

Second in a series. I apologise for the temporary hiatus, had a couple of weeks travelling and then intense sessions at work (part of this was my choice, but still).

Background

I was not in the UK for high school, and studying computing/informatics wasn’t common in my high school in Singapore either. I never actually participated in the British Informatics Olympiad. This is the UK’s national high-school competition in computing, and unlike normal programming contests there’s an explicit portion involving reasoning and discrete mathematics which I like. This can get intense; problems especially in the second round can feature fairly complex reasoning – such as 2009’s second round which required understanding models of computation in FRACTRAN (and eventually deciphering a “program” that computed the Fibonacci series). For quite a few contest problems, intuitively determining a solution, or even being uncertain but firing it off anyway in hopes of getting it accepted is much easier than trying to prove that the solution is correct.

The first round is somewhat less intense, but still has plenty of opportunities for interesting problems. There are typically three tasks; the first is gentle and meant to be a warm-up, the second tests implementation (perhaps more like what I might expect on a programming exam in university) and the third tests algorithms (in a sense, more like a standard contest problem). Each question has most of the points (about 75%) allocated to implementation and the remainder to several “reasoning” questions. Partial scores are awarded for solutions that aren’t optimal or fail on some cases (much like in the IOI, though the information is not known to the students).

This was one of the easier papers I’ve done; I worked my way through the tasks very quickly, and didn’t have too much of a problem (other than a careless error in problem 2). Most of the time I score above 90 points on these, though sometimes if I don’t “get” problem 3 in an optimal way or make subtle errors on problem 2 the score will be a fair bit lower.

The specific paper I did is here.

Problem 1: Block Palindromes

Given a string, find the number of ways of decomposing it into at least two blocks such that when the blocks are reversed (but not the letters within the blocks), the arrangement of blocks looks the same. For example, BARBAR is a block palindrome, since (BAR)(BAR) reads the same forwards and backwards. Of course, every ordinary palindrome is also a block palindrome, since taking one-letter blocks is allowed.

Not too much here; the problem is readily amenable to a simple recursive implementation (try taking each block length from 1 to half the string length and see if the string ends with that prefix – if so, recurse and add that result to our total). It would be possible to use memoisation to improve performance here, since there can be repeated subproblems e.g. with AABCBAA, you would be solving for the number of solutions to BCB twice. However, given the small constraints (strings of up to 10 characters) this wasn’t needed.

This problem probably had one of the trickier reasoning questions though – if all groupings of a block palindrome contain an even number of blocks, how many groupings can there be (and why)?

I attacked this one with a proof by contradiction – that it is not possible to have any groupings

S = B_1 B_2 \ldots B_{n-1} B_n B_{n-1} \ldots B_2 B_1

for n \geq 2. This can be shown by construction:

S' = (B_1) (B_2 \ldots B_{n-1} B_n B_{n-1} \ldots B_2) (B_1)

is a valid decomposition, which has three blocks. Of course, the empty string isn’t a valid block, so we require S = B_1 B_1 and the number of groupings has to be 1. We can probably formulate an even stronger condition (for instance, that apart from B_1 itself, no prefix of B_1 is also a suffix of B_1). While the reasoning seemed sound to me, I wasn’t too confident I had the right answer here.

(Note: The question text implied that a block palindrome could be decomposed into blocks in some way, so while 0 satisfies “all groupings” it wouldn’t be a block palindrome.)

Problem 2: Battleships

Given a funky (deterministic) algorithm for generating random numbers, implement it, and use it to fill a grid with ships of certain sizes via some kind of rejection sampling. The key constraints are that the grid has a finite size, and more interestingly ships are not allowed to touch (even diagonally).

Implementing the random-generator is pretty much following instructions from the spec. However, there are several ways to implement the latter constraint. Given that optimality is generally not important in problem 2, I went for a rather nasty solution that, say, for a k-ship placed horizontally would iterate through a 3 \times (k + 2) bounding rectangle around the ship and mark all those squares as unusable. Better asymptotic complexity can probably be obtained using an R-tree (though that is probably not reasonable for contests – especially without a reference!).

There was a rather nasty “reasoning” problem about the number of ways to place ships of size 4, 3, 2 and 1 on a 5 by 5 grid – this was also where I lost the five points. I hacked my implementation to, instead of using the random number generator, simply use backtracking to count the number of possible placements. However, I forgot that my bounding rectangle system was based on booleans (in other words, a square that was unusable because of two different ships would suddenly become usable after just one was removed), and thus came up with the impressive total of 427,840 possibilities (correct answer: 1,424)!

I subsequently fixed the aforementioned bug (using integers instead) and confirmed that the answer was indeed 1,424. During the test I did try smoke-tests that verified that a 1-ship could be placed in 25 ways and a 5-ship in 10; I attempted to do two 1-ships, but decided that I would have had too much pain working through the combinatorics for that.

Problem 3: Modern Art

This problem revolved around (reasonably) efficiently determining the ith permutation of a string in terms of lexicographic ordering, possibly involving duplicate characters. For example, in a string which consists of five As, Bs, Cs and Ds – what is the 11,223,344,556th permutation?

Of course, for smaller strings and smaller numbers it would suffice to implement some kind of next-permutation generator. This is a pretty standard algorithm. We can then step through the permutations until we get what we want – and there are some test cases that would be solvable with this; you would get (I think) 11 of the 25 points available.

The key idea is that we don’t want to explore all of the permutations. From combinatorics, we can quickly calculate the number of permutations of five As, Bs, Cs and Ds:

\displaystyle {20 \choose 5} \times {15 \choose 5} \times {10 \choose 5} \times {5 \choose 5} = 11,732,745,024

Intuitively, we are close to the “end”; we can thus be pretty sure that our 11-millionth permutation should start with a D. We can thus write a recursive (or iterative) algorithm that identifies the first character based on this search.

I began by implementing a function that computes the number of ways of permuting a given number of As, Bs, Cs and Ds. This was done in a bit of a hacky way and has some (probably unnecessary) memoisation.

private static long numPerms(long as, long bs, long cs, long ds) {
  List<Long> longs = Lists.newArrayList(as, bs, cs, ds);
  Collections.sort(longs);
  if (cache.contains(longs)) {
    return cache.get(longs);
  }
  long totalChars = as + bs + cs + ds;
  long perms = 1L;

  // Computes (total C l) then updates total
  for (Long l : longs) {
    long num = 1;
    long denom = 1;
    for (int i = 0; i < l; i++) {
      num *= totalChars - i;
      denom *= i + 1;
    }
    perms *= num / denom;
    totalChars -= l;
  }

  cache.put(longs, perms);
  return perms;
}

Interestingly, it turns out it would actually have been okay to build the entire numerator and denominator and then divide the two; perhaps 20 was chosen to avoid complications with overflow (since 21! will exceed the max value of a 64 bit integer). Anyway, the highest value the numerator could have taken on here was 1,860,480 (20 to 16 multiplied together) so I was sure I was safe this way.

We then go through the characters in lexicographic order, and for each one, consider the number of permutations that would be feasible. The general approach is something like this:

private static String getPermutation(long as, long bs, long cs, long ds, long offset) {
  if (as == 0 && bs == 0 && cs == 0 && ds == 0) {
    return ""; // we've built the entire thing
  }

  long permsCovered = as == 0 ? 0 : numPerms(as - 1, bs, cs, ds);
  if (offset < permsCovered) {
    return "A" + getPermutation(as - 1, bs, cs, ds, offset);
  }
  long permsPassed = permsCovered;
  permsCovered += bs == 0 ? 0 : numPerms(as, bs - 1, cs, ds);
  if (offset < permsCovered) {
    return "B" + getPermutation(as, bs - 1, cs, ds, offset - permsPassed);
  }
  permsPassed = permsCovered;
  permsCovered += cs == 0 ? 0 : numParams(as, bs, cs - 1, ds);
  if (offset < permsCovered) {
    return "C" + getPermutation(as, bs, cs - 1, ds, offset - permsPassed);
  }
  // and the same for D.
}

I actually finished the implementation fairly quickly, but decided to write a couple of tests; I stepped through the 12 permutations of “ABBC” as given in the spec, and also wrote a test for “AAABBBCCCDDD” that all the permutations were generated in the correct order (using numPerms to give me the range). I also of course tried a couple of edge-ish cases such as each of “A”, “B”, “C” and “D”, and a test to verify we could handle the expected scale (five of each character, retrieving the 1st as well as 11732745024th permutation).

The “reasoning” problems on this one were very easy compared to the other two questions.

Conclusion

Unfortunately meta-analysis is somewhat limited here as unlike the AIME results, the ones for the BIO aren’t public. I’m pretty sure 95 is a solid score as the contest isn’t that easy, but I won’t be surprised if there are multiple 100s (with a bit of caution I could have gotten one)!

Style during programming contests is an interesting issue. I find that many contestants especially with C++ like to use rather bespoke preprocessor routines (e.g. cin as f, cout as g). I’m actually used to some of them as well e.g. ll for long long, vpll for vector<pair<long, long>>. If one understands these it’s probably better for speed, though when debugging I find having not-unreasonable names is preferable (even if my code is written more for speed than style).


Leave a Reply