A teammate asked me a computer science / algorithm question over lunch. It took me somewhere between 30 to 40 minutes of thought and discussion with teammates to solve the problem, inclusive of writing about 20 lines of Java; I can see it being a kind of ‘brain teaser’ interview question (and thus generally best avoided). That said, genuinely not having heard this problem before, I think it was interesting to see the thought process I went through to get to the final answer.
You have 1,000 buckets of water, 1 of which is poisoned. It is known that the poison if ingested will kill a pig within 15 minutes. You have one hour to determine which bucket contains the poison. What is the minimum number of pigs that you need, and how would you do it? (Eventually, of course, you want to generalise your answer.)
After some initial “lateral thinking”-style questions (does the poison only kill pigs? is the death timing exact?), I determined that it was intended to be a genuine algorithms problem. As a computer scientist, I then thought of binary encodings. This gives a quick solution to the case where you have just 15 minutes to determine which bucket is poisoned – if you convert each number to a 10-bit binary number (so the first bucket is 0000000000, the second is 0000000001, and so on) and give the first bucket to no pigs, the second to pig number 10, the third to pig number 9, the fourth to pig numbers 9 and 10 and so on, the final state of the pigs 15 minutes later allows you to determine the poisoned bucket. If we used this method and only pigs 5 and 7 died, then the poisoned bucket has binary representation 0000101000, which corresponds to bucket number 40. In any case, we know the answer is bounded above by 10.
I then considered ways of extending this solution given that we can run four trials instead of one. I did consider partial experiments, but didn’t see a way this would be helpful. One quick possibility we came up with was to linearly split the buckets among a number of pigs to cut down the search space – though this risks losing one pig at each iteration. Still, we determined that 6 pigs would work:
- On trial 1, you have 1000 buckets and 6 pigs – divide into 7 groups of at most 143.
- On trial 2, you have 143 buckets and possibly 5 pigs – divide into 6 groups of at most 24.
- On trial 3, you have 24 buckets and possibly 4 pigs – divide into 5 groups of at most 5.
- On trial 4, you have 5 buckets and possibly 3 pigs – the binary method can distinguish 8 buckets.
Yet, this same method wouldn’t work with 5 pigs. We were ready to settle on 6 as the answer, but the question asker revealed that it was possible to do better than 6. The idea of imbalanced groups then came up – the ‘no risk’ group of buckets should be larger, because in that case we could guarantee that we would still have all our pigs if we needed to investigate that group.
I guess I haven’t done algorithm interviews for a good while, and it thus took me to this point before I thought of setting up a recurrence relation. Define as the maximum number of buckets distinguishable with
trials and
pigs remaining. I then set up this relation:
The last case comes about because we can safely allocate a group of buckets to be given to no pigs, since with one fewer trial we can distinguish between them. For each of the pigs present, we need to prepare to continue the investigation with one fewer trial and one fewer pig, hence
.
This is more efficient, but doesn’t quite get us there yet. I noted that was somewhere in the 800s, which established that strategies based on risking one pig in every round apart from the last would be insufficient. Thinking a little more about it, I realised that in the extreme case, you could always have a special bucket each round that you gave to all pigs – if they all died that round, we would know that bucket was poisoned. However, you couldn’t have more than one of these per round – if they all died, you wouldn’t know which bucket was the cause.
More generally, you could split the groups of buckets into groups that are given to zero, one, two… up to pigs, where
is the number of pigs still left. The groups given to smaller numbers of pigs could likely be larger, to account for the additional resources available in later rounds. The following recurrence relation then drops out:
This actually has a closed form solution. I didn’t notice it when looking at the recurrence, though it became apparent when I coded up a program to calculate the required . I was surprised to see how much additional discriminatory power five pigs had, though four wouldn’t be sufficient. Interestingly, five pigs and three periods were sufficient. The general closed form solution is simply
and our argument provides a method for actually identifying the relevant bucket.
Finally, an interesting challenge here was to establish that it wasn’t possible to achieve this with four pigs (so that 5 would be the minimum). One way to do this was to reason in terms of information theory – a pig either survives all the way or dies after a given round, and then it stays dead. This is many possibilities, and there are
pigs, so it seems that we can’t do better than
. Clearly
so four pigs isn’t enough; also, this expression matches the value obtained by our recurrence-based method, suggesting that that is indeed optimal.