# Algorithmic Modelling – Codenames

Codenames is a board game designed by Vlaada Chvatil which tests communication. The game is played with two teams; each team splits itself into clue-givers and guessers.

There is a board of 25 cards, each with one (or more) words on them. Of these cards, 9 are owned by the first team, 8 owned by the second team, 7 are neutral and one is the ‘assassin’.

The clue-givers attempt to communicate to the guessers which cards are owned by their team. This is done by giving a one-word clue followed by the number of cards N corresponding to the clue. The rules establish some bounds on allowed associations (for example, ‘sounds like’ clues are not allowed).

I don’t know how much time went into selecting the words to appear on individual cards, but there are certainly many words in the deck that can be interpreted in many ways, which makes the game fun. For example, I can think of quite a number of ways to clue BOND (COVALENT for unambiguity; DURATION, ASSET or DEBT for the debt; JAMES, AGENT or SEVEN or even at a stretch something like CASINO for the character; FRIEND or FRIENDSHIP; STREET or TUBE; CONTRACT, PROMISE or WORD for the idea of a promise; ADHERE or CONNECT, among others). Which one I might pick would depend on the other words on the board.

Guessers then identify up to N+1 cards they think their team owns. This can be based on the entire history of clues given, not just the previous clue. These cards are guessed one at a time; a team is only allowed to make further guesses if they guess ‘correctly’ one of their team’s cards. Teams may refrain from making all guesses.

The game ends either when all cards owned by a team are revealed (that team wins), or if a team reveals the assassin (that team loses).

In practice, clue-givers will need to consider all cards, not just the ones their team owns. The penalty for ‘failure’ varies; the assassin is an instant loss, and revealing a card owned by the other team is also bad. For example, with the cards APPLE, BANANA and GRAPE it would be very tempting to declare (FRUIT, 3) as a clue; yet, if KIWI is owned by the other team (or worse, is the assassin) it might be dangerous.

However, if the other team has already revealed that KIWI is theirs (e.g. with another clue, maybe SOUTH or BIRD) then the FRUIT clue becomes safe. Thus, pre-computing a strategy at the beginning of the game (e.g. by partitioning the eight or nine clues owned into several logical groups while avoiding other words) may not be optimal.

I tend to consider making an effort to win to be a key part of games, and thus what moves may be considered good also often depends on the state of the game. For example, if the opposing team has just one card left, I will give a broader clue that may have more tenuous links in a ‘do or die’ effort to finish on this turn. A less extreme example would be attempting weaker links if behind, or perhaps playing slightly more conservatively if ahead.

Mathematically modelling Codenames can be tough. We can try modelling the game state as a tuple $(O, E, N, x, h, f_O, f_E)$ where $O, E, N$ are sets of the remaining clue words our own team has, words the enemy team has, and the neutral words, $x$ is the assassin, $h = (t,w,n)^+$ is the history of the game, and $f_O$ and $f_E$ are preference functions of the form $h, w, n, O, E, N, x \rightarrow P$, returning an ordered list $O \cup E \cup N \cup \lbrace x \rbrace$. $(t, w, n)$ is a clue meaning that a given team gave a clue word and hinted that some number of cards was relevant. This already abstracts two difficult parts away – ordering the clues, and determining how many of the top preferences to pick.

I think the preference functions need to take into account previous clues from both teams; if a previous clue could clearly have corresponded to two words and I picked the wrong one, it might be fairly high on the list even if unrelated. Similarly if this scenario happens to the opposing team, I would probably avoid the word that would blatantly be theirs.

The notion of degree of confidence also isn’t captured as well in our ordering; going back to the fruit example, having a clear ordering would imply that clues for less than 4 items could reliably result in correct guesses (if we knew that APPLE was by a small margin the best pick, we could safely and confidently give (FRUIT, 1) to clue it, which seems wrong). One could imagine modelling these with selection probabilities, though things would be even more complex.

The above representation still seems computationally difficult to work with. The evolution of the preference function as one moves from one state to another is unclear (so in that representation it is fully general), making lookahead difficult. It doesn’t seem like a greedy choice is always best; for example, given eight clues that are reasonably divisible into four pairs, a clue for 4 words taking one element from each pair might be a bad idea if the other words can’t easily be linked.

We can proceed by simplifying the preference functions; a simple approach could be that for each $w, n$ each team has a persistent preference function that returns an ordered list of $O_0 \cup E_0 \cup N_0 \cup \lbrace x \rbrace$. The preference functions during the game then return the subsequence of that list that contains strictly the words still left in the game. This of course doesn’t take into account past information or clues from the other team.

With this, we can attempt to solve the game using standard techniques; assuming that the vocabulary of clues is bounded (let’s say it must be from the Linux dictionary), a game state is winning for the current team if there exists some word for which the preference function returns everything in $O$ as a prefix. A state is losing if all moves in that state produce a state which is winning for the opposing team.

We can generalise this to a simple probabilistic model as well; the preference ‘functions’ instead return a discrete random variable that indicates either guessing some word or passing. A simplified model could then, at the start of the game, assign weights to each candidate indicating the relative probability that candidate would be selected. These can be normalized to account for a pass probability. As words are removed from the board, the probability of the remaining words being selected scales (we can imagine a simple rejection-sampling where we discard words that aren’t actually on the board any more).

The algorithm for the probability that we get a win from a given state is then slightly more complex (though I think still reasonably covered by standard techniques).