Preliminaries

First, you may want to read a short introduction to the problem space such as this one.

Traditionally, the Nim game is introduced prior to deriving the Sprague-Grundy theorem. In this post, we will instead show how Nim naturally arises from a greedy algorithm to categorize games. Specifically, we will first motivate the usage of XOR, then the usage of MEX.

Theoretically, to solve any impartial game, we need only to determine a single bit of information about every possible position: whether the next player (about to make a move) wins, or loses. If the next player wins, the state is called an N-position; otherwise, if the previous player wins, the state is a P-position. Together, N-positions and P-positions comprise the two possible outcome classes of any impartial game.

An example of a simple game with positions labeled N or P. Recall that all impartial games can be represented as DAGs (directed acyclic graphs).

The positions are labelled according to the following rules:

  1. All terminal states (i.e. states with no out-degree) are P-positions.
  2. If a state has an edge to any P-position, it is an N-position.
  3. Otherwise, it is a P-position.

Sums of games

We define the sum of two games 𝐺+𝐻 as a new game in which the games 𝐺 and 𝐻 are played in parallel: on each player’s turn, they can choose to move in exactly one of the two games, and leave the other unchanged (a classic example of this would be a two-pile Nim game).

Note

Considering the positions as DAGs, this + operation corresponds precisely to a Cartesian product: A simple Cartesian product, illustrated. Each edge corresponds to a move on exactly one of the two parallel positions. Note that the + operation as defined here is both commutative and associative. It is also closed over the space of impartial games, since the Cartesian product of two DAGs is still a DAG.

Importantly, under our current definition of equivalence by outcome classes (i.e. all N-positions are equivalent, as are all P-positions), equivalent positions might not produce equivalent results when combined with other positions. Specifically, two N-positions combined could produce either an N-position or a P-position.

Two N-positions combined don’t always yield equivalent results.

This motivates a new definition of equivalence:

Definition: position equivalence

𝐺≑𝐻 iff for all positions 𝑋, 𝑋+𝐺 and 𝑋+𝐻 belong to the same outcome class.

To assist with the following inductive proofs, we introduce the notion of the size of a position |𝐺| as the total number of reachable positions from 𝐺 after one more more moves. Importantly, this means that if position 𝐺 can move to position 𝐻, |𝐻|<|𝐺|.

We will also refer to the previous player and next player as Players P and N, respectively.

We can now prove our first theorem:

Theorem: P-positions

For any P-position 𝐺, πΊβ‰‘βˆ… (βˆ… denotes the empty game, which is also a P-position).

Proof. By definition, we must show that for all 𝑋, 𝑋+𝐺 and 𝑋 have the same winner. We split this into two cases:

  1. 𝑋 is a P-position, so we want to show 𝑋+𝐺 is also a P-position.
    • If player N moves 𝑋→𝑋′, player P must be able to make the move 𝑋′→𝑋″ where 𝑋″ is a P-position. By induction on |𝑋|, 𝑋″+𝐺 is also a P-position.
    • Symmetric reasoning for if player N moves 𝐺→𝐺′ instead.
    • Therefore, no matter how player N moves in 𝑋+𝐺, player P can always return it to a P-position.
  2. 𝑋 is an N-position, so we want to show that 𝑋+𝐺 is also an N-position.
    • If 𝑋 is an N-position, there must exists a move 𝑋→𝑋′ such that 𝑋′ is a P-position. From the first case, 𝑋′+𝐺 is also a P-position, therefore in both games player N can move to a P-position, as desired.

Importantly, this means that all P-positions are still equivalent to one another.


Corollary

For all games 𝐺, 𝐺+πΊβ‰‘βˆ….

Proof. We just need to show 𝐺+𝐺 is a P-position, which is true because player P can always just mirror the moves of player N.


Now that we’ve now discovered the inverse of 𝐺 (specifically, βˆ’πΊ=𝐺), note that our notions of equivalence and sum define a group:

  1. The operation + is both associative and commutative.
  2. The identity element is βˆ….
  3. The inverse of an element 𝐺 is 𝐺 itself.

Corollary

𝐺≑𝐻 iff 𝐺+𝐻 is a P-position, i.e. 𝐺+π»β‰‘βˆ….

Proof. By elementary group properties.

XOR?

The fundamental reason that XOR shows up is the fact that 𝐺+𝐺=βˆ…. In particular, consider a set of positions {𝑔𝑖} and the following two sums:

  • 𝐺=𝑔1+𝑔2+𝑔4; we encode this as 𝑓(𝐺)=1101.
  • 𝐻=𝑔2+𝑔3; we encode this as 𝑓(𝐻)=0110.

Then we have that 𝐺+𝐻=𝑔1+𝑔3+𝑔4⇒𝑓(𝐺+𝐻)=1011=𝑓(𝐺)βŠ•π‘“(𝐻).

Note that this XOR property holds for any choice of {𝑔𝑖}. However, we’d also like an additional condition:

Any non-empty sum of distinct elements of {𝑔𝑖} should not be equivalent to βˆ….

or, alternatively:

Every position 𝐺 should have a unique representation as a sum of distinct {𝑔𝑖}.

In linear algebra terms, we want {𝑔𝑖} to be an independent basis of all possible positions. If this holds, then we have the nice property that πΊβ‰‘βˆ… iff 𝑓(𝐺)=0.

Therefore, it remains to solve the following question: how do we actually find such a basis?

MEX!

Let me first describe a simple algorithm to find a basis:

  1. Topologically sort all game states.
  2. Initialize our basis 𝑔 as an empty list, [] (because order matters!).
  3. In reverse topological order, check if our current game 𝐺 is independent of 𝑔. If yes, append 𝐺 to the end of 𝑔.

And that’s it!

Wait, what? That’s it? How is this even remotely related to MEX? And how do we check whether a game is independent of 𝐺? The key idea is to show that because of this specific greedy approach, the 𝑖th basis element 𝑔𝑖 has the property that it can transition to an equivalent state of every subset of the first 𝑖 basis elements. That is, for every 𝑧<2𝑖, 𝑔𝑖 can transition to a state 𝑍 such that 𝑓(𝑍)=𝑧. This will enable us to implement a much faster check for independence.

In fact, together with the properties of XOR, this claim generalizes to the following:

Theorem

For every state π‘Œ and every 𝑧<𝑓(π‘Œ), π‘Œ can transition to some state 𝑍 such that 𝑓(𝑍)=𝑧.

Do you see where mex could come in now?

Proof: We induct. Let’s say this claim is true for every state we’ve encountered so far. Now, let our current state be 𝐺, and the set of possible next states as 𝑇𝑖. Let π‘š be the MEX of 𝑓(𝑇𝑖) over all 𝑇𝑖. We wish to show that 𝑓(𝐺)=π‘š.

To do this, consider some game 𝑀 in equivalence class π‘š and spanned by our current basis 𝑋. If no such game exists, 𝐺 is independent of 𝑋 and thus will be added as a new element to the basis. Note that this also means we can transition to every game currently spanned by 𝑋, so our inductive hypothesis still holds.

Otherwise, we then need to prove is that 𝐺≑𝑀⇒𝐺+π‘€β‰‘βˆ…. We can show this by splitting into two cases:

  1. The first player moves either 𝐺 or 𝑀 to a state 𝑍 with 𝑓𝑋(𝑍)<π‘š. The second player can then match this value by moving on the other game. Both states will have value 𝑓𝑋(𝑍) and will thus form a P-position.
  2. The first player moves either 𝐺 or 𝑀 to a state 𝑍 with 𝑓𝑋(𝑍)>π‘š. The second player can then move this state back to one with value π‘š. Both games will now have value π‘š and will thus form a P-position.

Since the second player can always force the first player to begin their next turn on a P-position, it follows that 𝐺+𝑀 must be a P-position itself. Therefore, 𝐺≑𝑀, as desired.


OK, that was a lot. But the primary takeaway is that the entire reason mex is used is because of the (surprisingly elegant) greedy strategy we used in order to find a valid basis. In fact, (don’t quote me on this) this may be one of the only efficient strategies to compute a valid basis because of the issue of checking independence. If 𝑋 was just some arbitrary basis, this could have taken forever to check, and maybe even have been impossible. Thankfully, due to the nice structure our greedy algorithm imposes on 𝑋, we instead end up with a very quick and elegant way of evaluating independence.

The best part? We derived all these beautiful results without referencing the game of Nim at all! The XOR relation arose naturally from the β€œmirror” nature of combining identical impartial games, and mex showed up as a direct result of the simple greedy strategy we defined to find a valid basis. This, I believe, is truly the best way to understand the fundamental nature of the Sprague-Grundy theorem.