Day 24: Planet of Discord
Day 24 is a variant of Conway’s game of life. My solution runs in 2.6 ms and gains its performance from efficient data structures, including a custom bit field representation of Boolean matrices similar to the key sets in day 18.
The basic principle for this challenge is simple: propagate the world state forward in time, and detect when any state occurs a second time. Part 2 introduces new connectivity rules, but otherwise remains the same and asks us to simulate 200 time steps.
A lot of the work in this challenge consists of looking up and counting neighbors. Each cell of the game can only be in one of two states, so an easy solution is to represent the state as a 2-dimensional vector of Boolean values, but a more efficient representation is to use a bit field.
The state of each level is 5x5 cells, so we could fit all 25 cells into a 32-bit integer value. But it’s going to be useful to have one cell of padding on each side, meaning our states will be 7x7 cells, so we’ll need to use 64-bit integers. We slice the integer into 7-bit rows, in order from least to most significant: the first 7 bits are always zero, the first and last bit of each group of 7 are always zero, and the middle 5 bits of the middle 5 groups are the cells of each row, as shown in figure 14.
With this representation, we can easily count all the neighbors of a given cell
using the number
0b000001000001010000010, as a bit mask:
0100000 1010000 0100000 0000000 0000000 0000000 0000000
This is why the padding is useful -
we don’t need to spend cycles on worrying about the edges.
All we need to do is bit-shift this pattern
by the index of the cell we want the neighbors for,
then AND the shifted pattern with the bit field,
and count the number of ones in the result.
Incidentally, Rust integers have a method
.count_ones() which does exactly that.
The index for the cell at coordinates
(x, y) in the original 5x5 state
7 * (y + 1) + (x + 1).
This compact representation is also suitable for fast lookups in a hash set
for the state repetition check in part 1.
Unfortunately it isn’t quite as helpful for part 2 due to the inter-level connections,
but it does still approximately halve runtime
compared to storing the state as a
For part 2, we also need a way to store the levels. Since levels extend in two directions, a natural numbering scheme is to start at zero and count into both positive and negative numbers, but this means you can’t simply use the level as the index of a vector. You can use a hash map instead, but that’s slow, and there is a simple trick we can use to store items at “negative” indices in a vector. The trick is to interleave the positive and negative indices:
Level: 0 -1 1 -2 2 -3 3 -4 4 -5 ... Index: 0 1 2 3 4 5 6 7 8 9 ...
That is, level
n is stored at index
n * 2 if
n is nonnegative,
|n| * 2 - 1 if
n is negative.
This turns out to be much more efficient than a hash map,
cutting runtime by no less than 75%.
Also making the effort to only create new levels
when one is actually going to be populated
(which is at most every second time step, due to how the problem is set up)
saves an additional 30%.
Going even further beyond
I wrote above that part 2 spoils the fun of the bit field a bit, but we can in fact make it work for part 2 also. We can do this by embedding the relevant parts of the neighbor levels, and using special neighbor masks for the middle 4 cells. We can encode the state as shown in figure 15:
This encoding is chosen to minimize the number of operations needed to embed the neighbor states: Each edge of cells from the outer neighbor can be masked in with just one bitwise OR operation, and the cells from the inner neighbor can also be extracted and moved into place with just 3 bitwise logic operations per set. Also important is that all the masks can be precomputed, i.e. hard-coded.
This way, we get the neighbors from the outer level for free with the basic neighbor mask, and the neighbors from the inner level can be accounted for by using separate neighbor masks specifically for the middle 4 cells. This again cuts the runtime in half, bring the total runtime down to 2.6 ms.