After I posted this series to the Advent of Code subreddit, Reddit user askalski - who has previously posted optimized solutions in C and C++, and was one of my inspirations for doing the same - suggested a better solution for day 3: processing the wires as line segments instead of point sets. This reduces runtime by 99.9%, down to 28 μs.
This idea had crossed my mind, but I had rejected it thinking it wouldn’t improve performance enough to offset the increased complexity per operation - the algorithm I had was linear in complexity, after all. Let’s just say I was hilariously wrong.
The new algorithm is in a way simpler than the hash set method: parse each wire into a vector of line segments, then iterate through each pair of line segments and check if they intersect. This can be done quite quickly since we know all line segments are straight lines parallel with one of the coordinate axes.
It is easy to see from figure 1 that an intersection exists if and only if:
(x10 <= x00 <= x11 AND y00 <= y10 <= y01) OR (x00 <= x10 <= x01 AND y10 <= y00 <= y11)
The drawback of this approach is that overlapping lines become more cumbersome to support. Detecting the overlap is easy enough, but for part 2 we also need to track the path length along each line. I didn’t want to deal with this, so I decided to simply crash on overlapping lines.
Using the above rules, we can simply iterate over all pairs of line segments to find the set of intersections. Although this has time complexity O(N2) and the hash set method has complexity O(N), my puzzle input has 301 + 301 line segments spanning 155,678 + 149,517 points. It turns out 301 * 301 = 90,601, which is significantly smaller than both 155,678 and 149,517. Not only does this mean we need to make fewer tests, it also means we don’t need to hash each point several times for the hash set operations.
For part 2, we can extend our line segment representation to not only store the two end points, but also the first point along the path from the origin as well as the total path length. When we compute an intersection, we can use these values to compute the total path length to that intersection. With that, we simply need to visit each intersection once to compute both solutions we need. With these optimizations, both parts take ~130 μs to solve.
There’s one more trick we can use to go even faster, though: we don’t need to test every pair of line segments. Instead, we can split the first wire into the line segments parallel with the x axis, and the ones parallel with the y axis. We can then sort each vector by y and x coordinate, respectively. This allows us to binary search for just the range of segments that have a chance to intersect the other wire, further reducing runtime for this problem to 28 μs and the total runtime to 110 ms.