I carefully made a surgical change that dramatically improved the performance of the entire game.
Or did I? Perhaps I instead opened up a can of worms, and one of those worms was a big ugly boy with "Nondeterminism" writ large upon his cursed frame, and then said worm proceeded to slowly bludgeon me to death?
Questions, always more questions... and to answer them, we need to start at the beginning, don't we?
An Innocuous Beginning
I caveated last week's performance optimization wins by explaining that I was testing in a game level with not much happening.
So I created a performance testing level with some things happening:
And, like a good little programmer, I recorded the initial performance characteristics and made a quite a few nice performance improvements, each time comparing the new performance with the initial recording.
Until, one time, I noticed something odd:
Some slightly different things were happening? Uh oh.
Determinism
If you take a simulation, and run it twice, and each time you get exactly the same result, then we can say that simulation is "deterministic".
If you get a different result, even a little different, then the simulation is "non-deterministic":
People often say that deterministic simulations are ideal because they're easier to fix bugs in,1 but it goes deeper than that.
If you have a deterministic simulation, then the simulation needs to calculate the same things each time - meaning it needs to do the same work each time you run it.
But if you have a non-deterministic simulation, then maybe this time you run the game:
- That stray fire atom doesn't make a barrel explode...
- so that barrel doesn't cause a chain reaction to explode those other barrels...
- so that particular bit of terrain there doesn't get destroyed...
- so that water above it doesn't flow through the new hole to extinguish the fire below...
- (etc - you get the idea.)
Even a tiny difference in execution will cause a large divergence quickly, and, (surprise!) large divergences in what happens between two runs of the same game will cause game performance to differ significantly too.
Which is all to say: some of the "performance improvements" I found were due to the game doing less work because less things were happening. Woops.
But it gets worse!
I also need the game simulation to be 100% deterministic in order for my online multiplayer fever dream to work. 2
So this was a Big Problem™.
Fixing Nondeterminism
Making a non-deterministic simulation deterministic is simple in theory; you just gotta:
- Provide the exact same inputs..
- to every single calculation..
- in the exact same order.. 3 4
- in the entire simulation..
- for the entire time the simulation runs.
But interactive simulations like games have a lot of inputs. 5
I had, err, bent the rules in quite a few cases, so first I had to tediously pay back that "technical debt" by explicitly storing each bit of input state. 6
Then, I had to build a new debug option: because the differences start out small and magnify over time (and are different each run), it's well nigh impossible to just watch a video and figure out "which calculations are using inconsistent inputs and happened to be the root cause of the big differences 4 seconds later".
So you gotta detect tiny differences as soon as they occur, with a debug option to (for example):
- Run each tick (single step of the game simulation) twice: once on each of two separate copies of (initially identical) simulation state.
- Verify that simulation states are still identical afterwards, by comparing7 them (or their checksums8).
If simulation states after a tick differ, you tediously inspect the two states to find all the differences9, guess10 where they might have been introduced, and track the bit of input you weren't tracking.
With the debug option in place, I found and fixed 4 different non-determinism bugs, including 2 that only manifest when loading a saved game. 11
The test level now plays out the exact same sequence of events each time, so hopefully my performance metrics will no longer lie to me - and I can return to making that test level actually run faster.
Playable web build
For your amusement I have set the default level to that chaotic test level, so see if you can survive:
Known issues:
- Objects near explosions get stained too darkly, probably due to a color storage performance optimization I made.12
- Lightning is no longer spawning electric arcs. No idea why yet!
If they go wrong, you can run them more slowly and they'll still go wrong the exact same way.
So you can observe them going wrong and compare what's happening to the code you've written, examine state with a debugger, etc.
Actually, the multiplayer dream requires not just "determinism", but also "cross-platform determinism", as in, same results on Intel vs AMD vs (maybe, hopefully) Apple processors, regardless of whether you're running Windows, Linux or (maybe, hopefully) Apple. That's an even stronger constraint, and in theory I've taken steps to accomplish it, but we'll see when we get there.
So for example, if you iterate through a list of things to calculate something, the list must always iterate front to back.
That sounds straightforward, but in practice several common programming data structures (e.g. HashMaps) often have unspecified iteration order - so you can't use them or have to use ordered equivalents.
Or at least, inputs must be provided in the exact same order to the extent that the difference can be observed in the output. If you iterate through a list and perform a commutative operation on it (such as addition), the order isn't going to matter - but if you e.g. find the element in a list which is largest and break ties by the first entry found, you'll have a problem.
This has significant implications for multithreading a simulation, which we haven't got to yet - but we will!
- Explicit player keyboard, mouse and gamepad inputs, like buttons pressed.
- Level editor inputs, such as drawing or erasing atoms or spawning explosive barrels.
- Random number generator(s) shared with non-simulation logic.
- Assets; for example the shape and color of an explosive barrel determines the atoms it comprises of - and asset hot reload means they can change at runtime during development.
- Debug options I can turn on and off.
Well, I decided not to store asset state, as assets use a comparatively large amount of memory and only change during development anyway.
Deep comparing states is actually more tricky than it sounds, because the simulation state is composed of many different types of things, and some of those things may have notions of equality that differ from what you normally want in your game.
For example, say your game contains a book; most of your game may consider books to be equal if they have the same name and author, but when checking your game state hasn't diverged you would also want to check that all books are open to the same page and have sustained damage to the same pages.
Approximately, a checksum is a number you calculate from some arbitrary data in a deterministic way (or see overly detailed wikipedia article). If two pieces of data have different checksums, you know for sure they're different.
(On the other hand, if two pieces of data have the same checksum, they're probably the same, but it's not guaranteed. That doesn't matter for our purposes here though - we just want to use checksums to detect differences.)
So far I've been saving the two game states to disk and using an external comparison tool to find differences, which is low-tech but has worked.. okay-ish, I guess?
The most annoying thing is that I haven't found a human-readable serde-compatible serialization format that can also represent all data structures that exist in my game state (JSON can't handle non-string keys, RON can't handle round-tripping 128 bit numbers, etc).
So sometimes I end up using (the excellent) ImHex to diff binary (Postcard format) saves of my game state, but figuring out which bytes correspond to what is painful.
I'm aware that there is a better technique involving rr, the reverse debugger. But I haven't tried it yet: once I know what the difference is, I've been pretty quick at guessing why it appeared.
I'm not sure the game would even need to support saving and loading an in-progress level, but being able to save and restore the game state from disk seems useful for both fixing crashes - I could write a crash handler that saves the last known good game state to disk.
Also, I actually had set up my nondeterminism-hunting-tooling to save the game, load the game, save the game again and finally check that the two save files were identical - so these bugs turned up and I wasn't going to just leave them unfixed.
Colors used to be represented as 4 bytes (one for each of the red, green, blue and transparency components). For optimization reasons (reduce in-memory size of each atom and thereby improve CPU cache hit rate for all operations involving reading atoms), atom colors are now 2 bytes, with a nibble (4 bits) for each component. Probably this means my logic for adjusting colors of atoms as they get stained is now wrong, but I haven't gotten around to investigating it yet.