There was a thread on the ODE mailing list recently, where I wrote some notes about exceptions and the problems that come with them:
- Bigger exe size, by simply turning exceptions on.
- Performance hit (often tiny, sometimes significant), by simply turning exceptions on.
- Uncontrollable cost, depends on compiler. Even if you check the assembly on PC / msvc, there’s no guarantee the cost is the same on PC / gcc or on PS3. Not using them at all provides peace of mind here.
- Dubious support on consoles. If you have to drop exceptions on platform X, you probably need a workaround/alternative for it. So why not using the alternative all the time? Or do you drop all error messages / checks on platform X ? Or do you maintain two codepaths, one with exceptions, one with the alternative?
- Writing “proper” exception code is not trivial. The more people in your team, the more chances one of them is going to screw up.
A lot of people don’t see why they should pay this price for something that is not even supposed to happen under normal circumstances.
And then I suddenly realized I never really profiled the cost of exceptions in ICE. There’s a simple reason for this: I have a check preventing me from compiling with exceptions enabled. It has been there for years, to make sure I never forgot to turn them off. Removing them at the time was more like a hunch, and a good practice back in 1999 (exceptions were not supported on Dreamcast and we got bitten once when porting a PC engine to this console).
Anyway, out of curiosity I removed the check and compiled Konoko Payne with exceptions turned on (using VC7). As expected the size of DLLs became bigger (numbers are sizes without & with exceptions):
IceCharacter.dll 94208 98304
IceCore.dll 286720 299008
IceDX9Renderer.dll 180224 184320
IceImageWork.dll 102400 114688
IceMaths.dll 225280 245760
IcePhysics.dll 217088 249856
IcePrelighter.dll 37376 43008
IceRenderer.dll 327680 344064
IceRenderManager.dll 937984 1028096
IceTerrain.dll 37376 39936
KonokoPayne.dll 589824 684032
Meshmerizer.dll 307200 364544
Opcode.dll 454656 499712
SystemPlugs.dll 139264 163840
Total 3937280 4358656
So the difference is 421376 bytes, or 9.66 % of the total. Almost 10% ! That’s huge.
But wait. There’s more. The biggest surprise was the performance impact. I just ran the game and checked the framerate immediately when it starts, not moving the camera or anything.
FPS without exceptions enabled: ~85
FPS with exceptions enabled: ~67
Oops. That’s a drop of 18 FPS here?! I admit I didn’t expect this. I never saw exceptions having such a big impact on framerate. Last time I checked this, in the NovodeX SDK, the difference between the builds with or without exceptions was around 5 FPS only IIRC. But then again, I was only recompiling the physics engine, not a complete game… This might explain why I never recorded such a huge performance impact.
This sort of confirms to me something I’ve been suspecting for a long time: optimizations matter. All of them. Conventional wisdom says that “premature optimization” is bad, and that you should only optimize your bottlenecks until you get a “flat profile”. Fair enough. But what if you would have optimized all your code all the time instead? What if you wouldn’t have skipped “small” optimizations that could have been performed everywhere? Disabling exceptions for a single function doesn’t save much, but disabling them everywhere in the code adds up to a 18 FPS gain! What if the same was true about other things…? What if you could gain the same or more by writing optimized code “all the time”? This is something I’ve suspected for a while, and seen in practice a few times.
I remember when I first joined Novodex and started optimizing the SDK. There was no clear bottleneck, it was usually a fifty-fifty story between collision detection and the solver. However the code was literally full of small nasty things that could have been written in a better way. I spent weeks and weeks rewriting those, removing templated arrays from critical areas, inlining (or actually __forceinlining) things instead of trusting the compiler for this, replacing divisions with multiplications by the inverse, moving the code around so that called functions are closer to callers (ideally in the same compilation unit - closer in the CPP usually means closer in memory after compilation), replacing for loops with while loops, replacing qsort with radix, removing conditionals in the solver, replacing some virtual calls with C-style function pointers, saving a few bytes here and there in data structures, ordering data members in classes by size to minimize losses in padding, etc, etc, etc. The core algorithms never changed: just the way the code was written. Just details. It was all about details. But everywhere. Did it pay off? I sure think it did. I will always remember a comment made by Dennis from Meqon, visiting us in Zürich, when we were still friendly competitors. He said something like: “I noticed a big speed increase between the previous release and the one you worked on. Did you rewrite things using SSE code?”. No, I hadn’t been playing with SSE. I just micro-optimized the damn thing absolutely everywhere, for weeks. I’m sorry but details still matter. The devil lies there, making your code slow.
In that respect, the optimization process is not much different from what it was on Atari ST, where we “saved one NOP” at a time in inner loops . (We counted time in NOPs, not in cycles). One NOP didn’t matter much! But saving one NOP every day for two weeks certainly did.