Exceptions: just say no.

There was a thread on the ODE mailing list recently, where I wrote some notes about exceptions and the problems that come with them:

  • Bigger exe size, by simply turning exceptions on.
  • Performance hit (often tiny, sometimes significant), by simply turning exceptions on.
  • Uncontrollable cost, depends on compiler. Even if you check the assembly on PC / msvc, there’s no guarantee the cost is the same on PC / gcc or on PS3. Not using them at all provides peace of mind here.
  • Dubious support on consoles. If you have to drop exceptions on platform X, you probably need a workaround/alternative for it. So why not using the alternative all the time? Or do you drop all error messages / checks on platform X ? Or do you maintain two codepaths, one with exceptions, one with the alternative?
  • Writing “proper” exception code is not trivial. The more people in your team, the more chances one of them is going to screw up.

A lot of people don’t see why they should pay this price for something that is not even supposed to happen under normal circumstances.

And then I suddenly realized I never really profiled the cost of exceptions in ICE. There’s a simple reason for this: I have a check preventing me from compiling with exceptions enabled. It has been there for years, to make sure I never forgot to turn them off. Removing them at the time was more like a hunch, and a good practice back in 1999 (exceptions were not supported on Dreamcast and we got bitten once when porting a PC engine to this console).

Anyway, out of curiosity I removed the check and compiled Konoko Payne with exceptions turned on (using VC7). As expected the size of DLLs became bigger (numbers are sizes without & with exceptions):

IceCharacter.dll 94208 98304
IceCore.dll 286720 299008
IceDX9Renderer.dll 180224 184320
IceImageWork.dll 102400 114688
IceMaths.dll 225280 245760
IcePhysics.dll 217088 249856
IcePrelighter.dll 37376 43008
IceRenderer.dll 327680 344064
IceRenderManager.dll 937984 1028096
IceTerrain.dll 37376 39936
KonokoPayne.dll 589824 684032
Meshmerizer.dll 307200 364544
Opcode.dll 454656 499712
SystemPlugs.dll 139264 163840
Total 3937280 4358656

So the difference is 421376 bytes, or 9.66 % of the total. Almost 10% ! That’s huge.

But wait. There’s more. The biggest surprise was the performance impact. I just ran the game and checked the framerate immediately when it starts, not moving the camera or anything.

FPS without exceptions enabled: ~85
FPS with exceptions enabled: ~67

Oops. That’s a drop of 18 FPS here?! I admit I didn’t expect this. I never saw exceptions having such a big impact on framerate. Last time I checked this, in the NovodeX SDK, the difference between the builds with or without exceptions was around 5 FPS only IIRC. But then again, I was only recompiling the physics engine, not a complete game… This might explain why I never recorded such a huge performance impact.

This sort of confirms to me something I’ve been suspecting for a long time: optimizations matter. All of them. Conventional wisdom says that “premature optimization” is bad, and that you should only optimize your bottlenecks until you get a “flat profile”. Fair enough. But what if you would have optimized all your code all the time instead? What if you wouldn’t have skipped “small” optimizations that could have been performed everywhere? Disabling exceptions for a single function doesn’t save much, but disabling them everywhere in the code adds up to a 18 FPS gain! What if the same was true about other things…? What if you could gain the same or more by writing optimized code “all the time”? This is something I’ve suspected for a while, and seen in practice a few times.

I remember when I first joined Novodex and started optimizing the SDK. There was no clear bottleneck, it was usually a fifty-fifty story between collision detection and the solver. However the code was literally full of small nasty things that could have been written in a better way. I spent weeks and weeks rewriting those, removing templated arrays from critical areas, inlining (or actually __forceinlining) things instead of trusting the compiler for this, replacing divisions with multiplications by the inverse, moving the code around so that called functions are closer to callers (ideally in the same compilation unit - closer in the CPP usually means closer in memory after compilation), replacing for loops with while loops, replacing qsort with radix, removing conditionals in the solver, replacing some virtual calls with C-style function pointers, saving a few bytes here and there in data structures, ordering data members in classes by size to minimize losses in padding, etc, etc, etc. The core algorithms never changed: just the way the code was written. Just details. It was all about details. But everywhere. Did it pay off? I sure think it did. I will always remember a comment made by Dennis from Meqon, visiting us in Z├╝rich, when we were still friendly competitors. He said something like: “I noticed a big speed increase between the previous release and the one you worked on. Did you rewrite things using SSE code?”. No, I hadn’t been playing with SSE. I just micro-optimized the damn thing absolutely everywhere, for weeks. I’m sorry but details still matter. The devil lies there, making your code slow.

In that respect, the optimization process is not much different from what it was on Atari ST, where we “saved one NOP” at a time in inner loops . (We counted time in NOPs, not in cycles). One NOP didn’t matter much! But saving one NOP every day for two weeks certainly did.

13 Responses to “Exceptions: just say no.”

  1. kuranes Says:

    Nice information thanks !

    I searched a lot on the web for C++ optimisation technique, but still find hard to find any valuable apart from Pete Isensee work :
    and his slides/paper http://www.tantalon.com/pete.htm

    “padding, etc, etc, etc.”
    but now we want all details…

  2. Ruud van Gaal Says:

    Great article. :)
    Indeed, optimization should never be neglected to the extent of creating slow code, then never to return again because other code was even worse! ;-)

    I never saw the elegance of exceptions (the opposite actually), so I never use them. Reorganizing functions seems like a nice touch, hadn’t thought about that before.

  3. ggn Says:

    Btw Pierre, I think you wouldn’t want to turn THESE exceptions off: http://www.pouet.net/groups.php?which=3159 ;)

    On another note, while I fully realise that C/C++ is the way to go these days, I’m glad that 68000 asm is still my main language! I sure have missed on all the “fun” like the one you describe it in your post here, and I certainly don’t mind ;)

  4. Eric Parker Says:

    On your idea of optimizing from the start, the same idea can be applied to memory bandwidth as well. I’ve been playing around with CUDA a bit recently and found that for an FEA solver that the FLOPS are more than enough, but that memory bandwidth becomes the biggest issue. A way to optimize for memory is to benchmark the actual memory bandwidth as you add code. If the bandwidth drops at one step, then figure out why and don’t move on until it is back up. An added benefit is you are forced to understand the hardware. This incremental approach is a very efficient way to teach you what is important and what isn’t important.

  5. Maciej Says:

    Just wondering - how can you “check” if we’re compiling with exceptions enabled? I’d gladly include such test in my code as well.

  6. admin Says:

    There’s an MSVC-specific preprocessor define: _CPPUNWIND

    Check IcePreprocessor.h in the sweep-and-prune library for example.

  7. John W. Ratcliff Says:

    Choir - Me - Preaching

  8. Jalf Says:

    Hmmmm, just one little question.
    If you take an engine that does all its error checking without using exceptions, and then additionally enable exceptions (giving you the extra overhead), aren’t you getting the worst of both worlds? And of course performance would take a hit then.

    If you used exceptions from day one, you’d be able to save *some* processing time otherwise spent checking return codes.
    Whether that would make up for the overhead of enabling exceptions, I have no idea. I’m not trying to say that you’re wrong or that exceptions should (or shouldn’t) be used.

    But I think you’re missing out one important factor that would make real-world exception code perform at least slightly better than in your test.

  9. admin Says:

    Fair enough. Except I don’t believe for a second that checking return codes takes any significant time during the frame. Errors are checked in appropriate places, mainly at init time - not runtime.

    Exceptions however, generate overhead even in the middle of your inner loops. I’ve seen that a few times in the past, e.g. in the middle of a recursive function related to skeletal animation. Simply declaring a local matrix, even with an inlined empty constructor, generated useless exceptions-related code - and slowed things down here.

    Anyway I see your point, but I think the time spent checking error codes is simply insignificant.

  10. admin Says:

    Ok Jalf, here you go: http://www.ogre3d.org/phpBB2/viewtopic.php?t=40222&postdays=0&postorder=asc&start=0&sid=031ef1388c0e59fddcf348e28385533c

    They did the same test in OGRE (a project using exceptions instead of error codes) and got the same results (or worse, up to a 20-30 FPS difference).

  11. parapete Says:

    I’ve just cited this post in my final year uni project, so, you know… Thanks :)

  12. fengerfafa2006 Says:

    the mail from yahoo to your mail address (p.terdiman@codercorner.com) can’t work ,so i ask you here,sorry botther
    hi,Pierre Terdiman:
    thank you very much for your opcode,it gives me many ideas for testing mesh vs mesh,but i download “Opcode 1.3 Test Framework “,in which don’t include mesh vs mesh collisiont test,so i refer to your manual “User manual (for version 1.2 !):OpcodeUserManual.pdf”,but it’s just for version 1.2,version 1.3 have greate diffrerence with 1.2,and the manual1.2 does show the mesh vs mesh collision method ,but i can’t do it in vesion 1.3,so i really need your help very much! can you give a opcode mannual 1.3?or just tell me how to work mesh vs mesh in version 1.3?
    looking forward your reply!

  13. A thousand cuts « Miles Macklin Says:

    [...] in Uncategorized A while back Pierre Terdiman posted about his experience optimizing Novodex and how it amounted to just lots and lots of little things. [...]

shopfr.org cialis