While I’m at it…

What’s the deal with alpha-sorted polygons those days? I hear more and more that engine X or Y doesn’t support them because “it’s too slow”. Errr, hello? We even sorted polygons on 8 Mhz Atari STs, I assume you can manage to do the same on a 3 Ghz multi-core machine? Sure the number of polygons has increased, but so has the CPU speed. So what’s the problem?

  • the first problem is that they don’t use the right algorithm. If the number of polygons and the CPU speed evolve in parallel, the ratio only remains constant with an O(n) sort. Since they insist on using qsort, well, they’re doomed before the race even starts.
  • the second problem is that they don’t generate their sort keys efficiently. No, you don’t need that square root. No, you don’t need the distance to the camera position. You need a simple dot-product: the vertex projection on the camera’s forward vector.
  • the third problem is that they sort everything each frame. It’s useless. Sort the index buffer once, and only sort it again when the view direction has changed significantly. You don’t need an exact sort each frame since the whole process is not accurate to begin with (each triangle is sorted according to a single witness point). So it means you need to actually sort stuff only once in a while - or all the time but over N frames, if you prefer.
  • the fourth problem is that they sort at runtime. You don’t have to. As we just said the sorting is approximate, so you can precompute 8 sorts/index buffers for the 8 possible octants in which the view vector can be, and simply pick up the correct one at runtime. This has been used countless times in countless demos on Atari and PC. Seriously old skool. It certainly increases the memory used, but here’s my deal for you: drop the STL, and you’ll win back all the memory you need to include those precomputed sorted buffers. The number of transparent objects shouldn’t be huge anyway, so this is usually not a problem. Especially since those sorted buffers can be easily compressed like crazy, and decompressed to the actual D3D index buffer only when the view vector changes.

So, what’s your excuse already?

6 Responses to “While I’m at it…”



  1. Nathanael Presson Says:

    I think you may slightly miss the point here, when peoples say ‘too slow’ they speak about shaders/textures/etc… switches, not sorting. Sorting was an issue on Atari/Amiga at the time, not today, never (unless you ‘bubble’ it:) ). I agree that radix sort is the right tool for the job, but that’s just not, by far, the bottleneck in my experience. If you happens to deal only with static and shaders homogeneous transparent faces, you should kiss your designers :).

    Btw, nice to see you blogging.

  2. admin Says:

    Maybe I missed the point, but I still don’t get it then. I’m talking about fully transparent objects, like, I don’t know, a transparent torus. Or a transparent teapot. The whole object uses the same shader, exactly as if it would be opaque. I don’t see why you would have more shader switches with transparent objects?

    Anyway there’s another, simpler way to summarize the rant: if people still support particle systems, which are even worse in terms of requirements & performance hits, why would they drop the (easier) transparent objects?

    …so I still don’t get it.

  3. Nathanael Presson Says:

    Ok, my bad, i had smokes,water,explosions,fire sutff in mind still,
    transparency have nasty side effects, like dealing with dynamic shadows, or deferred shading among other things.
    Concerning particle systems, when i think about it, almost everything i see end up with Add or Sub blending, so they don’t even bother so sort anything.
    But in the end, its perhaps that young coders are a bit lazy, and think that gpu draw, cpu compute, so that computing to draw must be bad. Old folks me like me had to rasterize polygons in the good old times :). And yes i do support transparency in my current project, it’s just that no one figured yet where to put that damn transparent torus ^^.

  4. Ilian Says:

    You could also make use of the PSX method of drawing those particles sorted [the O(N) type of sorting you meant, I guess] . Just make a simple uber-shader and put all particle-textures of the game into one big texture. Using particles with different types of blending in the same frame will still be problematic, though.

  5. Mark Wayland Says:

    As Ilian says, you can use PSX style ordering tables. I did that in Carmageddon TDR2000 back in 1999 - and that was for each translucent polygon! We actually had a car called the Wraith (I think) which was totally translucent and required pretty much polygon accurate sorting. The ordering table is just a hash table by depth, nothing fancy. Fast as anything - definitely no qsorts involved!

  6. Phil Says:

    One thing I would like to point out is that often methods that rely on the CPU to do things with resources used by a GPU are actually at a bigger disadvantage now than they used to be.

    GPU power has grown faster than CPU power, so it stands to reason that the object’s complexity has grown faster than the CPU power required to perform the operation. So that means you have to set aside a larger percentage of your CPU to do this than you did in the past - granted back then you also had to do rasterization etc, but I think the point is still interesting to consider.

    Now this applies to per-frame sorting of polygons, if you pre-compute the order (for however many quadrants) then I totally agree that there is no reason not to do it.

    Phil

shopfr.org cialis