Creating efficient triangle strips

 

 

Pierre Terdiman

Last revision:  04.01.2000

 

 

Efficient rendering of triangle-based meshes often requires you pack them into triangle strips. Strips provide interesting advantages : the mesh is stored in a more compact way, hence wasting less ram, and you save bandwidth when you send it to your favorite rendering API. It is usually said that you need two to three times less bus traffic with strips than with rough triangles. In a nutshell, triangle strips are the most efficient primitives on today’s hardware.

 

However strips give rise to a thorny problem : the way one should create them. In this article I’ll review the standard algorithms used to create strips, and discuss some improvements I made in my own stripifier.

 

 

Should I care ?

 

For the happy beginners out there, let’s briefly recall what a triangle strip is. Say your mesh contains a list of connected triangles. A triangle is made of three vertex references, and in case of connected triangles, two of them may be shared from one triangle to another. The list of indices resulting from this sharing forms a triangle strip. For example those triangles :

 

012

123

234

345

are equivalent to a single strip :

012345

 

If your own 3D engine only works with trilists (i.e. standard lists of triangles), you may wonder whether moving to tristrips is worth the pain. List of triangles are a lot easier to deal with, especially when you have dynamic topologies (e.g. for terrains) and may well be fast enough for your needs... But come on ! Things never are fast enough ! You should keep trilists for all dynamic meshes whose topology changes over time, because recomputing strips on-the-fly is a very tedious task which sometimes involves the use of skip lists [1], and other non-trivial data structures. However on-the-fly stripification sometimes happens with smart drivers, provided your trilists have been preprocessed and transformed into a suitable format. The Dreamcast version of DirectX for example, contains an index list optimizer which transforms your vertex references so that the driver can further efficiently build strips during the DrawPrimitive call. Unfortunately this doesn’t exist on PC, neither in DirectX, nor in OpenGL. So you’re still supposed to create your own triangle strips, and that’s the only way to reach the limits of the newest video cards such as the GeForce256.

 

 

 

 

 

Related work

 

 

The creation of triangle strips from an arbitrary mesh is an NP-complete problem [4]. NP stands for « Nondeterministic Polynomial time », and a problem is said to be NP-complete if it is both NP and NP-hard. Basically, all we have to know is that a NP problem doesn’t have any theoretical solution. Hence, one way or another we’re stuck with heuristic methods to solve it, and we’re free to tackle that problem the way we want. Roughly speaking, there are two main algorithms used to create triangle strips : the SGI algorithm and the STRIPE algorithm. Both have been implemented and widely used through years. The most popular source code available for the first one was written by Brad Grantham [3], the other is part of a classic package called STRIPE [5], written by Evans & al. At Meltdown X’99, Mike Chow from 3dfx presented a variant of the SGI algorithm [2] which works just as well.

 

Unfortunately, all available codes have various problems. Brad Grantham’s code outputs strips which aren’t single-sided, STRIPE especially works with quads, and so on. My implementation works with triangles in input, outputs single-sided strips, and uses the SGI algorithm or Chow’s variant according to what the user wants. On top of that, I also implemented the look-ahead improvement that Mike Chow presented in [2].

 

 

 

The algorithms

 

In order to stripify a mesh you need a data structure to describe the connections between all faces. The adjacency structures I use are presented in the next paragraph. Some may have used a classic winged edge structure, but I find mine easier to handle. Anyway, you may rewrite all of this the way you want, that’s just a matter of feeling.

 

Here’s the standard algorithm :

1)     choose a starting face for a strip

2)     choose a direction (i.e. an edge) in which you’ll walk along the strip

3)     actually extend the strip in the chosen direction until you reach a triangle with no forwards connections

4)     go to 1) until all faces have been visited

 

A basic improvement (at least implemented by Grantham) is :

3b) reverse the strip and extend it in the opposite direction

 

 

I’ll now go into the details of each step.

 

 

 

Adjacency structures

 

To create triangle strips by tracking connected triangles you need to be able to jump from one triangle to any of its three possible adjacent neighbors. Hence the need for adjacency structures which will provide this information. Here’s what I use :

 

struct AdjTriangle{

udword VRef[3];      // Vertex references

udword ATri[3];      // Triangle references

};

 

The triangle has the usual references to three vertices, and also three new references to possible adjacent triangles. My convention say that :

 

ATri[0] is the triangle adjacent to edge 0-1

ATri[1] is the triangle adjacent to edge 0-2

ATri[2] is the triangle adjacent to edge 1-2

 

Of course edge a-b is the one which links VRef[a] and VRef[b]. I could have used three explicit names (say ATri01, ATri02 and ATri12) but using an array allows for an easiest access. I also could have used pointers but indices are better : I can pack extra information in their most significant bits. Current version uses the two most significant bits to encode an edge number between 0 and 2. What edge number ? For example when I jump from triangle A to triangle B through the edge 0-1 (i.e. A->ATri[0]), the counterpart edge in B is the one whose ID is encoded by the two most significant bits of A->ATri[0]. That way I automatically know where I come from when I jump to a new triangle. This is not difficult to implement, and this is quite handy. Another way would have been to keep track of the two vertex references of the incoming edge, and scan the three vertex references of the new triangle to find which of its edges is the same as the one we just crossed. Yuck ! My way is a lot faster, I just have a udword to shift. Is it worth the extra code ? Is the speed gain negligible ? You may think that’s a lot of pain just to create triangle strips, don’t you ? Correct. But the way they’re built, you can use the exact same structures to implement a lot of other things. I successfully used them to track the silhouette of a mesh (an operation involved in the shadow volumes computation, but also in occlusion culling using shadow frusta, etc), do local searches in collision detection, and subdividing surfaces with the modified Butterfly algorithm. Among other things. Trust me, those ones are worth the effort.

 

Fine. Now, how do we create the structures ?

 

We first allocate a structure for each face, initialized will the correct vertex references and null links. Since we’re not using pointers, a null link is encoded as 0xffffffff. We also create three edge structures for each triangle :

 

struct AdjEdge{

   udword      Ref0;       // Vertex reference

   udword      Ref1;       // Vertex reference

   udword      FaceNb;      // Owner face

};

 

At first we don’t try to detect shared edges, we just create three times as many edges as there are faces. Vertex references are pre-sorted in those structures, so that Ref0 < Ref1. That way we ensure an edge a-b will be encoded as an edge b-a. All we have to do next is to sort all the AdjEdge structures, in order to group faces sharing the same edge. If we use an enhanced Radix Sort as described in [6], this part is both easy and fast : actually just one line of code. Then, we just have to read the structures in sorted order to get connected faces back. Once we have those faces and the common edge, we just have to fill the adjacency structures with the relevant information. Some faces may have boundary edges whose links will remain encoded as 0xffffffff. As we will see just below, those faces will be the first ones chosen by the SGI algorithm when we’ll start creating strips.

 

 

 

Choosing the first triangle

 

There are two standard ways of choosing the first triangle of a strip:

 

-         the first method is the one from the SGI algorithm : pick the less connected faces first. Such faces, due to their lack of connections, are easily left isolated after some strips have been created. The idea is to take those faces first, in order to reduce the number of isolated triangles in the final mesh. This is just a heuristic method, but it works well in practice.

 

-         The second method is just to take the first free face, i.e. the first face which doesn’t belong to a strip yet. This is what Chow does and according to him it’s just as good as the SGI way.

 

Both methods can be coded in a very similar way by using a precomputed insertion order, i.e. the order in which we’ll pick the faces to start creating a new strip. The insertion order for the second method is just the decimal one : we pick triangle 1 first, then triangle 2, and so on. For the SGI algorithm, we just have to sort the default list according to the number of connections of each triangle. The number of connections depends on the number of boundary edges, as defined in the previous paragraph, and we already have this information thanks to the adjacency structures. We just have to call the sort routine once more.

 

My own code takes one method or the other according to what the user wants.

 

 

 

Choosing the initial direction

 

Once we have our starting face, we must choose a direction in which we’ll extend the strip. Most implementations just take the first edge, and let it go. Chow proposed in [2] to look ahead. I don’t know if what I did was the same as what he suggested. In my implementation indeed I chose the simple, brute-force solution : there are three edges, or three possible directions, then I just compute the three possible strips and select the longest one.

 

Since the creation of a strip takes linear time, this is not a matter of running time, but just a matter of coding : this is relatively painful to implement, because you must keep track of every operations you do, just to be able to discard them in the end. Well. No subtle things there, just some more buffers to fill and some extra code to debug.

 

Of course I also reverse the three possible strips and extend all of them in the opposite directions. Hence, starting from a given face, I just can’t miss the best possible strip sharing this face. The only thing which could produce better strips is the order in which you select your starting faces, but this is left for further investigation.

 

Walking the strip is done with the following code :

 

udword Length = 2;      // Initial length is 2, we have 2 indices in input

strip[0] = oldest;      // First index of the strip

strip[1] = middle;      // Second index of the strip

 

bool DoTheStrip = true;

while(DoTheStrip)

{

// Get the third index of a face given two of them

udword Newest = mAdj->mFaces[face].OppositeVertex(oldest, middle);

 

strip[Length++] = Newest;      // Extend the strip,...

*faces++ = face;              // ...keep track of the face,...

tags[face] = true;            // ...and mark it as "done".

 

      // Get the edge ID…

ubyte CurEdge = mAdj->mFaces[face].FindEdge(middle, Newest);

 

// ...and use it to catch the link to adjacent face.

udword Link = mAdj->mFaces[face].ATri[CurEdge];

 

if(IS_BOUNDARY(Link))      DoTheStrip = false;

// If the face is no more connected, we're done...

else

{

// ...else the link gives us the new face index.

face = MAKE_ADJ_TRI(Link);

 

// Is the new face already done?

if (tags[face])      DoTheStrip=false;

}

oldest = middle;         // Shift the indices and wrap

middle = Newest;

}

return Length;

 

 

Click here for Picture1 : the three triangle strips (Red, Magenta, Brown) have all been generated starting from the first face of the sphere. Only the longest one is kept in the final strip, others are discarded.

 

 

 

 

 

 

Backface culling

 

There is a little problem with standard available code for tristrips : backface culling. For example the code from Brad Grantham produces strips whose orientation isn’t guaranteed to be the same as the one of the original mesh. In other words, if you want your strip to be correctly displayed, you must use double-sided faces. Quite annoying. This problem comes from the 3b) optimisation : when reversing the strip, the original starting face can be flipped, depending on the strip length. Care must be taken to ensure the final strip (extended in both directions) has the same orientation as the original model.

 

That’s not as easy as it sounds. To reverse the culling of a strip, it isn’t sufficient to write it in reverse order. It actually depends on its length.

 

Let’s have some examples.

 

Say we start from face (0,1,2). From this face, we create the first part of a strip, for example :

 

0 1 2 3 4

 

Those 5 indices encode 3 faces :

 

012

123

234

 

Please note a strip of N indices always encodes N-2 faces. The first face encoded, (0,1,2) is in an arbitrary CW order, and we want that order to be conserved. Now, to extend the strip in the opposite direction, we must write it in reverse order, so that the first face becomes the last one :

 

4 3 2 1 0

 

Then we can extend that strip even more, and compute a final strip which would be, for example :

 

4 3 2 1 0 5 6 7

 

But this strip now encodes the following faces :

 

432 (+)

321 (-)

210 (+)

105 (-)

056 (+)

567 (-)

 

The original face (0,1,2) is now the third one in the strip. Don’t forget that culling order is inverted from one face to the next in a triangle strip. The sign near the faces gives you the actual culling. It is the same as :

 

432 (+)

312 (+)

210 (+)

150 (+)

056 (+)

576 (+)

 

Hence, our original face is now displayed as (2,1,0), which is the same as (0,2,1) (you can safely « scroll » the numbers without changing the culling). As you see, that face has been inverted, and if the original one was CW, that one is to be displayed as CCW.

 

If you just write the whole strip in reverse order once gain, you may not fix the problem :

 

7 6 5 0 1 2 3 4

 

encodes

 

765 (+)

650 (-)

501 (+)

012 (-)

123 (+)

234 (-)

 

That is, we get our original face back (0,1,2) but its position in the strip implies it still is displayed as CCW (see the minus sign on the side). Would our final strip have had another extra index, the final culling for our original face would have been correct.

 

In bad cases such as our example, the only solution is to replicate the first index, introducing a void vertex but actually fixing the culling problem.

 

In short, here’s the recipe for single-sided strips:

 

-         if the length of the first part of the strip is odd, the strip must be reversed

-         to reverse the strip, write it in reverse order. If the position of the original face in this new reversed strip is odd, you’re done. Else replicate the first index.

 

 

Picture 2 :

Click for a one-sided striped teapot.

 

 

 

 

Connecting strips

 

Longer strips can be created by using swaps [7], but multiple strips can also be connected in a single one, even if they don’t share vertices. For example, say you want to connect the strip :

 

0 1 2 3 4

 

and the strip :

 

5 6 7 8 9

 

They don’t have any common vertex, but you still can create the following strip :

 

0 1 2 3 4 4 5 5 6 7 8 9

 

Which encodes the following triangles :

 

012 (+)

123 (-)

234 (+)

 

344 (-)

445 (+)

455 (-)

556 (+)

 

567 (-)

678 (+)

789 (-)

 

This method requires inserting two void vertices in the strip, and it produces four void faces, but since those faces have two common indices, they rapidly get discarded by your rendering API as zero-area faces, and don’t get very far into the rendering pipeline – ie they’re almost free. Whatever happens, sending two more vertices is still better than calling DrawPrimitive two times. Multiple rendering calls indeed is the main disadvantage of strips, compared to trilists. If you manage to render all your strips with a single call, you’re getting best of both worlds. Hence, even if linking two strips in such a way introduces extra vertices, it usually is a win. As far as I know, this method works well in practice.

 

However, as the signs near the faces suggest, linking two strips may flip the culling ! That’s why packing all your strips in a single one is somewhat delicate. The easiest way is to create your strips in the usual way, as described until now, and to link them together only in the end. Sometimes you will need to flip a strip, as in the example above. This happens when the total length of all accumulated strips is odd. To flip the incoming strip you can just replicate the first vertex. However that strip may already begin with a replicated vertex, because of the way we created them. In such cases, we just can discard the first replicated vertex, which flips the strip as well, but saves some space.

 

Picture 3 : a mesh packed in a single, single-sided strip

 

 

 

 

 

Source code

 

All the features discussed in this article have been implemented in the companion source code. You’re free to include it in your own commercial or non-commercial products, as long as you send me a mail for notification. Feedback is welcome as well.

 

 

 

 

 

 

 

 

 

 

 

References

 

 

[1] Jihad El-Sana, Elvir Azanli, Amitabh Varshney, Skip Strips : maintaining Triangle Strips for View-dependent Rendering

 

[2] Mike Chow, Using Strips for Higher Game Performance, Meltdown X99 presentation

 

[3] Brad Grantham’s web site :

http://tofu.alt.net/~grantham/meshifier/show.html

 

[4] Francine Evans and Steven S. Skiena and Amitabh Varshney. Optimizing Triangle Strips for Fast Rendering , IEEE Visualization '96,  pp. 319-326 (October 1996). IEEE. Edited by Roni Yagel and Gregory M. Nielson. ISBN 0-89791-864-9.

 

[5] STRIPE homepage :

http://www.cs.sunysb.edu/~stripe/

[6] Pierre Terdiman, Radix Sort Revisited,

http://codercorner.com/RadixSortRevisited.htm

 

 [7] Tomas Möller, Eric Haines, Real-Time Rendering (p.232) ISBN 1-56881-101-2