Triangle configuration table

 

 

Here's a very old tip that might still be useful nowadays.

 

I was browsing the source code for Nick's very nice "Soft-Wired Shaders" COTD, when I noticed this in the rasterizer's triangle setup :

 

††††††††††††††† // Sort: V1 top, V2 middle, V3 bottom

††††††††††††††† if(V1->y > V3->y) swap(V1, V3);

††††††††††††††† if(V2->y > V3->y) swap(V2, V3);

††††††††††††††† if(V1->y > V2->y) swap(V1, V2);

 

This is a very common piece of code in rasterizers, yet some years ago it was considered "sub-optimal" for at least two reasons :

 

-         FPU comparisons producing bad-looking assembly code

-         branch misprediction penalties, since there's typically no coherence at this point of the pipeline, from one triangle to the next (triangleís orientation is mostly random)

 

This is obviously not the bottleneck in Nick's code, and I'm not even sure old optimization rules still apply on today's machines, but the following tip might still be relevant.

 

You can replace these tests with a "configuration table", that gets rid of the two previously mentioned problems :

 

#define LEFT††††††††† 0†††††† // long edge on the left side of triangle

#define RIGHT†††††† 1†††††† // long edge on the right side of triangle

#define DUMMY†† 255 †††// for padding

 

#pragma pack(1)

static ubyte gConfigTable[] =

{

††††††††††††††† 0,†††††††††††† ††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† RIGHT,††††††††††††††† 2,††††††††††††††† 1,††††††††††††††† 0,††††††††††††††† 1,††††††††††††††† 0,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† RIGHT,††††††††††††††† 1,††††††††††††††† 0,††††††††††††††† 2,††††††††††††††† 0,††††††††††††††† 2,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† LEFT,†††† ††††††††††††††† 1,††††††††††††††† 2,††††††††††††††† 0,††††††††††††††† 0,††††††††††††††† 2,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† RIGHT,††††††††††††††† 0,††††††††††††††† 2,††††††††††††††† 1,††††††††††††††† 2,††††††††††††††† 1,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† LEFT,†††† ††††††††††††††† 2,††††††††††††††† 0,††††††††††††††† 1,††††††††††††††† 1,††††††††††††††† 0,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† LEFT,†††† ††††††††††††††† 0,††††††††††††††† 1,††††††††††††††† 2,††††††††††††††† 2,††††††††††††††† 1,††††††††††††††† DUMMY,††††††††††††††† DUMMY,

††††††††††††††† 0,†††††††††††† ††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† 255,††††††††††††††† DUMMY,††††††††††††††† DUMMY

};

#pragma pack()

 

To find triangle's configuration you simply index the table using a bitmask, computed from triangleís Y coords :

 

#define SIGN_BITMASK††††††††††††††† ††††††††††††††† 0x80000000

 

// Integer representation of a floating-point value.

#define IR(x)††††††††††††††† ((udword&)(x))

 

udword dy0 = (IR(V1->y) - IR(V2->y)) & SIGN_BITMASK;

udword dy1 = (IR(V2->y) - IR(V3->y)) & SIGN_BITMASK;

udword dy2 = (IR(V3->y) - IR(V1->y)) & SIGN_BITMASK;

udword Index = (dy0 >> 29)|(dy1 >> 30)|(dy2 >> 31) << 3;

 

ubyte Type††††††††††††††† = gConfigTable[Index+0];

ubyte Top††††††††††††††† = gConfigTable[Index+1];

ubyte Middle††††††††††††††† = gConfigTable[Index+2];

ubyte Bottom††††††††††††††† = gConfigTable[Index+3];

ubyte Right††††††††††††††† = gConfigTable[Index+4];

ubyte Left††††††††††††††† = gConfigTable[Index+5];

 

Note that it requires properly clipped TL-vertices, whose Y is positive, in order to work.

 

Then you pretty much know everything :

 

Type says if this is a LEFT or RIGHT triangle, i.e. if the long edge is on the right or on the left side of it.

 

Top, Middle and Bottom are sorted vertex indices, similar to what Nickís code is computing in the original code.

 

As a free bonus you also get the indices of Right and Left vertices, that can be used to compute gradients without a single ďifĒ or any special case. Resulting code is very short and clean.

 

For example, hereís how you can render a flat triangle :

 

void FlatRenderer::DrawTriangle(const Triangle* t)

{

††††††††††††††† SetFPUCeilMode();††††††††††††††† // We let the casts do the ceiling for us

 

††††††††††††††† udword dy0 = (IR(t->mVerts[0].y) - IR(t->mVerts[1].y))&SIGN_BITMASK;

††††††††††††††† udword dy1 = (IR(t->mVerts[1].y) - IR(t->mVerts[2].y))&SIGN_BITMASK;

††††††††††††††† udword dy2 = (IR(t->mVerts[2].y) - IR(t->mVerts[0].y))&SIGN_BITMASK;

††††††††††††††† udword Index = ((dy0 >> 29)|(dy1 >> 30)|(dy2 >> 31)) << 3;

 

††††††††††††††† ubyte Type††††† = gConfigTable[Index+0];

††††††††††††††† ubyte Top††††††† = gConfigTable[Index+1];

††††††††††††††† ubyte Middle†† = gConfigTable[Index+2];

††††††††††††††† ubyte Bottom†† = gConfigTable[Index+3];

 

††††††††††††††† ubyte Right††††† = gConfigTable[Index+4];

††††††††††††††† ubyte Left†††††††† = gConfigTable[Index+5];

 

††††††††††††††† // Compute #scanlines to go to the middle vertex

††††††††††††††† udword CeilYBottom††††††††††††††† = udword(t->mVerts[Bottom].y);

††††††††††††††† udword CeilYMiddle††††††††††††††† = udword(t->mVerts[Middle].y);

††††††††††††††† udword CeilYTop††††††††††††††† = udword(t->mVerts[Top].y);

††††††††††††††† udword Count†††† ††††††††††††††† = CeilYMiddle - CeilYTop;

 

††††††††††††††† // Prestep for subpixel accuracy

††††††††††††††† float PreStep = t->mVerts[Top].y - float(CeilYTop);

 

††††††††††††††† // Compute slopes and usual stuff

††††††††††††††† float dX[2];

††††††††††††††† dX[RIGHT] = (t->mVerts[Right].x - t->mVerts[Top].x) / (t->mVerts[Right].y - t->mVerts[Top].y);

††††††††††††††† dX[LEFT] = (t->mVerts[Left].x - t->mVerts[Top].x) / (t->mVerts[Left].y - t->mVerts[Top].y);

 

††††††††††††††† // Prestep slopes

††††††††††††††† float CurrentX[2];

††††††††††††††† CurrentX[RIGHT] = t->mVerts[Top].x - PreStep * dX[RIGHT];

††††††††††††††† CurrentX[LEFT] = t->mVerts[Top].x - PreStep * dX[LEFT];

 

††††††††††††††† // Draw triangle

††††††††††††††† ubyte* Dest = &mBuffer[CeilYTop*mPitch];

††††††††††††††† udword Nb=2;

 

††††††††††††††† while(Nb--)

††††††††††††††† {

††††††††††††††† ††††††††††††††† while(Count)

††††††††††††††† ††††††††††††††† {

†††††††††††††††††††††††††††††† ††††††††††††††† sdword StartPix = sdword(CurrentX[LEFT]);

†††††††††††††††††††††††††††††† ††††††††††††††† sdword NbPixels = (sdword)CurrentX[RIGHT] - StartPix;

†††††††††††††††††††††††††††††† ††††††††††††††† udword* Buf = ((udword*)Dest)+StartPix;

 

†††††††††††††††††††††††††††††† ††††††††††††††† while(NbPixels-->0)

†††††††††††††††††††††††††††††† ††††††††††††††† {

†††††††††††††††††††††††††††††† ††††††††††††††† ††††††††††††††† *Buf++ = mColor;

†††††††††††††††††††††††††††††† ††††††††††††††† }

 

†††††††††††††††††††††††††††††† ††††††††††††††† CurrentX[LEFT]+=dX[LEFT];

†††††††††††††††††††††††††††††† ††††††††††††††† CurrentX[RIGHT]+=dX[RIGHT];

†††††††††††††††††††††††††††††† ††††††††††††††† Dest+=mPitch;

†††††††††††††††††††††††††††††† ††††††††††††††† Count--;

††††††††††††††† ††††††††††††††† }

 

††††††††††††††† ††††††††††††††† Count = CeilYBottom - CeilYMiddle;

††††††††††††††† ††††††††††††††† if(Count)

††††††††††††††† ††††††††††††††† {

†††††††††††††††††††††††††††††† ††††††††††††††† PreStep = t->mVerts[Middle].y - float(CeilYMiddle);

†††††††††††††††††††††††††††††† ††††††††††††††† dX[Type] = (t->mVerts[Bottom].x - t->mVerts[Middle].x) / (t->mVerts[Bottom].y - t->mVerts[Middle].y);

†††††††††††††††††††††††††††††† ††††††††††††††† CurrentX[Type] = t->mVerts[Middle].x - PreStep * dX[Type];

††††††††††††††† ††††††††††††††† }

††††††††††††††† }

††††††††††††††† SetFPUFloorMode();

}

 

 

Notes :

-         the above code is only included for information purpose, it may not be very safe. For example I donít think it works correctly if you donít compile using /QIfist.

-         of course you don't have to change the FPU mode for each triangle... The above code was not written with speed in mind !

 

 

This tip might be obsolete nowadays, but I've never seen it explained anywhere (though Iím sure itís old news to many of you). Somehow I felt it would be a good idea to share it nonetheless. I mainly used this in Phototracer's scanline (Phototracer is an old Photoshop plug-in), and before in various PC demos.

 

 

Pierre