Modulo 3

 

 

Ever encountered this one?

 

      int Index;  // 0, 1 or 2

      int Index2 = (Index+1)%3;

      int Index3 = (Index+2)%3;

 

It appears all over the place, in particular when you have some algorithm working on vertices, and "Index" is the index of a particular component, i.e. X, Y or Z.

 

Each time I see this code I feel terribly sorry for the poor guy who wrote it. Especially when I am that guy.

 

Needless to say (but I'll say it anyway), this code is evil since in this case it includes two hidden divisions by 3. Which is slow.

 

The usual way to get rid of the modulos is to bite the bullet and write explicit comparisons. For example:

 

      int Index;  // 0, 1 or 2

      int Index2 = Index+1;   if(Index2>2)      Index2 -= 3;

      int Index3 = Index+2;   if(Index3>2)      Index3 -= 3;

 

This is a bit better. But not much. Of course, you traded the modulos for comparisons, i.e. you replaced slow divisions with potential branch mispredictions, which are just as evil.

 

Another popular way to fix this (used in 3DS MAX, for example) is to use a small lookup table:

 

      int LUT[] = { 1, 2, 0, 1 };

      int Index;  // 0, 1 or 2

      int Index2 = LUT[Index];

      int Index3 = LUT[Index+1];

 

It almost looks nice. Almost. But of course it suffers from two issues:

- the LUT has to be written to memory each time

- then read back

 

To get rid of the first issue you usually do:

 

      static int LUT[] = { 1, 2, 0, 1 };

      int Index;  // 0, 1 or 2

      int Index2 = LUT[Index];

      int Index3 = LUT[Index+1];

 

Hence you just have to read a static LUT. Well, it's also bad. Accessing memory is bad, no matter what. Accessing a static array is even worse, since the corresponding memory can be anywhere, and probably not close to your stack. In short, you replaced slow divisions with even slower cache misses. This is not really a progress.

 

So, we have the choice between 3 evil ways to do this. Hmmm. Is there any hope left?

 

Well, maybe. I'd like to propose this alternative:

 

      int Index;  // 0, 1 or 2

      int Index3 = 0x01000201;

      Index3>>=(Index<<3);

      int Index2 = Index3 & 0xff;

      Index3>>=8;

      Index3&=0xff;

 

No modulo, no comparisons, no memory access. Looks good!

 

In case you don’t actually need all the 3 indices, but only the next one in the sequence, you can use this alternative function from Adam Moravanszky:

 

int Modulo(int edge)

{

     return ((edge&1)<<1) | ((~(edge+1)&2) >>1);

}

 

 

 

 

Addendum March 22, 2007:

 

Christer Ericson was kind enough to send two alternative methods. The first one was:

 

 

int Index;  // 0, 1 or 2
int Index2 = ( 9 >> (Index << 1)) & 3;
int Index3 = (18 >> (Index << 1)) & 3;

int Modulo(int edge)
{
      return ( 9 >> (edge << 1)) & 3;
}

 

 

And the most recent one is so simple I think we can just close the topic:

 

 

int Index1 = n;  // 0, 1 or 2
int Index2 = (1  << Index1) & 3;
int Index3 = (1  << Index2) & 3;

int Modulo(int edge)

{
      return (1 << edge) & 3;
}

 

 

 

 

Pierre Terdiman