Sweep-and-prune benchmark

This little page sums up various sweep-and-prune (SAP) tests. The goal is to find all colliding pairs in a set of N dynamic (moving) objects. We have three algorithms available :

- brute force, performing N ( N-1) / 2 overlap tests

- standard SAP algorithm (V-Collide implementation)

- radix-based SAP

In the following screenshots, you can see the results for N = 1024 moving cubes.

BF = brute force

BI = radix-based

SAPVC = V-Collide

(the "AABB build" timing has nothing to do here, forget it)

Here are the pics :

Here are the comments:

- All versions find the same number of pairs, which is good !

- The standard SAP algorithm takes advantage of temporal coherence. Hence, running time depends on objects' motions. Slower motions produce faster running time.

- On the other hand, both brute force and radix versions don't depend on the underlying objects motions. "Brute-force" doesn't because it performs the full job all the time, "Radix" doesn't because it's O(n) regardless of actual floating-point values.

- All objects are moving each frame, so it might not be a very realistic situation. In typical cases, only a small subset actually moves each time (say 10% of the N objects). In any case, the lower bound for the running time of the standard SAP is when there's no motion at all (first picture).

- In the best case, the vanilla SAP is faster than the radix version. This is quite normal. When there's no motion at all, the vanilla SAP should actually take no time -at all-. In this test we still call the Update() method even when the world matrices have not been modified, which explain why it still takes some time to complete when there's no motion. In a better implementation, that method shouldn't be called at all, and the vanilla SAP should take no time at all (in theory).

- With slow/normal motions however, the vanilla SAP quickly becomes slower than the radix-based version.

- For very fast motions the vanilla SAP's running time plummets completely, while the radix version remains steady.

What you don't see :

- For 1024 objects, building the vanilla SAP's lists is far from immediate. Some seconds at best. There's no delay for other methods. Hence the radix-version is especially good for one-shot queries.

- The vanilla SAP is not cache-friendly, maintaining 3 linked lists all the time, and relying on a lot of runtime object creation/destruction, leading to memory fragmentation.

- Finally the V-Collide implementation has an extra limitation, since the bounding box used in the SAP algorithm must be a cube. This limitation could be removed, but it would complicate the code a bit, making it slower. On the other hand, other methods work with arbitrary AABBs.

Conclusion :

If only a few objects are moving each frame (among N of them), and if you're not running on a console, use the vanilla SAP.

In any other cases, the radix-based version definitely is a nice alternative.

Other possibilities :

- Loose octrees (but they're slower)

- Sphere trees à la Ratcliff (not tested so far)

- H-Grids (probably similar to loose octrees)