I recently ran across an interesting paper, Stackless KD-Tree Traversal for High Performance GPU Ray Tracing, which documented the strides made by GPU based ray-tracing over the last decade and introduced a new way of mapping acceleration structure traversal to modern GPUs, namely Nvidia’s new G80. The paper was authored by Philipp Slusallek’s talented computer graphics group at Saarland University in Germany. Our own Cell iRT ray-tracer was based on papers written by Philipp’s students so we have great respect for their work. It was interesting to see the great lengths researchers are willing to go through in order to harvest a fraction of the floating point potential locked away in these black boxes.
From 10,000 feet here’s how the Cell processor stacks up to Nvidia’s new G80 GPU:
Both parts are compared at 90-nanometre.
As you can see the G80 is twice as big, which is a good indication it requires twice the power, and produces twice the floating point power on paper. However when we ran one of the benchmarks discussed in the paper, the Stanford Bunny, we found that the Cell processor when combined with the iRT produces significantly better performance (we don’t have access to the other datasets listed in the paper):
Left to Right:
2.6 GHz AMD Opteron - Saarland Ray-tracer
Nvidia GeForce 8800 GTX - Saarland Ray-tracer
Sony Playstation3 (partial 3.2 GHz Cell processor running Linux) - IBM iRT
3.2 GHz Cell Processor - IBM iRT
IBM QS20 Blade (Two 3.2 GHz Cell Processors) - IBM iRT
In fact one Cell processor is four to five times faster at ray-tracing the Stanford Bunny than the G80 and the Cell QS20 blade, which has comparable floating point power on paper, is eight to eleven times faster. Both the G80 and Cell crush the AMD Opteron at ray-tracing which is arguably the most popular production rendering processor today. It’s also interesting to note that secondary rays are less costly on Cell which is where ray-tracing becomes interesting. Primary ray cast is only interesting from an academic perspective. The real issue is secondary rays and GPUs have traditionally had problems with these do to their incoherent nature. When you factor power into the equation it gets even more interesting, given that Cell is half the size of the G80 and produces five times the ray-tracing performance.
Things are starting to get interesting and Intel is hot on the trail with their Larrabee part which is said to be designed for ray-tracing.
Only time will tell….