Ray tracing, Parallel Computing and a Bugatti Veyron

At last week's Hot Chips symposium, Nvidia founder and CEO Jen-Hsun Huang delivered the first keynote about the GPU computing revolution.

The keynote was definitely the highlight of the conference, but before I get all swoony over the incredible directional flame sprites and the finger-licking Bugatti Veyron their GPUs can render, first I need to pick on Nvidia a little.

That’s because the company was selling $200 3-D glasses at their booth. Or, they were trying to. I didn’t see anyone buy them, and if anyone did, they didn’t tell me about it.

The glasses were supposed to augment a very engrossing 3-D Batman game Nvidia had nakedly set up to lure passers-by. Apparently they created a deeper z-space by giving each lens a different refresh rate. Something like that. I put on the glasses and played for a while. It’s telling either of my unsophistication with games or of how unimpressive these glasses were that I failed to notice that you had to actually turn them on—when someone out pointed my mistake, and I flipped the on switch, the only difference I noticed was a pretty blue LED light.

But enough: let’s make with the swooning.

First, Huang took the audience back to February of 1993, when he'd just finished his master’s in electrical engineering at Stanford, and Nvidia was just a gleam in a venture capitalist's eye. For perspective, 1993 is so long ago that there was no need to have a PC on your desktop even if you were trying to get people to invest in your computer company. “If we had told our investors at the time that we’d be using the same hardware to play games and try to cure cancer," he said, "I am sure we would not have been funded."

“The GPU will likely be parallel processor for the future,” he told the crowd. Computers are being driven to parallel computing because people can do magical things with them.

Nvidia’s Teraflop-capable GPUs can, in fact, do some things that would have literally appeared to be magic to a person in 1993: Augmented reality in Monday Night Football, where it’s possible for the football players to stand on top of the 3-D rendered line of scrimmage projected onto the field but under the players. The flags rendered under the ice at Olympic hockey games; Ann Curry’s set during the 2008 election coverage. But you know all this stuff.

The point is this: The GPU has evolved faster than any other tech component, their complexity increasing from a few million transistors in 1994 to billions in 2009. That’s a thousand-fold increase in complexity in only 15 years.

What did they do with all that complexity? Shaders. Shaders and programmable pipelines made it possible for computer game designers to be artists. Let’s take an extreme example. Pacman and his attendant ghosts are lovable, clunkety and pixelated.

Pacman

Let's leave aside the fact that these were animated with pixels instead of polygons and that GPUs barely existed when Pacman was born. With the obscene amount of processing power GPUs now command, a programmer can now create a specific mood for his or her game by automatically shading all scenes and objects with a hypercolor style or a sepia tint, you name it. The result can be anything from the eye-poppingly surreal textures of Super Mario Galaxy...

...to the otherworldly, overexposed dreamscape of Riven or Myst.

Shading is great but Nvidia wanted to take it to the next level: articulate the surfaces, but also the physics underlying what you see on the surface. Now you’re getting into computational visualization.

This is where ray tracing comes in. With ray tracing, an image is generated by tracing a path through each pixel in a virtual screen, and calculating the color of the object visible through it. Huang showed us what exactly ray tracing can do by way of a Bugatti Veyron, rendered with 2 million polygons worth of luscious, mouth-watering detail.

[This image was from the 2008 SIGGRAPH conference-- the image from Hot Chips isn't online yet but it's even prettier!]

Because ray tracing constructs the entire image using information from the computed trajectory of rays of light bouncing from surface to surface, you can light the scene, place your object into the scene, and then do a “walk through”, panning inside the car, where it’s possible to see details—there is no independent lighting inside the car—provided exclusively by ambient “light” rays diffracting and reflecting off the environment and streaming in through the windows. The lighting was so complex and subtle you begin to understand how the GPU could harness physics simulations as impossibly complex as molecular dynamics.

This animation was running on three GeForce GPUs, each with almost 1 Tflop of processing horsepower. That’s about 2.7 Tflops to sustain animation that was very close to photorealistic. (1500-2000 instructions per component, all in HD. 100 shader instructions per component, 4 components per pixel [R G B alpha], 1.5 Flops per instruction on average, 60 frames per second, etc—that adds up to 500 shader Gflps: and if this sentence makes you want to die, read "Data Monster," the tutorial on GPUs and graphics processing in the September issue of Spectrum.) But that only represents, Huang said, about 10 percent of the total math capability of a GPU.

Meanwhile, let’s do a little side by side comparison. Intel's vaunted Nehalem CPU, trotted out earlier that day: 3 Ghz, 4 cores, and a bunch of other stuff—theoretical peak performance of 96 Gflops. That's great for general purpose computing, but two orders of magnitude short of being able to run the Bugatti animation in real time, which requires 5 Tflops. Nehalem—and the CPU in general—is designed for general-purpose computing, but not for graphics.

Animators will be making increasingly photorealistic art for games: water, fire, clouds, smoke—anything that obeys the laws of physics can be rendered to look real, provided you have the right algorithms and a monster amount of GPU muscle. To prove that point, he showed a nice video of water gently rippling in the sunlit breeze. It was more than photorealistic. But to do all that, you’re using a 3D fluid solver that renders in agonizing detail about 262,000 individual particles to generate fluid motion. Each particle has its own shadow and motion blur. Not to mention color, alpha, etc.

But ray tracing has a way to go, Huang said. Where it's great for photorealism, it’s not good for real-time rendering. The Bugatti for example was super-impressive in still frame; but when you moved around it, it got grainy and monochrome. Not for long—as soon as you stopped, the image filled in remarkably fast. If you're just making a movie, you can pre-bake the animation as long as you want. For games that's obviously a nonstarter.

To illustrate the true power of ray tracing, Huang showed us the directional flames Industrial Light & Magic did for the Harry Potter movie, which are apparently just unthinkable without monster processing power. Fire is amazingly complex because it’s alive, dynamic, moving and turbulent, so normally, to do fire special effects, animators use and sculpt sprites of real flames. But you can’t animate flame sprites directionally. The ILM fire simulator runs on top of CUDA, and the realistic flames shooting out of Dumbledore's hands are as good as any real-life flame thrower.

In addition, there are some things you can’t pre-animate because you don’t know how it will work at game time. For example, a really awful tackle in a football video game. Animators combine physics simulations and morph them with motion capture, because even though the motion capture is convincing to a certain extent, a brutal tackle would be really painful to motion-capture.

When a program is written taking full advantage of the GPU, obscene improvements are the norm, and not just for graphics. A certain unnamed quantum chemistry program, for example, had a 130X speedup when it was run properly on a GPU. It’s totally doable when an application is inherently parallelizeable.

The point is this: Moore’s law as applied to Intel’s CPUs can reap performance improvements of, on average, 20 percent per year.

By contrast, over the next 6 years, Huang predicted, a co-processing architecture (ganging together a CPU and one or more GPUs) would enable a performance improvement of 570X. Understandably, later blog posts that referenced this figure had people's heads exploding. But keep in mind, this is for specialized applications: graphics, oil & gas equations, seismic, molecular dynamics, quantum chemistry.

I assume ray tracing lends itself to parallel computing, and also that with a 570X performance improvement, this Bugatti will look photorealistic in real time by 2015. But I think the real issue is whether that 570X speedup will help humanoid characters be truly photorealistic by 2015.

Huang wrapped up the talk by wowing us with all manner of Star Trek daydreams—the real-time universal translator, the smartphone app that can tell you what you’re looking if you just snap a picture of it (WANT).

But even with all those goodies, I’m still stuck on the Uncanny Valley problem. I wonder how far we’ll have to go into physics simulations before we break humanoid characters out of the Uncanny Valley. Even the most advanced animations—Beowulf and Digital Emily—are convincing until they start talking. There’s something impossible to render accurately about teeth, I think. Digital Emily was perfect until she showed her teeth, and the sad thing is, when I mentioned this to Paul Debevec, he looked crestfallen and explained that they had modeled the teeth exactly.

The upshot is this: I don’t think we’re going to get out of the Uncanny Valley until we can do essentially molecular dynamics on every part of the human face, and that includes building the teeth from the ground up.

The good news is, if Huang’s prediction proves true, and GPU performance increases by 570 over the next six years, that’s not a crazy thing to aspire to do. Whether it’s worthwhile, that’s another story.

gpus hardware embedded systems nvidia hot chips processors audio software devices gaming

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Ray tracing, Parallel Computing and a Bugatti Veyron

Thanks, Nvidia!

2D Gold Sheet Draws Graphene Comparisons

How Field AI is Conquering Unstructured Autonomy

Expect a Wave of Waferscale Computers

Related Stories

Ignoring Intel Rumors, GlobalFoundries Will Do $1-billion Expansion

South Korea’s $450-Billion Investment Latest in Chip Making Push

Nanoscale Plasma Switch Goes From Zero to Kilowatts in Picoseconds

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and talk to tech insiders — all free! For full access and benefits, join IEEE as a paying member.

Ray tracing, Parallel Computing and a Bugatti Veyron

Thanks, Nvidia!

2D Gold Sheet Draws Graphene Comparisons

How Field AI is Conquering Unstructured Autonomy

Expect a Wave of Waferscale Computers

Related Stories

Ignoring Intel Rumors, GlobalFoundries Will Do $1-billion Expansion

South Korea’s $450-Billion Investment Latest in Chip Making Push

Nanoscale Plasma Switch Goes From Zero to Kilowatts in Picoseconds