Many researchers have co-opted powerful graphics processing units, or GPUs, to run climate models and other scientific programs, while tech and financial giants use large banks of these processors to train machine-learning algorithms. They all have video-game players to thank for the emergence of these workhorse processors: It was gamers who stoked the original demand for chips that could do the massive amounts of parallel number crunching required to produce rich graphics quickly enough to keep up with fast-paced action.
Of course, gamers weren’t the only people pushing graphics technology forward. By 1995, movies like Pixar’s Toy Story, the first full-length digitally animated movie, had demonstrated the potential of high-quality computer animation. But gamers drove the technology in a very specific direction. Pixar had created Toy Story’s graphics by slowly rendering each frame individually and then stitching it all together. Gamers, on the other hand, were interested only in technologies that could generate graphics in real time.
This demand for faster and better graphics had existed since the advent of mainstream video games in the 1970s. By the late 1990s, graphics chips for the games market had considerably improved from their humble beginnings when displaying say, a handful of colors at a resolution of 300 by 200 pixels was considered impressive. But Nvidia’s NV20, released in 2001 under the GeForce 3 brand, marked a key turning point, and one that serendipitously opened up new vistas in scientific computing and artificial intelligence.
Generating rich imagery on the fly for a video game was still a tremendous computational challenge in 1998, when the Nvidia team began working on the NV20. “I was always fascinated by the idea of being able to create super-realistic images in real time,” says NV20 system architect Steven Molnar. The graphics processors of the era couldn’t generate complex textures, realistic reflective surfaces, or shadows in real time. Thanks to a new architecture, the NV20 would offer all this to game developers.
At the time, the market for GPUs was highly competitive, says the NV20’s chief architect John Montrym. Manufacturers had to strike a careful balance between offering new (and potentially expensive) features that game designers might not actually end up using and making sure their new chips could still run existing graphics programs efficiently. With the NV20, Nvidia decided to take a risk, and went for an ambitious—and difficult to manufacture—design that required making major changes to the company’s existing recipe for building graphics chips.
Montrym and Molnar concentrated on streamlining the chip’s operation. They changed how memory was partitioned, dividing up 128-bit chunks of data into four 32-bit portions, which made the process of fetching data from memory more efficient. They also adopted a system called z-cull which could predict which pixels in a 3D scene were going to be obscured by other objects and would then toss the unneeded pixel data out of the working memory to save processing power. All this enabled the production of faster, richer graphics—but it made the chip hard to build.
To cram these new features onto the NV20, hardware engineer John Robinson had to switch to what was then the latest semiconductor technology available—manufacturing chips at 150 nanometers. Robinson sighs and looks stressed out just thinking about that period. “I saw that chip as taking one slightly bigger bite than we should have at the time,” he says. “It was a very difficult chip to get into production.” But after more trial and error than usual, Robinson made it work.
Another big risk was letting developers get under the NV20’s hood and tinker. “There was a sense that if you want really realistic images, you need to expose programmability to developers,” says Molnar. NV20 wasn’t fully programmable, but for the first time Nvidia made some of the onboard features configurable by game developers. With the NV20, developers could reach in and adjust the GPU’s pixel and vertex shaders.
Pixel shaders are key to making computer graphics look realistic. An object may have an exquisitely detailed and realistic 3D shape, but it will still look flat and artificial if it doesn’t appear to be made out of a specific material and reflect light in the way that that material would. Vertex shaders help bring 3D objects to life, allowing them to be reshaped on the fly. For example, Molnar explains, developers could program a function that changed the height of a surface to simulate waves on water or program another function to display realistic moving joints in an animated figure.
By letting game developers adjust the NV20’s shaders to their needs—including the ability to write their own vertex shader functions from scratch—rather than being stuck with what the chip designer has chosen, they could create much more realistic game environments without sacrificing speed.
This decision opened the door to use GPUs for things other than graphics. As successive generations of GPUs offered more and more programmability, developers began hacking them for their own purposes—a practice that eventually led to the use of GPUs for scientific computing and later for training machine-learning algorithms This was because the architecture required to process countless pixels on a screen in parallel according to the preferences of a game designer turned out to be just the thing for handling other massively parallel math problems, such as adjusting the weights of a neural network. The NV20, says Montrym, was the first step on that road.
These risks also paid off in the short term. Microsoft chose a custom version of the chip to run the graphics for its first Xbox games console, which became an international hit. Ironically, none of the NV20’s lead engineers plays video games—they’re more excited about what came later, in scientific computing and AI. “I don’t have time to play video games; I’m too busy!” says Montrym. “Computer gaming paid the bills, it caused us to continue to evolve, and nowadays if you look at the portfolio of who uses our chips society derives enormous value from GPU-based general purpose computing.”