LATE LAST YEAR , the news that two heavyweight computers were racing for the title of world's fastest left an excited supercomputing community in suspense. The competition came to an end in November, when the Top500 supercomputer ranking project released its biannual report.
IBM's Blue Gene/L [see photo, " Your Move, Garry"], to be delivered this spring to the U.S. Department of Energy's Lawrence Livermore National Laboratory in Livermore, Calif., took the top spot, with a performance of 70.72 teraflops (trillion floating-point operations per second). Silicon Graphics' Columbia, built for NASA and named after the space shuttle lost in 2003, came in second, at 51.87 teraflops. The two machines displaced Japan's famed Earth Simulator to third place after that 35.86-teraflop computer had reigned supreme for two and a half years [see table, " The Top Three Supercomputers"].
"This was tremendously important," says Tarek El-Ghazawi, a professor of electrical and computer engineering at George Washington University, in Washington, D.C. IBM and Silicon Graphics Inc.�SGI�came up with promising new designs, he says, showing it's possible to achieve dramatic increases in speed without sending costs through the roof. Blue Gene/L cost roughly a quarter of what was spent on the Earth Simulator, and Columbia just an eighth.
The new breed of supercomputers brings technology advances that may ultimately trickle down to a variety of high-performance computers, thus benefiting not only big-bucks buyers like the Energy Department and NASA but many other organizations in need of serious computing horsepower. That market is worth US $6 billion worldwide, in which IBM and SGI compete with Cray, HP, NEC, and Sun, among others.
What innovations catapulted Blue Gene/L and Columbia to the top?
IBM takes particular pride in its building block: a microchip containing not one but two microprocessors, based on the company's PowerPC processor family. It added memory, communications functions, and extra circuitry to speed up floating-point operations. Even so, the chip barely covers a fingernail and consumes a mere 10 to 15 watts.
It's important that the building blocks be small and energy efficient, because they work together by the thousands, with each one crunching a tiny piece of a massive computational task. After five years of work, IBM engineers managed to cram 16 384 chips into 16 refrigerator-sized cabinets, then installed 5 types of networks that link the processors together in different ways and guarantee that none of them starve for data.
"It's just a maniacal focus on power, and it's a maniacal focus on integration," says Tilak Agerwala, vice president of systems at IBM Research and an IEEE fellow. At one-fourth of its planned size, Blue Gene/L has twice the computing power of the Earth Simulator, yet it's 1/50 the size and consumes 1/14 the power required by the Japanese computer, Agerwala says.
SGI, in Mountain View, Calif., isn't too shy to boast of its achievements, either. For Columbia's design, it took 20 of its Altix systems, high-performance computers powered by 512 Intel Itanium 2 processors, and connected them with an industry-standard InfiniBand networking system, getting them all up and running in only 120 days, says Dave Parry, a senior vice president at SGI.
The 10 240-processor machine began operation by adding one batch of processors at a time, without disrupting other systems, says Walt Brooks, chief of NASA's advanced supercomputing division at the agency's Ames Research Center, in Moffett Field, Calif. Simulations of space shuttle missions that used to take two to three months to generate are now running in hours, he says. The machine will raise NASA's overall computing power 10-fold.
A prodigious performance on the part of the new supermachines hasn't stopped many in the supercomputing field from asking whether they will really stay ahead of new contenders for long.
For one thing, Blue Gene/L and Columbia�and other top-ranking supercomputers, for that matter�have distinct architectures, perhaps an indication that there's currently no single winning strategy for building these ultrafast computers. In fact, just a few years ago, the field "was very monochromatic, with very few architectures," says IEEE Fellow Jack Dongarra, a professor of computer science at the University of Tennessee at Knoxville and one of the organizers of the Top500 project. "Now there is a much richer collection of machines," he adds.
Current designs range from off-the-shelf clusters, which are basically large collections of cheap computers hooked together with some standard high-speed network, to highly customized machines like the Earth Simulator. It was built by Japanese computer maker NEC Corp. out of specially designed�and especially costly�components such as vector microprocessors, which perform some operations in parallel and thus faster than standard microprocessors.
IBM and SGI chose a mix of commodity and custom architectures�a strategy that seems to have proved successful, especially for Blue Gene/L, whose monstrous final 130 000-plus-processor configuration will achieve 360 teraflops. Dongarra predicts that this machine will lead the Top500 list for at least several more editions.
"But in the end," he hastens to add, "it's not really a question of speed, it's a question of what kind of science, what kinds of new ideas, what kind of deeper understanding do we get by using this equipment." The number of teraflops is just "a trophy," Dongarra says. "And we all know that that trophy won't last forever."