The October 2022 issue of IEEE Spectrum is here!

Close bar

Frontier Supercomputer to Usher in Exascale Computing

Oak Ridge National Lab may be the first to reach 1,000,000,000,000,000,000 operations per second

3 min read
A row of 24 computer racks stand on one side of a large room.

Technicians at Oak Ridge National Laboratory are assembling massive racks that will constitute the Frontier supercomputer.

Oak Ridge National Laboratory

In 2018, a new supercomputer called Summit was installed at Oak Ridge National Laboratory, in Tennessee. Its theoretical peak capacity was nearly 200 petaflops—that’s 200 thousand trillion floating-point operations per second. At the time, it was the most powerful supercomputer in the world, beating out the previous record holder, China’s Sunway TaihuLight, by a comfortable margin, according to the well-known Top500 ranking of supercomputers. (Summit is currently No. 2, a Japanese supercomputer called Fugaku having since overtaken it.)

In just four short years, though, demand for supercomputing services at Oak Ridge has outstripped even this colossal machine. “Summit is four to five times oversubscribed,” says Justin Whitt, who directs ORNL’s Leadership Computing Facility. “That limits the number of research projects that can use it.”


The obvious remedy is to get a faster supercomputer. And that’s exactly what Oak Ridge is doing. The new supercomputer being assembled there is called Frontier. When complete, it will have a peak theoretical capacity in excess of 1.5 exaflops.

The remarkable thing about Frontier is not that it will be more than seven times as powerful as Summit, stunning as that figure is. The remarkable thing is that it will use only twice the power. That’s still a lot of power—Frontier is expected to draw 29 megawatts, enough to power a town the size of Cupertino, Calif. But it’s a manageable amount, both in terms of what the grid there can supply and what the electricity bill will be.

“The efficiency comes from putting more computer hardware in smaller and smaller spaces,” says Whitt. “Each of these [computer] cabinets weighs as much as a full-sized pickup.” That’s because they are stuffed with what ORNL’s spec sheet describes as “high density compute blades powered by HPC- and AI-optimized AMD EPYC processors and Radeon Instinct GPU accelerators purpose-built for the needs of exascale computing.”

Building a supercomputer of this capacity is hard enough. But doing so during a pandemic has been especially challenging. “Supply-chain issues were broad,” says Whitt, including shortages of many things that aren’t special to building a high-performance supercomputer. “It could just be sheet metal or screws.”

Supply issues are indeed the reason Frontier will become operational in 2022 ahead of another planned supercomputer, Aurora, which will be installed at Argonne National Laboratory, in Illinois. Aurora was to come first, but its construction has been delayed, because Intel is having difficulty supplying the processors and GPUs needed for this machine.

At the time of this writing, technicians at Oak Ridge were assembling and testing parts of Frontier in hopes that the giant machine will come together before the end of 2021 and with the intention of making it fully operational and available for users in 2022. Will we then be able to call it the world’s first exascale supercomputer?

That depends on your definition. “[Japan’s Fugaku supercomputer] actually achieved 2 exaflops with a different benchmark,” says Jack Dongarra of the University of Tennessee, one of the specialists behind the Top500 list. Those rankings, he explains, are based on a benchmark that involves 64-bit floating-point calculations, the kind used to solve three-dimensional partial differential equations as required for many physical simulations. “That’s the bottom line of what supercomputers are being used for,” says Dongarra. But he also points out that supercomputers are increasingly used to train deep neural networks, where 16-bit precision can suffice.

Will we be able to call Frontier the world’s first exascale supercomputer? That depends on your definition.

And then there’s Folding@Home, a distributed-computing project intended to simulate protein folding. “I would call that a specialized computer,” says Dongarra, one that can do its job because the calculations involved are “embarrassingly parallel.” That is, separate computers can perform the required calculations independently—or at least largely so, with what little communication between them is needed being conveyed over the Internet. In March of 2020, the Folding@Home project proudly announced on Twitter, “We’ve crossed the exaflop barrier!”

But if you stick with the usual benchmark, the one used for the Top500 ranking, no supercomputer yet qualifies as an exascale machine. Frontier may be the first. Or well, it’s on track to be the first known exascale supercomputer, says Dongarra. He explains that before the June 2021 Top500 ranking came out, a rumor emerged that China has at least one, if not two, supercomputers already running at exascale.

Why would Chinese engineers construct such a machine without telling anyone about it? At the time, Dongarra says, he thought that maybe they were waiting for the 100-year anniversary of the founding of the Chinese communist party. But that date came and went in July. He now speculates that Chinese officials may be worried that making its existence public would exacerbate geopolitical rivalries and cause the United States to restrict the export of certain technologies to China.

Perhaps that explains it. But it’s going to be increasingly difficult for Chinese researchers not to let this cat, if it truly exists, out of the bag. For the moment, anyway, with only rumors to go on, this exascale rival to Frontier is a Schrödinger’s cat—both here and not here at the same time.

This article appears in the January 2022 print issue as “The Exascale Era Is Upon Us.”

The Conversation (1)
James Brady30 Dec, 2021
LF

This article is a little short on how Frontier is put together. Nothing on the communication links: their bandwidth, protocol cost, latency. Without that information it is hard to tell what applications Frontier is good for, and I am more interested in what can be done with it than how much it it heating eastern Tennessee.

The World’s Largest Camera Is Nearly Complete

The future heart of the Vera C. Rubin Observatory will soon make its way to Chile

3 min read
A large black cylinder with a glass lens in front rests on a sturdy white structure in a bright room.

The LSST camera, eventually bound for the Vera C. Rubin Observatory in Chile, sits on its stand in a Bay Area clean room.

Jacqueline Ramseyer Orrell/SLAC National Accelerator Laboratory

The world’s largest camera sits within a nondescript industrial building in the hills above San Francisco Bay.

If all goes well, this camera will one day fit into the heart of the future Vera C. Rubin Observatory in Chile. For the last seven years, engineers have been crafting the camera in a clean room at the SLAC National Accelerator Laboratory in Menlo Park, Calif. In May 2023, if all goes according to plan, the camera will finally fly to its destination, itself currently under construction in the desert highlands of northern Chile.

Building a camera as complex as this requires a good deal of patience, testing, and careful engineering. The road to that flight has been long, and there’s still some way to go before the end is in sight.

Keep Reading ↓Show less
{"imageShortcodeIds":[]}

Lab Revisits the Task of Putting Common Sense in AI

New nonprofit Basis hopes to model human reasoning to inform science and public policy

5 min read
ai hand and human hand touching pointer fingers
iStock

The field of artificial intelligence has embraced deep learning—in which algorithms find patterns in big data sets—after moving on from earlier systems that more explicitly modeled human reasoning. But deep learning has its flaws: AI models often show a lack of common sense, for example. A new nonprofit, Basis, hopes to build software tools that advance the earlier method of modeling human reasoning, and then apply that method toward pressing problems in scientific discovery and public policy.

To date, Basis has received a government grant and a donation of a few million dollars. Advisors include Rui Costa, a neuroscientist who heads the Allen Institute in Seattle, and Anthony Philippakis, the chief data officer of the Broad Institute in Cambridge, Mass. In July, over tacos at the International Conference on Machine Intelligence, I spoke with Zenna Tavares, a Basis cofounder, and Sam Witty, a Basis research scientist, about human intelligence, problems with academia, and trash collection. The following transcript has been edited for brevity and clarity.

Keep Reading ↓Show less

How a Dual Curing Adhesive Works

UV22DC80-1 is an abrasion-resistant adhesive system that meets NASA low outgassing specs

1 min read

Master Bond's UV22DC80-1 is a one component, nanosilica filled, dual cure system with UV and heat curing mechanisms.

Master Bond

This sponsored article is brought to you by Master Bond.

Master Bond UV22DC80-1 is a nanosilica filled, dual cure epoxy based system. Nanosilica filled epoxy formulations are designed to further improve performance and processing properties.

The specific filler will play a crucial role in determining key parameters such as viscosity, flow, aging characteristics, strength, shrinkage, hardness, and exotherm. As a dual curing system, UV22DC80-1 cures readily upon exposure to UV light, and will cross link in shadowed out areas when heat is added.

Keep Reading ↓Show less