Will China Attain Exascale Supercomputing in 2020?

The U.S., China, Japan, and the EU are all striving to reach the next big milestone in supercomputing, but only China has claimed it will do so this year


3 min read
Photo-illustration depicting four country flags as martial artists.
Photo-Illustration: Edmon de Haro

To the supercomputer world, what separates “peta" from “exa" is more than just three orders of magnitude.

As measured in floating-point operations per second (a.k.a. FLOPS), one petaflop (1015 FLOPS) falls in the middle of what might be called commodity high-performance computing (HPC). In this domain, hardware is hardware, and what matters most is increasing processing speed as cost-effectively as possible.

Now the United States, China, Japan, and the European Union are all striving to reach the exaflop (1018 ) scale. The Chinese have claimed they will hit that mark in 2020. But they haven't said so lately: Attempts to contact officials at the National Supercomputer Center, in Guangzhou; Tsinghua University, in Beijing; and Xi'an Jiaotong University yielded either no response or no comment.

It's a fine question of when exactly the exascale barrier is deemed to have been broken—when a computer's theoretical peak performance exceeds 1 exaflop or when its maximum real-world compute speed hits that mark. Indeed, the sheer volume of compute power is less important than it used to be.

“Now it's more about customization, special-purpose systems," says Bob Sorensen, vice president of research and technology with the HPC consulting firm Hyperion Research. “We're starting to see almost a trend towards specialization of HPC hardware, as opposed to a drive towards a one-size-fits-all commodity" approach.

The United States' exascale computing efforts, involving three separate machines, total US $1.8 billion for the hardware alone, says Jack Dongarra, a professor of electrical engineering and computer science at the University of Tennessee. He says exascale algorithms and applications may cost another $1.8 billion to develop.

And as for the electric bill, it's still unclear exactly how many megawatts one of these machines might gulp down. One recent ballpark estimate puts the power consumption of a projected Chinese exaflop system at 65 megawatts. If the machine ran continuously for one year, the electricity bill alone would come to about $60 million.

Dongarra says he's skeptical that any system, in China or anywhere else, will achieve one sustained exaflop anytime before 2021, or possibly even 2022. In the United States, he says, two exascale machines will be used for public research and development, including seismic analysis, weather and climate modeling, and AI research. The third will be reserved for national-security research, such as simulating nuclear weapons.

“The first one that'll be deployed will be at Argonne [National Laboratory, near Chicago], an open-science lab. That goes by the name Aurora or, sometimes, A21," Dongarra says. It will have Intel processors, with Cray developing the interconnecting fabric between the more than 200 cabinets projected to house the supercomputer. A21's architecture will reportedly include Intel's Optane memory modules, which represent a hybrid of DRAM and flash memory. Peak capacity for the machine should reach 1 exaflop when it's deployed in 2021.

The other U.S. open-science machine, at Oak Ridge National Laboratory, in Tennessee, will be called Frontier and is projected to launch later in 2021 with a peak capacity in the neighborhood of 1.5 exaflops. Its AMD processors will be dispersed in more than 100 cabinets, with four graphics processing units for each CPU.

The third, El Capitan, will be operated out of Lawrence Livermore National Laboratory, in California. Its peak capacity is also projected to come in at 1.5 exaflops. Launching sometime in 2022, El Capitan will be restricted to users in the national security field.

China's three announced exascale projects, Dongarra says, also each have their own configurations and hardware. In part because of President Trump's China trade war, China will be developing its own processors and high-speed interconnects.

“China is very aggressive in high-performance computing," Dongarra notes. “Back in 2001, the Top 500 list had no Chinese machines. Today they're dominant." As of June 2019, China had 219 of the world's 500 fastest supercomputers, whereas the United States had 116. (Tally together the number of petaflops in each machine and the numbers come out a little different. In terms of performance, the United States has 38 percent of the world's HPC resources, whereas China has 30 percent.)

China's three exascale systems are all built around CPUs manufactured in China. They are to be based at the National University of Defense Technology, using a yet-to-be-announced CPU; the National Research Center of Parallel Computer Engineering and Technology, using a nonaccelerated ARM-based CPU; and the Chinese HPC company Sugon, using an AMD-licensed x 86 with accelerators from the Chinese company HyGon.

Japan's future exascale machine, Fugaku, is being jointly developed by Fujitsu and Riken, using ARM architecture. And not to be left out, the EU also has exascale projects in the works, the most interesting of which centers on a European processor initiative, which Dongarra speculates may use the open-source RISC-V architecture.

All four of the major players—China, the United States, Japan, and the EU—have gone all-in on building out their own CPU and accelerator technologies, Sorensen says. “It's a rebirth of interesting architectures," he says. “There's lots of innovation out there."

The Conversation (0)

New AI Speeds Computer Graphics by Up to 5x

Neural rendering harnesses machine learning to paint pixels

5 min read
Four examples of Nvidia's Instant NeRF 2D-to-3D machine learning model placed side-by-side.

Nvidia Instant NeRF uses neural rendering to generate 3D visuals from 2D images.

NVIDIA

On 20 September, Nvidia’s Vice President of Applied Deep Learning, Bryan Cantanzaro, went to Twitter with a bold claim: In certain GPU-heavy games, like the classic first-person platformer Portal, seven out of eight pixels on the screen are generated by a new machine-learning algorithm. That’s enough, he said, to accelerate rendering by up to 5x.

This impressive feat is currently limited to a few dozen 3D games, but it’s a hint at the gains neural rendering will soon deliver. The technique will unlock new potential in everyday consumer electronics.

Keep Reading ↓Show less

Golf Robot Learns To Putt Like A Pro

Watch out Tiger Woods, Golfi has a mean short game

4 min read
Golf Robot Learns To Putt Like A Pro

While being able to drive the ball 300 yards might get the fans excited, a solid putting game is often what separates a golf champion from the journeymen. A robot built by German researchers is quickly becoming a master of this short game using a clever combination of classical control engineering and machine learning.

In golf tournaments, players often scout out the greens the day beforehand to think through how they are going to play their shots, says Annika Junker, a doctoral student at Paderborn University in Germany. So she and her colleagues decided to see if giving a robot similar capabilities could help it to sink a putt from anywhere on the green, without assistance from a human.

Keep Reading ↓Show less

Learn How Global Configuration Management and IBM CLM Work Together

In this presentation we will build the case for component-based requirements management

2 min read

This is a sponsored article brought to you by 321 Gang.

To fully support Requirements Management (RM) best practices, a tool needs to support traceability, versioning, reuse, and Product Line Engineering (PLE). This is especially true when designing large complex systems or systems that follow standards and regulations. Most modern requirement tools do a decent job of capturing requirements and related metadata. Some tools also support rudimentary mechanisms for baselining and traceability capabilities (“linking” requirements). The earlier versions of IBM DOORS Next supported a rich configurable traceability and even a rudimentary form of reuse. DOORS Next became a complete solution for managing requirements a few years ago when IBM invented and implemented Global Configuration Management (GCM) as part of its Engineering Lifecycle Management (ELM, formerly known as Collaborative Lifecycle Management or simply CLM) suite of integrated tools. On the surface, it seems that GCM just provides versioning capability, but it is so much more than that. GCM arms product/system development organizations with support for advanced requirement reuse, traceability that supports versioning, release management and variant management. It is also possible to manage collections of related Application Lifecycle Management (ALM) and Systems Engineering artifacts in a single configuration.

Keep Reading ↓Show less