Over the last three decades, we have witnessed an enormous increase in the processing power of supercomputers, with gains in speed of roughly three orders of magnitude every 10 years. From gigaflops in the mid-1980s, computers reached teraflop speeds in the mid-1990s, and petaflop speeds at the end of the first decade of this millennium. The next logical step, an exaflop computer, is still quite a distance away as evidenced by the fact that China’s NUDT Tianhe-2 supercomputer cranks out 33.86 petaflops.
The Tianhe-2 runs on 24 megawatts of electricity, and extrapolations show that the energy consumption of an exaflop computer using today’s technology would require the entire output of a power station—clearly an option that is not sustainable.
Jesus Carretero, a researcher in computer architecture at the Polytechnical School of the Charles III University of Madrid, is the coordinator of NESUS (Network for Sustainable Ultrascale Computing), a research network focusing on the next generation of ultrascale computing systems—which not only include parallel but also distributed systems. Funded by the European Union, the four-year NESUS project started this past March with contributions from researchers from 29 countries. The aim of the research network, is to investigate how the European reseach community will be able to deal with the challenges of ultrascale computing, not only in terms of energy, but also the hardware and software challenges that will arise with these large systems, explains Carretero. "I know that most people, when they talk about sustainability, they talk about energy. Energy is a big problem, but it is only part of the problem."
Six working groups are exploring several aspects of large scale computing, ranging from programming to computer hardware and dealing with both high performance computing (HPC) and distributed computing. And staying on top of new developments is important. "One of our working groups deals with technology surveillance and continuously feeds information about new systems into our network, which allows us to incorporate new technologies fast," says Carretero.
He mentions two specific developments: general-purpose graphic processing units (GP-GPUs) and memristors. The extremely fast GP-GPUs will be used in a large number of applications, such as adaptive optics in the European Extremely Large Telescope, but will require new programming models. Memristors will be nonvolatile, compact, energy-saving replacements for disk drives. "These huge nonvolatile memories, several terabytes, will be important for bigger computers, and they will be available in three or four years. Therefore we are following that technology," says Carretero.
Large future systems will also need new approaches to software. "Basically here we have two goals. One goal is enhancing the system software stack to increase the scalability and sustainability of the systems. The other is an effort to port applications to these new architectures as fast as possible," says Carretero.
"Because these systems are very large, we are also looking for some automatic programming systems and new programming paradigms to make this easier," he adds.
"Our action has become very popular in the scientific community," reports Carretero. The research network now covers 45 countries with around 200 scientists.
Carretero is less optimistic about a quick ramping up to the exaflop computer; the required energy and scale, he says, are still problems that need to be worked out. "I don't know who will be willing to pay for those machines," says Carretero. "We are probably bound to that technology in the mid-term, but in the meantime we should find a way of getting computer power like this by 'federating' several parallel supercomputers.
Currently, European researchers looking to take advantage of distributed systems have only the Internet as a connector, says Carretero. This has degraded the performance of some very-high-throughput projects. Asked whether there are plans for a network comparable to the Backbone Network in the United States, Carrero says, "For research purposes there is a good example in the GEANT network, which provides very high bandwidth. "But right now, it is difficult to extend this at a global scale. I don't see movements in Europe as compared to the U.S. because it requires large investments and agreement between countries," says Carretero.