As Moore’s Law slows for CPUs, dedicated graphics co-processors are picking up some of the slack. Just as GPUs are changing the game in deep learning and autonomous cars, the GPU-powered desktop PC might even begin to keep pace with the conventional supercomputer for a portion of supercomputer applications.

For instance, a group of Russian scientists are reporting this month that they’ve been able to solve computational problems in nuclear physics using an off-the-shelf, high-end PC containing a GPU. And, they say, after fine-tuning their algorithm for GPUs, they were able to run their calculations faster than the traditional, CPU-powered supercomputer their colleagues use. Bonus: they ran those calculations for free as opposed to their colleagues, who must pay for access to the supercomputer to run their computations.

“I have no doubts that many research groups over the world can reach the similar results in their own fields such as geophysics, seismology, plasma physics, [medical] diagnostics, etc.” says Vladimir Kukulin, professor of theoretical physics at Lomonosov Moscow State University. “But only by combining two above ingredients: Reformulation of the whole problem, and then by inventing some effective way how to parallelize the whole execution in thousands or even millions of independent threads.”

The problem Kukulin’s group was tackling involved the extensive calculations needed to describe scattering problems in their field — such as when one nucleon collides with a particle or another nucleon and produces a spray of particles and daughter nuclei as a result. This nuclear many-body problem, Kukulin says, could require calculations involving matrices containing millions of elements.

Matrix algebra with this many moving parts can stymie even a supercomputer. But, Kukulin says, his group realized, too, that calculations with giant matrices can mean independent threads of instructions to run in parallel with many other similar threads. The parallelizability of his group’s nuclear calculations, in other words, meant it was a prime candidate for running efficiently on a GPU.

The GPU, originally designed to handle matrix-heavy calculations needed to generate real-time graphics, is finding unexpected applications in a number of fields today, including Bitcoin mining, molecular modeling, and the applications noted above. Kukulin says the list of computational tasks the GPU can handle, a list that now includes nuclear physics, will only increase.

Overall, he says the kinds of problems that could lend themselves to GPU-supercomputing on the cheap are those whose individual elements are not interdependent on one another. Because interdependence means that individual elements (i.e. threads in a GPU’s calculations) would have to go through regular if-then logic gates, checking for each element’s ongoing influence on other elements of the calculation.

And such conditional logic steps are probably going to involve the CPU, which slows an otherwise streamlined calculation down. Instead, to maximize the speedup a GPU can produce, he says it’s best to find a way of expressing one’s problem — or find an approximation that enables the expression — as a system containing many discrete and unconnected elements.

“You should write your problem into a form that allows you to massively parallelize,” he says. “It’s necessary to avoid somehow any conditional.”

So complicated simulations, where every component’s interaction is dependent on other components and their trajectories in time, would be difficult to translate into a GPU-ready problem. By contrast, he says, tsunami early warning systems that predict tidal wave landfalls in faster-than-real-time have sped up when run on GPUs as opposed to traditional CPU supercomputers.

Kukulin says 3D ultrasound imaging, a compute-intensive medical diagnostic tool, could be more widely embraced if medical offices needed only a few thousand dollars for a desktop PC as opposed to many thousands of dollars for access to a supercomputer.

Ultimately, then, problems fit for GPU speedup require not just programming finesse but also expertise in the field of application to find the right way through the problem that could produce a GPU speedup.

“This is some art, but not [just] in programming,” Kukulin says.

Kukulin’s group’s research was published in the July issue of the journal *Computer Physics Communications* (and is also available on the arXiv preprint server).

Mark Anderson is the news manager at *IEEE Spectrum*. He has a bachelor's degree in physics and a master's degree in astrophysics.