Close

How To Kill A Supercomputer: Dirty Power, Cosmic Rays, and Bad Solder

Will future exascale supercomputers be able to withstand the steady onslaught of routine faults?

11 min read
How To Kill A Supercomputer: Dirty Power, Cosmic Rays, and Bad Solder
Illustration: Shaw Nielsen

As a child, were you ever afraid that a monster lurking in your bedroom would leap out of the dark and get you? My job at Oak Ridge National Laboratory is to worry about a similar monster, hiding in the steel cabinets of the supercomputers and threatening to crash the largest computing machines on the planet. 

The monster is something supercomputer specialists call resilience—or rather the lack of resilience. It has bitten several supercomputers in the past. A high-profile example affected what was the second fastest supercomputer in the world in 2002, a machine called ASCI Q at Los Alamos National Laboratory. When it was first installed at the New Mexico lab, this computer couldn’t run more than an hour or so without crashing.

Keep reading... Show less

Stay ahead of the latest trends in technology. Become an IEEE member.

This article is for IEEE members only. Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, podcasts, and special reports. Learn more →

Membership includes:

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions

Medal of Honor Goes to Microsensor and Systems Pioneer

The UCLA professor developed aerospace and automotive safety systems

3 min read
Photo of a man in a blue jacket in front of a brick wall.
UCLA Samueli School of Engineering

IEEE Life Fellow Asad M. Madni is the recipient of this year’s IEEE Medal of Honor. He is being recognized “for pioneering contributions to the development and commercialization of innovative sensing and systems technologies, and for distinguished research leadership.”

Keep Reading ↓ Show less

Video Friday: An Agile Year

Your weekly selection of awesome robot videos

3 min read
Video Friday: An Agile Year

Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We’ll also be posting a weekly calendar of upcoming robotics events for the next few months; here's what we have so far (send us your events!):

ICRA 2022: 23–27 May 2022, Philadelphia
ERF 2022: 28–30 June 2022, Rotterdam, Germany
CLAWAR 2022: 12–14 September 2022, Açores, Portugal

Let us know if you have suggestions for next week, and enjoy today's videos.

Keep Reading ↓ Show less

Learn How to Use a High-Performance Digitizer

Join Teledyne for a three-part webinar series on high-performance data acquisition basics

1 min read

Webinar: High-Performance Digitizer Basics

Part 3: How to Use a High-Performance Digitizer

Date: Tuesday, December 7, 2021

Time: 10 AM PST | 1 PM EST

Keep Reading ↓ Show less