ISS Repair Space Walk: A Glimpse Into the Station's Future
NASA is changing the way it handles hardware problems
6 August 2010—The dramatic emergency-repair space walks assigned to astronauts aboard the International Space Station (ISS) signify much more than the repair itself. The astronauts are the first to employ an entirely new mode of spacecraft maintenance. Previous approaches to keeping the 380-metric-ton orbital outpost functional are being retired, along with the United States’ space shuttle fleet. Astronauts should expect this new emergency-repair scenario for the remainder of the station’s lifetime, which could be decades.
From now on, urgent repairs will be performed entirely by broadly trained space-station crews, not by specialized teams on brief shuttle visits as was previously done. These crews will use stocks of spare parts left inside and outside the station by the final visiting shuttles. These resources are being sent up based on a careful analysis of the ”mean time between failure” (MTBF) of the spacecraft’s components, which are designed to last for years in space.
The precipitating event for today’s space walk occurred on the evening of 31 July, when a slew of error messages announced the malfunction of a coolant loop pump module on the outside of the station. Without the heat-removal services of the pump, the redundant second-pump system was unable to transfer all the waste heat from the station’s electrical power usage to exterior radiators that dump the heat into deep space. The result was a prescripted power-down by the astronauts and the installation of a set of jumper cables to reroute some power around the downed equipment. Contrary to some press reports, there was nothing wrong with the electrical power generation system itself.
The actual failure had long been anticipated, based purely on the expected lifetime of the pumps, and four spare pump modules had already been laid in reserve aboard the station. In fact, this failure was one of 14 specific problems that analysis had already indicated would require quick fixes. ”The criteria for tasks being added to this list,” NASA spokesman Kelly Humphries explains, ”is that the failure of the function provided by the [unit] causes a situation placing the ISS in a configuration that is zero tolerant [to some further failures].” In other words, if this thing breaks, the astronauts had better hope nothing else does before they get the system fixed.
The so-called Big 14 all relate to the station’s electrical power system. (Critical life-support functions have more levels of redundancy, so they don’t have to be fixed as quickly.) Ten involve keeping the primary electrical power system running, by replacing switching units, distribution hubs, and controller assemblies for power generation, storage, and distribution units. The other four breakdowns, including the one that did occur, keep the station’s thermal control system working, which in the vacuum of space is the only way the ISS can dump waste heat from consumed electricity.
Such is the complexity of the ISS that failures can interact in ways too innumerable to be analyzed in detail. Consequently, the plan at mission control in Houston was to wait for the first Big 14 failure and then to ratchet up its activity to respond to it. Since this failure, flight-control teams have been working at a ”pedal to the metal” tempo more typical of a two-week shuttle flight rather than the more relaxed ”long-distance runner” pace of the multimonth ISS expeditions.
That’s not to say NASA hasn’t been preparing. Each group of specialists on the ground has ideas to try in the event of specific failures that would minimize the damage of second independent failures. The most serious such ”second failure” would occur if the alternate coolant loop also went down. Since the first unit failed about 80 000 hours into a predicted MTBF of 100 000 hours, and no known single cause could induce both pumps to fail at about the same time, a second failure would require some extraordinary bad luck. But such bad luck isn’t unknown in the history of manned spaceflight, so top priority was given to preparing for worst-case scenarios like losing a second cooling loop during the first loop’s repair space walk.
Even when such a double failure happens, there are still margins of safety. First, the Russian segment of space-station modules has its own independent power and cooling system, and a flexible air duct is available to pump some of its cooled air into the U.S. side. Second, the large volume of the U.S. segment provides enough good air for safe breathing for several days of repair work. Last, some internal equipment can still operate at intervals of 6 to 8 hours even when the pumps aren’t working, especially if astronauts use ad hoc measures, such as swinging equipment away from its wall mounts to surround it with the station’s air flow or wrapping it in cool-water bags or even ice packs.
The spacewalking repair team cannot be expected to know how to handle every possible Big 14 task, because there isn’t enough training time. So in response, NASA has overhauled both how astronauts train to handle hardware breakdowns and how the agency develops step-by-step procedures to do this. Crews are now trained for a set of generic space-walk maintenance operations. Next, specialists in each system are expected to tailor their procedures to make maximum use of the astronauts’ generic spacewalking skills to perform the specialized tasks. Once the specific repair is settled on, the orbiting astronauts can then sharpen their skills on simulators and practice panels in the space station while watching videos of fellow astronauts performing the desired operations in simulators on Earth.
The usual preparation period for a Big 14 task is two weeks. But because of a fortunate coincidence (good luck is also known to occur in space), a space walk was already on the schedule, so many of the required preparations have already been done. This allowed the two weeks to be squeezed down to about seven days. That put some pressure on workers at mission control and the crew. But in the tradition of the Apollo 13 crisis, astronauts and ground controllers have responded. And, after this first time, they will have to do so again and again in the years ahead.
This article was updated on 11 August 2010.
About the Author
James Oberg worked as an aerospace engineer at NASA for 22 years. He switched to journalism in the late 1990s and now makes his living reporting on space for such outlets as Popular Science, NBC News, and of course, IEEE Spectrum. He interviewed Wesley T. Huntress, author of NASA’s new manned spaceflight plan, for IEEE Spectrum Online in April 2010.