The Benefits of Risk
Would we be safer overall if we just accept a few deaths due to software?
Steven Cherry: Hi, this is Steven Cherry for IEEE Spectrum’s “Techwise Conversations.”
There’s an odd and terrifying notion in systems engineering called “the automation paradox.” It was tacitly in the news this month when Popular Mechanics magazine published a controversial analysis of the June 2009 crash of Air France Flight 447. Here’s what they wrote:
“Over the decades, airliners have been built with increasingly automated flight control functions. These have the potential to remove a great deal of uncertainty and danger from aviation. But they also remove important information from the attention of the flight crew. While the airplane’s avionics track crucial parameters such as location, speed, and heading, the human beings can pay attention to something else. But when trouble suddenly springs up and the computer decides that it can no longer cope—on a dark night, perhaps, in turbulence, far from land—the humans might find themselves with a very incomplete notion of what’s going on.”
Two years ago, in our December 2009 issue, a feature story entitled “Automated to Death” made this exact point. It said: “Operators are increasingly left out of the loop, at least until something unexpected happens. Then the operators need to get involved quickly and flawlessly.” The article was summed up in its subhead: “As software pilots more of our vehicles, humans can pay the ultimate price. Robert N. Charette investigates the causes and consequences of the automation paradox.”
My guest today is Robert N. Charette [PDF]. He’s a 20-plus-year veteran of systems engineering, risk management, and the development of large-scale software-intensive systems. He’s also a contributing editor at Spectrum and the founding contributor of our popular blog The Risk Factor. He’s also been a frequent guest on this show. Bob, welcome back to the podcast.
Bob Charette: Well, thank you, Steven. Glad to be back.
Steven Cherry: Bob, in your 2009 article, you cited the research of psychologist Lisanne Bainbridge—and here I’m quoting from the article—“the irony, she said, is that the more advanced the automated system, the more crucial the contribution of the human operator becomes to the successful operation of the system. But the more reliable the automation, the less the human operator may be able to contribute to that success.” Did that factor into the Air France 447 crash?
Bob Charette: I think so. It’s a little hard to be absolutely sure. The French safety bureau is still going though everything; they started a group to look at the human factor issues that were involved. From what is known—and again we don’t know everything, obviously, until the report comes out—it does look like the pilots were confused by the operation of the Airbus automated flight control system, flight management system, and what was happening and didn’t recognize the state that it was in. And this is very simplified, but the computer was in one sense acting one way and the pilots were assuming it was acting a different way. So there was this irony that Airbus systems—which are very reliable—when something does happen, pilots have in the past been sometimes confused about what they were supposed to do next or how the system is actually operating. In this particular case, what was interesting was that the particular error that caused the system to move from autopilot to manual control hadn’t been experienced before—in millions of hours from what I understand of flight operations, so the pilots were encountering something that maybe no one had in that particular set of circumstances.
Steven Cherry: Bob, there were a couple of incidents in Australia from a few years ago that are in the news right now because the Australian Transportation Safety Bureau just released some findings. Did the automation paradox play a role in them?
Bob Charette: I think they did, Steve. There was an incident in 2008 involving a Qantas flight from Singapore to Perth that was forced to make an emergency landing in Learmonth airbase in western Australia. And what happened was that there was a problem with the Air Data Inertial Reference Unit—ADIRU as it’s called—and what happened is that this very unexpected error caused the aircraft to climb unexpectedly and then lose altitude, and 110 of the 300 plus passengers and 9 of the crew were injured, and almost a dozen were seriously injured. And what happened was that the pilots didn’t understand what was going on and there was an error—which the Transportation Safety Board still isn’t quite exactly sure what caused it—which caused the computer system to malfunction. And I’m using malfunction in kind of a very narrow term because what they determined was that it didn’t appear to be a software problem, it didn’t appear to be an obvious hardware problem. They think it actually may be related to something called a SEE, or single entity effect, which might even be a cosmic ray going through the circuitry, which caused some kind of spike to occur which they were able to see. But all in all, what happened is that the pilots didn’t know what was going on, had to take control of the aircraft, and as I said, a number of people were injured. And then there was another incident, another report that just came out this week by the Australian Transportation Safety Board which talked about some user errors in a United Arab Emirates Airbus flight taking off from Melbourne in 2009 going on to Dubai, and where the copilot entered the wrong weight of the aircraft. The error wasn’t caught, and they started to proceed to take off, and the computer system, having entered the wrong weight, didn’t react the way it should. And luckily they were able to gain control and take off, but not before the tail struck the runway three times and took out some lights at the end of the runway, and it could have been a tragedy. So in each one of these cases, the computers acted in ways that the pilots didn’t recognize right away and were caught in this paradox and irony of having to jump in when it didn’t appear that there was any need to jump in.
Steven Cherry: Bob, we’ve been focusing on air travel, but you had a second feature article in 2009, and it was entitled “This Car Runs on Code.” And you wrote, “It takes dozens of microprocessors running 100 million lines of code to get a premium car out of the driveway, and this software is only going to get more complex.” Can we expect to see the automation paradox enter the automotive world as well—or maybe it already has?
Bob Charette: Again, anecdotally, it appears that it already has. We see it in some sense in the systems that are onboard like GPS systems, where people believe the GPS system telling them to turn left, turn right, or go in a particular direction. A number of years ago, a German motorist drove his BMW straight into the Havel River in eastern Germany. He told the police that the navigation system told him that he could cross the river. Unfortunately, the navigation system didn’t tell him that he needed to have a ferry.
Steven Cherry: [laughs] So that really is the automation paradox, because if the GPS system were less good, then the driver wouldn’t have had so much confidence in it.
Bob Charette: Right, and in fact what’s happened both in New York state and more so in the U.K. this past year, is that councils in the U.K. are basically getting control over how GPS displays the maps in their local communities, because in a number of cases drivers, especially lorry drivers, are driving onto roads that are not wide enough or smack into bridges that are too small.
Steven Cherry: Bob, you mentioned in the Air France case how rare that failure mode was. And in the Australian incidents as well, in your blog post you quoted the ATSB as saying, “The failure mode was probably initiated by a single rare type of internal or external trigger event combined with a marginal susceptibility to that type of event within the hardware component. There were only three known occasions of the failure mode in over 128 million hours of unit operation.” Maybe we should just accept the automation paradox, and you know, do our darndest to drive down the failure rate as low as we can. But overall, aren’t we a lot safer with software flying our planes than pilots?
Bob Charette: Oh, I think we’re safer with software assisting pilots. I think that it’s going to be interesting as we move toward a totally automated situation, where we may have commercial airlines flying without pilots. That’s within the realm of possibility, and Spectrum Editor Phil Ross wrote about this just recently. Again, I think one of the issues is whether or not from a design standpoint, or systems engineering standpoint, is a system—for instance, is an aircraft safer if you design it with the expectation that there will be no pilot, versus having a system that has to interact with a pilot? For instance, the companies like Rockwell Collins are developing a panic button for a pilot so that when the aircraft gets in trouble, you can hit this, I guess, a big red button, which will take over the aircraft and try to put it into the right, stable configuration. In fact, in the Air France case, it’s looking more and more like the aircraft actually was safe and that the actions of the pilots actually put that aircraft into danger; they didn’t recognize that the aircraft was okay. So it’s an interesting area, and I think that like all things, whether it’s the automated aircraft or fully automated cars, you know, I think the time’s coming; it’s just going to be whether or not the populace accepts it. And also a big part of it’s our expectation—I think that maybe we expect systems that are fully automated or highly automated never to crash, and I think that may just be an unreasonable expectation.
Steven Cherry: Yeah, it seems like there’s that and then there’s also the way in which people treat human risks and automation risks differently, I guess.
Bob Charette: Well, I think that’s a perception thing. I think, again, we may accept risks because we think we’re in control, and we’re all a little bit leery of turning total control over to third parties, especially computers. And especially if you’ve been on your laptop or desktop recently and you’ve had the blue screen of death or some other computer foul-up, which you say, “Well, what’s gonna happen?” But I think in systems we’re getting better safety, better reliability, but I don’t think we will ever have perfect systems, because we just can’t model every situation and every particular error. It’s a matter of cost-benefit risk/safety trade-offs and how much we want to pay for that safety.
Steven Cherry: Well, but I guess part of the problem is just getting people to make a rational decision here. I mean, if we could reduce 43 000 automobile deaths each year just in the U.S. down to, you know, 20 000 or 10 000, still, you might get people saying, “Well, even though the highways would be safer overall with the cars doing the driving, I would be safer with me doing the driving”—and there’s this odd fact of something like 85 percent of all drivers think they’re especially good drivers.
Bob Charette: Well, that’s the Lake Wobegon effect, right?—is that everybody’s above average. Again, I think that it becomes a slow evolution. What people forget is that—I think it was Union Pacific who was proudly touting that only one person a day was being killed in train accidents around 1900, where before it was, I guess, 10 or 12 people a day. So it’s just our expectations of safety and what we want to pay, and how society—you know, what it values and what it doesn’t.
Steven Cherry: Very good. Well, you know, Bob I’ve heard you describe yourself as a “risk ecologist” who investigates the impact of the changing concept of risk on technology and societal development. You know, it seems society’s going to have to change its understanding of risk a lot faster than it wants to.
Bob Charette: Well, we’re in a more complex set of situations; we have a lot more interfaces between systems, and where there’s interfaces, there’s risk and complexity as we just don’t have as much understanding as we did before, and we have to in some sense put a little bit of faith in the engineers and scientists who are doing these things. But at the same time, I think we also need to ask the questions whether or not something is safe enough, and we shouldn’t take it as rote in any case that you can’t question these things. There’s always room for questioning how good something is, and as engineers and scientists we need to be open about that as well.
Steven Cherry: Very good. Well, if we’re ever going to catch up to the way automation is changing our lives, it’s going to come about through the efforts of you and your fellow risk ecologists. So thanks for being one, and thanks for joining us.
Bob Charette: It’s been a pleasure, Steven, and have a good holiday.
Steven Cherry: We’ve been speaking with risk management expert and software engineer Bob Charette about the risks and benefits of our increasing dependence on software.
This is the last show to be recorded this year, so I’d like to take a moment to thank everyone who’s contributed to it: Ariel Bleicher, who produced a number of shows in 2010 and 2011, first as our journalism intern and then as a freelancer, including our award-winning show about Google Ngram; Barbara Finkelstein, who produced four shows in the past couple of months, including what I fully expect is our next award winner, about the future of work; Sharon Basco, our radio consultant, whose gentle encouragement and fierce expertise has helped me refine the show’s format over the past couple of years; Joe Levine and Michele Kogon, our adroit and stalwart copy editors; Randi Silberman Klett, who every week astonishes me by defying the conventional wisdom that you have to settle for two out of three when it comes to quality, cost, and speed; and most of all, Francesco Ferorelli, our audio engineer, who makes us all sound better than we are and does it with a ready smile and a grace beyond his years. Wishing you a happy end of 2011 and an even better 2012, for IEEE Spectrum’s “Techwise Conversations,” I’m Steven Cherry.
Announcer: “Techwise Conversations” is sponsored by National Instruments.
This interview was recorded 20 December 2011.
Follow us on Twitter @Spectrumpodcast
Audio Engineer: Francesco Ferorelli
NOTE: Transcripts are created for the convenience of our readers and listeners and may not perfectly match their associated interviews and narratives. The authoritative record of IEEE Spectrum’s audio programming is the audio version.