Learning From Software Failure
This issue of IEEE Spectrum contains a special report about custom-made enterprise software and its many spectacular failures—the kind that bankrupt companies and cost governments and whole industries tens of billions of dollars a year. We put a man on the moon. So why can’t we make software that works?
Companies and governments undertake these customized IT ventures to make themselves run more efficiently and more effectively. Some of these projects are huge and extremely complex. It’s now common to see multibillion-dollar efforts that take years or even decades to complete. And when that software is good, it can transform entire organizations, as companies like Wal-Mart and Dell Computer have shown.
But when it’s bad, it’s horrid. Nobody really knows how much is wasted. In the United States, a conservative estimate is US $60 billion to $75 billion dollars every year. Just how many ventures fail outright—meaning that they’re either canceled partway through or abandoned shortly after completion—is controversial. For large projects, it’s probably in the 15 to 20 percent range. And then you have lots of other code that’s delivered late or way over budget.
What’s the problem with custom enterprise software? In “Why Software Fails,” risk-management expert Robert N. Charette, an IEEE member, tackles this question. Basically, far too much of it doesn’t work very well, for reasons that are well understood and preventable: bad or nonexistent process documentation, impossible-to-meet requirements, poor and ever-changing specifications, quality-control issues, and perhaps the biggest problem of all—people. The clients who can’t figure out what they really want or need, the vendors who can’t or won’t rein them in, the managers who see scope creep tearing through a project but look the other way.
One of the most well-publicized software failures is the subject of “Who Killed the Virtual Case File?” Senior Associate Editor Harry Goldstein investigates the trail of missteps that brought down the FBI’s Virtual Case File system. This custom software was supposed to automate the bureau’s paper-based work environment, allowing agents to share investigative information via a computer network. Instead the FBI claims that the Virtual Case File’s contractor, Science Applications International Corp., delivered code so bug-ridden that the bureau scrapped the $170 million project earlier this year. But various government and independent reports show that the FBI shares a good part of the blame for the failure.
As the FBI gears up to spend hundreds of millions more on software during the next several years, questions remain as to how the Virtual Case File project went so wrong and whether an even bigger failure lies ahead. Despite a good deal of attention in the press, the inner workings of the initiative have remained largely invisible—until now. Goldstein’s interviews with many of the people directly involved reveal an effort that succumbed to the most basic mistakes of software development.
Is anything going right? In “The Exterminators," Contributing Editor Philip E. Ross describes how UK-based Praxis High Integrity Systems is applying formal methods to software engineering and emerging with largely bug-free code. These mathematical methods work well for relatively small programs (less than 200 000 lines of code). Because of the added expense, they’re best suited for mission-critical systems, like air-traffic control programs, that must be bulletproof and totally reliable. Issues of scalability remain, however, and that is one of many reasons more people aren’t using such methods: 200 000 lines of code is one thing; the 200 million lines in a supply-chain management system is quite another.
Future software failures are everywhere in the making. The FBI’s replacement for the Virtual Case File system is on deck. The push to automate and digitize medical records looks like another breeding ground for fatal bugs. After you read our report, you will know why so many of these projects end in disaster—and how such costly disasters can often be avoided.
Many people helped us put this report together, but the IEEE members listed here were particularly generous with their time and their knowledge. Any fault found with these pages rests with the editors. We thank these members for their support and guidance.
Susan K. Land, software engineering section manager, Northrop Grumman IT/TASC, IEEE Senior Member
James C. McGroddy, former director of research, IBM, IEEE Life Fellow
James W. Moore, senior principal engineer, The MITRE Corp., IEEE Senior Member
Peter G. Neumann, principal scientist, SRI International Computer Science Laboratory, IEEE Life Fellow
George Spix, chief architect, Microsoft Corp., IEEE Member
Elaine J. Weyuker, technology leader, AT&T Labs—Research, IEEE Fellow
The editorial content of IEEE Spectrum magazine does not represent official positions of the IEEE or its organizational units.