Steven Cherry Hi, this is Steven Cherry for Radio Spectrum.
The coronavirus pandemic has exposed any number of weaknesses in our technologies, business models, medical systems, media, and more. Perhaps none is more exposed than what my guest today calls, “The Hidden World of Legacy IT.” If you remember last April’s infamous call for volunteer COBOL programmers by the governor of New Jersey, when his state’s unemployment and disability benefits systems needed to be updated, that turned out to be just the tip of a ubiquitous multi-trillion-dollar iceberg—yes, trillion with ‘t’—of outdated systems. Some of them are even more important to us than getting out unemployment checks—though that’s pretty important in its own right. Water treatment plants, telephone exchanges, power grids, and air traffic control are just a few of the systems controlled by antiquated code.
In 2005, Bob Charette wrote a seminal article, entitled “Why Software Fails.” Now, fifteen years later, he strikes a similar nerve with another cover story that shines a light at the vast and largely hidden problem of legacy IT. Bob is a 30-year veteran of IT consulting, a fellow IEEE Spectrumcontributing editor, and I’m happy to say my good friend as well as my guest today. He joins us by Skype.
Bob, welcome to the podcast.
Bob Charette Thank you, Steven.
Steven Cherry Bob, legacy software, like a middle child or your knees meniscus, isn’t something we think about much until there’s a problem. You note that we know more about government systems because the costs are a matter of public record. But are the problems of the corporate world just as bad?
Bob Charette Yes. There’s really not a lot of difference between what’s happening in government and what’s happening in industry. As you mentioned, government is more visible because you have auditors who are looking at failures and [are] publishing reports. But there’s been major problems in airlines and banks and insurance companies—just about every industry that has IT has a problem with legacy systems in one way or another.
Steven Cherry Bob, the numbers are staggering. In the past 10 years, at least $2.5 trillion has been spent trying to replace legacy IT systems, of which some seven hundred and twenty billion dollars was utterly wasted on failed replacement efforts. And that’s before the last of the COBOL generation retires. Just how big a problem is this?
Bob Charette That’s a really good question. The size of the problem really is unknown. We have no clear count of the number of systems that are legacy in government where we should be able to have a pretty good idea. We have really no insight into what’s happening in industry. The only thing that we that we do know is that we’re spending trillions of dollars annually in terms of operations and maintenance of these systems, and as you mentioned, we’re spending hundreds of billions per year in trying to modernize them with large numbers failing. This is this is one of the things that when I was doing the research and you try to find some authoritative number, there just isn’t any there at all.
In fact, a recent report by the Social Security Administration’s Inspector General basically said that even they could not figure out how many systems were actually legacy in Social Security. And in fact, the Social Security Administration didn’t know itself either.
Steven Cherry So some of that is record-keeping problems, some of that is secrecy, especially on the corporate side. A little bit of that might be definitional. So maybe we should step back and ask what the philosophers call the the ti esti [τὸ τί ἐστι] question—the what-is question. Does everybody agree on what legacy IT is? What counts as legacy?
Bob Charette No. And that’s another problem. What happens is there’s different definitions in different organizations and in different government agencies, even in the US government. And no one has a standard definition. About the closest that we come to is that it’s a system that does not meet the business need for some reason. Now, I want to make it very clear: The definition doesn’t say that it has to be old, or past a certain point in time Nor does it mean that it’s COBOL. There are systems that have been built and are less than 10 years old that are considered legacy because they no longer meet the business need. So the idea is, is that there’s lots of reasons why it may not meet the business needs—there may be obsolescent hardware, the software software may not be usable or feasible to be improved. There may be bugs in the system that just can’t be fixed at any reasonable cost. So there’s a lot of reasons why a system may be termed legacy, but there’s really no general definition that everybody agrees with.
Steven Cherry Bob, states like your Virginia and New York and every state in the Union keep building new roads, seemingly without a thought. A few years ago, a Bloomberg article noted that every mile of fresh new road will one day become a mile of crumbling old road that needs additional attention. Less than half of all road budgets go to maintenance. A Texas study found that the 40-year cost to maintain a $120 million highway was about $800 million. Do we see the same thing in IT? Do we keep building new systems, seemingly without a second thought that we’re going to have to maintain them?
Bob Charette Yes, and for good reason. When we build a system and it actually works, it works usually for a fairly long time. There’s kind of an irony and a paradox. The irony is that the longer these systems live, the harder they are to replace. Paradoxically, because they’re so important, they also don’t receive any attention in terms of spend. Typically, for every dollar that’s spent on developing a system, there’s somewhere between eight and 10 dollars that’s being spent to maintain it over its life. But very few systems actually are retired before their time. Almost every system that I know of, of any size tends to last a lot longer than what the designers ever intended.
Steven Cherry The Bloomberg article noted that disproportionate spending by states on road expansion, at the expense of regular repair—again, less than half of state road budgets are spent on maintenance—has left many roads in poor condition. IT spent a lot of money on maintenance, but a GAO study found that a big part of IT budgets are for operations and maintenance at the expense of modernization or replacement. And in fact, that ratio is getting higher, that less and less money is available for upgrades.
Bob Charette Well, there’s two factors at play. One is, it’s easier to build new systems, so there’s money to build new systems, and that’s what we we constantly do. So we’re building new IT systems over time, which has again, proliferated the number of systems that we need to maintain. So as we build more systems, we’re going to eat up more of our funding so that when it comes time to actually modernize these, there’s less money available. The other aspect is, as we build these systems, we don’t build them a standalone systems. These systems are interconnected with others. And so when you interconnect lots of different systems, you’re not maintaining just an individual s— you’re maintaining this system of systems. And that becomes more costly. Because the systems are interconnected, and because they are very costly to replace, we tend to hold onto these systems longer. And so what happens is that the more systems that you build and interconnect, the harder it is later to replace them, because the cost of replacement is huge. And the probability of failure is also huge.
Steven Cherry Finally—and I promise to get off the highway comparison after this—there seems to be a point at which roads, even when well maintained, need to be reconstructed, not just maintained and repaved. And that point for roads is typically the 40-year mark. Are we seeing something like that in IT?
Bob Charette Well, we’re starting to see a radical change. I think that one of the real changes in IT development and maintenance and support has been the notion of what’s called DevOps, this notion of having development and operations being merged into one.
Since the beginning almost of IT systems development, we’ve thought about it as kind of being in two phases. There is the development phase, and then there was a separate maintenance phase. And a maintenance phase could last anywhere from 8, 10, some systems now are 20, 30, 40 years old. The idea now is to say when you develop it, you have to think about how you’re going to support it and therefore, development and maintenance are rolled into one. It’s kind of this idea that software is never done and therefore, hopefully in the future, this problem of legacy in many ways will go away. We’ll still have to worry about at some point where you can’t really support it anymore. But we should have a lot fewer failures, at least in the operational side. And our costs hopefully will also go down.
Steven Cherry So we can have the best of intentions, but we build roads and bridges and tunnels to last for 40 or 50 years, and then seventy-five years later, we realize we still need them and will for the foreseeable future. Are we still going to need COBOL programmers in 2030? 2050? 2100?
Bob Charette Probably. There’s so much coded in COBOL. And a lot of them work extremely well. And it’s not the software so much that that is the problem. It’s the hardware obsolescence. I can easily foresee COBOL systems being around for another 30, 40, maybe even 50 years. And even that I may be underestimating the longevity of these systems. What’s true in the military, where aircraft like to be 50 to, which was supposed to have about a 20 to 25 year life, is now one hundred years old, replacing everything in the aircraft over a period of time.
There is research being done by DARPA and others to look at how to extend systems and possibly have a system be around for 100 years. And you can do that if you’re extremely clever in how you design it. And also have this idea of how I’m going to constantly upgrade and constantly repair the system and make it easy to move both the data and the hardware. And so I think, again, we’re starting to see the realization that IT, which at one time—again, systems designers were always thinking about 10 years is great, twenty years is fantastic—that maybe now that these system’s, core systems, may be around for one hundred years,.
Steven Cherry Age and complexity have another consequence: Unlike roads, there’s a cybersecurity aspect to all of this as well.
Bob Charette Yeah, that’s probably the biggest weakness that that occurs in new systems, as well as with legacy systems. Legacy systems were never really built with security in mind. And in fact, one of the common complaints even today with new systems is that security isn’t built in; it’s bolted on afterwards, which makes it extremely difficult.
I think security has really come to the fore, especially in the last two or three years where we’ve had this ... In fact last year we had over 100 government systems in the United States—local, state and federal systems—that were subject to ransomware attacks and successful ransomware attacks because the attackers focused in on legacy systems, because they were not as well maintained in terms of their security practices as well as the ability to be made secure. So I think security is going to be an ongoing issue into the foreseeable future.
Steven Cherry The distinction between development and operations brings to mind another one, and that is we think of executable software and data as very separate things. That’s the basis of computing architectures ever since John von Neumann. But legacy IT has a problem with data as well as software, doesn’t it?
Bob Charette One of the areas that we didn’t get to explore very deeply in the story, mostly because of space limitations, is is the problem of data. Data is one of the most difficult things to move from one system to another. In the story, we talked about a Navy system, a payroll system ... The Navy was trying to consolidate 55 systems into one and they use dozens of programing languages. They have multiple databases. The formats are different. How the data is accessed—what business processes, how they use the data—is different. And when you try to think about how you’re going to move all that information and make sure that the information is relevant, it’s correct. We want to make sure we don’t have dirty data. Those things all need to come to be so that when we move to a new system, the data actually is what we want. And in fact, if you take a look at the IRS, the IRS has 60-year-old systems and the reason they have 60-year-old systems is because they have 60-year-old data on millions of companies and millions of—or hundreds of millions of—taxpayers and trying to move that data to new systems and make sure that you don’t lose it and you don’t corrupt it has been a decades-long problem that they’ve been trying to solve.
Steven Cherry Making sure you don’t lose individuals or duplicate individuals across databases when you merge them.
Bob Charette One of the worst things that you can do is have not only duplicate data, but have data that actually is incorrect and then you just move that incorrect data into a new system.
Steven Cherry Well, Bob, as I said, you did it before with why software fails and you’ve done it again with this detailed investigation. Thanks for publishing “The Hidden World of Legacy IT,” and thanks for joining us today.
Bob Charette My pleasure. Steven.
We’ve been speaking with IT consultant Bob Charette about the enormous and still-growing problem of legacy IT systems.
Radio Spectrum is brought to you by IEEE Spectrum, the member magazine of the Institute of Electrical and Electronic Engineers.
For Radio Spectrum, I’m Steven Cherry.
Note: Transcripts are created for the convenience of our readers and listeners. The authoritative record of IEEE Spectrum’s audio programming is the audio version.