Airline Systems Meltdowns

Yesterday morning, United Airlines flight and maintenance dispatch system operating at O'Hare in Chicago went down for about two hours grounding all flights world-wide. United, the world's second largest airline, said that 268 domestic and international flights were delayed, and 24 domestic flights were canceled. According to reports, the Unimatic dispatch system that shut down dispatches flight crews,determines the weight and balance of an aircraft, relays flight plans and weather to pilots and confirms that maintenance checks have been carried out as required.

More interesting is that not only does United not know what caused the malfunction, but that the dispatch system's back-up also failed for unknown reasons. Apparently a major hardware upgrade was made on the 18th of May, but it is not known whether this had anything to do with the problems experienced.

Last month, on the 27th of May, All Nippon Airways' (ANA) integrated computer system that controls reservations, boarding procedures and luggage flows for domestic flights and delivers the information to computer terminals at airports across Japan malfunctioned which led to the cancellation of 131 flights and affected over 70,000 passengers. As in the United case above, back-up systems didn't seem to kick in properly. Problems continued into the next day, with another flight canceled and another nine delayed. ANA blamed the problem on the installation of three new computers the previous week.

You may also remember that in March of this year, US Airways had extensive trouble rolling out an integrated reservation system as well.

I am waiting for the perfect storm to occur in the US: one airline loses its dispatch system while another loses its dispatch system while the FAA loses its radar systems, all on a Friday afternoon with bad weather across the nation. Should be a sight to behold. Given the increased complexity of these types of systems, and the overall fragility of the air traffic control system, this should not be unexpected.

Update: This afternoon United Airlines said that the cause of the problem was an operational error during routine system testing. Then, during the recovery, a hardware error occurred.

Update 2: AP is reporting that an employee mistake caused the problems with the Unimatic and the back-up system.


Risk Factor

IEEE Spectrum's risk analysis blog, featuring daily news, updates and analysis on computing and IT projects, software and systems failures, successes and innovations, security threats, and more.

Robert Charette
Spotsylvania, Va.
Willie D. Jones
New York City