There was a small story in ComputerWorld over the weekend reporting that a card in a New York City Verizonpeering router that handles traffic for Verizons' DSL and FIOS services between its network and the Internet went bad about 1515 EDT Friday. This caused an outage in some areas of New York and the Northeast US for about 40 minutes before it was fixed.

What caught my eye in this report was a paragraph in the ComputerWorld story pointing to a Twitter post on the web by Verizon Senior Vice President Eric Rabe, in which he wrote:

"When routers have problems, they are designed to report that they are sick. Internet traffic is rerouted to adjacent routers automatically and sent around the trouble spot. In this case, that didn't happen. The router went into a hung state and did not appear to the rest of the network as though it was having problems.

That meant that some user traffic from the northeast continued to flow to the stalled router, but couldn't be processed. Presto - an outage for those users."

An analyst quoted in the ComputerWorld story said that it was "highly unusual" that one router could take down large portions of a network. He pointed out that this happens more often when someone accidentally cuts a cable, which happened twice in the past 18 months (here and here) in Australia.

When I read about the problems with the router, it also reminded me a bit of the DC Metro crash where the US National Transportation Safety Board (NTSB) discovered that a failure occurred in which a spurious signal generated by a track circuit module transmitter mimicked a valid signal and bypassed the rails via an unintended signal path. The spurious signal was sensed by the module receiver which resulted in a train not being detected when it stopped in the track circuit where the accident occurred.

Unfortunately here, "presto" resulted in a crash.

Expecting the unexpected is a fundamental engineering principle that we seem to be constantly in need of being reminded of.

Or as the Greek philosopher Heraclitus wrote some 2,500 years ago, "If you do not expect the unexpected, you will not find it; for it is hard to be sought out, and difficult.”

The Conversation (0)

Why Functional Programming Should Be the Future of Software Development

It’s hard to learn, but your code will produce fewer nasty surprises

11 min read
A plate of spaghetti made from code
Shira Inbar

You’d expectthe longest and most costly phase in the lifecycle of a software product to be the initial development of the system, when all those great features are first imagined and then created. In fact, the hardest part comes later, during the maintenance phase. That’s when programmers pay the price for the shortcuts they took during development.

So why did they take shortcuts? Maybe they didn’t realize that they were cutting any corners. Only when their code was deployed and exercised by a lot of users did its hidden flaws come to light. And maybe the developers were rushed. Time-to-market pressures would almost guarantee that their software will contain more bugs than it would otherwise.

Keep Reading ↓Show less