IT Hiccups of the Week: Amazon’s Christmas Eve Incident Hits Netflix

Luckily, there have been rather few IT uff das over the holiday season. That said, there were still a few worth noting.

The world’s stock exchanges continued their streak of glitches over the last year with one at the New York Stock Exchange ending 2012 and two new ones to start 2013, one at the Nasdaq and the other at the London Stock Exchange, which served to undercut the exchanges’ promises to reduce operational snafus.

Netflix experienced two glitches, one at the hands of Amazon on Christmas Eve and one of their own making. At about 12:30 PST on Christmas Eve, Netflix’s streaming video went out  for customers across Canada, Latin America, and the United States  and wasn’t fully restored until late Christmas morning. Netflix placed the blame on a problem with Amazon’s Web Services (AWS) cloud computing center in Northern Virginia, which Amazon admitted experienced an “event” on 24 December. Amazon, in a long and detailed explanation, said that the root cause of the error was a programmer doing system maintenance who accidentally logically deleted “a portion of the ELB [Elastic Load Balancing] state data.” It took a couple of hours before the accident showed up in a way that it could be diagnosed correctly and then many more before an effective fix took hold. Other Amazon AWS users were also affected by the incident, but given that it was Christmas Eve, there weren’t wide-spread complaints reported.

Amazon apologized for the incident, and claimed that it would “use it to drive further improvement” in its services. Netflix also apologized for the outage, explaining how it had thought it had built in enough redundancy to handle such an incident. In a backhand swipe at Amazon’s vaunted claims for it cloud’s service reliability, Netflix said, “It is still early days for cloud innovation and there is certainly more to do in terms of building resiliency in the cloud,” and would be investigating further approaches on how to improve its reliability.

The other Netflix problem occurred on New Year’s Eve. In this case, the “technical difficulties” didn’t affect streaming video but the capability of some Netflix customers to add discs to their mailing queue. Apparently the minor issue was fixed by New Year’s Day.

A more impactful technical difficulty was felt in Michigan when the state’s Department of Technology, Management and Budget (DTMB) to load food assistance benefits on state-issued debit cards at the beginning of the year, the AP reported. It happened because of a “human error” in which a the department “failed to give a vendor a computer file required”. Some 85 000 food assistance recipients out of 1.8 million were affected; those affected had identification numbers ending in “0,” “1,” and “2,” television station CBS Detroit reported. The situation was fixed by noon on Saturday, and the DTMB promised to “find out why exactly this happened” and “figure out how to make sure this doesn’t happen again.”

A similar “forgot to load the data” excuse was pointed to as the reason why some Augusta, Georgia city employees discovered that their Blue Cross Blue Shield of Georgia health insurance cards were showing up as being invalid on New Year’s Day. According to a story at the Augusta Chronicle, “Mike Blanchard, the deputy information technology director … said [in an email to city employees] the problem was caused when … [an] employee benefits file failed to load into the Blue Cross system.” The insurance policy was still valid, but until new cards were mailed out sometime this week, “employees should present a copy of an attached letter at the doctor’s office and pharmacy and have the office call Blue Cross or pharmacy services for confirmation.”

The Chronicle reported that city employees weren't happy, especially given that, “City officials offered similar explanations when hundreds of Augusta employees’ and retirees’ insurance previously turned up canceled,” last year.

Finally, there was an AP report that those dependent on unemployment benefits in Arizona wouldn’t have to wait for their checks after all. The Arizona Department of Economic Security had warned benefit recipients that the late decision by the U.S. government to extend unemployment benefits as part of the “fiscal cliff” agreement might delay unemployment checks for up to a week because of the computer programming changes required.  However, the AP reported, the programming changes were completed ahead of schedule and no delay would result.

A programming change done ahead of schedule? Almost sounds like as A Christmas Carol ending.

Related Stories

Risk Factor

IEEE Spectrum's risk analysis blog, featuring daily news, updates and analysis on computing and IT projects, software and systems failures, successes and innovations, security threats, and more.

Contributors

 
Contributor
Willie D. Jones
 

Newsletter Sign Up

Sign up for the ComputerWise newsletter and get biweekly news and analysis on software, systems, and IT delivered directly to your inbox.

Advertisement
Advertisement