It’s been a fairly quiet week in regard to IT glitches of any major significance. That said, there were still a sufficient number of snarls, snafus and errors to interfere with work as well as generally upset, annoy and outrage a lot of people. We start off this week's review with an issue affecting NASA’s $2.5 billion Mars rover mission.
NASA Curiosity Goes into Safe Mode Due to Memory Issue
Responding to a problem it detected Wednesday morning with the data coming from the Mars rover Curiosity, NASA announced on Thursday that it had “switched the rover to a redundant onboard computer in response to a memory issue on the computer that had been active.”
NASA said that it will shift the rover from its current “safe mode” operation to full operational status over the next few days as well as troubleshoot what is causing the “glitch in flash memory linked to the other, now-inactive, computer.”
The NASA press release stated that on Wednesday the rover communicated "at all scheduled communication windows…but it did not send recorded data, only current status information. The status information revealed that the computer had not switched to the usual daily ‘sleep’ mode when planned. Diagnostic work in a testing simulation at JPL indicates the situation involved corrupted memory at an A-side memory location used for addressing memory files.”
A detailed story at CNET quoted Curiosity Project Manager Richard Cook as telling CBS News that, “We were in a state where the software was partially working and partially not, and we wanted to switch from that state to a pristine version of the software running on a pristine set of hardware.”
The project team thinks that space radiation, while a remote possibility, may in fact be to blame, CNET said. Again quoting Cook:
“In general, there are lots of layers of protection, the memory is self correcting and the software is supposed to be tolerant to it…But what we are theorizing happened is that we got what's called a double bit error, where you get an uncorrectable memory error in a particularly sensitive place, which is where the directory for the whole memory was sitting…So you essentially lost knowledge of where everything was. Again, software is supposed to be tolerant of that...But it looks like there was potentially a problem where software kind of got into a confused state where parts of the software were working fine but other parts of software were kind of waiting on the memory to do something...and the hardware was confused as to where things were.”
Cook indicated that, in essence, a reboot of the inactive computer should clear things up, but that the team will do a lot of analysis before that happens to make sure that there isn’t anything more troublesome lurking about.
The rover problem no doubt annoyed many NASA scientists given that Curiosity had, only a few days earlier, drilled into the Mars surface to gather for analysis the “first sample ever collected from the interior of a rock on another planet.”
Amazon Bug Wipes Out iOS Users’ Kindle Libraries
Not as significant as the Curiosity issue but still annoying to its Apple product users was the Amazon update error related to its Kindle iOS app.
On Wednesday morning, Amazon warned Apple iOS users not to download its latest Kindle app because of “a known issue” which turned out to be something capable of deleting Kindle libraries from their devices, CNET reported. Reading the comments at the Amazon Kindle forum seemed to indicate that not everyone was affected by the bug, however.
According to CNET, “After downloading the initial update [version 3.6.1], existing Kindle users were logged out of their accounts, and everything they had downloaded was deleted from their devices. They also lost bookmarks and other settings, according to angry comments on iTunes. Users then had to log back in to Kindle and redownload their books from the cloud. Some complained that they had to delete the app entirely and download it again.”
Amazon fixed the problem later in the day with version 3.6.2. How may users got whacked is not known, or at least Amazon isn’t saying. Interestingly, on Amazon’s Kindle app for iPad, iPhone & iPod touch page, it says the latest app version is 3.5.
T-Shirt Maker Blames Computer for Violent Phrases Targeting Women
Another company has found that it needed some updating to its website as well. Worcester, Mass., T-shirt maker Solid Gold Bomb was in fully apology mode last week when its t-shirts appeared for sale on Amazon with a range of phrases such as “Keep Calm and Punch Her” being one of the “least hateful” ones, a story at the Daily Mail reported on Thursday and again with more vigor on Friday. Soon after the t-shirts appeared for sale, a flood of outrage appeared on social networks against both Solid Gold Bomb and Amazon.
Solid Gold Bomb founder Michael Fowler tried somewhat unconvincingly to explain that the problem resided with a poorly thought out and careless computer algorithm he created that allowed certain offensive combinations of words to appear on the t-shirts. Fowler didn’t say exactly why neither he nor anyone else in his company thought the words “Keep Calm and …” when combined with “murder,” “knife” and “rape” would be generally acceptable—especially when they could further be combined with the words “her” or most other personal pronouns, for that matter.
Fowler says that the company is correcting the problem, and that, “Rest assured, we do not condone the offense nor do we have any desire to promote it. Ultimately, it comes down to my error and I should singly accept repsonsibility (sic) for the mistake. Again, my sincere apologies for the unintended outcome.”
Maybe Solid Gold Bomb could invest in a spell checker as well.
EHR Problem at Canberra Hospital Forces Emergency Patients Elsewhere
There was a brief news item in the Canberra Times about a “glitch” in Canberra Hospital’s electronic health record system Wednesday afternoon that “forced their emergency department to direct patients with ‘non-urgent’ issues elsewhere.” The story went on to say that critical emergency patients were still being seen, but the emergency department was “reverting to paper systems while IT experts fix the problem.”
The problem was reportedly fixed by Thursday.
An eerily similar problem hit Bellevue Hospital in New York City a few weeks ago, according to the Wall Street Journal. In this case, ambulances containing anyone other than psychiatric patients were diverted away from the hospital’s emergency room because of a “computer glitch.”
I was under the impression that quick and seamless rollover procedures to begin using paper medical records was standard operating procedure in event that an EHR system went offline, but it seems that this switch-over isn't so easy for emergency rooms.
BB&T Bank Customers Get Scare
Diverted attention could have been used to describe BB&T customers this weekend. With all the stories about the dangers of identity thieves being able to drain personal bank accounts, many BB&T bank customers wondered if that had happened to them when they tried to access their accounts and saw that they showed zero balances.
According to WRCB TV in Chattanooga, Tennessee, an “internal computer issue” that affected ATM, mobile, and online systems incorrectly indicated to customers that they had zero bank balances and therefore could not access or move their money. I bet more than one customer experienced a quick panic attack on when that happened.
While BB&T, which is headquartered in Winston-Salem, North Carolina, has not said how many customers were affected (it has some 6.4 million customers), there were enough calls to its 1-800 customer service lines to overload its phone system.
BB&T says everything is okay now, and assures its customers that it “will reverse any fees or charges that may have occurred as a result of this issue.”
Robert N. Charette is a Contributing Editor to IEEE Spectrum and an acknowledged international authority on information technology and systems risk management. A self-described “risk ecologist,” he is interested in the intersections of business, political, technological, and societal risks. Charette is an award-winning author of multiple books and numerous articles on the subjects of risk management, project and program management, innovation, and entrepreneurship. A Life Senior Member of the IEEE, Charette was a recipient of the IEEE Computer Society’s Golden Core Award in 2008.