The web was buzzing yesterday with stories of the 150,000 or so Gmail users who found that they could either not access their accounts or found that their accounts were empty.
At first, Google thought 0.08% of its 193 million Gmail user accounts had been affected, but then it lowered that number to 0.02% or some 40,000 accounts. Nearly all the Gmail accounts have been restored as of now, it claims.
According to the official Gmail Blog, what happened was that even though Google has multiple copies of its Gmail account data spread across multiple data centers, a programming glitch in a software update apparently was still able to delete the multiple copies.
The Gmail Blog went on to say:
"We released a storage software update that introduced the unexpected bug, which caused 0.02% of Gmail users to temporarily lose access to their email. When we discovered the problem, we immediately stopped the deployment of the new software and reverted to the old version."
In addition:
"If you were affected by this issue, it’s important to note that email sent to you between 6:00 PM PST on February 27 and 2:00 PM PST on February 28 was likely not delivered to your mailbox, and the senders would have received a notification that their messages weren’t delivered. "
However, the Gmail Blog post also said:
"To protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs. But restoring data from them also takes longer than transferring your requests to another data center, which is why it’s taken us hours to get the email back instead of milliseconds. "
Yep, you read that correctly: tapes.
As this controversial blog post over at Fortune points out (it has gotten lots of nasty comments), that's likely a whole lot of tape.
UPDATE: Thursday, 03 March 2011
Apparently getting all the affected Gmail user accounts back online is taking longer than Google expected.
As related in this blog post at the LA Times last night, Google now is refusing to confirm that everyone would get their service restored by Wednesday evening. Google had originally promised that the problem would be fixed by Monday night, this article yesterday from ComputerWorld noted.
This updated blog post at the LA Times reports Google claiming eveything is back to normal, but just in case it doesn't appear to be, to contact them since it may be an unrelated issue.
Robert N. Charette is a Contributing Editor to IEEE Spectrum and an acknowledged international authority on information technology and systems risk management. A self-described “risk ecologist,” he is interested in the intersections of business, political, technological, and societal risks. Charette is an award-winning author of multiple books and numerous articles on the subjects of risk management, project and program management, innovation, and entrepreneurship. A Life Senior Member of the IEEE, Charette was a recipient of the IEEE Computer Society’s Golden Core Award in 2008.