Cloud computing puts your desktop wherever you want it
This is part of IEEE Spectrum’s special report: Top 11 Technologies of the Decade
Illustration: Frank Chimero
Just 18 years ago the Internet was in its infancy, a mere playground for tech-savvy frontiersmen who knew how to search a directory and FTP a file. Then in 1993 it hit puberty, when the Web’s graphical browsers and clickable hyperlinks began to attract a wider audience. Finally, in the 2000s, it came of age, with blogs, tweets, and social networking dizzying billions of ever more naive users with relentless waves of information, entertainment, and gossip.
This, the adulthood of the Internet, has come about for many reasons, all of them supporting a single conceptual advance: We’ve cut clean through the barrier between hardware and software. And it’s deeply personal. Videos of our most embarrassing moments, e-mails detailing our deepest heartaches, and every digit of our bank accounts, social security numbers, and credit cards are splintered into thousands of servers controlled by dozens—hundreds?—of companies.
Welcome to cloud computing. We’ve been catapulted into this nebulous state by the powerful convergence of widespread broadband access, the profusion of mobile devices enabling near-constant Internet connectivity, and hundreds of innovations that have made data centers much easier to build and run. For most of us, physical storage may well become obsolete in the next few years. We can now run intensive computing tasks on someone else’s servers cheaply, or even for free. If this all sounds a lot like time-sharing on a mainframe, you’re right. But this time it’s accessible to all, and it’s more than a little addictive.
The seduction of the business world began first, in 2000, when Salesforce.com started hosting software for interacting with customers that a client could rebrand as its own. Customers’ personal details, of course, went straight into Salesforce’s databases. Since then, hundreds of companies have turned their old physical products into virtual services or invented new ones by harnessing the potential of cloud computing.
Consumers were tempted four years later, when Google offered them their gateway drug: Gmail, a free online e-mail service with unprecedented amounts of storage space. The bargain had Faustian overtones—store your e-mail with us for free, and in exchange we’ll unleash creepy bots to scan your prose—but the illusion of infinite storage proved too thoroughly enthralling. This was Google, after all: big, brawny, able to warp space and time.
Gmail’s infinite storage was a start. But the program’s developers also made use of a handy new feature. Now they could roll out updates whenever they pleased, guaranteeing that Gmail users were all in sync without having to visit a Web site to download and install an update. The same principle applied to the collaborative editing tools of Google Docs, which moved users’ documents into the browser with no need for backups to a hard drive. “Six years ago”—before the launch of Docs—“office productivity on the Web wasn’t even an idea,” recalls Rajen Sheth, a product manager at Google.
Docs thus took a first, tentative bite out of such package software products as Microsoft Office. Soon hundreds of companies were nibbling away.
Adding new features and fixing glitches, it turned out, could be a fluid and invisible process. Indeed, sites like the photo storage service Flickr and the blog platform WordPress continually seep out new products, features, and fixes. Scraping software off individual hard drives and running it in anonymous data centers obliterated the old, plodding cycles of product releases and patches.
In 2008, Google took a step back from software and launched App Engine. For next to nothing, Google now lets its users upload Java or Python code that is then modified to run swiftly on any desired number of machines. Anyone with a zany idea for a Web application could test it out on Google’s servers with minimal financial risk. Let’s say your Web app explodes in popularity: App Engine will sense the spike and swiftly increase your computing ration.
With App Engine, Google began dabbling in a space already dominated by another massive player, Amazon.com. No longer the placid bookstore most customers may have assumed it to be, in 2000 Amazon had begun to use its sales platform to host the Web sites of other companies, such as the budget retailer Target. In 2006 came rentable data storage, followed by a smorgasbord of “instances,” essentially slices of a server available in dozens of shapes and sizes. (Not satisfied? Fine: The CPU of an instance, which Amazon calls a compute unit, is equivalent to that of a 1.0- to 1.2-gigahertz 2007 Opteron or 2007 Xeon processor.)
To get a flavor of the options, for as little as about US $0.03 an hour, you can bid on unused instances in Amazon’s cloud. As long as your bid exceeds a price set by Amazon, that spare capacity is yours. At the higher end, around $2.28 per hour can get you a “quadruple extra large” instance with 68 gigabytes of memory, 1690 GB of storage, and a veritable bounty of 26 compute units.
In a sense, the cloud environment makes it easier to just get things done. The price of running 10 servers for 1000 hours is identical to running 1000 machines for 10 hours—a flexibility that doesn’t exist in most corporate server rooms. “These are unglamorous, heavy-lifting tasks that are the price of admission for doing what your customers value,” says Adam Selipsky, a vice president at Amazon Web Services.
As unglamorous as an electric utility, some might say. Indeed, Amazon’s cloud services are as close as we’ve gotten to the 50-year-old dream of “utility computing,” in which processing is treated like power. Users pay for what they use and don’t install their own generating capacity. The idea of every company running its own generators seems ludicrous, and some would argue that computing should be viewed the same way.
Selling instances, of course, is nothing like selling paperbacks, toasters, or DVDs. Where Google’s business model revolves around collecting the world’s digital assets, Amazon has more of a split personality, one that has led to some odd relationships. To help sell movies, for example, Amazon now streams video on demand, much like companies such as Netflix. Netflix, however, also uses Amazon’s servers to stream its movies. In other words, Amazon’s servers are so cheap and useful that even its competitors can’t stay away. But to understand what’s truly fueling the addiction to the cloud, you’ll need to glance a bit farther back in time.
COMPANY TO WATCH:
F-Secure Corp. uses the cloud to protect the cloud. Its global network of servers detects malicious software and distributes protective updates in minutes. To assess a threat, it uses the Internet itself: A widely available application is more likely to be safe than a unique file.
Transmitting a terabyte of data from Boston to San Francisco can take a week. So the impatient are returning to an old idea, “Sneakernet”: Put your data on a disc, take it to FedEx, and get it to a data center in a day.
Dude, where are my bits? In the growing obfuscation of who’s responsible for what data, Amazon recently deployed its storefront platform on privacy-challenged Facebook for the first time. The irresistible business case? Selling Pampers diapers.
In the mid-1990s, a handful of computer science graduate students at Stanford University became interested in technologies that IBM had developed in the 1960s and ’70s to let multiple users share a single machine. By the 1980s, when cheap servers and desktop computers began to supplant mainframe computers, those “virtualization” techniques had fallen out of favor.
The students applied some of those dusty old ideas to PCs running Microsoft Windows and Linux. They built what’s called a hypervisor, a layer of software that goes between hardware and other higher-level software structures, deciding which of them will get how much access to CPU, storage, and memory. “We called it Disco—another great idea from the ’70s ready to make a comeback,” recalls Stephen Herrod, who was one of the students.
They realized that virtualization could address many of the problems that had begun to plague the IT industry. For one thing, servers commonly operated at as little as a tenth of their capacity, according to International Data Corp., because key applications each had a dedicated server. It was a way of limiting vulnerabilities because true disaster-proofing was essentially unaffordable.
So the students spawned a start-up, VMware. They started by emulating an Intel x86 microprocessor’s behavior in software. But those early attempts didn’t always work smoothly. “When you mess up an emulation and then run Windows 95 on top of it, you sometimes get funny results,” Herrod, now VMware’s chief technology officer, recalls. They’d wait an hour for the operating system to boot up, only to see the Windows graphics rendered upside down or all reds displayed as purple. But slowly they figured out how to emulate first the processor, then the video cards and network cards. Finally they had a software version of a PC—a virtual machine.
Next they set out to load multiple virtual machines on one piece of hardware, allowing them to run several operating systems on a single machine. Armed with these techniques, VMware began helping its customers consolidate their data centers on an almost epic scale—shrinking 500 servers down to 20. “You literally go up to a server, suck the brains out of it, and plop it on a virtual machine, with no disruption to how you run the application or what it looks like,” Herrod says.
Also useful was an automated process that could switch out the underlying hardware that supported an up-and-running virtual machine, allowing it to move from, say, a Dell machine to an HP server. This was the essence of load balancing—if one server started failing or got too choked up with virtual machines, they could move off, eliminating a potential bottleneck.
You might think that the virtual machines would run far more slowly than the underlying hardware, but the engineers solved the problem with a trick that separates mundane from “privileged” computing tasks. When the virtual machines sharing a single server execute routine commands, those computations all run on the bare metal, mixed together with their neighbors’ tasks in a computational salad bowl. Only when the virtual machine needs to perform a more confidential task, such as accessing the network, does the processing retreat back into its walled-off software alcove, where the calculating continues, bento-box style.
Those speedy transitions would not have been possible were it not for another key trend—the consolidation of life into an Intel world. Back in virtualization’s early days, a major goal was to implement foreign architectures on whatever hardware was at hand—say, by emulating a Power PC on a Sun Microsystems workstation. Virtualization then had two functions, to silo data and to translate commands for the underlying hardware. With microprocessor architectures standardized around the x86, just about any server is now compatible with every other, eliminating the tedious translation step.
VMware no longer has a monopoly on virtualization—a nice open-source option exists as well—but it can take credit for developing much of the master idea. With computers sliced up into anywhere between 5 and 100 flexible, versatile virtual machines, users can claim exactly the computing capacity they need at any given moment. Adding more units or cutting back is simple and immediate. The now-routine tasks of cloning virtual machines and distributing them through multiple data centers make for easy backups. And at a few cents per CPU-hour, cloud computing can be cheap as dirt.
So will all computing move into the cloud? Well, not every bit. Some will stay down here, on Earth, where every roofing tile and toothbrush seems fated to have a microprocessor of its own.
But for you and me, the days of disconnecting and holing up with one’s hard drive are gone. IT managers, too, will surely see their hardware babysitting duties continue to shrink. Cloud providers have argued their case well to small-time operations with unimpressive computing needs and university researchers with massive data sets to crunch through. But those vendors still need to convince Fortune 500 companies that cloud computing isn’t just for start-ups and biology professors short on cash. They need a few more examples like Netflix to prove that mucking around in the server room is a choice, not a necessity.
And we may just need more assurances that our data will always be safe. Data could migrate across national borders, becoming susceptible to an unfriendly regime’s weak human rights laws. A cloud vendor might go out of business, change its pricing, be acquired by an archrival, or get wiped out by a hurricane. To protect themselves, cloud dwellers will want their data to be able to transfer smoothly from cloud to cloud. Right now, it does not.
The true test of the cloud, then, may emerge in the next generation of court cases, where the murky details of consumer protections and data ownership in a cloud-based world will eventually be hashed out. That’s when we’ll grasp the repercussions of our new addiction—and when we may finally learn exactly how the dream of the Internet, in which all the world’s computers function as one, might also be a nightmare.
For all of IEEE Spectrum’s Top 11 Technologies of the Decade, visit the special report.