The December 2022 issue of IEEE Spectrum is here!

Close bar

Google AI Learns Classic Arcade Games From Scratch, Would Probably Beat You at Them

Deep learning artificial intelligence that plays Space Invaders could inspire better search, translation, and mobile apps

3 min read
Illustration: Google DeepMind
Illustration: Google DeepMind

New artificial intelligence software from Google can teach itself how to play—and often master—classic 1980s Atari arcade games.

“This work is the first time that anyone has built a single general-learning system that can learn directly from experience to master a wide range of challenging task—in this case, a set of Atari games—and perform at or better than human level at those games,” says one of the AI’s creators Demis Hassabis, who works at Google DeepMind in London. Hassabis and colleagues detailed their findings in in this week’s issue of the journal Nature. (And you can download the source code from Google here.)

The researchers hope to apply the ideas behind their AI to Google products such as search, machine translation, and smartphone apps “to make those things smarter,” Hassabis says.

Artificial intelligence is now experiencing a renaissance because of groundbreaking advances in machine learning. One important machine learning strategy is reinforcement learning, in which a program known as an agent learns through trial and error what actions maximize a future reward.

The researchers are now moving on to games from the 1990s, such as

However, reinforcement learning agents often have problems dealing with data that approach real-world complexity. To improve such agents, researchers combined reinforcement learning with a technique known as convolutional neural networks, which are hotly pursued under the name “deep learning” by tech giants such as Google, Facebook, Apple. (The original developer of convolutional networks, Facebook AI chief Yann LeCun, explains deep learning here.)

In an artificial neural network, components known as artificial neurons are fed data, and work together to solve a problem such as reading handwriting or recognizing speech. The network can then alter the pattern of connections among those neurons to change the way they interact, and the network tries solving the problem again. Over time, the network learns which patterns are best at computing solutions.

Such learning systems differ from other game-playing systems such as Deep Blue’s chess software and Watson’s Jeopardy program, explains Hassabis: 

Those systems are very impressive technical feats—obviously, they beat the human world champions in both those games. The key difference in those kinds of algorithms and systems is that they were largely preprogrammed with those abilities. Take Deep Blue—it was a team of programmers and chess grandmasters that distilled chess knowledge into the program, and then that program efficiently executed that task without adapting or learning anything.
What we’ve done is developed an algorithm that learns from the ground up. It takes perceptual experiences and learns how to do things directly from those perceptual experiences from first principles. The advantage of these kinds of systems is that they can learn and adapt to unexpected things, and the programmers and system designers don’t have to know the solution themselves in order for the machine to master that task.

The new software agent, called a deep Q-network (DQN), was tested on 49 classic Atari 2600 games, including Space Invaders, Ms. Pac-Man, Pong, Asteroids, Centipede, Q*bert, and Breakout. The agent was only fed the scores and data from an 84 by 84 pixel screen—unlike some other general game-playing AIs, the DQN did not know the rules of the games it played beforehand.

The system ran on a single GPU-equipped desktop computer and trained for about two weeks per game. The DQN performed at a level comparable to that of a professional human games tester, achieving more than 75 percent of what the human tester scored on 29 games. The agent also outperformed the best existing reinforcement learning agents on 43 games.

The nature of the games at which the DQN excelled were highly varied in nature, including side-scrolling shooters, 3-D car-racing, and boxing. “This system is able to generalize to any sequential decision-making decision,” says Koray Kavukcuoglu at Google DeepMind.

The games where the DQN did not do well reflect the limitations of the agent. “Currently, the system learns essentially by pressing keys randomly and then figuring out when this leads to high scores,” Google DeepMind’s Vlad Mnih. However, such a button-mashing strategy often does not work in games requiring more sophisticated exploration or long-term planning.

The researchers are now moving on to games from the 1990s, which include some 3-D “where the challenge is much greater,” Hassabis says. “StarCraft and Civilization are the ones we plan to crack at some point.”

So, will it be “Today, Ms. Pac-man; tomorrow, the world”? No, says Hassabis, noting that the AI-concerned entrepreneur Elon Musk was an early investor in DeepMind, which was later acquired by Google. “I’m good friends with Elon,” says Hassabis. “We agree with him that there are risks, but we’re many many decades away from any kind of technology we need to worry about.”

The Conversation (0)

Will AI Steal Submarines’ Stealth?

Better detection will make the oceans transparent—and perhaps doom mutually assured destruction

11 min read
A photo of a submarine in the water under a partly cloudy sky.

The Virginia-class fast attack submarine USS Virginia cruises through the Mediterranean in 2010. Back then, it could effectively disappear just by diving.

U.S. Navy

Submarines are valued primarily for their ability to hide. The assurance that submarines would likely survive the first missile strike in a nuclear war and thus be able to respond by launching missiles in a second strike is key to the strategy of deterrence known as mutually assured destruction. Any new technology that might render the oceans effectively transparent, making it trivial to spot lurking submarines, could thus undermine the peace of the world. For nearly a century, naval engineers have striven to develop ever-faster, ever-quieter submarines. But they have worked just as hard at advancing a wide array of radar, sonar, and other technologies designed to detect, target, and eliminate enemy submarines.

The balance seemed to turn with the emergence of nuclear-powered submarines in the early 1960s. In a 2015 study for the Center for Strategic and Budgetary Assessment, Bryan Clark, a naval specialist now at the Hudson Institute, noted that the ability of these boats to remain submerged for long periods of time made them “nearly impossible to find with radar and active sonar.” But even these stealthy submarines produce subtle, very-low-frequency noises that can be picked up from far away by networks of acoustic hydrophone arrays mounted to the seafloor.

Keep Reading ↓Show less
{"imageShortcodeIds":["30133857"]}