At the end of 2018, Dario “TLO” Wünsch, a well-known professional gamer from Germany, was ranked 42nd in the world in the video game StarCraft II. He’d lost some—especially as he battled debilitating carpal tunnel syndrome—but he’d won enough to still be considered among the world’s best players.
But last week, as he sat before his screen executing the unorthodox moves that have become his signature, he watched helplessly as his opponent slaughtered his armies and laid waste to his StarCraft II kingdom. There was no fist-pumping excitement coming from TLO’s opponent. The German gamer lost to an artificial intelligence agent created by DeepMind Technologies as part of its mission to push the boundaries of AI.
The company, which is measuring its progress by testing its algorithms’ ability to play StarCraft II, is celebrating a major milestone: the introduction last week of AlphaStar, its StarCraft II player.
To publicize AlphaStar’s release, DeepMind, which was acquired several years ago by Alphabet, set up a series of matches last week pitting a couple of its agents (algorithms trained to autonomously react to their environment with a focus on achieving a set of goals) against TLO (short for “The Little One”). To ensure that the results couldn’t be considered a fluke, DeepMind then matched its agents up against a second professional gamer, Grzegorz “MaNa” Komincz. MaNa, who hails from Poland, finished 2018 ranked 13th in the StarCraft II World Championship Series Circuit.
Right out of the gate, the first AlphaStar agent TLO faced picked him apart in short order. The human player had put too much emphasis on gathering resources and building out his kingdom. When the AI agent came calling with imperialism on its mind, TLO’s defenses were meager, his armies overmatched, and his simulated society quickly overtaken. After five matches, TLO acknowledged what a superb job the AI developers had done. After the first game, the outcomes weren’t as lopsided, but TLO walked away without a single victory.
To win at StarCraft II, a player builds an empire with all forethought and flexibility such an endeavor requires. Players must weigh the importance of competing objectives—like gathering resources, building structures, organizing an army, setting up defenses, and fighting battles—and shift their relative importance in real time over the course of a game that could last an hour or more.
Further, only a portion of the landscape in the game’s fictional world is visible at any given time, so the odds of winning are greatly affected by the player’s memory and ability to set up things that won’t be continuously monitored. Further ratcheting up the game’s complexity is the fact that one of more than 300 possible actions can be taken at any given time (compared with the fewer than a dozen moves a player can make in, say, simple arcade games).
That’s complicated enough for any human player to manage. But TLO was faced with DeepMind agents with different tendencies. So when TLO adjusted his strategy with an eye toward counteracting the first agent’s favorite moves, the DeepMind group switched to another agent, effectively erasing the value of what the German gamer was learning about how a particular algorithm preferred to weight the tasks.
Although MaNa, the Polish gaming professional, is more experienced than TLO at playing with the Protoss characters featured in the version of the game the players had to contend with, he too was frustrated by the AI agents’ use of strategy and fighting techniques that he had never seen from his human opponents. MaNa lost the first four of the five matches against the DeepMind algorithms. Only when DeepMind switched to an agent whose view of the landscape was strictly limited did MaNa manage to score one for humanity.
This series of victories by DeepMind’s team was a big deal. Although AIs have surpassed us in some widely recognized measures of human smarts—trouncing the best human players in the TV game show Jeopardy!, beating world champions in chess and Go, and getting us closer to letting cars drive themselves in order to avoid accidents—these algorithms are still not as good at the game of learning as we humans are.
In fact, up until now, StarCraft II and its predecessor were too complex for AI gamers to take on. Even when the game was dumbed down by simplifying maps of the landscape and changing the rules to give the agents superhuman abilities, the AI agents were easily bested by human professional gamers. But AlphaStar needs no such assistance. Its deep neural network, trained directly from raw game data via supervised and reinforcement learning techniques, more than holds its own.
DeepMind and its partner, Blizzard Entertainment (which provides the game replays needed for training the algorithms), think their use of StarCraft II as a research environment will continue to pay huge dividends. They’ve developed a machine-learning API to help other researchers and developers make better use of the millions of anonymized game replays that will soon be available for sending agents to StarCraft school. And their PySC2 environment wrapper even breaks the game down into chunks that can be used to test an agent on specific tasks, such as shifting its own field of view and collecting mineral resources. The AI developers say that the training environment provided by StarCraft II could even bolster research that would make computers better at sequence prediction and improve their long-term memory.