AI Decisively Defeats Human Poker Players

Poker pro Jason Les with computer mouse in hand plays against the Libratus AI
Photo: Andrew Rush/Pittsburgh Post-Gazette/AP Photo
Poker pro Jason Les (with computer mouse in hand) plays against the Libratus AI.

Humanity has finally folded under the relentless pressure of an artificial intelligence named Libratus in a historic poker tournament loss. As poker pro Jason Les played his last hand and leaned back from the computer screen, he ventured a half-hearted joke about the anticlimactic ending and the lack of sparklers. Then he paused in a moment of reflection.

“120,000 hands of that,” Les said. “Jesus.”

Libratus lived up to its “balanced but forceful” Latin name by becoming the first AI to beat professional poker players at heads-up, no-limit Texas Hold'em.  The tournament was held at the Rivers Casino in Pittsburgh from 11–30 January. Developed by Carnegie Mellon University, the AI won the “Brains vs. Artificial Intelligence” tournament against four poker pros by US $1,766,250 in chips over 120,000 hands (games). Researchers can now say that the victory margin was large enough to count as a statistically significant win, meaning that they could be at least 99.98 percent sure that the AI victory was not due to chance.

Previous attempts to develop poker-playing AI that can exploit the mistakes of opponents—whether AI or human—have generally not been overly successful, says Tuomas Sandholm, a computer scientist at Carnegie Mellon University. Libratus instead focuses on improving its own play, which Sandholm describes as safer and more reliable compared to the riskier approach of trying to exploit opponent mistakes. 

“We looked at fixing holes in our own strategy because it makes our own play safer and safer,” Sandholm says. “When you exploit opponents, you open yourself up to exploitation more and more.”

Even more important, the victory demonstrates how AI has likely surpassed the best humans at doing strategic reasoning in “imperfect information” games such as poker. The no-limit Texas Hold’em version of poker is a good example of an imperfect information game because players must deal with the uncertainty of two hidden cards and unrestricted bet sizes. An AI that performs well at no-limit Texas Hold’em could also potentially tackle real-world problems with similar levels of uncertainty.

“The algorithms we used are not poker specific,” Sandholm explains. “They take as input the rules of the game and output strategy.”

In other words, the Libratus algorithms can take the “rules” of any imperfect-information game or scenario and then come up with its own strategy. For example, the Carnegie Mellon team hopes its AI could design drugs to counter viruses that evolve resistance to certain treatments, or perform automated business negotiations. It could also power applications in cybersecurity, military robotic systems, or finance.

The Libratus victory comes two years after a first “Brains vs. Artificial Intelligence” competition held at the Rivers Casino in Pittsburgh in April–May 2015. During that first competition, an earlier AI called Claudico fell short of victory when it challenged four human poker pros. That competition proved a statistical draw in part because it featured just 80,000 hands of poker, which is why the Carnegie Mellon researchers decided to bump up the number of hands to 120,000 in the second tournament.

The four human poker pros who participated in the recent tournament—Jason Les, Dong Kim, Daniel McAulay, and Jimmy Chou—spent many extra hours each day trying to puzzle out Libratus. They teamed up at the start of the tournament with a collective plan of each trying different ranges of bet sizes to probe for weaknesses in the Libratus AI’s strategy that they could exploit. During each night of the tournament, they gathered together back in their hotel rooms to analyze the day’s worth of plays and talk strategy.

The human strategy of playing weird bet sizes had its greatest success in the first week, even if the AI never lost its lead from the beginning. Libratus held a growing lead of $193,000 in chips by the third day, but the poker pros narrowed the AI’s lead by clawing back $42,201 in chips on the fourth day. After losing an additional $8,189 in chips to Libratus on the fifth day, the humans scored a sizable victory of $108,775 in chips on the sixth day and cut the AI’s lead to just $50,513.

But Libratus struck back by winning $180,816 in chips on the seventh day. After that, the “wheels were coming off the wagon” for the human poker pros, Sandholm says. They noticed that Libratus seemed to become especially unbeatable toward the last of the four betting rounds in each game, and so they tried betting big up front to force a result before the fourth round. They speculated on how much Libratus could change its strategy within each game. But victory only seemed to slip further away.

One of the players, Jimmy Chou, became convinced that Libratus had tailored its strategy to each individual player. Dong Kim, who performed the best among the four by only losing $85,649 in chips to Libratus, believed that the humans were playing slightly different versions of the AI each day.

After Kim finished playing on the final day, he helped answer some questions for online viewers watching the poker tournament through the live-streaming service Twitch. He congratulated the Carnegie Mellon researchers on a “decisive victory.” But when asked about what went well for the poker pros, he hesitated: “I think what went well was…shit. It’s hard to say. We took such a beating.”

In fact, Libratus played the same overall strategy against all the players based on three main components:

  • First, the AI’s algorithms computed a strategy before the tournament by running for 15 million processor-core hours on a new supercomputer called Bridges. 
  • Second, the AI would perform “end-game solving” during each hand to precisely calculate how much it could afford to risk in the third and fourth betting rounds (the “turn” and “river” rounds in poker parlance). Sandholm credits the end-game solver algorithms as contributing the most to the AI victory. The poker pros noticed Libratus taking longer to compute during these rounds and realized that the AI was especially dangerous in the final rounds, but their “bet big early” counter strategy was ineffective.
  • Third, Libratus ran background computations during each night of the tournament so that it could fix holes in its overall strategy. That meant Libratus was steadily improving its overall level of play and minimizing the ways that its human opponents could exploit its mistakes. It even prioritized fixes based on whether or not its human opponents had noticed and exploited those holes. By comparison, the human poker pros were able to consistently exploit strategic holes in the 2015 tournament against the predecessor AI called Claudico.

By the end of the tournament, the poker pros had long since been resigned to their fate. Daniel McAulay, the last poker pro to finish his hands for the day, turned to an offscreen spectator and joked, “How much do I have to pay you to play the last 50 hands? Uhhhh, this is so brutal.”

The Libratus victory translates into an astounding winning rate of 14.7 big blinds per 100 hands in poker parlance—and that’s a very impressive winning rate indeed considering the AI was playing four human poker pros. Prior to the start of the tournament, online betting sites had been giving odds of 4:1 with Libratus seen as the underdog. But Sandholm seemed confident enough in the AI’s tournament performance to state that “there is no human who can beat Libratus.”

Despite the historic victory over humans, AI still has a ways to go before it can claim to have perfectly solved heads-up, no-limit Texas Hold’em. That’s because the computational power required to solve the game is still far beyond even the most powerful supercomputers. The game has 10160 possible plays at different stages—which may be more than the number of atoms in the universe. In 2015, a University of Alberta team demonstrated AI that provides a “weak” solution to a less complex version of poker with fixed bet sizes and a fixed number of betting rounds.

But as the defeated poker pros drifted away from their computer stations one by one, gloomy viewer comments floated up on the live stream’s Twitch chat window. “Dude poker is dead!!!!!!!!!!!” wrote one Twitch user before adding “RIP poker.” Others seemed concerned about computer bots dominating future online poker games: “its tough to identify a bot from online poker rooms? ppl are terified [sic].”

There is some good news for anyone who enjoys playing—and winning—at poker. Libratus still required serious supercomputer hardware to perform its calculations and improve its play each night, said Noam Brown, a Ph.D. student in computer science at Carnegie Mellon University who worked with Sandholm on Libratus. Brown reassured the Twitch chat that invincible poker-playing bots probably would not be flooding online poker play anytime soon.

Another Twitch user asked if poker still counts as a skill-based game. The question seemed to reflect anxiety about the meaning of a game that millions of people enjoy playing and watching: What does it all mean if an AI can dominate potentially any human player? But Sandholm told the Twitch chat that he sees poker as “definitely a skill-based game, no question.”

“People are worried that my work here has killed poker: I hope it has done the exact opposite,” Sandholm said. “I think of poker and no-limit [Texas Hold’em] as a recreational intellectual endeavor in much the same way as composing a symphony or performing ballet or playing chess.”

As the final day of the tournament wound down, the Carnegie Mellon professor thanked the online viewers for watching and supporting the competition. And he took the time to answer a number of lingering questions about the new AI overlord of poker.

“Does Libratus call me daddy?” Sandholm read aloud a Twitch chat question. “No, it can’t speak.”

Editor’s Note: This story was updated to to reflect the fact that the statistical significance for the Libratus victory was 99.98 percent and not merely 99.7 percent.

Advertisement

Automaton

IEEE Spectrum's award-winning robotics blog, featuring news, articles, and videos on robots, humanoids, drones, automation, artificial intelligence, and more.
Contact us:  e.guizzo@ieee.org

Editor
Erico Guizzo
 
Senior Writer
Evan Ackerman
 
 
Contributor
Jason Falconer
Contributor
Angelica Lim
 

Newsletter Sign Up

Sign up for the Automaton newsletter and get biweekly updates about robotics, automation, and AI, all delivered directly to your inbox.

Advertisement