Champions Declared in AI Poker Tournament

University of Alberta’s “Hyperborean” program wins three gold medals

4 min read
Poker chips and four aces playing cards.
Photo: Frederic Prochasson / iStockphoto

Poker chips and four aces playing cards.Photo: Frederic Prochasson / iStockphoto

This is the third and final blog post I’ll be writing about the Annual Computer Poker Competition that recently concluded at the ongoing AAAI conference in Toronto. I’m a member of the competing team from the University of Alberta’s Computer Poker Research Group (CPRG).

In my first post, I talked about the three different games that comprise the Texas Hold ‘Em competition: heads-up (two-player) limit (fixed betting sizes), heads-up no-limit, and 3-player limit. Each game has two different methods for determining a winner, an “instant runoff bankroll” that sequentially eliminates competitors based on their losses and an “total bankroll” method that looks only their total number of chips, meaning that there are six divisions and six winners overall. The second post described the “Hyperborean” programs that we’ve entered into the competition. With the completion of this year’s competition, I will now discuss the results and give my take on the Poker Symposium introduced at this year’s AAAI conference.

The final tally: Our Hyperborean won three gold medals, a program called Slumbot won two golds, and an Australian program called Little.Rock took home the remaining gold.

While the CPRG has historically done well in heads-up limit, this year saw a new champion of heads-up limit crowned. “Slumbot,” designed by Eric Jackson, an independent hobbyist and co-chair of this year’s competition, won both the instant-runoff and total bankroll divisions. Just like our instant-runoff entry, Slumbot is an approximate Nash equilibrium (a highly “defensive” strategy described in my second post) computed with an algorithm called counterfactual regret minimization. 

The kicker, though, is that while the my team makes use of state-of-the-art supercomputers maintained by Westgrid and Compute Canada, Eric only used low-end machines to build Slumbot. As RAM was limited, Slumbot’s data was instead stored on disk. While this was slow and required four months of computation prior to the competition, a large disk space allowed Slumbot to use very little abstraction compared to Hyperborean and other programs (due to the vast number of possible game states in poker, many programs base their decisions on a more simplified, abstract version of the game). This meant that Slumbot could distinguish between many more different hands, which resulted in Slumbot being the closest program to a real Nash equilibrium.

In those heads-up limit games, our Hyperborean ended up a close second behind Slumbot in the instant-runoff division, and placed fourth in the total bankroll event. While we succeeded in increasing our winnings against a few opponents in the total bankroll event by generating new strategies on the fly (as described in my second post), in many cases playing our approximate equilibrium would have been better. I suspect that many of the other entries in the total bankroll event, like Slumbot, were also near-equilibrium programs. If true, then because Nash equilibria are “unbeatable,” this rendered our generated strategies practically useless.

For heads-up no-limit, our single entry placed first in the instant-runoff bankroll and second in the total bankroll divisions. Our new technique allowing the program to choose many different bet sizes throughout the game proved quite successful, as Hyperborean never lost a single 1-on-1 match. There is still room for improvement, however, in the total bankroll division, as Hyperborean is not designed to take advantage of an opponent’s weaknesses. The winner of the total bankroll division, “Little.Rock,” by Rod Byrnes, was victorious by taking a lot of chips from some of the weaker no-limit entries.

In the 3-player events, Hyperborean finished on top under both winner determination rules. This year, we improved our 3-player entries significantly by creating a “dynamic expert strategy” [PDF]. While our 2-player programs fit each card that's dealt into a single “bucket” (a category of cards used in the abstract version of the game) independent of the betting, our 3-player program can generate two different abstractions, and bases its decision on the past sequence of actions taken by the players. First, we classify each action sequence as “important” or “unimportant” according to how many chips are in the pot and how many times our program from last year saw that action sequence. Then we create two separate card abstractions. The first allows our program to distinguish between many different hands and is employed at the important decisions. To keep memory requirements feasible, we compensate for this by employing a much more coarse and weaker abstraction at the unimportant sequences. In preliminary experiments, this approach significantly improved upon simply using a singe, balanced abstraction.

All of these results were announced at the first ever Poker Symposium held at AAAI-12. In addition to the results, many competitors and associates gave short talks or poster presentations about their programs or current research. Highlights include Sam Ganzfried from Carnegie Mellon University who discussed his no-limit program, “Tartanian5,” that finished second in the instant-runoff bankroll event and used a new “reverse mapping” rule to interpret non-standard bet sizes. Kevin Waugh, another PhD student from Carnegie Mellon University, talked about the challenges of exploiting weak players with opponent modelling and described some of his first attempts at doing so.

Also at the symposium, a colleague of mine from the University of Alberta, Michael Johanson, showed that being closer to a real Nash equilibrium did not always imply victory in the heads-up limit instant-runoff competitions from previous years. Lastly, Luís Teófilo presented poker research, past and present, from LIACC at the University of Porto. Their work includes PokerLANG, a high-level language for building poker programs, and research towards programs that select high-level actions, such as “check-raise,” rather than a single action per decision point. Overall, there was much value from learning other teams’ approaches and sharing ideas at the Poker Symposium. Hopefully the symposium will be back again next year.

The Conversation (0)