The Daily - The Sunday Read: ‘How A.I. Conquered Poker’

Starting point is 00:00:00 My name is Keith Romer. I'm a contributor to the New York Times Magazine. So I've been playing poker recreationally for a couple of decades, not anywhere near on the level of the people in this story, but enough to understand what they are trying to do and what they're trying to achieve. I have probably played hundreds of thousands of hands of poker in my life, and it is a game that is fundamentally about decision-making, about when to bluff, when to call, when to raise. And we do this over and over, hundreds or thousands of times in a single game of poker.

Starting point is 00:00:39 For a long time, poker strategy was something that you learned by reading books, or you could watch other good players playing and see what they were doing, and it was kind of an open question. What is the right way to play a particular hand in a particular spot? But now, poker has been solved to a great extent using purely mathematics. to a great extent using purely mathematics. What makes poker so difficult to analyze is the sheer scale of the game. Game theorists will talk about games

Starting point is 00:01:12 in terms of the size of their game tree, how many different decision points there are throughout the course of a game. In No Limit Texas Hold'em, just played with two players, the size of the game tree has more branches in it than the number of atoms in the universe. This story is about professional poker players and the way that they have learned from artificial intelligence how to play the game that they make their living from in a way that approaches theoretical perfection. To see all of this in action, I got on a plane for the first

Starting point is 00:01:47 time since the pandemic started, and I flew out to Las Vegas to watch people play for hundreds and thousands of dollars, the best players in the world trying their best to play perfect poker. So here's my article. Last November, in the cavernous Amazon room of Las Vegas' Rio Casino, two dozen men, dressed mostly in sweatshirts and baseball caps, sat around three well-worn poker tables playing Texas Hold'em. Occasionally, a few passersby stopped to watch the action, but otherwise the players pushed their chips back and forth in dingy obscurity. Except for the taut electric stillness with which they held themselves during a hand, there was no outward sign that these were the greatest poker players in the world,

Starting point is 00:02:42 nor that they were, as the poker saying goes, playing for houses, or at least hefty down payments. This was the first day of a three-day tournament whose official name was the World Series of Poker Super High Roller, though the participants simply called it the 250K, after the $250,000 each had put up to enter it. $250,000 each had put up to enter it. At one table, a professional player named Seth Davies covertly peeled up the edges of his cards to consider the hand he had just been dealt, the six and seven of diamonds. Over several hours of play,

Starting point is 00:03:17 Davies had managed to grow his starting stack of 1.5 million in tournament chips to well over 2 million, some of which he now slid forward as a raise. A 33-year-old former college baseball player with a trimmed light brown beard, Davies sat upright, intensely following the action as it moved around the table. Two men called his bet before Dan Smith, a fellow pro with a round face, mustache, and whimsically worn cowboy hat, put in a hefty re-raise.

Starting point is 00:03:46 Only Davies called. The dealer laid out a king, four, and five, all clubs, giving Davies a straight draw. Smith checked, bet nothing. Davies bet. Smith called. The turn card was the deuce of diamonds, missing Davies' draw. Again Smith checked. Again Davies bet. Again Smith called. The last card dealt was the deuce of clubs, one final blow to Davies' hopes of improving his hand. By now, the pot at the center of the faded green felt-covered table had grown to more than a million in chips. The last deuce had put four clubs on the table, which meant that if Smith had even one club in his hand, he would make a flush. Davies, who had been betting the whole way, needing an eight or a three to turn his hand into a straight,

Starting point is 00:04:34 had arrived at the end of the hand with precisely nothing. After Smith checked a third time, Davies considered his options for almost a minute before declaring himself all in for $1.7 million in chips. If Smith called, Davies would be out of the tournament, his $250,000 entry fee incinerated in a single ill-timed bluff. Smith studied Davies from under the brim of his cowboy hat, then twisted his face in exasperation at Davies, or perhaps at luck itself. Finally, his features settling in an irritated scowl,

Starting point is 00:05:11 Smith folded, and the dealer pushed the pile of multicolored chips Davies' way. According to Davies, what he felt when the hand was over was not so much triumph as relief. You're playing a pot that's effectively worth a half million dollars in real money, he said afterwards. It's just so much goddamn stress. Real validation wouldn't come until around 2.30 that morning, after the first day of the tournament had come to an end and Davies had made the 15-minute drive from the Rio to his home outside Las Vegas.

Starting point is 00:05:44 There, in an office just in from the garage, he opened a computer program called PioSolver, one of a handful of artificial intelligence-based tools that have, over the last several years, radically remade the way poker is played, especially at the highest levels of the game. Davies input all the details of the hand and then set the program to run. In moments, the solver generated an optimal strategy.

Starting point is 00:06:09 Mostly, the program said, Davies had gotten it right. His bet on the turn when the deuce of diamonds was dealt should have been 80% of the pot instead of 50%, but the 1.7 million chip bluff on the river was the right play. That feels really good, Davies said, even more than winning a huge pot. The real satisfying part is when you nail one like that. Davies went to sleep that night knowing for certain that he played the hand within a few degrees of perfection. The pursuit of perfect poker goes back at least as far as the 1944 publication of Theory of Games and Economic Behavior

Starting point is 00:06:45 by the mathematician John von Neumann and the economist Oscar Morgenstern. The two men wanted to correct what they saw as a fundamental imprecision in the field of economics. We wish, they wrote, to find the mathematically complete principles which define rational behavior for the participants in a social economy

Starting point is 00:07:05 and to derive from them the general characteristics of that behavior. Economic life, they suggested, should be thought of as a series of maximization problems in which individual actors compete to wring as much utility as possible from their daily toil. If von Neumann and Morgenstern could quantify the way good decisions were made, the idea went, they would then be able to build a science of economics on firm ground. It was this desire to model economic decision-making that led them to gameplay. Von Neumann rejected most games as unsuitable to the task, especially those like checkers or chess, in which both players can see all the pieces on the board and share the same information.

Starting point is 00:07:50 Real life is not like that, he explained to Jacob Bronowski, a fellow mathematician. Real life consists of bluffing, of little tactics of deception, of asking yourself, what is the other man going to think I mean to do? And that is what games are about in my theory. Real life, von Neumann thought, was like poker. Using his own simplified version of the game, in which two players were randomly dealt secret numbers and then asked to make bets of a predetermined size on whose number was higher, von Neumann derived the basis for an optimal strategy.

Starting point is 00:08:26 Players should bet large both with their very best hands and, as bluffs, with some definable percentage of their very worst hands. The percentage changed depending on the size of the bet relative to the size of the pot. Von Neumann was able to demonstrate that by bluffing and calling at mathematically precise frequencies, players would do no worse than break even in the long run, even if they provided their opponents with an exact description of their strategy. And if their opponents deployed any strategy against them other than the perfect one von Neumann had described, those opponents were guaranteed to lose, given a large enough sample. Theory of games pointed the way to a future in which all manner of competitive interactions

Starting point is 00:09:14 could be modeled mathematically. Auctions, submarine warfare, even the way species compete to pass their genes on to future generations. But in strategic terms, poker itself barely advanced in response to von Neumann's proof until it was taken up by members of the Department of Computing Science at the University of Alberta more than five decades later. The early star of the department's game research was a professor named Jonathan Schaefer, who, after 18 years of work, discovered the solution to checkers. Alberta faculty and students also made significant progress on games as diverse as Go, Othello, Starcraft, and the Canadian pastime of curling. Poker, though, remained a particularly thorny

Starting point is 00:09:58 problem, for precisely the reason von Neumann was attracted to it in the first place—the way hidden information in the game acts as an impediment to good decision-making. Unlike in chess or backgammon, in which both players' moves are clearly legible on the board, in poker, a computer has to interpret its opponent's bets despite never being certain what cards they hold.

Starting point is 00:10:20 Neil Birch, a computer scientist who spent nearly two decades working on poker as a graduate student and researcher at Alberta before joining an artificial intelligence company called DeepMind, characterizes the team's early attempts as pretty unsuccessful. What we found was if you put a knowledgeable poker player in front of the computer and let them poke at it, he says, the program got crushed, absolutely smashed. at it, he says. The program got crushed, absolutely smashed. Partly, this was just a function of the difficulty of modeling all the decisions involved in playing a hand of poker. Game theorists use a diagram of a branching tree to represent the different ways a game can play out. In a straightforward one, like rock, paper, scissors, the tree is small. Three branches for the rock, paper, and scissors you can play, each with three subsequent branches for the rock, paper, and scissors your opponent

Starting point is 00:11:10 can play. The more complicated the game, the larger the tree becomes. For even a simplified version of Texas Hold'em, played heads-up, that is, between just two players, and with bets fixed at a predetermined size, a full game tree contains 316 quadrillion branches. The tree for No Limit Holum, in which players can bet any amount, has even more than that. It really does get truly enormous, Birch says, like larger than the number of atoms in the universe. At first, the Alberta group's approach was to try to shrink the game to a more manageable scale, crudely bucketing hands together that were more or less alike, treating a pair of nines and a pair of tens, say, as if they were identical.

Starting point is 00:11:54 But as the field of artificial intelligence grew more robust, and as the team's algorithms became better tuned to the intricacies of poker, its programs began to improve. Crucial to this development was an algorithm called counterfactual regret minimization. Computer scientists tasked their machines with identifying poker's optimal strategy by having the programs play against themselves

Starting point is 00:12:19 billions of times and take note of which decisions in the game tree had been least profitable, the regrets, which the AI would learn to minimize in future iterations by making other, better choices. In 2015, the Alberta team announced its success by publishing an article in Science titled, Heads Up, Limit, Hold'em Poker is Solved. For some players, especially those who made a living playing that variant of poker online, the Alberta group's triumph represented a serious threat to their livelihood.

Starting point is 00:12:51 I remember when we read about it, says the former professional Terence Chan, we were just like, oh, good game. It's been a fun ride. It quickly became clear that academics were not the only ones interested in computers' ability to discover optimal strategy. One former member of the Alberta team who asked me not to name him, citing confidentiality agreements with a software company that currently employs him, told me that he had been paid hundreds of thousands of dollars to help poker players develop software that would identify perfect play and to consult with programmers building bots that would be capable of defeating humans in online games. Players unable to front that kind of money didn't have

Starting point is 00:13:30 to wait long before gaining more affordable access to AI-based strategies. The same year that Science published the Limit Hold'em article, a Polish computer programmer and former online player named Piotrek Lopuszewicz began selling the first version of his application, PioSolver. For $249, players could download a program that approximated the solutions for the far more complicated, no-limit version of the game. As of 2015, a practical actualization of John von Neumann's mathematical proof

Starting point is 00:14:01 was available to anyone with a powerful enough personal computer. One of the earliest and most devoted adopters of what has come to be known as game theory optimal poker is Seth Davey's friend and poker mentor, Jason Kuhn. On the second day of the three-day Super High Roller tournament, I visited Kuhn at his multi-million dollar house, located in a gated community inside a larger gated community next to a Jack Nicklaus-designed golf course.

Starting point is 00:14:45 On day one, Kuhn paid $250,000 to play the super high roller, then a second $250,000 after he was knocked out four hours in. But again, he lost all his chips. Welcome to the world of nosebleed tourneys, he texted me afterward. Just have to play your best. It evens out. For Kuhn, Evening Out has taken the form of more than $30 million in in-person tournament winnings, and, he says, at least as much from high-stakes cash games in Las Vegas and Macau, the Asian gambling mecca. Kuhn began

Starting point is 00:15:17 playing poker seriously in 2006 while rehabbing an injury at West Virginia Wesleyan College, where he was a sprinter on the track team. He made a good living from cards, but he struggled to win consistently in the highest stakes games. I was a pretty mediocre player, pre-solver, he says. But the second solvers came out, I just buried myself in this thing. And I started to improve, like, rapidly, rapidly, rapidly, rapidly. In a home office, decorated mostly with trophies from poker tournaments he has won,

Starting point is 00:15:47 Kuhn turned to his computer and pulled up a hand on PO Solver. After specifying the size of the players' chip stacks and the range of hands they would play from their particular seats at the table, he entered a random three-card flop that both players would see.

Starting point is 00:16:00 A 13-by-13 grid illustrated all the possible hands one of the players could hold. Kuhn hovered his mouse over the square for an ace and queen of different suits. The solver indicated that Kuhn should check 39% of the time, make a bet equivalent to 30% the size of the pot 51% of the time, and bet 70% of the pot the rest of the time. This von Neumann-esque mixed strategy would simultaneously maximize his profit and disguise the strength of his hand. Thanks to tools like PO Solver, Kuhn has remade his approach to the game, learning what size bets work best in different situations.

Starting point is 00:16:39 Sometimes, tiny ones, one-fifth or even one-tenth the size of the pot, are ideal. Other times, giant bets, two or three times the size of the pot, are correct. And while good poker players have always known that they need to maintain a balance between bluffing and playing it straight, solvers define the precise frequency with which Kuhn should employ one tactic or the other and identify the, sometimes surprising, best and worst hands to bluff with, depending on the cards in play. Eric Seidel, a pro who learned the game in the 1980s, told me that if players like Kuhn traveled back in time just 15 years with today's knowledge, they would crush the best players of that era. I think also that all the people in the game

Starting point is 00:17:23 would think that they were fish, Seidel said, using the poker argot for bad players. There are a lot of really strange plays now that these guys are making that are effective, but if people saw them back in the day, I think that they'd be invited into the game every night. Against weaker players, Kuhn will sometimes intentionally diverge from theoretically perfect poker, bluffing more than he should, or betting large when the AI says he should bet small to take advantage of his opponent's mistakes. But against the best professionals, he will mostly just do his best to replicate the solver's decisions to the extent that he is able to remember the AI's preferred bet sizes and the frequencies with which to employ them. Because he knows his own human biases can creep into his decision-making,

Starting point is 00:18:06 Kuhn will often randomly select which of the solvers' tactics to employ in a given hand. He'll glance down at the second hand on his watch or at a poker chip to note the orientation of the casino logo as if it were a clock face in order to generate a percentage between 1 and 100. The higher the percentage, the more aggressive the action he'll take. I'll say, okay, well, I just rolled 9 o'clock, so that's 75%. That's a pretty aggressive number. In that instance, Kuhn might choose the largest

Starting point is 00:18:36 of the solver's approved bet sizes for his hand, whereas if the second-handed pointed to 3 o'clock, or 25%, he might have checked. Using optimal strategy is no guarantee, of course, that Kuhn will win any particular hand. Given enough hands, however, the math says he should do no worse than break even, and will in practice do much better than that, depending on how far his opponent's strategies diverge from theoretically perfect play. If he were to play thousands of hands against the solver, Kuhn says, it's going to win, I promise. Kuhn is quick to point out that even with access to the solver's perfect strategy,

Starting point is 00:19:14 poker remains an incredibly difficult game to play well. The emotional swings that come from winning or losing giant pots and the fatigue of 12-hour sessions, remain the same challenges as always. But now top players have to put in significant work away from the tables to succeed. Like most top pros, Kuhn spends a good part of each week studying different situations that might arise, trying to understand the logic behind the program's choices. Solvers can't tell you why they do what they do. They just do it, he says. So now it's on the poker player to figure out why. The best players are able to reverse-engineer the AI's strategy and create heuristics that apply to hands and situations similar to the one they're studying.

Starting point is 00:19:59 Even so, they are working with immense amounts of information. When I suggested to Kuhn that it was like endlessly rereading a 10,000-page book in order to keep as much of it in his head as possible, he immediately corrected me. A 100,000-page book. The game is so damn hard. In fact, the store of data Kuhn draws on is even larger than that. He rents nearly 200 terabytes of cloud storage for the game trees he has developed since he started working with solvers. Players sitting down to in-person games have no way to access all that information at the table. But that limitation does not necessarily apply to poker played over the internet. Automated bots, especially in low-stakes games, have been a

Starting point is 00:20:41 problem for internet poker since before the rise of solvers. But now, human players willing to skirt the rules can look up AI strategies on one screen and then use them to play optimally on a second screen. Anytime there are high stakes and a lot of money to be won and a device that might be used for good, Kuhn says, people have a way to turn it into a cheating tool. Kuhn isn't especially worried that people are cheating in the games he plays over the internet, but other players aren't so sure. It's the main reason why I don't really play much online anymore,

Starting point is 00:21:14 a pro named Ryan LaPlante says. In a recent $7,000 buy-in online tournament held as part of the World Series of Poker, LaPlante says he recognized the screen names of at least four of the hundred or so competitors as belonging to players who are rumored to have been banned from other sites for using what is called real-time assistance.

Starting point is 00:21:34 LaPlante credits some of the biggest online sites with doing a good job of policing their games, but he worries that as solvers become more ubiquitous, the balance of power will continue to shift towards those who cheat to gain an edge. The only thing I'm confident of, LaPlante says, is that it's going to get a lot worse very quickly. Well after midnight on the super high roller's second day, a German professional named Christoph Vogelsang

Starting point is 00:22:00 called a bet for all his chips with a king and a nine versus another player's ace and jack. According to the solvers, calling was in fact the correct play. All the same, Vogelsang lost the hand and was eliminated from the tournament in sixth place. Unlike a regular poker game where players can leave the table and cash in their chips whenever they feel like it, a poker tournament requires players to continue until they either lose everything or win every single chip in play. Prizes drawn from the pool created by all the buy-ins are paid out based on how long players manage to stay in the game. The person who ends with all the

Starting point is 00:22:36 chips is awarded the first place prize, $3.2 million in this tournament. The second to last survivor gets second place, $2 million, and so on down to the final in-the-money finisher, in this case fifth place, $630,000. Vogelsang and all the players who were eliminated before him received nothing. Given the small sample size of several hundred hands that a player will see over the course of three days, a single poker tournament is an incredibly inexact way of identifying the strongest player in the field. Luck will determine much of the outcome

Starting point is 00:23:11 for even the best players. If the 26 human players in the tournament were replaced with 26 perfectly programmed poker bots, one bot would win, and one would be the first to be eliminated, despite their sharing the same optimal strategy. Poker players tend to take the long view, speaking of tournament buy-ins as investments with a more or less predictable return when averaged over time.

Starting point is 00:23:35 In a relatively tough tournament, the worst players in the field are losing maybe as much as 30 or 40 percent of their buy-in, says Ike Haxton, who plays professionally. Stronger amateurs, he says, should expect to lose an average of about 15% of the money they put in, while the best pros will earn a return of around 5 to 10% over the long run. To dampen the huge swings of fortune that come in the short term, many professionals agree to swap percentages of any potential prize money with one another before the tournament starts. I agree to give you 5% of what I win, say, if you agree to give me 5% of what you win.

Starting point is 00:24:12 Or sell stakes in their future winnings to outside backers, like shares in an old-time whaling voyage. Seth Davies wouldn't tell me the exact details of his own arrangements, but he admitted that less than half of what he put into this tournament had come out of his own bankroll. Even so, after being knocked out on the first day and then paying a second $250,000 to re-enter, he had, quote,

Starting point is 00:24:36 well into six figures of his own money on the line. On the third and final day of the Super High Roller, the five remaining players were relocated from the dilapidated outer tables of the Amazon Room to a made-for-television set at its center. Stage lights brightly illuminated the poker table's gleaming green felt from above, while a 45-foot camera crane swung from side to side to get the best angle on the action.

Starting point is 00:25:02 All five players who had made it this far were guaranteed to turn a profit, but there was still a lot of maneuvering left to determine how far up the payout ladder they could climb. As the game got underway, the chip leader, a 27-year-old Spanish pro named Adrian Mateos, kept up a steady barrage of giant bets against the other players, asking them again and again whether this was the hand with which they wanted to make their final stand, or whether, perhaps, they would rather fold and wait for another player or two to bust out so that they could finish fourth or third instead of fifth and take home an additional $300,000 or $700,000 in prize money. Situations like these bend the

Starting point is 00:25:44 value of players' stacks in strange ways, depending on where they are in the payout hierarchy. Even a single chip can be worth an incredible amount of real money if another player is knocked out of the tournament after you've folded. There are solvers that can model these circumstances as well. But as the chip stacks get shorter relative to the size of the blind bets and antis players are required to put in the pot before each hand begins, flawless play alone offers no real insurance against what often becomes essentially a game of heads or tails. When it comes down to it, Davies says, you just end up running these million-dollar flips, and you hope you win. After one competitor was eliminated, Davies found himself with the shortest

Starting point is 00:26:26 chip stack at the table. With only one more person still to play behind him, he pushed all in with the ace and seven of clubs, just as the solvers said he should, given the size of his stack. The remaining player, a ponytailed Englishman named Ben Heath, quickly called and turned over a pair of jacks, making him a 67% favorite to win the hand. None of the five cards the dealer laid out improved Davies' hand, so Heath won the pot and Davies was eliminated in fourth place. He stood up from the table, collected his backpack and N95 mask, and left the stage.

Starting point is 00:26:58 That was some serious gambling up there, he told me. Davies at least had the satisfaction of knowing how closely his play over the last three days had hewed to the optimal strategy generated on his computer at home. Another consolation was the $930,791 in prize money he would receive for his fourth-place finish. Sowing his cash-out ticket in his pocket, Davies walked over to a nearby $50,000 buy-in tournament already underway. He had planned to get some dinner and rest a little before buying in, cash-out ticket in his pocket, Davies walked over to a nearby $50,000 buy-in tournament already underway. He had planned to get some dinner and rest a little before buying in, but he changed

Starting point is 00:27:30 his mind after seeing how many of the players here were the sort most likely to employ decidedly non-optimal strategies. This 50k looks incredible, Davies told me. I just couldn't not be in there right away. Not every player I spoke to is happy about the way AI-based approaches have changed the poker landscape. For one thing, while the tactics employed in most lower-stakes games today look pretty similar to those in use before the advent of solvers, higher-stakes competition has become much tougher. As optimal strategy has become more widely understood, the advantage in skill the very best players once held over the merely quite good players has narrowed considerably.

Starting point is 00:28:11 But for Doug Polk, who largely retired from poker in 2017 after winning tens of millions of dollars, the change solvers have wrought is more existential. I feel like it kind of killed the soul of the game, Polk says, changing poker, quote, from who can be the most creative problem solver to who can memorize the most stuff and apply it. Piotrek Lobuszewicz, the programmer behind P.O. Solver, counters by arguing that the new generation of AI tools is merely a continuation of a longer pattern of technological innovation in poker. is merely a continuation of a longer pattern of technological innovation in poker. Before the advent of solvers, top online players like Polk used software to collect data about their opponent's past play and analyze it for potential weaknesses. So now somebody brought a bigger firearm to the arms race, Lopushevitz says,

Starting point is 00:28:59 and suddenly those guys who weren't in a position to profit were like, oh yeah, but we don't really mean that arms race. We just want our tools, not the better tools. Besides, for Lopushevich, solvers haven't so much changed poker as revealed its essence. Whether poker players themselves recognized it or wanted to, at its core, the game was always just the maximization problem John von Neumann revealed it to be. Today, everyone at a certain level is forced to respect the math side, Lopucewicz says. They can't ignore it anymore.

Starting point is 00:29:39 This story was written and narrated by Keith Romer. This story was written and narrated by Keith Romer. To listen to more stories from The New York Times, The New Yorker, Vanity Fair, The Atlantic, and other publications on your smartphone, download Autumn on the App Store or the Play Store. Visit autumn, that's A-U-D-M dot com, for more details.

The Daily - The Sunday Read: ‘How A.I. Conquered Poker’

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.