The Daily - The Sunday Read: ‘How A.I. Conquered Poker’
Episode Date: February 6, 2022If you didn’t think poker and artificial intelligence could be bedfellows, think again. Keith Romer delves into the history of man’s pursuit of the perfect game of poker, and explains how the use ...of A.I. is altering how it is played: individuals using an algorithmic “solver program” to analyze potential weaknesses about themselves and their opponents, thus gaining an advantage.While it feels futuristic, this desire to optimize poker isn’t new.Are these new generations of A.I. tools merely a continuation of a longer pattern of technological innovation in poker, or does it mark an irreversible structural shift? One thing’s for certain: The stakes are high.This story was written by Keith Romer. To hear more audio stories from publications like The New York Times, download Audm for iPhone or Android.
Transcript
Discussion (0)
My name is Keith Romer. I'm a contributor to the New York Times Magazine.
So I've been playing poker recreationally for a couple of decades,
not anywhere near on the level of the people in this story,
but enough to understand what they are trying to do and what they're trying to achieve.
I have probably played hundreds of thousands of hands of poker in my life,
and it is a game that
is fundamentally about decision-making, about when to bluff, when to call, when to raise.
And we do this over and over, hundreds or thousands of times in a single game of poker.
For a long time, poker strategy was something that you learned by reading books,
or you could watch other good players playing and see what they were doing,
and it was kind of an open question.
What is the right way to play a particular hand in a particular spot?
But now, poker has been solved to a great extent using purely mathematics.
to a great extent using purely mathematics.
What makes poker so difficult to analyze is the sheer scale of the game.
Game theorists will talk about games
in terms of the size of their game tree,
how many different decision points there are
throughout the course of a game.
In No Limit Texas Hold'em, just played with two players,
the size of the game tree has more branches in it than
the number of atoms in the universe. This story is about professional poker players and the way
that they have learned from artificial intelligence how to play the game that they make their living
from in a way that approaches theoretical perfection. To see all of this in action, I got on a plane for the first
time since the pandemic started, and I flew out to Las Vegas to watch people play for hundreds
and thousands of dollars, the best players in the world trying their best to play perfect poker.
So here's my article.
Last November, in the cavernous Amazon room of Las Vegas' Rio Casino, two dozen men,
dressed mostly in sweatshirts and baseball caps, sat around three well-worn poker tables playing Texas Hold'em. Occasionally,
a few passersby stopped to watch the action, but otherwise the players pushed their chips back and
forth in dingy obscurity. Except for the taut electric stillness with which they held themselves
during a hand, there was no outward sign that these were the greatest poker players in the world,
nor that they were, as the poker saying goes, playing for houses, or at least hefty down payments. This was the first day of a three-day
tournament whose official name was the World Series of Poker Super High Roller, though the
participants simply called it the 250K, after the $250,000 each had put up to enter it.
$250,000 each had put up to enter it.
At one table, a professional player named Seth Davies covertly peeled up the edges of his cards
to consider the hand he had just been dealt,
the six and seven of diamonds.
Over several hours of play,
Davies had managed to grow his starting stack
of 1.5 million in tournament chips
to well over 2 million,
some of which he now slid forward as a raise.
A 33-year-old former college baseball player with a trimmed light brown beard,
Davies sat upright, intensely following the action as it moved around the table.
Two men called his bet before Dan Smith, a fellow pro with a round face,
mustache, and whimsically worn cowboy hat, put in a hefty re-raise.
Only Davies called. The dealer laid out a king, four, and five, all clubs, giving Davies a straight
draw. Smith checked, bet nothing. Davies bet. Smith called. The turn card was the deuce of diamonds,
missing Davies' draw. Again Smith checked. Again Davies bet. Again Smith called.
The last card dealt was the deuce of clubs, one final blow to Davies' hopes of improving his hand.
By now, the pot at the center of the faded green felt-covered table had grown to more than a
million in chips. The last deuce had put four clubs on the table, which meant that if Smith had even one club in his hand, he would make a flush.
Davies, who had been betting the whole way,
needing an eight or a three to turn his hand into a straight,
had arrived at the end of the hand with precisely nothing.
After Smith checked a third time, Davies considered his options for almost a minute
before declaring himself all in for $1.7 million in chips.
If Smith called, Davies would be out of the tournament,
his $250,000 entry fee incinerated in a single ill-timed bluff.
Smith studied Davies from under the brim of his cowboy hat,
then twisted his face in exasperation at Davies,
or perhaps at luck itself. Finally, his features settling in an irritated scowl,
Smith folded, and the dealer pushed the pile of multicolored chips Davies' way.
According to Davies, what he felt when the hand was over was not so much triumph as relief.
You're playing a pot that's effectively worth a half million dollars in real money,
he said afterwards.
It's just so much goddamn stress.
Real validation wouldn't come until around 2.30 that morning,
after the first day of the tournament had come to an end
and Davies had made the 15-minute drive from the Rio to his home outside Las Vegas.
There, in an office just in from the garage,
he opened a computer program called PioSolver,
one of a handful of artificial intelligence-based tools
that have, over the last several years,
radically remade the way poker is played,
especially at the highest levels of the game.
Davies input all the details of the hand and then set the program to run.
In moments, the solver generated an optimal strategy.
Mostly, the program said, Davies had gotten it right.
His bet on the turn when the deuce of diamonds was dealt
should have been 80% of the pot instead of 50%,
but the 1.7 million chip bluff on the river was the right play.
That feels really good, Davies said, even more than winning
a huge pot. The real satisfying part is when you nail one like that. Davies went to sleep that
night knowing for certain that he played the hand within a few degrees of perfection.
The pursuit of perfect poker goes back at least as far as the 1944 publication of Theory of Games and Economic Behavior
by the mathematician John von Neumann
and the economist Oscar Morgenstern.
The two men wanted to correct what they saw
as a fundamental imprecision in the field of economics.
We wish, they wrote,
to find the mathematically complete principles
which define rational behavior
for the participants in a social economy
and to derive from them the general characteristics of that behavior. Economic life, they suggested,
should be thought of as a series of maximization problems in which individual actors compete to
wring as much utility as possible from their daily toil. If von Neumann and Morgenstern could quantify the way good
decisions were made, the idea went, they would then be able to build a science of economics
on firm ground. It was this desire to model economic decision-making that led them to
gameplay. Von Neumann rejected most games as unsuitable to the task, especially those like
checkers or chess,
in which both players can see all the pieces on the board and share the same information.
Real life is not like that, he explained to Jacob Bronowski, a fellow mathematician.
Real life consists of bluffing, of little tactics of deception, of asking yourself,
what is the other man going to think I mean to do? And that is what games are about in my theory.
Real life, von Neumann thought, was like poker.
Using his own simplified version of the game,
in which two players were randomly dealt secret numbers
and then asked to make bets of a predetermined size on whose number was higher,
von Neumann derived the basis for an optimal strategy.
Players should bet large both with their very best hands and, as bluffs, with some definable
percentage of their very worst hands. The percentage changed depending on the size of
the bet relative to the size of the pot. Von Neumann was able to demonstrate that by bluffing
and calling at mathematically precise
frequencies, players would do no worse than break even in the long run, even if they provided their
opponents with an exact description of their strategy. And if their opponents deployed any
strategy against them other than the perfect one von Neumann had described, those opponents were guaranteed to lose, given a large enough sample.
Theory of games pointed the way to a future in which all manner of competitive interactions
could be modeled mathematically. Auctions, submarine warfare, even the way species compete
to pass their genes on to future generations. But in strategic terms, poker itself barely advanced in response to von Neumann's proof
until it was taken up by members of the Department of Computing Science
at the University of Alberta more than five decades later.
The early star of the department's game research was a professor named Jonathan Schaefer,
who, after 18 years of work, discovered the solution to checkers. Alberta
faculty and students also made significant progress on games as diverse as Go, Othello,
Starcraft, and the Canadian pastime of curling. Poker, though, remained a particularly thorny
problem, for precisely the reason von Neumann was attracted to it in the first place—the
way hidden information in the game
acts as an impediment to good decision-making.
Unlike in chess or backgammon,
in which both players' moves
are clearly legible on the board,
in poker, a computer has to interpret its opponent's bets
despite never being certain what cards they hold.
Neil Birch, a computer scientist
who spent nearly two decades working on poker
as a graduate student and researcher at Alberta before joining an artificial intelligence company called DeepMind, characterizes the team's early attempts as pretty unsuccessful.
What we found was if you put a knowledgeable poker player in front of the computer and let them poke at it, he says, the program got crushed, absolutely smashed.
at it, he says. The program got crushed, absolutely smashed. Partly, this was just a function of the difficulty of modeling all the decisions involved in playing a hand of poker. Game theorists use a
diagram of a branching tree to represent the different ways a game can play out. In a straightforward
one, like rock, paper, scissors, the tree is small. Three branches for the rock, paper, and scissors
you can play, each with three subsequent branches for the rock, paper, and scissors your opponent
can play. The more complicated the game, the larger the tree becomes. For even a simplified
version of Texas Hold'em, played heads-up, that is, between just two players, and with bets fixed
at a predetermined size, a full game tree contains 316 quadrillion branches.
The tree for No Limit Holum, in which players can bet any amount, has even more than that.
It really does get truly enormous, Birch says, like larger than the number of atoms in the universe.
At first, the Alberta group's approach was to try to shrink the game to a more manageable scale,
crudely bucketing hands together that were more or less alike,
treating a pair of nines and a pair of tens, say, as if they were identical.
But as the field of artificial intelligence grew more robust,
and as the team's algorithms became better tuned to the intricacies of poker,
its programs began to improve.
Crucial to this development
was an algorithm called counterfactual regret minimization.
Computer scientists tasked their machines
with identifying poker's optimal strategy
by having the programs play against themselves
billions of times
and take note of which decisions in the game tree
had been least profitable,
the regrets, which the AI would learn to minimize in future iterations by making other,
better choices. In 2015, the Alberta team announced its success by publishing an article
in Science titled, Heads Up, Limit, Hold'em Poker is Solved. For some players, especially those who
made a living playing that variant
of poker online, the Alberta group's triumph represented a serious threat to their livelihood.
I remember when we read about it, says the former professional Terence Chan,
we were just like, oh, good game. It's been a fun ride. It quickly became clear that academics
were not the only ones interested in computers' ability to discover
optimal strategy. One former member of the Alberta team who asked me not to name him,
citing confidentiality agreements with a software company that currently employs him,
told me that he had been paid hundreds of thousands of dollars to help poker players
develop software that would identify perfect play and to consult with programmers building
bots that would be capable of defeating humans in online games. Players unable to front that kind of money didn't have
to wait long before gaining more affordable access to AI-based strategies. The same year
that Science published the Limit Hold'em article, a Polish computer programmer and former online
player named Piotrek Lopuszewicz began selling the first version of his application, PioSolver.
For $249, players could download a program
that approximated the solutions
for the far more complicated, no-limit version of the game.
As of 2015, a practical actualization
of John von Neumann's mathematical proof
was available to anyone with a powerful enough personal computer.
One of the earliest and most devoted adopters
of what has come to be known as game theory optimal poker
is Seth Davey's friend and poker mentor, Jason Kuhn.
On the second day of the three-day Super High Roller tournament,
I visited Kuhn at his multi-million dollar house,
located in a gated community inside a larger gated community
next to a Jack Nicklaus-designed golf course.
On day one, Kuhn paid $250,000 to play the super high roller,
then a second $250,000 after he was knocked out four hours in.
But again, he lost all his chips.
Welcome to the world of nosebleed tourneys, he texted me afterward.
Just have to play your best. It evens out.
For Kuhn, Evening Out has taken the
form of more than $30 million in in-person tournament winnings, and, he says, at least
as much from high-stakes cash games in Las Vegas and Macau, the Asian gambling mecca. Kuhn began
playing poker seriously in 2006 while rehabbing an injury at West Virginia Wesleyan College,
where he was a sprinter on the track team.
He made a good living from cards,
but he struggled to win consistently in the highest stakes games.
I was a pretty mediocre player, pre-solver, he says.
But the second solvers came out, I just buried myself in this thing. And I started to improve, like, rapidly, rapidly, rapidly, rapidly.
In a home office, decorated mostly with trophies
from poker tournaments he has won,
Kuhn turned to his computer
and pulled up a hand on PO Solver.
After specifying the size
of the players' chip stacks
and the range of hands they would play
from their particular seats at the table,
he entered a random three-card flop
that both players would see.
A 13-by-13 grid illustrated
all the possible hands
one of the players could hold. Kuhn hovered
his mouse over the square for an ace and queen of different suits. The solver indicated that Kuhn
should check 39% of the time, make a bet equivalent to 30% the size of the pot 51% of the time,
and bet 70% of the pot the rest of the time. This von Neumann-esque mixed strategy would simultaneously
maximize his profit and disguise the strength of his hand. Thanks to tools like PO Solver,
Kuhn has remade his approach to the game, learning what size bets work best in different situations.
Sometimes, tiny ones, one-fifth or even one-tenth the size of the pot, are ideal. Other times, giant bets,
two or three times the size of the pot, are correct. And while good poker players have
always known that they need to maintain a balance between bluffing and playing it straight,
solvers define the precise frequency with which Kuhn should employ one tactic or the other
and identify the, sometimes surprising, best and worst hands to
bluff with, depending on the cards in play. Eric Seidel, a pro who learned the game in the 1980s,
told me that if players like Kuhn traveled back in time just 15 years with today's knowledge,
they would crush the best players of that era. I think also that all the people in the game
would think that they were fish, Seidel said,
using the poker argot for bad players. There are a lot of really strange plays now that these guys
are making that are effective, but if people saw them back in the day, I think that they'd be
invited into the game every night. Against weaker players, Kuhn will sometimes intentionally diverge
from theoretically perfect poker, bluffing more than he should, or betting large when the AI says he should bet small to take advantage of his opponent's mistakes.
But against the best professionals, he will mostly just do his best to replicate the solver's
decisions to the extent that he is able to remember the AI's preferred bet sizes and the
frequencies with which to employ them. Because he knows his own human biases can creep into his decision-making,
Kuhn will often randomly select which of the solvers' tactics to employ in a given hand.
He'll glance down at the second hand on his watch or at a poker chip to note the orientation of the
casino logo as if it were a clock face in order to generate a percentage between 1 and 100.
The higher the percentage, the more aggressive the action he'll take.
I'll say, okay, well, I just rolled 9 o'clock,
so that's 75%.
That's a pretty aggressive number.
In that instance, Kuhn might choose the largest
of the solver's approved bet sizes for his hand,
whereas if the second-handed pointed to 3 o'clock,
or 25%, he might have checked.
Using optimal strategy is no guarantee, of course, that Kuhn will win any particular hand. Given enough hands, however,
the math says he should do no worse than break even, and will in practice do much better than
that, depending on how far his opponent's strategies diverge from theoretically perfect play.
If he were to play thousands of hands against the solver, Kuhn says, it's going to win, I promise.
Kuhn is quick to point out that even with access to the solver's perfect strategy,
poker remains an incredibly difficult game to play well. The emotional swings that come from
winning or losing giant pots and the fatigue of 12-hour sessions, remain the same challenges as
always. But now top players have to put in significant work away from the tables to succeed.
Like most top pros, Kuhn spends a good part of each week studying different situations that
might arise, trying to understand the logic behind the program's choices. Solvers can't tell you why
they do what they do. They just do it, he says. So now it's on
the poker player to figure out why. The best players are able to reverse-engineer the AI's
strategy and create heuristics that apply to hands and situations similar to the one they're studying.
Even so, they are working with immense amounts of information. When I suggested to Kuhn that it
was like endlessly rereading a 10,000-page book in order to keep as much of it in his head as
possible, he immediately corrected me. A 100,000-page book. The game is so damn hard.
In fact, the store of data Kuhn draws on is even larger than that. He rents nearly 200 terabytes
of cloud storage for the game trees he has developed
since he started working with solvers. Players sitting down to in-person games have no way to
access all that information at the table. But that limitation does not necessarily apply to
poker played over the internet. Automated bots, especially in low-stakes games, have been a
problem for internet poker since before the rise of solvers. But now,
human players willing to skirt the rules can look up AI strategies on one screen and then use them
to play optimally on a second screen. Anytime there are high stakes and a lot of money to be
won and a device that might be used for good, Kuhn says, people have a way to turn it into a cheating
tool. Kuhn isn't especially worried that people are cheating
in the games he plays over the internet,
but other players aren't so sure.
It's the main reason why I don't really play much online anymore,
a pro named Ryan LaPlante says.
In a recent $7,000 buy-in online tournament
held as part of the World Series of Poker,
LaPlante says he recognized the screen names
of at least four of the hundred or so competitors
as belonging to players who are rumored
to have been banned from other sites
for using what is called real-time assistance.
LaPlante credits some of the biggest online sites
with doing a good job of policing their games,
but he worries that as solvers become more ubiquitous,
the balance of power will continue to shift towards those who cheat to gain an edge.
The only thing I'm confident of, LaPlante says,
is that it's going to get a lot worse very quickly.
Well after midnight on the super high roller's second day,
a German professional named Christoph Vogelsang
called a bet for all his chips with a king and a nine
versus another player's ace and
jack. According to the solvers, calling was in fact the correct play. All the same, Vogelsang
lost the hand and was eliminated from the tournament in sixth place. Unlike a regular
poker game where players can leave the table and cash in their chips whenever they feel like it,
a poker tournament requires players to continue until they either lose everything
or win every single chip in play. Prizes drawn from the pool created by all the buy-ins are
paid out based on how long players manage to stay in the game. The person who ends with all the
chips is awarded the first place prize, $3.2 million in this tournament. The second to last
survivor gets second place, $2 million, and so on down to
the final in-the-money finisher, in this case fifth place, $630,000. Vogelsang and all the
players who were eliminated before him received nothing. Given the small sample size of several
hundred hands that a player will see over the course of three days, a single poker tournament
is an incredibly inexact way
of identifying the strongest player in the field.
Luck will determine much of the outcome
for even the best players.
If the 26 human players in the tournament
were replaced with 26 perfectly programmed poker bots,
one bot would win,
and one would be the first to be eliminated,
despite their sharing the same optimal strategy.
Poker players tend to take the long view, speaking of tournament buy-ins as investments
with a more or less predictable return when averaged over time.
In a relatively tough tournament, the worst players in the field are losing maybe as much
as 30 or 40 percent of their buy-in, says Ike Haxton, who plays professionally.
Stronger amateurs,
he says, should expect to lose an average of about 15% of the money they put in,
while the best pros will earn a return of around 5 to 10% over the long run.
To dampen the huge swings of fortune that come in the short term, many professionals agree to
swap percentages of any potential prize money with one another before the tournament starts.
I agree to give you 5% of what I win, say, if you agree to give me 5% of what you win.
Or sell stakes in their future winnings to outside backers, like shares in an old-time
whaling voyage.
Seth Davies wouldn't tell me the exact details of his own arrangements, but he admitted that
less than half of what he put into this tournament
had come out of his own bankroll.
Even so, after being knocked out on the first day
and then paying a second $250,000 to re-enter,
he had, quote,
well into six figures of his own money on the line.
On the third and final day of the Super High Roller,
the five remaining players were relocated
from the dilapidated outer tables of the Amazon Room
to a made-for-television set at its center.
Stage lights brightly illuminated the poker table's gleaming green felt from above,
while a 45-foot camera crane swung from side to side
to get the best angle on the action.
All five players who had made it this far were guaranteed
to turn a profit, but there was still a lot of maneuvering left to determine how far up the
payout ladder they could climb. As the game got underway, the chip leader, a 27-year-old Spanish
pro named Adrian Mateos, kept up a steady barrage of giant bets against the other players, asking
them again and again whether this was the hand with which
they wanted to make their final stand, or whether, perhaps, they would rather fold and wait for
another player or two to bust out so that they could finish fourth or third instead of fifth
and take home an additional $300,000 or $700,000 in prize money. Situations like these bend the
value of players' stacks in strange ways, depending on
where they are in the payout hierarchy. Even a single chip can be worth an incredible amount
of real money if another player is knocked out of the tournament after you've folded.
There are solvers that can model these circumstances as well. But as the chip stacks
get shorter relative to the size of the blind bets and antis players are required to put in the pot before each hand begins,
flawless play alone offers no real insurance against what often becomes essentially a game of heads or tails.
When it comes down to it, Davies says, you just end up running these million-dollar flips, and you hope you win.
After one competitor was eliminated, Davies found himself with the shortest
chip stack at the table. With only one more person still to play behind him, he pushed all in with
the ace and seven of clubs, just as the solvers said he should, given the size of his stack.
The remaining player, a ponytailed Englishman named Ben Heath, quickly called and turned over
a pair of jacks, making him a 67% favorite to win the hand.
None of the five cards the dealer laid out improved Davies' hand,
so Heath won the pot and Davies was eliminated in fourth place.
He stood up from the table, collected his backpack and N95 mask,
and left the stage.
That was some serious gambling up there, he told me.
Davies at least had the satisfaction of knowing
how closely his play over
the last three days had hewed to the optimal strategy generated on his computer at home.
Another consolation was the $930,791 in prize money he would receive for his fourth-place finish.
Sowing his cash-out ticket in his pocket, Davies walked over to a nearby $50,000 buy-in tournament
already underway. He had planned to get some dinner and rest a little before buying in, cash-out ticket in his pocket, Davies walked over to a nearby $50,000 buy-in tournament already
underway. He had planned to get some dinner and rest a little before buying in, but he changed
his mind after seeing how many of the players here were the sort most likely to employ decidedly
non-optimal strategies. This 50k looks incredible, Davies told me. I just couldn't not be in there
right away. Not every player I spoke to is happy
about the way AI-based approaches have changed the poker landscape. For one thing, while the
tactics employed in most lower-stakes games today look pretty similar to those in use before the
advent of solvers, higher-stakes competition has become much tougher. As optimal strategy has become
more widely understood, the advantage in skill the
very best players once held over the merely quite good players has narrowed considerably.
But for Doug Polk, who largely retired from poker in 2017 after winning tens of millions of dollars,
the change solvers have wrought is more existential. I feel like it kind of killed the soul of the
game, Polk says, changing poker, quote, from who can be the most creative problem solver to who can memorize the most stuff and apply it.
Piotrek Lobuszewicz, the programmer behind P.O. Solver, counters by arguing that the new generation of AI tools is merely a continuation of a longer pattern of technological innovation in poker.
is merely a continuation of a longer pattern of technological innovation in poker.
Before the advent of solvers, top online players like Polk used software to collect data about their opponent's past play
and analyze it for potential weaknesses.
So now somebody brought a bigger firearm to the arms race, Lopushevitz says,
and suddenly those guys who weren't in a position to profit were like,
oh yeah, but we don't really mean that arms race.
We just want our tools, not the better tools.
Besides, for Lopushevich, solvers haven't so much changed poker as revealed its essence.
Whether poker players themselves recognized it or wanted to,
at its core, the game was always just the maximization problem John von Neumann revealed it to be.
Today, everyone at a certain level is forced to respect the math side, Lopucewicz says.
They can't ignore it anymore.
This story was written and narrated by Keith Romer.
This story was written and narrated by Keith Romer.
To listen to more stories from The New York Times, The New Yorker, Vanity Fair, The Atlantic,
and other publications on your smartphone, download Autumn on the App Store or the Play Store.
Visit autumn, that's A-U-D-M dot com, for more details.