approximating game-theoretic optimal strategies for full scale poker (darse billings ++ 2003 …)

Approximating Game-Theoretic Optimal Strategies for Full Scale Poker

(Darse Billings++ 2003 …)

Presented by Brett Borghetti

21 Jan 2007

21 Feb 2007 Brett Borghetti 2

Contributions of the work:

• Reduced 2 player Hold’em gamespace O(1018) using approximations to O(107)

• Built a new pokerbot capable of competing with world-class human opponents

• [Brett says] Developed a solution for mixed strategy equilibrium in a ‘model’ (approximation) of full hold’em poker


Interesting Experiments

• Played fairly well against world class player (Gautam Rao)• Although ‘thecount’ won the match, statistically the

outcome of this match does not indicate which player is better overall


Approach for reducing the gamespace

• Betting Round Reduction (actions per round)

• Elimination of Betting Rounds (rounds per hand)

• Splitting the hand into multiple abstract subgames

• Bucketing of (approximate) equivalence classes of cards


Betting Round Reduction

• Normally, up to 4 legitimate raises are allowed in 2 player Hold’em

• Reduction allowed only 3 legitimate raises to be considered

• Reduces branching factor from 9 to 7• Experiments showed that this reduction did not

significantly reduce EV or perturb strategy• Reducing to 2 legitimate raises did perturb EV and

strategy significantly


Elimination of Betting Rounds

• Explored truncation (treating Hold’em as a n-round game instead of a 4 round game)– Used EV rollouts for the remaining rounds (assumed all

players checked or called in the truncated rounds)– Explored truncating early rounds and later rounds

• Combined several truncations– PsOpti1 uses 1-round pre-flop model plus a post-flop

model– PsOpti2 uses 2 overlapping 3 round models (pre-flop

through turn and flop through river)


Integration of Truncated Models


Bucketing

• Trying to reduce cardspace via equivalence classes with respect to how to bet and how much the cards are worth (EV)

• Built a 2-d graph (Hand Strength vs Hand Potential)• Choose N ‘buckets’ (the number of clusters to break up the

neighborhoods in the graph)• Explored performance different values for N & chose

– N-1 buckets of varying hand strength-low potential cards– 1 bucket for low hand strength-high potential cards

• Used transition probabilities to give likelihoods of transitioning between one bucket and another after revealing the next card(s) on the board


Psuedo-Optimal Play

• With the approximated game tree, they used a powerful LP solver (CPLEX with the Barrier method & 2GB ram) to determine the solution to the linear equations for equilibrium play– Calculation took ‘less than a day’ of computing

• Produced a large lookup table of probability triples for each bucket in each possible condition <P(fold),P(call),P(raise)> which sum to 1

• Play a mixed strategy by randomly choosing one action according to the distribution.


Issues [Brett]

• Only works with 2 players. (future work claims they will develop an N-player version also)

• Does not contain an explicit opponent model that attempts to exploit its current opponent


Background Information


Texas Hold’em Heads-up Limit Poker Basics

• 2 Players• 4 Betting Rounds per hand

– Preflop(2 hole cards), Flop(3 community cards), Turn (1cc), River (1 cc)

• Action set = {fold, call(check), raise(bet)}• Up to 3 raises allowed per round• Round is over when either

– When all players are even in the pot via a final call and each player has had at least one opportunity to act [go on to next round]

– When one player folds [other player wins]


Requirements for a World Class Poker Player

• Able to assess– Hand Strength

– Hand Potential

– Opponents Betting Strategy (opponent model)

• Has a strong– Betting strategy

– Ability to play deceptively [bluff vs. slow play*]

– Ability to play unpredictably


Optimal vs Maximal play

• Optimal player makes decisions based on game-theoretic probabilities without regard to specific context (opponent’s plays)

• Maximal player takes into account the opponent’s sub-optimal tendencies and adjusts its play to exploit perceived weaknesses


Hand Assessment (Hand Strength = HS)

• Pre-Flop HS determined from 169 equivalence classes “income rate” from 1M simulated poker hands

• Flop HS determined comparing each of the 1081 possible opponent hands with ours and determining how many wins each player has


Hand Potential (HP) at the Flop

• PPot1 = likelihood that our hand will improve with one card (the turn card)

• PPot2 = likelihood that our hand will improve with two cards (turn and river)

• NPot1 and 2 = equivalent calculations of likelihood that our opponent’s hand will get better than ours on the turn and/or river


Effective Hand Strength & Pot Odds

• EHS = HSn + (1-HSn) x Ppotn

– The chance that we either are ahead or could pull ahead by the end of n=1 or n=2 cards from now

• Pot odds = P(win)/(Expected Return on Pot)– Example: if your chance of winning is 25%, you would

call a $4 bet to win a $16 pot because your earnings are 0.25*$20 = $5 and hence you can expect to win $5 every time you pay $4 for an expected net gain of $1.00 per play.

approximating game-theoretic optimal strategies for full scale poker (darse billings ++ 2003 …)

Documents