poker as a testbed for machine intelligence research by darse billings, dennis papp, jonathan...

24
Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada Kekin Dhiraj Raunak Pillani

Upload: logan-rose

Post on 05-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Poker as a Testbed for Machine Intelligence Research

By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron

Presented By:-

Debraj Manna

Gada Kekin Dhiraj

Raunak Pillani

Page 2: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

CONTENT

Introduction Characteristics of Poker Game Texas Hold’Em Requirements From Players Lokibot Experiment Future Work

Page 3: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

INTRODUCTION

Game Researchers used Chess & other board games as TestBed

Poker can be a better testbed for decision making problems

Page 4: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

POKER

Game of Imperfect knowledge Risk management Agent modelling Unreliable information Deception

Heuristic Search and evaluation methods employed in Chess not helpful.

Page 5: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

AI PROBLEM CHARACTERISTICS

General Application Problem Problem Realization in Poker

Imperfect knowledge Opponents' hands are hidden

Multiple competing agents Many competing players

Risk managementBetting strategies and theirconsequences

Agent modelingIdentifying patterns in opponent'splay and exploiting them

Deception Bluffing and varying style of play

Unreliable informationTaking into account youropponents' deceptive plays

Page 6: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

TEXAS HOLD 'EM

Pre-Flop – Each player is dealt with two cards with their face down

Community Cards are dealt in 3 stages:- Flop – 3 cards are dealt with face up Turn – 4th community card is dealt with face up. River – last community card is dealt

A round of betting held at each stage

Showdown – player having the best 5 cards wins the game

Page 7: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

BETTING STRATEGY

FOLD – Withdraw from the game

CALL – Match the current bet

RAISE – Raise the current outstanding bet

Only 3 raises are allowed in a round.

Page 8: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

REQUIREMENT

Hand Strength – strength of your hand compared to opponents.

Hand Potential – Probability of hand improving as additional cards appear.

Betting Strategy – Determining optimal betting strategy

Bluffing – Allows you make profit even on weak hands

Page 9: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

REQUIREMENT (contd.)

Opponent Modeling – Determining probability distribution for opponents strategy.

Unpredictability – making difficult for opponent to model your strategy.

Page 10: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Lokibot (later changed to Pokibot)

Page 11: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Pre-flop Evaluation

52 choose 2 = 1326 possible combinations for two cards

Approximate income rate for each starting hand using a simulation of 1,000,000 poker games done against nine random opponents

Highest income rate: A pair of aces Lowest income rate: 2 and 7 (of different suits)

One time evaluation

Page 12: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Hand Evaluation

1. Hand Strength Assessment of the current strength of the

hand Enumeration techniques can provide an

accurate estimate of the probability of currently holding the strongest hand.

2. Hand Potential Potential changes in hand strength

Page 13: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Hand Strength

Starting hand is and the flop is

47 remaining unknown cards and {47 choose 2} = 1,081 possible hands an opponent might hold.

Hand strength is estimated by simply counting number of possible hands that are:

better than ours (any pair, two pair, A-K, or three of a kind: 444 hands)

equal to ours (9 possible remaining A-Q combinations)

worse than ours (628)

Page 14: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Hand Potential

Hand strength alone is insufficient to assess the quality of a hand

Example Hand: Flop: Next card: ,

Positive / Negative Potential

Page 15: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Hand Potential (contd.)

5 cards 7 cards

Ahead Tied Behind Sum

Ahead 449,005 3,211 169,504 621,720 = 628x990

Tied 0 8,370 540 8,910 = 9x990

Behind 91,981 1,036 346,543 439,560 = 444x990

Sum 540,986 12,617 516,587 1,070,190 = 1,081x990

Page 16: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Hand Potential (contd.)

If T{row,col} refers to the values in the table (B, T, A, and S are Behind, Tied, Ahead, and Sum, resp.) then Ppot and Npot are calculated by:

Ppot = (T{B,A} + T{B,T}/2 + T{T,A}/2 ) / ( T{B,S} + T{T,S}/2)

Npot = (T{A,B} + T{A,T}/2 + T{T,B}/2 ) / ( T{A,S} + T{T,S}/2)

Ppot = 0.208 and Npot = 0.274

Page 17: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Betting Strategy

Hand strength and potential are combined into effective hand strength (EHS):

EHS = HSn + (1 - HSn ) x Ppot

where HSn is the adjusted hand strength for n opponents, Ppot is the positive potential.

EHS is the probability that we are ahead, and in those cases where we are behind there is a Ppot chance that we will pull ahead

pot_odds = bets_to_us / ( bets_in_pot + bets_to_us )

Call when Ppot > pot_odds

Page 18: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Player A is the most advanced version of the program

Player E is a basic player

Player B lacks an appropriate weighting of subcases, using a uniform distribution for all possible opponent hands.

Player C uses a simplistic pre-flop hand selection method, rather than the advanced system which accounts for player position and number of opponents.

Player D lacks the computation of hand potential, which is used in modifying the effective hand strength and calling with proper pot odds.

Experiment

Page 19: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Experiment (contd.)

Page 20: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Experiment (contd.)

The Bot was also run against other Poker playing bots and human players over the internet.

In it's current state the bot showed losses against advanced players

Page 21: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Work In Progress

It is a predictable player that reacts the same in a given situation irrespective of any historical information

Opponent modeling: When Lokibot is better able to infer likely holdings for the opponent, it will be capable of much better decisions

Betting strategy: bluff with high potential hands and occasionally bet a strong hand weakly

Page 22: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Work Done After The Paper

Later versions used simulation to discover the correct action to take, simulating what the actions of the other players (estimated using the opponent modelling) would be depending on the action that Lokibot chose.

They included selective sampling simulation: Opponent modelling consisted of weights for each hole card combination describing the probabilities of each action (bet, call, fold) and they measured opponents by their rate of each action.

The most recent work has concerned other approaches to poker game-tree search methods, as well as ways to evaluate perfomance of agents

Page 23: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

Contributions Of This Paper

Showing that poker can be a testbed of real-world decision making,

Identifying the major requirements of high-performance poker,

Presenting new enumeration techniques for hand-strength and potential, and

Demonstrating a working program that successfully plays "real" poker.

Page 24: Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada

REFERENCE

Billings D., Papp D., Schaeffer J. and Szafron D. "Poker as a Testbed for Machine Intelligence Research." In Advances in Artificial Intelligence (Mercer R. and Neufeld E. eds.), Springer-Verlag, pp 1-15, 1998.

http://www.poker-academy.com