automated poker - penn engineeringcse400/cse400_2009_2010/final_report/kim_a… · that behavior...

Automated Poker

Dept. of CIS - Senior Design 2009-2010

Albert [email protected]

Univ. of PennsylvaniaPhiladelphia, PA

Barry [email protected]


Jonathan M. [email protected]


ABSTRACTWe discuss the design and development of a multiplayerTexas Hold’em Poker game where user-created bots will com-pete against each other instead of humans. The purpose forthis application is to allow users to play more hands thanthey normally could in traditional poker websites, where eachaction must be manually performed. This idea is not unlikethe shift towards algorithmic stock trading machines in thefinancial world. We first develop a distributed back-end sys-tem, providing the infrastructure for the bots to play at highfrequencies on many tables. We then provide tools and aninterface for users to implement their own algorithms intounique bots. We create several example bots with variousdecision-making abilities, which we then analyze collectivelyfor testing purposes and to demonstrate that our applicationcan serve as an alternative to popular online poker sites. Increating this application, we seek to answer several issuessurrounding the idea. We first determine whether a realis-tic poker environment can be mimicked, where participantsfairly earn/lose money based on their decision-making abil-ities. We also determine whether a user can feasibly trans-late his/her playing strategy into an algorithm. Because botscan play a significantly greater number of hands than a per-son can, we aim to investigate whether this difference affectsthe gameplay from a human’s perspective. We closely mon-itor and study the quantitative aspects of the game, such asvariance, increase/decrease in profits, and strategy implica-tions.

1. INTRODUCTIONThere are currently dozens of popular online poker web-

sites allowing users to play various poker games, includingTexas Hold’ em, Omaha, and Seven Card Stud. These sitesattract tens of thousands of active players from all over theworld. There is an ongoing debate (in and out of the play-ing field) on whether poker is gambling or a complicatedgame of skill. However, there is no doubt that successfulplayers are able to repeatedly make money from this gamethrough careful probability and risk analysis, in addition togood judgment.

Poker is played with only the money pooled together fromits participants. As opposed to other casino games, such asBlackjack, players do not compete against a “house”, butrather, with other players. This aspect of the game is com-monly used in favor of poker’s classification as a game ofskill, rather than luck. In a game of Hold’em Poker, thedealer will distribute two cards to each player, then sequen-tially place community cards in the center of the table after

each turn of betting. The object of the game is to bet ac-cording to the strength of one’s hand against others’ or tofold (give up play without committing any more money) ifthe odds are not in favor of the player.

Several poker terminologies will be used throughout thisproposal. A group of players (usually up to 8 to 10 max)can play at a single table. By sitting at multiple tables(permitted online), players participate in several indepen-dent games, each having no effect on the outcome of another.Players will act one by one around a table, and they can bet,call a bet, or fold. Betting is when a player places money ina pot, usually indicative of possessing a strong hand (theycould also be faking, or “bluffing”, their strength). Whena bet occurs, any following bets or calls for that round arerequired to at least match amount originally wagered. Play-ers can also fold a hand, essentially giving up play for thatround and not having to risk any additional money.

It is very common for experienced poker players to play onmultiple online tables at a time (some up to 20 or more) sincethis reduces variability and provides an outlet for greaterreturns. In a sense, this act can be compared to portfoliomanagers or stock traders taking positions in a diversifiedportfolio to reduce return volatility. In the past decade,an increasing number of investment banks, mutual/hedgefunds, and trading firms have been buying and selling stocksthrough electronic exchanges [6]. One source claims that73% of daily equity trades is done automatically by machineswith intelligent algorithms [11].

These machines have significant advantages over tradi-tional (human) traders. Aside from executing trades in amatter of milliseconds, they require much less manpowerand attention to fine details. Algorithmic traders specifyalgorithms that satisfy their overarching goal and let themrun throughout the trading period. By engaging in a hugevolume of trades with incremental returns, institutions areable to reap substantial profits. Even during the midst ofthe recent financial crisis, firms such as Goldman Sachs maderecord profits due to these electronic trades [4].

Our project seeks to mimic the automated nature of algo-rithmic trading in multiplayer poker software (particularlyHold’em). Automated decision-making bots are currentlydiscouraged by (and usually banned from) online poker sites,as they may provide an unfair advantage over live players.These programs/bots work in conjunction with statisticalanalysis software to decide on the best course of actions ver-sus certain types of players. For example, the program mightdecide that a certain opponent has given up a pot 75% of thetime in a specific scenario and automatically try to exploit

that behavior again.We believe that the future of poker lies in the automated

world. Instead of humans playing against one another, pokercan be played with user-created algorithms, or “bots,” thatcompete on the poker table. With rapid advances in arti-ficial intelligence, bots can be designed to play poker in avery similar manner of its creator without human errors orfaults (for example, judgment can be clouded by emotion).We want to develop an application that allows users to playfairly against each other by allowing them to create com-petitive algorithms that can win more hands for them andeventually make money.

2. RELATED WORKNot much academic research has been performed specifi-

cally regarding the online poker world and much less regard-ing an automated poker offering to the public. However, be-cause of our project’s unique position at the intersection ofonline poker, high-speed information storage and transfer,and automated decision-making processes, there has beenplenty of related research in those fields that we can use toadvance our project. Specifically, we explore the researchperformed on the system technology we want to use for ourback-end as well as the bot decision-making abilities and be-haviors we want to make available for users playing on oursoftware.

We envision our poker exchange to act very similarly to ahigh-speed online stock exchange: both take in large amountsof orders (raise, call, or fold for the poker exchange versusbuy or sell for the stock exchange), execute these orders, andtransfer and store all relevant data in a timely fashion. Hen-dershott details the importance of handling these actions vialimit order books and electronic communication networks ina trading environment [8]. Our poker exchange will containanalogous systems that will receive/respond to bot requestsas well as store and process information during a game.

We considered various parallel computing and distributedcaching technologies to implement this, as we felt they wouldbest handle and process the data in real-time. Parallel com-puting would be useful in our system because most of thecalculations will be repetitive, and hence, can be partitionedinto table groups on several machines. Distributed cachingallows us to store data solely in memory, allowing faster ac-cess. Much of the data we will be dealing with only requirestemporary storage, and combined with parallel computing,we can distribute only the necessary data into each of thepartitions to be worked on.

Rodriguez et al. describes two caching architectures, hi-erarchical and distributed caching, and compares them interms of various performance metrics. These metrics in-clude the client’s latency, bandwidth usage, and disk spaceusage [10] .

We also studied several articles that examined the effec-tiveness of distributed computing over a single large ma-chine. In their article, Ahuja et al. provides a comprehen-sive explanation of JavaSpaces, a distributed programmingtool designed for Java [1]. They describe JavaSpaces as asimple framework that can be used to solve a large class ofdistributed problems.

The key feature of JavaSpaces is that it allows processes towork in parallel with each other and not have to worry aboutthe actions of the other processes. Within JavaSpaces, pro-cesses are loosely related, and communicate through a “per-

sistent object store” called a “space” rather than throughdirect communication. Ahuja et al. then introduces Gi-gaSpaces as “an industrial-strength JavaSpaces implementa-tion,” providing many wrapper libraries that optimizes theapplication and simplifies development.

Ahuja et al. also introduces a series of experiments thatwere conducted to test the relative effectiveness between Gi-gaSpaces and ORBacus, a product based on the CommonObject Request Broker Architecture (COBRA) standard.A distributed, “parallel insertion sort” application was exe-cuted on both platforms, and response times were recorded.The article was able to conclude that GigaSpaces was ableto consistently outperform ORBacus. Therefore, Ahuja etal. claims that the JavaSpaces platform was ideal for de-veloping distributed parallel algorithms, which we chose topursue.

Hancke et al. also details the performance of different dis-tributed systems [7]. The authors calculated the overhead ofdistributed programming tools, such as JavaSpaces, using astatistical approach. In doing so, their primary motivation isto determine whether such distributed programming is idealfor solving high performance calculations. Their article con-cludes that although the overhead for JavaSpaces is not toolarge, there is significant variability, especially when outliersare considered.

These factors will be of important consideration in build-ing our poker exchange, and we hope that our poker ex-change implements an effective caching system that reducesmany of the problems associated with storing and processinglarge amounts of data quickly.

In addition to learning about the system component ofour application, we must also be aware of the nature of thealgorithms that will take place on the platform.

Perhaps one of the first poker-playing bots publicized onthe internet was the Poki system, created by Billings etal. [3]. Poki is a poker-playing algorithm designed for limitHold’em games. Billings describes the significant attributesbuilt into Poki as hand-strength assessment, evaluation ofhand potential, bluffing, unpredictability, and opponent mod-eling.

Put simply, the authors describe a central evaluation sys-tem takes in information about the hand, the model it hasdeveloped thus far on its opponents, the state of the board,its pot-odds, and various other factors and develops a setof possible actions (for example, 25% fold, 50% call, 25%raise). Then an action selector would select one of these ac-tions at random. Although the Poki system presented a verysolid foundation on the necessities of a poker-playing algo-rithm, it focused on limit Hold’em, a form of poker more con-strained and arguably less interesting than no-limit Hold’em.In 2007, Beattie et al. [2] introduced a model designed toplay the more popular no-limit variant of Texas Hold’em.This poker bot focuses on three key considerations: handvalue, risk, and aggressiveness.

For each consideration, Beattie’s algorithm uses mathe-matical modeling to decide on the best course of action. Forexample, Beattie describes risk as a function of the bet size,the pot size, and the size of the blinds. Beattie implementedmany of these poker-playing agents and concluded that theyperformed competently versus “average” human players.

We draw from these works that our system must allowbots to store and update information while playing the gamein order to make complex decisions. Moreover, our applica-

tion must provide an API that allows users access to dataspecific to a certain table. We must find an optimal way forusers to access and process these data in their algorithmswithout infringing upon overall system performance.

Our goal is to combine these prior establishments, botharchitectural and algorithm-related, as components to buildan innovation and efficient poker-exchange system.

3. SYSTEM MODELOur primary goal is to create a multiplayer poker appli-

cation that can conceptually serve as a feasible alternativeto popular online poker websites. Because poker is a gameplayed relative to other players’ actions, a player must waitfor all of the other players to complete their actions beforemaking his/her action. In the event that the player folds ahand, he/she must then wait until the next set of cards aredealt. Even when every step of this is done automatically,there is inevitably some waiting time for other players to“think”, or to wait until the current hand is over.

This project seeks to reduce that waiting time for a partic-ipant to make a decision. We achieve this by automating thedecision-making process via bots. We attempt to create asetting in which an individual “algorithm” can be exploitedto work constantly by replicating the same table multipletimes. This can be compared to taking a position in a fi-nancial options or futures contract, where a single contractconsists of hundreds of shares. Any small benefit/loss re-ceived from the table or option is multiplied by the numberof tables or shares, respectively. This allows the bot to playmany more hands in a reduced variance setting against theexact same opponents. The tables can be replicated enoughtimes for most bots on average to approach a scenario wherethe bottleneck in performance is due to thinking time, orminimizing idle time for each bot.

Our poker software/game consists of both the back-endserver and the interface (mainly API calls) for users to cre-ate and monitor bots. As mentioned before, the server isdeveloped on a distributed platform, taking advantage ofthe parallelization of data storage and processing. An inter-face allows users to program their own algorithms via APIcalls through which the back-end machines will respond to.In addition, it allows the bot creators to monitor the per-formance and statistics regarding his/her bot and the tablesavailable for play.

Most of the back-end system workflow occurs in-memoryacross several different machines. A remote application, orthe user interface to this distributed application, chooses toupload a specific type of bot to join a poker table againstother players. A Grid Service Manager (GSM), which ab-stracts the partitioning schema from the point of view ofthe remote application, takes this request and instantiates abot to upload in the memory of the computer hosting thatparticular poker table.

In the memory of the host (as shown in Figure 1), there isnow the bot object that the GSM just uploaded, as well asthe poker table and its current participants (several otherbots). At this stage, the new bot is not considered to beplaying on the table yet. The arrival (write action) of thisbot into the space triggers a notify event, making a managerprocess (TableService) aware of this arrival. This managerprocess handles everything related to the poker table oper-ations (this could be thought as a dealer role). This processthen modifies the table object to incorporate information

Figure 1: 1. Bot object is written into the spacethrough a remote application. 2. On this writeevent, TableService is notified of the Bot object’sarrival. 3. TableService proceeds to modify data inthe Table object to “connect” the Table and new Bottogether.

from the new bot.During a hand, each player, or bot, is given two random

cards from the deck. This occurs when the table managerprocess writes a notification object into the space, one foreach bot participating in the hand. This notification ob-ject contains information about the cards distributed to theplayers. Another manager process that handles bot actionsis notified of this start of a new hand and takes these noti-fication objects and incorporates the information to the botobjects.

The table manager proceeds to immediately write anothernotification object in the space, informing a bot of its turn tomake an action (Figure 2). This notification object containsinformation about the history of the current hand (thereis no information at the start of the hand, aside from theinitial pot size from blinds and antes). Again, the bot man-ager takes this notification object out of the space and feedsinformation to the respective bot.

It then invokes the bot to make a decision object to writeback in the space, containing information regarding the spe-cific action and bet amount if any (Figure 3). The tablemanager then takes this decision object and processes thenew information on the table object. This process repeatsuntil the hand is over.

This workflow all occurs within the context of a single ma-chine. Each machine operates a chunk of the total “space”of the distributed system. Bots and poker tables that arewritten to the space can be partitioned based on a specifiedfield to ensure it is distributed to the correct machine. A re-

Figure 2: 4. On a certain Bot’s turn, a BotTurnOb-ject is written to the space containing informationabout the current round. 5. This write event no-tifies the BotService. 6. The BotService invokes afunction in Bot to make a decision based on the newinformation.

mote application monitors and controls the objects that areintroduced into the space. To this application, the partition-ing logic is transparent, and it is simply communicating to aproxy of the space that can automatically redirect requeststo the correct partition. Hence, the interface thinks it iscommunicating with a single, but large, memory instance.

4. SYSTEM IMPLEMENTATIONTo implement the above design and system workflow, a

third-party product called GigaSpaces is used. GigaSpacesis a platform that allows one to easily distribute a large ap-plication across several machines in parallel. It essentiallycreates a large “cloud” of memory, called the “space” or “Gi-gaSpace”, that is spread across these machines, and allowsdifferent services to interact with each other via the space.Services in our project refer to the applications that actu-ally handle business logic and the reading and writing ofdata objects.

It is important to note that each partition of the cloudoperate independently of each other, as the system relieson parallel computing. The cluster of partitions is referredcollectively as the“space” since to the outside user or remoteapplication, the partitioning schema is transparent. Eachpartition has a replicated set of services, equivalent across allmachines, and they operate solely on the set of data objectslocal to their machine. For example, when a user wants towrite an object into the space, GigaSpaces will automaticallyroute that request into the appropriate machine.

GigaSpaces is built entirely on Java, which provides some

Figure 3: 7. After invoking Bot’s function, BotSer-vice writes a DecisionObject to the space. 8. Thiswrite event notifies the TableService. 9. TableSer-vice takes the DecisionObject and modifies the datain Table to reflect the most recent action.

Figure 4: Various GigaSpaces Topologies: parti-tioned, partitioned with backups, replicated [5]

benefits. Likewise, the poker system is built using Java be-cause of the extensive packages available with this language.Since many of the required functions and data structuresare readily available for use, development will be relativelyeasier and faster compared to low-level programming lan-guages. In addition, GigaSpaces builds on top of JavaSpacesand Jini, both built by Sun Microsystems as part of the Java

package.JavaSpaces and Jini allow coordination and distribution

of objects across a distributed system set up using the Jininetwork protocol. GigaSpaces builds an additional layer ontop of these technologies to allow creation of a service-baseddistributed application and data grid, and it abstracts awaymuch of the lower-level networking and distribution aspects.

Because the goal of this system is to play many hands ina given period of time, the design and implementation de-cisions were geared towards maximizing performance. How-ever, there is also considerable effort to keep this applicationas modular as possible.

We classify our application into three types of classes: ser-vices, data objects, and data access objects. There are twoservices in the system: TableService and BotService. Thepurpose of these classes is to handle all of the business logicbehind the game of poker. For example, TableService com-pares two players’ hands and determine the winner. It thendistributes the pot (collective bet amount for all players thatround) to the winner(s). In our partitioned application, eachpartition will have a copy of the services running in memory.

Data objects are essentially Java POJOs (Plain Old JavaObjects). As their name implies, they primarily serve tostore data and hold information about the current state.

Though some of the fields in the data objects are JavaCollections, complex fields are otherwise avoided, such as areference to other objects in the POJOs. Instead, referencesto their primary keys are used (i.e., BotID). This is a designchoice made on the fact that a requirement of the system isto achieve optimal performance. GigaSpaces provides suchoptimization when querying objects using a primary key orindexed field in the space, much like a database. In fact, thespace can essentially be viewed as a distributed databaserunning in memory, where it will be populated by the dataobjects that store data.

Data access objects (DAOs) were implemented to com-municate between the services and data objects. The DAOscontain an instance of a “GigaSpace” variable, which is theproxy to the space. Whenever the services process somelogic and need to read or modify objects in the space, thisis done by calling the data access objects.

Several “helper” classes that did not fall under the abovecategories were implemented as well. In particular, the Pok-erAnalyzer class processes most of the logic specific to therules of poker, such as determining the winner(s), given aset of players still playing at the end of a hand.

PokerAnalyzer is a library of many helper methods de-signed to aid TableService in the determination of winners.In addition to being accurate, this library is designed to behighly flexible and efficient, taking advantage of code reuseand paying special attention to the use of effective algo-rithms.

TableService is a service class that helps process a roundof Texas Hold’em from beginning to end. This service hasseveral main responsibilities. First, it is TableService’s jobto process all the DecisionObjects written into the spacefrom the BotService and Bots. TableService accepts a Deci-sionObject and immediately updates the poker table objectas well as fields in this object, such as the state of the currenthand, based on the DecisionObject.

For example, if a bot submits a DecisionObject to bet$500, TableService will add $500 to the pot, and then notifythe bot that is next to act. On the other hand, if a bot sub-

mits a DecisionObject to call a $500 bet, effectively closingthe action for the round, then TableService will add $500to the pot, recognize that the round of betting is over, anddeal out the next set of cards.

When it is a specific bot’s turn to act, TableService willbegin the process of notifying it by invoking the notifyBot().This method writes a BotTurnObject into the object spacefor BotService to retrieve. The BotTurnObject contains allthe information relevant to the hand, including all previousactions leading up to this turn, the status of the commu-nity cards, and the ID of the intended receiving bot. Thereceiving bot, through the BotService, retrieves all the in-formation, runs its algorithm (as specified by the creator ofthe bot), and writes a DecisionObject into the space. ThisDecisionObject contains all the vital information for Table-Service to process, such as the bot’s action, the amount (ifthe action is to bet), its own ID and the ID of the table.The BotService puts the DecisionObject back into the ob-ject space for the TableService to retrieve.

The other main responsibility of TableService is to deter-mine the winner of each hand after the showdown stage hasbeen reached. In order to accomplish this task, TableServicesurveys the table, takes each player’s cards that they hold,combines themg with the community cards, and determinesthe best possible poker hand for each player. TableServicemakes use of a HashMap to associate each bot with its cards,then iterates through this HashMap comparing each bot’shand, and determines a winner (or multiple winners in thecase of a tie). As mentioned before, TableService utilizesPokerAnalyzer to do this. Finally, TableService awards thewinning bot with the pot and then resets the pot to $0 andshuffles the cards for a new deal.

BotService is also an in-memory service that manages thebot objects in the space. When the TableService writes aBotTurnObject into the space, BotService is notified of thiswrite action. It takes this object from the space (and re-moves it), reads the information, and relays it to the bot.It requests the bot to make a decision, based on the infor-mation provided (and any other information it stored), andwrites a DecisionObject into the space for the TableServiceto pick up.

The key point of this implementation is the use of notifica-tion system on an event in the space. When a service writesan object into the space, GigaSpaces notifies another servicethat has registered an interest in such write event. This nextservice automatically calls a function that processes anotherset of logic. This notification system is implemented using aNotificationContainer, built into the GigaSpaces platform.It is essentially a wrapper class that communicates with thespace and other services on a certain event.

Another “container”, called the PollingContainer, simi-larly wraps around a method in TableService, but will con-tinuously poll for a certain type of object until it obtainsone. In this case, it will poll on any available tables waitingto be processed. Polling improves performance when dealingwith objects in batches, as it allows a system to aggregateall objects and take them altogether instead of being noti-fied one at a time. Notifications also alert all idle threads ofan interested service, causing CPU utilization to spike for ashort period of time.

An advantage of utilizing the notification and polling-based architecture implemented in GigaSpaces/JavaSpacesover using other methods is that it drastically reduces con-

currency issues and many possible race conditions while main-taining high performance. Our in-memory components onlyselect the table and bot objects that are ready to be pro-cessed. In the notification-based operations, if multiple no-tification signals are produced simultaneously, such as a bottelling a service to process its decision, our program will au-tomatically queue the work requests in the order receivedand process them based on the number of available runningthreads. In the polling-based operations, our applicationswill poll and operate only on the objects that are ready tobe processed; otherwise, they will immediately be released.

Our application is also designed to minimize any complica-tions from multiple threads running. Since most of our entiresoftware operates on an in-memory cache, where objects arequickly being read and modified, there may be certain raceconditions that could arise. We have ensured that these donot happen by sacrificing a tiny portion of performance bymaking sure read/write locks are properly secured in eventhe most remote corner cases. Whereas we could have dealtwith these issues by simply restarting a hand when such aproblem occurs, ensuring correctness was more easily im-plemented into our system. When an object is polled, it iscompletely removed from the space; no other threads willknow of its existence thereafter.

5. RESULTSThe preliminary tests of our automated poker platform

were done with several test bots, prototype versions of botsthat we expect users to submit to our system. With thesebots, we were able to perform test runs to determine boththe accuracy and efficiency of our platform. Overall, weperformed several batches of five-minute test sessions. Foreach session, we tested a different table setting (9, 6, or 2player size) and different combinations of test bots.

Figure 5: Sample flowchart for RandomBot show-ing its decision process. In this case, it follows a20/30/50% distribution for folding, calling, or bet-ting. For each instance, its actual action will bedetermined randomly.

Three of the test bots we developed are called Random-Bot, SmarterBot, and ModelBot, and their names roughlyreflect their abilities. RandomBot (Figure 5) is the mostprimitive of the bots we created and serves simply as abenchmark for other bots. It makes each of its decisionson a random basis. Each time it needs to make a decision,

RandomBot determines its action based on a fixed probabil-ity distribution. For example, during the preflop round ofbetting, it may decide to fold 50% of the time, call 30% ofthe time, or raise 20% of the time (or otherwise determinedby its creator). RandomBot also acts independently of thestrength of its own card holdings; it essentially ignores whatcards it currently has. Thus, as designed, its decision basisis strictly random and we expect it to perform rather poorlyagainst more intelligent bots.

Figure 6: Sample flowchart for SmarterBot.SmarterBot first determines the strength of itshand, then evaluates the actions taken leading upto its turn, and finally makes a decision.

SmarterBot (Figure 6) is a more detailed test bot with theability evaluate its own card holdings. Preflop, it evaluatesits cards and determines their relative strength against tworandomly dealt cards. This measure of strength (and anyother future mention of “strength”) refers to the percentagesof winning the pot if the two hands are played until theriver card. The relative strengths of each hand have alreadybeen well-established in the poker community, and we usedthis information in a simple lookup fashion to determinea certain hand’s strength. If SmarterBot determines thatit has a strong hand (for example, lies within the top 20percentile), then it will proceed to play the hand as long asthere hasn’t been an abnormally large bet. Otherwise, itwill fold to the bet. Similarly, after the flop, SmarterBotdetermines its action strictly based on the strength of itshand. If it determines that it has made a high-card pair orbetter, it will continue to play the hand by either callingthe bet or raising initial bet amount. Otherwise, it will getout of the hand by folding. Once again, we reiterate thatthere is nothing special about playing the top 20% of handsor continuing with a top pair or better; these are simplyuser-set parameters that can be altered at any time. In thissense, our automated poker platform allows the user to freelymodel their bots after their own games.

Figure 7: Sample flowchart for ModelBot. Afterevaluating the strength of its own hand, it then usesits model to analyze the behavior characteristics ofeach of its opponents, and makes a decision basedon both factors.

Our third example bot, called ModelBot (Figure 7), is animprovement on SmartBot and with a more complex algo-rithm with some basic artificial intelligence capabilities. Asits name suggests, ModelBot contains data structures andstatistics logic that is able to record and predict the behav-iors of its opponents. For example, it keeps track of howoften each opponent raises during a certain betting stage,how often they fold to a raise, how often they bluff, etc. Af-ter every hand it observes, it updates the relevant statisticsfor each opponent that was involved in the previous hand.Then, in future hands, it uses its model on each opponentas a basis for its decision. For instance, if it records thata particular opponent has bluffed 0 out of 20 past times onthe river card (meaning the opponent has always bet witha strong holding), it will not call a river bet from that op-ponent without having a very strong hand itself. If, on thecontrary, its statistics on the opponent suggests that the op-ponent bluffs on the river with a very high frequency (say,above 50%), then it will call its opponent’s bet with a muchwider range of hands, perhaps a single pair or even a highcard. We envision that eventually, most of the winning botson our automated poker platform will incorporate a modelof some kind, as most winning poker players do in real life.

After running our test sessions, we gathered preliminarydata on the win rates of each bot when placed in various ta-bles. As expected, ModelBot performed the best, Smarter-Bot performed second-best, and RandomBot came in last.When we simulated a single table of nine bots (three of eachtype) randomly positioned around the table, both Model-

Bot and SmarterBot consistently came out as winners, whileRandomBot was an overall loser. On average, ModelBotachieved a win rate of about 14-16 big blinds per 100 hands,SmarterBot achieved a win rate of about 8-10 big blinds per100 hands, and RandomBot achieved a loss rate of about23-25 big blinds per 100 hands. When simulating tablesconsisting of only two types of bots, we achieved similar re-sults. Both ModelBot and SmarterBot achieved relativelylarge win rates against RandomBot, and though it variedwidely, ModelBot was able to achieve a very small win rateagainst SmarterBot.

These win rates are not surprising. When all three testbots played versus one another, ModelBot and SmarterBotwere much more selective in the hands they played, as dic-tated by their algorithms. RandomBot, on the other hand,played a much larger proportion of hands without worryingabout the strength of its holding. Thus, in a typical hand,there would be one or two ModelBots or SmarterBots withvery strong holdings against two or three RandomBots withcompletely random holdings. The ModelBots and Smarter-Bots would win those hands more often than not.

The relatively small win rate of ModelBot over Smarter-Bot is slightly more non-obvious, but nonetheless can beexplained. In essence, ModelBot has all the hand strength-determining features that SmarterBot has, but it is also ableto keep track of statistics on its opposing bots. However, be-cause SmarterBot is programmed to play solid “ABC” pokerand rarely make unprofitable decisions, ModelBot’s model ofSmarterBot is simply that of a typical, solid bot, and thusit is rarely able to exploit any weaknesses. Thus, it onlyachieved an average win rate of around 3 big blinds per 100hands over SmarterBot.

The users of our automated poker platform are required tosubmit a bot that can interact“correctly”with our platform.That is, the user bot simply needs to be able to submit a de-cision object notifying our platform of its action. Whateveralgorithms or techniques the bot undergoes to come up withthe decision is entirely up to the user. However, we envisionthat through natural selection, only bots with solid, winningalgorithms will continue to play on our platform. Becauseof the large number of tables bots can play on and highspeed of each hand, bots can experience an extremely largenumber of hands per hour. Thus, any bots with exploitableerrors will lose money very rapidly, and only the bots withthe most solid and clever algorithms will be able to continueplaying. We predict that the eventual majority of user botswill have algorithms consisting of three key elements: an ap-titude to determine the strength of its holdings relative topossible opponent holdings, the capability to model its op-ponents and keep track of its opponents’ decision patterns,and the ability to randomize its decisions to a certain extentso that even the smartest of opponents will not be able topredict its holdings or next moves with complete certainty.

A benefit of our automated poker platform is that it pro-vides a means for users to test their novel strategies beforeimplementing them in their own games, whether traditionallive/online or in a future automated platform. Our systemprovides a fast and effective way to test a strategy for a largenumber of repetitions. Users will first need to build the im-portant aspects of the strategy into their test bots, and thensubmit this bot to our platform, which allows it to play ver-sus a variety of other bots with different styles. Then, userscan analyze how their bots fared against each type of oppo-

nent, and subsequently determine if their strategy is viable.For example, if one thought that going all in pre-flop withthe top 50 percent of hands is a viable strategy in an envi-ronment where all stack sizes are less than ten times the bigblind, he or she could easily program such a bot and have itplay on the “low stack” tables of our platform. Then, afterone hour, or roughly 20,000 hands, the user can observe howthe bot did in terms of profit and loss, and analyze the bot’sstrengths and weaknesses versus different types of bots. Heor she can finally develop potential improvements to thisbot and ultimately conclude whether this strategy is a rea-sonable one or not. By using our system, users can gain abetter understanding of the effectiveness of any strategy ina very short period of time.

6. SYSTEM PERFORMANCEIn the ideal setting, our automated poker software would

have been tested through a large mixture of bots ranging incomplexity from making purely random decisions to beingprofessionally competitive, across a huge number of tables.Though such an exhaustive performance test of our systemwas not feasible at the time of its completion, we have gainedvaluable insight into the overall behavior of our programby analyzing a subset of basic situations. Our system isfurthermore heavily dependent on the type and amount ofhardware available to be deployed on, so the initial statisticsdo not necessarily reflect the end-goal performance of ourprogram.

We first looked into implementing a synchronized repli-cation or backup of each space cluster. This would ensurethat in the case of a failover in one partition, the systemas a whole would continue to run seamlessly (with only aslight decrease in performance). To minimize the risk of to-tal failover, the replicated partitions were placed in separatemachines. GigaSpaces automatically attempts to convertthe replication into a primary partition while the failed ma-chine is being restored.

After running several tests, using a synchronized repli-cated schema for our system caused performance to dropon average an order of magnitude. This decrease is not toounexpected, since the replications were present in machinesalso running primary partitions. Copying each object acrossa network synchronously (and preventing other operationsfrom running until replication is complete) would create abottleneck in performance. In addition, though many of theinteractions between services seem trivial, our system wouldbe forced to replicate many of the intermediary objects usedto communicate between services.

Another factor that may affect the performance when us-ing synchronized replication is the design of our objects.Some objects, such as a table, contain various complex fields(data structures) that contain information regarding the stateof the current hand and all bots associated with the table.When these objects are backed up, GigaSpaces will also haveto back up these complex fields. Though we could drasti-cally improve the replication time by modifying the classstructures, doing so would create other disadvantages in ourprogram as well.

On the other hand, using an asynchronous replication isnot very practical given the fast-paced nature of poker andour platform. Since data regarding a specific hand is veryshort lived (order of milliseconds), an asynchronous backupwould serve no useful purpose other than to store persis-

tent data. At closest, the only useful information to beasynchronously backed up would be certain long-term botinformation, such as its chip stack amount.

In any case, we found it most practical to remove the repli-cations completely, since a failover in one partition wouldonly cause a single hand (out of many) to be interrupted.Analyzing the probability of such failover happening in con-junction with that hand being important (one involving asignificant pot size) led us to the decision of simply waitinguntil the machine is restored and restarting that hand. Ide-ally and with the resources, however, we would likely main-tain a synchronized backup to ensure perfect correctness.

Another performance issue we analyzed was based on thetime limit we placed for each bot to make a decision. In ourcurrent series of tests, bots do not contain a significantlycomplex algorithm; each decision-making algorithm is fairlyquick and would probably not fare well against a decenthuman player. However, in the ideal case that our systemdoes operate with bots that are competitive, the designersof each bot would want to ideally use up the most amountof time they are provided with to make the best decision.

To replicate this scenario, we incorporated time-stampswithin the iterations of services handling bot and table logicand induced an artificial“waiting”time for each bot’s decision-making process. We assume that bots that take longer thanthe given amount will time out and automatically submit adefault action, either a check or fold, as is the case in bothlive and online poker.

We tested a single 9-player table using a modified Smarter-Bot through a range of induced waiting times. At higherwaiting times (0.5s to 1s), we discovered that the number ofhands per hour varied proportionally with the waiting times.In each hand, every bot would be forced to make a “deci-sion” during preflop play. On average, 2 to 4 bots chose totake the hand into the flop, and even fewer towards the turnand river. Ignoring overhead costs, this would equate to anaverage of 10-15 “decision-making” periods per hand. At 1second per decision, this would yield a 10-15 second periodper hand (about 300 hands per hour), and so on. This re-sult is consistent with the typical number of players beinginvolved in most hands in live or online poker.

We gradually reduced the time-limit threshold down totens of milliseconds, and though the results varied widely,there was an expected increase in the number of hands played.The performance tended to plateau when we reduced thetimes down to below 25 milliseconds per decision, approach-ing the same performance as when we did not induce anywaiting time. At a combined total of around 100ms perhand, our simulations resulted in slightly under the theo-retical average of 36,000 hands per hour. Again, this valuevaried widely depending on the nature of each hand and theoverhead associated with different operations.

As depicted in Figure 8, the number of hands we wereable to attain is shown relative to the other alternatives ofplaying poker, all of which require manual inputs for everydecision. During a typical live poker game (such as in acasino), players are able to play around 25 hands per hour.In a single online poker table, this number increases to about75. Recently, Full Tilt Poker, one of the most popular on-line sites for poker, released a fast-paced version of a pokertable called Rush Poker. Once a player folds a hand, he orshe is immediately taken to a new table, reducing the wait-ing time between each hand. This drastically improves the

Figure 8: Comparison of number of hands per hourin a single 9-person table for different types of pokergames. The number of hands is plotted on a loga-rithmic scale.

number of hands played per hour. However, none of thesegames come close to the potential frequency achieved by anautomated poker platform. We believe that the 36,000 fig-ure is a conservative estimate, and higher numbers can beachieved through an improved system implementation andbetter hardware.

We note that simulating a waiting time was only feasiblebecause the bots resided in the in-memory grid alongside theservices, where communication is much faster compared to anetworked communication. In a networked setting where theserver communicates with clients to get the bots’ decisions,many other issues arise such as latency which may affectthe time limit per decision. In addition, actual performancemay be faster without time logging, since the operation tokeep track of time uses up system resources in each of thepartitions.

7. SHORTCOMINGSThere are several current shortcomings of our project that

can be improved in the future. First, our consumer baseexperiences a very high barrier to entry. That is, in order tobe able to participate in our poker games, users must havethe capability to program a bot with functionality that isconsistent with what we expect. Creating anything at thislevel of sophistication requires a considerable programmingbackground, and further, making a poker bot that is ableto win money requires a sophisticated poker knowledge baseas well. Thus, it is difficult to predict the traffic and userbase that a platform like ours can attract. However, as boththe world of poker become more and more technologicallysophisticated, we expect this problem to be less and less ofan issue in the future.

Furthermore, our platform is currently lacking any typeof graphical user interface (GUI). Because the direct usersof our platform will be bots rather than human beings, wedecided that the functionality of our system was of primaryinterest, and therefore did not focus on creating a standalone

user interface. Consequently, all results, hand reports, andstatistics are currently outputted and analyzed separately.For our platform to be commercially viable, however, wewill need to develop some sort of GUI. This way, when usersdecide to review their bot’s playing history or performance,he or she can do so simply and easily.

Another shortcoming for our automated poker platformis that we do not properly address security issues. Thoughthe platform is designed such that bots only have accessto a certain set of information, this does not prevent usersfrom designing malicious bots that could affect the behaviorand logic of our system. For example, such a malicious botmay intentionally take a long time to act, delaying the handand disrupting the game flow. Other bots may attempt tosend in faulty decision objects, multiple decision objects, ordecision objects out of turn. Although our system generallysafeguards against the creation of faulty objects, it has noexplicit layer of security or detecting them and thus cannotguarantee to be immune to malicious attacks.

We also do not address the issue of possible collusion thatmight occur in automated poker. Though collusion existsin any form of poker, allowing bots to automatically makedecisions could further amplify the problem. For instance,bots may data mine on the behavior of other bots, and pos-sibly share this information through a separate network orvia detectable patterns in the actions it takes.

8. FUTURE WORKWhile much of the foundation and basic aspects of the

ideal automated poker software have been addressed in ourproject, there are still many remaining issues that have notbeen covered.

First, as mentioned before, a graphical user interface mustbe developed to allow the users and creators of bots to beable to build and maintain their playing algorithms, even ifthe users do not come from a programming background. Allinteraction between the client and server is currently doneprogrammatically. Our ideal, long-term goal would be torelease a commercialized version of our software, allowingusers to play on our product as they do on the many popu-lar poker sites in existence today. Hence, the user interfaceaspect of our project plays an important part of differenti-ating our product from others’.

Creating a real-time interface that keeps track of the bots’performance is not much different from what way we haveimplemented it already, but can be easily improved withthe GigaSpaces platform into a graphical interface as well.Instead of reading the bot objects directly, which affect per-formance, we can place an additional statistics service ineach of the partitions. Clients can then read detailed “stat”objects written directly into the space by these services cor-responding to each bot and table. If a client maintains sev-eral bots across different partitions, the space proxy wouldbe able to take advantage of the hidden partitioning logicand aggregate all these “stat” objects across all the relevantpartitions. Furthermore, by creating a graphical interface,bot managers would be able to track their performance muchlike quantitative day traders.

We would also need to develop a more practical commu-nication channel between the client and the server compo-nents. In our project, the only network communication be-ing used is for the clients to upload its bots into the memoryspace and to keep track of the bots’ performance. The actual

decision-making process is done via in-memory communica-tion with bots stored directly in the data grid. This presentsmany security issues in an ideal setting, and we would likethe server to instead communicate with client bots througha network. However, the issue of latency rises, which maydrastically affect the speed of our platform.

9. CONCLUSIONAlthough invented almost a century ago, the game of

Poker, and especially Texas Hold’em, has never been morepopular than it is today. Over the past decade, an increas-ingly large number of people have begun playing poker on-line, where more hands are played in a shorter amount oftime. We believe that the next natural step in the evolu-tionary path of poker is automated poker. We developeda prototype of an automated poker platform, where user-designed bots are placed against one another in multiple,high-speed poker games. Our platform is able to effectivelyincrease the number of hands played by several orders ofmagnitude, therefore reducing the variance of each sessionand therefore allowing winning players to better realize theirexpected profits. Furthermore, we developed several sampletest bots to assess the functionality and effectiveness of ourplatform. With a few improvements to our program’s GUI,security, and user-friendliness, we envision our project to bethe prototype of the future of online poker.

10. REFERENCES[1] Sanjay P. Ahuja, Roger Eggen, and Anjani K. Jha. A

performance evaluation of distributed algorithms onshared memory and message passing middlewareplatforms. Informatica, 29:327–333, 2005.

[2] Brien Beattie, Garrett Nicolai, David Gerhard, andRobert J. Hilderman. Pattern classification in no-limitpoker: A head-start evolutionary approach. LectureNotes in Artificial Intelligence, 4509:204–215, 2007.

[3] Darse Billings, Aaron Davidson, Jonathan Schaeffer,and Duane Szafron. The challenge of poker. ArtificialIntelligence, 134:201–240, 2002.

[4] Bloomberg. Goldman sachs posts record profit,beating estimates.http://www.bloomberg.com/apps/news?pid=

20601082sid=aezVfob8myV4.

[5] GigaSpaces. Gigaspaces topologies image.http://www.gigaspaces.com/wiki/download/

attachments/50760325/topologies.jpg/.

[6] Aite Group. Algorithmic trading: Hype or reality?http://www.aitegroup.com/reports/20050328.php/.

[7] Frederic Hancke, Frans Arickx, Jan Broeckhove, andTom Dhaene. Modeling overhead of distributedsystems using statistical techniques. University ofAntwerp, Belgium.

[8] Terrence Hendershott. Electronic trading in financialmarkets. IT Pro, pages 10–14, August 2003.

[9] Hanyang Quek, Chunghoong Woo, Kaychen Tan, andArthur Tay. Evolving nash-optimal poker strategiesusing evolutionary computation. Front. Comput. Sci.China, 3(1):73–91, 2009.

[10] Pablo Rodriguez, Christian Spanner, and Ernst W.Biersack. Analysis of web caching architectures:Hierarchical and distributed caching. IEEE/ACMTransactions on Networking, 9(4):404–418, 2001.

[11] Advanced Trading. The real story of trading softwareespionage.http://advancedtrading.com/algorithms/

showArticle.jhtml?articleID=218401501/.

[12] Ian Watson and Jonathan Rubin. Casper: Acase-based poker-bot. Lecture Notes in ArtificialIntelligence, 5360:594–600, 2008.

automated poker - penn engineeringcse400/cse400_2009_2010/final_report/kim_a… · that behavior...

Documents