research article a study of prisoner s dilemma game model...

11
Research Article A Study of Prisoner’s Dilemma Game Model with Incomplete Information Xiuqin Deng 1 and Jiadi Deng 2 1 School of Applied Mathematics, Guangdong University of Technology, Guangzhou, Guangdong 510006, China 2 Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China Correspondence should be addressed to Xiuqin Deng; [email protected] Received 25 May 2014; Revised 22 September 2014; Accepted 23 September 2014 Academic Editor: Yiu-ming Cheung Copyright © 2015 X. Deng and J. Deng. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Prisoners’ dilemma is a typical game theory issue. In our study, it is regarded as an incomplete information game with unpublicized game strategies. We solve our problem by establishing a machine learning model using Bayes formula. e model established is referred to as the Bayes model. Based on the Bayesian model, we can make the prediction of players’ choices to better complete the unknown information in the game. And we suggest the hash table to make improvement in space and time complexity. We build a game system with several types of game strategy for testing. In double- or multiplayer games, the Bayes model is more superior to other strategy models; the total income using Bayes model is higher than that of other models. Moreover, from the result of the games on the natural model with Bayes model, as well as the natural model with TFT model, it is found that Bayes model accrued more benefits than TFT model on average. is demonstrates that the Bayes model introduced in this study is feasible and effective. erefore, it provides a novel method of solving incomplete information game problem. 1. Introduction Incomplete information games are influenced by the pri- vate information owned by at least one game player, such as current game state, mechanism of other players in decision-making, the game state of other players, and the reward/punishment mechanism of the game [1]. However, due to the absence of its optimal or relatively optimal solution (at any certain game state), such an incomplete information game is insoluble by traditional study methods. is is due to the fact that the other players restrain the strategy of incomplete information games. Harsanyi [2] analyzed an incomplete information game using Bayesian game player strategies and proposed methods for modeling and analyzing game problems. Zinkevich et al. [3] investigated incomplete information games using Nash equilibrium and minimiz- ing regret method and proposed some game strategies for improving poker game problems. As a branch of artificial intelligence and a cutting- edge research topic, machine learning has been paid great attention in related fields in recent years. Machine learning is defined as a research method aiming at obtaining a more desired approximate solution based on the general rules obtained by analyzing a large amount of data [4]. Statistical machine learning is a branch of machine learning. It integrates statistical theory into machine learning by combining probability theory and stochastic mathematical knowledge with machine learning to improve the efficiency and accuracy [5, 6]. e Bayesian classification algorithm is a commonly used machine learning method [7]. e simplified model of naive Bayesian classification is oſten used in text classification. e prisoners’ dilemma game is a classic cooperation and selection problem based on the assumption of selfish human motives [8]. It is popular and widely applied in mathematics and economics [9]. For a long time, it has been a classical game theory problem and attracted great interest from mathematics and economics researchers around the world. Game theory was born in the mid-twentieth Century and was founded by von Neumann (a famous mathematician and founding father of computing) and Morgenstern (a famous economist). e starting point for the development of game theory was the publication of John von Neumann and Oscar Morgenstern’s seminal work e eory of Games and Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2015, Article ID 452042, 10 pages http://dx.doi.org/10.1155/2015/452042

Upload: others

Post on 24-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

Research ArticleA Study of Prisonerrsquos Dilemma Game Model withIncomplete Information

Xiuqin Deng1 and Jiadi Deng2

1School of Applied Mathematics Guangdong University of Technology Guangzhou Guangdong 510006 China2Department of Computer Science and Technology Tsinghua University Beijing 100084 China

Correspondence should be addressed to Xiuqin Deng xiuqindeng163com

Received 25 May 2014 Revised 22 September 2014 Accepted 23 September 2014

Academic Editor Yiu-ming Cheung

Copyright copy 2015 X Deng and J DengThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Prisonersrsquo dilemma is a typical game theory issue In our study it is regarded as an incomplete information game with unpublicizedgame strategies We solve our problem by establishing a machine learning model using Bayes formula The model established isreferred to as the Bayes model Based on the Bayesian model we can make the prediction of playersrsquo choices to better complete theunknown information in the game And we suggest the hash table to make improvement in space and time complexity We builda game system with several types of game strategy for testing In double- or multiplayer games the Bayes model is more superiorto other strategy models the total income using Bayes model is higher than that of other models Moreover from the result of thegames on the natural model with Bayes model as well as the natural model with TFT model it is found that Bayes model accruedmore benefits than TFTmodel on averageThis demonstrates that the Bayes model introduced in this study is feasible and effectiveTherefore it provides a novel method of solving incomplete information game problem

1 Introduction

Incomplete information games are influenced by the pri-vate information owned by at least one game player suchas current game state mechanism of other players indecision-making the game state of other players and therewardpunishment mechanism of the game [1] Howeverdue to the absence of its optimal or relatively optimal solution(at any certain game state) such an incomplete informationgame is insoluble by traditional study methods This is dueto the fact that the other players restrain the strategy ofincomplete information games Harsanyi [2] analyzed anincomplete information game using Bayesian game playerstrategies and proposedmethods for modeling and analyzinggame problems Zinkevich et al [3] investigated incompleteinformation games using Nash equilibrium and minimiz-ing regret method and proposed some game strategies forimproving poker game problems

As a branch of artificial intelligence and a cutting-edge research topic machine learning has been paid greatattention in related fields in recent years Machine learningis defined as a research method aiming at obtaining a

more desired approximate solution based on the generalrules obtained by analyzing a large amount of data [4]Statistical machine learning is a branch of machine learningIt integrates statistical theory into machine learning bycombining probability theory and stochastic mathematicalknowledge with machine learning to improve the efficiencyand accuracy [5 6]The Bayesian classification algorithm is acommonly usedmachine learningmethod [7]The simplifiedmodel of naive Bayesian classification is often used in textclassification

The prisonersrsquo dilemma game is a classic cooperationand selection problem based on the assumption of selfishhuman motives [8] It is popular and widely applied inmathematics and economics [9] For a long time it has beena classical game theory problem and attracted great interestfrom mathematics and economics researchers around theworld Game theory was born in the mid-twentieth Centuryand was founded by von Neumann (a famous mathematicianand founding father of computing) and Morgenstern (afamous economist)The starting point for the development ofgame theory was the publication of John von Neumann andOscar Morgensternrsquos seminal workTheTheory of Games and

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2015 Article ID 452042 10 pageshttpdxdoiorg1011552015452042

2 Mathematical Problems in Engineering

Economic Behavior in 1944 [10] Game theory brought radicalchanges to economics and provided a standard analysis toolfor economists In light of the contributions of game theory toeconomics the Royal Swedish Academy of Sciences awardedNobel Prizes for economics to Nash Harsanyi and Selten in1994 and Aumann and Schelling in 2005 respectively [10]In the famous artificial intelligence algorithm competition ofprisonersrsquo dilemma Axelrod concluded that a TFT (tit-for-tat) model was the optimum solution through brute forcecompetition in participating algorithms [11 12] Miller [13]introduced an automaton model to simplify and analyzeprisonersrsquo dilemma and proposed a more general prisonersrsquodilemmadecision analysismethodHe also applied themodelto solve problems arising from generalisation And Press sug-gested that the prisonerrsquos dilemma is an ultimatum game andgave an example of strategy which can gain an unfair share ofrewards to support his claim [14] Using genetic algorithmLin and Wu [15] studied the evolution of strategies in theiterated prisonerrsquos dilemma on complex networks and foundthat the agents located on complex networks can naturallydevelop some self-organization mechanics of cooperationwhich can not only result in the emergence of cooperationbut also strengthen and sustain the persistent cooperation

Evolutionary game theory [16 17] extends and combinesideas from game theory and evolutionary biology to study theevolution of an interacting population of individuals Perhapsone of the simplest games in evolutionary game theory isthe so-called evolutionary spatial prisonersrsquo dilemma (ESPD)[18] Cardillo et al [19] investigated the coevolution ofstrategies and update rules in the evolutionary spatial pris-onersrsquo dilemma (ESPD) The authors concluded for a varietyof underlying graph topologies that when the dynamicscoevolves with the strategies it leads to more cooperationin the weak prisonersrsquo dilemma in general Du et al dis-cussed another evolutionary method that uses the improvedweighted network to solve the problem [20] Literature [21]proposed a model using two graphs in conjunction with theESPD one for determining player interaction and secondfor updating strategies Moreover Wang et al have proposedsome evolutionary algorithm to solve relevant problems[22 23] Game theory techniques have been widely appliedto various engineering design problems in which the actionof one component has impact on (and perhaps conflicts with)that of any other component

Prisonersrsquo dilemma could be regarded as a game withincomplete information It satisfies the conditions for anincomplete information game namely the players of eachgame are incapable of determining the choice of their rivalin any current station In our study we propose a naiveBayesian classification method which is used to establishthe machine learning model for prisonersrsquo dilemma in anattempt to solve it through statistical machine learning Withthe use of Bayesian classification the opponentsrsquo strategycan be presented as the possibility of choice which meansthe accuracy of the prediction on opponentsrsquo strategy hasbeen promoted Moreover we introduce an evaluation withmultiple processes to provide the information with highprecision for the final decision in our strategy In the step ofrecord we suggest some efficient data structures and ensure

0

10000

20000

30000

40000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 1 The incomes of the models after 10000 games

a reasonable space and time complexity in our method Wetest the proposedmethod during the competitions with sometypical methods The simulation experimental results showthat our method outperforms four classical methods (seeFigure 1 formore details)We further apply the Bayesmethodto multiplayer gamesThe simulation result indicates that theBayesmethod gains the highest income in themultiplayer test(see Figure 5 for more details)

2 Game Model

21 Game with Incomplete Information Games with incom-plete information can be defined as the game where theplayers are uncertain about some important parameters [2]Generally the incomplete information can be regarded asplayersrsquo lack of full information on some basic law of thegame Such incomplete information can mainly arise in threedifferent ways as follows

(1) Lack the information of the profit function of thegame the profit function is a functionwith119873 ormoreparameters to get the final income of each playerrsquosprofit (119873 means the total number of players) It canbe represented as

119901 = 119875 (1199041199091 1199041199092 119904

119909119873 1199011 1199012 119901

119898) (1)

where 1199041199091 1199041199092 119904

119909119873is all the119873 playersrsquo strategy set

and 1199011 1199012 119901

119898is the other parameters involving

the outcome like the number of turn(2) Lack the information of the strategy of other players

in the game a strategy list can be defined as thedecision of a player made in each step (119871 is thetemporary number of turn)

119904119894= 1198891 1198892 119889

119871 (2)

Players will make their decision for next step based onthe previous situation of the game

119889119897+1

= 119863 (1198921 1198922 119892

119897) (3)

(3) Lack the information of the available choice spaceof others in the game each playerrsquos choice may be

Mathematical Problems in Engineering 3

limited by the rule or background of the game Thatis to say the playersrsquo action in each step is finite butuncertain

We can say that all other cases of incomplete informationcan be reduced into these three basic cases [1] In the pris-onerrsquos dilemma the player is not aware of the other playerrsquosstrategy (case 2) for there is no best or winning strategy in thegame Yet the profit function (the outcome according to eachplayerrsquos choice) and the choice space (cooperate or betray) aredefinite in the game

In our research we mainly focus on the game in case2 where players know nothing about othersrsquo strategy Suchgame model can be found in various ranges of areas Forexample in some economical model the choice space islimited by the situation of the problem and the profit curvescan be found from the previous models And the strategy ofeach competitor will be unpredictable in the case that theprofit is not linear or varied by time (as it is in reality) So suchmodel can be regard as a game with incomplete informationin case 2

22 Game of Prisonersrsquo Dilemma In this game two individu-als determine cooperation or defection If the two individualsare mutually cooperative they both earn incomes 119877 if theydefect to each other the incomes of both sides are 119875 ifone individual is cooperative while the other is in a state ofbetrayal the cooperative one gains 119878 while the treacherousplayer gains 119879 (Table 1) Here 119879 gt 119877 gt 119875 gt 119878 and 2119877 gt

119878 + 119879 The latter formula means that total income of the twocooperative individuals is always larger than that gained incase of one individualrsquos treachery However with regard toindividuals the incomes earned by defection to cooperationare greater than that by cooperation to cooperation

In this experiment the parameters selected were con-sistent with those that Axelrod and Hamilton [12] used insolving the prisonersrsquo dilemma That is to say the incomeswere 119877 = 3 119879 = 5 119878 = 0 and 119875 = 1 which satisfied theconditions 119879 gt 119877 gt 119875 gt 119878 and 2119877 gt 119878 + 119879

In addition each pair of strategies was competed for 119871119898

times namely both sides had to make 119871119898selections The

result of each turn was recorded and the information ofopponentsrsquo selection was sent to each player in the end ofeach turn After 119871

119898turns the results of each pair of strategies

were listed in forms of their total score According to thetotal score the strategy of corresponding strategymodels wasevaluated In general competition 119871

119898was set as 10000 It can

be seen that the game in the long-term could yield stableincome results

Several typical strategy models are presented in thefollowing

(1) TFT (tit-for-tat) strategy that is ldquoreturn like-for-likerdquostrategy TFT is a well-known model for prisonersrsquodilemma The main idea of this strategy is thatby starting with cooperation the strategy selectionof a round is made on the basis of the selectionfrom the previous round That is if the rival selectscooperation or defection in the previous round theselection will be repeated in the current round This

Table 1 Game information

(Player A Player B) Cooperation (C) Defection (D)Cooperation (C) (119877 119877) (119878 119879)Defection (D) (119879 119878) (119875 119875)

strategy performed best in the artificial intelligencealgorithm competition organized by Axelrod andHamilton [12] (although theTFT strategy in this studywas consistent in concept with that TFT strategy thedifference here lies in its details)

(2) PTFT an improved TFT strategy [24] this strategyis relatively more selfish than TFT It still startswith cooperation however in the following roundscooperation is only selectable in the case of an absenceof defection for three rounds

(3) GTFT another improved TFT strategy [8] its strategyallows a certain probability of cooperation in the caseof rival defection and a certain probability of defec-tion in the case of cooperation It solves the deadlockarising from mutual defection in the competition

(4) Pavlov a different strategic concept [8] it bullies theweak and fears the strong namely cooperation iscontinued in cases of mutual cooperation Howeverdefection is selected when one side chooses defectionMoreover in the case of mutual defection coopera-tion is given priority Such a strategy represents a localoptimum in genetic algorithm terms

(5) Random random strategy that is randomly return-ing to cooperation or noncooperation in related pro-grams a 50 50 random strategy is more commonlyapplied that is the probability of returning to coop-eration or defection is 50 This strategy is mainlyadopted to assess fixed strategies set competitionparameters and so forth

(6) Normal a strategy mode developed by simulatingcommon players in this strategy cooperation ordefection would be selected with different probabil-ities based on the selections of both sides in the previ-ous gameThis strategy simulates player participationin a game using different strategies

23 Extended Prisonersrsquo Dilemmarsquos Model The original pris-onersrsquo dilemma problem contains only two players with twodifferent choices And the profit function of the game is givenand fixed during the game However most of the problems inreality are not in such simple scene but multiplayer variantsTherefore multiplayer formats afford the opportunity for auseful expansion of the prisonersrsquo dilemma game

In the extended prisonersrsquo dilemma 119873 players will beconsidered in a game A time-invariant profit function isgiven The profit function has 119873 parameters and for each setof input only one corresponding output would be producedThe choice set contains two elements the cooperation or thedefection That is to say each player can only select one ofthese two choices as their action The game would have 119871

119898

4 Mathematical Problems in Engineering

rounds In one round all players have to give their choiceto the judges at the same time and then get the feedbackThe feedback includes other playersrsquo choices As the profitfunction is provided for the game all players can get thecompetitorsrsquo profit through the feedback which includes allplayersrsquo choice As a game with incomplete information allthe players could mask their strategy Plus other playersrsquochoice would be unknown before the step of returningfeedback

By referring to a published multiplayer dilemma study[25] a reward and punishment rule was defined for this studyas follows

(1) When all players selected cooperation (C) theirincome 119877 was averaged across each player

(2) When partial players selected defection (D) theirincome 119879 was averaged out among the treacherousplayers while income 119878 was shared amongst cooper-atives players

(3) When all players selected defection (D) the income119875was averaged out among all players

For a prisonersrsquo dilemma with 119899 = 4 the parameterswere set to 119877 = 12 119879 = 10 119878 = 0 and 119875 = 4which was in agreement with the standard form of the gameThat is the individual optimum solution of every player wasobtainedwhenone player selected defectionwhile the overalloptimum solution was obtained when all players selectedcooperation

In this prisonersrsquo game the four strategies of each groupwere unavailable to the other players before they made theirdecision After decisions were made the income of eachplayer was calculated and the decisions were revealed thegame was repeated 10000 times

3 Strategy in Prisonerrsquos Dilemma Game

In the prisonerrsquos dilemma game we are studying the strategyof other players is unknown while the profit function andchoice space are clear Our strategy in this game is toreduce the unclear information and maximize the profit inexpectation There are at least three challenges as follows

(i) The strategies of other players are variable andunstable They may make a different choice in thesame situation (like the strategy to randomly make achoice) That is to say no best choice can be selectedin a single game

(ii) All of the other players in the game will have differentstrategy And they play together in one game

(iii) The performance and efficiency of the strategy shouldbe promised especially in the case that the number ofplayers is large

To solve the problem listed above we propose the Bayesformula is the basic method of our strategy Though playerschoice can be considered being random and irregular theiraction can be descripted as a serial of probabilities of thepossible choicesThe Bayes formula provides us with a way to

Table 2 Decision possibility table of TFT

Choice of lastturn(my opponentrsquos)

My choice ofnext turn (C)

My choice ofnext turn (D)

(C C) 10 0

(C D) 0 10

(D C) 10 0

(D D) 0 10

make the prediction based on the history According to Bayesformulawe can build our prediction table which includes theprobability sets to individual player As the game in our studyis a multiple-turn-based game the historical data is easy toget and restore

After the prediction we could evaluate each choice inour choice space with some probability-based method forinstance the expectation of profit To be more convincingwe can include the multiple future steps in our evaluationFinally we select the one choice with the highest value as thedecision of temporary turn

To make our strategy more effective we can use 119899-dimension array or hash table for data storage119873-dimensionarray shows best performance in the competition with smallnumber of players Hash table is used in game including alarge number of players

31 Prediction In an extended prisonerrsquos dilemma gamersquosmodel we assume that there are 119871

119898players in the game and

each player has119870 choices The goal of prediction is to get thepossibility distribution of each different playersrsquo choice Thatis

119875 (119883119894= 119888119896) (4)

We do not know each different playersrsquo strategy but wecan get each playerrsquos choice in the past Our strategy assumesthat other players would base their decision on the historicaldata they recorded and the decision possibility table Otherplayers would record limited steps of historical data and use adecision possibility table tomake their decision For exampleTable 2 shows the decision possibility table which is used inTFT As a consequent we can infer such table with the use ofsome tools from probability theory including Bayes formulaand make a prediction of other playersrsquo choice based on thehistory data

The decision possibility table can be presented as thepossibility function 119875(119883

119894= 119888119896| 119867 = ℎ

119905) which means the

possibility of player 119894 (119883119894) to choose 119888

119896when the history of

opponentsrsquo choice is ℎ119905 From the Bayes formula [26] we can

know that119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)))

(5)

where 119883119894means the choice of player 119894 119888

119896presents the 119896th

choice 119867 represents the history of the choice of all players

Mathematical Problems in Engineering 5

and ℎ119905is an 119898-dimension vector meaning the temporary

record of the playersrsquo decisions

ℎ119895= 119877119895minus1

119877119895minus2

119877119895minus119898

= 1199031119895

1199032119895

119903119873119895

1199031119895minus119898

1199032119895minus119898

119903119873119895minus119898

(119903119894119895

isin 1198880 1198882 119888

119896)

ℎ119894119897

= 119903119894119897minus1

119903119894119897minus2

119903119894119897minus119898

(6)

where 119877119895is a set of choices of each player in the 119895th turn

And the record set ℎ119895dates back to 119898 turns of the records

and includes 119898 set of records and 119903119894119895is the player 119894th choice

in the 119895th turn which is one of the elements in the set ofchoices119883

119895119898is the119898-step choice history of player 119895 Here we

consider that all the strategies would base their decision onthe last few steps and therefore the historical choices wouldbe immaterial at the temporary judge If the 119898 is very largeor in some extreme cases there may not be enough recordsfor building ℎ

119895The denominator of the formula may become

0 To avoid such situation we can make a correction for theoriginal formula by simultaneously increasing the moleculeand denominator119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)) + 1

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))) +

10038161003816100381610038161198881198961003816100381610038161003816

(7)

Next step is to get the historical record for the possibilityof each situation From the previous analysis we know thatwe should get the extra value of 119875(119883

119894= 119888119896) and 119875(119867 = ℎ

119895|

119883119894= 119888119896) from 119877

119895 Basically we can easily get the probability

from the formula that

119875 (119883119894= 119888119896) =

sum119871120591119896(119903119894119897)

sum119871120590119894(119903119894119897)

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

= 119875 (1198831119898

= ℎ1119905 1198832119898

= ℎ2119905 119883

119873119898= ℎ119873119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1120591lowast

119894119896119905(ℎ119895119897)

sum119871

119897=119898prod119873

119895=1120590lowast

119905(ℎ119895119897)

120591119896(119903119894119897) =

1 (119903119894119897

= 119888119896 119903119894119897

isin 119877119897)

0 (else)

120590119894(119903119894119897) =

1 (119903119894119897

isin 119877119897)

0 (else)

120591lowast

119894119896119905(ℎ119895119897) =

1 (119903119894119897

= 119888119896 ℎ119895119897

= ℎ119895119905)

0 (else)

120590lowast

119905(ℎ119895119897) =

1 (ℎ119895119897

= ℎ119895119905)

0 (else) (8)

Here ℎ119894119897is an ordered and comparable sequence of player

119894th 119898-step records and 119905 is the temporary number of turnWe can get the result of formula (7) from the data we record

Moreover as the strategy varies from player to playerwe should not expect all players to use similar strategies inthe game For example some players may ignore the choicesmade by their own and focus on othersrsquo choice The recordsfrom the player himself should be abandoned when makingthe prediction So we have to add a weight for each historyrecordThe weight function will be relevant to the player andthe choice

120596119894= 119908119894(119883119895 119888119896) (9)

and the probability formula would be

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120591lowast

119894119896119905(ℎ119894119897)

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120590lowast (ℎ

119894119897)

(10)

The weight function which distributes the weight toeach history record would vary in different strategies In theexperiment of depth of considered step of weight functionswe could find that the depth within 5 could preform similarlyin the game (see Figure 13) So in our study we use an one-turn weight function for prediction Plus we can find thatmost of the classical strategies ignore the self-made choicesand regard all the choices to be the same Therefore we canget one of the weight functions by concluding these attributesfrom classical strategies

119908119894(119883119895 119888119896) =

1 (119894 = 119895)

0 (else) (11)

32 Income Evaluation In this step we must evaluate eachchoice we can return and choose one of them as the finaldecision We would select the choice with the highest scoreafter the evaluation

119862119905= argmax

119896

val (119888119896 ℎ119905) (12)

We can judge the evaluation based on the possibility ofeach playerrsquos choices and the profit function As the situationthat the profit function is given we can easily predict thevalue that our playerwouldmake from the choice In one-stepprediction the value equals the expectation of the incomeConsider

val (119888119896 ℎ119905)

= sum profit (119888119896 1198881199091 119888

119909119873)

lowast 119875 (1198831= 1198881199091 119883

119873= 119888119909119873

| 119867 = ℎ119905)

(13)

The function ldquoprofitrdquo is a given profit function with 119873

parameters which is equal to the number of players thereturn will be a vector of the income of the players who makethe same choice as the first parameterWe canmake our value

6 Mathematical Problems in Engineering

Function 119901 step( ck 119901 )if 119901 lt 0 then

return 0V larr 0

for each choice 119888119896do

V larr V + val(119888119896 ℎ119905)

Append(ht 119888119896 1198881199092 119888

119909119873)

for each choice 119888 doV larr V + 119901 step(119888 119901 minus 1)

endRemove (ht 119888

119896 1198881199092 119888

119909119873)

endreturn V

Algorithm 1

function more visionary if both the 119873 and the 119870 are small(like the classical prisonerrsquos dilemma where 119873 = 2 119870 = 2)A 119901-step prediction of the income will be a more efficientmethod We can get the best choice by the recursive program(see Algorithm 1)

33 Feedback Record At the end of each turn of the game wecan get the feedback from the system The feedback includeseach playerrsquos choice and the income they get In our modelthe profit function is given so we can detect the income of allplayers with the historical records of all players

The content of feedback can be represented as follows

119865119897= 1198881199091 1198881199092 119888

119909119873 (119888

119909119894isin 1198881 1198882 119888

119870) (14)

We can record the feedbacks as a list As a list the spacecomplexity of the record is 119874(119871119873) In the step of predictionthe time complexity is 119874(119870119873

2

1198712

) In the step of incomeevaluation the time complexity is 119874(119873119870

119873

) In the step ofrecord the time complexity is 119874(119873) In our discussion thenumber of players and the number of the choices is relativelysmall and the number of turns is large Such that 119873119870 ≪ 119871Andwe can find that the bottleneck of the problemwill be thetime complexity of prediction

To improve the problem that all the records have the sameweight and consider one historical step we can use a 119873-dimension array for storage We can build an 119873-dimensionarray 119860 where each dimensionrsquos length is 119870 The entries of119860 are counters of all specific situations which represent thecombination of the choice space Then 119860[119888

1199091][1198881199092] sdot sdot sdot [119888

119909119873]

means the total time of the turns that player 1 2 119873

chose 1198881199091 1198881199092 119888

119909119873 In such situation formula (10) will be

simplified into

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=119860 [1199031119905minus1

] [1199032119905minus1

] sdot sdot sdot [119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

] sdot sdot sdot [119903119873119905minus1

]

sum119870

119896=1119860 [1199031119905minus1

] [1199032119905minus1

]sdot sdot sdot[119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

]sdot sdot sdot[119903119873119905minus1

]

(15)

The time complexity of the prediction will reduce to119874(119873119870) The time complexity of the step of recording will be119874(1) (actually it is 119874(119873) in actual data structure) while thespace complexity will rise to 119874(119870

119873

) The space complexity isacceptable when the119870 and119873 are relatively small

The119873-dimension array can just be satisfied in 1-step date-back and its data structure will be complexity when the 119873

becomes large Here we can use hash table to make a moreefficient record A list of 119898-step data record is shown asfollows

119865119897 119865119897minus1

119865119897minus119898+1

= 1198881199091 1198881199092 119888

119909119873119897minus1

1198881199091 1198881199092 119888

119909119873119897minus119898

(16)

We can get its hash code through a hash function

hash119897= 119867119886 (119865

119897 119865119897minus1

119865119897minus119898+1

) (17)

And the hash code provides an index of record arraywhich records the time that a specific situation happensWiththe use of hash table the complexity of prediction will still be119874(119873119870) The time complexity of the recording will be 119874(1)And the space complexity will become 119874(1) which is basedon the device and not relevant to the 119873 or 119870 Table 3 showsspace and time complexity of different strategy

4 Brief Step of the Algorithm

Firstly we built an environment for the prisonerrsquos dilemmagame Each player is asked to provide a strategy and updatefunction The program of the environment is as shown inAlgorithm 2

And the strategy and update function we provide is asshown in Algorithm 3

5 Experimental Results and Analysis

51 The Performance of Bayes Model in the Double-PlayerGame Four typicalmodels were run against the Bayesmodel10000 times each the total incomes of both players in eachgame were recorded Figure 1 shows the overall incomesof both players recorded over 10000 games comparing theproposed Bayes model with the other four typical mod-els Overall the Bayes model was more advantageous andachieved a higher score (overall income) than the other fourOf these other four typical strategy models TFT performedbest It showed an equivalent overall income compared to thatof the Bayes model and a higher income than all the othersFor each game pair this research presented two test resultseach corresponding to one of the two stable score resultsfrom the selected game pair Figure 2 reveals that the Bayesmodel earned a higher income than the random Pavlovand GTFT models The income ratio to the GTFT modelreached 66 while that with the TFT model also exceededone By examining the final income from repeated games itwas found that the Bayes model was more advantageous thanthe other four typical strategy models tested here

In games repeated 10000 times the cases when the Bayesmodel scored 5 3 1 and 0 were statistically analyzed As

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

2 Mathematical Problems in Engineering

Economic Behavior in 1944 [10] Game theory brought radicalchanges to economics and provided a standard analysis toolfor economists In light of the contributions of game theory toeconomics the Royal Swedish Academy of Sciences awardedNobel Prizes for economics to Nash Harsanyi and Selten in1994 and Aumann and Schelling in 2005 respectively [10]In the famous artificial intelligence algorithm competition ofprisonersrsquo dilemma Axelrod concluded that a TFT (tit-for-tat) model was the optimum solution through brute forcecompetition in participating algorithms [11 12] Miller [13]introduced an automaton model to simplify and analyzeprisonersrsquo dilemma and proposed a more general prisonersrsquodilemmadecision analysismethodHe also applied themodelto solve problems arising from generalisation And Press sug-gested that the prisonerrsquos dilemma is an ultimatum game andgave an example of strategy which can gain an unfair share ofrewards to support his claim [14] Using genetic algorithmLin and Wu [15] studied the evolution of strategies in theiterated prisonerrsquos dilemma on complex networks and foundthat the agents located on complex networks can naturallydevelop some self-organization mechanics of cooperationwhich can not only result in the emergence of cooperationbut also strengthen and sustain the persistent cooperation

Evolutionary game theory [16 17] extends and combinesideas from game theory and evolutionary biology to study theevolution of an interacting population of individuals Perhapsone of the simplest games in evolutionary game theory isthe so-called evolutionary spatial prisonersrsquo dilemma (ESPD)[18] Cardillo et al [19] investigated the coevolution ofstrategies and update rules in the evolutionary spatial pris-onersrsquo dilemma (ESPD) The authors concluded for a varietyof underlying graph topologies that when the dynamicscoevolves with the strategies it leads to more cooperationin the weak prisonersrsquo dilemma in general Du et al dis-cussed another evolutionary method that uses the improvedweighted network to solve the problem [20] Literature [21]proposed a model using two graphs in conjunction with theESPD one for determining player interaction and secondfor updating strategies Moreover Wang et al have proposedsome evolutionary algorithm to solve relevant problems[22 23] Game theory techniques have been widely appliedto various engineering design problems in which the actionof one component has impact on (and perhaps conflicts with)that of any other component

Prisonersrsquo dilemma could be regarded as a game withincomplete information It satisfies the conditions for anincomplete information game namely the players of eachgame are incapable of determining the choice of their rivalin any current station In our study we propose a naiveBayesian classification method which is used to establishthe machine learning model for prisonersrsquo dilemma in anattempt to solve it through statistical machine learning Withthe use of Bayesian classification the opponentsrsquo strategycan be presented as the possibility of choice which meansthe accuracy of the prediction on opponentsrsquo strategy hasbeen promoted Moreover we introduce an evaluation withmultiple processes to provide the information with highprecision for the final decision in our strategy In the step ofrecord we suggest some efficient data structures and ensure

0

10000

20000

30000

40000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 1 The incomes of the models after 10000 games

a reasonable space and time complexity in our method Wetest the proposedmethod during the competitions with sometypical methods The simulation experimental results showthat our method outperforms four classical methods (seeFigure 1 formore details)We further apply the Bayesmethodto multiplayer gamesThe simulation result indicates that theBayesmethod gains the highest income in themultiplayer test(see Figure 5 for more details)

2 Game Model

21 Game with Incomplete Information Games with incom-plete information can be defined as the game where theplayers are uncertain about some important parameters [2]Generally the incomplete information can be regarded asplayersrsquo lack of full information on some basic law of thegame Such incomplete information can mainly arise in threedifferent ways as follows

(1) Lack the information of the profit function of thegame the profit function is a functionwith119873 ormoreparameters to get the final income of each playerrsquosprofit (119873 means the total number of players) It canbe represented as

119901 = 119875 (1199041199091 1199041199092 119904

119909119873 1199011 1199012 119901

119898) (1)

where 1199041199091 1199041199092 119904

119909119873is all the119873 playersrsquo strategy set

and 1199011 1199012 119901

119898is the other parameters involving

the outcome like the number of turn(2) Lack the information of the strategy of other players

in the game a strategy list can be defined as thedecision of a player made in each step (119871 is thetemporary number of turn)

119904119894= 1198891 1198892 119889

119871 (2)

Players will make their decision for next step based onthe previous situation of the game

119889119897+1

= 119863 (1198921 1198922 119892

119897) (3)

(3) Lack the information of the available choice spaceof others in the game each playerrsquos choice may be

Mathematical Problems in Engineering 3

limited by the rule or background of the game Thatis to say the playersrsquo action in each step is finite butuncertain

We can say that all other cases of incomplete informationcan be reduced into these three basic cases [1] In the pris-onerrsquos dilemma the player is not aware of the other playerrsquosstrategy (case 2) for there is no best or winning strategy in thegame Yet the profit function (the outcome according to eachplayerrsquos choice) and the choice space (cooperate or betray) aredefinite in the game

In our research we mainly focus on the game in case2 where players know nothing about othersrsquo strategy Suchgame model can be found in various ranges of areas Forexample in some economical model the choice space islimited by the situation of the problem and the profit curvescan be found from the previous models And the strategy ofeach competitor will be unpredictable in the case that theprofit is not linear or varied by time (as it is in reality) So suchmodel can be regard as a game with incomplete informationin case 2

22 Game of Prisonersrsquo Dilemma In this game two individu-als determine cooperation or defection If the two individualsare mutually cooperative they both earn incomes 119877 if theydefect to each other the incomes of both sides are 119875 ifone individual is cooperative while the other is in a state ofbetrayal the cooperative one gains 119878 while the treacherousplayer gains 119879 (Table 1) Here 119879 gt 119877 gt 119875 gt 119878 and 2119877 gt

119878 + 119879 The latter formula means that total income of the twocooperative individuals is always larger than that gained incase of one individualrsquos treachery However with regard toindividuals the incomes earned by defection to cooperationare greater than that by cooperation to cooperation

In this experiment the parameters selected were con-sistent with those that Axelrod and Hamilton [12] used insolving the prisonersrsquo dilemma That is to say the incomeswere 119877 = 3 119879 = 5 119878 = 0 and 119875 = 1 which satisfied theconditions 119879 gt 119877 gt 119875 gt 119878 and 2119877 gt 119878 + 119879

In addition each pair of strategies was competed for 119871119898

times namely both sides had to make 119871119898selections The

result of each turn was recorded and the information ofopponentsrsquo selection was sent to each player in the end ofeach turn After 119871

119898turns the results of each pair of strategies

were listed in forms of their total score According to thetotal score the strategy of corresponding strategymodels wasevaluated In general competition 119871

119898was set as 10000 It can

be seen that the game in the long-term could yield stableincome results

Several typical strategy models are presented in thefollowing

(1) TFT (tit-for-tat) strategy that is ldquoreturn like-for-likerdquostrategy TFT is a well-known model for prisonersrsquodilemma The main idea of this strategy is thatby starting with cooperation the strategy selectionof a round is made on the basis of the selectionfrom the previous round That is if the rival selectscooperation or defection in the previous round theselection will be repeated in the current round This

Table 1 Game information

(Player A Player B) Cooperation (C) Defection (D)Cooperation (C) (119877 119877) (119878 119879)Defection (D) (119879 119878) (119875 119875)

strategy performed best in the artificial intelligencealgorithm competition organized by Axelrod andHamilton [12] (although theTFT strategy in this studywas consistent in concept with that TFT strategy thedifference here lies in its details)

(2) PTFT an improved TFT strategy [24] this strategyis relatively more selfish than TFT It still startswith cooperation however in the following roundscooperation is only selectable in the case of an absenceof defection for three rounds

(3) GTFT another improved TFT strategy [8] its strategyallows a certain probability of cooperation in the caseof rival defection and a certain probability of defec-tion in the case of cooperation It solves the deadlockarising from mutual defection in the competition

(4) Pavlov a different strategic concept [8] it bullies theweak and fears the strong namely cooperation iscontinued in cases of mutual cooperation Howeverdefection is selected when one side chooses defectionMoreover in the case of mutual defection coopera-tion is given priority Such a strategy represents a localoptimum in genetic algorithm terms

(5) Random random strategy that is randomly return-ing to cooperation or noncooperation in related pro-grams a 50 50 random strategy is more commonlyapplied that is the probability of returning to coop-eration or defection is 50 This strategy is mainlyadopted to assess fixed strategies set competitionparameters and so forth

(6) Normal a strategy mode developed by simulatingcommon players in this strategy cooperation ordefection would be selected with different probabil-ities based on the selections of both sides in the previ-ous gameThis strategy simulates player participationin a game using different strategies

23 Extended Prisonersrsquo Dilemmarsquos Model The original pris-onersrsquo dilemma problem contains only two players with twodifferent choices And the profit function of the game is givenand fixed during the game However most of the problems inreality are not in such simple scene but multiplayer variantsTherefore multiplayer formats afford the opportunity for auseful expansion of the prisonersrsquo dilemma game

In the extended prisonersrsquo dilemma 119873 players will beconsidered in a game A time-invariant profit function isgiven The profit function has 119873 parameters and for each setof input only one corresponding output would be producedThe choice set contains two elements the cooperation or thedefection That is to say each player can only select one ofthese two choices as their action The game would have 119871

119898

4 Mathematical Problems in Engineering

rounds In one round all players have to give their choiceto the judges at the same time and then get the feedbackThe feedback includes other playersrsquo choices As the profitfunction is provided for the game all players can get thecompetitorsrsquo profit through the feedback which includes allplayersrsquo choice As a game with incomplete information allthe players could mask their strategy Plus other playersrsquochoice would be unknown before the step of returningfeedback

By referring to a published multiplayer dilemma study[25] a reward and punishment rule was defined for this studyas follows

(1) When all players selected cooperation (C) theirincome 119877 was averaged across each player

(2) When partial players selected defection (D) theirincome 119879 was averaged out among the treacherousplayers while income 119878 was shared amongst cooper-atives players

(3) When all players selected defection (D) the income119875was averaged out among all players

For a prisonersrsquo dilemma with 119899 = 4 the parameterswere set to 119877 = 12 119879 = 10 119878 = 0 and 119875 = 4which was in agreement with the standard form of the gameThat is the individual optimum solution of every player wasobtainedwhenone player selected defectionwhile the overalloptimum solution was obtained when all players selectedcooperation

In this prisonersrsquo game the four strategies of each groupwere unavailable to the other players before they made theirdecision After decisions were made the income of eachplayer was calculated and the decisions were revealed thegame was repeated 10000 times

3 Strategy in Prisonerrsquos Dilemma Game

In the prisonerrsquos dilemma game we are studying the strategyof other players is unknown while the profit function andchoice space are clear Our strategy in this game is toreduce the unclear information and maximize the profit inexpectation There are at least three challenges as follows

(i) The strategies of other players are variable andunstable They may make a different choice in thesame situation (like the strategy to randomly make achoice) That is to say no best choice can be selectedin a single game

(ii) All of the other players in the game will have differentstrategy And they play together in one game

(iii) The performance and efficiency of the strategy shouldbe promised especially in the case that the number ofplayers is large

To solve the problem listed above we propose the Bayesformula is the basic method of our strategy Though playerschoice can be considered being random and irregular theiraction can be descripted as a serial of probabilities of thepossible choicesThe Bayes formula provides us with a way to

Table 2 Decision possibility table of TFT

Choice of lastturn(my opponentrsquos)

My choice ofnext turn (C)

My choice ofnext turn (D)

(C C) 10 0

(C D) 0 10

(D C) 10 0

(D D) 0 10

make the prediction based on the history According to Bayesformulawe can build our prediction table which includes theprobability sets to individual player As the game in our studyis a multiple-turn-based game the historical data is easy toget and restore

After the prediction we could evaluate each choice inour choice space with some probability-based method forinstance the expectation of profit To be more convincingwe can include the multiple future steps in our evaluationFinally we select the one choice with the highest value as thedecision of temporary turn

To make our strategy more effective we can use 119899-dimension array or hash table for data storage119873-dimensionarray shows best performance in the competition with smallnumber of players Hash table is used in game including alarge number of players

31 Prediction In an extended prisonerrsquos dilemma gamersquosmodel we assume that there are 119871

119898players in the game and

each player has119870 choices The goal of prediction is to get thepossibility distribution of each different playersrsquo choice Thatis

119875 (119883119894= 119888119896) (4)

We do not know each different playersrsquo strategy but wecan get each playerrsquos choice in the past Our strategy assumesthat other players would base their decision on the historicaldata they recorded and the decision possibility table Otherplayers would record limited steps of historical data and use adecision possibility table tomake their decision For exampleTable 2 shows the decision possibility table which is used inTFT As a consequent we can infer such table with the use ofsome tools from probability theory including Bayes formulaand make a prediction of other playersrsquo choice based on thehistory data

The decision possibility table can be presented as thepossibility function 119875(119883

119894= 119888119896| 119867 = ℎ

119905) which means the

possibility of player 119894 (119883119894) to choose 119888

119896when the history of

opponentsrsquo choice is ℎ119905 From the Bayes formula [26] we can

know that119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)))

(5)

where 119883119894means the choice of player 119894 119888

119896presents the 119896th

choice 119867 represents the history of the choice of all players

Mathematical Problems in Engineering 5

and ℎ119905is an 119898-dimension vector meaning the temporary

record of the playersrsquo decisions

ℎ119895= 119877119895minus1

119877119895minus2

119877119895minus119898

= 1199031119895

1199032119895

119903119873119895

1199031119895minus119898

1199032119895minus119898

119903119873119895minus119898

(119903119894119895

isin 1198880 1198882 119888

119896)

ℎ119894119897

= 119903119894119897minus1

119903119894119897minus2

119903119894119897minus119898

(6)

where 119877119895is a set of choices of each player in the 119895th turn

And the record set ℎ119895dates back to 119898 turns of the records

and includes 119898 set of records and 119903119894119895is the player 119894th choice

in the 119895th turn which is one of the elements in the set ofchoices119883

119895119898is the119898-step choice history of player 119895 Here we

consider that all the strategies would base their decision onthe last few steps and therefore the historical choices wouldbe immaterial at the temporary judge If the 119898 is very largeor in some extreme cases there may not be enough recordsfor building ℎ

119895The denominator of the formula may become

0 To avoid such situation we can make a correction for theoriginal formula by simultaneously increasing the moleculeand denominator119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)) + 1

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))) +

10038161003816100381610038161198881198961003816100381610038161003816

(7)

Next step is to get the historical record for the possibilityof each situation From the previous analysis we know thatwe should get the extra value of 119875(119883

119894= 119888119896) and 119875(119867 = ℎ

119895|

119883119894= 119888119896) from 119877

119895 Basically we can easily get the probability

from the formula that

119875 (119883119894= 119888119896) =

sum119871120591119896(119903119894119897)

sum119871120590119894(119903119894119897)

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

= 119875 (1198831119898

= ℎ1119905 1198832119898

= ℎ2119905 119883

119873119898= ℎ119873119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1120591lowast

119894119896119905(ℎ119895119897)

sum119871

119897=119898prod119873

119895=1120590lowast

119905(ℎ119895119897)

120591119896(119903119894119897) =

1 (119903119894119897

= 119888119896 119903119894119897

isin 119877119897)

0 (else)

120590119894(119903119894119897) =

1 (119903119894119897

isin 119877119897)

0 (else)

120591lowast

119894119896119905(ℎ119895119897) =

1 (119903119894119897

= 119888119896 ℎ119895119897

= ℎ119895119905)

0 (else)

120590lowast

119905(ℎ119895119897) =

1 (ℎ119895119897

= ℎ119895119905)

0 (else) (8)

Here ℎ119894119897is an ordered and comparable sequence of player

119894th 119898-step records and 119905 is the temporary number of turnWe can get the result of formula (7) from the data we record

Moreover as the strategy varies from player to playerwe should not expect all players to use similar strategies inthe game For example some players may ignore the choicesmade by their own and focus on othersrsquo choice The recordsfrom the player himself should be abandoned when makingthe prediction So we have to add a weight for each historyrecordThe weight function will be relevant to the player andthe choice

120596119894= 119908119894(119883119895 119888119896) (9)

and the probability formula would be

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120591lowast

119894119896119905(ℎ119894119897)

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120590lowast (ℎ

119894119897)

(10)

The weight function which distributes the weight toeach history record would vary in different strategies In theexperiment of depth of considered step of weight functionswe could find that the depth within 5 could preform similarlyin the game (see Figure 13) So in our study we use an one-turn weight function for prediction Plus we can find thatmost of the classical strategies ignore the self-made choicesand regard all the choices to be the same Therefore we canget one of the weight functions by concluding these attributesfrom classical strategies

119908119894(119883119895 119888119896) =

1 (119894 = 119895)

0 (else) (11)

32 Income Evaluation In this step we must evaluate eachchoice we can return and choose one of them as the finaldecision We would select the choice with the highest scoreafter the evaluation

119862119905= argmax

119896

val (119888119896 ℎ119905) (12)

We can judge the evaluation based on the possibility ofeach playerrsquos choices and the profit function As the situationthat the profit function is given we can easily predict thevalue that our playerwouldmake from the choice In one-stepprediction the value equals the expectation of the incomeConsider

val (119888119896 ℎ119905)

= sum profit (119888119896 1198881199091 119888

119909119873)

lowast 119875 (1198831= 1198881199091 119883

119873= 119888119909119873

| 119867 = ℎ119905)

(13)

The function ldquoprofitrdquo is a given profit function with 119873

parameters which is equal to the number of players thereturn will be a vector of the income of the players who makethe same choice as the first parameterWe canmake our value

6 Mathematical Problems in Engineering

Function 119901 step( ck 119901 )if 119901 lt 0 then

return 0V larr 0

for each choice 119888119896do

V larr V + val(119888119896 ℎ119905)

Append(ht 119888119896 1198881199092 119888

119909119873)

for each choice 119888 doV larr V + 119901 step(119888 119901 minus 1)

endRemove (ht 119888

119896 1198881199092 119888

119909119873)

endreturn V

Algorithm 1

function more visionary if both the 119873 and the 119870 are small(like the classical prisonerrsquos dilemma where 119873 = 2 119870 = 2)A 119901-step prediction of the income will be a more efficientmethod We can get the best choice by the recursive program(see Algorithm 1)

33 Feedback Record At the end of each turn of the game wecan get the feedback from the system The feedback includeseach playerrsquos choice and the income they get In our modelthe profit function is given so we can detect the income of allplayers with the historical records of all players

The content of feedback can be represented as follows

119865119897= 1198881199091 1198881199092 119888

119909119873 (119888

119909119894isin 1198881 1198882 119888

119870) (14)

We can record the feedbacks as a list As a list the spacecomplexity of the record is 119874(119871119873) In the step of predictionthe time complexity is 119874(119870119873

2

1198712

) In the step of incomeevaluation the time complexity is 119874(119873119870

119873

) In the step ofrecord the time complexity is 119874(119873) In our discussion thenumber of players and the number of the choices is relativelysmall and the number of turns is large Such that 119873119870 ≪ 119871Andwe can find that the bottleneck of the problemwill be thetime complexity of prediction

To improve the problem that all the records have the sameweight and consider one historical step we can use a 119873-dimension array for storage We can build an 119873-dimensionarray 119860 where each dimensionrsquos length is 119870 The entries of119860 are counters of all specific situations which represent thecombination of the choice space Then 119860[119888

1199091][1198881199092] sdot sdot sdot [119888

119909119873]

means the total time of the turns that player 1 2 119873

chose 1198881199091 1198881199092 119888

119909119873 In such situation formula (10) will be

simplified into

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=119860 [1199031119905minus1

] [1199032119905minus1

] sdot sdot sdot [119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

] sdot sdot sdot [119903119873119905minus1

]

sum119870

119896=1119860 [1199031119905minus1

] [1199032119905minus1

]sdot sdot sdot[119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

]sdot sdot sdot[119903119873119905minus1

]

(15)

The time complexity of the prediction will reduce to119874(119873119870) The time complexity of the step of recording will be119874(1) (actually it is 119874(119873) in actual data structure) while thespace complexity will rise to 119874(119870

119873

) The space complexity isacceptable when the119870 and119873 are relatively small

The119873-dimension array can just be satisfied in 1-step date-back and its data structure will be complexity when the 119873

becomes large Here we can use hash table to make a moreefficient record A list of 119898-step data record is shown asfollows

119865119897 119865119897minus1

119865119897minus119898+1

= 1198881199091 1198881199092 119888

119909119873119897minus1

1198881199091 1198881199092 119888

119909119873119897minus119898

(16)

We can get its hash code through a hash function

hash119897= 119867119886 (119865

119897 119865119897minus1

119865119897minus119898+1

) (17)

And the hash code provides an index of record arraywhich records the time that a specific situation happensWiththe use of hash table the complexity of prediction will still be119874(119873119870) The time complexity of the recording will be 119874(1)And the space complexity will become 119874(1) which is basedon the device and not relevant to the 119873 or 119870 Table 3 showsspace and time complexity of different strategy

4 Brief Step of the Algorithm

Firstly we built an environment for the prisonerrsquos dilemmagame Each player is asked to provide a strategy and updatefunction The program of the environment is as shown inAlgorithm 2

And the strategy and update function we provide is asshown in Algorithm 3

5 Experimental Results and Analysis

51 The Performance of Bayes Model in the Double-PlayerGame Four typicalmodels were run against the Bayesmodel10000 times each the total incomes of both players in eachgame were recorded Figure 1 shows the overall incomesof both players recorded over 10000 games comparing theproposed Bayes model with the other four typical mod-els Overall the Bayes model was more advantageous andachieved a higher score (overall income) than the other fourOf these other four typical strategy models TFT performedbest It showed an equivalent overall income compared to thatof the Bayes model and a higher income than all the othersFor each game pair this research presented two test resultseach corresponding to one of the two stable score resultsfrom the selected game pair Figure 2 reveals that the Bayesmodel earned a higher income than the random Pavlovand GTFT models The income ratio to the GTFT modelreached 66 while that with the TFT model also exceededone By examining the final income from repeated games itwas found that the Bayes model was more advantageous thanthe other four typical strategy models tested here

In games repeated 10000 times the cases when the Bayesmodel scored 5 3 1 and 0 were statistically analyzed As

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

Mathematical Problems in Engineering 3

limited by the rule or background of the game Thatis to say the playersrsquo action in each step is finite butuncertain

We can say that all other cases of incomplete informationcan be reduced into these three basic cases [1] In the pris-onerrsquos dilemma the player is not aware of the other playerrsquosstrategy (case 2) for there is no best or winning strategy in thegame Yet the profit function (the outcome according to eachplayerrsquos choice) and the choice space (cooperate or betray) aredefinite in the game

In our research we mainly focus on the game in case2 where players know nothing about othersrsquo strategy Suchgame model can be found in various ranges of areas Forexample in some economical model the choice space islimited by the situation of the problem and the profit curvescan be found from the previous models And the strategy ofeach competitor will be unpredictable in the case that theprofit is not linear or varied by time (as it is in reality) So suchmodel can be regard as a game with incomplete informationin case 2

22 Game of Prisonersrsquo Dilemma In this game two individu-als determine cooperation or defection If the two individualsare mutually cooperative they both earn incomes 119877 if theydefect to each other the incomes of both sides are 119875 ifone individual is cooperative while the other is in a state ofbetrayal the cooperative one gains 119878 while the treacherousplayer gains 119879 (Table 1) Here 119879 gt 119877 gt 119875 gt 119878 and 2119877 gt

119878 + 119879 The latter formula means that total income of the twocooperative individuals is always larger than that gained incase of one individualrsquos treachery However with regard toindividuals the incomes earned by defection to cooperationare greater than that by cooperation to cooperation

In this experiment the parameters selected were con-sistent with those that Axelrod and Hamilton [12] used insolving the prisonersrsquo dilemma That is to say the incomeswere 119877 = 3 119879 = 5 119878 = 0 and 119875 = 1 which satisfied theconditions 119879 gt 119877 gt 119875 gt 119878 and 2119877 gt 119878 + 119879

In addition each pair of strategies was competed for 119871119898

times namely both sides had to make 119871119898selections The

result of each turn was recorded and the information ofopponentsrsquo selection was sent to each player in the end ofeach turn After 119871

119898turns the results of each pair of strategies

were listed in forms of their total score According to thetotal score the strategy of corresponding strategymodels wasevaluated In general competition 119871

119898was set as 10000 It can

be seen that the game in the long-term could yield stableincome results

Several typical strategy models are presented in thefollowing

(1) TFT (tit-for-tat) strategy that is ldquoreturn like-for-likerdquostrategy TFT is a well-known model for prisonersrsquodilemma The main idea of this strategy is thatby starting with cooperation the strategy selectionof a round is made on the basis of the selectionfrom the previous round That is if the rival selectscooperation or defection in the previous round theselection will be repeated in the current round This

Table 1 Game information

(Player A Player B) Cooperation (C) Defection (D)Cooperation (C) (119877 119877) (119878 119879)Defection (D) (119879 119878) (119875 119875)

strategy performed best in the artificial intelligencealgorithm competition organized by Axelrod andHamilton [12] (although theTFT strategy in this studywas consistent in concept with that TFT strategy thedifference here lies in its details)

(2) PTFT an improved TFT strategy [24] this strategyis relatively more selfish than TFT It still startswith cooperation however in the following roundscooperation is only selectable in the case of an absenceof defection for three rounds

(3) GTFT another improved TFT strategy [8] its strategyallows a certain probability of cooperation in the caseof rival defection and a certain probability of defec-tion in the case of cooperation It solves the deadlockarising from mutual defection in the competition

(4) Pavlov a different strategic concept [8] it bullies theweak and fears the strong namely cooperation iscontinued in cases of mutual cooperation Howeverdefection is selected when one side chooses defectionMoreover in the case of mutual defection coopera-tion is given priority Such a strategy represents a localoptimum in genetic algorithm terms

(5) Random random strategy that is randomly return-ing to cooperation or noncooperation in related pro-grams a 50 50 random strategy is more commonlyapplied that is the probability of returning to coop-eration or defection is 50 This strategy is mainlyadopted to assess fixed strategies set competitionparameters and so forth

(6) Normal a strategy mode developed by simulatingcommon players in this strategy cooperation ordefection would be selected with different probabil-ities based on the selections of both sides in the previ-ous gameThis strategy simulates player participationin a game using different strategies

23 Extended Prisonersrsquo Dilemmarsquos Model The original pris-onersrsquo dilemma problem contains only two players with twodifferent choices And the profit function of the game is givenand fixed during the game However most of the problems inreality are not in such simple scene but multiplayer variantsTherefore multiplayer formats afford the opportunity for auseful expansion of the prisonersrsquo dilemma game

In the extended prisonersrsquo dilemma 119873 players will beconsidered in a game A time-invariant profit function isgiven The profit function has 119873 parameters and for each setof input only one corresponding output would be producedThe choice set contains two elements the cooperation or thedefection That is to say each player can only select one ofthese two choices as their action The game would have 119871

119898

4 Mathematical Problems in Engineering

rounds In one round all players have to give their choiceto the judges at the same time and then get the feedbackThe feedback includes other playersrsquo choices As the profitfunction is provided for the game all players can get thecompetitorsrsquo profit through the feedback which includes allplayersrsquo choice As a game with incomplete information allthe players could mask their strategy Plus other playersrsquochoice would be unknown before the step of returningfeedback

By referring to a published multiplayer dilemma study[25] a reward and punishment rule was defined for this studyas follows

(1) When all players selected cooperation (C) theirincome 119877 was averaged across each player

(2) When partial players selected defection (D) theirincome 119879 was averaged out among the treacherousplayers while income 119878 was shared amongst cooper-atives players

(3) When all players selected defection (D) the income119875was averaged out among all players

For a prisonersrsquo dilemma with 119899 = 4 the parameterswere set to 119877 = 12 119879 = 10 119878 = 0 and 119875 = 4which was in agreement with the standard form of the gameThat is the individual optimum solution of every player wasobtainedwhenone player selected defectionwhile the overalloptimum solution was obtained when all players selectedcooperation

In this prisonersrsquo game the four strategies of each groupwere unavailable to the other players before they made theirdecision After decisions were made the income of eachplayer was calculated and the decisions were revealed thegame was repeated 10000 times

3 Strategy in Prisonerrsquos Dilemma Game

In the prisonerrsquos dilemma game we are studying the strategyof other players is unknown while the profit function andchoice space are clear Our strategy in this game is toreduce the unclear information and maximize the profit inexpectation There are at least three challenges as follows

(i) The strategies of other players are variable andunstable They may make a different choice in thesame situation (like the strategy to randomly make achoice) That is to say no best choice can be selectedin a single game

(ii) All of the other players in the game will have differentstrategy And they play together in one game

(iii) The performance and efficiency of the strategy shouldbe promised especially in the case that the number ofplayers is large

To solve the problem listed above we propose the Bayesformula is the basic method of our strategy Though playerschoice can be considered being random and irregular theiraction can be descripted as a serial of probabilities of thepossible choicesThe Bayes formula provides us with a way to

Table 2 Decision possibility table of TFT

Choice of lastturn(my opponentrsquos)

My choice ofnext turn (C)

My choice ofnext turn (D)

(C C) 10 0

(C D) 0 10

(D C) 10 0

(D D) 0 10

make the prediction based on the history According to Bayesformulawe can build our prediction table which includes theprobability sets to individual player As the game in our studyis a multiple-turn-based game the historical data is easy toget and restore

After the prediction we could evaluate each choice inour choice space with some probability-based method forinstance the expectation of profit To be more convincingwe can include the multiple future steps in our evaluationFinally we select the one choice with the highest value as thedecision of temporary turn

To make our strategy more effective we can use 119899-dimension array or hash table for data storage119873-dimensionarray shows best performance in the competition with smallnumber of players Hash table is used in game including alarge number of players

31 Prediction In an extended prisonerrsquos dilemma gamersquosmodel we assume that there are 119871

119898players in the game and

each player has119870 choices The goal of prediction is to get thepossibility distribution of each different playersrsquo choice Thatis

119875 (119883119894= 119888119896) (4)

We do not know each different playersrsquo strategy but wecan get each playerrsquos choice in the past Our strategy assumesthat other players would base their decision on the historicaldata they recorded and the decision possibility table Otherplayers would record limited steps of historical data and use adecision possibility table tomake their decision For exampleTable 2 shows the decision possibility table which is used inTFT As a consequent we can infer such table with the use ofsome tools from probability theory including Bayes formulaand make a prediction of other playersrsquo choice based on thehistory data

The decision possibility table can be presented as thepossibility function 119875(119883

119894= 119888119896| 119867 = ℎ

119905) which means the

possibility of player 119894 (119883119894) to choose 119888

119896when the history of

opponentsrsquo choice is ℎ119905 From the Bayes formula [26] we can

know that119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)))

(5)

where 119883119894means the choice of player 119894 119888

119896presents the 119896th

choice 119867 represents the history of the choice of all players

Mathematical Problems in Engineering 5

and ℎ119905is an 119898-dimension vector meaning the temporary

record of the playersrsquo decisions

ℎ119895= 119877119895minus1

119877119895minus2

119877119895minus119898

= 1199031119895

1199032119895

119903119873119895

1199031119895minus119898

1199032119895minus119898

119903119873119895minus119898

(119903119894119895

isin 1198880 1198882 119888

119896)

ℎ119894119897

= 119903119894119897minus1

119903119894119897minus2

119903119894119897minus119898

(6)

where 119877119895is a set of choices of each player in the 119895th turn

And the record set ℎ119895dates back to 119898 turns of the records

and includes 119898 set of records and 119903119894119895is the player 119894th choice

in the 119895th turn which is one of the elements in the set ofchoices119883

119895119898is the119898-step choice history of player 119895 Here we

consider that all the strategies would base their decision onthe last few steps and therefore the historical choices wouldbe immaterial at the temporary judge If the 119898 is very largeor in some extreme cases there may not be enough recordsfor building ℎ

119895The denominator of the formula may become

0 To avoid such situation we can make a correction for theoriginal formula by simultaneously increasing the moleculeand denominator119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)) + 1

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))) +

10038161003816100381610038161198881198961003816100381610038161003816

(7)

Next step is to get the historical record for the possibilityof each situation From the previous analysis we know thatwe should get the extra value of 119875(119883

119894= 119888119896) and 119875(119867 = ℎ

119895|

119883119894= 119888119896) from 119877

119895 Basically we can easily get the probability

from the formula that

119875 (119883119894= 119888119896) =

sum119871120591119896(119903119894119897)

sum119871120590119894(119903119894119897)

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

= 119875 (1198831119898

= ℎ1119905 1198832119898

= ℎ2119905 119883

119873119898= ℎ119873119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1120591lowast

119894119896119905(ℎ119895119897)

sum119871

119897=119898prod119873

119895=1120590lowast

119905(ℎ119895119897)

120591119896(119903119894119897) =

1 (119903119894119897

= 119888119896 119903119894119897

isin 119877119897)

0 (else)

120590119894(119903119894119897) =

1 (119903119894119897

isin 119877119897)

0 (else)

120591lowast

119894119896119905(ℎ119895119897) =

1 (119903119894119897

= 119888119896 ℎ119895119897

= ℎ119895119905)

0 (else)

120590lowast

119905(ℎ119895119897) =

1 (ℎ119895119897

= ℎ119895119905)

0 (else) (8)

Here ℎ119894119897is an ordered and comparable sequence of player

119894th 119898-step records and 119905 is the temporary number of turnWe can get the result of formula (7) from the data we record

Moreover as the strategy varies from player to playerwe should not expect all players to use similar strategies inthe game For example some players may ignore the choicesmade by their own and focus on othersrsquo choice The recordsfrom the player himself should be abandoned when makingthe prediction So we have to add a weight for each historyrecordThe weight function will be relevant to the player andthe choice

120596119894= 119908119894(119883119895 119888119896) (9)

and the probability formula would be

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120591lowast

119894119896119905(ℎ119894119897)

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120590lowast (ℎ

119894119897)

(10)

The weight function which distributes the weight toeach history record would vary in different strategies In theexperiment of depth of considered step of weight functionswe could find that the depth within 5 could preform similarlyin the game (see Figure 13) So in our study we use an one-turn weight function for prediction Plus we can find thatmost of the classical strategies ignore the self-made choicesand regard all the choices to be the same Therefore we canget one of the weight functions by concluding these attributesfrom classical strategies

119908119894(119883119895 119888119896) =

1 (119894 = 119895)

0 (else) (11)

32 Income Evaluation In this step we must evaluate eachchoice we can return and choose one of them as the finaldecision We would select the choice with the highest scoreafter the evaluation

119862119905= argmax

119896

val (119888119896 ℎ119905) (12)

We can judge the evaluation based on the possibility ofeach playerrsquos choices and the profit function As the situationthat the profit function is given we can easily predict thevalue that our playerwouldmake from the choice In one-stepprediction the value equals the expectation of the incomeConsider

val (119888119896 ℎ119905)

= sum profit (119888119896 1198881199091 119888

119909119873)

lowast 119875 (1198831= 1198881199091 119883

119873= 119888119909119873

| 119867 = ℎ119905)

(13)

The function ldquoprofitrdquo is a given profit function with 119873

parameters which is equal to the number of players thereturn will be a vector of the income of the players who makethe same choice as the first parameterWe canmake our value

6 Mathematical Problems in Engineering

Function 119901 step( ck 119901 )if 119901 lt 0 then

return 0V larr 0

for each choice 119888119896do

V larr V + val(119888119896 ℎ119905)

Append(ht 119888119896 1198881199092 119888

119909119873)

for each choice 119888 doV larr V + 119901 step(119888 119901 minus 1)

endRemove (ht 119888

119896 1198881199092 119888

119909119873)

endreturn V

Algorithm 1

function more visionary if both the 119873 and the 119870 are small(like the classical prisonerrsquos dilemma where 119873 = 2 119870 = 2)A 119901-step prediction of the income will be a more efficientmethod We can get the best choice by the recursive program(see Algorithm 1)

33 Feedback Record At the end of each turn of the game wecan get the feedback from the system The feedback includeseach playerrsquos choice and the income they get In our modelthe profit function is given so we can detect the income of allplayers with the historical records of all players

The content of feedback can be represented as follows

119865119897= 1198881199091 1198881199092 119888

119909119873 (119888

119909119894isin 1198881 1198882 119888

119870) (14)

We can record the feedbacks as a list As a list the spacecomplexity of the record is 119874(119871119873) In the step of predictionthe time complexity is 119874(119870119873

2

1198712

) In the step of incomeevaluation the time complexity is 119874(119873119870

119873

) In the step ofrecord the time complexity is 119874(119873) In our discussion thenumber of players and the number of the choices is relativelysmall and the number of turns is large Such that 119873119870 ≪ 119871Andwe can find that the bottleneck of the problemwill be thetime complexity of prediction

To improve the problem that all the records have the sameweight and consider one historical step we can use a 119873-dimension array for storage We can build an 119873-dimensionarray 119860 where each dimensionrsquos length is 119870 The entries of119860 are counters of all specific situations which represent thecombination of the choice space Then 119860[119888

1199091][1198881199092] sdot sdot sdot [119888

119909119873]

means the total time of the turns that player 1 2 119873

chose 1198881199091 1198881199092 119888

119909119873 In such situation formula (10) will be

simplified into

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=119860 [1199031119905minus1

] [1199032119905minus1

] sdot sdot sdot [119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

] sdot sdot sdot [119903119873119905minus1

]

sum119870

119896=1119860 [1199031119905minus1

] [1199032119905minus1

]sdot sdot sdot[119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

]sdot sdot sdot[119903119873119905minus1

]

(15)

The time complexity of the prediction will reduce to119874(119873119870) The time complexity of the step of recording will be119874(1) (actually it is 119874(119873) in actual data structure) while thespace complexity will rise to 119874(119870

119873

) The space complexity isacceptable when the119870 and119873 are relatively small

The119873-dimension array can just be satisfied in 1-step date-back and its data structure will be complexity when the 119873

becomes large Here we can use hash table to make a moreefficient record A list of 119898-step data record is shown asfollows

119865119897 119865119897minus1

119865119897minus119898+1

= 1198881199091 1198881199092 119888

119909119873119897minus1

1198881199091 1198881199092 119888

119909119873119897minus119898

(16)

We can get its hash code through a hash function

hash119897= 119867119886 (119865

119897 119865119897minus1

119865119897minus119898+1

) (17)

And the hash code provides an index of record arraywhich records the time that a specific situation happensWiththe use of hash table the complexity of prediction will still be119874(119873119870) The time complexity of the recording will be 119874(1)And the space complexity will become 119874(1) which is basedon the device and not relevant to the 119873 or 119870 Table 3 showsspace and time complexity of different strategy

4 Brief Step of the Algorithm

Firstly we built an environment for the prisonerrsquos dilemmagame Each player is asked to provide a strategy and updatefunction The program of the environment is as shown inAlgorithm 2

And the strategy and update function we provide is asshown in Algorithm 3

5 Experimental Results and Analysis

51 The Performance of Bayes Model in the Double-PlayerGame Four typicalmodels were run against the Bayesmodel10000 times each the total incomes of both players in eachgame were recorded Figure 1 shows the overall incomesof both players recorded over 10000 games comparing theproposed Bayes model with the other four typical mod-els Overall the Bayes model was more advantageous andachieved a higher score (overall income) than the other fourOf these other four typical strategy models TFT performedbest It showed an equivalent overall income compared to thatof the Bayes model and a higher income than all the othersFor each game pair this research presented two test resultseach corresponding to one of the two stable score resultsfrom the selected game pair Figure 2 reveals that the Bayesmodel earned a higher income than the random Pavlovand GTFT models The income ratio to the GTFT modelreached 66 while that with the TFT model also exceededone By examining the final income from repeated games itwas found that the Bayes model was more advantageous thanthe other four typical strategy models tested here

In games repeated 10000 times the cases when the Bayesmodel scored 5 3 1 and 0 were statistically analyzed As

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

4 Mathematical Problems in Engineering

rounds In one round all players have to give their choiceto the judges at the same time and then get the feedbackThe feedback includes other playersrsquo choices As the profitfunction is provided for the game all players can get thecompetitorsrsquo profit through the feedback which includes allplayersrsquo choice As a game with incomplete information allthe players could mask their strategy Plus other playersrsquochoice would be unknown before the step of returningfeedback

By referring to a published multiplayer dilemma study[25] a reward and punishment rule was defined for this studyas follows

(1) When all players selected cooperation (C) theirincome 119877 was averaged across each player

(2) When partial players selected defection (D) theirincome 119879 was averaged out among the treacherousplayers while income 119878 was shared amongst cooper-atives players

(3) When all players selected defection (D) the income119875was averaged out among all players

For a prisonersrsquo dilemma with 119899 = 4 the parameterswere set to 119877 = 12 119879 = 10 119878 = 0 and 119875 = 4which was in agreement with the standard form of the gameThat is the individual optimum solution of every player wasobtainedwhenone player selected defectionwhile the overalloptimum solution was obtained when all players selectedcooperation

In this prisonersrsquo game the four strategies of each groupwere unavailable to the other players before they made theirdecision After decisions were made the income of eachplayer was calculated and the decisions were revealed thegame was repeated 10000 times

3 Strategy in Prisonerrsquos Dilemma Game

In the prisonerrsquos dilemma game we are studying the strategyof other players is unknown while the profit function andchoice space are clear Our strategy in this game is toreduce the unclear information and maximize the profit inexpectation There are at least three challenges as follows

(i) The strategies of other players are variable andunstable They may make a different choice in thesame situation (like the strategy to randomly make achoice) That is to say no best choice can be selectedin a single game

(ii) All of the other players in the game will have differentstrategy And they play together in one game

(iii) The performance and efficiency of the strategy shouldbe promised especially in the case that the number ofplayers is large

To solve the problem listed above we propose the Bayesformula is the basic method of our strategy Though playerschoice can be considered being random and irregular theiraction can be descripted as a serial of probabilities of thepossible choicesThe Bayes formula provides us with a way to

Table 2 Decision possibility table of TFT

Choice of lastturn(my opponentrsquos)

My choice ofnext turn (C)

My choice ofnext turn (D)

(C C) 10 0

(C D) 0 10

(D C) 10 0

(D D) 0 10

make the prediction based on the history According to Bayesformulawe can build our prediction table which includes theprobability sets to individual player As the game in our studyis a multiple-turn-based game the historical data is easy toget and restore

After the prediction we could evaluate each choice inour choice space with some probability-based method forinstance the expectation of profit To be more convincingwe can include the multiple future steps in our evaluationFinally we select the one choice with the highest value as thedecision of temporary turn

To make our strategy more effective we can use 119899-dimension array or hash table for data storage119873-dimensionarray shows best performance in the competition with smallnumber of players Hash table is used in game including alarge number of players

31 Prediction In an extended prisonerrsquos dilemma gamersquosmodel we assume that there are 119871

119898players in the game and

each player has119870 choices The goal of prediction is to get thepossibility distribution of each different playersrsquo choice Thatis

119875 (119883119894= 119888119896) (4)

We do not know each different playersrsquo strategy but wecan get each playerrsquos choice in the past Our strategy assumesthat other players would base their decision on the historicaldata they recorded and the decision possibility table Otherplayers would record limited steps of historical data and use adecision possibility table tomake their decision For exampleTable 2 shows the decision possibility table which is used inTFT As a consequent we can infer such table with the use ofsome tools from probability theory including Bayes formulaand make a prediction of other playersrsquo choice based on thehistory data

The decision possibility table can be presented as thepossibility function 119875(119883

119894= 119888119896| 119867 = ℎ

119905) which means the

possibility of player 119894 (119883119894) to choose 119888

119896when the history of

opponentsrsquo choice is ℎ119905 From the Bayes formula [26] we can

know that119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)))

(5)

where 119883119894means the choice of player 119894 119888

119896presents the 119896th

choice 119867 represents the history of the choice of all players

Mathematical Problems in Engineering 5

and ℎ119905is an 119898-dimension vector meaning the temporary

record of the playersrsquo decisions

ℎ119895= 119877119895minus1

119877119895minus2

119877119895minus119898

= 1199031119895

1199032119895

119903119873119895

1199031119895minus119898

1199032119895minus119898

119903119873119895minus119898

(119903119894119895

isin 1198880 1198882 119888

119896)

ℎ119894119897

= 119903119894119897minus1

119903119894119897minus2

119903119894119897minus119898

(6)

where 119877119895is a set of choices of each player in the 119895th turn

And the record set ℎ119895dates back to 119898 turns of the records

and includes 119898 set of records and 119903119894119895is the player 119894th choice

in the 119895th turn which is one of the elements in the set ofchoices119883

119895119898is the119898-step choice history of player 119895 Here we

consider that all the strategies would base their decision onthe last few steps and therefore the historical choices wouldbe immaterial at the temporary judge If the 119898 is very largeor in some extreme cases there may not be enough recordsfor building ℎ

119895The denominator of the formula may become

0 To avoid such situation we can make a correction for theoriginal formula by simultaneously increasing the moleculeand denominator119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)) + 1

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))) +

10038161003816100381610038161198881198961003816100381610038161003816

(7)

Next step is to get the historical record for the possibilityof each situation From the previous analysis we know thatwe should get the extra value of 119875(119883

119894= 119888119896) and 119875(119867 = ℎ

119895|

119883119894= 119888119896) from 119877

119895 Basically we can easily get the probability

from the formula that

119875 (119883119894= 119888119896) =

sum119871120591119896(119903119894119897)

sum119871120590119894(119903119894119897)

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

= 119875 (1198831119898

= ℎ1119905 1198832119898

= ℎ2119905 119883

119873119898= ℎ119873119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1120591lowast

119894119896119905(ℎ119895119897)

sum119871

119897=119898prod119873

119895=1120590lowast

119905(ℎ119895119897)

120591119896(119903119894119897) =

1 (119903119894119897

= 119888119896 119903119894119897

isin 119877119897)

0 (else)

120590119894(119903119894119897) =

1 (119903119894119897

isin 119877119897)

0 (else)

120591lowast

119894119896119905(ℎ119895119897) =

1 (119903119894119897

= 119888119896 ℎ119895119897

= ℎ119895119905)

0 (else)

120590lowast

119905(ℎ119895119897) =

1 (ℎ119895119897

= ℎ119895119905)

0 (else) (8)

Here ℎ119894119897is an ordered and comparable sequence of player

119894th 119898-step records and 119905 is the temporary number of turnWe can get the result of formula (7) from the data we record

Moreover as the strategy varies from player to playerwe should not expect all players to use similar strategies inthe game For example some players may ignore the choicesmade by their own and focus on othersrsquo choice The recordsfrom the player himself should be abandoned when makingthe prediction So we have to add a weight for each historyrecordThe weight function will be relevant to the player andthe choice

120596119894= 119908119894(119883119895 119888119896) (9)

and the probability formula would be

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120591lowast

119894119896119905(ℎ119894119897)

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120590lowast (ℎ

119894119897)

(10)

The weight function which distributes the weight toeach history record would vary in different strategies In theexperiment of depth of considered step of weight functionswe could find that the depth within 5 could preform similarlyin the game (see Figure 13) So in our study we use an one-turn weight function for prediction Plus we can find thatmost of the classical strategies ignore the self-made choicesand regard all the choices to be the same Therefore we canget one of the weight functions by concluding these attributesfrom classical strategies

119908119894(119883119895 119888119896) =

1 (119894 = 119895)

0 (else) (11)

32 Income Evaluation In this step we must evaluate eachchoice we can return and choose one of them as the finaldecision We would select the choice with the highest scoreafter the evaluation

119862119905= argmax

119896

val (119888119896 ℎ119905) (12)

We can judge the evaluation based on the possibility ofeach playerrsquos choices and the profit function As the situationthat the profit function is given we can easily predict thevalue that our playerwouldmake from the choice In one-stepprediction the value equals the expectation of the incomeConsider

val (119888119896 ℎ119905)

= sum profit (119888119896 1198881199091 119888

119909119873)

lowast 119875 (1198831= 1198881199091 119883

119873= 119888119909119873

| 119867 = ℎ119905)

(13)

The function ldquoprofitrdquo is a given profit function with 119873

parameters which is equal to the number of players thereturn will be a vector of the income of the players who makethe same choice as the first parameterWe canmake our value

6 Mathematical Problems in Engineering

Function 119901 step( ck 119901 )if 119901 lt 0 then

return 0V larr 0

for each choice 119888119896do

V larr V + val(119888119896 ℎ119905)

Append(ht 119888119896 1198881199092 119888

119909119873)

for each choice 119888 doV larr V + 119901 step(119888 119901 minus 1)

endRemove (ht 119888

119896 1198881199092 119888

119909119873)

endreturn V

Algorithm 1

function more visionary if both the 119873 and the 119870 are small(like the classical prisonerrsquos dilemma where 119873 = 2 119870 = 2)A 119901-step prediction of the income will be a more efficientmethod We can get the best choice by the recursive program(see Algorithm 1)

33 Feedback Record At the end of each turn of the game wecan get the feedback from the system The feedback includeseach playerrsquos choice and the income they get In our modelthe profit function is given so we can detect the income of allplayers with the historical records of all players

The content of feedback can be represented as follows

119865119897= 1198881199091 1198881199092 119888

119909119873 (119888

119909119894isin 1198881 1198882 119888

119870) (14)

We can record the feedbacks as a list As a list the spacecomplexity of the record is 119874(119871119873) In the step of predictionthe time complexity is 119874(119870119873

2

1198712

) In the step of incomeevaluation the time complexity is 119874(119873119870

119873

) In the step ofrecord the time complexity is 119874(119873) In our discussion thenumber of players and the number of the choices is relativelysmall and the number of turns is large Such that 119873119870 ≪ 119871Andwe can find that the bottleneck of the problemwill be thetime complexity of prediction

To improve the problem that all the records have the sameweight and consider one historical step we can use a 119873-dimension array for storage We can build an 119873-dimensionarray 119860 where each dimensionrsquos length is 119870 The entries of119860 are counters of all specific situations which represent thecombination of the choice space Then 119860[119888

1199091][1198881199092] sdot sdot sdot [119888

119909119873]

means the total time of the turns that player 1 2 119873

chose 1198881199091 1198881199092 119888

119909119873 In such situation formula (10) will be

simplified into

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=119860 [1199031119905minus1

] [1199032119905minus1

] sdot sdot sdot [119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

] sdot sdot sdot [119903119873119905minus1

]

sum119870

119896=1119860 [1199031119905minus1

] [1199032119905minus1

]sdot sdot sdot[119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

]sdot sdot sdot[119903119873119905minus1

]

(15)

The time complexity of the prediction will reduce to119874(119873119870) The time complexity of the step of recording will be119874(1) (actually it is 119874(119873) in actual data structure) while thespace complexity will rise to 119874(119870

119873

) The space complexity isacceptable when the119870 and119873 are relatively small

The119873-dimension array can just be satisfied in 1-step date-back and its data structure will be complexity when the 119873

becomes large Here we can use hash table to make a moreefficient record A list of 119898-step data record is shown asfollows

119865119897 119865119897minus1

119865119897minus119898+1

= 1198881199091 1198881199092 119888

119909119873119897minus1

1198881199091 1198881199092 119888

119909119873119897minus119898

(16)

We can get its hash code through a hash function

hash119897= 119867119886 (119865

119897 119865119897minus1

119865119897minus119898+1

) (17)

And the hash code provides an index of record arraywhich records the time that a specific situation happensWiththe use of hash table the complexity of prediction will still be119874(119873119870) The time complexity of the recording will be 119874(1)And the space complexity will become 119874(1) which is basedon the device and not relevant to the 119873 or 119870 Table 3 showsspace and time complexity of different strategy

4 Brief Step of the Algorithm

Firstly we built an environment for the prisonerrsquos dilemmagame Each player is asked to provide a strategy and updatefunction The program of the environment is as shown inAlgorithm 2

And the strategy and update function we provide is asshown in Algorithm 3

5 Experimental Results and Analysis

51 The Performance of Bayes Model in the Double-PlayerGame Four typicalmodels were run against the Bayesmodel10000 times each the total incomes of both players in eachgame were recorded Figure 1 shows the overall incomesof both players recorded over 10000 games comparing theproposed Bayes model with the other four typical mod-els Overall the Bayes model was more advantageous andachieved a higher score (overall income) than the other fourOf these other four typical strategy models TFT performedbest It showed an equivalent overall income compared to thatof the Bayes model and a higher income than all the othersFor each game pair this research presented two test resultseach corresponding to one of the two stable score resultsfrom the selected game pair Figure 2 reveals that the Bayesmodel earned a higher income than the random Pavlovand GTFT models The income ratio to the GTFT modelreached 66 while that with the TFT model also exceededone By examining the final income from repeated games itwas found that the Bayes model was more advantageous thanthe other four typical strategy models tested here

In games repeated 10000 times the cases when the Bayesmodel scored 5 3 1 and 0 were statistically analyzed As

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

Mathematical Problems in Engineering 5

and ℎ119905is an 119898-dimension vector meaning the temporary

record of the playersrsquo decisions

ℎ119895= 119877119895minus1

119877119895minus2

119877119895minus119898

= 1199031119895

1199032119895

119903119873119895

1199031119895minus119898

1199032119895minus119898

119903119873119895minus119898

(119903119894119895

isin 1198880 1198882 119888

119896)

ℎ119894119897

= 119903119894119897minus1

119903119894119897minus2

119903119894119897minus119898

(6)

where 119877119895is a set of choices of each player in the 119895th turn

And the record set ℎ119895dates back to 119898 turns of the records

and includes 119898 set of records and 119903119894119895is the player 119894th choice

in the 119895th turn which is one of the elements in the set ofchoices119883

119895119898is the119898-step choice history of player 119895 Here we

consider that all the strategies would base their decision onthe last few steps and therefore the historical choices wouldbe immaterial at the temporary judge If the 119898 is very largeor in some extreme cases there may not be enough recordsfor building ℎ

119895The denominator of the formula may become

0 To avoid such situation we can make a correction for theoriginal formula by simultaneously increasing the moleculeand denominator119875 (119883119894= 119888119896| 119867 = ℎ

119905)

=

(119875 (119883119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)) + 1

(sum119896(119875 (119883

119894= 119888119896)prod119895119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896))) +

10038161003816100381610038161198881198961003816100381610038161003816

(7)

Next step is to get the historical record for the possibilityof each situation From the previous analysis we know thatwe should get the extra value of 119875(119883

119894= 119888119896) and 119875(119867 = ℎ

119895|

119883119894= 119888119896) from 119877

119895 Basically we can easily get the probability

from the formula that

119875 (119883119894= 119888119896) =

sum119871120591119896(119903119894119897)

sum119871120590119894(119903119894119897)

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

= 119875 (1198831119898

= ℎ1119905 1198832119898

= ℎ2119905 119883

119873119898= ℎ119873119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1120591lowast

119894119896119905(ℎ119895119897)

sum119871

119897=119898prod119873

119895=1120590lowast

119905(ℎ119895119897)

120591119896(119903119894119897) =

1 (119903119894119897

= 119888119896 119903119894119897

isin 119877119897)

0 (else)

120590119894(119903119894119897) =

1 (119903119894119897

isin 119877119897)

0 (else)

120591lowast

119894119896119905(ℎ119895119897) =

1 (119903119894119897

= 119888119896 ℎ119895119897

= ℎ119895119905)

0 (else)

120590lowast

119905(ℎ119895119897) =

1 (ℎ119895119897

= ℎ119895119905)

0 (else) (8)

Here ℎ119894119897is an ordered and comparable sequence of player

119894th 119898-step records and 119905 is the temporary number of turnWe can get the result of formula (7) from the data we record

Moreover as the strategy varies from player to playerwe should not expect all players to use similar strategies inthe game For example some players may ignore the choicesmade by their own and focus on othersrsquo choice The recordsfrom the player himself should be abandoned when makingthe prediction So we have to add a weight for each historyrecordThe weight function will be relevant to the player andthe choice

120596119894= 119908119894(119883119895 119888119896) (9)

and the probability formula would be

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120591lowast

119894119896119905(ℎ119894119897)

sum119871

119897=119898prod119873

119895=1119908119894(119883119895 119888119896) 120590lowast (ℎ

119894119897)

(10)

The weight function which distributes the weight toeach history record would vary in different strategies In theexperiment of depth of considered step of weight functionswe could find that the depth within 5 could preform similarlyin the game (see Figure 13) So in our study we use an one-turn weight function for prediction Plus we can find thatmost of the classical strategies ignore the self-made choicesand regard all the choices to be the same Therefore we canget one of the weight functions by concluding these attributesfrom classical strategies

119908119894(119883119895 119888119896) =

1 (119894 = 119895)

0 (else) (11)

32 Income Evaluation In this step we must evaluate eachchoice we can return and choose one of them as the finaldecision We would select the choice with the highest scoreafter the evaluation

119862119905= argmax

119896

val (119888119896 ℎ119905) (12)

We can judge the evaluation based on the possibility ofeach playerrsquos choices and the profit function As the situationthat the profit function is given we can easily predict thevalue that our playerwouldmake from the choice In one-stepprediction the value equals the expectation of the incomeConsider

val (119888119896 ℎ119905)

= sum profit (119888119896 1198881199091 119888

119909119873)

lowast 119875 (1198831= 1198881199091 119883

119873= 119888119909119873

| 119867 = ℎ119905)

(13)

The function ldquoprofitrdquo is a given profit function with 119873

parameters which is equal to the number of players thereturn will be a vector of the income of the players who makethe same choice as the first parameterWe canmake our value

6 Mathematical Problems in Engineering

Function 119901 step( ck 119901 )if 119901 lt 0 then

return 0V larr 0

for each choice 119888119896do

V larr V + val(119888119896 ℎ119905)

Append(ht 119888119896 1198881199092 119888

119909119873)

for each choice 119888 doV larr V + 119901 step(119888 119901 minus 1)

endRemove (ht 119888

119896 1198881199092 119888

119909119873)

endreturn V

Algorithm 1

function more visionary if both the 119873 and the 119870 are small(like the classical prisonerrsquos dilemma where 119873 = 2 119870 = 2)A 119901-step prediction of the income will be a more efficientmethod We can get the best choice by the recursive program(see Algorithm 1)

33 Feedback Record At the end of each turn of the game wecan get the feedback from the system The feedback includeseach playerrsquos choice and the income they get In our modelthe profit function is given so we can detect the income of allplayers with the historical records of all players

The content of feedback can be represented as follows

119865119897= 1198881199091 1198881199092 119888

119909119873 (119888

119909119894isin 1198881 1198882 119888

119870) (14)

We can record the feedbacks as a list As a list the spacecomplexity of the record is 119874(119871119873) In the step of predictionthe time complexity is 119874(119870119873

2

1198712

) In the step of incomeevaluation the time complexity is 119874(119873119870

119873

) In the step ofrecord the time complexity is 119874(119873) In our discussion thenumber of players and the number of the choices is relativelysmall and the number of turns is large Such that 119873119870 ≪ 119871Andwe can find that the bottleneck of the problemwill be thetime complexity of prediction

To improve the problem that all the records have the sameweight and consider one historical step we can use a 119873-dimension array for storage We can build an 119873-dimensionarray 119860 where each dimensionrsquos length is 119870 The entries of119860 are counters of all specific situations which represent thecombination of the choice space Then 119860[119888

1199091][1198881199092] sdot sdot sdot [119888

119909119873]

means the total time of the turns that player 1 2 119873

chose 1198881199091 1198881199092 119888

119909119873 In such situation formula (10) will be

simplified into

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=119860 [1199031119905minus1

] [1199032119905minus1

] sdot sdot sdot [119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

] sdot sdot sdot [119903119873119905minus1

]

sum119870

119896=1119860 [1199031119905minus1

] [1199032119905minus1

]sdot sdot sdot[119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

]sdot sdot sdot[119903119873119905minus1

]

(15)

The time complexity of the prediction will reduce to119874(119873119870) The time complexity of the step of recording will be119874(1) (actually it is 119874(119873) in actual data structure) while thespace complexity will rise to 119874(119870

119873

) The space complexity isacceptable when the119870 and119873 are relatively small

The119873-dimension array can just be satisfied in 1-step date-back and its data structure will be complexity when the 119873

becomes large Here we can use hash table to make a moreefficient record A list of 119898-step data record is shown asfollows

119865119897 119865119897minus1

119865119897minus119898+1

= 1198881199091 1198881199092 119888

119909119873119897minus1

1198881199091 1198881199092 119888

119909119873119897minus119898

(16)

We can get its hash code through a hash function

hash119897= 119867119886 (119865

119897 119865119897minus1

119865119897minus119898+1

) (17)

And the hash code provides an index of record arraywhich records the time that a specific situation happensWiththe use of hash table the complexity of prediction will still be119874(119873119870) The time complexity of the recording will be 119874(1)And the space complexity will become 119874(1) which is basedon the device and not relevant to the 119873 or 119870 Table 3 showsspace and time complexity of different strategy

4 Brief Step of the Algorithm

Firstly we built an environment for the prisonerrsquos dilemmagame Each player is asked to provide a strategy and updatefunction The program of the environment is as shown inAlgorithm 2

And the strategy and update function we provide is asshown in Algorithm 3

5 Experimental Results and Analysis

51 The Performance of Bayes Model in the Double-PlayerGame Four typicalmodels were run against the Bayesmodel10000 times each the total incomes of both players in eachgame were recorded Figure 1 shows the overall incomesof both players recorded over 10000 games comparing theproposed Bayes model with the other four typical mod-els Overall the Bayes model was more advantageous andachieved a higher score (overall income) than the other fourOf these other four typical strategy models TFT performedbest It showed an equivalent overall income compared to thatof the Bayes model and a higher income than all the othersFor each game pair this research presented two test resultseach corresponding to one of the two stable score resultsfrom the selected game pair Figure 2 reveals that the Bayesmodel earned a higher income than the random Pavlovand GTFT models The income ratio to the GTFT modelreached 66 while that with the TFT model also exceededone By examining the final income from repeated games itwas found that the Bayes model was more advantageous thanthe other four typical strategy models tested here

In games repeated 10000 times the cases when the Bayesmodel scored 5 3 1 and 0 were statistically analyzed As

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

6 Mathematical Problems in Engineering

Function 119901 step( ck 119901 )if 119901 lt 0 then

return 0V larr 0

for each choice 119888119896do

V larr V + val(119888119896 ℎ119905)

Append(ht 119888119896 1198881199092 119888

119909119873)

for each choice 119888 doV larr V + 119901 step(119888 119901 minus 1)

endRemove (ht 119888

119896 1198881199092 119888

119909119873)

endreturn V

Algorithm 1

function more visionary if both the 119873 and the 119870 are small(like the classical prisonerrsquos dilemma where 119873 = 2 119870 = 2)A 119901-step prediction of the income will be a more efficientmethod We can get the best choice by the recursive program(see Algorithm 1)

33 Feedback Record At the end of each turn of the game wecan get the feedback from the system The feedback includeseach playerrsquos choice and the income they get In our modelthe profit function is given so we can detect the income of allplayers with the historical records of all players

The content of feedback can be represented as follows

119865119897= 1198881199091 1198881199092 119888

119909119873 (119888

119909119894isin 1198881 1198882 119888

119870) (14)

We can record the feedbacks as a list As a list the spacecomplexity of the record is 119874(119871119873) In the step of predictionthe time complexity is 119874(119870119873

2

1198712

) In the step of incomeevaluation the time complexity is 119874(119873119870

119873

) In the step ofrecord the time complexity is 119874(119873) In our discussion thenumber of players and the number of the choices is relativelysmall and the number of turns is large Such that 119873119870 ≪ 119871Andwe can find that the bottleneck of the problemwill be thetime complexity of prediction

To improve the problem that all the records have the sameweight and consider one historical step we can use a 119873-dimension array for storage We can build an 119873-dimensionarray 119860 where each dimensionrsquos length is 119870 The entries of119860 are counters of all specific situations which represent thecombination of the choice space Then 119860[119888

1199091][1198881199092] sdot sdot sdot [119888

119909119873]

means the total time of the turns that player 1 2 119873

chose 1198881199091 1198881199092 119888

119909119873 In such situation formula (10) will be

simplified into

prod119895

119875 (119883119895119898

= ℎ119895119905

| 119883119894= 119888119896)

=119860 [1199031119905minus1

] [1199032119905minus1

] sdot sdot sdot [119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

] sdot sdot sdot [119903119873119905minus1

]

sum119870

119896=1119860 [1199031119905minus1

] [1199032119905minus1

]sdot sdot sdot[119903119894minus1119905minus1

] [119888119896] [119903119894minus1119905minus1

]sdot sdot sdot[119903119873119905minus1

]

(15)

The time complexity of the prediction will reduce to119874(119873119870) The time complexity of the step of recording will be119874(1) (actually it is 119874(119873) in actual data structure) while thespace complexity will rise to 119874(119870

119873

) The space complexity isacceptable when the119870 and119873 are relatively small

The119873-dimension array can just be satisfied in 1-step date-back and its data structure will be complexity when the 119873

becomes large Here we can use hash table to make a moreefficient record A list of 119898-step data record is shown asfollows

119865119897 119865119897minus1

119865119897minus119898+1

= 1198881199091 1198881199092 119888

119909119873119897minus1

1198881199091 1198881199092 119888

119909119873119897minus119898

(16)

We can get its hash code through a hash function

hash119897= 119867119886 (119865

119897 119865119897minus1

119865119897minus119898+1

) (17)

And the hash code provides an index of record arraywhich records the time that a specific situation happensWiththe use of hash table the complexity of prediction will still be119874(119873119870) The time complexity of the recording will be 119874(1)And the space complexity will become 119874(1) which is basedon the device and not relevant to the 119873 or 119870 Table 3 showsspace and time complexity of different strategy

4 Brief Step of the Algorithm

Firstly we built an environment for the prisonerrsquos dilemmagame Each player is asked to provide a strategy and updatefunction The program of the environment is as shown inAlgorithm 2

And the strategy and update function we provide is asshown in Algorithm 3

5 Experimental Results and Analysis

51 The Performance of Bayes Model in the Double-PlayerGame Four typicalmodels were run against the Bayesmodel10000 times each the total incomes of both players in eachgame were recorded Figure 1 shows the overall incomesof both players recorded over 10000 games comparing theproposed Bayes model with the other four typical mod-els Overall the Bayes model was more advantageous andachieved a higher score (overall income) than the other fourOf these other four typical strategy models TFT performedbest It showed an equivalent overall income compared to thatof the Bayes model and a higher income than all the othersFor each game pair this research presented two test resultseach corresponding to one of the two stable score resultsfrom the selected game pair Figure 2 reveals that the Bayesmodel earned a higher income than the random Pavlovand GTFT models The income ratio to the GTFT modelreached 66 while that with the TFT model also exceededone By examining the final income from repeated games itwas found that the Bayes model was more advantageous thanthe other four typical strategy models tested here

In games repeated 10000 times the cases when the Bayesmodel scored 5 3 1 and 0 were statistically analyzed As

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

Mathematical Problems in Engineering 7

Table 3 Time and space complexity of Bayes method with different data structures and some typical models

Space complexity Time complexity of prediction Time complexity of recordList 119874 (119871119873) 119874 (119870119873

2

1198712

) 119874 (1)

119873-d array 119874 (119870119873

) 119874 (119873119870) 119874 (119873)

Hash table 119874 (1) 119874 (119873119870) 119874 (1)

TFT 119874 (119873) 119874 (119873) 119874 (119873)

GTFT 119874 (119873) 119874 (119873) 119874 (119873)

Pavlov 119874 (119873) 119874 (119873) 119874 (119873)

Random 119874 (1) 119874 (1) 119874 (1)

for 119905 from 0 toMAX TURN dofor each player 119894 do

decision[119894]larr player[119894]strategyendfeedbacklarr profit(decision)for each player 119894 do

player[119894]update(feedback)end

end

Algorithm 2

procedure strategy()if tem turn ltTURN THRESHOLD then

decisionlarr random(choice space)else

prediction listlarr predict(restore)for each choice 119888 do

profit[119888]larrprofit expect(119888 prediction list)

endend

decisionlarr arg(max(profit))return decision

endprocedure update( feedback )

storageadd(feedback)

Algorithm 3

shown in Figure 3 the scores of 5 and 1 represented a relativelylarge proportion That is to say the Bayes model was proneto defection Analysis of Figures 1 and 3 implied that highscores mostly corresponded to cases scoring 5 Moreover theresults of each game competition showed that the incomesachieved by the Bayes model were higher when manifestingits tendency to defection

Analysis of the overall income of both players in eachgame (Figure 4) showed that the overall income in the gamewith a TFT model was lower According to the performanceof the rivals in each game (Figure 3) and comparison withtest result 2 (with the name end with ldquo2rdquo) it was noted thatboth strategy models in test result 1 (with the name end withldquo1rdquo) were less inclined to defection Therefore their overall

02468

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 2 The income ratio of the Bayes model to other models inthe games

53

10

05000

1000015000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

Figure 3 The distribution of Bayesian scores in each game

0

10000

20000

30000

40000

50000

Rand

om1

Rand

om2

TFT1

TFT2

Pavl

ov1

Pavl

ov2

GTF

T1

GTF

T2

BayesRival

Figure 4 The comparisons of overall game incomes

income was higher Since TFT is considered to be the modelthat can achievemore desired results amongst the four typicalstrategy models games setting the TFT strategy model andthe Bayes model in opposition were mainly investigated Inthis game the overall incomes of both sides were lower thanthose in other game competitions This indicated that thegame between the TFT strategy model and the Bayes model

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

8 Mathematical Problems in Engineering

both suffered losses By studying the single game resultsfrom the Bayes model and the TFT model it was foundthat the scores from the Bayes model were 1 (both sidessimultaneously selected defection)This result suggested thatin any game between the Bayes model and the TFT modeldefection appeared more frequently and represented a markof the defection-prone tendencies of the two models

52 The Performance of the Bayes Model in a MultiplayerGame With regard to multiplayer games this study jointlyused four different strategy models to run against the Bayesmodel and the overall income from eachmodel was recordedWith the methods described in Section 4 the income accru-ing to each player was distributed and the overall incomewas calculated In this section the decision method of TFTPavlov andGTFTmodels differed slightly from those appliedto the two-player game in the event of the defection of oneof the other players in the previous round the rivals selecteddefection In the following two-player game the decision forthe current round was made with the investigation of thedecisions of players A and B in the previous round

It can be deduced from Figure 5 that in the multiplayergame the Bayes model returned the highest overall incomeIn addition the overall game situation implied that theproportion that the four typical strategy models selectedcooperation was the highest That is to say in the game withfour models each treated cooperation as its main strategy(Figure 6)

53 Analysis of the Performance of the Bayes Model versusNormal Models The normal model refers to the strategymodels that are possibly encountered in real-life enactmentsof the game simulated here using a natural model The nat-ural model was a model adopting a random strategy (that iswhen faced with identical decisions from the previous roundthe probabilities that the natural models selected cooperationwere different) To verify that the strategy selected by theproposed Bayes model reaped more benefits in games versusthe normal model the game between them was repeated 500times In each game there were 1000 selections It was anattempt tomore comprehensively analyze the advantages anddisadvantages of the Bayes model

Figure 7 shows that the Bayes model performed betterthan most of natural models Overall the income from theBayes model can reach approximately 3000 and even 4500in individual extreme casesThe income ratio in Figure 8 wasmaximized at approximately 14 while most of the incomeratios were above one

In addition the TFT model also conducted the gameswith the normal model the result is shown in Figure 9 Itwas shown that the average income of the TFT model wasapproximately 2500 which was lower than that of the Bayesmodel while still equivalent to those of its other rivals

54 The Performance of the Bayes Model When Run overFewer Games Since the Bayes model was amachine learningmodel it needed a certain amount of data to guarantee itslearningTherefore when therewere fewer games whether or

0

20000

40000

60000

BayesTFT

PavlovGTFT

Figure 5 The overall income of each model after 10000 times ofmultiplayer game

02000400060008000

0 1 2 3 4The number of betrayals

Figure 6 Frequency ocurrence of games in the one-short gamewhen the number of betrayals is varied

not the Bayes model could achieve better game results shouldbe considered

In this study the Bayes model was evaluated using fewergame times in competition with the TFT model 100 timeswith the game repeated 100 times From Figures 10 and 11 wecan see that the Bayes model was more advantageous

55 Game Results from a PTFT Model Compared with theOtherModels Thegamemodels discussed abovemerely con-sidered the attitude immediately after the selection of the pre-vious step of both sides while the PTFT model took accountof the attitude during the selection of the previous threesteps Over 10000 runs of the PTFT model against the othermodels the incomes of each model are shown in Figure 12which suggested that the Bayes model was disadvantageousover the game and gained neither more nor less than the TFTmodel (both models became trapped in the mutual defectiondeadlock) Moreover the Pavlov model returned the lowestindividual income while the GTFT model gained the most

This revealed one of the disadvantages of the Bayesmodelif it failed to comprehensively consider all state characteristicsthat may appear in the game it cannot obtain the optimumsolution In the current experiment since the decision statesteps of both sides considered by the Bayes model were set toone the Bayesmodel was incapable of obtaining the optimumincome result in the game against the PTFT model

6 Conclusions

The authors regarded the prisonersrsquo dilemma as an incom-plete information gamewith unpublicized game strategies Inthe research a machine learning model was constructed tosolve problems in incomplete information games Based onthe Bayesian model we can make the prediction of playersrsquo

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

Mathematical Problems in Engineering 9

010002000300040005000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

BayesNormal

Figure 7The income of the Bayes model and normal models in thegame

0

5

10

15

20

0 200 400 600

Figure 8 The income ratio of the Bayes model versus the generalmodel

01000200030004000

1 19 37 55 73 91 109

127

145

163

181

199

217

235

253

271

289

307

325

343

361

379

397

415

433

451

469

487

TFTNormal

Figure 9 The income of the TFT model and normal models in thegame

choices to better complete the unknown information Andwesuggested the hash table to make improvement in space andtime complexity We built a game system with several typesof game strategy for testing The experimental results showthat the proposed Bayes model could obtain more desiredgame results compared with conventional typical strategymodels in double- or multiplayer games 10000 times It waseven believed that the Bayes model was slightly better thanthe acknowledged optimal strategy TFT model In a gamewith more general single-step decision modeling and fewergames runs the Bayes model also dominated This resultindicated that the naive Bayesian classification algorithmwasfeasible and effective at establishing the strategy model of anincomplete information game It provided a novel idea forsolving incomplete information game problems

However the results obtained by the naive Bayesianclassification algorithm showed certain defects it was unableto obtain the desired solution in the case of the decision abilityof a rival beyond its estimation range Therefore it reduced

BayesTFT

050

100150200250

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Figure 10 The incomes of the Bayes and TFT models after 100games

160165170175180

Bayes TFT

Figure 11 The average income of the Bayes and TFT models after100 games

0

10000

20000

30000

40000

Bayes TFT Pavlov GTFT

RivalPTFT

Figure 12 The incomes of each model after 10000 games

0

5000

10000

15000

20000

25000

1 2 3 4 5 6 7 8 9 10Depth of considered step

m = 1m = 2m = 3

m = 4m = 5

Figure 13 The income of Bayes model with different depths ofconsidered step in weight function when completing with randomm-step strategy

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

10 Mathematical Problems in Engineering

the applicability of the machine learning algorithm whenencountering complex models This should be the subject offuture research

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research was supported by the National Natural ScienceFoundation of China (No 61100148) and Project on the Inte-gration of Industry Education and Research of GuangdongProvince (No 2012B091100489)

References

[1] O NomiaGames with Incomplete Information Universite Paris1 Pantheon-Sorbonne Paris France 1998

[2] J C Harsanyi ldquoGames with incomplete information played byldquoBayesianrdquo players I The basic modelrdquo Management Sciencevol 14 no 3 pp 159ndash182 1967

[3] M Zinkevich M Johanson M H Bowling and C PiccioneldquoRegret minimization in games with incomplete informationrdquoAdvances in Neural InformationProcessing Systems vol 2008no 20 pp 1729ndash1736 2008

[4] E Alpaydin Introduction to Machine Learning The MIT PressCambridge Mass USA 2004

[5] C M Bishop Pattern Recognition and Machine LearningInformation Science and Statistics Springer New York NYUSA 2006

[6] X G Zhang ldquoIntroduction to statistical learning theory andsupport vector machinesrdquo Acta Automatica Sinica vol 26 no1 pp 32ndash42 2000

[7] R J Hanson B Cheeseman and P StutzBayesian ClassificationTheory NASA Ames Research Center Artificial IntelligenceResearch Branch 1991

[8] M Nowak and K Sigmund ldquoA strategy of win-stay lose-shiftthat outperforms tit-for-tat in the Prisonerrsquos Dilemma gamerdquoNature vol 364 no 6432 pp 56ndash58 1993

[9] D M Kreps P Milgrom J Roberts and R Wilson ldquoRationalcooperation in the finitely repeated prisonersrsquo dilemmardquo Journalof Economic Theory vol 27 no 2 pp 245ndash252 1982

[10] D W K Yeung L A Petrosyan and M C C Lee DynamicCooperation A Paradigm on the Cutting-Edge of Game TheoryChina Market Press 2007

[11] R Axelrod ldquoEffective choice in the prisonerrsquos dilemmardquo Journalof Conflict Resolution vol 24 no 1 pp 3ndash25 1980

[12] R Axelrod andWDHamilton ldquoThe evolution of cooperationrdquoScience vol 211 no 4489 pp 1390ndash1396 1981

[13] J H Miller ldquoThe coevolution of automata in the repeatedprisonerrsquos dilemmardquo Journal of Economic Behavior and Orga-nization vol 29 no 1 pp 87ndash112 1996

[14] W H Press and F J Dyson ldquoIterated Prisonerrsquos Dilemmacontains strategies that dominate any evolutionary opponentrdquoProceedings of the National Academy of Sciences of the UnitedStates of America vol 109 no 26 pp 10409ndash10413 2012

[15] H Lin and C-X Wu ldquoEvolution of strategies based on geneticalgorithm in the iterated prisonerrsquos dilemma on complex net-worksrdquo Acta Physica Sinica vol 56 no 8 pp 4313ndash4318 2007

[16] M A Nowak and K Sigmund ldquoEvolutionary dynamics ofbiological gamesrdquo Science vol 303 no 5659 pp 793ndash799 2004

[17] J W Weibull Evolutionary Game Theory MIT Press Cam-bridge Mass USA 1997

[18] M A Nowak and R M May ldquoEvolutionary games and spatialchaosrdquo Nature vol 359 no 6398 pp 826ndash829 1992

[19] A Cardillo J Gomez-Gardenes D Vilone and A SanchezldquoCo-evolution of strategies and update rules in the prisonerrsquosdilemma game on complex networksrdquo New Journal of Physicsvol 12 no 10 Article ID 103034 2010

[20] W-B Du H-R Zheng andM-B Hu ldquoEvolutionary prisonerrsquosdilemma game on weighted scale-free networksrdquo Physica AStatistical Mechanics and Its Applications vol 387 no 14 pp3796ndash3800 2008

[21] H Ohtsuki C Hauert E Lieberman and M A Nowak ldquoAsimple rule for the evolution of cooperation on graphs andsocial networksrdquo Nature vol 441 no 7092 pp 502ndash505 2006

[22] Y Wang and C Dang ldquoAn evolutionary algorithm for globaloptimization based on level-set evolution and latin squaresrdquoIEEE Transactions on Evolutionary Computation vol 11 no 5pp 579ndash595 2007

[23] Y Wang Y-C Jiao and H Li ldquoAn evolutionary algorithmfor solving nonlinear bilevel programming based on a newconstraint-handling schemerdquo IEEE Transactions on SystemsMan and Cybernetics C Applications and Reviews vol 35 no2 pp 221ndash232 2005

[24] R Selten and R Stoecker ldquoEnd behavior in sequences of finitePrisonerrsquos Dilemma supergames A learning theory approachrdquoJournal of Economic Behavior and Organization vol 7 no 1 pp47ndash70 1986

[25] D Y Jiang Situation Analysis of Double Action Games withEntropy Science Press New York NY USA 2010

[26] P A Flach and N Lachiche ldquoNaive Bayesian classification ofstructured datardquo Machine Learning vol 57 no 3 pp 233ndash2692004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article A Study of Prisoner s Dilemma Game Model ...downloads.hindawi.com/journals/mpe/2015/452042.pdf · a game system with several types of game strategy for testing. In

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of