maximizing the chance of winning in searching go game trees presenter: ling zhao march 16, 2005...
Post on 19-Dec-2015
213 views
TRANSCRIPT
Maximizing the Chance of Winning in Searching Go
Game Trees
Presenter: Ling ZhaoPresenter: Ling Zhao
March 16, 2005March 16, 2005
Author: Keh-Hsun ChenAccepted by Information Sciences
Motivation
Traditional approach in Go: Traditional approach in Go: maximize territorymaximize territory
Should it better to maximize the Should it better to maximize the probability of winning?probability of winning?
Expected territory vs. chance of winning kk groups, prob groups, prob ppii to be value to be value AAii, and prob 1–, and prob 1–ppii
to be –to be –AAii
((qqii, , AA’’ii) either () either (ppii, , AAii)) or (1-or (1-ppii, -, -AAii))
2^n combinations2^n combinations
Case study: All groups are safe
Territory score is a good predication of the Territory score is a good predication of the outcome of the game in end games.outcome of the game in end games.
Less reliable in opening or middle game.Less reliable in opening or middle game. Major difficulty: measuring no man’s landsMajor difficulty: measuring no man’s lands
Frontier space points of a block
1.1. Must be adjacent empty points of the blockMust be adjacent empty points of the block
2.2. Must have an adjacent point which is not the Must have an adjacent point which is not the same color of the blocksame color of the block
3.3. Can be used to measureCan be used to measure
openness of the boundary.openness of the boundary.
Frontier space points
Usually the total number of frontier space points (Usually the total number of frontier space points (FF) is 0 at ) is 0 at the beginning, increases until to its peak (about 60) in the the beginning, increases until to its peak (about 60) in the middle game, then decreases to 0 in the end. middle game, then decreases to 0 in the end. MM is the move is the move number.number.
if (if (M M < 100)< 100)
if (if (EEA A > 60+(100-> 60+(100-MM)/4) )/4) EEW W = 1;= 1;
else if (else if (EEA A < -60-(100-< -60-(100-MM)/4) )/4) EEW W = 0;= 0;
else else EEW W = 0.5+= 0.5+ 0.5 *0.5 * EEAA/(60+(100-/(60+(100-MM)/4);)/4);else else
if (if (EEA A > > FF) ) EEW W = 1;= 1;
else if (else if (EEA A < -< -FF) ) EEW W = 0;= 0;
else else EEW W = 0.5+0.5*= 0.5+0.5*EEAA/F;/F;
Case study: Existence of unsafe groups kk groups, the first groups, the first kk11 groups are safe, and the groups are safe, and the
rest are unsafe.rest are unsafe. Pessimistic evaluation:Pessimistic evaluation:
Optimistic evaluation:Optimistic evaluation:
Battles
An unsafe group’s transitive closure of An unsafe group’s transitive closure of adjacent unsafe groups forms a battle.adjacent unsafe groups forms a battle.
Evaluation of one battleEvaluation of one battle
Probability Probability pp11, p, p22,……p,……pn n with sum of 1.with sum of 1.
EEW W ==
Multi-battle situation
Combinatorial game model:Combinatorial game model:
GG11 = { = {AA | | BB}}
GG = = GG1 1 ++ GG2 2 ++ … …++ GGnn
Probabilistic combinatorial game (PCG) model:Probabilistic combinatorial game (PCG) model:
GG11 = { = {AA11 , , pp11 , , AA22 , , pp22 | | BB1 1 , , qq11 , , BB22 , , qq22 } }
GG = = GG1 1 ++ GG2 2 ++ … …++ GGnn
Solution
Mini-max based on winning percentageMini-max based on winning percentage Terminal nodes: no branching in the gameTerminal nodes: no branching in the game
Experimental results
Experimental Go intellect is slightly inferior Experimental Go intellect is slightly inferior to the regular version.to the regular version.
Reasons:Reasons:Probability and the correspondent outcome Probability and the correspondent outcome score are difficult to estimate when there score are difficult to estimate when there are one or more battles.are one or more battles.
Solutions: more thorough knowledge Solutions: more thorough knowledge engineering and implementation.engineering and implementation.
Lessons learned
Dynamic modification of weights on some move Dynamic modification of weights on some move generators. For example, reduce weight for generators. For example, reduce weight for attacking moves when far ahead.attacking moves when far ahead.
Adjust territory evaluation by the probability of Adjust territory evaluation by the probability of winning. For example, if the winning percentage winning. For example, if the winning percentage is 99%, add 10 points to territory score.is 99%, add 10 points to territory score.
Incremental increase of performance found from Incremental increase of performance found from experiments from the above two techniques.experiments from the above two techniques.
Conclusions
Right direction for Go.Right direction for Go. More concrete experimental results.More concrete experimental results. Interesting problem in itself and possible Interesting problem in itself and possible
applications in other games like Amazon.applications in other games like Amazon. Need better implementation for computing Need better implementation for computing
winning probability.winning probability.