maximizing the chance of winning in searching go game trees presenter: ling zhao march 16, 2005...

13
Maximizing the Chance of Winning in Searching Go Game Trees Presenter: Presenter: Ling Zhao Ling Zhao March 16, 2005 March 16, 2005 Author: Keh-Hsun Chen Accepted by Information Sciences

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Maximizing the Chance of Winning in Searching Go

Game Trees

Presenter: Ling ZhaoPresenter: Ling Zhao

March 16, 2005March 16, 2005

Author: Keh-Hsun ChenAccepted by Information Sciences

Motivation

Traditional approach in Go: Traditional approach in Go: maximize territorymaximize territory

Should it better to maximize the Should it better to maximize the probability of winning?probability of winning?

Expected territory vs. chance of winning kk groups, prob groups, prob ppii to be value to be value AAii, and prob 1–, and prob 1–ppii

to be –to be –AAii

((qqii, , AA’’ii) either () either (ppii, , AAii)) or (1-or (1-ppii, -, -AAii))

2^n combinations2^n combinations

Case study: All groups are safe

Territory score is a good predication of the Territory score is a good predication of the outcome of the game in end games.outcome of the game in end games.

Less reliable in opening or middle game.Less reliable in opening or middle game. Major difficulty: measuring no man’s landsMajor difficulty: measuring no man’s lands

Frontier space points of a block

1.1. Must be adjacent empty points of the blockMust be adjacent empty points of the block

2.2. Must have an adjacent point which is not the Must have an adjacent point which is not the same color of the blocksame color of the block

3.3. Can be used to measureCan be used to measure

openness of the boundary.openness of the boundary.

Frontier space points

Usually the total number of frontier space points (Usually the total number of frontier space points (FF) is 0 at ) is 0 at the beginning, increases until to its peak (about 60) in the the beginning, increases until to its peak (about 60) in the middle game, then decreases to 0 in the end. middle game, then decreases to 0 in the end. MM is the move is the move number.number.

if (if (M M < 100)< 100)

if (if (EEA A > 60+(100-> 60+(100-MM)/4) )/4) EEW W = 1;= 1;

else if (else if (EEA A < -60-(100-< -60-(100-MM)/4) )/4) EEW W = 0;= 0;

else else EEW W = 0.5+= 0.5+ 0.5 *0.5 * EEAA/(60+(100-/(60+(100-MM)/4);)/4);else else

if (if (EEA A > > FF) ) EEW W = 1;= 1;

else if (else if (EEA A < -< -FF) ) EEW W = 0;= 0;

else else EEW W = 0.5+0.5*= 0.5+0.5*EEAA/F;/F;

Case study: Existence of unsafe groups kk groups, the first groups, the first kk11 groups are safe, and the groups are safe, and the

rest are unsafe.rest are unsafe. Pessimistic evaluation:Pessimistic evaluation:

Optimistic evaluation:Optimistic evaluation:

Battles

An unsafe group’s transitive closure of An unsafe group’s transitive closure of adjacent unsafe groups forms a battle.adjacent unsafe groups forms a battle.

Evaluation of one battleEvaluation of one battle

Probability Probability pp11, p, p22,……p,……pn n with sum of 1.with sum of 1.

EEW W ==

Multi-battle situation

Combinatorial game model:Combinatorial game model:

GG11 = { = {AA | | BB}}

GG = = GG1 1 ++ GG2 2 ++ … …++ GGnn

Probabilistic combinatorial game (PCG) model:Probabilistic combinatorial game (PCG) model:

GG11 = { = {AA11 , , pp11 , , AA22 , , pp22 | | BB1 1 , , qq11 , , BB22 , , qq22 } }

GG = = GG1 1 ++ GG2 2 ++ … …++ GGnn

Solution

Mini-max based on winning percentageMini-max based on winning percentage Terminal nodes: no branching in the gameTerminal nodes: no branching in the game

Experimental results

Experimental Go intellect is slightly inferior Experimental Go intellect is slightly inferior to the regular version.to the regular version.

Reasons:Reasons:Probability and the correspondent outcome Probability and the correspondent outcome score are difficult to estimate when there score are difficult to estimate when there are one or more battles.are one or more battles.

Solutions: more thorough knowledge Solutions: more thorough knowledge engineering and implementation.engineering and implementation.

Lessons learned

Dynamic modification of weights on some move Dynamic modification of weights on some move generators. For example, reduce weight for generators. For example, reduce weight for attacking moves when far ahead.attacking moves when far ahead.

Adjust territory evaluation by the probability of Adjust territory evaluation by the probability of winning. For example, if the winning percentage winning. For example, if the winning percentage is 99%, add 10 points to territory score.is 99%, add 10 points to territory score.

Incremental increase of performance found from Incremental increase of performance found from experiments from the above two techniques.experiments from the above two techniques.

Conclusions

Right direction for Go.Right direction for Go. More concrete experimental results.More concrete experimental results. Interesting problem in itself and possible Interesting problem in itself and possible

applications in other games like Amazon.applications in other games like Amazon. Need better implementation for computing Need better implementation for computing

winning probability.winning probability.