[ieee 2012 conference on technologies and applications of artificial intelligence (taai) - tainan,...

Extracting important patterns for building state-action evaluation function in Othello

Huy Nguyen School of Information Science

JAIST Ishikawa, Japan

[email protected]

Kokolo Ikeda School of Information Science

JAIST Ishikawa, Japan

[email protected]

Bac Le Faculty of Information Technology

HCMUS Ho Chi Minh city, Vietnam

[email protected]

Abstract— Important patterns are necessary in building a state-action evaluation function in board game. A pattern is good if its features are informative. In this paper, we propose amethod to evaluate whether a pattern is informative. From that, important patterns can be extracted automatically. Patterns must be extracted from high quality and huge set of game records. In experiment, we implement the method in such set of Othello game records.

Keywords- Extract patterns; state-action evaluation function; Othello; game records

I. INTRODUCTION

Many board game programs use domain knowledge encoded into patterns. In historically, game Artificial Intelligent programmers learned the target game, found the important patterns in which they feel features of each pattern are informative, and then optimized parameters of features manually. The parameters maybe contain errors, have holes, and cannot be easily updated. Today, the parameters can be optimized automatically by many methods, if game records are available. The kinds of patterns considered in this paper are state-action patterns. State-action evaluation approach can be used in game of Go as well as Othello game, especially in Monte Carlo programs. Thus, extracting important patterns for building state-action evaluation function in Othello game is our main purpose. Such knowledge may be used to improve random simulations in Monte-Carlo programs [5].

This paper presents a method to evaluate whether apattern is important or not. A pattern including features is considered an important one if parameters of features distribute from very low to very high, and number of features in each interval must be the same or uniform. Aparameter of a feature represents for a characteristic of feature. Such pattern is called to be informative. We propose a method to identify whether distribution of features is uniform, and a method to identify whether parameters of features are from very low to very high. From that, important patterns can be extracted automatically by browsing expert game records.

In this paper, the term “pattern” means a set of cells, regardless to the existence of disks/stones, and only one cell is empty position always. It is descripted as following {empty position, (cells of pattern)}. In figure 1, we can see 4 patterns such as {c2, (a2, b2, c2, d2, e2, f2, g2, h2)}, {c2,

(c1, c2, c3, c4, c5, c6, c7, c8)}, {c2, (a4, b3, c2, d1)}, and {c2, (b1, c2, d3, e4, f5, g6, h7)}. The term “feature” means an attributive composition of cells in pattern, and each cell has 3 attributes including 1 for Black player, 2 for White player, and 0 for empty. Maximum number of features in a pattern is 3(length -1). The “length” is number of cells in pattern. For example, the maximum number features of pattern {c2, (a4, b3, c2, d1)} is 27, of pattern {c2, (b1, c2, d3, e4, f5, g6, h7)} is 729. In figure 2, it is two features of two different patterns. There are a lot of information in a feature, but number of times that feature appears in candidate moves and number of times that feature appears in selected move are good characteristic for a feature. Thus, we could consider equation (3) be a parameter for a feature.

Definition 1 (The importance). A pattern called important if it satisfies properties such as frequent, scatter,and uniform.

Given a pattern and user threshold is 0.9, there are 90% features appeared in all possible features. It is an example of the frequent property. Parameters of these features distribute from very low to very high, for example, from 0/1000 to 990/1000. It is the scatter property. Divide all features into kclasses, if number of features in each class is the same, we can say that distribution of features is uniform.

This paper is organized as follows: Section 2 discusses related work about pattern in game board, section 3 explains the problem description in details, section 4 presents experimental results of pattern extracting, we evaluate the experiment results in section 5, and section 6 is conclusion and future work.

II. RELATED WORK

Patterns have been broadly used in board game programs, in which use thousands or even millions of them. However, collecting large set of patterns by human brain is difficult and error prone. Thus, there were many methods to extract patterns from professional game records such as a pattern extraction scheme for efficiently harvesting patterns of given size and shape [3] by Stern, D et al, using the relative frequencies of local board patterns observed in game records to generate a ranked list of moves [2] by Araki, N et al, using a neural network approach to generate local moves [9] by E. van der Werf, using the K nearest-neighbor representation to generate local moves [1] by Bouzy, B and Chaslot, G, automatic acquisition of tactical patterns for eyes or connections [10] by Cazenave, T.

2012 Conference on Technologies and Applications of Artificial Intelligence

978-0-7695-4919-4/12 $26.00 © 2012 IEEE

DOI 10.1109/TAAI.2012.11

270

2012 Conference on Technologies and Applications of Artificial Intelligence

978-0-7695-4919-4/12 $26.00 © 2012 IEEE

DOI 10.1109/TAAI.2012.11

278

Besides, there are some methods to optimize parameters of patterns after getting patterns observed in game records such as Maximum Entropy Method (MEM) [2], Bayesian Pattern Ranking method learns a distribution over the values of a move given a board position based on the local pattern context for move predicting in the game of Go [3], Bradley-Terry Minorization Maximization (BTMM) [6] model. This model can be applied for supervised learning of patterns, by considering that each sample move is a competition, whose winner is the move in question, and losers are the other legal moves. Each move can be considered as a “team” of features, thus allowing combine a large number of such features without a very high cost [6].

These pattern extracting methods are good for the game of Go, they are difficult for Othello game because there are only horizontal, vertical, or diagonal patterns in Othello which are useful in evaluation function. In optimizing parameters, the BTMM model is strong suitable for state-action evaluation approach to optimize.

III. PROBLEM DESCRIPTION

State is defined as a pattern attribute vector in cardinal space representing properties of a move, and action as selection of a pattern set [8]. The motivation for extracting important patterns came from the need optimize parameters of features in each kind of pattern later. Othello is a strategyboard game for two players, played on an 8x8 board. There are 64 identical pieces called 'discs', which are light on one side and dark on the other. Current player must place a piece with his side up on the board, in such a position that there exists at least one straight (horizontal, vertical, or diagonal) occupied line between the new piece and another current player’s piece, with one or more contiguous opponent’s pieces between them. Thus, only horizontal, vertical, or diagonal patterns have meaning in evaluation function.

Figure 1. Action at position c2 and 4 related patterns.

Figure 2. A feature of pattern having 8 cells.

In figure 2, feature code is: 12202021, where 1 is Black player, 2 is White, and 0 is empty. There are two empty positions at indexes 3, and 5. Thus, we have 2 different features (3, 12202021) and (5, 12202021) of two patterns with the same length.

Let G be a set of game records. Each record is a list of many plies, a ply is a move by one player in which identifies

selected move and set of legal moves. Let S be a set of selected moves of G, and L be a set of legal moves of G.Important information for analysis maybe the selected time number, called α, and the candidate time number, called β.

A feature occurs in a legal (candidate) move Lm� , L is set of legal moves, called ),( Lmfeatureoccurence � . A feature occurs in a selected move Sm� , S is set of selected moves, called ),( Smfeatureoccurence � .

|),(|:)( Sfeatureoccurencefeature �� , it is number of times that feature occurs in selected moves of game records (set S).

|),(|:)( Lfeatureoccurencefeature �� , it is number of times that feature occurs in legal moves of game records (set L).

��

�otherwise 0

0)( if 1),(

featureGfeatureappearance

� ��

)()(),(),(

featurefeatureLfeatureoccurenceSfeatureoccurence

��

From equation (2), 1)()(

�featurefeature

�� ,it is an important

characteristic of feature. If it is near 0, the feature is bad. If it is near 1, the feature is probably good. However, rate 1/1 is good but less confident than rate 100/100, or rate 0/1 is bad but less confident than rate 0/100. To increase confident, we must limit the length of pattern.

Definition 2 (The frequency). The appear_ratio of a pattern X in G is computed by the number of appearance features in the feature cardinal of pattern.

features

GfeatureappearanceGpatternratioappearance patternfeature

��

),(:),(_ ��

|features| is the feature cardinal of pattern. If pattern length is too long, the appear_ratio will be small. Obviously, )( feature� of features is small, too. It means that the rate �� / of features is not confident. Thus, we can use equation (3) to limit pattern length. It is a relation between equations (2) and (3).

Remark (The monotonicity). Given a set of game records G, let X, Y be two patterns. Then,

).,(_),(_ GXratioappearanceGYratioappearanceYX ��

It is useful in two cases which are able to prune patterns: In case 1, if a pattern is not satisfied a threshold, its super-patterns need not compute. In case 2, if a pattern is satisfied a threshold, its sub-patterns must be satisfied.

Definition 3 (The uniformity). A pattern is uniform if its distribution of features is the same in all observing classes (classes for observing by a characteristic). If the uniformity

271279

of pattern is smaller or equal threshold ϒ, the pattern is uniform.

��

��

K

i i

i

ttw

KGpattern

1

2)(1:),(uniformity ��

��

�patternfeature

GfeatureappearanceN ),( , � �Nk 2log� is

number of classes should be divided. Probability distribution is from 0 to 1, each class has a distance 1/K

each other, KNw �: is number of features that their

parameters belong to distribution class i in theory, ti is number of features that their parameters belong to distribution class i in practice. Threshold ϒ is gotten from Chi-square (χ2) table at row (K-1) and column P=0.001. It is a validation method in statistic for distribution.

If parameters distribute from very low to very high continuously, for example, from 0/1000 to 990/1000, it means that features are good informative and good for state-action evaluation function or optimizing in future.

Definition 4 (The scatter). The scatter of a pattern is number of parameter distribution classes containing features over the maximum classes.

Kttcount

Gpatternk

i ii� ��

� 11}0|{

:),(scatter ��

In other word, the scatter identifies how many classes in K classes contain features. If the scatter is 1, distribution of parameters is continuous from 0 to 1. Constant 1 in equation (5) means that we always have one special class in which parameters of features is 0. Pattern having the scatter is greater than or equal the threshold (μ) is considered. The scatter value is from 0 to 1.

Example 1. If we have 15 features then � � 415log2 ��k . Each class contains features in

distance ¼ = 0.25. Class 1 is from 0 to 0.249, class 2 is from 0.25 to 0.499, class 3 is from 0.5 to 0.749, and class 4 is from 0.75 to 0.999.

In case (a), they are 0, 1/15, 2/9, 2/5, 1/5, 8/9, 3/5, 4/9, 11/13, 10/13, 1/3, ¼, 4/7, 5/7, and 9/10. Classes are divided as follows: Class 1: 0, 1/15, 2/9, and 1/5. Class 2: 2/5, 4/9, 1/3, and ¼. Class 3: 3/5, 4/7, and 5/7. Class 4: 8/9, 11/13, 10/13, and 9/10. This case is uniform, and scatter.

In case (b), they are 0, 1/15, 1/14, 1/13, 2/15, 2/17, 3/7, 7/12, 8/13, 13/24, 15/26, 14/27, 15/27, 3/5, and 9/10. Classes are divided as follows: Class 1: 0, 1/15, 1/14, 1/13, 2/15, and 2/17. Class 2: 3/7. Class 3: 7/12, 8/13, 13/24, 15/26, 14/27, 15/27, and 3/5. Class 4: 9/10. This case is not uniform, but scatter.

Thus, we can say that case (a) is better than case (b) in this example.

Problem (Finding out important patterns). Given a set of game records G, a position in game board, and threshold δ. Find out pattern in which their appearance_ratio is greater than or equal threshold δ. Then, continuously consider its scatter is smaller than or equal threshold ϒ or not, and its uniformity is greater than or equal threshold μ or not. Patterns that satisfy conditions above are important.

Figure 3. Algorithm for finding out important patterns

Figure 4. Uniform function based on equation (4)

Figure 5. Scatter function based on equation (5) In figure 3, it is pseudo-code for algorithm of finding out

important patterns with input parameters including game records, a given position in game board, threshold of appearance ratio, and threshold of scatter. The uniformity

272280

will be calculated and compared with Chi-square value getting from table 1, and the scatter will be calculated compared with threshold of scatter. The pattern variable represents for a kind of pattern, each kind of pattern has 3patternLength – 1 features and an appearRatio attribute. Each feature has an important characteristic, rate �� / , which is represented by param attribute of feature in UNIFORM and SCATTER functions (see figures 4, 5). The appearRatioattribute is calculated by equation (3). It is the number of appearance features in the feature cardinal of pattern.

TABLE I. TABLE OF CHI-SQUARE STATISTICS

dfP = 0.05

P = 0.01

P = 0.001 df P =

0.05P = 0.01

P = 0.001

1 3.84 6.64 10.83 8 15.51 20.09 26.132 5.99 9.21 13.82 9 16.92 21.67 26.133 7.82 11.35 16.27 10 18.31 23.21 29.594 9.49 13.28 18.47 11 19.68 24.73 31.265 11.07 15.09 20.52 12 21.03 26.22 32.916 12.59 16.81 22.46 13 22.36 27.69 34.537 14.07 18.48 24.32 14 23.69 29.14 36.12

TABLE II. PATTERNS SATISFYING δ IN POSITION C2.

No Candidate Patterns Appear features

No. of pattern

ratio

1 {C2, (D1,C2,B3,A4)} 27 27 12 {C2, (B1,C2,D3,E4)} 18 27 0.673 {C2, (B1,C2,D3,E4,F5)} 54 81 0.674 {C2, (B1,C2,D3,E4,F5,G6)} 151 243 0.625 {C2, (B1,C2,D3,E4,F5,G6,H7)} 371 729 0.516 {C2, (C2,D3,E4,F5)} 18 27 0.677 {C2, (C2,D3,E4,F5,G6)} 54 81 0.678 {C2, (C2,D3,E4,F5,G6,H7)} 151 243 0.629 {C2, (A2,B2,C2,D2)} 27 27 1

10 {C2, (B2,C2,D2,E2)} 27 27 111 {C2, (C2,D2,E2,F2)} 27 27 112 {C2, (A2,B2,C2,D2,E2)} 81 81 113 {C2, (A2,B2,C2,D2,E2,F2)} 243 243 114 {C2, (A2,B2,C2,D2,E2,F2,G2)} 729 729 115 {C2,(A2,B2,C2,D2,E2,F2,G2,H2)} 2109 2187 0.9616 {C2, (B2,C2,D2,E2,F2)} 81 81 117 {C2, (B2,C2,D2,E2,F2,G2)} 243 243 118 {C2, (B2,C2,D2,E2,F2,G2,H2)} 728 729 0.9919 {C2, (C2,D2,E2,F2,G2)} 81 81 120 {C2, (C2,D2,E2,F2,G2,H2)} 243 243 121 {C2, (C1,C2,C3,C4)} 27 27 122 {C2, (C2,C3,C4,C5)} 27 27 123 {C2, (C2,C3,C4,C5,C6)} 81 81 124 {C2, (C1,C2,C3,C4,C5)} 81 81 125 {C2, (C1,C2,C3,C4,C5,C6)} 243 243 126 {C2, (C1,C2,C3,C4,C5,C6,C7)} 725 729 0.9927 {C2, (C2,C3,C4,C5,C6,C7)} 243 243 128 {C2, (C2,C3,C4,C5,C6,C7,C8)} 729 729 129 {C2,(C1,C2,C3,C4,C5,C6,C7,C8)} 2008 2187 0.92

Example 2. Given 480.000 game records, and position C2 in Othello board, threshold δ = 90%. Table 2 shows all patterns satisfying threshold.

As we can see in table 2, appearance_ratio of pattern {B1, C2, D3, E4} is 0.67, then appearance_ratio(s) of its super-patterns are smaller than 0.67 such as

{B1,C2,D3,E4,F5} is 0.667, {B1, C2, D3, E4, F5, G6} is 0.62, and {B1, C2, D3, E4, F5, G6, H7} is 0.51.

Besides, appearance_ratio of pattern {B1, C2, D3, E4, F5} is 0.667, then appearance_ratio of pattern {B1, C2, D3, E4, F5, G6} must be smaller. Thus, the remark can be used for pruning unnecessary patterns. For each pattern, to consider it is uniform or not we need compare with χ2. χ2 is gotten from column P = 0.001 at row (df-1) in table 1, μ =0.8. Table 3 shows the uniformity, scatter, as well as Chi-square value corresponding to each pattern. In this example, it is not happy for us because there is no any pattern satisfying conditions of problem.

However, we can try in more game records, for example from 500.000 to 1.000.000 game records, or reduce the threshold μ, or find important patterns by ranking if patterns have the same scatter. In case of the scatter = 0.8, and by ranking method, we can select pattern {B2, C2, D2, E2, F2, G2, H2} or pattern {B2, C2, D2, E2, F2, G2}. From two selected patterns, we can select only one because the pattern {B2,C2, D2,E2, F2, G2, H2} is super-pattern of the pattern {B2,C2, D2,E2, F2, G2}.

TABLE III. VALUES OF UNIFORMITY, SCATTER, AND CHI-SQUARE.

No Candidate Pattern uniform scatte df χ2

1 {D1,C2,B3,A4} 0.92 0.4 2 10.832 {A2,B2,C2,D2} 13.32 0.6 3 13.823 {B2,C2,D2,E2} 33.26 0.6 3 13.824 {C2,D2,E2,F2} 3.284 0.2 2 10.835 {A2,B2,C2,D2,E2} 26.05 0.571 4 16.276 {A2,B2,C2,D2,E2,F2} 1128.02 0.75 6 20.527 {A2,B2,C2,D2,E2,F2,G2} 4796.2 0.9 9 26.138 {B2,C2,D2,E2,F2} 8.6054 0.428 3 13.829 {B2,C2,D2,E2,F2,G2} 1682.5 0.875 7 22.4610 {C2,D2,E2,F2,G2} 10.267 0.285 2 10.8311 {C2,D2,E2,F2,G2,H2} 107.47 0.5 4 16.2712 {B2,C2,D2,E2,F2,G2,H2} 4392.5 1 10 27.8813 {A2,B2,C2,D2,E2,F2,G2,H2} 11650.5 1 12 31.2614 {C1,C2,C3,C4} 0.035 0.4 2 10.8315 {C2,C3,C4,C5} 5.125 0.2 2 10.8316 {C1,C2,C3,C4,C5} 13.6 0.428 3 13.8217 {C2,C3,C4,C5,C6} 30.767 0.428 3 13.8218 {C1,C2,C3,C4,C5,C6} 1209.8 0.625 5 18.4719 {C2,C3,C4,C5,C6,C7} 144.32 0.5 4 16.2720 {C1,C2,C3,C4,C5,C6,C7} 3837.45 0.9 9 27.1321 {C2,C3,C4,C5,C6,C7,C8} 9243.3 0.8 8 24.3222 {C1,C2,C3,C4,C5,C6,C7,C8} 17521.1 1 11 29.59

IV. EXPERIMENTS

Extracting was performed on game records played by strong players on a site of Michael Buro [7], author of Logistello program. These game records were downloaded from http://skatgame.net/mburo/ggs/game-archive/Othello.The data set was made of the 480.000 games. ELO rating of play in these games may be from 1200 to 2600, it is strong enough for extracting important patterns, and for training in other Othello programs. These game records have the advantage of being publicly available for free.

273281

In Othello moves, current player must place a piece with his side up on the board, in such a position that there exists at least one straight (horizontal, vertical, or diagonal) occupied line between the new piece and another current player’s piece. Thus, only horizontal, vertical, or diagonal patterns have meaning in evaluation function. Other shapes of pattern will be not considered in experiment. Because the symmetry of game board, we only consider 9 positions in the eighth of game board such as a1, b1, c1, d1, b2, c2, d2, c3, d3. Then, we can use symmetrically for the other eighths.

From data set above, we find patterns with following conditions: the length of pattern is from 4 to 8, the threshold δ = 0.9, means that the appearance_ratio of candidates must be greater than or equal 0.9. The threshold μ =0.8, means that we only consider patterns in which their factors α/β are scatter such that the patterns are informative about eighty percent. Then, the uniformity(s) of patterns will be calculated and compared with the Chi-square table at column P = 0.001. Unfortunately, almost the uniformity(s) are smaller than values in Chi-square table while they satisfy threshold δ, μ. Thus, the result patterns are identified by ranking if they have the same scatter. In case of two patterns having parent-child relationship, for example, {B2,C2,D2,E2,F2,G2} and {B2,C2,D2,E2,F2,G2,H2}, we only select one. Finally, experiment result is in table 4.

TABLE IV. IMPORTANT PATTERNS IN GAME RECORDS

No Pos Important patterns1 A1 {a1, b1, c1, d1, e1, f1, g1}, {a1, a2, a3, a4, a5, a6,a7}2 B1 {a1,b1,c1,d1,e1,f1,g1,h1}, {b1,b2,b3,b4,b5,b6,b7}3 C1 {a1, b1, c1, d1, e1, f1, g1, h1}, {c1, c2, c3, c4, c5,c6,c7}4 D1 {a1,b1,c1,d1,e1,f1,g1, h1}5 B2 {a2, b2, c2,d2, e2, f2,g2, h2}, {b1,b2,b3,b4,b5,b6,b7,b8}6 C2 {a2, b2, c2, d2,e2, f2, g2, h2}, {c1, c2,c3,c4,c5,c6,c7,c8}7 D2 {a2, b2, c2, d2, e2, f2, g2, h2}, {c1, d2, e3, f4, g5, h6}8 C3 {a3, b3, c3, d3, e3, f3, g3}, {c1, c2, c3, c4, c5, c6, c7}9 D3 {c3, d3, e3, f3, g3}, {f1, e2, d3, c4, b5, a6}

V. EVALUATION

After extracting important patterns for each position in game board, we needs evaluate whether these patterns are actually useful for state-action evaluation function. A method presents as follows: (1) Find out a group of not important patterns, a group of bad patterns, and a group of good patterns. (2) Optimize parameters of features in three groups by BTMM [6] model, and compare the log-likelihood for testing data in 3 groups each other.

Suppose that the result group in experiment is good, the other groups includes patterns which their appearance_ratio(s), scatter(s), and uniformity(s) are lower than that of good group. Table 5 is patterns in not important group, these patterns are the same appearance_ratio to table 4, but the uniformity and the scatter are worse.Unfortunately, almost patterns in table 5 is sub-patterns of

table 4. Table 6 is low frequent patterns, because their appearance_ratio(s) are smaller than 0.65. All patterns are not sub-patterns of table 4 by accident.

From three groups, we optimize parameters of features in patterns from set of game records by BTMM [6] model. The following equation is used to update parameters of features (6) and evaluate experiment results (7).

��

�N

j j

ij

ii

E

C

W

1

� (6)

M

pprobMi

i��

))(log( likelihood-logaver (7)

jj E

pstrongpprob )()( � (8)

��

�pi

ipstrong �)( (9)

where ϒi is parameter of feature i, )( ii featureW �� , Cij is product of all features which are teammates of feature i in current state j of game board, Ej is sum strengths of positions at current state j. Strength of position is product all features in position. N is number of positions at current state j. M is number of moves in all game records.

TABLE V. NOT IMPORTANT PATTERNS

No Pos Not important patterns1 A1 {a1, b1, c1, d1, e1, f1}, {a1, a2, a3, a4, a5, a6}2 B1 {a1, b1, c1, d1, e1, f1}, {b1, b2, b3, b4, b5, b6}3 C1 {a1, b1, c1, d1, e1, f1}, {c1, d2, e3, f4, g5, h6}4 D1 {a1, b1, c1, d1, e1, f1}, {d1, e2, f3, g4, h5 }5 B2 {a2, b2, c2, d2, e2, f2, g2}, {b1, b2, b3, b4, b5, b6, b7}6 C2 {a4, b3, c2, d1}, {c1, c2, c3, c4, c5, c6}7 D2 {a2, b2, c2, d2, e2, f2}, {a5, b4, c3, d2, e1} 8 C3 { a5, b4, c3, d2, e1}, {c1, c2, c3, c4, c5, c6}9 D3 {a3, b3, c3, d3, e3}, {b5, c4, d3, e2, f1}

TABLE VI. LOW FREQUENT PATTERNS

No Pos Low frequent patterns1 A1 {a1, b2, c3, d4, e5}2 B1 {b1, c2, d3, e4, f5, g6, h7}3 C1 { }4 D1 {d1 ,d2 ,d3 ,d4 ,d5 ,d6 ,d7 ,d8}5 B2 {b2, c3, d4, e5, f6, g7}6 C2 {b1, c2, d3, e4, f5, g6, h7}7 D2 {d2,d3,d4,d5,d6,d7}8 C3 {a1, b2, c3, d4, e5}9 D3 {c2, d3, e4, f5, g6, h7}

We use BTMM to optimize parameters of features in two groups of patterns from set of 97.000 game records, and test in 3000 game records. Each optimizing runs 30 loops. The learning curves and testing curves of two groups are average log-likelihood function to identify the average of distribution probability for selected action. In figure 6,testing curve of important group is better than that of not good pattern group and bad pattern group, and testing curve

274282

of not important group is better than that of bad pattern group. We can see that the patterns in tables 5, 6 are sub-patterns of table 4. In other words, the length of patterns in table 5, 6 is shorter than that of table 4, it makes readers look unfair. Thus, we also compare in three groups in which the patterns are the same length (see figure 7). In this case, the testing curve of group is good if the measures of group is good, measures of “better group” is better than that of “initial group” and measures of “initial group” is better than that of “worse group”.

VI. CONCLUSION AND FUTURE WORK

The research presented a method to identify whether apattern is important or not. From that, we can automaticallyextract important patterns from large set of game records. In practice, it is rarely find out any pattern that its factors α/β of features are distributed uniformly. However, we can extend the method by ranking patterns that have the same scatter. This method is very useful to support programmers narrow the pattern space of a position, then select the important one by their brain. By this way, we also ignore some patterns which are not necessary for state-action evaluation function. In experiments, patterns gotten by this method are important. We can see that performance of group including important patterns and other patterns is the same with that of important group (see figure 7). Thus, we can only use important patterns instead of all.

We only implement experiment on Othello game. Maybe this method can use in other board games. We have a plan to implement in more groups of patterns, more game records,

and find a suitable way to ranking patterns. The way for evaluating experiment results is general for every case. Besides, trade-off between multiple patterns is also considered. Because a move is including multiple patterns,not only one pattern.

REFERENCES

[1] Bouzy, B.,Chaslot, G., Bayesian generation and integration of Knearest-neighbor patterns for 19x19 Go. In G. Kendall and Simon Lucas, editors, IEEE Symposium on Computational Intelligence in Games, Colchester,UK, pp. 176–181(2005).

[2] Araki, N., Yoshida, K.,Tsuruoka, Y., and Tsujii. J., Move prediction in Go with the maximum entropy method. In Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Games, (2007).

[3] Stern, D., Herbrich, R., and Graepel. T., Bayesian pattern ranking for move prediction in the game of Go. In Proceedings of the 23rd international conference on Machine learning, Pittsburgh, Pennsylvania, USA, pp. 873–880, (2006).

[4] Franck, G.,Moyo Go Studio. http://www.moyogo.com/, (2007). [5] Gelly, S., Wang, Y., Munos, R., and Teytaud. O., Modification of

UCT with patterns in Monte-Carlo Go. Technical Report RR-6062, INRIA, (2006).

[6] Coulom, R., Computing Elo Ratings of Move Patterns in the Game of Go. In Computer Games Workshop, Amsterdam, Netherlands, (2007).

[7] http://skatgame.net/mburo/ggs/game-archive/Othello (2012). [8] Liu, F., and Su, J., An Online Feature Learning Algorithm Using

HCI-Based Reinforcement Learning. LNCS 3173, pp. 293–298, (2004).

[9] E. van der Werf, J. Uiterwijk, E. Postma, and J. van den Herik. Local move prediction in Go. Computers and Games, volume 2883 of Lecture Notes in Computer Science, pp. 393.412. Springer, (2002).

[10] Cazenave, T., Automatic acquisition of tactical go rules. In 3rd Game Programming Workshop in Japan, pp. 10.19, Hakone, Japan (1996).

Figure 6. Learning and testing curves of three groups of patterns

Figure 7. Learning and testing curves of three groups which have the same pattern length

275283

[ieee 2012 conference on technologies and applications of artificial intelligence (taai) - tainan,...

Documents