Download - win rate first search
![Page 1: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/1.jpg)
From Monte-Carloto win rate first searchfor “Dobutsu Shogi”
2010/05/22IHARA Takehiro
![Page 2: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/2.jpg)
Abstract
• On algorithm for computer Shogi (Japanese chess)
• Contents– Exhibition of Dobutsu Shogi– Min-max method (conventional)– Monte-Carlo method (conventional)– Win rate first search (presented)
![Page 3: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/3.jpg)
Dobutsu shogi
• This slide mentions computer game algorithm by using Dobutsu Shogi
• Dobutsu Shogi: a miniature shogi• Shogi: Japanese chess• Dobutsu: animal• Normal shogi is too large to examine new
methods
![Page 4: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/4.jpg)
Rule of Dobutsu Shogi 1
Five kind of piecesInitial position is as figureWin if you catch lionWin if your lion reaches to opposite end
Chick promotes chicken
![Page 5: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/5.jpg)
Rule of Dobutsu Shogi 2
All pieces move by one step
forwardvertical horizontal and forward-diagonal
around 8 squares
diagonal
vertical horizontal
You can reuse (drop) the pieces that you took
![Page 6: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/6.jpg)
Copy right of Dobutsu shogi
• I do not know who has copy right– FUJITA Maiko (illustration)
– KITAO Madoka (making rule)
– LPSA (the two designers had belonged to)
– GENTOSHA Education (toy seller)
![Page 7: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/7.jpg)
Illustration on this slide
• Because of that complex copy right, I use the illustrations on the website below in this slide, instead of FUJITA's ones
• “SOZAIYA JUN”• (http://park18.wakwak.com/~osyare/)
![Page 8: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/8.jpg)
Exhibition initial position
Black: win rate first search (presented)White: min-max method, search depth 9, evaluation function is composed by only piece value (conventional)
![Page 9: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/9.jpg)
Exhibition 1st move
Black advanced giraffe
![Page 10: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/10.jpg)
Exhibition 2nd move
White advanced giraffe
![Page 11: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/11.jpg)
Exhibition 3rd move
Black took chick by chick
![Page 12: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/12.jpg)
Exhibition 4th move
White took chick by elephant
![Page 13: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/13.jpg)
Exhibition 5th move
Black advanced elephant
![Page 14: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/14.jpg)
Exhibition 6th move
White dropped chick for defense
![Page 15: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/15.jpg)
Exhibition 7th move
Black moved giraffe backward
![Page 16: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/16.jpg)
Exhibition 8th move
White advanced giraffe
![Page 17: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/17.jpg)
Exhibition 9th move
Black dropped chick for defense
![Page 18: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/18.jpg)
Exhibition 10th move
White took elephant by giraffe
![Page 19: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/19.jpg)
Exhibition 11th move
Black took giraffe by lion
![Page 20: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/20.jpg)
Exhibition 12th move
White dropped elephantThis elephant combination style is strong
![Page 21: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/21.jpg)
Exhibition 13th move
Black lion escaped
![Page 22: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/22.jpg)
Exhibition 14th move
White advanced lion
![Page 23: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/23.jpg)
Exhibition 15th move
Black dropped giraffe and check
![Page 24: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/24.jpg)
Exhibition 16th move
White escaped lion
![Page 25: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/25.jpg)
Exhibition 17th move
Black advanced giraffeBlack forced white to select taking giraffe or escaping elephant
![Page 26: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/26.jpg)
Exhibition 18th move
White took giraffe by elephant
![Page 27: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/27.jpg)
Exhibition 19th move
Black took elephant by lion
![Page 28: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/28.jpg)
Exhibition 20th move
White dropped giraffe
![Page 29: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/29.jpg)
Exhibition 21st move
Black dropped elephant behind lion
![Page 30: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/30.jpg)
Exhibition 22nd move
White moved elephant backward
![Page 31: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/31.jpg)
Exhibition 23rd move
Black advanced elephant
![Page 32: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/32.jpg)
Exhibition 24th move
White check by giraffe
![Page 33: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/33.jpg)
Exhibition 25th move
Black took giraffe by elephant
![Page 34: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/34.jpg)
Exhibition 26th move
White took elephant by chickIf white had taken by elephant, white would be mate
![Page 35: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/35.jpg)
Exhibition 27th move
Black lion escaped
![Page 36: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/36.jpg)
Exhibition 28th move
White dropped elephant
![Page 37: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/37.jpg)
Exhibition 29th move
Black check by giraffe
![Page 38: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/38.jpg)
Exhibition 30th move
White took giraffe by elephant
![Page 39: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/39.jpg)
Exhibition 31st move
Black took chick by lion, and white resignedAfter it, white drops giraffe on side of lion, black giraffe takes elephant and check, white lion takes it, black chick advances, white lion moves backward, black drops chick, check mate
![Page 40: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/40.jpg)
Min-max method
• A conventional method• Today the most successful method for shogi• Explanation using tree structure from next
page
![Page 41: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/41.jpg)
Min-max Example: 3 depth
Present board positionBoard position
after 1 and 2 moves
Board position after 3 moves
![Page 42: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/42.jpg)
Min-maxS
uppose scores after 3 moves
were revealed
-4 -3 10 3 -9 5 23 -8
![Page 43: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/43.jpg)
Min-maxS
cores after 2 moves are
maxim
um of each score
-4 -3 10 3 -9 5 23 -8
-3 10 5 23
![Page 44: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/44.jpg)
Min-maxS
cores after 1 moves are
minim
um of each score
-4 -3 10 3 -9 5 23 -8
-3 10
5
23
-3
5
![Page 45: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/45.jpg)
Min-maxS
elect the move having
maxim
um score
-4 -3 10 3 -9 5 23 -8
-3 10
5
23
-3
5
5
![Page 46: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/46.jpg)
Min-max method
• Theoretically you can select the move that has the maximum score after N moves
• Theoretically if we could obtain the score of the end of the game, we would always win the game
• Practically because of too large computational cost, we cannot calculate all moves
![Page 47: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/47.jpg)
Min-max method
• Although many methods for reducing computational cost is presented, they will be not mentioned this slide (It is called pruning to reduce the number of searched nodes)
![Page 48: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/48.jpg)
Conclusion of min-max method
• It uses tree structure• Scores after N moves are needed• Pruning is needed
![Page 49: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/49.jpg)
Monte-Carlo method
• While I do not know the history of Monte-Carlo method, it have been successful for computer “go” (precisely successful by Monte-Carlo tree search)
• They say that it is difficult to apply computer shogi (or chess-like game) yet
![Page 50: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/50.jpg)
Outline of Monte-Carlo
• Repeat random moves
• Then game finishes and winner is revealed
• making game end by random moves is called playout
first move
end of gameplayout
random m
ove
![Page 51: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/51.jpg)
Outline of Monte-Carlo
• Repeat playout• Obtain win rate of
the first move• (number of win) /
(number of playout)• Select move having
highest win rate at the last
![Page 52: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/52.jpg)
Outline of Monte-Carlo
• Outline is only it• As to “Go”, this method has become
stronger by combining tree structure and making Monte-Carlo tree search (this slide does not mention it)
• Another improvement is that playout uses moves by knowledge of “Go” instead of simple random moves
![Page 53: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/53.jpg)
Example of knowledge of “Go”
• Observe 3x3 squares• Set low probability to drop
black stone the center of above figure
• Set high probability to drop black stone the center of below figure
![Page 54: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/54.jpg)
Monte-Carlo for shogi
• Simple Monte-Carlo method does not work for shogi (too many bad moves appear)
• A causal must be that few moves in all legal moves are good on shogi
• I do not want to use knowledge of shogi by neither machine learning nor manual setting
![Page 55: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/55.jpg)
Why Monte-Carlo for shogi
• Ability to determine the move by result of the end of game, which seems beautiful
• No evaluation function is needed, no preset knowledge is needed
![Page 56: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/56.jpg)
Discussion Monte using treeS
imple random
moves lead
equal win rate betw
eengreen and red
Truth is that green win and red loseIt tells importance of tree structure
![Page 57: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/57.jpg)
Discussion Monte using treeS
uppose you obtain win rate
after 3 moves
0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
Obtain win rate of green and red from These 3-move-after rates by playout
![Page 58: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/58.jpg)
Discussion Monte using treeIdeally the rates are equal toones of m
in-max m
ethod
0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
0.3 0.8 0.6 0.9
0.3 0.6
![Page 59: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/59.jpg)
Discussion Monte using tree
• Q: How do you calculate parent node 0.6 by children nodes 0.2 and 0.6
• A: Ignore 0.2
0.2 0.6
0.6
![Page 60: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/60.jpg)
Discussion Monte using tree
• Q: How do you ignore 0.2?• A1: Always search maximum
win rate node• A2: sometimes search through
node randomly
0.2 0.6
0.6
![Page 61: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/61.jpg)
Discussion Monte using treeS
earch node that hasm
aximum
win rate
0.1 0.3 0.7 0.8 0.2 0.6 0.9 0.4
This tactics finds the best path
![Page 62: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/62.jpg)
Win rate first search
• Remember win rate of searched node• Almost always search node that has
maximum win rate• Sometimes search randomly (ideally it is
not needed)• Then this algorithm finds the best move
![Page 63: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/63.jpg)
Additional explanation
• Update win rate at every playout• Keep numerator and denominator as win
rate• Add constant number to both numerator and
denominator when win the playout• Add constant number to only denominator
when lose the playout
![Page 64: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/64.jpg)
Problems of presented method
• Win rates of the nodes that have not been searched are mentioned from the next pages
• Many other issues must be hiding, though I have not defined them
![Page 65: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/65.jpg)
Unreached node
• On the node that has not been searched and no win rate
0.4 0.6 0.3
unreached
![Page 66: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/66.jpg)
Another win rate
• Before this page, knowledge of shogi does not appear and only graph is used
• This win rate uses knowledge of shogi• Win rate is calculated by kind of moves• For example, taking piece, promotion, and
etc.
![Page 67: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/67.jpg)
Another win rate
• Calculate win rate by these factors– Piece position before and after move– Kind of pieces moving and taken– Is position whether controlled or not
• Win rate table for all combination of these factors is prepared
• These win rates are learned by playout, whose values are not prepared
![Page 68: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/68.jpg)
Another smaller win rate
• Another smaller win rate table is prepared– Kind of pieces moving and taken– Is position whether controlled or not
• Since it is small, it learns fast• It is used when “another larger win rate” is
not learned yet• If all three kinds of win rate have not been
learned, let win rate be 1
![Page 69: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/69.jpg)
Conclusion of presented method
• Win rates of all searched nodes are remembered and learned by playout
• Select node that has highest win rate in playout (“win rate first search”)
• Sometimes select node randomly• If win rate has not been learned, other win
rates are used
![Page 70: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/70.jpg)
Condition of simulation game
• Win rate first search vs. Simple min-max method (evaluation function is composed by only values of pieces)
• If the game continues till 80 moves, the game is regarded as even (special rule for this simulation)
![Page 71: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/71.jpg)
Result of simulation 1
Number of playout 10000 30000 100000
Presented method: black
22-76 44-52 48-49
Presented method: white
16-81 30-68 61-35
Win-lose for presented method in 100 gamesSome even games existDepth of min-max method is 6More the playouts are, stronger the method is
![Page 72: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/72.jpg)
Result of simulation 2
Win-lose for presented method in 100 gamesSome even games exist100000 playouts for presented methodAlmost same strongness to 6-depth min-max
Depth of min-max 4 5 6 7 8 9
Present method: black
94-6 77-20 48-49 37-61 24-73 14-85
Present method: white
78-21 78-20 61-35 38-57 40-52 20-74
![Page 73: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/73.jpg)
Impression by human viewer
• Frequently presented method take bad moves
• Although it is a variation of Monte-Carlo method, it can find mate route
• It is good at finding narrow route• Difference of the number of playout shows
clearly difference of strongness
![Page 74: win rate first search](https://reader035.vdocuments.us/reader035/viewer/2022062319/556dbf44d8b42aed2e8b4e2a/html5/thumbnails/74.jpg)
Conclusion and future issue
• Conclusion– Playout by win rate first– Select moves without preset knowledge– Select moves by result of playout
• Future– Someone can apply it to “Go” or other
chess-like games– I return to research speech signal
processing