![Page 1: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/1.jpg)
Honte, a Go-Playing Program Using Neural Nets
Frederik Dahl
![Page 2: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/2.jpg)
Combined approach
Supervised learning Shape evaluation
Reinforcement learning Group safety Territory
Heuristic evaluation Influence
Search Capture Connectivity Life and death
![Page 3: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/3.jpg)
Architecture
![Page 4: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/4.jpg)
Shape evaluation: Multilayer perceptron190 inputs
Receptive field of radius 3 Distance to edge Liberties Captured stones
50 hidden nodesSingle output
Will an expert play here?
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 5: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/5.jpg)
Shape evaluation:Training and performanceTrained on 400 expert games
Expert move used as positive example (+1) Random legal move as negative example (0)
Error backpropagation error = target - eval
Performance measured by treating prediction as evaluation function
What percentage of legal moves are ranked below the expert move?
![Page 6: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/6.jpg)
Shape evaluation:Results
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 7: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/7.jpg)
Local search
Selective search for local goals Capture Connectivity Life and death
Only considers moves suggested by shape evaluating network Deep and narrow search Captures common-sense knowledge
![Page 8: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/8.jpg)
Group safety evaluation:Multilayer perceptronGroups defined by connectable blocks13 inputs
Number of stones in group Number of liberties in group Number of proven eyes Average opponent influence over liberties
20 hidden nodes1 output
Probability of group survival
![Page 9: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/9.jpg)
Group safety evaluation:Temporal difference learningTrained by self-playReward signal for the group is the average
final safety of stones 0 = captured 1 = survived
TD(0) is used, replaying games backwardsVery simple idea:
error = eval(next) - eval(now)
![Page 10: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/10.jpg)
Influence evaluation
Consider random walks from an intersection
How likely to end up at a black or white stone?
Can also take account of group safety estimates
![Page 11: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/11.jpg)
Territory evaluation
Another multilayer perceptron4 Inputs
Revised influence (for both sides) Distance from edge
10 hidden nodes1 output
Predicted territory valueTrained by TD(0) using eventual territory
value as reward signal
![Page 12: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/12.jpg)
Playing strength
Playing 19x19 Go Approximately even against Handtalk 97-06e Wins more than 50% against Ego 1.0
Weaknesses Confuses group safety with group strength Has no concept of the aji of a group
![Page 13: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/13.jpg)
Recent work
New version of WinHonte 1.03 Neural net to evaluate sente/gote
Trial version available online!
![Page 14: Honte, a Go-Playing Program Using Neural Nets](https://reader036.vdocuments.us/reader036/viewer/2022082612/56813a6a550346895da2641c/html5/thumbnails/14.jpg)
Conclusions
Go knowledge can be learnedCombining different forms of knowledge
can be a good ideaMultilayer perceptrons provide a flexible
representationLocal search can be used effectively as
input features for learning