1 of 81 evolution and coevolution of anns playing go peter mayer, 2004
Post on 20-Dec-2015
214 views
TRANSCRIPT
![Page 1: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/1.jpg)
1 of 81
Evolution and Coevolution of ANNs playing Go
Peter Mayer, 2004
![Page 2: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/2.jpg)
(81) 2
Outline
Computers and Games
The game of Go
Experimental Setup
Training of Go playing ANNs
Evolution of Go Playing ANNs
Summary and Outlook
![Page 3: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/3.jpg)
(81) 3
Games
Algorithms designed since AIs onset Clearly defined rules
Still complex
Chess received the most attention More researched than Go
Two main approaches Rely on expertise – directly programmed
weighted features; Extensive knowledge
Use evolution – less knowledge; more
versatility
![Page 4: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/4.jpg)
(81) 4
The game of Go
Oldest (unaltered)
strategic board game in
the world
10,000,000 players in
Japan “alone”
Fairly simple rules
BUT difficult to master Immense tree (~200 opts)
Complex structures
Many concurrent goals
![Page 5: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/5.jpg)
(81) 5
Go Rules
19x19 board Empty in the beginning
Black & White “stones”
Black starts
Each turn Place 1 stone
At an intersection
Never move stones
OR pass
![Page 6: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/6.jpg)
(81) 6
Go Rules [2] Objective - Get the most points !
Points are acquired by: Securing Territories
Capturing opp’s
pieces
![Page 7: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/7.jpg)
(81) 7
Go Rules [3] Stones at a vertically or horizontally
adjacent intersection are called a group An empty intersection adjacent to a stone
or group is called a "liberty" of that group 1 Liberty = group in “atari” No liberties -> CAPTURE ! Group is
removed Example – Black places stone in X
resulting in right figure
![Page 8: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/8.jpg)
(81) 8
Go Rules [4] Stones can be placed anywhere,
but cannot commit suicide (except
Chinese)
Legal if stone simultaneously
captures opponent’s group (2
right figures)
Suicide – white cannot place at
X
White CAN place at X
Result: capture
![Page 9: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/9.jpg)
(81) 9
Go Rules [5]
Same position cannot occur more
than once
Endless repetitions: Black can capture at upper figure by
placing at X
White - same by placing at Y
Black – repeat…
Ko rule White may not place at Y before
playing somewhere else first
Avoid any repetitions
![Page 10: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/10.jpg)
(81) 10
Go Rules – Live and Dead groups “Dead” groups if impossible to prevent
capture It is not necessary to do so
Group remains on board
At end of game, removed and added to captured
stones
“Living” groups are impossible to capture Group with 2 “eyes” – even if white surrounds it,
playing at X or Y is suicide
Opponent must play elsewhere
![Page 11: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/11.jpg)
(81) 11
Go Basics – End game
Play continues until both players pass
Players then alternatively play stones at
“neutral” points – adjacent to both White and
Black Also known as “dame” (DAH-MAY)
Dead stones are removed from the board and
counted with other prisoners (1 point per
prisoner)
Also - 1 point for each intersection surrounded
by player’s stones (“territory”)
![Page 12: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/12.jpg)
(81) 12
Go Basics – End game example
Prisoners were removed already
All 4 points marked X are dame – worthless
Black has 7 points in UR (territory); 2 points in LL
1 removed prisoner
TOTAL = 10 points
White has 5 in UL; 2 in LR
2 prisoners
TOTAL = 9 points
Black wins unless komi (5.5 pts compensation) is
due
![Page 13: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/13.jpg)
(81) 13
Ranking and Handicaps Determine Go players’ strength Resemblance to martial arts Both amateur and professional ranking system Amateur
35 kyu to 1 kyu THEN 1 dan to 7 dan
Pro 1 dan to 9 dan Awarded only by Go institutions
Pro dans are much stronger than amateur dans
![Page 14: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/14.jpg)
(81) 14
Ranking and Handicaps (2) Handicaps
Weaker player starts with several stones on the board Placed at specific places Helps make games more even
Difference in ranks ~ number of handicap stones needed to win
2 stones to even 2 dan against 4 dan 4 to even 3 kyu and 2 dan
The most powerful Go programs reach only … … 10 kyu!
![Page 15: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/15.jpg)
(81) 15
Outline
Computers and Games
The game of Go
Experimental Setup
Training of Go playing ANNs
Evolution of Go Playing ANNs
Summary and Outlook
![Page 16: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/16.jpg)
(81) 16
Experimental Setup
Opponent Go players
ANN player
Go board (input) representations
Move (output) representations
Coevolution Hall of Fame coevolution
Cultural coevolution
General evolution setup
![Page 17: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/17.jpg)
(81) 17
Go Players - Random
No strategy
Pass move also
“Knows” only the rules of go legality of moves
Usually weakest opponent
![Page 18: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/18.jpg)
(81) 18
Go Players – Naïve Player
Roughly human-beginner level
Able to save and capture stones
Knows about Lost stones
Saving - connecting stones to living groups
Weak stones (not savable)
![Page 19: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/19.jpg)
(81) 19
Go Players – Naïve Strategy A subset of JaGo’s (main opponent)
strategy Outline (arranged by priority):
Attempt to save Try to put opponent into atari Connect weak stones Capture opponent groups in atari Check intersections for placing stones
In random order Make sure no (own) liberties decrease below 2 as a
result
Perform Random move
![Page 20: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/20.jpg)
(81) 20
Go Players – JaGo Player Java based program
Best computer player used Not a strong player ~16 kyu
Knows standard techniques Mainly save & capture
Uses pattern matching Looks at entire board
32 patterns, with rotations and mirrors
![Page 21: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/21.jpg)
(81) 21
Go Players – JaGo Strategy (1) Save stones in atari
Try to decrease liberties of large groups
Find own savable larger groups
Attack opponent’s groups (decreasing
order:) With 2 or more liberties and attackable
With 2 or more stones & less than 3 liberties
With 2 or less liberties
![Page 22: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/22.jpg)
(81) 22
Go Players – JaGo Strategy (2)
Save own groups with few liberties if savable Start pattern matching – Response; Center
Random move order
Seek opponent’s groups to capture in 2 moves
Perform random move which isn’t of a bad pattern
Capture opponent’s single liberties Connect own weak stones PASS
![Page 23: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/23.jpg)
(81) 23
Go Players – JaGo Patterns (1)
![Page 24: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/24.jpg)
(81) 24
Go Players – JaGo Patterns (2)
![Page 25: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/25.jpg)
(81) 25
Go Players – GNU Go Advantages
5x5 to 19x19 boards Handles handicaps well Rated 10 kyu
Problems 5x5 solved – open an C3 for 18.5 points
(komi=5.5) – always wins in Black GNU Go passes on B3, C2-4, D3 (only correct
at C3) Premature convergence of evolution
![Page 26: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/26.jpg)
(81) 26
ANN Player Inform ANN about actual position
Evaluate ANN output to receive next move
Representation is important!
Intention maps For each Go move (including PASS) – value
between [0,1]
High value – high intention to make move (and
v.v)
Select legal move with highest value
To avoid predictability – consider sub optimal
moves also (“creativity factor”)
![Page 27: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/27.jpg)
(81) 27
Player Strength Commonly to receive a rating unrated
Go players play against rated players
(same in Chess)
The strength s of a player is determined
by The score of 1000 double games
Against each of 3 opponents: R, N, JaGo
Divided by the number of games (6,000)
1 is perfect strength
3 opponents help resist over-fitting
![Page 28: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/28.jpg)
(81) 28
Player Competence Strength is not understanding of rules
(legality) E.g. 2 players receive same score but only one
always tried legal moves first
The competence C of a player is defined as follows:
bi = games; i = moves; tij = #tried illegal moves; kij = #possible illegal move
C is the averaged on all games
![Page 29: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/29.jpg)
(81) 29
Board Representations 19x19 boards
far too large
Even for evolved agents
Use only 5x5 boards
![Page 30: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/30.jpg)
(81) 30
Board Representations Should preprocess position to make
ANNs life easier
Tested in training experiments
Standard Input Representation (SIR) 2 neurons at each intersection :-
1 per player’s piece; 1 per opponent’s
No distinction between B and W stones
Optional – 1 neuron to tell if B or W
(2*b^2) neurons (were b is board size) = 50
![Page 31: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/31.jpg)
(81) 31
Representations - NIR Naïve Input Representation
More compact
1 neuron per intersection
Set to -1 (player’s stone) or 1
(opponent’s)
0 if empty
Uses half of SIRs neurons = 25
![Page 32: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/32.jpg)
(81) 32
Representations - LVIR Limited View Input Representation Splits the Go board into several
quadratic areas of size 3x3 Idea – simplest way of capturing stones
works within this area E.g. capture of 1 stone by surrounding it
Areas overlap at middle row and middle column
Coding – similar to SIR w is number of areas (=4) 72 Neurons Could also be Naïve
![Page 33: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/33.jpg)
(81) 33
Clever Representations Based on image processing and circuits
We want less raw inputs to allow ANN to
concentrate more on features
Manhattan distance Used in integrated circuits where wires run
parallel to X or Y axis
Got its name from Manhattan NY, where
streets are aligned in grid
P1 = (x1, x2)
P2 = (y1, y2)
![Page 34: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/34.jpg)
(81) 34
Clever Representations Manhattan distance is related to distance
of Go stones (no diagonals) distance = [#(separating stones) – 1]
1 if next to each other
2 if separated by one stone
3 for knight’s move or two separating stones
![Page 35: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/35.jpg)
(81) 35
Representations: c-o-Matrix
Co-occurrence-matrices
Used in image processing
Many parameters are derived from it Mean, Sd, energy, contrast,
homogeneity, …
Quadratic
Based on a relation p between image
positions (symmetric if p is)
![Page 36: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/36.jpg)
(81) 36
Representations: c-o-Matrix
Elements C[i][j] = Number of times pixels occur in an
image of a specified value (color)
In the relation specified by p
Relative to other pixels
Size is number of different colors
![Page 37: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/37.jpg)
(81) 37
Representations: c-o-Matrix
An actual go board is an “image” with
3 different colors (including empty)
Example p1: Manhattan distance of 1 between 2
points First matrix row: B near B 16 times B near W 3 times B near empty 11 times
![Page 38: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/38.jpg)
(81) 38
Representations: c-o-Matrix
Does not say much about absolute positions – must combine
SIR and C for whole board NIR and C for whole board NIR and Cs for 3x3 areas
sLVIR and Cs for 3x3 areas
NLVIR and Cs for 3x3 areas
![Page 39: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/39.jpg)
(81) 39
Output Representations Only 2
Standard Output Representation
(SOR) Each intersection is represented by 1
neuron
1 for PASS
(b^2 + 1) neurons
![Page 40: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/40.jpg)
(81) 40
Output Representations Row Column Output Representation
(RCOR) Used to decrease ANN size
5 neurons for columns; 5 for rows
1 for PASS
(2b + 1) neurons
Intention more complicated:
PASS intention is square of relevant neuron
RCOR Limits intention map: v1>v2 y1>y2 v4>v3
All values positive, non-zero
![Page 41: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/41.jpg)
(81) 41
Coevolution Derives non-static fitness, as in nature
1 or more populations; interacting
Competitive [battle] vs. Cooperative
[subtasks]
Advantages “Who needs enemies when you got friends like
these?” – saves finding opponents; Especially
in Go where no strong program exists
Variety in fitness – adaptive opponents
No upper bound for improvement
![Page 42: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/42.jpg)
(81) 42
Coevolution Methods Applied
Based on work by Lubberts &
Mikkulainen [2001]
Hall of Fame Host population and Master population
Maintaining the ability of host population to
beat opponents of previous generations
Each generation, the best individual is added
to HoF
All population competes against sample of
the HoF
![Page 43: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/43.jpg)
(81) 43
Coevolution - HoF Applied in this resaearch
HoF initially filled without competition
Individuals get their fitness by competing
against the masters
When full - host with highest win rates
(against masters) joins HoF Replace first Master to lose all games
Coevolutionary progress cannot be
directly seen Both populations constantly changeing
![Page 44: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/44.jpg)
(81) 44
Cultural Coevolution A new approach!
Maintains “culture” of masters resembling
HoF
To enter culture, host must defeat all
masters Masters never replaced – unlimited culture size
Every individual receives a fitness score by
competing against all masters
Culture growth rate decreases rapidly Every new master is the strongest found (yet)
![Page 45: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/45.jpg)
(81) 45
Cultural Coevolution [2] Numerous advantages Maintains ability to defeat weak players
Keeps good solutions found
Same player cannot enter twice Needs to defeat itself
Culture’s performance never decreases Avoid focusing on a specific player’s
weakness As soon as any master is immune, the hosts
have to find another way More masters less likely to remember all
weaknesses
![Page 46: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/46.jpg)
(81) 46
General Evolution Setup Opponents – Random; Naïve; JaGo Fitness = strength
Rate of wins against all 3 opponents 6,000 games of both colors
Not using scores, only win rates Defeating more opponents is better
Generalized Multi-Layer Perceptrons (GMLPs) All non-loop connections are permitted
Evolving Hidden neurons; connections; weights; bias (for
non-input)
![Page 47: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/47.jpg)
(81) 47
General Evolution Setup [2] 2 binary Chromosomes used
1 for connections : 0-no 1-yes 1 for hidden neurons (if 0, no connections also) Number of possible connections:
ni, nh, no – number of input, hidden and output neurons
Determines size of chromosome
Real-Chromosome Weights & Bias values (seen as weights) Size is number of connections + number of bias
vals (for non-input)
![Page 48: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/48.jpg)
(81) 48
General Evolution Setup [3] Tournament selection (size 2) 2 point crossover Binary mutation
Flip bits with 1/L probability
Real-Chromosome Mutation multiple-σSA Each object maintains altering “strategy”
params which alter distribution of “object” params
Normal distributions used for both
![Page 49: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/49.jpg)
(81) 49
Setup – Recurrent Nets Difficult to learn Go without structured input Experiments with recurrent nets included Allow loops for input Ns
Naturally represent adjacent board intersections
No hidden Ns Played against JaGo Typically output changes without input
change due to feedback loops Computed output only once! Only 2 directly connected Ns influence each other Evolutions should connect only close Ns
![Page 50: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/50.jpg)
(81) 50
Outline
Computers and Games
The game of Go
Experimental Setup
Training of Go playing ANNs
Evolution of Go Playing ANNs
Summary and Outlook
![Page 51: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/51.jpg)
(81) 51
Training ANNs – Setup Testing IRs mentioned previously No Go-specific knowledge used Each experiment was repeated 20 times Nets, same as Richards [1998]
3 layers; Fully connected; Feed forward Linear activation for input Ns; Sigmoid for rest 50 input; 26 output; 100 hidden - 7600
connections
Patterns: JaGo vs Jago; 5x5 board;
Rprop – resilient variant of Backprop
![Page 52: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/52.jpg)
(81) 52
Training ANNs – Experiment 1
Determine number of training cycles Too few cycles Weights not adjusted properly Too many over-fitting
Determine training pattern set Limit the level a Go player can reach Should include all 3 game stages Both expert and novice moves
JaGo vs JaGo All game stages No distinction between winner and loser moves
1,000 .. 5,000 Cycles; 50/100/200 Games
![Page 53: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/53.jpg)
(81) 53
Training ANNs – Results 1
Average of 20 runs 100&200 games better than 50 3000\5000 games don’t add strength Best – 200 games; 2000 cycles
Used hereafter
![Page 54: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/54.jpg)
(81) 54
Training ANNs – Experiment 2
Determine number of hidden Ns Many
Diverse features
Few Few stronger features (perhaps better 1s) Less time-consuming
100 Ns yielded best results selected
![Page 55: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/55.jpg)
(81) 55
Training ANNs – Experiment 3
Output representations Standard (SOR) vs Row-Column (RCOR) 200 patterns; 2000 games; 100 hidden Ns
Similar strength; RCOR competence slightly lower
RCOR still expansive and adds constraints SOR is used in the following experiments
![Page 56: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/56.jpg)
(81) 56
Training ANNs – IR Experiments
Various input representations Used reference-ANN (RANN)
SIR & SOR; 100 hidden; 7,600 connections Strength = 0.2908; Competence = 0.8467
2,000 games; 200 cycles NIR (half input size) & SOR
Strength = 0.2093; Competence = 0.8031 Naïve input makes it difficult to learn Go
LVIR (3x3 windows) & SOR Strength = 0.2755; Competence = 0.8258 Slightly lower; LVIR doesn’t add input
difficulty
![Page 57: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/57.jpg)
(81) 57
Training ANNs – IRs [2] Whole Co-occur-matrix (dist=1,2,3);
SIR&SOR
Found better Strength & Competence! Knight’s-Move matrix adds relevant information
Whole matrix (dist=1,2,3); NIR&SOR 21% less connections due to NIR
Better than standard NIR, but still low
![Page 58: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/58.jpg)
(81) 58
Training ANNs – IRs [3] 3x3 matrices (dist=1,2,3) ; NIR&SOR
Low but ~20% better than previous (whole matrix) NIR
3x3 matrices (dist=1,2,3) ; LVIR\NLVIR Both matrices and board views use 3x3 windows
No improvement; Huge number of Ns not necessary
![Page 59: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/59.jpg)
(81) 59
Training ANNs – IRs Summary
![Page 60: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/60.jpg)
(81) 60
Training ANNs – IRs Summary
Trained ANNS better against JaGo compared
to Naïve Although JaGo is better
Some over-fitting for good players
Against Naïve outputs close to zero – no repsonse
NIR ANNs generally weaker than SIR
Manhattan distance of 2 good against
Random
IR + whole matrix (dist=2) was strongest
RANN is still best; Selected for evolution
![Page 61: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/61.jpg)
(81) 61
Outline
Computers and Games
The game of Go
Experimental Setup
Training of Go playing ANNs
Evolution of Go Playing ANNs
Summary and Outlook
![Page 62: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/62.jpg)
(81) 62
Evolving Go ANNs
Setup of Evolution experiments
Evolution of ANNs against Computer
Players Random Player; Naïve; JaGo
Recurrent against JaGo
Coevolution Cultural
Hall of Fame
Training Evolved ANNs
![Page 63: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/63.jpg)
(81) 63
Evolution Setup 5x5 boards; Komi of 5.5
50 Individuals Described previously (3 chromosomes)
GMLPs with SIR and SOR Max 3,010 connections
Recurrent ANNs Using NIR (25 Ns) and SOR (26)
Max 2,601 connections
Same strength measure as training (6k
games)
![Page 64: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/64.jpg)
(81) 64
Evolution Against Random Empirically 64 games to determine fitness Best ANN evolved {Str=0.4005;
Comp=0.48} After 47 gens; 929 connections
Evolved ANNs hardly reacted to different positions Always in the middle; Never in corners –
creates eyes Unnecessary to “think” against Random
Occasionally Random places at strategic intersection and then usually wins
Only 3 of 20 best ANNs open at optimal C3
![Page 65: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/65.jpg)
(81) 65
Evolution Against Naive Better player; ANNs develop better strategies Same setting 200 gens for ALL population to win ½ of
games – fast learning Best {Str=0.69; Comp=0.487} after 2915
gens High strength and only 10 hidden !! Win rates
Same against Naïve and Random Low against JaGo (~0.2)
25% use optimal opening move (still low) Exploit Naïve’s weaknesses at endgames
![Page 66: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/66.jpg)
(81) 66
Evolution Against JaGo Far stronger than Naïve (85% wins) Takes significantly more time for each move
Used distributed computing 64 games would take 32 hours per run Only 32 games for fitness - empirically sufficient
Best {Str=0.772; Comp=0.476} after 1909 gens Scores 100% wins 1k gens to score 0.4; In 4 runs 100% wins in 3k gens!!!
Sd twice as large – harder for evolution Weak against Naïve ~0.4;Strong against
Random
![Page 67: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/67.jpg)
(81) 67
Evolution Against JaGo Again, low competence ~0.5 Evolved strategies
Still connecting stones but faster (responsive) Tenuki (abandon & play elsewhere) to distract
JaGo 9 open optimally; All in 3x3 area around center Strength depends heavily on opening move Mid games sometimes show standard Go
sequences! Take advantage of JaGo’s weakness – capturing
weak stones
![Page 68: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/68.jpg)
(81) 68
Recurrent Nets Evolution Natural representation on Go board
Input are connected
More time consuming Only 2 runs; 32 games; setting described
previously
100% win rate within 1k generations!!! Both nets open at C3 Strategies
1 aggressive;1 distractive Protect; Create living groups; Bad Endgames
Very high relative strength 0.94 Random; 0.49 Naïve (never played before)
![Page 69: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/69.jpg)
(81) 69
Cultural Coevolution Until now much over-fitting was observed
Fitness 8 games against all masters (4 each color)
Few because games are quite similar
Results of typical run – host population 3,500 gens
90% wins at 500 gens
Stagnation around 1k
Last master added at 462
After 2k Mean fitness decreases
![Page 70: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/70.jpg)
(81) 70
Cultural Coevolution [2] Masters
21 ANNs
After number 8 all have R>0.8
Last obtained Strength of 0.365
Strategy (both populations) Many random move selection
Due to many saturated Ns (output=1)
Games usually similar but multiple random
moves are hard to defeat
May be cause by mutation (Multiple-Self
Adaption)
![Page 71: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/71.jpg)
(81) 71
Cultural Coevolution [3] Strategy (cont.)
Coevolution found easy solution Computer players are very difficult to beat with
saturated neurons
New extremely long experiment (60k gens!) was performed with different mutation (single-SA) Similar results, Except: Now most culture growth until gen 10k (last at
40k) Now less saturated Neurons Less fitness decrease despite increasing culture
Strength
![Page 72: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/72.jpg)
(81) 72
Cultural Coevolution [4] Culture Summary
80 members
After #16 Random>0.94
After #29 all opened optimally
After #57 all Strength>0.4
Wins against JaGo ~0.5 Naïve
~15 hidden Ns – fluctuate between successive
![Page 73: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/73.jpg)
(81) 73
Recurrent & Cultural 10k gens
Faster learning but basically same results R>0.9 at C11 (compared to C14)
N>0.2 at 14 (compared to C37)
Strategy Still bad against JaGo
Bad openings! (only 2% optimal)
Only last 5 masters close to center
Learned not to capture dead groups
![Page 74: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/74.jpg)
(81) 74
Hall of Fame Coevolution Compared to Cultural
Parameters Important parameter is HoF size={1,2,4,8,16}
Eight games against each master
3k gens were coevolved
After coevolution all HoF ANNs were
evaluated
Every 100 gens the best ANN was evaluated
![Page 75: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/75.jpg)
(81) 75
Hall of Fame Coevolution [2]
Results – HoF size 1 Masters – low strength of 0.3625
In gen 1k – one ANN had 0.4 Lost solution
HoF changed every generation cycles
Results – HoF size 16 Master 5 – highest strength of 0.4403 in gen
400
Strength of 0.5057 was obtained and lost
One master was replaced in every generation!
Somehow weak masters remained in the HoF
Host population stagnates (cycles)
![Page 76: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/76.jpg)
(81) 76
Hall of Fame Coevolution [3]
Strategies All place first stone at D4!
HoF coevolution does not encourage diversity
among ANNs
![Page 77: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/77.jpg)
(81) 77
Training Evolved ANNs Evolution against JaGo –
Strength ~0.77
4-16 hidden Ns
Training Strength ~0.3
100 hidden Ns
Check whether evolved structure is good Train after evolution
Train without evolution only using structure
![Page 78: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/78.jpg)
(81) 78
Training Evolved ANNs [2] Used best 2 evolved ANNs against JaGo
Taken from runs 11 & 17
ANN11 – 10 hidden; 1178 connections
ANN17 – 14 hidden; 1162 connections
Trained with 200 games; 2,000 cycles
Experiment 1 (post-evolution) Results Bad! Strength of 0.11 and 0.10 –
Lower than any trained ANN (RANN has 0.29)
High competence 0.89
![Page 79: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/79.jpg)
(81) 79
Training Evolved ANNs [3] Experiment 2 – keep only evolved structure
Strength below 0.152 (RANN is 0.29)
Weakest against JaGo (0.05) although trained
with JaGo
Against Naïve 0.11 (same as RANN)
Evolutions creates efficient structures Few hidden Ns
Difficult to learn with training
High competence due to they seldom
responded with same move to different
positions
![Page 80: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/80.jpg)
(81) 80
Summary Training could not achieve high Go
playing skills
Evolved ANNs specialized in the opponent
which was used during evolution
Cultural coevolution generated strong
players Strength increasing throughout the process
Perhaps an ANN stronger than amateurs can
be coevolved
Recurrent nets learned faster
![Page 81: 1 of 81 Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004](https://reader031.vdocuments.us/reader031/viewer/2022032704/56649d4d5503460f94a2c5b3/html5/thumbnails/81.jpg)
(81) 81
Summary [2] 2 coevolved (recurrent and feed-forward)
won the grand tournament
Coevolution proved better than evolution
for developing Go strategies
Recurrent ANNs would provide a field for
further research More natural board representation
Could contain a fixed input layer representing
the board