evolution and coevolution of artificial neural networks playing go thesis by peter maier, salzburg,...
Post on 21-Dec-2015
213 views
TRANSCRIPT
Evolution and CoevolutionEvolution and Coevolutionof Artificial Neural Networksof Artificial Neural Networksplaying Goplaying Go
Thesis by Peter Maier, Salzburg, April 2004
Additional paper usedComputer Go, by Martin Müller
Presented byDima Stopel
OverviewOverview
Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s
History of GoHistory of Go
Go is an ancient Chinese board game Go is an ancient Chinese board game that is believed to be 2,000 to 4,000 that is believed to be 2,000 to 4,000 years old. years old.
Go is played around the world, and Go is played around the world, and has several names. The Chinese call it has several names. The Chinese call it Wei-chiWei-chi. In Korea it’s . In Korea it’s BadukBaduk. The . The Japanese word is Japanese word is IgoIgo, or just , or just GoGo. .
Go BasicsGo BasicsStones and BoardStones and Board
Boards Standard 19x19 Beginners 9x9 13x13
Stones 180 White 181 Black
Go BasicsGo BasicsGamePlay and Winning ConditionGamePlay and Winning Condition
Play starts on an empty board. Players put their stones at the
intersections of the lines on the board. Players can pass at any time. Consecutive passes end the game. The goal of the game is to control a
larger area than the opponent and take more prisoners.
Go BasicsGo BasicsThree Rules of GoThree Rules of Go
Rule 1Rule 1 Stones of one color that have been
completely surrounded by the opponent are removed from the board as prisoners.
Liberties
Go BasicsGo BasicsThree Rules of GoThree Rules of Go
Rule 3 – The “Rule 3 – The “Ko” Ko” rule.rule. No infinity.
OverviewOverview
Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s
State of computer in GoState of computer in GoThe challengeThe challenge
It is easy to write a Go program that can play a complete game. However, it is hard to write a program that plays well.
State of computer in GoState of computer in GoOverview pictureOverview picture
• 1980 The Ing Foundation’s issues a million dollar prize for a professional level Go program
State of computer in GoState of computer in GoThe challengeThe challenge
Computer Go poses many formidable conceptual, technical and software engineering challenges.
Most programs have required 5–15 person-years of effort.
Contained 50–100 modules dealing with different aspects of the game.
The fact that despite all these efforts the current level of programs is still modest
State of computer in GoState of computer in GoThe challengeThe challenge
The search space for 19 × 19 Go is very large compared to other popular board games.
The number of distinct board positions is , and about 1.2% of these are
legal. In chess 20 potential moves from a board
position are available, but on a standard 19x19 Go board there are about 200–300 potential moves.
1701919 103
OverviewOverview
Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s
Brief intro to ANN Brief intro to ANN History and motivationHistory and motivation
1943 Warren McCulloch, Walter Pitts 1943 Warren McCulloch, Walter Pitts A Logical A Logical
Calculus of Ideas Immanent in Nervous ActivityCalculus of Ideas Immanent in Nervous Activity 1958 Frank Rosenblatt 1958 Frank Rosenblatt Perceptron Perceptron 1969 M. Minsky, S. Papert's 1969 M. Minsky, S. Papert's PerceptronsPerceptrons 1974 Paul Werbos 1974 Paul Werbos Back-propagationBack-propagation 1986 D. E. Rumelhart 1986 D. E. Rumelhart Learning internal Learning internal
representation by error propagation.representation by error propagation. 1991 K. Hornik 1991 K. Hornik Approximation capabilities of Approximation capabilities of
multilayer feed forward networksmultilayer feed forward networks..
Brief intro to ANN Brief intro to ANN General factsGeneral facts
Data Driven ModelData Driven Model
One of the most flexible mathematical One of the most flexible mathematical models for Data Mining. models for Data Mining.
Very powerful when trying to solve Very powerful when trying to solve very complex non linear problemsvery complex non linear problems
Brief intro to ANN Brief intro to ANN Weights and Biases – Artificial NeuronWeights and Biases – Artificial Neuron
)(1
1
n
iii inwtfoutput
Non linear power!
Brief intro to ANN Brief intro to ANN Example – XOR problemExample – XOR problem
y
(0,1)
(0,0) (1,0)
(1,1)
x
Brief intro to ANN Brief intro to ANN Activation FunctionsActivation Functions
xexf
1
1)(
0:1
0:0
0:1
)(
x
x
x
xf
Logistic Sigmoid
Hyperbolic Tangent
Sign Function
xx
xx
ee
eexf
)(
Brief intro to ANN Brief intro to ANN multilayer perceptronmultilayer perceptron
Inputs Hidden Layer
Output Layer
Neurons
Weights
Brief intro to ANN Brief intro to ANN multilayer perceptronmultilayer perceptron
Most ANN architectures are:Most ANN architectures are: Feed ForwardFeed Forward Layered (Layered (twotwo layers) layers)
In the past up to In the past up to three three layers were used.layers were used.
There are a lot of more other types: There are a lot of more other types: Recurrent NetworksRecurrent Networks Feed Forward but not layeredFeed Forward but not layered
Brief intro to ANN Brief intro to ANN Learning ProcedureLearning Procedure
The process of determining the values for The process of determining the values for WW on the on the basis of the data is called basis of the data is called learninglearning or or trainingtraining
We want to make as close as possible to We want to make as close as possible to We can achieve this by minimizing the error We can achieve this by minimizing the error function by changing function by changing W W (weights).(weights).
)(ˆ xf
),()(ˆ WxFxf )(xf
N
n
nn xfxfE1
2))()(ˆ(2
1Error function example
Sum Of Squares
Brief intro to ANN Brief intro to ANN Example – Gradient DescentExample – Gradient Descent
Error function derivative calculation
Next weight calculation
Evolution approachEvolution approachTogether with ANNTogether with ANN
Parts that can be evolved Connection weights Network topology Hidden neurons amount Activation functions
OverviewOverview
Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s
Experimental SetupExperimental SetupThe Referee and Game EndingThe Referee and Game Ending
Referee JaGo, a Go playing Java program written by
Fuming Wang, slightly improved
Game Ending Each players passes When one player placed all of his stones There are no free intersections left Fools Draw
Experimental SetupExperimental SetupComputer Go playersComputer Go players
Random Player Only knows the rules
Naïve Player Knows the rules Knows some basic strategies
JaGo Player JaGo is the best computer player that have been
used Estimated rank ~16 kyu
Experimental SetupExperimental SetupANN PlayerANN Player
Creativity factor
Strength (wins / games)
2000 games with each player, total 6000 games
Experimental SetupExperimental SetupGo Board RepresentationGo Board Representation
Standard input representation Two inputs for each intersections
Naïve input representation One input for each intersections
Limited View Input Representation w sized windows
Experimental Setup Experimental Setup Output RepresentationOutput Representation
Standard output representation One output for each intersections
Row-Column output representation One for each row or column
Experimental Setup Experimental Setup Techniques usedTechniques used
Simple Training Simple Evolution
Using Random,Naïve and JaGo players Simple Coevolution
Competing against each other Cultural Coevolution
Competing against culture Hall of Fame Coevolution
Competing against HoF
Experimental Setup Experimental Setup ANN EncodingANN Encoding
Feed Forward ANN Three chromosomes were used
Binary: Connections Encoding Binary: Hidden Neurons Encoding Real: Weights and Biases Encoding
Recurrent ANN Two chromosomes were used (no hidden layer)
Binary: Connections Encoding Real: Weight and Biases Encoding
Generally Two Point Crossover were used Strength function were used always as fitness
OverviewOverview
Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s
Training of Go playing Training of Go playing ANN’sANN’s
Feed forward, fully connected ANN’s were used
Each training experiment is repeated 20 times Training set consisted of Go games played by
JaGo against itself Sigmoid function was used Number of connections: For straight evaluation each network played
2000 games against three players Learning algorithm used: Resilient back–
propagation
OverviewOverview
Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s
Evolution of Go playing ANN’sEvolution of Go playing ANN’sInitializationInitialization
Binary chromosomes: randomly set with probability p.
Real chromosomes: randomly set to small random values between -0.1 and 0.1
Maximum of 20 hidden neurons were used. 20 experiments for each run End conditions: Fitness (strength) = 1 or 3000
generations reached.
Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for RandomResults for Random
Interesting observation: Output values of ANN’s wasn’t influenced by the inputs. Just tried to catch important places (mainly in the middle of the board).
Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for Naïve Results for Naïve
Important observation: ANN is able to learn basic Go by means of evolution. > 200 generations => all ANN could win by rate of 50%
Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for JaGoResults for JaGo
Important observation: 1000 generation for mean fitness of 0.4. Standard deviation is twice as high as for Naïve.
Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for JaGo – Recurrent ANN’sResults for JaGo – Recurrent ANN’s
Important observation: Within 1000 generation a individual evolved with fitness function of 1!
Evolution of Go playing ANN’sEvolution of Go playing ANN’sCultural CoevolutionCultural Coevolution
CulturePopulation
Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for Cultural CoevolutionResults for Cultural Coevolution
“The analysis of these culture ANNs showed that the ANNs—when playing the black stones—take advantage of a weakness of JaGo.”
Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for HoFResults for HoF
Poor related to Cultural