evolution and coevolution of artificial neural networks playing go thesis by peter maier, salzburg,...

48
Evolution and Coevolution Evolution and Coevolution of Artificial Neural of Artificial Neural Networks Networks playing Go playing Go Thesis by Peter Maier, Salzburg, April 2004 Additional paper used Computer Go, by Martin Müller Presented by Dima Stopel [email protected]

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Evolution and CoevolutionEvolution and Coevolutionof Artificial Neural Networksof Artificial Neural Networksplaying Goplaying Go

Thesis by Peter Maier, Salzburg, April 2004

Additional paper usedComputer Go, by Martin Müller

Presented byDima Stopel

[email protected]

OverviewOverview

Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s

History of GoHistory of Go

Go is an ancient Chinese board game Go is an ancient Chinese board game that is believed to be 2,000 to 4,000 that is believed to be 2,000 to 4,000 years old. years old.

Go is played around the world, and Go is played around the world, and has several names. The Chinese call it has several names. The Chinese call it Wei-chiWei-chi. In Korea it’s . In Korea it’s BadukBaduk. The . The Japanese word is Japanese word is IgoIgo, or just , or just GoGo. .

Go BasicsGo BasicsStones and BoardStones and Board

Boards Standard 19x19 Beginners 9x9 13x13

Stones 180 White 181 Black

Go BasicsGo BasicsGamePlay and Winning ConditionGamePlay and Winning Condition

Play starts on an empty board. Players put their stones at the

intersections of the lines on the board. Players can pass at any time. Consecutive passes end the game. The goal of the game is to control a

larger area than the opponent and take more prisoners.

Go BasicsGo BasicsThree Rules of GoThree Rules of Go

Rule 1Rule 1 Stones of one color that have been

completely surrounded by the opponent are removed from the board as prisoners.

Liberties

Go BasicsGo BasicsThree Rules of GoThree Rules of Go

Rule 2Rule 2 No suicide moves are allowed.

Go BasicsGo BasicsThree Rules of GoThree Rules of Go

Rule 3 – The “Rule 3 – The “Ko” Ko” rule.rule. No infinity.

OverviewOverview

Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s

State of computer in GoState of computer in GoThe challengeThe challenge

It is easy to write a Go program that can play a complete game. However, it is hard to write a program that plays well.

State of computer in GoState of computer in GoOverview pictureOverview picture

• 1980 The Ing Foundation’s issues a million dollar prize for a professional level Go program

State of computer in GoState of computer in GoThe challengeThe challenge

Computer Go poses many formidable conceptual, technical and software engineering challenges.

Most programs have required 5–15 person-years of effort.

Contained 50–100 modules dealing with different aspects of the game.

The fact that despite all these efforts the current level of programs is still modest

State of computer in GoState of computer in GoThe challengeThe challenge

The search space for 19 × 19 Go is very large compared to other popular board games.

The number of distinct board positions is , and about 1.2% of these are

legal. In chess 20 potential moves from a board

position are available, but on a standard 19x19 Go board there are about 200–300 potential moves.

1701919 103

OverviewOverview

Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s

Brief intro to ANN Brief intro to ANN History and motivationHistory and motivation

1943 Warren McCulloch, Walter Pitts 1943 Warren McCulloch, Walter Pitts A Logical A Logical

Calculus of Ideas Immanent in Nervous ActivityCalculus of Ideas Immanent in Nervous Activity 1958 Frank Rosenblatt 1958 Frank Rosenblatt Perceptron Perceptron 1969 M. Minsky, S. Papert's 1969 M. Minsky, S. Papert's PerceptronsPerceptrons 1974 Paul Werbos 1974 Paul Werbos Back-propagationBack-propagation 1986 D. E. Rumelhart 1986 D. E. Rumelhart Learning internal Learning internal

representation by error propagation.representation by error propagation. 1991 K. Hornik 1991 K. Hornik Approximation capabilities of Approximation capabilities of

multilayer feed forward networksmultilayer feed forward networks..

Brief intro to ANN Brief intro to ANN General factsGeneral facts

Data Driven ModelData Driven Model

One of the most flexible mathematical One of the most flexible mathematical models for Data Mining. models for Data Mining.

Very powerful when trying to solve Very powerful when trying to solve very complex non linear problemsvery complex non linear problems

Brief intro to ANN Brief intro to ANN Weights and Biases – Artificial NeuronWeights and Biases – Artificial Neuron

)(1

1

n

iii inwtfoutput

Non linear power!

Brief intro to ANN Brief intro to ANN Example – XOR problemExample – XOR problem

y

(0,1)

(0,0) (1,0)

(1,1)

x

Brief intro to ANN Brief intro to ANN Example – Spam problemExample – Spam problem

Brief intro to ANN Brief intro to ANN Activation FunctionsActivation Functions

xexf

1

1)(

0:1

0:0

0:1

)(

x

x

x

xf

Logistic Sigmoid

Hyperbolic Tangent

Sign Function

xx

xx

ee

eexf

)(

Brief intro to ANN Brief intro to ANN multilayer perceptronmultilayer perceptron

Inputs Hidden Layer

Output Layer

Neurons

Weights

Brief intro to ANN Brief intro to ANN multilayer perceptronmultilayer perceptron

Most ANN architectures are:Most ANN architectures are: Feed ForwardFeed Forward Layered (Layered (twotwo layers) layers)

In the past up to In the past up to three three layers were used.layers were used.

There are a lot of more other types: There are a lot of more other types: Recurrent NetworksRecurrent Networks Feed Forward but not layeredFeed Forward but not layered

Brief intro to ANN Brief intro to ANN Learning ProcedureLearning Procedure

The process of determining the values for The process of determining the values for WW on the on the basis of the data is called basis of the data is called learninglearning or or trainingtraining

We want to make as close as possible to We want to make as close as possible to We can achieve this by minimizing the error We can achieve this by minimizing the error function by changing function by changing W W (weights).(weights).

)(ˆ xf

),()(ˆ WxFxf )(xf

N

n

nn xfxfE1

2))()(ˆ(2

1Error function example

Sum Of Squares

Brief intro to ANN Brief intro to ANN Example – Gradient DescentExample – Gradient Descent

Error function derivative calculation

Next weight calculation

Evolution approachEvolution approachTogether with ANNTogether with ANN

Parts that can be evolved Connection weights Network topology Hidden neurons amount Activation functions

OverviewOverview

Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s

Experimental SetupExperimental SetupThe Referee and Game EndingThe Referee and Game Ending

Referee JaGo, a Go playing Java program written by

Fuming Wang, slightly improved

Game Ending Each players passes When one player placed all of his stones There are no free intersections left Fools Draw

Experimental SetupExperimental SetupComputer Go playersComputer Go players

Random Player Only knows the rules

Naïve Player Knows the rules Knows some basic strategies

JaGo Player JaGo is the best computer player that have been

used Estimated rank ~16 kyu

Experimental SetupExperimental SetupANN PlayerANN Player

Creativity factor

Strength (wins / games)

2000 games with each player, total 6000 games

Experimental SetupExperimental SetupGo Board RepresentationGo Board Representation

Standard input representation Two inputs for each intersections

Naïve input representation One input for each intersections

Limited View Input Representation w sized windows

Experimental Setup Experimental Setup Output RepresentationOutput Representation

Standard output representation One output for each intersections

Row-Column output representation One for each row or column

Experimental Setup Experimental Setup Techniques usedTechniques used

Simple Training Simple Evolution

Using Random,Naïve and JaGo players Simple Coevolution

Competing against each other Cultural Coevolution

Competing against culture Hall of Fame Coevolution

Competing against HoF

Experimental Setup Experimental Setup ANN EncodingANN Encoding

Feed Forward ANN Three chromosomes were used

Binary: Connections Encoding Binary: Hidden Neurons Encoding Real: Weights and Biases Encoding

Recurrent ANN Two chromosomes were used (no hidden layer)

Binary: Connections Encoding Real: Weight and Biases Encoding

Generally Two Point Crossover were used Strength function were used always as fitness

OverviewOverview

Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s

Training of Go playing Training of Go playing ANN’sANN’s

Feed forward, fully connected ANN’s were used

Each training experiment is repeated 20 times Training set consisted of Go games played by

JaGo against itself Sigmoid function was used Number of connections: For straight evaluation each network played

2000 games against three players Learning algorithm used: Resilient back–

propagation

Training of Go playing Training of Go playing ANN’sANN’s

Training of Go playing Training of Go playing ANN’sANN’s

OverviewOverview

Go: History and RulesGo: History and Rules Role of the computer in GoRole of the computer in Go Brief introduction to ANNBrief introduction to ANN Experimental SetupExperimental Setup Training of Go playing ANN’sTraining of Go playing ANN’s Evolution of Go playing ANN’sEvolution of Go playing ANN’s

Evolution of Go playing ANN’sEvolution of Go playing ANN’sInitializationInitialization

Binary chromosomes: randomly set with probability p.

Real chromosomes: randomly set to small random values between -0.1 and 0.1

Maximum of 20 hidden neurons were used. 20 experiments for each run End conditions: Fitness (strength) = 1 or 3000

generations reached.

Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for RandomResults for Random

Interesting observation: Output values of ANN’s wasn’t influenced by the inputs. Just tried to catch important places (mainly in the middle of the board).

Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for Naïve Results for Naïve

Important observation: ANN is able to learn basic Go by means of evolution. > 200 generations => all ANN could win by rate of 50%

Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for JaGoResults for JaGo

Important observation: 1000 generation for mean fitness of 0.4. Standard deviation is twice as high as for Naïve.

Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for JaGo – Recurrent ANN’sResults for JaGo – Recurrent ANN’s

Important observation: Within 1000 generation a individual evolved with fitness function of 1!

Evolution of Go playing ANN’sEvolution of Go playing ANN’sCultural CoevolutionCultural Coevolution

CulturePopulation

Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for Cultural CoevolutionResults for Cultural Coevolution

“The analysis of these culture ANNs showed that the ANNs—when playing the black stones—take advantage of a weakness of JaGo.”

Evolution of Go playing ANN’sEvolution of Go playing ANN’sResults for HoFResults for HoF

Poor related to Cultural

The EndThe End

Any Questions ?

The EndThe End

Thank you ;)