chapter 9 genetic algorithms. based upon biological evolution generate successor hypothesis based...

34
Chapter 9 Chapter 9 Genetic Algorithms Genetic Algorithms

Upload: sheena-bates

Post on 13-Dec-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

Chapter 9Chapter 9

Genetic AlgorithmsGenetic Algorithms

Genetic AlgorithmsGenetic Algorithms

Based upon biological evolutionBased upon biological evolution Generate successor hypothesis based Generate successor hypothesis based

upon repeated mutationsupon repeated mutations Acts as a randomized parallel beam Acts as a randomized parallel beam

search through hypothesis spacesearch through hypothesis space

Popularity of GA’sPopularity of GA’s

Evolution is a successful, robust method Evolution is a successful, robust method for adaptation in biological systemsfor adaptation in biological systems

GA’s can search complex spaces.GA’s can search complex spaces.

Easily parallelized Easily parallelized

Genetic AlgorithmsGenetic Algorithms

Each iteration all members of a population Each iteration all members of a population are evaluated by the are evaluated by the fitness functionfitness function..

A new population is generated by A new population is generated by probabilistically probabilistically selectingselecting the most fit the most fit individuals.individuals.

Some of the individuals are changed by Some of the individuals are changed by operations operations MutationMutation and and Crossover.Crossover.

GA TermsGA Terms

FitnessFitness: A function that assigns an evaluation score : A function that assigns an evaluation score to a hypothesis.to a hypothesis.

Fitness ThresholdFitness Threshold: A fitness that determines when to : A fitness that determines when to terminate.terminate.

pp: The size of the population of hypothesis.: The size of the population of hypothesis.

r r : The fraction of population to be used in crossover: The fraction of population to be used in crossover

mm:The mutation rate:The mutation rate

The AlgorithmThe Algorithm

Initialize Population:Initialize Population: P := p random hypothesis P := p random hypothesis

Evaluate:Evaluate: compute fitness for each p in P compute fitness for each p in P

while max fitness is < Fitness Threshold dowhile max fitness is < Fitness Threshold do

Create New Generation PCreate New Generation PSS

SelectSelect (1-r)p(1-r)p hypothesis from P to P hypothesis from P to PSS

Crossover :Crossover : choose choose rprp hypothesis to crossover. hypothesis to crossover.

MutateMutate: Choose : Choose mm percent of hypothesis to percent of hypothesis to mutatemutate

UpdateUpdate: P := P: P := PSS

Evaluate Evaluate : Compute fitness for each p in P.: Compute fitness for each p in P.

ClassificationClassification

One of the main functions of a machine One of the main functions of a machine learning algorithm is classificationlearning algorithm is classification

The agent is presented with a bit string The agent is presented with a bit string and asked to classify it between two or and asked to classify it between two or more classificationsmore classifications

A pattern which will classify all bit strings is A pattern which will classify all bit strings is called a hypothesiscalled a hypothesis

Hypothesis RepresentationHypothesis Representation

Hypothesis are often represented by bit-strings.Hypothesis are often represented by bit-strings.

Each bit in the string has an interpretation Each bit in the string has an interpretation

associated with it.associated with it.

For example a bit in the string could represent a For example a bit in the string could represent a

possible classificationpossible classification

It is good to ensure that all possible bit patterns It is good to ensure that all possible bit patterns

have meaninghave meaning

Hypothesis RepresentationHypothesis RepresentationExampleExample

Outlook Wind PlayTennis 011 10 10

Each bit corresponds to a possible value of the attribute

A value of 1 indicates the attribute is allowed that value

Corresponds to if wind = Strong and Outlook = Overcast or Rain

CrossoverCrossover

Two parent hypothesis are chosen Two parent hypothesis are chosen probabilistically from the population based probabilistically from the population based upon their fitnessupon their fitness

The parent hypothesis combine to form The parent hypothesis combine to form two child hypothesis.two child hypothesis.

The child hypothesis are added to the The child hypothesis are added to the populationpopulation

Crossover DetailsCrossover Details

Crossover operatorCrossover operator produces two new offspring from a parentproduces two new offspring from a parent

Crossover bit maskCrossover bit mask determines which parent will contribute to determines which parent will contribute to

which position in the stringwhich position in the string

Crossover Types Crossover Types

Single-point crossoverSingle-point crossover parents are “cut” at one point and swap half parents are “cut” at one point and swap half

of the bit string with the other parentof the bit string with the other parent

Two-point crossoverTwo-point crossover parents are cut at two pointsparents are cut at two points often outperforms single-pointoften outperforms single-point

Uniform CrossoverUniform Crossover each bit is sampled randomly from each each bit is sampled randomly from each

parentparent often looses coherence in hypothesisoften looses coherence in hypothesis

Crossover TypesCrossover Types

Single point:

Two-point:

Uniform:

Single point:

11101001000

00001010101

1111100000011101010101

00001001000

11101001000

00001010101

0011111000011001011000

00101000101

11101001000 11100001000

11101001000

00001010101

1001101001110001000100

01101011001

MutationMutation

A number of hypothesis are chosen A number of hypothesis are chosen randomly from the population.randomly from the population.

Each of these hypothesis are randomly Each of these hypothesis are randomly mutated to form slightly different mutated to form slightly different hypothesis.hypothesis.

The mutated hypothesis replace the The mutated hypothesis replace the original hypothesis.original hypothesis.

Fitness FunctionFitness Function

Contains criteria for evaluating hypothesisContains criteria for evaluating hypothesis Accuracy of HypothesisAccuracy of Hypothesis Size of HypothesisSize of Hypothesis

Main source of inductive bias for Genetic Main source of inductive bias for Genetic AlgorithmsAlgorithms

SelectionSelection

Fitness proportionate selectionFitness proportionate selection probability chosen is fitness relative to total probability chosen is fitness relative to total

populationpopulation

Tournament SelectionTournament Selection Two hypothesis are chosen at random and Two hypothesis are chosen at random and

winner is selectedwinner is selected

Rank SelectionRank Selection probability chosen is proportionate to rank probability chosen is proportionate to rank

of sorted hypothesisof sorted hypothesis

Boltzmann DistributionBoltzmann Distribution

Used to probabilistically select which Used to probabilistically select which individuals to crossoverindividuals to crossover

j

Tjf

Tif

e

eip

/)(

/)(

)(

Genetic ProgrammingGenetic Programming

Individuals are programsIndividuals are programs Represented by TreesRepresented by Trees

Nodes in the tree represent function callsNodes in the tree represent function calls User supplies User supplies

Primitive functionsPrimitive functions TerminalsTerminals

Allows for arbitrary lengthAllows for arbitrary length

Genetic ProgrammingGenetic Programming

CrossoverCrossover Crossover points chosen randomlyCrossover points chosen randomly Done by exchanging sub-treesDone by exchanging sub-trees

MutationMutation Not always necessaryNot always necessary Randomly change a nodeRandomly change a node

Genetic ProgrammingGenetic Programming

Search through space of programsSearch through space of programs Other search methods also workOther search methods also work

hill climbinghill climbing Simulated annealingSimulated annealing

Not likely to be effective for large Not likely to be effective for large programsprograms Search space much too largeSearch space much too large

Genetic ProgrammingGenetic Programming

VariationsVariations Individuals are programsIndividuals are programs Individuals are neural networksIndividuals are neural networks

Back-propagationBack-propagation RBF-networksRBF-networks

Individuals are reinforcement learning agentsIndividuals are reinforcement learning agents construct policy by genetic operationsconstruct policy by genetic operations could be aided by actual reinforcement learningcould be aided by actual reinforcement learning

Genetic ProgrammingGenetic Programming

Smart variationsSmart variations Hill-climbing mutationHill-climbing mutation Smart crossoverSmart crossover

requires a localized evaluation functionrequires a localized evaluation function extra domain knowledge requiredextra domain knowledge required

Genetic Programming Genetic Programming ApplicationsApplications

Block Stacking Koza (1992)Block Stacking Koza (1992) Spell “universal”Spell “universal” OperatorsOperators

(MS x) move to stack(MS x) move to stack (MT x) move to table(MT x) move to table (EQ x y) T if x = y(EQ x y) T if x = y (Not x)(Not x) (DU x y) do (DU x y) do xx until until yy

Genetic Programming Genetic Programming ApplicationsApplications

Block stacking continuedBlock stacking continued Terminal argumentsTerminal arguments

CS (Current Stack)CS (Current Stack) TB (top correct block)TB (top correct block) NN (next necessary)NN (next necessary)

Final discovered programFinal discovered program (EQ (DU (MT CS)(Not CS))(DU (MS NN)(NOT NN)) (EQ (DU (MT CS)(Not CS))(DU (MS NN)(NOT NN))

))

Genetic Programming Genetic Programming ApplicationsApplications

Circuit Design (Koza et al 1996)Circuit Design (Koza et al 1996) Gene represents potential circuitGene represents potential circuit Simulated with SpiceSimulated with Spice Population of 640,000Population of 640,000 64 node parallel processor64 node parallel processor 98% of circuits invalid first generation98% of circuits invalid first generation Good circuit after 137 generationsGood circuit after 137 generations

Genetic AlgorithmsGenetic Algorithms

Relationships to other search techniquesRelationships to other search techniques Mutation is a blind “hill climbing” searchMutation is a blind “hill climbing” search

mostly to get out of local minimamostly to get out of local minima

Selection is just hill climbingSelection is just hill climbing Crossover is uniqueCrossover is unique

no obvious corollary other search techniquesno obvious corollary other search techniques the source of power for genetic algorithmsthe source of power for genetic algorithms

Evolution and LearningEvolution and Learning

Lamarckian EvolutionLamarckian Evolution Proposed that learned traits could be passed Proposed that learned traits could be passed

on to succeeding generationson to succeeding generations Proved false for biologyProved false for biology Works for genetic algorithmsWorks for genetic algorithms

Evolution and LearningEvolution and Learning

Baldwin EffectBaldwin Effect Learning Individuals perform betterLearning Individuals perform better Rely less on hard coded traitsRely less on hard coded traits Allows a more diverse gene poolAllows a more diverse gene pool Indirectly accelerates adaptationIndirectly accelerates adaptation Hinton and NowlanHinton and Nowlan

Early generations had more learning than laterEarly generations had more learning than later

Evolution and LearningEvolution and Learning

Baldwin effect alters inductive biasBaldwin effect alters inductive bias hard coded weights restricts learninghard coded weights restricts learning good hard coded weights allow faster learninggood hard coded weights allow faster learning

Nature vs Nurture Nature vs Nurture Humans have greater learningHumans have greater learning Require shaping Require shaping

learn simple things before complex thingslearn simple things before complex things

Schema TheoremSchema Theorem

Probability of selecting a hypothesis.Probability of selecting a hypothesis.

number

fitness

)(

)(

)(

)()Pr(

1

n

f

tfn

hf

hf

hfh n

ii

Schema Theorem Schema Theorem

Probability of selecting a schemaProbability of selecting a schema

schemain sindividual ofnumber

schemaoffitnessaverageˆ

populationandschemainish

),()(

),(ˆ

)(

)()Pr(

tpsh

m

u

psh

tsmtfn

tsu

tfn

hfsh

t

Schema TheoremSchema Theorem

Equation for average fitness of schemaEquation for average fitness of schema

schemain sindividual ofnumber

schemaoffitnessaverageˆ

populationandschemainish

),(

)(

),(ˆ tpsh

m

u

psh

tsm

hf

tsu

t

Schema TheoremSchema Theorem

Expected Number of members of schema Expected Number of members of schema ss

schemain sindividual ofnumber

schemaoffitnessaverageˆ

sindividual of fitness average)(

),()(

),(ˆ)]1,([

m

u

tf

tsmtf

tsutsmE

Schema TheoremSchema Theorem

Full schema theoremFull schema theorem

schemain bits defined ofnumber

mutation ofy probabilit

stringbit theoflength

bitsrightmost andleftmost between distance

yprobabilitcrossover point single

)1)(1

)(1)(,(

)(

),(ˆ)]1,([ )(

o(s)

p

l

d(s)

p

pl

sdptsm

tf

tsutsmE

m

c

somc