just what are “building blocks”? how do (should) evolutionary algorithms “work”?
DESCRIPTION
Just what are “building blocks”? How do (should) Evolutionary Algorithms “work”?. Chris Stephens and Jorge Cervantes , Instituto de Ciencias Nucleares, UNAM FOGA 2007, 9/1/2007 [email protected]. It’s mathematically rigorous. It’s intuitive. Theory. It’s - PowerPoint PPT PresentationTRANSCRIPT
1
Just what are Just what are “building blocks”?“building blocks”?
How do (should) Evolutionary How do (should) Evolutionary Algorithms “work”?Algorithms “work”?
Chris Stephens and Jorge Cervantes,
Instituto de Ciencias Nucleares, UNAMFOGA 2007, 9/1/[email protected]
2
TheoryTheory
It’s mathematically rigorous
It’s useful for
practitioners
It’s intuitive
It’s exact
It predicts well
It unifiesphenomena
What should it do?
3
TheoryTheory
“Old” Schema Theory and
the BBH
DynamicalSystems Model
StatisticalMechanics Approach
Engineering“Rules of thumb”
PopulationBiologyModels
Coarse Grained models
What’s the best approach?
4
The Problem of Theory…
Theory
Experiment
The “ideal”
5
The Problem of Theory…
In EC …
Theory
?? ??
Experiment
New ApplicationsNew Algorithms
“Most algorithms are NEVER used (except by the people whocreated them)” - Darrell Whitley, GECCO 2003 tutorial
e.g. Multi-Resource Traveling Gravedigger Problem with Variable Coffin Size
6
The Problem of Theory…
The EC Expectation Gap
What theoreticians think practitioners are and what practitioners think theoreticians should be
What practitioners think theoreticians are and what theoreticians think practitioners should be
7
EC Theory – the “Bare Necessities”- the choice of representation
“Objects”Dim = |X|
(1.321,2.463,3.149)
x
z
y
(1,0,0)
…ES
GAs
Linear GPVariable-length GAs
GP?
8
EC Theory – the “Bare Necessities”
Objects have fitness:
f
Object
Objects have interactions:
Selection
Mutation
+ + Recombination
i j
k m – recombination “mode”
DynamicsDynamics
9
In mathematics…
That’s most of standard population genetics and evolutionary computation!
Finite population model determined by Markov chain. In the infinite population limit for haploids:
Implicit summation over repeated indices
Probability to mutate genotype J to genotype I
Probability to implement recombination
Probability that given recombination takes place it is implemented with mode m
Probability to select genotype I
Conditional probability for “child” J given “parents” K and L and a mode m
10
Select an object JDon’t recombine it with another
Mutate it to object I
Select two “parents” K and L
Recombine them withrespect to a recombinationmode m applied with probabilitypcpc(m) to obtain a “child” J
• Ω coupled non-linear difference equations• There are Ω3 different λJ
KL
• Most of them are zero • In object/string basis for a given m more than one K and L can give rise to J
• Equation is written covariantly (in terms of tensors) and therefore is valid in any coordinate system
11
Two Questions…
1. Can we understand anything “qualitatively” from them?
How does genetic dynamics “work”? (Why and when are recombination and mutation useful?)
What are the effective degrees of freedom/collective modes?
2. Can we “solve” them?Put them on the computer. Not very feasible for N = 100!
12
Can we make things simpler?- consider only one operator…
• Selection only – can get exact solution in terms of “objects”, e.g. strings (microscopic degrees of freedom are good “coordinates” for selection)
• Mutation only – can get exact solution by Fourier transforming (coordinate transformation to the Walsh/Fourier basis); Diagonalizes the mutation matrix - solutions are “normal modes” (collective/effective degrees of freedom)
Can answer both 1) and 2) in these cases
But what about recombination?
13
• Consider schemata/marginals and neglect the construction term
Holland´s Schema theorem for schemata of length l and order Nm
Smaller for longer schemata Tight linkage beneficial because tightly linked genes are morelikely to crossover together
Smaller for higherorder schemata Bigger for fitter
schemataDynamic schema fitness is population dependenta a aa
a aa
14
“building block” Hypothesis:
A GA works by combining short, low-order,
highly fit schemata (“building blocks”) into
fitter higher order schemata
• But how would we recognise one if we saw one?• Building what?• How many of them are there?• Just how are they combined together?• When is recombination beneficial?• How does the effect of recombination depend on the fitness landscape (and on other operators/parameters)?
15
Fitness landscape “linkage”
a a aa
a aa
a
a
a
a
Tightly linked epistatic genes
Loosely linked epistatic genes
Epistatic genes Create a representation sothat epistatic genes aretightly linked
Understand the “linkage” (epistatic) patterns of the fitness landscape (linkagelearning)
But…What is the relationship between“landscape blocks” and “building blocks”?
16
Does recombination favour tight linkage?
Perform a “coarse graining” (i.e. write it in terms of schemata) of the RHS of the exact microscopic equations or, equivalently,do a linear coordinate transformation using
Selection-weighted linkage disequilibrium coefficient
Depends on population state, fitnesslandscape and recombination distribution
Gives a complete description of the utilityof recombination mode by mode and generation by generation
17
Building Block schemata
• Object/string construction is now written in terms of schemata/marginals - Building Block schemata• These BBs are not the same as those of the “building block” hypothesis – they are not necessarily short or low-order or even fit!• For every recombination mode/channel there is a corresponding unique BB pair• The number of BB schemata is precisely defined (e.g. 2N for binary strings)• They form a coordinate basis (many in fact, one for ech object)• Hierarchical solutions – objects have BBs, these BBs have their BBs etc.• Hierarchy can be represented diagramatically
This is how recombination “works”For a given object/schema it specifies the ONLYways it can be built
18
If < 0, “channel” is “non-deceptive” higher probabilityto select the Building Blocks of the string/schemata than the string/schemata itself
Recombination via a particular channel increases/decreases the proportion (effective fitness) of a given string or schemata I when
< 0> 0
respectively
Standard Two-bit deception: f(0*) > f(1*) > 0
i.e. > 0
If > 0 , “channel” is “deceptive” lower probabilityto select the Building Blocks of the string/schemata than the string/schemata itself
Favours “tight linkage”
Favours “loose linkage”
19
Example: three loci, 1-point crossover
Level 2 BBs – BBs of the BBs
Level 3 BBs – BBs of the BBs of the BBs – there aren’t any, hierarchy terminates at O(1) BBs
Level 1 BBs – BBs of the string (e.g. optimum)
20
Landscape blocks
Modular landscapes:m=1 NIAHm=N “counting ones”f_0=0, Royal Road functionConcatenated traps
Useful metrics:
Compares the relative effects of two operator sets; e.g. recombinationand selection vs selection only, or recombination and selectionvs selection and mutation
21
What can theory tell us about selecto-recombinative EAs?
22
Predictions
To see interaction between biases of selection and recombination consider a random population, then
If it does exist then there exists a critical proportion for any string/schema such that if and hence recombination is “bad”, where
is population, mask/mode and landscape dependent
First, the obvious – if a string or schema does not exist in the populationthen
23
Predictions
For 1-block NIAH, N=4; only one landscape block and
Recombination is disadvantageous for all masks
For 4-block NIAH, N=4; maximum number of landscape blocks
(true for any mask)
(true for any mask)
For 2-block NIAH, N=4; intermediate number of landscape blocks
the relative advantage of recombination is mask dependent
Recombination is advantageous for all masks
0011 is compatible with the landscape blocks but 0001 isn’t
24
Predictions• Only in “extreme” cases can you say whether
recombination is uniformly good or bad– The more/less epistatic/”unmodular” the landscape the
worse/better the effect of recombination
• Better to ask which recombination distribution is good or bad
• Which recombination distribution is best depends on the landscape– The best recombination distributions are those whose BBs are
compatible with the landscape´s blocks, i.e. the underlying modularity
– Also depends on the population and therefore should be time dependent (first search with very mixing recombination to explore for blocks then restrict the mixing to exploit them)
25
When is recombination bad?
Shorter BBspreferred
Lower orderBBs preferred
Recombination leads to LESS production of the optimal string or ANYoptimal BB or schemata than selection only
26
When is recombination good?
Recombination leads to MORE production of ANY optimal string or optimal BB or schemata than selection only
Higher order BBs/schemata preferred
Longer BB/schemata preferred
Preference for O(1) BBsnear the string boundary
27
And what about here?
So, is recombination good or bad?
Recombination favours longer optimal schemata;But these aren’t BBs!
These BBs are only favoured after a certain amount of time.
These level 1 O(2) BBs are suppressed
This level 2 O(2) BB is favoured
Preference for O(1) BBsnear the string boundary
28
So, what do the Deltas tell us?
masks
Recombination is bad for ANY mask… but some masks are worse than others!
Recombination is particularly bad in trying to construct these O(2) BBs/optimal schemata – because of their tight linkage!
Recombination is better constructing these O(2) optimal schemata – because of their loose
linkage! But they’re not BBs!
Better to construct the needle with these masks than these; asymmetric BBspreferred
29
So, what do the Deltas tell us?
Recombination is good for ANY mask… but some masks are better than others!
Recombination is particularly good in trying to construct these O(2) BBs – because of their tight linkage!
Better to construct the optimum with these masks than these – “symmetric” BBSpreferred
30
So, what do the Deltas tell us?
Recombination is good for SOME masksbut BAD for others, and this depends on the landscape!
Splitting up landscape blocks that are also BBs is very BAD
Getting the optimum from recombining BBs that aren’t landscape blocks isn’t good
Getting the optimum from recombining BBs that are also landscape blocks is good. Preference for themask 0011, the only onethat respects the landscape blocks
Note: no sign changes
31
And for finite populations…?
32
2-point crossover, popsize = 13, 1000 repetitions
The more crossoverthe better it gets!
The hard part here is to find theBBs in the first place. Lots of crossover helps with that.
33
“2-point” crossover, popsize = 13, 100 repetitions
Lots of crossovergives random search (or worse)
Better to cut at block boundaries
Here mutation first finds the blocks then crossover joins them together
34
“2-point” crossover, popsize = 25, 100 reps
Mutation is bad – once you’ve got the BBs – easier to get O(1) BBs!
35
Conclusions • Recombination works by joining together BBs (not the
BBH ones!) – that’s the only way it works• Objects have BBs which have their BBs which …• BB basis is the appropriate mathematical description of
recombination along with the SWLD coefficients• Can glean qualitative information from the infinite
population equations that is also valid for finite populations
• Recombination is only absolutely good or bad in the extreme siutations of maximum and minimum epistasis, and even then it’s good if you don’t have the string/schema you want
• In other cases it depends on the fitness landscape and especially it’s modularity
• It seems to be particularly beneficial in “modular” landscapes
36
Conclusions
• Instead of asking if recombination is good or bad better to ask what is a good recombination distribution
• If recombination distributions are allowed to evolve they will do so to respect landscape modularity
• Possible explanation for recombination hotspots• Coevolution of recombination hotspots and modular
landscapes• Remember that a gene is a “building block”, O(1) in
terms of “loci” but O(thousands) in terms of nucleotides • Modularity can be lots of intragene epistasis but weak
intergene epistasis• Difference between “counting ones” (nucelotides) versus
“counting ones” (genes)