evolutionary algorithms darwinian evolution generic evolutionary algorithm 4 standard types of eas...
TRANSCRIPT
Evolutionary Algorithms
• Darwinian Evolution
• Generic Evolutionary Algorithm
• 4 standard types of EAs– GA, ES, EP, GP– GA example– GP examples
• EAs & Machine Learning (ML)
Thomas Malthus: Essay on the Principle of Population (1789)
• If unchecked, populations grow exponentially:
– dN/dt = kN => Nt = Noekt
• Environment has limited capacity to support life => competition for resources (Malthusian Crunch)
• More young are produced than can possibly survive until adulthood and reproduction.
Charles Darwin: On the Origin of Species by Means of Natural Selection (1859)
• Phenotypic Variation: When the Malthusian crunch comes, some phenotypes will be better equipped to deal with the environment (food, climate, avoiding predation & disease, etc.).
• Fitness Variation: These individuals will have a better chance of surviving to adulthood and producing healthy young.
• Heritable Variation: If these fitter individuals can pass their “advantages” on to their young, then these traits will increase in future generations, while weaker traits will decline or disappear.
Darwin • As many more individuals of each species are born than
can possibly survive; and as, consequently, there is a frequently recurring Struggle for Existence, it follows that any being, if it vary however slightly in any manner profitable to itself, under the complex and sometimes varying conditions of life, will have a better chance of surviving, and thus be naturally selected (Origin, pg. 5)
• Competition for Resources + Heritable Fitness Variation ==> Evolution by Natural Selection
• Niles Eldridge (1999): Natural Filtration of Heritable Information. Nature doesn’t select; it doesn’t care. It just sets the conditions for survival.
Darwinian Evolution
Genotypes
Phenotypes
Development
Natural Selection
Recombination & Mutation
Ptypes
Gtypes
ReproductionSex
Genetic
Physiological, Behavioral
Evolutionary Algorithms
Bit Strings
Parameters, Code,
Neural Nets,Rules
Translate
Performance Test
Recombination & Mutation
P,C,N,R
Bits
Generate
Semantic
Syntactic
R &M
Evolutionary Computation = Parallel Stochastic Search
1
2
3
4
56
Biased Roulette Wheel
CrossoverMutation
Next Generation
Selection
Translation &Performance
Test
Selection Biasing
Indiv: 1 2 3 4 5 6
Fitness: 3 8 2 4 1 1
Types of Evolutionary AlgorithmsGenetic Algorithms (Holland, 1975)
Representation: Bit Strings => Integer or real feature vectors
Syntactic crossover (main) & mutation (secondary)
Evolutionary Strategies (Recehenberg, 1972; Schwefel, 1995)
Representation: Real-valued feature vectors
Semantic mutation (main) & crossover (secondary)
Evolutionary Programs (Fogel, Owens & Walsh, 1966; Fogel, 1995)
Representation: Real-valued feature vectors or Finite State Machines
Semantic mutation (only)
View each individual as a whole species, hence no crossover
Genetic Programs (Koza, 1992)
Representation: Computer programs (typically in LISP)
Syntactic crossover (main) & mutation (secondary)
Using Evolutionary Algorithms
When• Large, rough search spaces
• Satisficing, optimization or general design problems
• Entire solutions are easily generated and tested
• Exhaustive search methods are too slow
• Heuristic search methods cannot find good solutions (e.g. Stuck at local max)
Satisficing : exploit parallelism between multiple attributes and multiple states.
How• Determine representation of solutions that tolerates mutation and crossover.
• Define fitness function that gives graded evaluations - not just good/bad rating.
• Define selection function = roulette-wheel biasing function (f: fitness -> area)
• Set key EA parameters: population size, mutation rate, crossover rate, # generations, etc.
* EA’s are easy to write, and there’s lots of freeware!
* Specific problems often require specific representations & genetic operators
Classic Genetic Algorithm
011101001011
000111110010
7
4
11
1
15
2
P1
P2
P3
011101010010
000111101011
7
5
2
1
14
11
X
Each chromosome may represent:• Parameters for a controller,• Room #’s for an exam scheduler• Weights for an artificial neural network• Instruction codes for a computer program
Process Scheduling (Kidwell, 1993)
• Task pairs (run + communicate result) to be run on a set of processors– ((7, 16) (11, 22) (12, 40), (15, 22)….)
• A task’s run must finish before result sending begins• All processors share a central communication line (bus)• Each processor can handle only one task at a time.• Each processor is capable of running any of the tasks• Only one processor at a time can send its message• A task cannot be removed from a processor until both run & send are finished• Tasks run on the main processor, P0, require no communication time, whereas
tasks run on all other processors must send their message to P0
Goal: Schedule the tasks on processors so as to minimize the total timespan
P7
P6 P4 P2
P0
P3P5 P1
Process Schedule Optimization using the Genetic Algorithm
Use GA to search the space of possible schedules (solutions)
1. Represent schedule in a GA-amenable form (i.e. linearize it)
Processor for Task #1 Processor for Task #4...
(2 3 4 1 3 1 …….)
...
001000110100000100110001….
2. Define a fitness function
Fitness = MaxTimespan - Timespan
* Lower timespan => Higher fitness
Other possibilities: 1/(1 + (Timespan - MinTimespan))
!!! We want Timespan to be low !!!
GA-based Schedule Optimization
*Compute schedule’s timespan by running on a process-network simulator.
1. Remove next task (whose assigned processor is open) from task-list and
start simulating it on that processor.
2. Remove tasks from processors as soon as they finish running & sending
3. Define a biasing function to convert fitness to a proportion of the roulette wheel
Sigma Scaling: (One of many standard biasing functions):
ExpVal(x) = Max ( 0, 1 + (Fitness (x) - AvgFitness) / (2 * StDevFitness))
Normalizing:
Roulette-Wheel%(x) = ExpVal(x) / SumofAllExpVals
4. Select Mutation and Crossover Rates: pmut = 0 .01; pcross = 0.75
5. Select a population size popsize = 10
6. Select # of generations: numgen = 10
7. Run the Genetic Algorithm
Generates a random initial population (of schedules) and evolves them via sigma-scaling selection, crossover and mutation for numgen generations
Kidwell’s (1993) Task List
((7 16) (11 22) (12 40) (15 22) (17 23)
(17 23) (19 23) (20 28) (20 27) (26 27)
(28 31) (36 37) (31 29) (28 22) (23 19)
(22 18) (22 17) (29 16) (27 16) (35 15))
MaxTime = Sum of all run & send times = 916 time units
Use 8 processors: P0 - P7, with P0 being the master processor
Population of Schedules (Generation # 0)
These are randomly-generated:
Processor List
1: Span: 401 Fitn: 515 (11.0%)| (1 2 1 3 6 7 2 0 0 4 5 1 6 0 5 5 4 7 4 6)
2: Span: 383 Fitn: 533 (13.7%)| (1 0 1 0 0 4 0 1 0 5 3 6 2 5 5 5 1 3 5 7)
3: Span: 426 Fitn: 490 ( 7.1%)| (6 2 2 7 6 5 2 6 7 6 6 5 3 4 0 1 0 2 0 7)
4: Span: 415 Fitn: 501 ( 8.8%)| (5 0 5 7 1 2 5 4 2 6 3 6 2 0 0 4 1 7 3 1)
5: Span: 377 Fitn: 539 (14.7%)| (4 2 0 6 2 1 2 0 1 6 3 2 2 3 3 4 0 3 0 1)
6: Span: 435 Fitn: 481 ( 5.7%)| (3 2 7 7 6 0 3 1 4 7 7 5 1 0 4 6 5 5 5 6)
7: Span: 439 Fitn: 477 ( 5.1%)| (0 3 2 3 2 2 7 0 4 5 2 1 6 3 1 7 1 0 3 2)
8: Span: 410 Fitn: 506 ( 9.6%)| (2 5 2 1 0 0 2 2 5 0 1 3 6 1 3 3 4 6 2 0)
9: Span: 337 Fitn: 579 (20.9%)| (1 0 0 7 0 4 3 1 1 0 0 2 4 1 2 4 6 4 7 6)
10: Span: 449 Fitn: 467 ( 3.5%)| (2 2 2 4 1 4 4 6 6 4 7 4 0 6 2 5 2 7 7 7)
Avg Fitness: 508.80 StDev Fitness: 32.28
Roulette Wheel (Generation # 0)
1: Span: 366 Fitn: 550 (12.1%)| (1 0 0 3 0 4 2 1 0 1 3 6 4 5 7 4 5 7 5 6)
2: Span: 406 Fitn: 510 ( 7.9%)| (1 0 1 4 0 4 1 1 1 4 0 2 2 1 0 5 2 0 7 7)
3: Span: 377 Fitn: 539 (11.0%)| (4 2 0 6 2 1 2 0 1 6 3 2 2 3 3 4 0 3 0 1)
4: Span: 464 Fitn: 452 ( 1.8%)| (1 0 1 3 0 4 2 0 1 4 4 2 4 1 5 5 4 6 4 6)
5: Span: 366 Fitn: 550 (12.1%)| (1 2 0 7 6 7 3 1 0 0 1 1 6 0 2 4 6 5 7 6)
6: Span: 337 Fitn: 579 (15.2%)| (1 0 0 7 0 4 3 1 1 0 0 2 4 1 2 4 6 4 7 6)
7: Span: 337 Fitn: 579 (15.2%)| (1 0 0 7 0 4 3 1 1 0 0 2 4 1 2 4 6 4 7 6)
8: Span: 415 Fitn: 501 ( 7.0%)| (5 0 5 7 1 2 5 4 2 6 3 6 2 0 0 4 1 7 3 1)
9: Span: 465 Fitn: 451 ( 1.7%)| (5 5 7 7 1 2 5 2 3 6 1 6 2 1 0 5 1 7 2 1)
10: Span: 329 Fitn: 587 (16.0%)| (2 0 0 1 0 0 2 4 4 0 3 3 6 0 3 2 4 6 3 0)
Avg Fitness: 529.80 StDev Fitness: 47.43
Population of Schedules (Generation # 1)
Roulette Wheel (Generation # 1)
1: Span: 337 Fitn: 579 (11.7%)| (1 0 0 7 0 0 6 0 2 6 3 6 0 4 1 4 1 7 1 1)
2: Span: 465 Fitn: 451 ( 0.0%)| (5 0 5 3 1 6 1 5 0 1 3 6 6 1 6 4 5 7 7 6)
3: Span: 366 Fitn: 550 ( 8.8%)| (1 2 0 7 6 7 3 1 0 0 1 1 6 0 2 4 6 5 7 6)
4: Span: 329 Fitn: 587 (12.5%)| (2 0 0 1 0 0 2 4 4 0 3 3 6 0 3 2 4 6 3 0)
5: Span: 329 Fitn: 587 (12.5%)| (2 0 0 1 0 0 2 4 4 0 3 3 6 0 3 2 4 6 3 0)
6: Span: 270 Fitn: 646 (18.4%)| (1 0 0 5 0 0 2 0 0 0 3 3 4 0 2 2 4 6 3 2)
7: Span: 364 Fitn: 552 ( 9.0%)| (2 0 0 3 0 4 3 5 5 0 0 2 6 1 3 4 6 4 7 4)
8: Span: 337 Fitn: 579 (11.7%)| (1 0 0 7 0 4 3 1 1 0 0 2 4 1 2 4 6 4 7 6)
9: Span: 347 Fitn: 569 (10.7%)| (5 0 1 7 1 6 1 4 0 0 1 2 2 0 0 4 5 7 7 0)
10: Span: 410 Fitn: 506 ( 4.5%)| (1 0 4 7 0 0 7 1 3 6 2 6 4 1 2 4 2 4 3 7)
Avg Fitness: 560.60 StDev Fitness: 49.61
Population of Schedules (Generation # 2)
Roulette Wheel (Generation # 2)
1: Span: 263 Fitn: 653 (10.6%)| (1 0 0 7 0 4 6 0 0 4 3 7 0 0 0 0 5 6 3 6)
2: Span: 232 Fitn: 684 (16.7%)| (1 0 0 5 0 0 2 0 0 6 3 3 0 0 2 2 0 7 1 0)
3: Span: 325 Fitn: 591 ( 0.0%)| (1 0 0 5 0 0 6 0 0 2 3 3 4 0 3 4 5 7 3 2)
4: Span: 261 Fitn: 655 (11.0%)| (1 0 0 7 0 0 2 0 0 0 3 2 0 0 2 0 1 6 3 2)
5: Span: 249 Fitn: 667 (13.3%)| (1 0 0 5 0 0 2 0 0 2 3 6 0 0 3 0 5 6 3 2)
6: Span: 255 Fitn: 661 (12.1%)| (1 0 0 5 0 0 6 0 0 0 3 3 0 0 1 0 5 7 3 2)
7: Span: 292 Fitn: 624 ( 4.8%)| (1 0 0 7 0 0 3 0 0 0 3 3 4 0 0 4 1 6 3 2)
8: Span: 269 Fitn: 647 ( 9.4%)| (1 0 0 7 0 4 6 0 0 4 3 7 0 0 2 0 4 6 3 6)
9: Span: 247 Fitn: 669 (13.7%)| (1 0 0 5 0 0 2 0 0 0 3 7 4 0 0 4 1 6 3 2)
10: Span: 274 Fitn: 642 ( 8.4%)| (1 0 0 7 0 0 2 0 0 0 3 3 0 0 3 0 5 6 3 2)
Avg Fitness: 649.30 StDev Fitness: 24.87
Population of Schedules (Generation # 6)
Roulette Wheel (Generation # 6)
1: Span: 265 Fitn: 651 ( 0.4%)| (1 0 0 5 0 0 2 0 0 2 3 6 0 0 1 0 5 7 3 2)
2: Span: 241 Fitn: 675 (12.3%)| (1 0 0 7 0 0 2 0 0 4 3 3 0 0 3 0 0 6 1 0)
3: Span: 238 Fitn: 678 (13.8%)| (1 0 0 5 0 0 6 0 0 6 3 3 0 0 1 0 1 6 1 0)
4: Span: 258 Fitn: 658 ( 3.8%)| (1 0 0 5 0 0 6 0 0 6 3 2 0 0 0 0 0 7 1 0)
5: Span: 248 Fitn: 668 ( 8.8%)| (1 0 0 5 0 0 2 0 0 2 3 3 0 0 1 0 1 6 3 0)
6: Span: 246 Fitn: 670 ( 9.8%)| (1 0 0 5 0 0 6 0 0 6 3 7 0 0 1 0 5 7 1 2)
7: Span: 246 Fitn: 670 ( 9.8%)| (1 0 0 5 0 0 2 0 0 4 3 3 0 0 1 0 5 7 1 2)
8: Span: 242 Fitn: 674 (11.8%)| (1 0 0 5 0 0 6 0 0 2 3 6 0 0 3 0 1 6 3 0)
9: Span: 226 Fitn: 690 (19.7%)| (1 0 0 5 0 0 2 0 0 0 3 6 0 0 3 0 5 6 3 2)
10: Span: 246 Fitn: 670 ( 9.8%)| (5 0 0 5 0 0 2 0 0 2 3 6 0 0 3 0 5 7 3 2)
Avg Fitness: 670.40 StDev Fitness: 10.06 (Convergence)
Population of Schedules (Generation # 9)
Roulette Wheel (Generation # 9)
Process-Schedule Evolution
GA falls offa peak
Convergence
Genetic Programming (GP)
f
gh
1 4 6
p
n
m2
3 9
n
m
3 9
f
g
4 6
p
2 h
1
• Genotype = a computer program (often in Lisp)• Phenotype = Genotype (usually)
Cross over
(p 2 (m (n 3 9)))
(f (h 1) (g 4 6))
(f (m (n 3 9)) (g 4 6))
(p 2 (h 1))
Symbolic Regression with GP (Koza, 1993)Given: a set of (X,Y) pairs.
Find: a functional expression that predicts Y, given X.
GP primitives:
Terminals: X
Functions: +, -, *, %, SIN, COS, EXP, RLOG
Fitness Function:
Fitn = 1/(MappingError + 1)
Sample Individual/Solution:
(- (COS (+ X (* X X))) (EXP X (% (+ X X ) X))) Target
Best of Gen
Generation 0 Generation 12 Generation 50
Robot Wall-Following with GP (Koza, 1993)
Given: Odd-shaped room with robot in center.
Find: A control strategy for the robot that makes it move along the periphery.
GP Primitives:
Terminals: S0, S1..S11 (12 sensor readings, distance to wall),
Functions: IFLTE (if less than or equal), PROGN2, MF, MB (move forward/back), TL, TR (turn left/right).
Fitness Function:
Fitn = # peripheral cells visited.
Sample Individual/Strategy:
(IFLTE S3 S7 (MF) (PROG2 MB (IFLTE S4 S9 (TL) (PROG2 (MB) (TL)))))
Wall-Following Evolution
Gen 0 Gen 2
Gen 14 Gen 49
Evolutionary Algorithms & Machine Learning
EAs are very useful ML techniques, but they are quite different from other ML approaches.
Generate-and-Test (G&T)Conventional ML philosopy is to include as much intelligence in G as possible, so it only produces fairly good solutions (S). In contrast, EAs include a lot of randomness in G, and let T do the work (just as in nature) to filter the bad solutions that G may produce.
G TS
G TS S
S
ML
EA*Box size = amount of computational effort used
EA & ML
Banzhaf et. al. use 4 criteria to compare EA to the rest of ML.
• Representations
• Solution Generators (i.e., search operators)
• Search Processes
• Learning Feedback
EA & ML: Representation• ML methods typically involve one representation type, or, stated
another way, ML methods are often representation dependent.– Logic– Semantic Networks– Decision Trees– Production rules– Artificial Neural Networks
• EAs can use any representation type - assuming special rep-dependent mutation and crossover operators + a decoder to run (reason with) the expressions that are generated (during fitness evaluation).
• Why are EAs more general?– ML methods have a deep semantic understanding of the
expressions being used in the sense that they can both reason with them and MODIFY them (i.e. generate new ones) in an intelligent, goal-directed manner.
– EAs only have a shallow syntactic understanding of a representation such that they can generate legal expressions, but often in a more random and less intelligent manner.
EA & ML: Solution Generators
• ML’s intelligent generators:– Specializing & generalizing logical concept descriptions by
adding/dropping conjuncts and disjuncts.– Adding nodes to a decision tree.– Tuning weights or adding hidden layers to an ANN
• Intelligent in the sense that the changes are governed by knowledge, constraints and goals. The newly-generated expressions thus have a reasonably good chance of being better than their predecessors.
• EAs dumb generators:– Crossover– Mutation– Inversion
*Generalizing and specializing are accidental side-effects of these random operations. EAs rely on fitness testing and selection mechanisms to separate the good from the bad.
EA & ML: Search Processes
You are here
You want to be here
Search Methods (Michalewicz & Fogel, 2000)
Traditional(Serial*) Evolutionary (Parallel)
Local Exhaustive
Greedy Algs
Branch & Bound
Divide & Conquer
Dynamic Prog’ing
Best First (A*)
Linear Prog’ing
Breadth-1st Hill Climbing Simplex
GA GP ES EP
* Traditional methods can beparallel, but it’s usually anindependent parallelism
*ML uses them all.
Sequential Search
Single Initialization Multiple Initializations
1
2
3
4
Parallel Search
Independent Local SearchesDone Together
Interdependent (Competing) Local Searches Done Together - Results of onesearch affect its future & others:Abort or focus more resources on it?Spawn a new local search?Analogy: Gov’t investment in research.
1
1
1
11
1,2
1,2
1
2
Exploration -vs- Exploitation
1
11
1
Exploit
2
* All local searchesare interdependent.
Little of Both
22
2
2
2
2
ExploreRemember best
Solution Space
Solution Space: All complete solutions to the problem.
Feasible solutions: Satisfy all hard constraintsOptimal solutions:
Best possible. Satisfy all hard constraints (related to topology) & most or all soft constraints.
SAT: Complete => assigns one truth value to each variable.
Feasible & Optimal => all clauses true
TSP: Complete => Each city visited once.Feasible => All inter-city links are legal.Optimal => Shortest possible route
Representation Space: Scope* This is the space that search algorithms work in.
Matching Scope
Includes some incompletesolutions
Focused ongood solutions Miss!!
Representation Space: Density
1-1
Solution Space
N - 11-N
Typical SAT rep: 1-1 => Each var is T or F (Each binary string can be represented in one way)
Typical TSP rep: N-1 => 1-2-3 = 3-1-2 = 3-2-1 =… 2n equivalent permutations (Every tour
can be represented in 2n ways).
Difficult Search Spaces
Crossover of feasiblesolutions can yieldinfeasible solutions
Hard to gradually mutate a feasible solution into an optimal solution, sinceintermediates may be infeasible
How to handle infeasible solutions???
1. Never generate them2. Generate, but do not let them reproduce3. Allow to reproduce, but penalize them heavily in the fitness function.4. Same status as feasible but non-optimal solutions.
But, crossover of infeasiblesolutions may be the easiestway to create optimalsolutions!
EA & ML: Learning Feedback• Supervised Learning
– Given correct inputs & outputs, find the mapping.
– Decision-Tree Induction, Version Spaces, Feedforward ANNs with backprop, etc.
– All EAs: fitness based on error between actual and correct outputs.
• Reinforcement Learning
– Given inputs & punishment or reward, but no clear description of the correct output, find the mapping anyway!
– RL, some ANNs, etc.
– All EAs again, where fitness = f(total reward & punishment). Classic example: Classifier Systems (GA + Credit assignment learning)
• Unsupervised Learning
– Given inputs, classify them into groups.
– Kohonen ANNs
– Possible for EAs, but rarely done in practice.
Application Areas for Evolutionary Algorithms
• Optimization: Controllers, Job Schedules, Networks(TSP)
• Electronics: Circuit Design (GP)
• Finance: Stock time-series analysis & prediction
• Economics: Emergence of Markets, Pricing & Purchasing Strategies
• Sociology: cooperation, communication, ANTS!
• Computer Science
– Machine Learning: Classification, Prediction…
– Algorithm design: Sorting networks
• Biology
– Immunology: natural & virtual (computer immune system)
– Ecology: arms races, coevolution
– Population genetics: roles of mutation, crossover & inversion
– Evolution & Learning: Baldwin Effect, Lamarckism…