introduction to genetic algorithms
DESCRIPTION
Introduction to Genetic Algorithms. Speaker: Moch. Rif’an [email protected]. What are genetic algorithms?. Genetic algorithms (GAs) is a search technique used in computing to find exact or approximate solutions to optimization and search problems. - PowerPoint PPT PresentationTRANSCRIPT
Introduction toIntroduction toGenetic AlgorithmsGenetic Algorithms
Speaker:Speaker:
Moch. Rif’anMoch. Rif’an
[email protected]@ub.ac.id
What are genetic What are genetic algorithms?algorithms?
Genetic algorithms (GAs) is a search technique used in computing to find exact or approximate solutions to optimization and search problems.
Genetic algorithms are categorized as global search heuristics.
Genetic algorithms are a particular class of evolutionary algorithms (also known as evolutionary computation) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).
A short history of genetic A short history of genetic algorithmsalgorithms19541954 - computer simulations of evolution started by the - computer simulations of evolution started by the
work of work of Nils Aall BarricelliNils Aall Barricelli1960s 1960s -- Hans Bremermann Hans Bremermann published a series of papers published a series of papers
that adopted a population of solution to that adopted a population of solution to optimization problems optimization problems
1963 1963 - - BarricelliBarricelli had simulated the evolution of ability had simulated the evolution of ability to to play a simple gameplay a simple game
1970s 1970s - artificial evolution became a widely recognized - artificial evolution became a widely recognized optimization method optimization method
19701970 - the first books that described the used methods - the first books that described the used methods ((Fraser and BurnellFraser and Burnell))
1980s 1980s - The - The First International ConferenceFirst International Conference on Genetic on Genetic Algorithms, Pennsylvania; General Electric started Algorithms, Pennsylvania; General Electric started selling the world's first genetic algorithm product selling the world's first genetic algorithm product
1989 1989 - - Axcelis, Inc.Axcelis, Inc. released Evolver, the world's second released Evolver, the world's second GA product and the first for desktop computers GA product and the first for desktop computers
Genetic Algorithms - HistoryGenetic Algorithms - History
Pioneered by John Holland in the Pioneered by John Holland in the 1970’s1970’s
Got popular in the late 1980’sGot popular in the late 1980’s Based on ideas from Darwinian Based on ideas from Darwinian
EvolutionEvolution Can be used to solve a variety of Can be used to solve a variety of
problems that are not easy to solve problems that are not easy to solve using other techniquesusing other techniques
Biological Background (1) – The cell
• Every animal cell is a complex of many small “factories” working together
• The center of this all is the cell nucleus
• The nucleus contains the genetic information
Biological Background (2) – Chromosomes
• Genetic information is stored in the chromosomes
• Each chromosome is build of DNA
• Chromosomes in humans form pairs
• There are 23 pairs
• The chromosome is divided in parts: genes
• Genes code for properties
• The posibilities of the genesforone property is called: allele
• Every gene has an unique positionon the chromosome: locus
Biological Background (3) – Genetics
• The entire combination of genes is called genotype
• A genotype develops to a phenotype
• Alleles can be either dominant or recessive
• Dominant alleles will always express from the genotype to the fenotype
• Recessive alleles can survive in the population for many generations, without being expressed.
Biological Background (4) – Reproduction
• Reproduction of genetical information• Mitosis
• Meiosis
• Mitosis is copying the same genetic information to new
offspring: there is no exchange of
information
• Mitosis is the normal way ofgrowing of multicell structures,
like organs.
Biological Background (5) – Reproduction
• Meiosis is the basis of sexual reproduction
• After meiotic division 2 gametesappear in the process
• In reproduction two gametesconjugate to a zygote wich
will become the new individual
• Hence genetic information is sharedbetween the parents in order to
create new offspring
Biological Background (6) – Reproduction
• During reproduction “errors” occur
• Due to these “errors” genetic variation exists
• Most important “errors” are:
• Recombination (cross-over)
• Mutation
Biological Background (7) – Natural selection
• The origin of species: “Preservation of favourablevariations and rejection of unfavourable
variations.”
• There are more individuals born than can survive, so there is a continuous struggle for life.
• Individuals with an advantage have a greater chance for survive: survival of the fittest.
Biological Background (8) – Natural selection
• Important aspects in natural selection are:
• adaptation to the environment
• isolation of populations in different groups which cannot mutually mate
• If small changes in the genotypes of individuals are expressed easily, especially in small populations, we speak of genetic drift
• Mathematical expresses as fitness: success in life
How GA are Different than How GA are Different than Traditional Search MethodsTraditional Search Methods
GAs work with a coding of the parameter GAs work with a coding of the parameter set, not the parameters themselves.set, not the parameters themselves.
GAs search from a population of points, GAs search from a population of points, not a single point.not a single point.
GAs use payoff information, not GAs use payoff information, not derivatives or auxiliary knowldege.derivatives or auxiliary knowldege.
GAs use probablistic transition rules, not GAs use probablistic transition rules, not deterministic rules.deterministic rules.
Evolution in the real worldEvolution in the real world Each cell of a living thing contains Each cell of a living thing contains chromosomeschromosomes - -
strings of strings of DNADNA Each chromosome contains a set of Each chromosome contains a set of genesgenes - blocks of DNA - blocks of DNA Each gene determines some aspect of the organism (like Each gene determines some aspect of the organism (like
eye colour)eye colour) A collection of genes is sometimes called a A collection of genes is sometimes called a genotypegenotype A collection of aspects (like eye colour) is sometimes A collection of aspects (like eye colour) is sometimes
called a called a phenotypephenotype Reproduction involves recombination of genes from Reproduction involves recombination of genes from
parents and then small amounts of parents and then small amounts of mutationmutation (errors) in (errors) in copying copying
The The fitnessfitness of an organism is how much it can reproduce of an organism is how much it can reproduce before it diesbefore it dies
Evolution based on “survival of the fittest”Evolution based on “survival of the fittest”
Start with a Dream…Start with a Dream…
Suppose you have a problemSuppose you have a problem You don’t know how to solve itYou don’t know how to solve it What can you do?What can you do? Can you use a computer to somehow Can you use a computer to somehow
find a solution for you?find a solution for you? This would be nice! Can it be done?This would be nice! Can it be done?
A dumb solutionA dumb solution
A “blind generate and test” algorithm:A “blind generate and test” algorithm:
RepeatRepeatGenerate a random possible solutionGenerate a random possible solution
Test the solution and see how good it isTest the solution and see how good it is
Until solution is good enoughUntil solution is good enough
Can we use this dumb idea?Can we use this dumb idea?
Sometimes - yes:Sometimes - yes: if there are only a few possible solutionsif there are only a few possible solutions and you have enough timeand you have enough time then such a method then such a method couldcould be used be used
For most problems - no:For most problems - no: many possible solutionsmany possible solutions with no time to try them allwith no time to try them all so this method so this method can notcan not be used be used
A “less-dumb” idea (GA)A “less-dumb” idea (GA)
Generate a Generate a setset of random solutions of random solutions
RepeatRepeatTest each solution in the set (rank them)Test each solution in the set (rank them)
Remove some bad solutions from setRemove some bad solutions from set
Duplicate some good solutions Duplicate some good solutions
make small changes to some of themmake small changes to some of them
Until best solution is good enoughUntil best solution is good enough
Silly Example - Drilling for Silly Example - Drilling for OilOil
Imagine you had to drill for oil somewhere Imagine you had to drill for oil somewhere along a single 1km desert roadalong a single 1km desert road
Problem: choose the best place on the Problem: choose the best place on the road that produces the most oil per dayroad that produces the most oil per day
We could represent each solution as a We could represent each solution as a position on the roadposition on the road
Say, a whole number between [0..1000]Say, a whole number between [0..1000]
Where to drill for oil?Where to drill for oil?
0 500 1000
Road
Solution2 = 900Solution1 = 300
Digging for OilDigging for Oil
The set of all possible solutions [0..1000] The set of all possible solutions [0..1000] is called the is called the search spacesearch space or or state spacestate space
In this case it’s just one number but it In this case it’s just one number but it could be many numbers or symbolscould be many numbers or symbols
Often GA’s code numbers in binary Often GA’s code numbers in binary producing a bitstring representing a producing a bitstring representing a solutionsolution
In our example we choose 10 bits which In our example we choose 10 bits which is enough to represent 0..1000is enough to represent 0..1000
Drilling for OilDrilling for Oil
0 1000
Road
Solution2 = 900 (1110000100)
Solution1 = 300 (0100101100)
O I
L
Location
305
Classes of Search TechniquesClasses of Search TechniquesSearch Techniqes
Calculus Base Techniqes
Guided random search techniqes
Enumerative Techniqes
BFSDFS Dynamic Programming
Tabu Search Hill Climbing
Simulated Anealing
Evolutionary Algorithms
Genetic Programming
Genetic Algorithms
Fibonacci Sort
Search SpaceSearch Space
For a simple function f(x) the search space is For a simple function f(x) the search space is one dimensional.one dimensional.
But by encoding several values into the But by encoding several values into the chromosome many dimensions can be chromosome many dimensions can be searched e.g. two dimensions f(x,y)searched e.g. two dimensions f(x,y)
Search space can be visualised as a surface Search space can be visualised as a surface or or fitness landscapefitness landscape in which fitness dictates in which fitness dictates heightheight
Each possible genotype is a point in the Each possible genotype is a point in the spacespace
A GA tries to move the points to better places A GA tries to move the points to better places (higher fitness) in the space(higher fitness) in the space
Fitness landscapesFitness landscapes
Search SpaceSearch Space
Obviously, the nature of the search Obviously, the nature of the search space dictates how a GA will performspace dictates how a GA will perform
A completely random space would A completely random space would be bad for a GAbe bad for a GA
Also GA’s can get stuck in local Also GA’s can get stuck in local maxima if search spaces contain lots maxima if search spaces contain lots of theseof these
Generally, spaces in which small Generally, spaces in which small improvements get closer to the improvements get closer to the global optimum are goodglobal optimum are good
GA AlgorithmGenerate a set of random solutions
RepeatTest each solution in the set (rank them)
Remove some bad solutions from set
Duplicate some good solutions
make small changes to some of them
Until best solution is good enough
The Evolutionary CycleThe Evolutionary Cycle
selection
population evaluation
modification
discard
deleted members
parents
modifiedoffspring
evaluated offspring
initiate & evaluate
A genetic algorithm maintains aA genetic algorithm maintains a population of candidate solutionspopulation of candidate solutions for for thethe problemproblem at hand,at hand,and makes it evolve byand makes it evolve byiteratively applyingiteratively applyinga set of stochastic operatorsa set of stochastic operators
VocabularyVocabulary
Gene – An single encoding of part of Gene – An single encoding of part of the solution space.the solution space.
Chromosome – A string of “Genes” Chromosome – A string of “Genes” that represents a solution.that represents a solution.
Population - The number of Population - The number of “Chromosomes” available to test.“Chromosomes” available to test.
Adding Sex - CrossoverAdding Sex - Crossover
Although it may work for simple search Although it may work for simple search spaces our algorithm is still very spaces our algorithm is still very simplesimple
It relies on random mutation to find a It relies on random mutation to find a good solutiongood solution
It has been found that by introducing It has been found that by introducing “sex” into the algorithm better results “sex” into the algorithm better results are obtainedare obtained
This is done by selecting two parents This is done by selecting two parents during reproduction and combining during reproduction and combining their genes to produce offspringtheir genes to produce offspring
Adding Sex - CrossoverAdding Sex - Crossover
Two high scoring “parent” bit Two high scoring “parent” bit strings (strings (chromosomes)chromosomes) are selected are selected and with some probability and with some probability (crossover rate) combined(crossover rate) combined
Producing two new Producing two new offspring offspring (bit (bit strings)strings)
Each offspring may then be Each offspring may then be changed randomly (changed randomly (mutationmutation))
MethodologyMethodology
Genetic algorithms are Genetic algorithms are implemented as a computer simulation implemented as a computer simulation in which a population of abstract in which a population of abstract representations (called chromosomes representations (called chromosomes or the genotype or the genome) of or the genotype or the genome) of candidate solutions (called individuals, candidate solutions (called individuals, creatures, or phenotypes) to an creatures, or phenotypes) to an optimization problem evolves toward optimization problem evolves toward better solutions. better solutions.
A typical genetic algorithm requires two things to be defined:
A genetic representation of the solution domain A fitness function to evaluate the solution domain
A standard representation of the solution is as an array of bits. Arrays of other types and structures can be used in essentially the same way.
The fitness function is defined over the genetic representation and measures the quality of the represented solution.
InitializationInitialization
Initially many individual Initially many individual solutions are randomly generated to solutions are randomly generated to form an initial population. The form an initial population. The population size depends on the population size depends on the nature of the problem (hundreds or nature of the problem (hundreds or thousands of possible solutions ). thousands of possible solutions ). Traditionally, the population is Traditionally, the population is generated randomly, covering the generated randomly, covering the entire range of possible solutions. entire range of possible solutions.
Selecting ParentsSelecting Parents
Many schemes are possible so long as Many schemes are possible so long as better scoring chromosomes more better scoring chromosomes more likely selectedlikely selected
Score is often termed the Score is often termed the fitnessfitness ““Roulette Wheel” selection can be Roulette Wheel” selection can be
used:used: Add up the fitness's of all chromosomesAdd up the fitness's of all chromosomes Generate a random number R in that rangeGenerate a random number R in that range Select the first chromosome in the Select the first chromosome in the
population that - when all previous fitness’s population that - when all previous fitness’s are added - gives you at least the value Rare added - gives you at least the value R
Example: Discrete Example: Discrete Representation (Binary Representation (Binary
alphabet)alphabet)
CHROMOSOMECHROMOSOME
GENEGENE
Representation of an individual can be using discrete values (binary, integer, or any other system with a discrete set of values). Following is an example of binary representation.
Example: Discrete Example: Discrete Representation (Binary Representation (Binary
alphabet)alphabet)
8 bits Genotype8 bits Genotype Phenotype:• Integer
• Real Number
• Schedule
• ...
• Anything?
Example: Discrete Example: Discrete Representation (Binary Representation (Binary
alphabet)alphabet)Phenotype could be integer numbersPhenotype could be integer numbers
Genotype:
1*21*27 7 + 0*2+ 0*26 6 + 1*2+ 1*25 5 + 0*2+ 0*24 4 + 0*2+ 0*23 3 + 0*2+ 0*22 2 + 1*2+ 1*21 1 + 1*2+ 1*200 ==
128 + 32 + 2 + 1 = 163128 + 32 + 2 + 1 = 163
= 163Phenotype:
Example: Discrete Example: Discrete Representation (Binary Representation (Binary
alphabet)alphabet) Phenotype could be Real NumbersPhenotype could be Real Numbers
e.g. a number between 2.5 and 20.5 using 8 e.g. a number between 2.5 and 20.5 using 8 binary digitsbinary digits
9609.135.25.20256
1635.2 x
= 13.9609Genotype: Phenotype:
Example: Discrete Example: Discrete Representation (Binary Representation (Binary
alphabet)alphabet) Phenotype could be a SchedulePhenotype could be a Schedule
e.g. 8 jobs, 2 time stepse.g. 8 jobs, 2 time steps
Genotype:
=
12345678
21211122
JobTime Step
Phenotype
SelectionSelection
During each successive generation, During each successive generation, a proportion of the existing population is a proportion of the existing population is selected to breed a new generation. selected to breed a new generation. Individual solutions are selected through Individual solutions are selected through a a fitness-basedfitness-based process, where fitter process, where fitter solutions are to be selected. Certain solutions are to be selected. Certain selection methods rate the fitness of selection methods rate the fitness of each solution and preferentially select each solution and preferentially select the best solutions. Other methods rate the best solutions. Other methods rate only a random sample of the population, only a random sample of the population, as this process may be very time-as this process may be very time-consuming.consuming.
Example (selection1)Example (selection1)
Next we apply fitness proportionate selection with the roulette wheel method:
Area is Proportional to fitness value
Individual i will have a
probability to be chosen
i
if
if
)(
)(
21n
3
4
We repeat the extraction as many times as the number of individuals we need to have the same parent population size (6 in our case)
ReproductionReproductionThe next step is to generate a second The next step is to generate a second
generation population of solutions from those generation population of solutions from those selected through genetic operators: selected through genetic operators: crossover (also called recombination) and/or crossover (also called recombination) and/or mutation.mutation.
For each new solution, a pair of "parent" For each new solution, a pair of "parent" solutions is selected for breeding from the solutions is selected for breeding from the pool selected previously. By producing a pool selected previously. By producing a "child" solution using crossover and mutation, "child" solution using crossover and mutation, a new solution is created which shares many a new solution is created which shares many of the characteristics of its "parents". New of the characteristics of its "parents". New parents are selected for each child and the parents are selected for each child and the process continues until a new population of process continues until a new population of solutions of appropriate size is generated.solutions of appropriate size is generated.
TerminationTerminationThis generational process is repeated until a This generational process is repeated until a
termination condition has been reached:termination condition has been reached: A solution is found that satisfies minimum A solution is found that satisfies minimum
criteriacriteria Fixed number of generations reachedFixed number of generations reached Allocated budget (computation time/money) Allocated budget (computation time/money)
reachedreached The highest ranking solution's fitness is reaching The highest ranking solution's fitness is reaching
or has reached a plateau such that successive or has reached a plateau such that successive iterations no longer produce better resultsiterations no longer produce better results
Manual inspectionManual inspection Combinations of the above.Combinations of the above.
Example (initialization)Example (initialization) We toss a fair coin 60 times and get We toss a fair coin 60 times and get
the following initial population:the following initial population: ss1 = 11110101011 = 1111010101f f ((ss1) = 71) = 7 ss2 = 01110001012 = 0111000101f f ((ss2) = 52) = 5 ss3 = 11101101013 = 1110110101f f ((ss3) = 73) = 7 ss4 = 01000100114 = 0100010011f f ((ss4) = 44) = 4 ss5 = 11101111015 = 1110111101f f ((ss5) = 85) = 8 ss6 = 01001100006 = 0100110000f f ((ss6) = 36) = 3
Simple ExampleSimple Example f(x) = {MAX(xf(x) = {MAX(x22): 0 <= x <= 32 }): 0 <= x <= 32 } Encode Solution: Just use 5 bits (1 or 0).Encode Solution: Just use 5 bits (1 or 0). Generate initial population.Generate initial population.
Evaluate each solution against objective.Evaluate each solution against objective.
AA 00 11 11 00 11
BB 11 11 00 00 00
CC 00 11 00 00 00
DD 11 00 00 11 11
Sol.Sol. StringString FitnessFitness % of % of TotalTotal
AA 0110101101 169169 14.414.4
BB 1100011000 576576 49.249.2
CC 0100001000 6464 5.55.5
DD 1001110011 361361 30.930.9
Pseudo-code algorithmPseudo-code algorithm Choose initial populationChoose initial population Evaluate the fitness of each individual in the Evaluate the fitness of each individual in the
populationpopulation Repeat Repeat
• Select best-ranking individuals to reproduceSelect best-ranking individuals to reproduce• Breed new generation through crossover and Breed new generation through crossover and
mutation (genetic operations) and give birth to mutation (genetic operations) and give birth to offspringoffspring
• Evaluate the individual fitnesses of the offspringEvaluate the individual fitnesses of the offspring• Replace worst ranked part of population with Replace worst ranked part of population with
offspringoffspring Until terminationUntil termination
Simple Example (cont.)Simple Example (cont.)
Create next generation of solutionsCreate next generation of solutions Probability of “being a parent” depends on the Probability of “being a parent” depends on the
fitness.fitness. Ways for parents to create next generationWays for parents to create next generation
ReproductionReproduction Use a string again unmodified.Use a string again unmodified.
CrossoverCrossover Cut and paste portions of one string to another.Cut and paste portions of one string to another.
MutationMutation Randomly flip a bit.Randomly flip a bit.
COMBINATION of all of the above.COMBINATION of all of the above.
The Basic Genetic AlgorithmThe Basic Genetic Algorithm1.1. [Start] [Start] Generate random population of Generate random population of nn chromosomes (suitable chromosomes (suitable
solutions for the problem)solutions for the problem) 2.2. [Fitness] [Fitness] Evaluate the fitness Evaluate the fitness f(x) f(x) of each chromosome of each chromosome xx in the in the
population population 3.3. [New population] [New population] Create a new population by repeating Create a new population by repeating
following steps until the new population is completefollowing steps until the new population is complete 1.1. [Selection] [Selection] Select two parent chromosomes from a Select two parent chromosomes from a
population according to their fitness (the better fitness, the population according to their fitness (the better fitness, the bigger chance to be selected) bigger chance to be selected)
2.2. [Crossover] [Crossover] With a crossover probability cross over the With a crossover probability cross over the parents to form new offspring (children). If no crossover was parents to form new offspring (children). If no crossover was performed, offspring is the exact copy of parents.performed, offspring is the exact copy of parents.
3.3. [Mutation] [Mutation] With a mutation probability mutate new offspring With a mutation probability mutate new offspring at each locus (position in chromosome). at each locus (position in chromosome).
4.4. [Accepting] [Accepting] Place new offspring in the new populationPlace new offspring in the new population 4.4. [Replace] [Replace] Use new generated population for a further run of the Use new generated population for a further run of the
algorithm algorithm 5.5. [Test] [Test] If the end condition is satisfied, stop, and return the best If the end condition is satisfied, stop, and return the best
solution in current population solution in current population 6.6. [Loop] [Loop] Go to step 2 Go to step 2
Example of mutation (Negnevitsky, Pearson Example of mutation (Negnevitsky, Pearson Education, 2002)Education, 2002)
Original network3
4
5
y6
x22
-0.3
0.9-0.7
0.5
-0.8
-0.6x11
-0.2
0.1
0.4
0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9
3
4
5
y6
x22
0.2
0.9-0.7
0.5
-0.8
-0.6x11
-0.2
0.1
-0.1
0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9
Mutated network
0.4 -0.3 -0.1 0.2
Some GA Application TypesSome GA Application TypesDomain Application Types
Control gas pipeline, pole balancing, missile evasion, pursuit
Design semiconductor layout, aircraft design, keyboardconfiguration, communication networks
Scheduling manufacturing, facility scheduling, resource allocation
Robotics trajectory planning
Machine Learning designing neural networks, improving classificationalgorithms, classifier systems
Signal Processing filter design
Game Playing poker, checkers, prisoner’s dilemma
CombinatorialOptimization
set covering, travelling salesman, routing, bin packing,graph colouring and partitioning
ConclusionsConclusions
Question:Question: ‘If GAs are so smart, why ain’t they rich?’‘If GAs are so smart, why ain’t they rich?’
Answer:Answer: ‘Genetic algorithms ‘Genetic algorithms areare rich - rich in rich - rich in application across a large and growing application across a large and growing number of disciplines.’number of disciplines.’
- David E. Goldberg, - David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Genetic Algorithms in Search, Optimization and Machine LearningLearning