![Page 1: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/1.jpg)
A Brief Introduction to Genetic Algorithms
Geoff Harcourt
![Page 2: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/2.jpg)
Hi
I’m Geoff
Developer at thoughtbot
Maintainer of thoughtbot/dotfiles and parity (Heroku app shortcuts)
![Page 3: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/3.jpg)
The Knapsack Problem
Given a set of items, each with its own weight, size,
price, determine the combination of items under
the weight and size budget that has the most value
CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php?curid=985491
![Page 4: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/4.jpg)
• Positive points for putting guests who like each other
together • Negative points for putting guests who don’t get
along together • families must be together
what arrangement produces the most happiness?
Seating Chart for a Wedding
![Page 5: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/5.jpg)
Delivery Truck Route
Given a set of packages that must be delivered in
one trip, what’s the ideal order of stops to do the
delivery in the shortest time (and/or least distance
travelled?)
![Page 6: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/6.jpg)
What Do These Problems Have In Common?
• Potentially massive/infinite set of possible solutions (“large solution space”)
• Optimized solutions are better, but “great” is almost as good as “perfect”
• Cheap to test any one solution’s value (“fitness”)
![Page 7: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/7.jpg)
When Force Isn’t Enough
First potential approach is brute force: test every possible solution
For some problems this technique can find the solution, but if the solution space is too large and/or infinite, may not be feasible.
![Page 8: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/8.jpg)
Genetic Algorithms
Genetic Algorithms (GA) are a type of search algorithm that mimics the mechanic of natural
selection to traverse a space of possible solutions and generate high-quality solutions.
![Page 9: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/9.jpg)
What does that mean?
A Genetic Algorithm combines and re-combines
elements of solutions to “evolve” toward more
optimal solutions in manner similar to that by
which a biological population evolves over time
![Page 10: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/10.jpg)
Current Genetic Research
![Page 11: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/11.jpg)
How does GA work?Representation: represent parts of the problem as “genes”
Fitness: a function that can be run against the expressed genes to measure the quality of the solution
Evolution: breeding and/or mutation and selection
![Page 12: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/12.jpg)
Representation
A DNA-based organism represents its genetic code with a base-4 system [A, C, G, T]. A chromosome’s gene sequence might read as AACTGACTGA
Many problems can be expressed as base-2: 0110101000
![Page 13: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/13.jpg)
Representation: The Knapsack Problem
Generate a random set of 100 items, each with its own weight, size, and value. Put the items in an array for reference. Our organism will have one “chromosome” with 100 “genes”. Each gene is set to either 0 (not in the knapsack) or 1 (in the knapsack).
The gene’s position in the chromosome matches that of the item’s state that it represents.
![Page 14: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/14.jpg)
Fitness
The Fitness Function is how we test any solution’s fitness, or how effective the solution is.
For some problems the best fitness will be the highest number possible or lowest number possible, or it might be the number closest to an ideal value.
![Page 15: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/15.jpg)
Fitness: The Knapsack Problem
The fitness function for the Knapsack Problem would be the sum of the value of the items that are in the Knapsack.
Our 9-gene chromosome: [0, 1, 1, 0, 0, 0, 1, 0, 1]
Our knapsack has items 1 ($11), 2 ($5), 6 ($3), and 8 ($21). Our knapsack’s value is 40, the sum of the dollar values of the items inside.
![Page 16: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/16.jpg)
But wait, there’s more (to fitness)!
Some solutions are invalid. In the Knapsack Problem, solutions whose summed weights or volumes are greater than the knapsack can hold aren’t valid even if they contain the highest dollar value.
These problems need to return a fitness that disqualifies them. In our case, we’ll return 0 for any knapsack that exceeds the weight or size limit.
![Page 17: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/17.jpg)
Seeding a Population
To determine a solution to our problem, let’s start with a population.
We’ll randomly generate 200 organisms by building 200 chromosomes, randomly flipping the bits in our chromosome either to put the item into the knapsack or hold it out.
![Page 18: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/18.jpg)
Seeding (continued)
If we think we know some parts of the solution already (such as an item that’s worth a lot and is small and lightweight), we can use non-random or partially random seed data to nudge the population closer to the solution.
This is called “warm starting”. It should be used carefully, as it may preclude unexpectedly fit solutions.
![Page 19: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/19.jpg)
Now what?We are going to iterate through a number of generations. In each generation, we’ll use the following mechanisms to move toward the best fitness:
• Crossover
• Mutation
• Selection
![Page 20: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/20.jpg)
Sweet, sweet love
Our algorithm takes two organisms* from the population and has them mate. Mating the organisms combines their respective genes and produces two new organisms, each containing some elements of their parents’ gene expressions.
* Complicated algorithms can mate more than two organisms at once, we won’t do that here.
![Page 21: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/21.jpg)
Crossover
Crossover is how we combine genes from two organisms to produce new solutions.
Crossover takes the chromosomes from two organisms and has them trade pieces with each other. The result of crossover is a group of organisms with new combinations of gene expressions that might not have existed in prior generations.
![Page 22: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/22.jpg)
Crossover
![Page 23: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/23.jpg)
Cr0ssover
Crossover mimics the process of gene recombination
(both between organisms and between
chromosomes themselves) that occurs in biological
organisms.
![Page 24: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/24.jpg)
Crossover
Crossover will do much of the work for creating
genetic diversity (different solutions through different
combinations), from one generation to another, but
what if our seed population was missing some gene
expressions that would produce better solutions?
![Page 25: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/25.jpg)
Mutation
Mutation is a mechanism to maintain genetic
diversity.
Mutation is applied by flipping a gene’s expression
according to a probability defined by the algorithm.
These flipped states introduce new values into the
population and contribute to a wider search space.
![Page 26: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/26.jpg)
Mutation
Mutation probabilities need to be kept low or else
they result in a loss of progress from generation to
generation, and the genetic algorithm becomes more
of random search than an evolution toward an ideal
solution.
![Page 27: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/27.jpg)
Mutation
It’s often helpful to tweak mutation settings
(probability of mutation) over several tests to see how
it affects the search.
![Page 28: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/28.jpg)
A New Generation
By mating our initial population through crossover
and randomly applying mutations to a small
percentage of genes, we’ve produces a new
generation of solutions.
We’ll test each organism’s fitness to see which
organisms are most fit.
![Page 29: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/29.jpg)
Selection
Here’s where it gets interesting (and where we have
some decisions to make).
In each generation, we want to promote the most fit
solutions and demote the least fit. There’s a number
of factors to consider.
![Page 30: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/30.jpg)
SelectionThe fittest solutions will be selected more often to
breed into the next generation. The frequency can be
determined by various techniques including:
• weigh by organism’s % of generation’s total fitness • randomly, weighted by fitness (“roulette wheel”) • “tournament selection” (taking a subset and
picking the best of the subset)
![Page 31: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/31.jpg)
ElitismIn order to ensure we never lose the best solution
taken, we can ensure that the best organism(s) found
is/are always included in the next generation.
This mechanism is termed elitism.
If the threshold for elitism is too harsh, the solution
may prematurely converge on a solution, sometimes
called “hill-climbing”.
![Page 32: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/32.jpg)
Hill-climbing
A “hill-climbing” algorithm takes a solution and checks adjacent solutions to see if neighboring options are better.
Vulnerable to local maxima/minima
![Page 33: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/33.jpg)
Why Did I Do This?Fantasy Baseball!
I play in a fantasy baseball league where over 600
baseball players are controlled by the teams.
Our league was expanding from 14 to 16 teams, and
I wanted to see the effect that the change would
have on positional scarcity.
![Page 34: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/34.jpg)
Why Did I Do This?Some positions (first base, outfield) have lots of great
hitters, while some (shortstop, catcher) have fewer
good hitters.
Some players are eligible to play at multiple
positions.
I wanted a way to simulate how our league would
draft and allocate players in the upcoming season.
![Page 35: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/35.jpg)
Why Did I Do This?My first attempt was to build a program that
performed a draft. Each turn it found the weakest
position and then selected the best player who could
play that position.
I noticed that this program frequently turned out
solutions that looked incorrect (positions looked
oddly ranked relatively to one another).
![Page 36: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/36.jpg)
Why Did I Do This?
It turns out my solution prematurely optimized, so I
was allocating players inefficiently and failing to
accurately simulate what would happen in a real
draft where people could observe the scarcity of
each position in real-time.
![Page 37: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/37.jpg)
Why Did I Do This?
I wasn’t actually interested in getting a perfect
solution, but was concerned with getting something
that was a reasonable representation of what would
happen in an auction.
I used the Darwinning gem, which provides a GA
framework, to simulate the allocation of players.
![Page 38: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/38.jpg)
Further Reading• Daniel Sellergren - Solving the 0-1 Knapsack
Problem with a Genetic Algorithm in Ruby http://www.danielsellergren.com/posts/solving-the-0-1-knapsack-problem-with-a-genetic-algorithm-in-ruby
• MIT Course Lecture (very high-level, great introduction!) - Genetic Algorithms https://www.youtube.com/watch?v=kHyNqSnzP8Y
• Darwinning - Ruby gem for GAhttps://github.com/dorkrawk/darwinning
![Page 39: Info to Genetic Algorithms - DC Ruby Users Group 11.10.2016](https://reader030.vdocuments.us/reader030/viewer/2022021507/58ef5a891a28ab54718b4621/html5/thumbnails/39.jpg)
Keep in Touch!GitHub - @geoffharcourt
Twitter - @geoffharcourt
Email - [email protected]
DC Tech Slack - @geoffharcourt