parallel genetic algorithms and the science of asteroseismology a review of the doctoral...
Post on 21-Dec-2015
218 views
TRANSCRIPT
PARALLEL GENETIC PARALLEL GENETIC ALGORITHMS AND THE ALGORITHMS AND THE
SCIENCE OF SCIENCE OF ASTEROSEISMOLOGYASTEROSEISMOLOGY
A Review of the Doctoral A Review of the Doctoral Dissertation Research of Dr. Travis Dissertation Research of Dr. Travis
MetcalfeMetcalfe
OutlineOutline
IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
IntroductionIntroductionAstronomers observe the universe and Astronomers observe the universe and gather information about it. They then fit gather information about it. They then fit this information into mathematical models. this information into mathematical models. The process of “fitting” involves adjusting The process of “fitting” involves adjusting the many parameters of the model. When the many parameters of the model. When they have a good fit, they use the they have a good fit, they use the parameter settings to tell them something parameter settings to tell them something about the object or phenomenon they are about the object or phenomenon they are studying. The author uses a parallel studying. The author uses a parallel genetic algorithm to solve this problem of genetic algorithm to solve this problem of optimization.optimization.
The Goal of the ResearchThe Goal of the Research
To Further the Understanding of the Composition To Further the Understanding of the Composition and Characteristics of White Dwarvesand Characteristics of White Dwarves
More Generally, Since White Dwarves are the More Generally, Since White Dwarves are the Endpoint for all but the most massive stars, this Endpoint for all but the most massive stars, this research can lead to a better understanding of research can lead to a better understanding of stellar evolutionstellar evolution
* Source
Traditional TechniqueTraditional Technique
Make an initial “guess” for parameter Make an initial “guess” for parameter valuesvalues
Use some iterative technique to Use some iterative technique to improve upon the initial guesses.improve upon the initial guesses.
Adjustable Input ParametersAdjustable Input Parameters
MassMass TemperatureTemperature H and He layer massesH and He layer masses Convective EfficiencyConvective Efficiency Core compositionCore composition
Problem with this techniqueProblem with this technique
Results often depend on the initial Results often depend on the initial guessguess
The initial guess is inherently The initial guess is inherently subjective, often the result of subjective, often the result of intuition or past experienceintuition or past experience
The Genetic AlgorithmThe Genetic Algorithm
A genetic algorithm provides a more A genetic algorithm provides a more systematic approach to optimizing the systematic approach to optimizing the resultsresults
The genetic algorithm used was PIKAIAThe genetic algorithm used was PIKAIA PIKAIA is a general purpose “function PIKAIA is a general purpose “function
optimization” genetic algorithmoptimization” genetic algorithm Public domain softwarePublic domain software Fortran-77Fortran-77
OutlineOutline
IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
White dwarves which show a regular White dwarves which show a regular variation in light intensity are known as variation in light intensity are known as pulsating white dwarvespulsating white dwarves
Using photometric techniques, this Using photometric techniques, this variation in intensity can be very variation in intensity can be very accurately measured with such accurately measured with such instruments as the Whole Earth Telescope instruments as the Whole Earth Telescope (WET)(WET)
The pulsation is the result of seismic The pulsation is the result of seismic activity within the white dwarfactivity within the white dwarf
Just as seismological information can be Just as seismological information can be used to study the internal nature of the used to study the internal nature of the earth, seismological data, as expressed in earth, seismological data, as expressed in varying stellar luminosity, can be used to varying stellar luminosity, can be used to determine the characteristics of these determine the characteristics of these pulsating white dwarves.pulsating white dwarves.
Observed Light Curve for the Observed Light Curve for the White Dwarf GD 358.White Dwarf GD 358.
OutlineOutline
IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
Initial ConditionsInitial Conditions
Population size: 1000 ( in later work this Population size: 1000 ( in later work this was reduced to 128).was reduced to 128).
No rationale was given for how the initial No rationale was given for how the initial population value was chosen, or why it population value was chosen, or why it was changed.was changed.
For each member of the initial population, For each member of the initial population, parameter values are randomly setparameter values are randomly set
DurationDuration
Until the difference between the Until the difference between the average fitness and the best fitness average fitness and the best fitness in the population were less than 1%.in the population were less than 1%.
In later work, he used a constant 200 In later work, he used a constant 200 generations.generations.
Fitness MeasurementFitness Measurement
The model is then run using these The model is then run using these initial valuesinitial values
Fitness is based on the root-mean-Fitness is based on the root-mean-square differences between the square differences between the observed and calculated pulsation observed and calculated pulsation periods periods
Fitness MeasurementFitness Measurement
The fitness value is converted to a The fitness value is converted to a survival probability by normalizing survival probability by normalizing with respect to the most fit memberwith respect to the most fit member
The next generation is chosen The next generation is chosen randomly. This random selection is randomly. This random selection is weighted, based on each member’s weighted, based on each member’s survivability ratiosurvivability ratio
CrossoverCrossover
Numerical encodingNumerical encoding Each of the initial parameter values are Each of the initial parameter values are
concatenated into one long stringconcatenated into one long string
A single point crossover technique is A single point crossover technique is used. The position along the string is used. The position along the string is picked randomlypicked randomly
MutationMutation
Mutation is achieved by randomly Mutation is achieved by randomly selecting a number in the string and selecting a number in the string and changing it to a new, randomly changing it to a new, randomly chosen valuechosen value
IllustrationIllustration
Consider two members, each with Consider two members, each with two parameters. two parameters.
MM11 has X=2.573 and Y= 4.457. has X=2.573 and Y= 4.457.
MM22 has parameter values X=3.547 has parameter values X=3.547
and Y=2.332. and Y=2.332. After encoding, MAfter encoding, M11=25734457 and =25734457 and
MM22=35472332=35472332
IllustrationIllustration
The crossover point is randomly chosen, and the string The crossover point is randomly chosen, and the string segments swappedsegments swapped
MM1 1 2573425734||457 457 25734 25734332332
MM2 2 3547235472||332 332 35472 35472457457
IllustrationIllustration
Mutating MMutating M11 involves picking a random spot involves picking a random spot along the string, and changing that value:along the string, and changing that value:
MM11 257257||33||4332 4332 257 2578843324332
Illustration*Illustration*
The strings would then be parsed back into The strings would then be parsed back into parameter values. For Mparameter values. For M11, this would be:, this would be:
MM11 X= 2.578X= 2.578 Y=4.332 Y=4.332
* Modified from [1]* Modified from [1]
Crossover and Mutation Crossover and Mutation RateRate
The cross over rate: 65% The cross over rate: 65% The mutation rate: 0.3%. The mutation rate: 0.3%.
In later work, the author increased the In later work, the author increased the crossover rate to 85% and varied the crossover rate to 85% and varied the mutation rate from 0.1% to 16.6%, mutation rate from 0.1% to 16.6%, depending on the variation between the depending on the variation between the mean fitness value, and the best fitness mean fitness value, and the best fitness valuevalue
ElitismElitism
The most fit solution was passed The most fit solution was passed unaltered the next generationunaltered the next generation
RationaleRationale
The idea behind the relatively low The idea behind the relatively low crossover and mutation rate is to crossover and mutation rate is to prevent removing promising prevent removing promising solutions from each generation too solutions from each generation too rapidlyrapidly
RepetitionRepetition
The paper states: “Repeating this The paper states: “Repeating this procedure many times with different procedure many times with different random number seeds helps to random number seeds helps to ensure that the minimum found is ensure that the minimum found is truly global”truly global”
It does not elaborate on how many It does not elaborate on how many Many timesMany times is, though is, though
RepetitionRepetition
In a later paper, he uses 5 repetitionsIn a later paper, he uses 5 repetitions
This result was obtained in the This result was obtained in the following way…following way…
Values were put in for the model, and Values were put in for the model, and pulsation periods generated.pulsation periods generated.
The genetic algorithm attempted to The genetic algorithm attempted to find the original parameters based on find the original parameters based on the output of the modelthe output of the model
This was done 20 times, and the This was done 20 times, and the results were as follows…results were as follows…
Results (second paper)Results (second paper)
First Order Solution…First Order Solution…
Run Teff M/Ms log(MHE/M*) rmsGeneration
Found1 26,800 0.560 -5.70 0.67 2452 25,000 0.600 -5.96 0.00 1593 24,800 0.605 -5.96 0.52 1454 25,000 0.600 -5.96 0.00 685 22,500 0.660 -6.33 1.11 976 25,000 0.600 -5.96 0.00 1427 25,000 0.600 -5.96 0.00 978 25,000 0.600 -5.96 0.00 1949 25,200 0.595 -5.91 0.42 11610 26,100 0.575 -5.80 0.54 8711 23,900 0.625 -6.12 0.79 7912 25,000 0.600 -5.96 0.00 16513 26,100 0.575 -5.80 0.54 9214 25,000 0.600 -5.96 0.00 9515 24,800 0.605 -5.96 0.52 4216 26,600 0.565 -5.70 0.72 24617 24,800 0.605 -5.96 0.52 18018 25,000 0.600 -5.96 0.00 6219 24,100 0.620 -6.07 0.76 22820 25,000 0.600 -5.96 0.00 167
The genetic algorithm found the The genetic algorithm found the exact result 9/20 times, and was exact result 9/20 times, and was close enough on four other occasions close enough on four other occasions for the correct result to be for the correct result to be determined by the addition of some determined by the addition of some other iterative technique, for a total other iterative technique, for a total of 65% accuracy.of 65% accuracy.
If the GA was rerun, and the best result If the GA was rerun, and the best result selected, the accuracy increased to 88%selected, the accuracy increased to 88%
After 5 runs, the accuracy was over 99%After 5 runs, the accuracy was over 99%
Because no correct answer was found Because no correct answer was found after 200 iterations, the number of after 200 iterations, the number of generations was reduced to 200generations was reduced to 200
Output CurveOutput Curve
OutlineOutline
IntroductionIntroduction The Science of AsteroseismologyThe Science of Asteroseismology The Genetic AlgorithmThe Genetic Algorithm Parallel ComputingParallel Computing ConclusionConclusion
Problem DivisionProblem Division
Part one: running the numerical Part one: running the numerical model using a large number of model using a large number of different initial parameters. different initial parameters.
Part two: determining fitness, Part two: determining fitness, selecting the next generation, and selecting the next generation, and performing crossover/mutationperforming crossover/mutation
Master-Slave ParadigmMaster-Slave Paradigm
Part one – running the model with a Part one – running the model with a given set of parameters was given set of parameters was performed by the slave nodesperformed by the slave nodes
Part two – fitness evaluation, Part two – fitness evaluation, selection/crossover/mutation was selection/crossover/mutation was performed by the master nodeperformed by the master node
PVMPVM
PVM was used as the message PVM was used as the message passing librarypassing library
ExecutionExecution
The master machine generates a job pool The master machine generates a job pool of parameter values that it passes to the of parameter values that it passes to the slave machines. slave machines.
The slave machines in turn run the model The slave machines in turn run the model and return the results to the master. and return the results to the master.
If there are more parameter sets If there are more parameter sets available, the node is given another job. available, the node is given another job.
ExecutionExecution The master calculates variance. The master calculates variance. Determines fitness. Determines fitness. After the models have been run for a given After the models have been run for a given
generation, the master determines the generation, the master determines the members of the next generation and runs members of the next generation and runs the crossover/mutation methods on the the crossover/mutation methods on the appropriate portion of the new population. appropriate portion of the new population.
As the new parameters are created, they As the new parameters are created, they are sent to the workstations.are sent to the workstations.
The NetworkThe Network
The Cluster is composed of one The Cluster is composed of one master computer and 64 slave nodesmaster computer and 64 slave nodes
The cluster of computers is divided The cluster of computers is divided into three subnetsinto three subnets
Each subnet is connected to the Each subnet is connected to the master serially, using coaxial cable master serially, using coaxial cable and a 10base-2 (thin Ethernet) and a 10base-2 (thin Ethernet) systemsystem
DarwinDarwin
Pentium-II 333 MHz system with 128 Pentium-II 333 MHz system with 128 MB RAMMB RAM
Two 8.4 GB hard disks. Two 8.4 GB hard disks. Three NE-2000 compatible network Three NE-2000 compatible network
cards, one for each of the segmentscards, one for each of the segments
DarwinDarwin
NodesNodes
MotherboardMotherboard ProcessorProcessor Single 32 MB RAM chipSingle 32 MB RAM chip NE-2000 compatible network cardNE-2000 compatible network card No Hard drive!No Hard drive!
NodesNodes
Half of the nodes contain Pentium-II Half of the nodes contain Pentium-II 300 MHz processors, while the other 300 MHz processors, while the other half are AMD K6-II 450 MHz chips half are AMD K6-II 450 MHz chips
The ClusterThe Cluster
ConclusionConclusion
Based on initial results, the use of Based on initial results, the use of genetic algorithms appears to be a genetic algorithms appears to be a promising method for minimizing the promising method for minimizing the residual difference between residual difference between observational data and the Wilson—observational data and the Wilson—Devinney model Devinney model
ConclusionConclusion
It is also a wonderful example of how It is also a wonderful example of how parallel computing, open source parallel computing, open source software and clusters of workstations software and clusters of workstations can have a profound impact on the can have a profound impact on the course of research.course of research.
PIKAIA NamesakePIKAIA Namesake
““Pikaia Gracilens, a little worm-like beast that crawled in the mud of a Pikaia Gracilens, a little worm-like beast that crawled in the mud of a long gone seafloor of the Cambrian era, 530 million years ago. While long gone seafloor of the Cambrian era, 530 million years ago. While not particularly impressive in the tooth and claw department, Pikaia not particularly impressive in the tooth and claw department, Pikaia
is believed to be the founder of the phylum Chordata, whose is believed to be the founder of the phylum Chordata, whose subsequent evolution had consequences still very much felt today by subsequent evolution had consequences still very much felt today by
the rest of the ecosystem”the rest of the ecosystem”
ReferencesReferences1.1. Metcalfe, T. S. (1999), Metcalfe, T. S. (1999), Genetic-Algorithm Based Light-Curve Genetic-Algorithm Based Light-Curve
Optimization Applied to Observations of the W Ursae Majoris Optimization Applied to Observations of the W Ursae Majoris Star Bh CassiopeiaeStar Bh Cassiopeiae, The Astronomical Journal, Vol. 117, No. 5, , The Astronomical Journal, Vol. 117, No. 5, pp. 2503-2510pp. 2503-2510
2.2. Metcalfe, T. S., R. E. Nather, and D. E. Winget (2000), Metcalfe, T. S., R. E. Nather, and D. E. Winget (2000), Genetic-Genetic-Algorithm-Based Asteroseismological Analysis of the DBV White Algorithm-Based Asteroseismological Analysis of the DBV White Dwarf GD 358Dwarf GD 358, The Astrophysical Journal, Vol. 545, No. 2, pp. , The Astrophysical Journal, Vol. 545, No. 2, pp. 974-981 974-981
3.3. Metcalfe, T. S. (2000), Metcalfe, T. S. (2000), The Asteroseismology MetacomputerThe Asteroseismology Metacomputer, , Baltic Astronomy, Vol. 9, pp. 479-483Baltic Astronomy, Vol. 9, pp. 479-483
ReferencesReferencesAuthor’s Web page:Author’s Web page:http://www.whitedwarf.orghttp://www.whitedwarf.org
Wilson-Devinney:Wilson-Devinney:http://cdsads.u-strasbg.fr/cgi-bin/nph-bib_quhttp://cdsads.u-strasbg.fr/cgi-bin/nph-bib_query?1971ApJ...166..605Wery?1971ApJ...166..605W
PIKAIA Web Page:PIKAIA Web Page:http://www.hao.ucar.edu/public/research/si/phttp://www.hao.ucar.edu/public/research/si/pikaia/pikaia.htmlikaia/pikaia.html
ReferencesReferencesImage SourcesImage Sources
All images were taken from: All images were taken from: http://www.whitedwarf.orghttp://www.whitedwarf.org
Except… Except…
H-R DiagramH-R Diagramhttp://www.astunit.com/tutorials/stellar.htmhttp://www.astunit.com/tutorials/stellar.htm
Pikaia Gracilens: PIKAIA WebsitePikaia Gracilens: PIKAIA Website