the bayesian optimization algorithm with substructural local search
DESCRIPTION
This work studies the utility of using substructural neighborhoods for local search in the Bayesian optimization algorithm (BOA). The probabilistic model of BOA, which automatically identifies important problem substructures, is used to define the structure of the neighborhoods used in local search. Additionally, a surrogate fitness model is considered to evaluate the improvement of the local search steps. The results show that performing substructural local search in BOA significatively reduces the number of generations necessary to converge to optimal solutions and thus provides substantial speedups.TRANSCRIPT
The Bayesian Optimization Algorithm with Substructural Local Search
Claudio Lima, Martin Pelikan, Kumara Sastry, Martin Butz, David Goldberg, and Fernando Lobo
OBUPM 2006 2
Overview
MotivationBayesian Optimization Algorithm (BOA)Modeling fitness in BOASubstructural NeighborhoodsBOA with Substuctural HillclimbingResultsConclusionsFuture Work
OBUPM 2006 3
Motivation
Probabilistic models of EDAs allow better recombination of subsolutions Get we can more from these models? Yes!Efficiency enhancement on EDAs
Evaluation relaxationLocal search in substructural neighborhoods
OBUPM 2006 4
Bayesian Optimization Algorithm
Pelikan, Goldberg, and Cantú-Paz (1999)Use Bayesian networks to model good solutionsModel structure => acyclic directed graph
Nodes represent variablesEdges represent conditional dependencies
Model parameters => conditional probabilitiesConditional Probability Tables based on the observed frequenciesLocal structures: Decision Trees or Graphs
OBUPM 2006 5
Learning a Bayesian Network
Start with an empty network (independence assumption)Perform operation that improves the metric the most
Edge addition, edge removal, edge reversalMetric quantifies the likelihood of the model wrt data (good solutions)
Stop when no more improvement is possible
OBUPM 2006 6
A 3-bit Example
X2X3
X1
X2X3 P(X1=1|X2X3)00 0.2001 0.2010 0.1511 0.45
X2
X3P(x1=1) = 0.20
P(x1=1) = 0.15 P(x1=1) = 0.45
Model Structure Model Parameters
Directed Acyclic Graph Conditional Probability Tables Decision Trees
0 1
0 1
OBUPM 2006 7
Modeling Fitness in BOA
Bayesian networks extended to store a surrogate fitness model (Pelikan & Sastry,2004)The surrogate fitness is learned from a proportion of the population......and is used to estimate the fitness of the remaining individuals (therefore reducing evals)
OBUPM 2006 8
The same 3-bit Example
X2X3 P(X1=1|X2X3) f(X1=0|X2X3)0.20 -0.49
-0.38-0.55-0.52
0.200.150.45
f(X1=1|X2X3)00 0.5301 0.5110 0.4711 0.62
X2
X3P(X1=1) = 0.20f(X1=0) = -0.48f(X1=1) = 0.54
P(X1=1) = 0.15f(X1=0) = -0.55f(X1=1) = 0.47
P(X1=1) = 0.45f(X1=0) = -0.52f(X1=1) = 0.62
0 1
0 1
Estimated fitness:
OBUPM 2006 9
Why Substructural Neighborhoods?
An efficient mutation operator should search in the correct neighborhoodOftentimes this is done by incorportaring domain- or problem-specific knowledgeHowever, efficiency typically does not generalize beyond a small number of applicationsBitwise local search have more general applicability but with inferior results
OBUPM 2006 10
Substructural Neighborhoods
Neighborhoods defined by the probabilistic model of EDAsExploits the underlying problem structure while not loosing generality of applicationExploration of neighborhoods respect dependencies between variables
If [X1X2X3] form a linkage group, the neighborhood considered will be 000, 001, 010, ..., 111
OBUPM 2006 11
Substructural Local Search
For uniformly-scaled decomposable problems, substructural local search scales as 0(2km1.5) (Sastry & Goldberg, 2004)
Bitwise hillclimber: O( mk log(m) )
Extended Compact GA with substructural local search is more robust than either single-operator-based aproaches (Lima et al., 2005)
OBUPM 2006 12
Substructural Neighborhoods in BOA
Model is more complex than in eCGAWhat is a linkage group? Which dependencies to consider? Is order relevant?Example: topology of 3 different substructural neighborhoods for variable X2:
OBUPM 2006 13
BOA + Substructural Hillclimbing
After model sampling each offspring undergoes local search with a certain probability pls
Current model is used to define the neighborhoodsChoice of best subsolutions => surrogate fitness modelCost of performing local search is then minimal
OBUPM 2006 14
Substructural Hillclimbing in BOA
OBUPM 2006 15
Substructural Hillclimbing in BOA
Use reverse ancestral ordering of variables
2 different versions of the substructural hillclimber (step 3)
Evaluated fitnessEstimated fitness
Result of local search is evaluated
OBUPM 2006 16
Experiments
Additively decomposable problemsTwo important bounds: Onemax and concatenated k-bit traps
Many things in between
OBUPM 2006 17
Onemax Results (l=50)
OBUPM 2006 18
Onemax Results (l=50)
Correctness of substructural neighborhoohs is not relevant......but the choice of subsolutions relies on the accuracy of the surrogate fitness modelMore important, the acceptance of the best subsolutions depends also on the surrogate, if using estimated fitness
OBUPM 2006 19
10x5-bit trap Results (l=50)
OBUPM 2006 20
10x5-bit trap Results (l=50)
Correct identification of problem substructure is crucialDifferent versions of the hillclimber perform similar (for small pls)Cost of using evaluated fitness increases significatively with pls (and with problem size)Phase transition in the population size required
OBUPM 2006 21
Scalability Results (5-bit traps)
OBUPM 2006 22
Scalability Results (5-bit traps)
Substancial speedups are obtained (η=6 for l=140)Speedup scales as O(l0.45) for l<80For bigger problem sizes the speedup is more moderatepls=5x10-4 adequate for range of problems tested, but optimal proportion should decrease for higher problem sizes
OBUPM 2006 23
More on Scalability...
OBUPM 2006 24
Scalability Issues
Optimal proportion of local search slowly decreases with problem sizeExploration of substructural neighborhoods is sensitive to the accuracy of model structureSpurious linkage size grows with problem sizeBOA’s sampling ability is not affected because conditional probabilities nearly express independence between spurious and linked variables
OBUPM 2006 25
Future Work
Model optimal proportion of local search pls
Get more accurate model structuresOnly accept pairwise depedencies that improve metric beyond some threshold (significance test)Study the improvement function of the metric
Consider other neighborhood topologiesConsider overlapping substructures
OBUPM 2006 26
Conclusions
Incorporation of substructural local search in BOA leads to significant speedupsUse of surrogate fitness in local search provides effective learning of substructures with minimal cost on evals.The importance of designing and hybridizing competent operators have been empirically demonstrated