FDA- A scalable evolutionary algorithm for the optimization of ADFs
By Hossein Momeni
Page 2
Outline
• Factorization Theorem• FDA• Analysis of FDA for large populations• Boltzmann and Truncation selections• Finite and critical population • Numerical results• LFDA
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 3
Introduction• In a deceptive function the global optimum
x=(1,…,1) is isolated.• Neighbors of the second best fitness value x=(0,
…,0) have large fitness value• GAs are deceived by the fitness distribution• Most Gas will convergence to x=(0,…,0)
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 4
Solutions • Mathematical methods are suitable to optimize
deceptive functions• Consider additively decomposed functions (ADF)
• Sj are non-overlapping substrings of X with k elements
• This class of functions is of great theoretical and practical importance
• Optimization of an arbitrary in this space is NP complete
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 5
ADFs Optimization Approaches• Adaptive recombination• Explicit detection of relations
(kargupta&Goldberg, 97)• Dependency trees(Baluja&Davies, 97)• Bivariate marginal distributions
(pelikan&Muhleinbein,98) • Estimation of Distributions(Muhlenbein et
all,1997)
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 6
ADF
• Definition: An additively decomposed function (ADF) is defined by:
• For theoretical analysis, use Boltzmann Distribution
)()(i
i
sSs
i xfxf
Xsssss il ,...,, 21
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 7
Gibbs or Boltzmann distribution• Definition: The Gibbs or Boltzmann distribution of a
function f is defined for u>=1 by
• is partition function• larger function value f(x) and larger p(x)• Such a search distribution is suitable for an
optimization problem• exponential computation
u
u
F
xfExpxp
)(:)(
uF
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 8
Reduce of B.D. computation
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
1) Approximate the Boltzmann distribution (simulated Annealing)
2) Look for ADFs with distribution computation in Polynomial time
• factorize distribution into a product of marginal and conditional probabilities (used by FDA)
Page 9
Input sets for Factorization theoremDefinition: if S={s1,s2, …, sl} for i=1, 2,…, l then
In the decomposable graphs theory:
di histories
bi residuals
ci separators
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 10
Factorization Theorem
Theorem1: Let p(x) be a Boltzmann distribution on X
If
then
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 11
FDAr
S0: set t=0, generate (1-r)*N>>o point randomly and r*N points (Equation 16)
S1: selection
S2: Compute using selected points
S3: Generate a new population
S4: If termination criteria is met, Finish
S5: Add the best point of previous generation to generated points (elitist)
S6: Set t=t+1, Go to Step2
),( txxpii cb
s
l
icb
s txxptxpii
1
),()1,(
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 12
Analysis of Factorization Algorithm• The computational Complexity depends on the factorization
and population size N• Number of function evaluations: FE=GENe*N
GENe is the number of generation till Convergence p(x,t+1)=p(x,t)
• The computational Complexity of computing N new search points is
• The Computational Complexity of computing probability is
Nlnts)compl(Npoi
Ml
i
)2(compl(p)1
si
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 13
Analysis of … (Contd)• Computation of FDA depends on:
1) Number of decomposition functions (l)2) Size of the defining sets (si)
3) Size of selected point (M)
• An infinite population is needed to exactly computation
• Should use a minimal population size N* in a numerical efficient FDA
• Computation of N* is a difficult problem for any search method using a population of points
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 14
FDA-FAC• S0: set i=1, is non-linear sub-function
• S1: compute
• S2: Select sk which has maximal overlap with and
• S3: if no set is found go to step 5
• S4: Set if i<L go Step1
• S5: Compute the factorization using Eq. 6 with sets
is~
i
jji sd
1
~:~
id~
ik ds~
1:,~1 iiss ki
is~
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 15
Generation of Initial Population
• Normally the initial population is generated randomly
• with ADF, initial point can be generated with this information.
• Generate subsets with high local fitness values• Distribution is an approximation of • Conditional probabilities are computed using local
fitness functions
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 16
Generation of Initial Population….
• The larger u, the steeper distribution• if u=1 the distribution is uniform.
• if function Onemax(n)=∑xi then • FDA computes span=1 and u=10
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 17
Generation of Initial Population….• if function Onemax(n)=∑xi then • FDA computes span=1 and u=10
• There will be 10 times more 1s than 0s in the initial population
• Such an initial population might not give a B.D. • Only half of the population is generated by this
method• Other half is generated randomly
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 18
Convergence of FDA• If points are selected base on Bol. Distribution
convergence of FDA is proved.• The distribution ps of selected points is given by:
• If p(x,t) is B.D. then ps(x,t) is B.D. • FDA computes new search points according to
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 19
• Theorem2 : If the initial points are distributed according to with u>=1, then for FDA the distribution at generation is given by
with
Tip: B. Selection with fixed basis v>1 defines an annealing schedule with that t is number of generation
Theorem3 remains valid for any annealing schedule with
tvuw .
))ln()ln((1)( uvttT
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 20
• Theorem 3(Convergence): Let be the set of optima, then base on Theorem 2 :
• FDA with B. selection is exact simulated annealing algorithm.
• simulated annealing is controlled by 2 parameters: N(T) and annealing schedule
• N can be called population size
,...},{ 21 optoptopt xxX
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 21
Truncation Selection Vs B. selection
• Numerically truncation selection is easier to implement• With truncation threshold ד the best ד*N individual are
selected.• Conditional probabilities of selected point is: • Based on factorization theorem to generate new search points :
• Problem: After Truncation selection the distribution is not B.D. therefore:
• With this inequality that this makes a convergence proof difficult.
),( txxpii cb
s
l
icb
s txxptxPii
1
),()1,(
l
icb
ss txxptxpii
1
),(),(
),()1,( txptxp opts
opt
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 22
Theoretical Analysis for Infinite populations
• For analysis two linear function will be investigated:
• OneMax has (n+1) different fitness value which are multinomial D.
• Int has 2n different fitness value.• For ADFs the multinomial distribution is typical• The distribution generated by Int is more special• Both functions is linear, therefore can use following
factorization:
n
ii
in
n
iin
xxInt
xxOneMax
1
1
1
2)(
)(
n
ii txptxp
1
),()1,(
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 23
• Theorem 4 For B. selection with basis v the probabilities
distribution for OneMax is given by:
• Number of generations to generate the optimum is given by:
nt
xtf
v
vtXp
)1(),(
)(
)ln(
ln
v
n
GEN
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 24
• Theorem 5For Truncation selection ד with selection intensity Iד
the marginal probability p(t) obeys for OneMax
• The approximate solution of this equation is :
Where
• The number of generations till convergence is given by:
))(1)(()()1( tptnpn
Itptp
))12arcsin(sin(1(5.0)( 0 ptn
Itp
I
npt ))12arcsin(
2( 0
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 25
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 26
Comparison Truncation & B. selection
• T.S. need more number of generation to convergence than B.S.
• GENe is of order for B.S. and for T.S. is
• If basis v is small (e.g. v=1.2) T.S. convergence is faster
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 27
• B.S. with fixed v gives an annealing schedule of
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 28
• FDA with truncation selection generates a B.D. with annealing schedule
• The annealing schedule depends on the average fitness and the variance of the population.
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 29
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 30
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Page 31
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
• For Int the B.D. is concentrated around the optimum
• The selected population has a small diversity• In finite population this cause a problem, some
genes will get fixed to wrong alleles
Page 32
Analysis of FDA for Finite Populations
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
In finite population, convergence of FDA can be Probabilistic
Page 33
Analysis of FDA for Finite Populations
Factorized Distributed Algorithm
Iran University of Science and Technology November 2006Of 47
Cumulative fixation probability for Int(16) Truncation Selection vs. Boltzmann selection with v=1.01