exact computation of the fitness-distance correlation for ... · computation of the fdc using the...
TRANSCRIPT
1 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Exact Computation of the Fitness-Distance Correlation for Pseudoboolean Functions
with One Global Optimum
Francisco Chicano and Enrique Alba
2 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• It is a measure of the difficulty of a problem defined by Jones and Forrest
Fitness-Distance Correlation
3 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Fitness-Distance Correlation: Definition
Distance to the optimum
Fitn
ess
valu
e Definition 2. Given a function f : Bn 7! R the fitness-distance correlation forf is defined as
r =Cov
fd
�
f
�
d
, (10)
where Cov
fd
is the covariance of the fitness values and the distances of thesolutions to their nearest global optimum, �
f
is the standard deviation of thefitness values in the search space and �
d
is the standard deviation of the distancesto the nearest global optimum in the search space. Formally:
Cov
fd
=1
2n
X
x2Bn
(f(x)� f)(d(x)� d),
f =1
2n
X
x2Bn
f(x), �
f
=
s1
2n
X
x2Bn
(f(x)� f)2,
d =1
2n
X
x2Bn
d(x), �
d
=
s1
2n
X
x2Bn
(d(x)� d)2, (11)
where the function d(x) is the Hamming distance between x and its nearest globaloptimum.
The FDC r is a value between �1 and 1. The lower the absolute value ofr, the more di�cult the optimization problem is supposed to be. The exactcomputation of the FDC using the previous definition requires the evaluationof the complete search space. It is required to determine the global optima todefine d(x) and compute the statistics for d and f . If the objective function f isa constant function, then the FDC is not well-defined, since �
f
= 0.
In the following we will focus on the case in which there exists one onlyglobal optimum x
⇤ and we know the elementary landscape decomposition of f .The following lemma provides an expression for d and �
d
in this case.
Lemma 1. Given an optimization problem defined over Bn, if there is only oneglobal optimum x
⇤, then the distance function d(x) defined in Definition 2 is theHamming distance between x and x
⇤ and its average and standard deviation inthe whole search space are given by
d =n
2, �
d
=
pn
2. (12)
Proof. Since there is only one global optimum, the function d(x) is defined asd(x) = H(x, x⇤). Given an integer number 0 k n, the number of solutions
at distance k from x
⇤ is
✓n
k
◆. Then we can compute the two first raw moments
4 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Fitness-Distance Correlation: Definition
Distance to the optimum
Fitn
ess
valu
e Definition 2. Given a function f : Bn 7! R the fitness-distance correlation forf is defined as
r =Cov
fd
�
f
�
d
, (10)
where Cov
fd
is the covariance of the fitness values and the distances of thesolutions to their nearest global optimum, �
f
is the standard deviation of thefitness values in the search space and �
d
is the standard deviation of the distancesto the nearest global optimum in the search space. Formally:
Cov
fd
=1
2n
X
x2Bn
(f(x)� f)(d(x)� d),
f =1
2n
X
x2Bn
f(x), �
f
=
s1
2n
X
x2Bn
(f(x)� f)2,
d =1
2n
X
x2Bn
d(x), �
d
=
s1
2n
X
x2Bn
(d(x)� d)2, (11)
where the function d(x) is the Hamming distance between x and its nearest globaloptimum.
The FDC r is a value between �1 and 1. The lower the absolute value ofr, the more di�cult the optimization problem is supposed to be. The exactcomputation of the FDC using the previous definition requires the evaluationof the complete search space. It is required to determine the global optima todefine d(x) and compute the statistics for d and f . If the objective function f isa constant function, then the FDC is not well-defined, since �
f
= 0.
In the following we will focus on the case in which there exists one onlyglobal optimum x
⇤ and we know the elementary landscape decomposition of f .The following lemma provides an expression for d and �
d
in this case.
Lemma 1. Given an optimization problem defined over Bn, if there is only oneglobal optimum x
⇤, then the distance function d(x) defined in Definition 2 is theHamming distance between x and x
⇤ and its average and standard deviation inthe whole search space are given by
d =n
2, �
d
=
pn
2. (12)
Proof. Since there is only one global optimum, the function d(x) is defined asd(x) = H(x, x⇤). Given an integer number 0 k n, the number of solutions
at distance k from x
⇤ is
✓n
k
◆. Then we can compute the two first raw moments
Difficult when |r| < 0.15 (Jones & Forrest)
5 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• A landscape is a triple (X,N, f) where
Ø X is the solution space
Ø N is the neighbourhood operator
Ø f is the objective function
Landscape Definition Landscape Definition Elementary Landscapes Landscape decomposition
The pair (X,N) is called configuration space
s0
s4 s7
s6
s2
s1
s8 s9
s5
s3 2
0
3
5
1
2
4 0
7 6
• The neighbourhood operator is a function
N: X →P(X)
• Solution y is neighbour of x if y ∈ N(x)
• Regular and symmetric neighbourhoods
• d=|N(x)| ∀ x ∈ X
• y ∈ N(x) ⇔ x ∈ N(y)
• Objective function
f: X →R (or N, Z, Q)
6 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• An elementary landscape is a landscape for which
where
• Grover’s wave equation
Elementary Landscapes
Linear relationship
Eigenvalue
Depend on the problem/instance
Landscape Definition Elementary Landscapes Landscape decomposition
def
7 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Elementary Landscapes: Examples Problem Neighbourhood d k
Symmetric TSP 2-opt n(n-3)/2 n-1 swap two cities n(n-1)/2 2(n-1)
Antisymmetric TSP inversions n(n-1)/2 n(n+1)/2 swap two cities n(n-1)/2 2n
Graph α-Coloring recolor 1 vertex (α-1)n 2α Graph Matching swap two elements n(n-1)/2 2(n-1) Graph Bipartitioning Johnson graph n2/4 2(n-1) NEAS bit-flip n 4 Max Cut bit-flip n 4 Weight Partition bit-flip n 4
Landscape Definition Elementary Landscapes Landscape decomposition
8 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• What if the landscape is not elementary?
• Any landscape can be written as the sum of elementary landscapes
• There exists a set of elementary functions that form a basis of the function space (Fourier basis)
Landscape Decomposition Landscape Definition Elementary Landscapes Landscape decomposition
X X X
e1
e2
Elementary functions
(from the Fourier basis)
Non-elementary function
f Elementary components of f
f < e1,f > < e2,f >
< e2,f >
< e1,f >
9 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Landscape Decomposition: Examples Problem Neighbourhood d Components
General TSP inversions n(n-1)/2 2 swap two cities n(n-1)/2 2
Subset Sum Problem bit-flip n 2 MAX k-SAT bit-flip n k NK-landscapes bit-flip n k+1
Radio Network Design bit-flip n max. nb. of reachable antennae
Frequency Assignment change 1 frequency (α-1)n 2 QAP swap two elements n(n-1)/2 3
Landscape Definition Elementary Landscapes Landscape decomposition
10 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• The set of solutions X is the set of binary strings with length n • Neighborhood used in the proof of our main result: one-change neighborhood
Ø Two solutions x and y are neighbors iff Hamming(x,y)=1
Pseudoboolean functions Pseudoboolean functions Spheres FDC formula
0 1 1 1 0 1 0 0 1 0
0 1 1 1 0 1 0 0 1 0
0 1 1 1 0 1 0 0 0 0
0 1 1 1 0 1 0 1 1 0
0 1 1 1 0 1 1 0 1 0
0 1 1 1 0 0 0 0 1 0
0 1 1 0 0 1 0 0 1 0
0 1 0 1 0 1 0 0 1 0
0 0 1 1 0 1 0 0 1 0
1 1 1 1 0 1 0 0 1 0
0 1 1 1 0 1 0 0 1 1 0 1 1 1 1 1 0 0 1 0
11 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• Any arbitrary pseudoboolean function can be written as
where
and for p > 0
Pseudoboolean function decomposition
Elementary landscape with eigenvalue 2p
(order-p elementary landscape)
f(x) =
nX
j=0
f[j](x)
where
j =
�1
2
f[j] = 0 for j > 0
f[0] = f
1
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
1
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
1
Pseudoboolean functions Spheres FDC formula
12 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• If f is elementary, the average of f in any sphere and ball of any size around x is a linear expression of f(x)!!!
Spheres around a Solution
H=1 H=2
H=3
Σ f(y’) = λ1 f(x)
Σ f(y’’) = λ2 f(x)
Σ f(y’’’) = λ3 f(x)
n+1 possible values
0, 2, 4, …, 2n
Sutton
Whitley
Langdon
Pseudoboolean functions Spheres FDC formula
13 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• We can use this property to compute the covariance between f(x) and d(x)
Computing the Covariance
H=1 H=2
H=3
Σ f(y’) = λ1 f(x)
Σ f(y’’) = λ2 f(x)
Σ f(y’’’) = λ3 f(x)
of d(x) over the search space as:
↵1 = d =1
2n
nX
k=0
✓n
k
◆k =
n2n�1
2n=
n
2,
↵2 = d
2 =1
2n
nX
k=0
✓n
k
◆k
2 =n(n+ 1)2n�2
2n=
n(n+ 1)
4.
Using these moments we can compute the standard deviation asp
↵2 � ↵
21,
which yields:
�
d
=
rn(n+ 1)
4� n
2
4=
rn
4=
pn
2. (13)
ut
Now we are ready to prove the main result of this work.
Theorem 1. Let f be an objective function whose elementary landscape decom-position is f =
Pn
p=0 f[p], where f[0] is the constant function f[0](x) = f andeach f[p] with p > 0 is an order-p elementary function with zero o↵set. If thereexists only one global optimum in the search space x
⇤, the FDC can be exactlycomputed as:
r =�f[1](x
⇤)
�
f
pn
. (14)
Proof. Let us expand the covariance as
Cov
fd
=1
2n
X
x2Bn
f(x)d(x)� f d =1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
f(x)� f
n
2
=1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
nX
p=0
f[p](x)� f[0]n
2=
1
2n
nX
k=0
k
nX
p=0
K(n)k,p
f[p](x⇤)� f[0]
n
2
=nX
p=0
1
2n
nX
k=0
kK(n)k,p
!f[p](x
⇤)� f[0]n
2=
n
2f[0] � 1
2f[1](x
⇤)� f[0]n
2
= �1
2f[1](x
⇤), (15)
where we used the result in Proposition 1. Substituting in (10) we obtain (14).ut
The previous theorem shows that the only thing we need to know on theglobal optimum is the value of the first elementary component. With this infor-mation we can exactly compute the FDC. Some problems for which we knowthe elementary landscape decomposition based on the numerial data defininga problem instance are MAX-SAT, 0-1 Unconstrained Quadratic Optimization
Distance to the optimum
of d(x) over the search space as:
↵1 = d =1
2n
nX
k=0
✓n
k
◆k =
n2n�1
2n=
n
2,
↵2 = d
2 =1
2n
nX
k=0
✓n
k
◆k
2 =n(n+ 1)2n�2
2n=
n(n+ 1)
4.
Using these moments we can compute the standard deviation asp
↵2 � ↵
21,
which yields:
�
d
=
rn(n+ 1)
4� n
2
4=
rn
4=
pn
2. (13)
ut
Now we are ready to prove the main result of this work.
Theorem 1. Let f be an objective function whose elementary landscape decom-position is f =
Pn
p=0 f[p], where f[0] is the constant function f[0](x) = f andeach f[p] with p > 0 is an order-p elementary function with zero o↵set. If thereexists only one global optimum in the search space x
⇤, the FDC can be exactlycomputed as:
r =�f[1](x
⇤)
�
f
pn
. (14)
Proof. Let us expand the covariance as
Cov
fd
=1
2n
X
x2Bn
f(x)d(x)� f d =1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
f(x)� f
n
2
=1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
nX
p=0
f[p](x)� f[0]n
2=
1
2n
nX
k=0
k
nX
p=0
K(n)k,p
f[p](x⇤)� f[0]
n
2
=nX
p=0
1
2n
nX
k=0
kK(n)k,p
!f[p](x
⇤)� f[0]n
2=
n
2f[0] � 1
2f[1](x
⇤)� f[0]n
2
= �1
2f[1](x
⇤), (15)
where we used the result in Proposition 1. Substituting in (10) we obtain (14).ut
The previous theorem shows that the only thing we need to know on theglobal optimum is the value of the first elementary component. With this infor-mation we can exactly compute the FDC. Some problems for which we knowthe elementary landscape decomposition based on the numerial data defininga problem instance are MAX-SAT, 0-1 Unconstrained Quadratic Optimization
Pseudoboolean functions Spheres FDC formula
14 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• We can use this property to compute the covariance between f(x) and d(x)
Computing the Covariance
H=1 H=2
H=3
Σ f(y’) = λ1 f(x)
Σ f(y’’) = λ2 f(x)
Σ f(y’’’) = λ3 f(x)
of d(x) over the search space as:
↵1 = d =1
2n
nX
k=0
✓n
k
◆k =
n2n�1
2n=
n
2,
↵2 = d
2 =1
2n
nX
k=0
✓n
k
◆k
2 =n(n+ 1)2n�2
2n=
n(n+ 1)
4.
Using these moments we can compute the standard deviation asp
↵2 � ↵
21,
which yields:
�
d
=
rn(n+ 1)
4� n
2
4=
rn
4=
pn
2. (13)
ut
Now we are ready to prove the main result of this work.
Theorem 1. Let f be an objective function whose elementary landscape decom-position is f =
Pn
p=0 f[p], where f[0] is the constant function f[0](x) = f andeach f[p] with p > 0 is an order-p elementary function with zero o↵set. If thereexists only one global optimum in the search space x
⇤, the FDC can be exactlycomputed as:
r =�f[1](x
⇤)
�
f
pn
. (14)
Proof. Let us expand the covariance as
Cov
fd
=1
2n
X
x2Bn
f(x)d(x)� f d =1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
f(x)� f
n
2
=1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
nX
p=0
f[p](x)� f[0]n
2=
1
2n
nX
k=0
k
nX
p=0
K(n)k,p
f[p](x⇤)� f[0]
n
2
=nX
p=0
1
2n
nX
k=0
kK(n)k,p
!f[p](x
⇤)� f[0]n
2=
n
2f[0] � 1
2f[1](x
⇤)� f[0]n
2
= �1
2f[1](x
⇤), (15)
where we used the result in Proposition 1. Substituting in (10) we obtain (14).ut
The previous theorem shows that the only thing we need to know on theglobal optimum is the value of the first elementary component. With this infor-mation we can exactly compute the FDC. Some problems for which we knowthe elementary landscape decomposition based on the numerial data defininga problem instance are MAX-SAT, 0-1 Unconstrained Quadratic Optimization
of d(x) over the search space as:
↵1 = d =1
2n
nX
k=0
✓n
k
◆k =
n2n�1
2n=
n
2,
↵2 = d
2 =1
2n
nX
k=0
✓n
k
◆k
2 =n(n+ 1)2n�2
2n=
n(n+ 1)
4.
Using these moments we can compute the standard deviation asp
↵2 � ↵
21,
which yields:
�
d
=
rn(n+ 1)
4� n
2
4=
rn
4=
pn
2. (13)
ut
Now we are ready to prove the main result of this work.
Theorem 1. Let f be an objective function whose elementary landscape decom-position is f =
Pn
p=0 f[p], where f[0] is the constant function f[0](x) = f andeach f[p] with p > 0 is an order-p elementary function with zero o↵set. If thereexists only one global optimum in the search space x
⇤, the FDC can be exactlycomputed as:
r =�f[1](x
⇤)
�
f
pn
. (14)
Proof. Let us expand the covariance as
Cov
fd
=1
2n
X
x2Bn
f(x)d(x)� f d =1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
f(x)� f
n
2
=1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
nX
p=0
f[p](x)� f[0]n
2=
1
2n
nX
k=0
k
nX
p=0
K(n)k,p
f[p](x⇤)� f[0]
n
2
=nX
p=0
1
2n
nX
k=0
kK(n)k,p
!f[p](x
⇤)� f[0]n
2=
n
2f[0] � 1
2f[1](x
⇤)� f[0]n
2
= �1
2f[1](x
⇤), (15)
where we used the result in Proposition 1. Substituting in (10) we obtain (14).ut
The previous theorem shows that the only thing we need to know on theglobal optimum is the value of the first elementary component. With this infor-mation we can exactly compute the FDC. Some problems for which we knowthe elementary landscape decomposition based on the numerial data defininga problem instance are MAX-SAT, 0-1 Unconstrained Quadratic Optimization
Σ i λi f(x)
Pseudoboolean functions Spheres FDC formula
15 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• Using the previous facts we get for the elementary landscapes…
• In general, for an arbitrary function…
Fitness-Distance Correlation Formulas
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
r =
�f[p](x⇤)
�fpn
0
1
if p=1
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
r =
�f[p](x⇤)
�fpn
r = 0
1
if p>1
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
r =
�f[p](x⇤)
�fpn
r = 0
f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)
1
… the only component contributing to r is f[1](x)
of d(x) over the search space as:
↵1 = d =1
2n
nX
k=0
✓n
k
◆k =
n2n�1
2n=
n
2,
↵2 = d
2 =1
2n
nX
k=0
✓n
k
◆k
2 =n(n+ 1)2n�2
2n=
n(n+ 1)
4.
Using these moments we can compute the standard deviation asp
↵2 � ↵
21,
which yields:
�
d
=
rn(n+ 1)
4� n
2
4=
rn
4=
pn
2. (13)
ut
Now we are ready to prove the main result of this work.
Theorem 1. Let f be an objective function whose elementary landscape decom-position is f =
Pn
p=0 f[p], where f[0] is the constant function f[0](x) = f andeach f[p] with p > 0 is an order-p elementary function with zero o↵set. If thereexists only one global optimum in the search space x
⇤, the FDC can be exactlycomputed as:
r =�f[1](x
⇤)
�
f
pn
. (14)
Proof. Let us expand the covariance as
Cov
fd
=1
2n
X
x2Bn
f(x)d(x)� f d =1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
f(x)� f
n
2
=1
2n
nX
k=0
k
X
x2BnH(x,x
⇤)=k
nX
p=0
f[p](x)� f[0]n
2=
1
2n
nX
k=0
k
nX
p=0
K(n)k,p
f[p](x⇤)� f[0]
n
2
=nX
p=0
1
2n
nX
k=0
kK(n)k,p
!f[p](x
⇤)� f[0]n
2=
n
2f[0] � 1
2f[1](x
⇤)� f[0]n
2
= �1
2f[1](x
⇤), (15)
where we used the result in Proposition 1. Substituting in (10) we obtain (14).ut
The previous theorem shows that the only thing we need to know on theglobal optimum is the value of the first elementary component. With this infor-mation we can exactly compute the FDC. Some problems for which we knowthe elementary landscape decomposition based on the numerial data defininga problem instance are MAX-SAT, 0-1 Unconstrained Quadratic Optimization
Rugged components are not considered by FDC
Pseudoboolean functions Spheres FDC formula
16 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• Fitness-Distance Correlation for order-1 (linear) elementary landscapes (assume max.)
• We can define a linear elementary landscape with the desired FDC ρ (greater than 0)
• “Difficult” problems can be obtained starting in n=45 (|r| < 0.15)
Implications for Linear Functions
The previous corollary states that only elementary landscapes with orderp = 1 have a nonzero FDC. Furthermore, the FDC does depend on the valueof the objective function in the global optimum f(x⇤) and the average value f ,but not on the solution x
⇤ itself. We can also observe that if we are maximizing,then f(x⇤) > f and the FDC is negative, while if we are minimizing f(x⇤) < f
and the FDC is positive.Interestingly, the order-1 elementary landscapes can always be written as
linear functions and they can be optimized in polynomial time. That is, if f isan order-1 elementary function then it can be written in the following way:
f(x) =nX
i=1
a
i
x
i
+ b. (17)
where a
i
and b are real values. The following proposition provides the averageand the standard deviation for this family of functions.
Proposition 2. Let f be an order-1 elementary function, which can be writtenas (17). Then, the average and the standard deviation of the function values inthe whole search space are:
f = b+1
2
nX
i=1
a
i
, �
f
=1
2
vuutnX
i=1
a
2i
. (18)
Proof. Using the linearity property of the average we can write: f =P
n
i=1 aixi
+b, and f in (18) follows from the fact that x
i
= 1/2. Now we can compute thevariance of f as:
V ar[f ] = (f(x)� f)2 =
nX
i=1
a
i
x
i
� 1
2
nX
i=1
a
i
!2
=
nX
i=1
a
i
✓x
i
� 1
2
◆!2
=nX
i,j=1
a
i
a
j
✓x
i
� 1
2
◆✓x
j
� 1
2
◆=
nX
i,j=1
a
i
a
j
✓x
i
x
j
� 1
2x
i
� 1
2x
j
+1
4
◆
=nX
i,j=1
a
i
a
j
✓x
i
x
j
� 1
4
◆=
nX
i,j=1
a
i
a
j
✓�
j
i
1
4+
1
4� 1
4
◆=
1
4
nX
i=1
a
2i
, (19)
where we used again x
i
= x
j
= 1/2 and x
i
x
j
= 1/4(�ji
+ 1), being �
j
i
theKronecker delta. The expression for �
f
in (18) follows from (19). utUsing Proposition 2 we can compute the FDC for the order-1 elementary
landscapes.
Proposition 3. Let f be an order-1 elementary function written as (17) suchthat all a
i
6= 0. Then, it has one only global optimum and its FDC (assumingmaximization) is:
r =�Pn
i=1 |ai|pn
Pn
i=1 a2i
, (20)
which is always in the interval �1 r < 0.
The previous corollary states that only elementary landscapes with orderp = 1 have a nonzero FDC. Furthermore, the FDC does depend on the valueof the objective function in the global optimum f(x⇤) and the average value f ,but not on the solution x
⇤ itself. We can also observe that if we are maximizing,then f(x⇤) > f and the FDC is negative, while if we are minimizing f(x⇤) < f
and the FDC is positive.Interestingly, the order-1 elementary landscapes can always be written as
linear functions and they can be optimized in polynomial time. That is, if f isan order-1 elementary function then it can be written in the following way:
f(x) =nX
i=1
a
i
x
i
+ b. (17)
where a
i
and b are real values. The following proposition provides the averageand the standard deviation for this family of functions.
Proposition 2. Let f be an order-1 elementary function, which can be writtenas (17). Then, the average and the standard deviation of the function values inthe whole search space are:
f = b+1
2
nX
i=1
a
i
, �
f
=1
2
vuutnX
i=1
a
2i
. (18)
Proof. Using the linearity property of the average we can write: f =P
n
i=1 aixi
+b, and f in (18) follows from the fact that x
i
= 1/2. Now we can compute thevariance of f as:
V ar[f ] = (f(x)� f)2 =
nX
i=1
a
i
x
i
� 1
2
nX
i=1
a
i
!2
=
nX
i=1
a
i
✓x
i
� 1
2
◆!2
=nX
i,j=1
a
i
a
j
✓x
i
� 1
2
◆✓x
j
� 1
2
◆=
nX
i,j=1
a
i
a
j
✓x
i
x
j
� 1
2x
i
� 1
2x
j
+1
4
◆
=nX
i,j=1
a
i
a
j
✓x
i
x
j
� 1
4
◆=
nX
i,j=1
a
i
a
j
✓�
j
i
1
4+
1
4� 1
4
◆=
1
4
nX
i=1
a
2i
, (19)
where we used again x
i
= x
j
= 1/2 and x
i
x
j
= 1/4(�ji
+ 1), being �
j
i
theKronecker delta. The expression for �
f
in (18) follows from (19). utUsing Proposition 2 we can compute the FDC for the order-1 elementary
landscapes.
Proposition 3. Let f be an order-1 elementary function written as (17) suchthat all a
i
6= 0. Then, it has one only global optimum and its FDC (assumingmaximization) is:
r =�Pn
i=1 |ai|pn
Pn
i=1 a2i
, (20)
which is always in the interval �1 r < 0.
The previous corollary states that only elementary landscapes with orderp = 1 have a nonzero FDC. Furthermore, the FDC does depend on the valueof the objective function in the global optimum f(x⇤) and the average value f ,but not on the solution x
⇤ itself. We can also observe that if we are maximizing,then f(x⇤) > f and the FDC is negative, while if we are minimizing f(x⇤) < f
and the FDC is positive.Interestingly, the order-1 elementary landscapes can always be written as
linear functions and they can be optimized in polynomial time. That is, if f isan order-1 elementary function then it can be written in the following way:
f(x) =nX
i=1
a
i
x
i
+ b. (17)
where a
i
and b are real values. The following proposition provides the averageand the standard deviation for this family of functions.
Proposition 2. Let f be an order-1 elementary function, which can be writtenas (17). Then, the average and the standard deviation of the function values inthe whole search space are:
f = b+1
2
nX
i=1
a
i
, �
f
=1
2
vuutnX
i=1
a
2i
. (18)
Proof. Using the linearity property of the average we can write: f =P
n
i=1 aixi
+b, and f in (18) follows from the fact that x
i
= 1/2. Now we can compute thevariance of f as:
V ar[f ] = (f(x)� f)2 =
nX
i=1
a
i
x
i
� 1
2
nX
i=1
a
i
!2
=
nX
i=1
a
i
✓x
i
� 1
2
◆!2
=nX
i,j=1
a
i
a
j
✓x
i
� 1
2
◆✓x
j
� 1
2
◆=
nX
i,j=1
a
i
a
j
✓x
i
x
j
� 1
2x
i
� 1
2x
j
+1
4
◆
=nX
i,j=1
a
i
a
j
✓x
i
x
j
� 1
4
◆=
nX
i,j=1
a
i
a
j
✓�
j
i
1
4+
1
4� 1
4
◆=
1
4
nX
i=1
a
2i
, (19)
where we used again x
i
= x
j
= 1/2 and x
i
x
j
= 1/4(�ji
+ 1), being �
j
i
theKronecker delta. The expression for �
f
in (18) follows from (19). utUsing Proposition 2 we can compute the FDC for the order-1 elementary
landscapes.
Proposition 3. Let f be an order-1 elementary function written as (17) suchthat all a
i
6= 0. Then, it has one only global optimum and its FDC (assumingmaximization) is:
r =�Pn
i=1 |ai|pn
Pn
i=1 a2i
, (20)
which is always in the interval �1 r < 0.
Proof. The global optimum x
⇤ has 1 in all the positions i such that ai
> 0 andthe maximum fitness value is:
f(x⇤) = b+nX
i=1a
i
>0
a
i
. (21)
Using Proposition 2 we can write:
f � f(x⇤) =
b+
1
2
nX
i=1
a
i
!�
0
B@b+nX
i=1a
i
>0
a
i
1
CA = �1
2
nX
i=1
|ai
|. (22)
Replacing the previous expression and �
f
in (16) we prove the claimed result.ut
When all the values of ai
are the same, the FDC computed with (20) is �1.This happens in particular for the Onemax problem. But if there exist di↵erentvalues for a
i
, then we can reach any arbitrary value in [�1, 0) for r. The followingtheorem provides a way to do it.
Theorem 2. Let ⇢ be an arbitrary real value in the interval [�1, 0), then anylinear function f(x) given by (17) where n > 1/⇢2, a2 = a3 = . . . = a
n
= 1 anda1 is
a1 =(n� 1) + n|⇢|p(1� ⇢
2)(n� 1)
n⇢
2 � 1(23)
has exactly FDC r = ⇢.
Proof. The expression for a1 is well-defined since n⇢
2> 1. Replacing all the a
i
in (20) we get r = ⇢. utTheorem 2 provides a solid argument against the use of FDC as a measure
of the di�culty of a problem. In e↵ect, we can always build an optimizationproblem based on a linear function, which can be solved in polynomial time,with an FDC as near as desired to 0 (but not zero), that is, as “di�cult” asdesired according to the FDC. However, we have to highlight here that for agiven FDC value ⇢ we need at least n > 1/⇢2 variables. Thus, an FDC nearer to0 requires more variables.
4 FDC, Autocorrelation Length and Local Optima
The autocorrelation length ` [8] has also been used as a measure of the di�cultyof a problem. Chicano and Alba [4] found a negative correlation between ` andthe number of local optima in the 0-1 Unconstrained Quadratic Optimizationproblem (0-1 UQO), an NP-hard problem [9]. Kinnear [11] also studied theuse of the autocorrelation measures as problem di�culty, but the results wereinconclusive. In this section we investigate which of the two measures, ` or the
Proof. The global optimum x
⇤ has 1 in all the positions i such that ai
> 0 andthe maximum fitness value is:
f(x⇤) = b+nX
i=1a
i
>0
a
i
. (21)
Using Proposition 2 we can write:
f � f(x⇤) =
b+
1
2
nX
i=1
a
i
!�
0
B@b+nX
i=1a
i
>0
a
i
1
CA = �1
2
nX
i=1
|ai
|. (22)
Replacing the previous expression and �
f
in (16) we prove the claimed result.ut
When all the values of ai
are the same, the FDC computed with (20) is �1.This happens in particular for the Onemax problem. But if there exist di↵erentvalues for a
i
, then we can reach any arbitrary value in [�1, 0) for r. The followingtheorem provides a way to do it.
Theorem 2. Let ⇢ be an arbitrary real value in the interval [�1, 0), then anylinear function f(x) given by (17) where n > 1/⇢2, a2 = a3 = . . . = a
n
= 1 anda1 is
a1 =(n� 1) + n|⇢|p(1� ⇢
2)(n� 1)
n⇢
2 � 1(23)
has exactly FDC r = ⇢.
Proof. The expression for a1 is well-defined since n⇢
2> 1. Replacing all the a
i
in (20) we get r = ⇢. utTheorem 2 provides a solid argument against the use of FDC as a measure
of the di�culty of a problem. In e↵ect, we can always build an optimizationproblem based on a linear function, which can be solved in polynomial time,with an FDC as near as desired to 0 (but not zero), that is, as “di�cult” asdesired according to the FDC. However, we have to highlight here that for agiven FDC value ⇢ we need at least n > 1/⇢2 variables. Thus, an FDC nearer to0 requires more variables.
4 FDC, Autocorrelation Length and Local Optima
The autocorrelation length ` [8] has also been used as a measure of the di�cultyof a problem. Chicano and Alba [4] found a negative correlation between ` andthe number of local optima in the 0-1 Unconstrained Quadratic Optimizationproblem (0-1 UQO), an NP-hard problem [9]. Kinnear [11] also studied theuse of the autocorrelation measures as problem di�culty, but the results wereinconclusive. In this section we investigate which of the two measures, ` or the
Proof. The global optimum x
⇤ has 1 in all the positions i such that ai
> 0 andthe maximum fitness value is:
f(x⇤) = b+nX
i=1a
i
>0
a
i
. (21)
Using Proposition 2 we can write:
f � f(x⇤) =
b+
1
2
nX
i=1
a
i
!�
0
B@b+nX
i=1a
i
>0
a
i
1
CA = �1
2
nX
i=1
|ai
|. (22)
Replacing the previous expression and �
f
in (16) we prove the claimed result.ut
When all the values of ai
are the same, the FDC computed with (20) is �1.This happens in particular for the Onemax problem. But if there exist di↵erentvalues for a
i
, then we can reach any arbitrary value in [�1, 0) for r. The followingtheorem provides a way to do it.
Theorem 2. Let ⇢ be an arbitrary real value in the interval [�1, 0), then anylinear function f(x) given by (17) where n > 1/⇢2, a2 = a3 = . . . = a
n
= 1 anda1 is
a1 =(n� 1) + n|⇢|p(1� ⇢
2)(n� 1)
n⇢
2 � 1(23)
has exactly FDC r = ⇢.
Proof. The expression for a1 is well-defined since n⇢
2> 1. Replacing all the a
i
in (20) we get r = ⇢. utTheorem 2 provides a solid argument against the use of FDC as a measure
of the di�culty of a problem. In e↵ect, we can always build an optimizationproblem based on a linear function, which can be solved in polynomial time,with an FDC as near as desired to 0 (but not zero), that is, as “di�cult” asdesired according to the FDC. However, we have to highlight here that for agiven FDC value ⇢ we need at least n > 1/⇢2 variables. Thus, an FDC nearer to0 requires more variables.
4 FDC, Autocorrelation Length and Local Optima
The autocorrelation length ` [8] has also been used as a measure of the di�cultyof a problem. Chicano and Alba [4] found a negative correlation between ` andthe number of local optima in the 0-1 Unconstrained Quadratic Optimizationproblem (0-1 UQO), an NP-hard problem [9]. Kinnear [11] also studied theuse of the autocorrelation measures as problem di�culty, but the results wereinconclusive. In this section we investigate which of the two measures, ` or the
Linear functions Autocorrelation FAQ
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
r =
�f[p](x⇤)
�fpn
r = 0
f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)
a1 = 7061.43
a2 = a3 = . . . = a45 = 1
1
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
r =
�f[p](x⇤)
�fpn
r = 0
f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)
a1 = 7061.43
a2 = a3 = . . . = a45 = 1
1
17 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• Autocorrelation length can also be exactly computed with landscape theory
• The expression only depends on the instance data (not the global optimum)
Comparison with Autocorrelation Length
absolute value of FDC, is more correlated to the number of local optima for somerandom instances of the 0-1 UQO. In particular, we have randomly generated1650 UQO instances using the Palubeckis instance generator [13]. The size ofthe instances varies between n = 10 and n = 20 and the density (percentage ofnonzero elements in the coe�cients matrix) varies from 10 to 90 in steps of 20. Foreach n and density, 30 random instances were generated by randomly selectingthe nonzero elements of the coe�cients matrix from the interval [�100, 100]. Forall the instances we computed the autocorrelation length `, the absolute value ofthe FDC |r| and the number of local optima (minima) by complete enumerationof the search space. In Table 1 we show the Spearman rank correlation coe�cientbetween the number of local optima and ` and |r|. The correlations are computedusing all the instances with the same size n.
Table 1. Spearman correlation coe�cient for the number of local optima against theautocorrelation length (`) and the absolute value of the FDC (|r|).
n 10 11 12 13 14 15` �0.5467 �0.5545 �0.5896 �0.4796 �0.4725 �0.5511|r| �0.1407 �0.1843 �0.0787 �0.1203 �0.1944 �0.0538n 16 17 18 19 20` �0.4959 �0.5740 �0.5872 �0.5249 �0.4829|r| �0.1251 �0.1791 �0.1339 �0.3310 �0.0338
We can observe a high inverse correlation (around �0.5) between the numberof local optima and the autocorrelation length, supporting the autocorrelationlength conjecture. However, the correlation between the number of local optimaand FDC is low, again supporting the hypothesis that FDC is not an appropriatemeasure of the di�culty of a problem (this time, from an experimental point ofview).
5 Conclusion
We have applied landscape theory to exactly compute the Fitness-DistanceCorrelation of combinatorial optimization problems defined over sets of binarystrings. The result is valid in the case in which one single global optimum existsin the landscape. We defer to future work the analysis of the general case.
The expression for the FDC takes only into account the order-1 elementarycomponent of the objective function, while previous work suggests that the com-ponents making a problem di�cult are the higher order elementary components.This fact questions the use of FDC as a measure of di�culty of the problem.We prove that there exist polynomial time solvable problems with an FDC ar-bitrarily near to zero. An experimental study over random instances of the 0-1UQO shows a low correlation between FDC and the number of local optima,supporting the hypothesis that FDC fails to capture the problem di�culty.
Spearman rank correlation coefficients
Correlations with the number of local optima
0-1 Unconstrained Quadratic Optimization
(1650 random instances considered)
Higher correlation with autocorrelation length
Tomassini
Bierwirth
Linear functions Autocorrelation FAQ
18 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
• Autocorrelation length can also be exactly computed with landscape theory
• The expression only depends on the instance data (not the global optimum)
Comparison with Autocorrelation Length
absolute value of FDC, is more correlated to the number of local optima for somerandom instances of the 0-1 UQO. In particular, we have randomly generated1650 UQO instances using the Palubeckis instance generator [13]. The size ofthe instances varies between n = 10 and n = 20 and the density (percentage ofnonzero elements in the coe�cients matrix) varies from 10 to 90 in steps of 20. Foreach n and density, 30 random instances were generated by randomly selectingthe nonzero elements of the coe�cients matrix from the interval [�100, 100]. Forall the instances we computed the autocorrelation length `, the absolute value ofthe FDC |r| and the number of local optima (minima) by complete enumerationof the search space. In Table 1 we show the Spearman rank correlation coe�cientbetween the number of local optima and ` and |r|. The correlations are computedusing all the instances with the same size n.
Table 1. Spearman correlation coe�cient for the number of local optima against theautocorrelation length (`) and the absolute value of the FDC (|r|).
n 10 11 12 13 14 15` �0.5467 �0.5545 �0.5896 �0.4796 �0.4725 �0.5511|r| �0.1407 �0.1843 �0.0787 �0.1203 �0.1944 �0.0538n 16 17 18 19 20` �0.4959 �0.5740 �0.5872 �0.5249 �0.4829|r| �0.1251 �0.1791 �0.1339 �0.3310 �0.0338
We can observe a high inverse correlation (around �0.5) between the numberof local optima and the autocorrelation length, supporting the autocorrelationlength conjecture. However, the correlation between the number of local optimaand FDC is low, again supporting the hypothesis that FDC is not an appropriatemeasure of the di�culty of a problem (this time, from an experimental point ofview).
5 Conclusion
We have applied landscape theory to exactly compute the Fitness-DistanceCorrelation of combinatorial optimization problems defined over sets of binarystrings. The result is valid in the case in which one single global optimum existsin the landscape. We defer to future work the analysis of the general case.
The expression for the FDC takes only into account the order-1 elementarycomponent of the objective function, while previous work suggests that the com-ponents making a problem di�cult are the higher order elementary components.This fact questions the use of FDC as a measure of di�culty of the problem.We prove that there exist polynomial time solvable problems with an FDC ar-bitrarily near to zero. An experimental study over random instances of the 0-1UQO shows a low correlation between FDC and the number of local optima,supporting the hypothesis that FDC fails to capture the problem di�culty.
Spearman rank correlation coefficients
Correlations with the number of local optima
0-1 Unconstrained Quadratic Optimization
(1650 random instances considered)
Higher correlation with autocorrelation length
Tomassini
Bierwirth
Linear functions Autocorrelation FAQ
f(x) =
nX
p=0
f[p](x)
where
j =
�1
2
f[p] = 0 for j > 0
f[0] = f
r =
�f[p](x⇤)
�
f
pn
r = 0
f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)
a1 = 7061.43
a2 = a3 = . . . = a45 = 1
R(A, f) = �(f)| {z }problem
⌦ ⇤(A)| {z }algorithm
(1)
1
19 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Frequently Asked Questions
How can we use the result?
• Can be used to improve the computation of FDC
But, we need the elementary landscape decomposition of the problem • Only the order-1 elementary component is required
How can we obtain this?
• Some problems have been decomposed. If yours is not there, you can decompose it (if possible)
Linear functions Autocorrelation FAQ
20 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Frequently Asked Questions
We also need to know the global optimum, but this is usually not known
• Right, if the optimum is not known the result will be an approximation
The empirical computation of FDC is also an approximation, what is the advantage of the proposed formula?
• First, it requires less computational resources (memory and CPU) • Second, it gives a value which considers all the solutions in the search space
(against the relatively small number of sampled solutions using the empirical approach)
What happens if more than one global optimum exist?
• Touché. We are working on it.
Linear functions Autocorrelation FAQ
21 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Introduction Landscape Theory Result Implications
Conclusions & Future Work
Conclusions & Future Work
Conclusions • We provide a closed-form formula for FDC • FDC only takes into consideration the order-1 elementary
component of the function when only one global optimum exists • Linear functions can be defined to have a value of FDC as low as
desired (greater than 0) • Autocorrelation length seems to be more correlated with the
number of local optima than FDC for UQO
Future Work • Find an expression for the case in which more than one global
optimum exists • Check how good is FDC in “interesting” problems
22 / 22 April 2012 EvoCOP 2012, Málaga, Spain
Thanks for your attention !!!
Exact Computation of the Fitness-Distance Correlation for Pseudoboolean Functions with One Global Optimum