multi object optimization

8/3/2019 Multi Object Optimization

http://slidepdf.com/reader/full/multi-object-optimization 1/55

Evolutionary Methods in Multi-ObjectiveEvolutionary Methods in Multi-Objective

OptimizationOptimization

- Why do they work ? -- Why do they work ? -

Lothar Thiele

Computer Engineering and Networks LaboratoryDept. of Information Technology and Electrical

EngineeringSwiss Federal Institute of Technology (ETH) Zurich

Computer EngineeringComputer Engineering

and Networks Laboratoryand Networks Laboratory



OverviewOverview

introduction

limit behavior

run-time

performance measures



?

Black-Box OptimizationBlack-Box Optimization

Optimization Algorithm:

only allowed to evaluate f

(direct search)

decision

vector xobjective

vector f(x(

objective function

(e.g. simulation model)



Issues in EMOIssues in EMO

y2

y1

Diversity

Convergence

How to maintain a diverse

Pareto set approximation?

density estimation

How to preventnondominated

solutions from being lost?

environmental

selection

How to guide thepopulation



Multi-objective OptimizationMulti-objective Optimization



A Generic Multiobjective EAA Generic Multiobjective EA

archivepopulation

new population new archive

evaluatesamplevary

updatetruncate



Comparison of Three ImplementationsComparison of Three Implementations

SPEA2

VEGA

extended VEGA

2-objective knapsack problem

Trade-off betweendistance and

diversity?



Performance Assessment: ApproachesPerformance Assessment: Approaches

Theoretically (by analysis): difficult

Limit behavior (unlimited run-time resources)

Running time analysis

Empirically (by simulation): standard

Problems: randomness, multiple objectives

Issues: quality measures, statistical testing,benchmark problems, visualization, …

Which technique is suited for which problem class?



The Need for Quality MeasuresThe Need for Quality Measures

A

B

AB

independent of user preferences

Yes (strictly) No

dependent on

user preferences

How much? In what aspects?

Is A better than B?

Ideal: quality measures allow to make both type of statements



OverviewOverview

introduction

limit behavior

run-time


A l i M i AA l i M i A t



Analysis: Main AspectsAnalysis: Main Aspects

Evolutionary algorithms are random searchheuristics

Computation Time

(number of iterations)

ProbabilityOptimum found

∞

1

1/2

Qualitative:Limit behavior

for t → ∞

Quantitative:

Expected

Running

Time E(T)

A l g o r i t h m

A a p

p l i e d t

o P r o b

l e m B

A hi iA hi i



ArchivingArchiving

optimizationoptimization archivingarchiving

generate update, truncatefinite

memory

finite

archive A

every solution

at least once

f 2

f 1

Requirements:

1. Convergence to Pareto front2. Bounded size of archive

3. Diversity among the stored

solutions

P bl D t i tiP bl D t i ti



Problem: DeteriorationProblem: Deterioration

f 2

f 1

f 2

f 1

t t+1

new

solution

discarded

new

solution

New solution accepted in t+1 is dominated by asolution found previously (and “lost” during theselection process)

bounded

archive

(size 3)

P bl D t i tiProblem: Deterioration



Goal: Maintain “good” front

(distance + diversity)

But: Most archiving

strategies may forget

Pareto-optimal

solutions…

Problem: DeteriorationProblem: Deterioration

NSGA-II

SPEA

Limit Beha ior Related WorkLimit Behavior: Related Work



Limit Behavior: Related Work Limit Behavior: Related Work

Requirements for

archive:

1. Convergence

2. Diversity

3. Bounded Size

(impractical)

“store all”

[Rudolph

98,00]

[Veldhuizen 99]

“store one”

[Rudolph 98,00]

[Hanne 99]

(not nice)

in this work

Solution Concept: Epsilon DominanceSolution Concept: Epsilon Dominance



Solution Concept: Epsilon DominanceSolution Concept: Epsilon Dominance

Definition 1: ε-Dominance

A ε-dominates B iff (1+ε)·f(A) ≥ f(B)

Definition 2: ε-Pareto set

A subset of the Pareto-optimalset which ε-dominates all Pareto-optimal solutions

ε-dominated

dominated

Pareto set ε-Pareto set

(known since 1987)

Keeping Convergence and DiversityKeeping Convergence and Diversity



Keeping Convergence and DiversityKeeping Convergence and Diversity

Goal: Maintain ε-Pareto set

Idea: ε-grid, i.e.maintain a

set of non-dominated

boxes (one solutionper box)

Algorithm: (ε-update)

Accept a new solution f if

the corresponding box is not

dominated by any box

represented in the archive A

AND

any other archive member inthe same box is dominated by

the new solution

y2

y1

(1+ε)2

(1+ε)2

(1+ε)3

(1+ε)3

(1+ε)

(1+ε)

1

1

Correctness of Archiving Method




Theorem:

Let F = (f 1, f 2, f 3, …) be an infinite sequence of objective

vectorsone by one passed to the ε-update algorithm, and Ft the

union of the first t objective vectors of F.

Then for any t > 0, the following holds: the archive A at time t contains an ε-Pareto front of Ft

the size of the archive A at time t is bounded by theterm

(K = “maximum objective value”, m = “number of objectives”)

1

)1log(

log−

+

m

K

ε




Sketch of Proof:

3 possible failures for At not being an ε-Pareto set of Ft

(indirect proof)

at time k ≤ t a necessary solution was missed

at time k ≤ t a necessary solution was expelled

At contains an f ∉ Pareto set of Ft

Number of total boxes in objective space:Maximal one solution per box accepted

Partition into chains of boxes


m

K

+ )1log(

log

ε

1

)1log(

log−

+

m

K

ε

+ )1log(

log

ε

K

Simulation ExampleSimulation Example



Simulation ExampleSimulation Example

Rudolph and Agapie, 2000

Epsilon- Archive

OverviewOverview



OverviewOverview

introduction

limit behavior

run-time


Running Time Analysis: Related WorkRunning Time Analysis: Related Work



Running Time Analysis: Related Work Running Time Analysis: Related Work

Single-objective

EAs

Multiobjective

EAs

discrete

search

spaces

continuous

search

spaces

problem domain type of results

• expected RT

(bounds)

• RT with high

probability

(bounds)

[Mühlenbein 92]

[Rudolph 97]

[Droste, Jansen, Wegener 98,02]

[Garnier, Kallel, Schoenauer 99,00]

[He, Yao 01,02]

• asymptoticconvergence rates

• exact convergence

rates

[Beyer 95,96,…]

[Rudolph 97]

[Jagerskupper 03]

[none]

MethodologyMethodology



MethodologyMethodology

Typical “ingredients”

of a Running TimeAnalysis:

Simple algorithms

Simple problems

Analytical methods & tools

Here:

⇒ SEMO, FEMO, GEMO (“simple”,

“fair”, “greedy”)

⇒ mLOTZ, mCOCZ (m-objectivePseudo-Boolean problems)

⇒ General upper bound technique

& Graph search process1. Rigorous results for specific algorithm(s) on specific

problem(s)

2. General tools & techniques

3. General insights (e.g., is a population beneficial at all?)

Three Simple Multiobjective EAsThree Simple Multiobjective EAs



Three Simple Multiobjective EAsThree Simple Multiobjective EAs

select

individualfrom population

insertinto population

if not dominated

removedominatedfrom population

fliprandomly

chosen bit

Variant 1: SEMO

Each individual in thepopulation is selectedith the same probability

(uniform selection)

Variant 2: FEMO

Select individual withminimum number of mutation trials

(fair selection)

Variant 3: GEMO

Priority of convergenceif there is progress

(greedy selection)

SEMOSEMO



SEMOSEMO

population P

x

x ’

uniform selection

single point

mutation

include,

if not dominated

remove dominated and

duplicated

Example Algorithm: SEMOExample Algorithm: SEMO



Example Algorithm: SEMOExample Algorithm: SEMO

1. Start with a random solution

2. Choose parent randomly (uniform)

3. Produce child by variating parent

4. If child is not dominated then

• add to population

• discard all dominated

1. Goto 2

Simple Evolutionary Multiobjective Optimizer

,T ili Z )



, Trailing Zeroes)

Given:

Bitstring

Objective:

Maximize # of “leading

ones”Maximize # of “trailingzeroes”

−=Ν

∑∏

∑∏

= =

= =n

i

n

i j

j

n

i

i

j

j

n

x

x

x

1

1 12

)1(

)LOTZ( , 1,0:LOTZ

1 1 1 0 1 1 0 01 1 1 0 0

Run-Time Analysis: ScenarioRun-Time Analysis: Scenario



u e a ys s Sce a oy

Problem:leading ones, trailing zeros (LOTZ)

Variation:single point mutation

one bit per individual

1 1 0 1 0 0 0

1 1 1 0 0 0

1 1 1 0 0 01

0

The Good NewsThe Good News



SEMO behaves like a single-objective EA until the Paretoset

has been reached...y2

y1

trailing 0s

leading 1s

Phase 1:

only one

solution stored

in the archive

Phase 2:

only Pareto-optimal

solutions stored

in the archive

SEMO: Sketch of the Analysis ISEMO: Sketch of the Analysis I



yy

Phase 1: until first Pareto-optimal solution has beenfound

i → i-1: probability of a successful mutation ≥ 1/nexpected number of mutations = n

i=n → i=0: at maximum n-1 steps (i=1 not possible)expected overall number of mutations =

O(n2)

1 1 0 1 1 0 0

leading 1s trailing 0s

i = number of ‘incorrect’ bits

SEMO: Sketch of the Analysis IISEMO: Sketch of the Analysis II



y

Phase 2: from the first to all Pareto-optimal solutions

j → j+1: probability of choosing an outer solution ≥ 1/j, ≤2/j

probability of a successful mutation ≥ 1/n , ≤

2/nexpected number T j of trials (mutations) ≥ nj/4,

≤ nj

Pareto set

j = number of optimal solutions in thepopulation

SEMO on LOTZSEMO on LOTZ



Can we do even better ?Can we do even better ?



Our problem is the exploration of the Pareto-front.

Uniform sampling is unfair as it samples early found

Pareto-points more frequently.

FEMO on LOTZFEMO on LOTZ



Sketch of Proof

000000

100000

110000

111000

111100

111110

111111

Probability for each individual, that parent did not generate it with c/p log n

n individuals must be produced. The probability that one needs more

than c/p log n trials is bounded by n1-c.

FEMO on LOTZFEMO on LOTZ



Single objective (1+1) EA with multistart strategy(epsilon-constraint method) has running time Θ (n3).

EMO algorithm with fair sampling has running timeΘ (n2 log(n)).

Generalization: Randomized Graph SearchGeneralization: Randomized Graph Search



⇒ Pareto front can be modeled as a graph

⇒ Edges correspond to mutations

⇒ Edge weights are mutation probabilities

How long does it take to explore the whole graph?

How should we select the “parents”?

Randomized Graph SearchRandomized Graph Search



Comparison to Multistart of Single-objective Method



feasible region

constraint

feasible region

constraint

feasible region

constraint

Underlying concept:

Convert all objectives except of one into constraints

Adaptively vary constraints

f 2

f 1

maximize f 1

s.t. f i > ci, i ∈ 2,…,m

)()( 2nnT E ⋅Θ=

Find one

point

number of

points

Optimizer (GEMO)Optimizer (GEMO)



1. Start with a random solution

2. Choose parent with smallest counter;

increase this counter

3. Mutate parent

4. If child is not dominated then

• If child not equal to anybody

• add to population

• else if child equal to somebody

• if counter = ∞: set back to 0

• if child dominates anybody

• set all other counters to ∞

• discard all dominated

1. Goto 2

Optimizer (GEMO)Optimizer (GEMO)

Idea:

Mutation counter for

each individual

I. Focus

search effort

II. Broaden

search effort

… without knowing where we are!

Running Time Analysis: Comparison



Algorithms

Problems

Population-based approach can be more efficient than

multistart of single membered strategy

OverviewOverview



introduction

limit behavior

run-time


Performance Assessment: ApproachesPerformance Assessment: Approaches



Theoretically (by analysis): difficult

Limit behavior (unlimited run-time resources)

Running time analysis

Empirically (by simulation): standard

Problems: randomness, multiple objectives

Issues: quality measures, statistical testing,benchmark problems, visualization, …

Which technique is suited for which problem class?

The Need for Quality MeasuresThe Need for Quality Measures



A

B

AB

independent of user preferences

Yes (strictly) No

dependent onuser preferences

How much? In what aspects?

Is A better than B?

Ideal: quality measures allow to make both type of statements

Independent of User PreferencesIndependent of User Preferences



weakly dominates = not worse in all objectives

sets not equal

dominates= better in at least one

objective

strictly dominates= better in all objectives

is incomparable to= neither set weakly better

A

B

C

D

A B

C D

Pareto set approximation (algorithm outcome) =set of incomparable

solutionsΩ = set of all Pareto set approximations

A C

B C

Dependent on User PreferencesDependent on User Preferences



Goal: Quality measures compare two Pareto setapproximations A and B.

A

B

hypervolume 432.34 420.13distance 0.3308 0.4532

diversity 0.3637 0.3463

spread 0.3622 0.3601

cardinality 6 5

A B

“A better”

application of quality measures

(here: unary)

comparison andinterpretation of

quality values

Quality Measures: ExamplesQuality Measures: ExamplesUnary Binary



Unary

Hypervolume measure

Binary

Coverage measure

performance

cheapness

S(A)A

performance

cheapness

B

A

[Zitzler, Thiele: 1999]

S(A) = 60%C(A,B) = 25%

C(B,A) = 75%

MeasuresMeasures



Status:

Visual comparisons common until recently

Numerous quality measures have been proposedsince the mid-1990s[Schott: 1995][Zitzler, Thiele: 1998][Hansen, Jaszkiewicz: 1998][Zitzler: 1999][Van Veldhuizen, Lamont: 2000][Knowles, Corne , Oates: 2000][Deb et al.: 2000] [Sayin: 2000][Tan, Lee, Khor: 2001][Wu,

Azarm: 2001]…

Most popular: unary quality measures (diversity +distance)

No common agreement which measures should be

usedOpen questions:

What kind of statements do quality measures allow?

Can quality measures detect whether or that a Pareto

NotationsNotations



Def: quality indicator

I: Ωm → ℜ

Def: interpretation function

E: ℜk × ℜk →false, true

Def: comparison method based on I = (I1, I2, …, Ik) and EC: Ω× Ω →false, true

where

A, B ℜk × ℜk false, true

quality

indicators

interpretation

function

RelationsRelations



Compatibility of a comparison method C:

C yields true ⇒ A is (weakly, strictly) better than B

(C detects that A is (weakly, strictly) better than B)

Completeness of a comparison method C:

A is (weakly, strictly) better than B ⇒C yields true

Ideal: compatibility and completeness, i.e.,

A is (weakly, strictly) better than B ⇔C yields true

(C detects whether A is (weakly, strictly) betterthan B)

Limitations of Unary Quality MeasuresLimitations of Unary Quality Measures



Theorem:

There exists no unary quality measure that is ableto detectwhether A is better than B.

This statement even holds, if we consider a finite

combinationof unary quality measures.

There exists no combination of unary measuresapplicable to any problem.

Theorem 2: Sketch of Proof Theorem 2: Sketch of Proof



Any subset A ⊆ S is aPareto set approximation

Lemma: any A’, A’’ ⊆ Smust be mapped to adifferent point in ℜk

Mapping from 2S to ℜk mustbean injection, but

Cardinality of 2S greaterthan ℜk

Power of Unary Quality IndicatorsPower of Unary Quality Indicators



dominates doesn’t weakly dominate doesn’t dominate weakly domin

Quality Measures: ResultsQuality Measures: Results



There is no combination of unary quality measuressuch that

S is better than T in all measures is equivalent to S dominates T

ST

application of quality measures

Basic question: Can we say on the basis of thequality measures

whether or that an algorithm outperformsanother? hypervolume 432.34 420.13distance 0.3308 0.4532

diversity 0.3637 0.3463

spread 0.3622 0.3601

cardinality 6 5

S T

Unary quality measures usually do not tell thatS dominates T; at maximum that S does not

dominate T

[Zitzler et al.: 2002]

A New Measure:A New Measure: ε-Quality Measure-Quality Measure

Two solutions: Two approximations:



Two solutions:

E(a,b) =max1≤ i ≤ n min

ε ε ⋅ f i(a) ≥

f i(b)

1 2 4

A

B2

1

E(A,B) = 2E(B,A) = ½

Two approximations:

E(A,B) =maxb ∈ B mina ∈ A E(a,b)

a

b2

1

E(a,b) = 2E(b,a) = ½

1 2

Advantages: allows all kinds of statements (complete andcompatible)

Selected ContributionsSelected Contributions



Algorithms:° Improved techniques [Zitzler, Thiele: IEEE TEC1999]

[Zitzler, Teich, Bhattacharyya: CEC2000][Zitzler, Laumanns, Thiele:

EUROGEN2001]

° Unified model [Laumanns, Zitzler, Thiele: CEC2000][Laumanns, Zitzler, Thiele: EMO2001]

° Test problems [Zitzler, Thiele: PPSN 1998, IEEE TEC1999]

[Deb, Thiele, Laumanns, Zitzler: GECCO2002]

Theory:° Convergence/diversity [Laumanns, Thiele, Deb, Zitzler:

GECCO2002][Laumanns, Thiele, Zitzler, Welzl, Deb: PPSN-

How to apply (evolutionary) optimization algorithms tolarge-scale multiobjective optimization problems?

multi object optimization

Documents