constrained optimization: indirect...

35
Constrained optimization: indirect methods Jussi Hakanen Post-doctoral researcher [email protected] spring 2014 TIES483 Nonlinear optimization

Upload: tranduong

Post on 06-Mar-2018

243 views

Category:

Documents


4 download

TRANSCRIPT

Constrained optimization:

indirect methods

Jussi Hakanen

Post-doctoral researcher [email protected]

spring 2014 TIES483 Nonlinear optimization

On constrained optimization

We have seen how to characterize optimal

solutions in constrained optimization

– KKT optimality conditions include the balance of

forces (−𝛻𝑓 𝑥∗ , 𝛻𝑔𝑖 𝑥∗ , 𝑖 ∈ 𝐼 and 𝛻ℎ𝑗(𝑥

∗)) and

complementarity conditions (𝜇𝑖𝑔𝑖 𝑥∗ = 0 ∀𝑖)

– Regularity of 𝑥∗ need to be assumed

Now, we are interested in how to find such

solutions

spring 2014 TIES483 Nonlinear optimization

Methods for constrained optimization

Many methods utilize knowledge about the constraints – Linear inequalities or linear equalities

– Nonlinear inequalities or equalities

For example, if a linear constraint is active at some point, you know that by taking steps along the direction of the constraint, it remains active

For nonlinear constraints, you don’t have such a direction

Methods for constrained optimization can be characterized based on how they treat constraints

spring 2014 TIES483 Nonlinear optimization

Classification of the methods

Indirect methods: the constrained problem is

converted into a sequence of unconstrained

problems whose solutions will approach to the

solution of the constrained problem, the

intermediate solutions need not to be feasible

Direct methods: the constraints are taking into

account explicitly, intermediate solutions are

feasible

spring 2014 TIES483 Nonlinear optimization

Transforming the optimization

problem

Constraints of the problem can be transformed if needed

𝑔𝑖 𝑥 ≤ 0 ⟺ 𝑔𝑖 𝑥 + 𝑦𝑖2 = 0, where 𝑦𝑖 is a

slack variable; constraint is active if 𝑦𝑖 = 0

– By adding 𝑦𝑖2 no need to add 𝑦𝑖 ≥ 0

– If 𝑔𝑖(𝑥) is linear, then linearity is preserved by 𝑔𝑖 𝑥 + 𝑦𝑖 = 0, 𝑦𝑖 ≥ 0

𝑔𝑖 𝑥 ≥ 0 ⟺ −𝑔𝑖 𝑥 ≤ 0

ℎ𝑖 𝑥 = 0 ⟺ ℎ𝑖 𝑥 ≤ 0 & − ℎ𝑖 𝑥 ≤ 0

spring 2014 TIES483 Nonlinear optimization

Examples of indirect methods

Penalty function methods

Lagrangian methods

spring 2014 TIES483 Nonlinear optimization

Penalty function methods

Include constraints into the objective function with the help of penalty functions that penalize constraint violations or even approaching the boundary of 𝑆

Different types – Penalty function: penalize for constraint violations

– Barrier function: prevents leaving the feasible region

– Exact penalty function

Resulting unconstrained problems can be solved by using the methods presented earlier in the course

spring 2014 TIES483 Nonlinear optimization

Penalty function methods

Generate a sequence of points that approach

the feasible region from outside

Constrained problem is converted into

min𝑥∈𝑅𝑛

𝑓 𝑥 + 𝑟 𝛼(𝑥),

where 𝛼(𝑥) is a penalty function and 𝑟 is a

penalty parameter

Requirements: 𝛼 𝑥 ≥ 0 ∀ 𝑥 ∈ 𝑅𝑛 and 𝛼 𝑥 = 0

if and only if 𝑥 ∈ 𝑆

spring 2014 TIES483 Nonlinear optimization

On convergence

When 𝑟 → ∞, the solutions 𝑥𝑟 of penalty

function problems converge to a constrained

minimizer (𝑥𝑟 → 𝑥∗ and 𝑟𝛼 𝑥𝑟 → 0)

– All the functions should be continuous

– For each 𝑟, there should exist a solution for penalty

functions problem and {𝑥𝑟} belongs to a compact

subset of 𝑅𝑛

spring 2014 TIES483 Nonlinear optimization

Examples of penalty functions

Can you give an example of a penalty function

𝛼(𝑥)?

For equality constraints

– ℎ𝑖 𝑥 = 0 ⟶ 𝛼 𝑥 = ℎ𝑖 𝑥2𝑙

𝑖=1 or

𝛼 𝑥 = |ℎ𝑖(𝑥)|𝑝𝑙

𝑖=1 , 𝑝 ≥ 2

For inequality constraints

– 𝑔𝑖 𝑥 ≤ 0 ⟶ 𝛼 𝑥 = max 0, 𝑔𝑖(𝑥)𝑚𝑖=1 or

𝛼 𝑥 = max 0, 𝑔𝑖 𝑥𝑝, 𝑝 ≥ 2𝑚

𝑖=1

spring 2014 TIES483 Nonlinear optimization

How to choose 𝑟?

Should be large enough in order for the solutions be close enough to the feasible region

If 𝑟 is too large, there could be numerical problems in solving the penalty problems

For large values of 𝑟, the emphasis is on finding feasible solutions and, thus, the solution can be feasible but far from optimum

Typically 𝑟 is updated iteratively

Different parameters can be used for different constraints (e.g. 𝑔𝑖 ⟶ 𝑟𝑖 , 𝑔𝑗 ⟶ 𝑟𝑗) – For the sake of simplicity, same parameter is used here for all

the constraints

spring 2014 TIES483 Nonlinear optimization

Algorithm

1) Choose the final tolerance 𝜖 > 0 and a starting point 𝑥1. Choose 𝑟1 > 0 (not too large) and set ℎ = 1.

2) Solve min

𝑥∈𝑅𝑛𝑓 𝑥 + 𝑟ℎ𝛼(𝑥)

with some method for unconstrained problems (𝑥ℎ as a starting point). Let the solution be 𝑥ℎ+1 = 𝑥(𝑟ℎ).

3) Test optimality: If 𝑟ℎ𝛼 𝑥ℎ+1 < 𝜖, stop. Solution 𝑥ℎ+1 is close enough to optimum. Otherwise, set 𝑟ℎ+1 > 𝑟ℎ (e.g. 𝑟ℎ+1 = 𝜅𝑟ℎ, where 𝜅 can be initialized to be e.g. 10). Set ℎ = ℎ + 1 and go to 2).

spring 2014 TIES483 Nonlinear optimization

Example

min 𝑥 𝑠. 𝑡. −𝑥 + 2 ≤ 0

Let 𝛼 𝑥 = max [0, (−𝑥 + 2)] 2

Then 𝛼 𝑥 =0, 𝑖𝑓 𝑥 ≥ 2

−𝑥 + 2 2, 𝑖𝑓 𝑥 < 2

Minimum of 𝑓 + 𝑟𝛼 is at 2 −1

2𝑟

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Barrier function method

Prevents leaving the feasible region

Suitable only for problems with equality constraints – Set 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} should not be empty

Problem to be solved is min

𝑟Θ 𝑟 𝑠. 𝑡. 𝑟 ≥ 0,

where Θ 𝑟 = inf 𝑥

𝑓 𝑥 + 𝑟𝛽 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖]

𝛽 is a barrier function: 𝛽 𝑥 ≥ 0 when 𝑔𝑖 𝑥 < 0 ∀𝑖 and 𝛽 𝑥 → ∞ when 𝑥 approaches boundary of 𝑆

Constraints 𝑔𝑖 𝑥 < 0 can be omitted since 𝛽 → ∞ in the boundary of 𝑆

spring 2014 TIES483 Nonlinear optimization

On convergence

Denote Θ 𝑟 = 𝑓 𝑥𝑟 + 𝑟𝛽(𝑥𝑟)

Under some assumptions, the solutions 𝑥𝑟 of

barrier problems converge to a constrained

minimizer (𝑥𝑟 → 𝑥∗ and 𝑟𝛽 𝑥𝑟 → 0) when

𝑟 → 0+

– All functions should be continuous

– 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} ≠ ∅

spring 2014 TIES483 Nonlinear optimization

Properties of barrier functions

Nonnegative and continuous in 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} Approaches ∞ when the boundary of the feasible region is approached from inside

Ideally: 𝛽 = 0 in 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} and 𝛽 = ∞ in the boundary – Guarantees staying in the feasible region

– This kind of discontinuity causes problems for any numerical method

Examples of barrier functions

– 𝛽 𝑥 = −1

𝑔𝑖 𝑥𝑚𝑖=1

– 𝛽 𝑥 = − ln (min[1, −𝑔𝑖(𝑥)])𝑚𝑖=1

spring 2014 TIES483 Nonlinear optimization

Algorithm

1) Choose the final tolerance 𝜖 > 0 and a starting point 𝑥1 s.t. 𝑔𝑖 𝑥 < 0 ∀𝑖. Choose 𝑟1 > 0, not too small (and a parameter 0 < 𝜏 < 1 for reducing 𝑟). Set ℎ = 1.

2) Solve min

𝑥𝑓 𝑥 + 𝑟ℎ𝛽(𝑥) 𝑠. 𝑡. 𝑔𝑖 𝑥 < 0 ∀𝑖

by using the starting point 𝑥ℎ. Let the solution be 𝑥ℎ+1.

3) Test optimality: If 𝑟ℎ𝛽 𝑥ℎ+1 < 𝜖, stop. Solution

𝑥ℎ+1 is close enough to optimum. Otherwise, set 𝑟ℎ+1 < 𝑟ℎ (e.g. 𝑟ℎ+1 = 𝜏𝑟ℎ). Set ℎ = ℎ + 1 and go to 2).

spring 2014 TIES483 Nonlinear optimization

Example

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

) min 𝑥 𝑠. 𝑡. −𝑥 + 1 ≤ 0

Let 𝛽 𝑥 =−1

−𝑥+1 when 𝑥 ≠ 1

Minimum of 𝑓 𝑥 + 𝑟𝛽 𝑥 =𝑥 + 𝑟 𝑥 − 1 −1 is at 1 + 𝑟

Summary: penalty and barrier

function methods

Penalty and barrier functions usually differentiable

Minimum is obtained in a limit – Penalty function: 𝑟ℎ → ∞

– Barrier function: 𝑟ℎ → 0

Choosing the sequence 𝑟ℎ essential for convergence – If 𝑟ℎ → ∞ or 𝑟ℎ → 0 too slowly, a large number of

unconstrained problems need to be solved

– If 𝑟ℎ → ∞ or 𝑟ℎ → 0 too fast, solutions of successive unconstrained problems are far from each other and solution time increases

spring 2014 TIES483 Nonlinear optimization

Exact penalty function

Idea is to have a method where the solution could be found with a small amount of iterations

Suitable for both equality and inequality constraints

An exact penalty function problem is e.g. of the form

min𝑥∈𝑅𝑛

𝑓 𝑥 + 𝑟( max[0, 𝑔𝑖 𝑥 ] + |ℎ𝑖(𝑥)|𝑙𝑖=1

𝑚𝑖=1 )

spring 2014 TIES483 Nonlinear optimization

Exact penalty function method

Theorem: Consider a point 𝑥 where the necessary KKT conditions hold. Let the corresponding Lagrange multipliers be 𝜇 and 𝜈. Assume that objective and inequality constraint functions are convex and equality constraint functions are affine. Then 𝑥 is a solution of the exact penalty function problem with 𝑟 ≥ max[𝜇𝑖 , 𝑖 = 1,… ,𝑚, 𝜈𝑖 , 𝑖 = 1,… , 𝑙] Solution can be obtained with a finite value for the penalty parameter 𝑟

Algorithm is similar to penalty function method except for that 𝑟ℎ is increased only if necessary – E.g. when the feasible region is not approached fast enough

spring 2014 TIES483 Nonlinear optimization

Properties of exact penalty function

Not differentiable in points 𝑥 where 𝑔𝑖 𝑥 = 0

or ℎ𝑖 𝑥 = 0

– Gradient based methods are not suitable

If 𝑟 and the starting point could be chosen

appropriately, only one minimization would be

required in principle

– If 𝑟 is too large and the starting point is not close

enough to the optimum, minimizing the exact

penalty function problem could become difficult

spring 2014 TIES483 Nonlinear optimization

Example

min 𝑓 𝑥 = 𝑥12 + 𝑥2

2 𝑠. 𝑡. 𝑥1 + 𝑥2 − 1 = 0

Optimal solution is 𝑥∗ =1

2,1

2

𝑇, 𝜈∗ = −2𝑥1

∗ = −2𝑥2∗ = −1

Exact penalty function problem: min

𝑥∈𝑅𝑛𝑥12 + 𝑥2

2 + 𝑟|𝑥1 + 𝑥2 − 1|

Solution: 𝑥∗ =𝑟

2,𝑟

2

𝑇when 0 ≤ 𝑟 < 1 and 𝑥∗ =

1

2,1

2

𝑇

when 𝑟 ≥ 1 – (obtained by using KKT conditions of an equivalent

differentiable problem where the absolute value term is replaced with a new variable and two inequality constraints)

Thus, the solution can be found with 𝑟 ≥ 1 (= |𝜈∗|)

spring 2014 TIES483 Nonlinear optimization

Example: barrier function

min 𝑓 𝑥 = 𝑥1𝑥22 𝑠. 𝑡. 𝑥1

2+𝑥22 − 2 ≤ 0

𝑥∗ = −0.8165,−1.1547 𝑇, the constraint is active in 𝑥∗

(a) level curves of 𝑓(𝑥) and the boundary of 𝑆

Logarithmic barrier function: (b) 𝑟 = 0.2, (c) 𝑟 = 0.001

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Example: penalty function

min 𝑓 𝑥 = 𝑥1𝑥22 𝑠. 𝑡. 𝑥1

2+𝑥22 − 2 ≤ 0

𝑥∗ = −0.8165,−1.1547 𝑇, the constraint is active in 𝑥∗

Quadratic penalty function: (a) 𝑟 = 1, (b) 𝑟 = 100

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Example: exact penalty function

min 𝑓 𝑥 = 𝑥1𝑥22 𝑠. 𝑡. 𝑥1

2+𝑥22 − 2 ≤ 0

𝑥∗ = −0.8165,−1.1547 𝑇, the constraint is active in 𝑥∗, 𝜇∗ = 0.8165

Exact penalty function: (a) 𝑟 = 1.2, (b) 𝑟 = 5, (c) 𝑟 = 100

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Lagrangian function

Consider problem

min𝑓(𝑥) 𝑠. 𝑡. ℎ𝑖 𝑥 = 0, 𝑖 = 1,… , 𝑙

Lagrangian function

𝐿 𝑥, 𝜈 = 𝑓 𝑥 + 𝜈𝑖ℎ𝑖(𝑥)𝑙𝑖=1

KKT conditions

𝛻𝑓 𝑥 + 𝜈𝑖𝛻ℎ𝑖(𝑥)

𝑙𝑖=1 = 0

ℎ𝑖 𝑥 = 0, 𝑖 = 1,… , 𝑙

Let 𝑥∗ be a minimizer and 𝜈∗ corresponding

Lagrange multiplier

spring 2014 TIES483 Nonlinear optimization

Properties of Lagrangian

KKT conditions: 𝑥∗ is a critical point of the

Lagrangian function

– 𝑥∗ is not necessarily a minimizer for 𝐿(𝑥, 𝜈∗)

Thus, minimizing the Lagrangian function

doesn’t necessarily give a minimum for 𝑓(𝑥)

– Hessian 𝛻𝑥𝑥2 𝐿(𝑥∗, 𝜈∗) may be indefinite → a saddle

point

Improve Lagrangian function!

spring 2014 TIES483 Nonlinear optimization

Augmented Lagrangian function

Augmented Lagrangian function:

𝐿𝐴 𝑥, 𝜈, 𝜚 = 𝑓 𝑥 + 𝜈𝑖ℎ𝑖(𝑥)𝑙𝑖=1 +

1

2𝜚 ℎ𝑖 𝑥

2𝑙𝑖=1 , 𝜚 > 0

– Lagrangian function + quadratic penalty function

A point (𝑥∗, 𝜈∗) is a critical point of the augmented Lagrangian

– 𝛻x𝐿𝐴 𝑥∗, 𝜈∗, 𝜚 = 0 and 1

2𝜚 ℎ𝑖 𝑥

∗ 2 = 0𝑙𝑖=1

Hessian: 𝛻𝑥𝑥

2 𝐿𝐴 𝑥∗, 𝜈∗, 𝜚 = 𝛻𝑥𝑥2 𝐿 𝑥∗, 𝜈∗ + 𝜚𝛻ℎ 𝑥∗ 𝑇𝛻ℎ(𝑥∗)

It can be shown that for 𝜚 > 𝜚 , 𝛻𝑥𝑥2 𝐿𝐴 𝑥∗, 𝜈∗, 𝜚 is

positive definite → 𝑥∗ is a local minimizer of 𝐿𝐴 𝑥, 𝜈∗, 𝜚

Need to know 𝜈∗

spring 2014 TIES483 Nonlinear optimization

Properties of 𝐿𝐴(𝑥, 𝜈, 𝜚)

Differentiable if the original functions are

𝑥∗ is a minimizer of 𝐿𝐴 𝑥, 𝜈∗, 𝜚 for finite 𝜚

Lagrangian function + quadratic penalty

function

spring 2014 TIES483 Nonlinear optimization

Algorithm

1) Choose the final tolerance 𝜖 > 0. Choose 𝑥1, 𝜈𝑖1 (𝑖 = 1,… , 𝑙) and 𝜚. Set ℎ = 1.

2) Test optimality: if optimality conditions are satisfied, stop. The solution is 𝑥ℎ.

3) Solve (with a suitable method) min

𝑥∈𝑅𝑛𝐿𝐴 𝑥, 𝜈ℎ, 𝜚

by using 𝑥ℎ as a starting point. Let the solution be 𝑥ℎ+1.

4) Update Lagrange multipliers: e.g. 𝜈ℎ+1 = 𝜈ℎ + 𝜚ℎ 𝑥ℎ+1 .

5) Increase 𝜚 if necessary: e.g. if ℎ(𝑥ℎ) − ℎ 𝑥ℎ+1 < 𝜖.

6) Set ℎ = ℎ + 1 and go to 2).

Note: 𝑥ℎ → 𝑥∗ only if 𝜈ℎ → 𝜈∗

spring 2014 TIES483 Nonlinear optimization

Example

spring 2014 TIES483 Nonlinear optimization

min𝑓 𝑥 = 𝑥1𝑥22 𝑠. 𝑡. 𝑥1

2+𝑥22 − 2 ≤ 0

𝑥∗ = −0.8165,−1.1547 𝑇, the constraint is active in 𝑥∗ Lagrangian function: saddle point in 𝑥∗

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Example (cont.)

Augmented Lagrangian function

(a) 𝜚 = 0.075, (b) 𝜚 = 0.2, (c) 𝜚 = 100

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Example (cont.)

Augmented Lagrangian function

𝜈∗ = 0.8165, 𝜚 = 0.2

(a) 𝜈 = 0.5, (b) 𝜈 = 0.9, (c) 𝜈 = 1.0

spring 2014 TIES483 Nonlinear optimization

Fro

m M

iett

ine

n: N

on

line

ar

op

tim

iza

tio

n, 2

00

7 (

in F

inn

ish

)

Topic of the lectures next week

Mon, Feb 10th: Constrained optimization: gradient projection, active set method

Wed, Feb 12th: Constrained optimization, SQP method & Matlab

Study this before the lecture!

Questions to be considered – What is the basic idea of gradient projection?

– What is the basic idea of active set methods?

– What is the basic idea of Sequential Quadratic Programming (SQP)?

spring 2014 TIES483 Nonlinear optimization