cs711008z algorithm design and analysis -...

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

CS711008Z Algorithm Design and AnalysisLecture 8. Linear programming: interior point method

Dongbo Bu

Institute of Computing TechnologyChinese Academy of Sciences, Beijing, China

1 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Outline

Brief history of interior point methodBasic idea of interior point method

2 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

A brief history of linear program

In 1949, G. B. Dantzig proposed the simplex algorithm;In 1971, Klee and Minty gave a counter-example to show thatsimplex is not a polynomial-time algorithm.In 1975, L. V. Kantorovich and T. C. Koopmans, Nobel prize,application of linear programming in resource distribution;In 1979, L. G. Khanchian proposed a polynomial-time ellipsoidmethod;In 1984, N. Karmarkar proposed another polynomial-timeinterior-point method;In 2001, D. Spielman and S. Teng proposed smoothedcomplexity to prove the efficiency of simplex algorithm.

3 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

In 1979, L. G. Khanchian proposed a polynomial-timeellipsoid method for LP

Figure: Leonid G. Khanchian

4 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

In 1984, N. Karmarkar proposed a new polynomial-timealgorithm for LP

5 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Basic idea of interior point method

6 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

KKT conditions for LP

Let’s consider a linear program in slack form and its dualproblem, i.e.

Primal problem:min cTxs.t. Ax = b

x ≥ 0Dual problem:

max bT ys.t. AT y ≤ c

KKT condition:Primal feasibility: Ax = b, x ≥ 0.Dual feasibility: ATy ≤ cComplementary slackness: xi = 0, or aT

i y = ci for anyi = 1, ...,m.

7 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Rewriting KKT condition

Let’s consider a linear program in slack form and its dualproblem, i.e.

Primal problem:min cTxs.t. Ax = b

x ≥ 0Dual problem:

max bT ys.t. AT y ≤ c

KKT condition:Primal feasibility: Ax = b, x ≥ 0;Dual feasibility: ATy + σ = c, σ ≥ 0;Complementary slackness: xiσi = 0.

These conditions consists of m + 2n equations over m + 2nvariables plus non-negative constraints.

8 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Rewrite KKT conditions further

Let’s define diagonal matrix:

X =

x1 0 00 x2 00 0 x3

,Σ =

σ1 0 00 σ2 00 0 σ3

and rewrite the complementary slackness as:

XΣe =

x1σ1 0 00 x2σ2 00 0 x3σ3

111

=

000

=

x1σ1x2σ2x3σ3

KKT condition:

Primal feasibility: Ax = b, x ≥ 0;Dual feasibility: ATy + σ = c, σ ≥ 0;Complementary slackness: XΣe = 0.

We call (x, y, σ) interior point if x > 0 and σ > 0.Question: How to find (x, y, σ) satisfy the KKT conditions?

9 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

A simple interior-point method

Basic idea: Assume we have already known (x, y, σ) that areboth primal and dual feasible, i.e., Ax = b, x > 0ATy + σ = c, σ > 0and try to improve to another point (x∗, y∗, σ∗) to makecomplementary slackness hold.Improvement strategy: Starting from (x, y, σ), we follow adirection (∆x,∆y,∆σ) such that (x +∆x, y +∆y, σ +∆σ) isa better solution, i.e., it comes closer to satisfying thecomplementary slackness.To find such a direction, we substitute(x +∆x, y +∆y, σ +∆σ) into the KKT conditions:

Primal feasibility: Ax +∆x = b, x +∆x ≥ 0;Dual feasibility: AT(y +∆y) + (σ +∆σ) = c, σ +∆σ ≥ 0;Complementary slackness: (X +∆X)(Σ + ∆Σ)e = 0.

10 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Since we start from (x, y, σ) such that Ax = b, x > 0ATy + σ = c, σ > 0, the above conditions change into

A∆x = 0, x +∆x ≥ 0;AT∆y +∆σ = 0, σ +∆σ ≥ 0;X∆σ + Σ∆x = −XΣe −∆X∆Σe.

Note that when ∆x and ∆σ are small, the final non-linearterm −∆X∆Σe is small relative to −XΣe and can thus bedropped out, generating a linear system.

A∆x = 0, x +∆x ≥ 0;AT∆y +∆σ = 0, σ +∆σ ≥ 0;X∆σ + Σ∆x = −XΣe

The solution are: ∆σ = X−1(−XΣ− Σ∆x) = −σ − X−1Σ∆x(−X−1Σ AT

A 0

)(∆x∆y

)=

(σ0

)

11 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

1: Set initial solution x such that Ax = b, x > 0;2: Set initial solution y, σ such that ATy + σ = c, σ > 0;3: while TRUE do4: Solve (

−X−1Σ AT

A 0

)(∆x∆y

)=

(σ0

)5: Set ∆σ = X−1(−XΣ− Σ∆x) = −σ − X−1Σ∆x;6: Set θx = minj:∆xj<0

xj−∆xj

, θσ = minj:∆σj<0σj

−∆σj;

7: Set θ = min{1, αθx, αθσ};8: Update x = x + θ∆x, y = y + θ∆y, σ = σ + θ∆σ;9: if xiσi ≤ ϵ for all i = 1, ...,m then

10: break;11: end if12: end while13: return (x, y, σ);

12 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An example

min 2x1 +1.5x2s.t. 12x1 +24x2 ≥ 120

16x1 +16x2 ≥ 12030x1 +12x2 ≥ 120

x1 ≤ 15x2 ≤ 15

x1 ≥ 0x2 ≥ 0

13 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An executionStarting point 1: x = (10, 10).

x1

x2

0 5 10 15

0

5

10

15

14 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Another executionStarting point 2: x = (14, 1).

x1

x2

0 5 10 15

0

5

10

15

15 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Centered interior-point method

To avoid the poor performance, we need a way to keep theiterate away from the boundary until the solution approachesthe optimum.An efficient way to achieve this goal is to relax complementaryslackness:

XΣe = 0

intoXΣe = µe (µ > 0)

We start from a large µ, and gradually reduce it as thealgorithm proceeds. At each iteration, we execute the previousaffine interior point method.

16 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

1: Set initial solution x such that Ax = b, x > 0;2: Set initial solution y, σ such that ATy + σ = c, σ > 0;3: while TRUE do4: Estimate µ = β σx

n ;5: Solve (

−X−1Σ AT

A 0

)(∆x∆y

)=

(σ−µX−1e

0

)6: Set ∆σ = X−1(−XΣ− Σ∆x) = −σ − X−1(Σ∆x−µe);7: Set θx = minj:∆xj<0

xj−∆xj

, θσ = minj:∆σj<0σj

−∆σj;

8: Set θ = min{1, αθx, αθσ};9: Update x = x + θ∆x, y = y + θ∆y, σ = σ + θ∆σ;

10: if xiσi ≤ ϵ for all i = 1, ...,m then11: break;12: end if13: end while14: return (x, y, σ);

17 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Running center path algorithmStarting point 1: x = (10, 10).

x1

x2

0 5 10 15

0

5

10

15

18 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Running center path algorithmStarting point 2: x = (14, 1).

x1

x2

0 5 10 15

0

5

10

15

19 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Another viewpoint

Actually, the relaxed complementary slackness

XΣe =1t e (t > 0)

corresponds to the following linear program:

min cTx− 1t∑n

i=1 log xis.t. Ax = b

x ≥ 0

20 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Consider a general optimization problem

Suppose we are trying to solve a convex program withinequality constraints:

min f(x)s.t. Ax = b

g(x) ≤ 0

where f(x) and g(x) are convex, and twice differentiable.

21 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Let’s start from equality constrained quadraticprogramming

Suppose we are trying to solve a QP with equalityconstraints :

min 12xTPx+QTx+ r

s.t. Ax = b

Applying Lagrangian conditions, we haveAx∗ = b, and Px+Q+ATλ = 0

Thus, the optimum point x∗ can be solved as follows:[P AT

A 0

] [x∗

λ

]=

[−Qb

]

22 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Then how to minimize f(x) with equality constraints?Newton’s method

Suppose we are trying to minimize a convex function f(x),which is not a QP.

min f(x)s.t. Ax = b

Basic idea: let’s try to improve from a feasible solution x. Atx, we write the Taylor extension, and use quadraticapproximation tof(x +∆x) = f(x) +▽f(x) + 1

2∆xT ▽2 f(x)∆x

min f(x)s.t. A(x+∆x) = b

Thus, the optimum point ∆x∗ can be solved as follows:[▽2f(x) AT

A 0

] [∆x∗

λ

]=

[−▽ f(x)

b

]Since this is just an approximation to f(x), we need toiteratively perform line search. 23 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Then how to minimize f(x) with inequality constraints?


min f(x)s.t. Ax = b

g(x) ≤ 0

where f(x) and g(x) are convex, and twice differentiable.Basic idea:

1 Transform it into a series of equality constraintedoptimisation problems;

2 Solve each equality constrainted optimisation problem usingNewton’s method.

24 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Log barrier function: transform into equality constraints

25 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Log barrier function


min f(x)s.t. Ax = b

g(x) ≤ 0

where f(x) and g(x) are convex, and twice differentiable.This is equivalent to the following optimization problem.

min f(x) + I_(g(x))s.t. Ax = b

where I_(x) ={

0 if x ≤ 0∞ otherwise

But indicator function is not differentiable.

26 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Log barrier function: smooth approximation to theindicator function


min f(x)s.t. Ax = b

g(x) ≤ 0

where f(x) and g(x) are convex, and twice differentiable.This is equivalent to the following optimization problem.

min f(x)−1t log(−g(x))

s.t. Ax = b

Basic idea: a smooth approximation to indicator function−1

t log(−x), and this approximation improves as t → ∞.

27 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Log barrier function approximates indicator function

−1t log(−x) approximates the indicator function, and the

approximation improves as t → ∞.For each setting of t, we obtain an approximation to theoriginal optimisation problem.

28 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Interior method and central path

Solve a sequence of optimization problem:


s.t. Ax = b

For any t > 0, we define x∗(t) as the optimal solution.t increases step by step. t should not be too large initially, asit is not easy to solve it using Newton’s method.Central path: {x∗(t)|t > 0}.

29 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Interior point method for LP

Consider the following LP

min cTxs.t. Ax ≤ b

x ≥ 0

30 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Central path

For any t > 0, we define x∗(t) as the optimal solution to:


s.t. Ax = b

x∗(t) is not optimal solution to the original problem.The duality gap is bounded by m

t .

31 / 32

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Interior point algorithm

32 / 32

cs711008z algorithm design and analysis -...

Documents