cs711008z algorithm design and analysis -...
TRANSCRIPT
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
CS711008Z Algorithm Design and AnalysisLecture 8. Linear programming: interior point method
Dongbo Bu
Institute of Computing TechnologyChinese Academy of Sciences, Beijing, China
1 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Outline
Brief history of interior point methodBasic idea of interior point method
2 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
A brief history of linear program
In 1949, G. B. Dantzig proposed the simplex algorithm;In 1971, Klee and Minty gave a counter-example to show thatsimplex is not a polynomial-time algorithm.In 1975, L. V. Kantorovich and T. C. Koopmans, Nobel prize,application of linear programming in resource distribution;In 1979, L. G. Khanchian proposed a polynomial-time ellipsoidmethod;In 1984, N. Karmarkar proposed another polynomial-timeinterior-point method;In 2001, D. Spielman and S. Teng proposed smoothedcomplexity to prove the efficiency of simplex algorithm.
3 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
In 1979, L. G. Khanchian proposed a polynomial-timeellipsoid method for LP
Figure: Leonid G. Khanchian
4 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
In 1984, N. Karmarkar proposed a new polynomial-timealgorithm for LP
5 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Basic idea of interior point method
6 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
KKT conditions for LP
Let’s consider a linear program in slack form and its dualproblem, i.e.
Primal problem:min cTxs.t. Ax = b
x ≥ 0Dual problem:
max bT ys.t. AT y ≤ c
KKT condition:Primal feasibility: Ax = b, x ≥ 0.Dual feasibility: ATy ≤ cComplementary slackness: xi = 0, or aT
i y = ci for anyi = 1, ...,m.
7 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Rewriting KKT condition
Let’s consider a linear program in slack form and its dualproblem, i.e.
Primal problem:min cTxs.t. Ax = b
x ≥ 0Dual problem:
max bT ys.t. AT y ≤ c
KKT condition:Primal feasibility: Ax = b, x ≥ 0;Dual feasibility: ATy + σ = c, σ ≥ 0;Complementary slackness: xiσi = 0.
These conditions consists of m + 2n equations over m + 2nvariables plus non-negative constraints.
8 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Rewrite KKT conditions further
Let’s define diagonal matrix:
X =
x1 0 00 x2 00 0 x3
,Σ =
σ1 0 00 σ2 00 0 σ3
and rewrite the complementary slackness as:
XΣe =
x1σ1 0 00 x2σ2 00 0 x3σ3
111
=
000
=
x1σ1x2σ2x3σ3
KKT condition:
Primal feasibility: Ax = b, x ≥ 0;Dual feasibility: ATy + σ = c, σ ≥ 0;Complementary slackness: XΣe = 0.
We call (x, y, σ) interior point if x > 0 and σ > 0.Question: How to find (x, y, σ) satisfy the KKT conditions?
9 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
A simple interior-point method
Basic idea: Assume we have already known (x, y, σ) that areboth primal and dual feasible, i.e., Ax = b, x > 0ATy + σ = c, σ > 0and try to improve to another point (x∗, y∗, σ∗) to makecomplementary slackness hold.Improvement strategy: Starting from (x, y, σ), we follow adirection (∆x,∆y,∆σ) such that (x +∆x, y +∆y, σ +∆σ) isa better solution, i.e., it comes closer to satisfying thecomplementary slackness.To find such a direction, we substitute(x +∆x, y +∆y, σ +∆σ) into the KKT conditions:
Primal feasibility: Ax +∆x = b, x +∆x ≥ 0;Dual feasibility: AT(y +∆y) + (σ +∆σ) = c, σ +∆σ ≥ 0;Complementary slackness: (X +∆X)(Σ + ∆Σ)e = 0.
10 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Since we start from (x, y, σ) such that Ax = b, x > 0ATy + σ = c, σ > 0, the above conditions change into
A∆x = 0, x +∆x ≥ 0;AT∆y +∆σ = 0, σ +∆σ ≥ 0;X∆σ + Σ∆x = −XΣe −∆X∆Σe.
Note that when ∆x and ∆σ are small, the final non-linearterm −∆X∆Σe is small relative to −XΣe and can thus bedropped out, generating a linear system.
A∆x = 0, x +∆x ≥ 0;AT∆y +∆σ = 0, σ +∆σ ≥ 0;X∆σ + Σ∆x = −XΣe
The solution are: ∆σ = X−1(−XΣ− Σ∆x) = −σ − X−1Σ∆x(−X−1Σ AT
A 0
)(∆x∆y
)=
(σ0
)
11 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
1: Set initial solution x such that Ax = b, x > 0;2: Set initial solution y, σ such that ATy + σ = c, σ > 0;3: while TRUE do4: Solve (
−X−1Σ AT
A 0
)(∆x∆y
)=
(σ0
)5: Set ∆σ = X−1(−XΣ− Σ∆x) = −σ − X−1Σ∆x;6: Set θx = minj:∆xj<0
xj−∆xj
, θσ = minj:∆σj<0σj
−∆σj;
7: Set θ = min{1, αθx, αθσ};8: Update x = x + θ∆x, y = y + θ∆y, σ = σ + θ∆σ;9: if xiσi ≤ ϵ for all i = 1, ...,m then
10: break;11: end if12: end while13: return (x, y, σ);
12 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An example
min 2x1 +1.5x2s.t. 12x1 +24x2 ≥ 120
16x1 +16x2 ≥ 12030x1 +12x2 ≥ 120
x1 ≤ 15x2 ≤ 15
x1 ≥ 0x2 ≥ 0
13 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
An executionStarting point 1: x = (10, 10).
x1
x2
0 5 10 15
0
5
10
15
14 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Another executionStarting point 2: x = (14, 1).
x1
x2
0 5 10 15
0
5
10
15
15 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Centered interior-point method
To avoid the poor performance, we need a way to keep theiterate away from the boundary until the solution approachesthe optimum.An efficient way to achieve this goal is to relax complementaryslackness:
XΣe = 0
intoXΣe = µe (µ > 0)
We start from a large µ, and gradually reduce it as thealgorithm proceeds. At each iteration, we execute the previousaffine interior point method.
16 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
1: Set initial solution x such that Ax = b, x > 0;2: Set initial solution y, σ such that ATy + σ = c, σ > 0;3: while TRUE do4: Estimate µ = β σx
n ;5: Solve (
−X−1Σ AT
A 0
)(∆x∆y
)=
(σ−µX−1e
0
)6: Set ∆σ = X−1(−XΣ− Σ∆x) = −σ − X−1(Σ∆x−µe);7: Set θx = minj:∆xj<0
xj−∆xj
, θσ = minj:∆σj<0σj
−∆σj;
8: Set θ = min{1, αθx, αθσ};9: Update x = x + θ∆x, y = y + θ∆y, σ = σ + θ∆σ;
10: if xiσi ≤ ϵ for all i = 1, ...,m then11: break;12: end if13: end while14: return (x, y, σ);
17 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Running center path algorithmStarting point 1: x = (10, 10).
x1
x2
0 5 10 15
0
5
10
15
18 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Running center path algorithmStarting point 2: x = (14, 1).
x1
x2
0 5 10 15
0
5
10
15
19 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Another viewpoint
Actually, the relaxed complementary slackness
XΣe =1t e (t > 0)
corresponds to the following linear program:
min cTx− 1t∑n
i=1 log xis.t. Ax = b
x ≥ 0
20 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Consider a general optimization problem
Suppose we are trying to solve a convex program withinequality constraints:
min f(x)s.t. Ax = b
g(x) ≤ 0
where f(x) and g(x) are convex, and twice differentiable.
21 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Let’s start from equality constrained quadraticprogramming
Suppose we are trying to solve a QP with equalityconstraints :
min 12xTPx+QTx+ r
s.t. Ax = b
Applying Lagrangian conditions, we haveAx∗ = b, and Px+Q+ATλ = 0
Thus, the optimum point x∗ can be solved as follows:[P AT
A 0
] [x∗
λ
]=
[−Qb
]
22 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Then how to minimize f(x) with equality constraints?Newton’s method
Suppose we are trying to minimize a convex function f(x),which is not a QP.
min f(x)s.t. Ax = b
Basic idea: let’s try to improve from a feasible solution x. Atx, we write the Taylor extension, and use quadraticapproximation tof(x +∆x) = f(x) +▽f(x) + 1
2∆xT ▽2 f(x)∆x
min f(x)s.t. A(x+∆x) = b
Thus, the optimum point ∆x∗ can be solved as follows:[▽2f(x) AT
A 0
] [∆x∗
λ
]=
[−▽ f(x)
b
]Since this is just an approximation to f(x), we need toiteratively perform line search. 23 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Then how to minimize f(x) with inequality constraints?
Suppose we are trying to solve a convex program withinequality constraints:
min f(x)s.t. Ax = b
g(x) ≤ 0
where f(x) and g(x) are convex, and twice differentiable.Basic idea:
1 Transform it into a series of equality constraintedoptimisation problems;
2 Solve each equality constrainted optimisation problem usingNewton’s method.
24 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Log barrier function: transform into equality constraints
25 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Log barrier function
Suppose we are trying to solve a convex program withinequality constraints:
min f(x)s.t. Ax = b
g(x) ≤ 0
where f(x) and g(x) are convex, and twice differentiable.This is equivalent to the following optimization problem.
min f(x) + I_(g(x))s.t. Ax = b
where I_(x) ={
0 if x ≤ 0∞ otherwise
But indicator function is not differentiable.
26 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Log barrier function: smooth approximation to theindicator function
Suppose we are trying to solve a convex program withinequality constraints:
min f(x)s.t. Ax = b
g(x) ≤ 0
where f(x) and g(x) are convex, and twice differentiable.This is equivalent to the following optimization problem.
min f(x)−1t log(−g(x))
s.t. Ax = b
Basic idea: a smooth approximation to indicator function−1
t log(−x), and this approximation improves as t → ∞.
27 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Log barrier function approximates indicator function
−1t log(−x) approximates the indicator function, and the
approximation improves as t → ∞.For each setting of t, we obtain an approximation to theoriginal optimisation problem.
28 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Interior method and central path
Solve a sequence of optimization problem:
min f(x)−1t log(−g(x))
s.t. Ax = b
For any t > 0, we define x∗(t) as the optimal solution.t increases step by step. t should not be too large initially, asit is not easy to solve it using Newton’s method.Central path: {x∗(t)|t > 0}.
29 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Interior point method for LP
Consider the following LP
min cTxs.t. Ax ≤ b
x ≥ 0
30 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Central path
For any t > 0, we define x∗(t) as the optimal solution to:
min f(x)−1t log(−g(x))
s.t. Ax = b
x∗(t) is not optimal solution to the original problem.The duality gap is bounded by m
t .
31 / 32
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
...
.
Interior point algorithm
32 / 32