conic duality theorems and applicationsweb.stanford.edu/class/msande311/lecture04.pdf · 2020. 1....
TRANSCRIPT
CME307/MS&E311: Optimization Lecture Note #04
Conic Duality Theorems and Applications
Yinyu Ye
Department of Management Science and Engineering
Stanford University
Stanford, CA 94305, U.S.A.
http://www.stanford.edu/˜yyye
Chapters 4.1-4.2 and 6.1-6.4
1
CME307/MS&E311: Optimization Lecture Note #04
Primal and Dual of Conic LP
Recall the pair of
(CLP ) minimize c • xsubject to ai • x = bi, i = 1, 2, ...,m, (Ax = b), x ∈ K;
and it dual problem
(CLD) maximize bTy
subject to∑m
i yiai + s = c, (ATy + s = c), s ∈ K∗,
where y ∈ Rm, s is called the dual slack vector/matrix, and K∗ is the dual cone of K .
Theorem 1 (Weak duality theorem)
c • x− bTy = x • s ≥ 0
for any feasible x of (CLP) and (y, s) of (CLD).
2
CME307/MS&E311: Optimization Lecture Note #04
CLP Duality Theorems
The weak duality theorem shows that a feasible solution to either problem yields a bound on the value of
the other problem. We call c • x− bTy the duality gap.
Corollary 1 Let x∗ ∈ Fp and (y∗, s∗) ∈ Fd. Then, c • x∗ = bTy∗ implies that x∗ is optimal for
(CLP) and (y∗, s∗) is optimal for (CLD).
Is the reverse also true? That is, given x∗ optimal for (CLP), then there is (y∗, s∗) feasible for (CLD) and
c • x∗ = bTy∗?
This is called the Strong Duality Theorem.
“True” when K = Rn+, that is, the polyhedral cone case, but not true in general.
3
CME307/MS&E311: Optimization Lecture Note #04
Proof of Strong Duality Theorem for LP
Let x∗ be an minimizer of (LP). Then the following system
Ax′ − bτ = 0, (x′; τ) ≥ 0, cTx′ − (cTx∗)τ = −1 < 0
must have no feasible solution (x′; τ). This is because otherwise, if τ > 0, x′/τ is feasible for (LP) and
cTx′/τ < cTx∗, which is a contradiction; and if τ = 0, x∗ + x′ is feasible for (LP) and
cT (x∗ + x′) = cTx∗ − 1 < cTx∗, which is also a contradiction. Thus, from the LP alternative system
pair II, there is y∗ feasible for
c−ATy∗ ≥ 0, −cTx∗ + bTy∗ ≥ 0.
Then, y∗ is feasible for (LD) from the first inequality; and from the weak duality theorem and the second
inequality cTx∗ − bTy∗ = 0.
4
CME307/MS&E311: Optimization Lecture Note #04
LP and LD Cases
Theorem 2 The following statements hold for every pair of (LP) and (LD) :
i) If (LP) and (LD) are both feasible, then both problems have optimal solutions and the optimal objective
values of the objective functions are equal, that is, optimal solutions for both (LP) and (LD) exist and
there is no duality gap.
ii) If (LP) or (LD) is feasible and bounded, then the other is feasible and bounded.
iii) If (LP) or (LD) is feasible and unbounded, then the other has no feasible solution.
iv) If (LP) or (LD) is infeasible, then the other is either unbounded or has no feasible solution.
A case that neither (LP) nor (LD) is feasible: c = (−1; 0), A = (0, −1), b = 1.
The proofs follow the Farkas lemma and the Weak Duality Theorem.
5
CME307/MS&E311: Optimization Lecture Note #04
The LP Primal and Dual Relation
The normal relation is proved on Slide 4 of Lecture Note #4.
6
CME307/MS&E311: Optimization Lecture Note #04
Optimality Conditions for LP
(x,y, s) ∈ (Rn+,Rm,Rn
+) :
cTx− bTy = 0
Ax = b
−ATy − s = −c
,
which is a system of linear inequalities and equations. Now it is easy to verify whether or not a pair
(x,y, s) is optimal.
7
CME307/MS&E311: Optimization Lecture Note #04
Complementarity Gap
For feasible x and (y, s), xT s = xT (c−ATy) = cTx− bTy is called the complementarity gap.
If xT s = 0, then we say x and s are complementary to each other.
Since both x and s are nonnegative, xT s = 0 implies that x. ∗ s = 0 or xjsj = 0 for all j = 1, . . . , n.
x. ∗ s = 0
Ax = b
−ATy − s = −c.
This system has total 2n+m unknowns and 2n+m equations including n nonlinear equations.
Interpretation of sj = 0: the jth inequality constraint of the dual is “binding” or “active”.
8
CME307/MS&E311: Optimization Lecture Note #04
Recall the Maze Runner Example
Since y5 = 0 the simplified LP is
maximizey y0 + y1 + y2 + y3 + y4
subject to y0 − γy1 ≤ 0, (x∗1 = 1)
y0 − γ(0.5y2 + 0.25y3 + 0.125y4) ≤ 0, (x∗2 = 0)
y1 − γy2 ≤ 0, (x∗3 = 1 + γ)
y1 − γ(0.5y3 + 0.25y4) ≤ 0, (x∗4 = 0)
y2 − γy3 ≤ 0, (x∗5 = 1 + γ + γ2)
y2 − γ(0.5y4) ≤ 0, (x∗6 = 0)
y3 − γy4 ≤ 0, (x∗7 = 0)
y3 ≤ 0, (x∗8 = 1 + γ + γ2 + γ3)
y4 ≤ 1. (x∗9 = 1)
The maximal solution is y∗1 = 0, y∗1 = 0, y∗2 = 0, y∗3 = 0, y∗4 = 1.
9
CME307/MS&E311: Optimization Lecture Note #04
Cost-to-go values on each state when actions in (red,red,red,blue,black) are taken: the current policy is
optimal.
10
CME307/MS&E311: Optimization Lecture Note #04
Cost-to-go values on each state when actions in (red,red,red,red,black) are taken: the current policy is not
optimal.
11
CME307/MS&E311: Optimization Lecture Note #04
General CLP: an SDP Example with a Duality Gap
The strong duality theorem may not hold for general convex cones:
c =
0 1 0
1 0 0
0 0 0
,a1 =
0 0 0
0 1 0
0 0 0
,a2 =
0 −1 0
−1 0 0
0 0 2
and
b =
0
2
.
12
CME307/MS&E311: Optimization Lecture Note #04
When Strong Duality Theorems Holds for CLP
Theorem 3 The following statements hold for every pair of (CLP) and (CLD):
i) If (CLP) and (CLD) both are feasible, and furthermore one of them have an interior, then there is no
duality gap between (CLP) and (CLD). However, one of the optimal solution may not be attainable.
ii) If (CLP) and (CLD) both are feasible and have interior, then, then both have attainable optimal solutions
with no duality gap.
iii) If (CLP) or (CLD) is feasible and unbounded, then the other has no feasible solution.
iv) If (CLP) or (CLD) is infeasible, and furthermore the other is feasible and has an interior, then the other
is unbounded.
In case i), one of the optimal solution may not be attainable although no gap.
13
CME307/MS&E311: Optimization Lecture Note #04
SDP Example with Zero-Duality Gap but not Attainable
C =
1 0
0 0
, A1 =
0 1
1 0
, and b1 = 2.
The primal has an interior, but the dual does not.
14
CME307/MS&E311: Optimization Lecture Note #04
Proof of CLP Strong Duality Theorem
i) Let Fp be feasible and have an interior, and let z∗ be its infimum. Then we consider the alternative
system pair
Ax− bτ = 0, c • x− z∗τ < 0, (x, τ) ∈ K ×R+,
and
ATy + s = c, −bTy + κ = −z∗, (s, κ) ∈ K∗ ×R+.
But the former is infeasible, so that we have a solution for the latter. From the Weak Duality theorem, we
must have κ = 0, that is, we have a solution (y, s) such that
ATy + s = c, bTy = z∗, s ∈ K∗.
ii) We only need to prove that there exist a solution x ∈ Fp such that c • x = z∗, that is, the infimum of
(CLP) is attainable. But this is just the other side of the proof given that Fd is feasible and has an interior,
and z∗ is also the supremum of (CLD).
iii) The proof by contradiction follows the Weak Duality Theorem.
15
CME307/MS&E311: Optimization Lecture Note #04
iv) Suppose Fd is empty and Fp is feasible and have an interior. Then, we have x̄ ∈ intK and τ̄ > 0
such that Ax̄− bτ̄ = 0, (x̄, τ̄) ∈ int(K ×R+). Then, for any z∗, we again consider the alternative
system pair
Ax− bτ = 0, c • x− z∗τ < 0, (x, τ) ∈ K ×R+,
and
ATy + s = c, −bTy + s = −z∗, (s, s) ∈ K∗ ×R+.
But the latter is infeasible, so that the formal has a feasible solution for any z∗. At such an solution, if
τ > 0, we have c • (x/τ) < z∗; if τ = 0, we have x̂+ αx, where x̂ is any feasible solution for (CLP),
being feasible for (CLP) and its objective value goes to −∞ as α goes to ∞.
16
CME307/MS&E311: Optimization Lecture Note #04
The SDP Primal and Dual Relation
min
(0 0
0 0
)•X
s.t.
(0 0
0 1
)•X = 0(
0 1
1 0
)•X = 2
X ≽ 0
max 2y2
s.t. y1
(0 0
0 1
)+ y2
(0 1
1 0
)+ S =
(0 0
0 0
)S ≽ 0
The Dual is feasible and bounded, but Primal is infeasible.
17
CME307/MS&E311: Optimization Lecture Note #04
Rules to Construct the Dual in General
obj. coef. vector right-hand-side
right-hand-side obj. coef. vector
A AT
Max model Min model
xj ∈ K jth constraint slack sj ∈ K∗
xj “free” jth constraint slack sj = 0
ith constraint slack si ∈ K yi ∈ K∗
ith constraint slack si = 0 yi “free”
The dual of the dual is primal!
18
CME307/MS&E311: Optimization Lecture Note #04
Optimality and Complementarity Conditions for SDP
c •X − bTy = 0
AX = b
−ATy − S = −c
X,S ≽ 0
, (1)
XS = 0
AX = b
−ATy − S = −c
X,S ≽ 0
(2)
19
CME307/MS&E311: Optimization Lecture Note #04
LP, SOCP, and SDP Examples
min 2x1 + x2 + x3
s. t. x1 + x2 + x3 = 1,
(x1;x2;x3) ≥ 0.
max y
s.t. e · y + s = (2; 1; 1),
(s1; s2; s3) ≥ 0.
min 2x1 + x2 + x3
s.t. x1 + x2 + x3 = 1,
x1 −√
x22 + x2
3 ≥ 0.
max y
s.t. e · y + s = (2; 1; 1),
s1 −√s22 + s23 ≥ 0.
For the SOCP case: 2− y ≥√2(1− y)2. Since y = 1 is feasible for the dual, y∗ ≥ 1 so that the dual
constraint becomes 2− y ≥√2(y − 1) or y ≤
√2. Thus, y∗ =
√2, and there is no duality gap.
20
CME307/MS&E311: Optimization Lecture Note #04
minimize
2 .5
.5 1
·
x1 x2
x2 x3
subject to
1 .5
.5 1
·
x1 x2
x2 x3
= 1, x1 x2
x2 x3
≽ 0,
maximize y
subject to
1 .5
.5 1
y + s =
2 .5
.5 1
,
s =
s1 s2
s2 s3
≽ 0.
21
CME307/MS&E311: Optimization Lecture Note #04
Duality Application: Two-Person Zero-Sum Game
Let P be the payoff matrix of a two-person, ”column” and ”row”, zero-sum game.
P =
+3 −1 −4
−3 +1 +4
Let the Column-Player choose a column, say the jth column, of the payoff matrix and the Row-Player
choose a row, say the ith row to play. Then the reward to the Column-Player is Pij that is also the loss to
the Row-Player.
The pure strategy is a single column (row) chosen by the Column-Player (Row-Player).
Players usually use randomized strategies in such a game. A randomized strategy is a fixed probability
distribution vector x (y) where the Column-Player (Row-Player) randomly chooses a column (row)
according to the distribution x (y). Then the expected reward to the Column-Player is yTPx that is also
the epected loss to the Row-Player.
22
CME307/MS&E311: Optimization Lecture Note #04
Nash Equilibrium
In a Nash Equilibrium, if the column (row) strategy is a pure strategy, then the payoff of the chosen column
(row) must be greater (less) than or equal to the payoff of any other chosen column (row) if the
row(column)-player does not change the strategy. If the column (row) strategy is a randomized strategy,
then the expected payoff of the chosen columns (rows) according to the column (row) probability
distribution x∗ (y∗) should be greater (less) than or equal to the expected payoff of any other column (row)
distribution if the row(column)-player does not change its randomized strategy.
(y∗)TPx ≤ (y∗)TPx∗ ≤ yTPx∗, ∀x,y.
The pure Nash Equilibrium may not exist, but randomized Nash Equilibrium always exists.
23
CME307/MS&E311: Optimization Lecture Note #04
LP formulation of Nash Equilibrium
”Column” strategy:
max v
s.t. ve ≤ Px
eTx = 1
x ≥ 0.
”Row” strategy:
min u
s.t. ue ≥ PTy
eTy = 1
y ≥ 0.
Theorem 4 They are dual to each other, and an optimal pair of x∗ and y∗ is a Nash Equilibrium.
24
CME307/MS&E311: Optimization Lecture Note #04
Duality Application: Robust Optimization
Consider a linear program
minimize (c+ Cu)Tx
subject to Ax = b,
x ≥ 0,
where u ≥ 0 and u ≤ e is chosen by an Adversary and beyond decision maker’s control.
Robust Min-Max Model:
minimize max{u≥0, u≤e}(c+ Cu)Tx
subject to Ax = b,
x ≥ 0.
25
CME307/MS&E311: Optimization Lecture Note #04
The Dual of Adversary’s Problem
Adversary’s (primal) problem:
maximizeu cTx+ xTCu
subject to u ≤ e,
u ≥ 0.
Dual of Adversary’s problem:
minimizey cTx+ eTy
subject to y ≥ CTx,
y ≥ 0.
26
CME307/MS&E311: Optimization Lecture Note #04
Decision Maker’s Robust Optimization Model
Robust Min-Min Model:
minimizex miny cTx+ eTy
s.t. y ≥ CTx, y ≥ 0
subject to Ax = b,
x, y ≥ 0.
which is equivalent to
minimizex,y cTx+ eTy
subject to y ≥ CTx,
Ax = b,
x, y ≥ 0.
27
CME307/MS&E311: Optimization Lecture Note #04
(Distributionally) Robust Deep-Learning I
Figure 1: Result of the DRO Learning I: Original
28
CME307/MS&E311: Optimization Lecture Note #04
(Distributionally) Robust Deep-Learning II
Figure 2: Result of the DRO Learning II: Nonrobust Result
29
CME307/MS&E311: Optimization Lecture Note #04
(Distributionally) Robust Deep-Learning III
Figure 3: Result of the DRO Learning III: DRO Result
30