lecture 4. the dual cone and dual problem · lecture 4. the dual cone and dual problem shuzhong...

28
IE 8534 1 Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang

Upload: vankhanh

Post on 09-Apr-2018

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 1

Lecture 4. The Dual Cone and Dual Problem

Shuzhong Zhang

Page 2: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 2

For a convex cone K, its dual cone is defined as

K∗ = {y | ⟨x, y⟩ ≥ 0, ∀x ∈ K}.

The inner-product can be replaced by ‘xTy’ if the coordinates of the

vectors are given. Clearly, K∗ is always a closed set, regardless if K is

closed or not. In fact, K∗ is also always convex regardless if K is

convex or not.

Shuzhong Zhang

Page 3: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 3

In light of the dual cone, the problem of finding a lower bound on the

optimal value for the standard conic optimization problem can be

done as follows. Introduce the Lagrangian multiplier y for the equality

constraint b−Ax = 0, and s for the conic constraint x ∈ K. The

objective becomes

cTx+ yT(b−Ax)− sTx = yTb+ (c−ATy − s)Tx.

This will be a lower bound on the optimal value whenever

c−ATy − s = 0 and s ∈ K∗. The quest to find the best lower bound

leads to the dual problem for the standard conic optimization problem:

maximize bTy

subject to ATy + s = c

s ∈ K∗.

(1)

Shuzhong Zhang

Page 4: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 4

As an example, for the standard SDP problem (primal),

minimize C •Xsubject to Ai •X = bi, i = 1, ...,m

X ≽ 0,

its dual problem is in the form of

maximize∑m

i=1 biyi

subject to∑m

i=1 yiAi + S = C

S ≽ 0.

Shuzhong Zhang

Page 5: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 5

The so-called weak duality relationship is immediate, by the

construction of the dual problem. This is shown in the next theorem.

Theorem 1 Let x be feasible for the primal problem, and (y, s) be

feasible to its dual. Then,

bTy ≤ cTx.

Moreover, if bTy = cTx then xTs = 0, and they are both optimal to

their own respective problems.

If a pair of primal-dual feasible solutions x and (y, s) both exist and

they satisfy xTs = 0 then we call them a pair of complementary

optimal solutions.

Shuzhong Zhang

Page 6: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 6

A very useful theorem is the following.

Theorem 2 Let x be feasible for the primal problem, and (y, s) be

feasible to its dual, and x ∈ intK and s ∈ intK∗. Then, the sets of

optimal solutions for both problems are nonempty and compact, and

any pair of primal-dual optimal solutions will be complementary to

each other.

Shuzhong Zhang

Page 7: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 7

Let us spend some time now to study the structures of the dual cones.

Let us start with the topological structures. If rel (K) is denoted as the

relative interior of K, then (rel (K))∗ = K∗. Let the Minkowski

summation of two sets A and B be

A+B = {z | z = x+ y with x ∈ A, y ∈ B}.

It is clear that the sum of two convex sets remains convex. However,

the sum of two closed convex sets may not be closed any more.

Shuzhong Zhang

Page 8: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 8

Example 1 The set R+

−1

0

1

+ SOC(3) is convex but not closed.

For instance, the point (0, 1, 0)T is not in the set, but it is in the

closure of the set, since for any ϵ > 0, we may writeϵ

1

0

=1

ϵ

−1

0

1

+

1ϵ + ϵ

1

−1ϵ

,

where the second term is in SOC(3).

Shuzhong Zhang

Page 9: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 9

However, if A and B are both polyhedra, then A+B remains a

polyhedron, hence closed. It is interesting to note that the

set-summation and the set-intersection operations are a duality pair:

Proposition 1 Suppose that K1 and K2 are convex cones. It holds

that

(K1 +K2)∗ = K∗

1 ∩ K∗2.

Proof. For y ∈ (K1 +K2)∗, it follows that 0 ≤ (x1 + x2)

Ty for all

x1 ∈ K1 and x2 ∈ K2. Since 0 ∈ K2, it follows that 0 ≤ (x1)Ty for all

x1 ∈ K1, thus y ∈ K∗1, and similarly y ∈ K∗

2. Hence, y ∈ K∗1 ∩ K∗

2.

Now, take any y ∈ K∗1 ∩ K∗

2, and any x1 ∈ K1 and x2 ∈ K2. We have

(x1 + x2)Ty = (x1)

Ty + (x2)Ty ≥ 0, and so y ∈ (K1 +K2)

∗. This

proves (K1 +K2)∗ = K∗

1 ∩ K∗2 as desired. 2

Shuzhong Zhang

Page 10: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 10

The following bipolar theorem can be considered as a cornerstone for

convex analysis.

Theorem 3 Suppose that K is a convex cone. Then (K∗)∗ = clK.

Proof. For any x ∈ K, by the definition of the dual cone, xTy ≥ 0

for all y ∈ K∗, and so K ⊆ (K∗)∗. Therefore clK ⊆ (K∗)∗.

The other containing relationship (K∗)∗ ⊆ clK can be shown by

contradiction. Suppose such is not true then there exists

z ∈ (K∗)∗ \ clK. By the separation theorem for convex cones, there is

vector c such that

cTz < 0, and cTx ≥ 0∀x ∈ clK.

The last relation suggests that c ∈ K∗, which contradicts with the first

relation because, as z ∈ (K∗)∗, it should lead to cTz ≥ 0, but not the

opposite. This contradiction completes the proof. 2

Shuzhong Zhang

Page 11: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 11

This proof suggests that the bipolar theorem for convex cone is

equivalent to the separation theorem itself. Suppose that K is a closed

convex cone, and v ̸∈ K. Then, the bipolar theorem asserts that

v ̸∈ (K∗)∗, and this implies that there is y ∈ K∗ such that vTy < 0,

and xTy ≥ 0 for any x ∈ K because y ∈ K∗, hence a separation exists.

Shuzhong Zhang

Page 12: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 12

In case K is closed, the bipolar theorem is what in folklore understood

as ‘the dual of the dual is the primal itself again’. Combining

Theorem 3 and Proposition 1, it follows that

cl (K1 +K2) = (K1 ∩ K2)∗.

It is also useful to compute the effect of linear transformation on a

convex cone in the context of dual operation. Let A be an invertible

matrix. Then, we have

(ATK)∗ = A−1K∗

by observing that y ∈ (ATK)∗ ⇐⇒ (ATx)Ty ≥ 0 ∀x ∈ K ⇐⇒ Ay ∈ K∗

⇐⇒ y ∈ A−1K∗.

Shuzhong Zhang

Page 13: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 13

A convex cone is called solid if it is full dimensional. Mathematically,

K ⊆ Rn is solid ⇐⇒

spanK = Rn, or K + (−K) = Rn.

A convex cone is called pointed if it does not contain any nontrivial

subspace; i.e. K∩ (−K) = {0}. The following result proves to be useful:

Proposition 2 If K is solid then K∗ is pointed, and if K is pointed

then K∗ is solid.

Shuzhong Zhang

Page 14: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 14

Proof. Suppose that K is solid, i.e. there is a ball B(x0; δ) ⊆ K with

radius δ > 0 and centered at x0, and at the same time suppose that

there is a line Rc ∈ K∗, then

0 ≤ s(x0 + t)Tc, ∀∥t∥ ≤ δ and s ∈ R,

which is clearly impossible. This contradiction shows that if K is solid

then K∗ must be pointed. By the bipolar theorem, the second part

follows by symmetry. 2

Shuzhong Zhang

Page 15: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 15

It is easy to compute the following dual cones

• (Rn+)

∗ = Rn+.

• SOC(n)∗ = SOC(n).

• (Sn×n+ )∗ = Sn×n

+ , and (Hn×n+ )∗ = Hn×n

+ .

The above standard cones are so regular that their corresponding dual

cones coincide with the original cones themselves (the primal cones).

In general, they are not the same. As an example, consider the

following Lp cone, where p ≥ 1,

Lp(n+ 1) =

t

x

∣∣∣∣∣∣ t ∈ R, x ∈ Rn, t ≥ ∥x∥p

.

Shuzhong Zhang

Page 16: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 16

One can show that (Lp(n+ 1))∗ = Lq(n+ 1) where 1/p+ 1/q = 1.

Now, the self-duality of SOC(n+ 1) is due to the fact that

SOC(n+ 1) = L2(n+ 1).

In convex analysis, the dual object for a convex function f(x) is known

as the conjugate of f(x), defined as

f∗(s) = sup{(−s)Tx− f(x) | x ∈ dom f},

where ‘dom f ’ stands for the domain of the function f . The conjugate

function is also known as the Legendre-Fenchel transformation.

Shuzhong Zhang

Page 17: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 17

Examples of some popular conjugate functions are:

1. For f(x) = 12x

TQx+ bTx where Q ≻ 0, the conjugate function is

f∗(s) = 12 (s+ b)TQ−1(s+ b).

2. For f(x) =∑n

i=1 exi , the conjugate function is

f∗(s) = −∑n

i=1 si log(−si) +∑n

i=1 si.

3. For f(x) =∑n

i=1 xi log xi, the conjugate function is

f∗(s) =∑n

i=1 e1+si .

4. For f(x) = cTx, the conjugate function is f∗(s) = 0 for s = −c

and f∗(s) = +∞ when s ̸= −c.

5. For f(x) = −∑n

i=1 log xi, the conjugate function is

f∗(s) = −n−∑n

i=1 log si.

6. For f(x) = max1≤i≤n xi, the conjugate function is f∗(s) = 0 if∑ni=1 si ≤ −1, and f∗(s) = +∞ otherwise.

Shuzhong Zhang

Page 18: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 18

If f(x) is strictly convex and differentiable, then f∗(s) is also strictly

convex and differentiable. The famous bi-conjugate theorem states

that f∗∗ = cl f . There is certainly a relationship between the two dual

objects: the dual cone and the conjugate function. Let us now set out

to find the exact relationship. First of all, suppose that f is a convex

function, mapping from its domain in Rn to R. Its epigraphy is a set

in Rn+1 defined as

epi (f) :=

t

x

∣∣∣∣∣∣ t ≥ f(x), t ∈ R, x ∈ dom (f)

.

Shuzhong Zhang

Page 19: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 19

It is well known that f is a convex function if and only if epi (f) is a

convex set. It turns out that H (epi (f∗)) is the dual of H (epi (f)),

after taking the closure and reorienting itself. To be precise, we have:

Theorem 4 Suppose that f is a convex function, and

K := cl

p

q

x

∣∣∣∣∣∣∣∣ p > 0, q − pf(x/p) ≥ 0

.

Then,

K∗ = cl

u

v

s

∣∣∣∣∣∣∣∣ v > 0, u− vf∗(s/v) ≥ 0

.

Shuzhong Zhang

Page 20: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 20

Let L2 :=

0, 1

1, 0

and L := L2 ⊕ In = [L2, 02,n; 0n,2, In]. The effect

of applying L to x is to swap the position of x1 and x2. Equivalently,

Theorem 4 can be stated as

(H (epi (f)))∗= cl (L (H (epi (f∗)))) .

Shuzhong Zhang

Page 21: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 21

Another related concept is the so-called polar set. Suppose S ⊂ Rn.

Then its polar set is defined as

S◦ = {y | (−y)Tx ≤ 1, ∀x ∈ S}.

Clearly, S◦ is always closed and convex, and it always contains the

origin.

If S is a cone, then S◦ = S∗.

One can verify that

S◦ =

y

∣∣∣∣∣∣ 1

y

∈ H(S)∗

.

Shuzhong Zhang

Page 22: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 22

Some basic properties regarding the polar set immediately follow:

• If A ⊆ B then A◦ ⊇ B◦;

• (∪i∈ISi)◦ = ∩i∈IS

◦i ;

• If S = conv(v1, · · · , vm) then

S◦ = {x | −vTi x ≤ 1, i = 1, 2, ...,m}.

In other words, the polar of a polytope is a polyhedron.

• It always holds S ⊆ S◦◦.

Shuzhong Zhang

Page 23: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 23

By Theorem 3, we also have

Theorem 5 If S is a convex set, then

S◦◦ = clS.

The above is evidently equivalent to Theorem 3. They can both be

referred to as the bipolar theorem.

Shuzhong Zhang

Page 24: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 24

Let us return to primal-dual conic optimization models.

In linear programming, it is known that whenever the primal and the

dual problems are both feasible then complementary solutions must

exist. This important property, however, is lost, for a general conic

optimization model, as the following example shows.

Consider

minimize x11

subject to x12 + x21 = 1, x11, x12

x21, x22

≽ 0.

This problem is clearly feasible. However, there is no attainable

optimal solution.

Shuzhong Zhang

Page 25: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 25

Its dual problem is

maximize y

subject to y

0, 1

1, 0

+

s11, s12

s21, s22

=

1, 0

0, 0

, s11, s12

s21, s22

≽ 0,

or, more explicitly,

maximize y

subject to

1, −y

−y, 0

≽ 0.

The dual problem is feasible and has an optimal solution; however, a

pair of primal-dual complementary optimal solutions fails to exist,

although both primal and dual problems are feasible.

Shuzhong Zhang

Page 26: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 26

One may guess that the unfortunate situation was caused by the

non-attainable optimal solution. Next example shows that even if both

primal and dual problems are nicely feasible with attainable optimal

solutions, there might still be a positive duality gap, meaning that the

complementarity condition does not hold.

minimize −x12 − x21 + 2x33

subject to x12 + x21 + x33 = 1,

x22 = 0,x11, x12, x13

x21, x22, x23

x31, x32, x33

≽ 0.

Shuzhong Zhang

Page 27: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 27

The dual of this problem is

maximize y1

subject to

0, −1− y1, 0

−1− y1, −y2, 0

0, 0, 2− y1

≽ 0.

The primal problem has an optimal solution, e.g.

X∗ =

1, 0, 0

0, 0, 0

0, 0, 1

, with the optimal value equal to 2. The dual

problem also is feasible, and has an optimal solution e.g.

(y∗1 , y∗2) = (−1,−1), with optimal value equal to −1. Both problems

are feasible and have attainable optimal solutions; however, their

optimal solutions are not complementary to each other, and there is a

positive duality gap: cTx∗ − bTy∗ = (x∗)Ts∗ > 0.

Shuzhong Zhang

Page 28: Lecture 4. The Dual Cone and Dual Problem · Lecture 4. The Dual Cone and Dual Problem Shuzhong Zhang. IE 8534 2 For a convex cone K, its dual cone is defined as K

IE 8534 28

Key References:

• Yu. Nesterov and A. Nemirovsky, Interior Point Polynomial

Methods in Convex Programming, Studies in Applied Mathematics

13, SIAM, 1994.

• J.F. Sturm, Theory and Algorithms for Semidefinite Programming.

In High Performance Optimization, eds. H. Frenk, K. Roos, T.

Terlaky, and S. Zhang, Kluwer Academic Publishers, 1999.

• S. Zhang, A New Self-Dual Embedding Method for Convex

Programming, Journal of Global Optimization, 29, 479 – 496, 2004.

Shuzhong Zhang