lagrangiandualityandconvexoptimization · lagrangiandualityandconvexoptimization...

91
Lagrangian Duality and Convex Optimization Marylou Gabrié & He He Material originally designed by: Julia Kempe & David Rosenberg CDS, NYU February 16, 2021 M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 1 / 38

Upload: others

Post on 28-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Lagrangian Duality and Convex Optimization

Marylou Gabrié & He HeMaterial originally designed by:Julia Kempe & David Rosenberg

CDS, NYU

February 16, 2021

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 1 / 38

Page 2: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Optimization

General Optimization Problem: Standard Formx ∈ Rn are the optimization variables and f0 is the objective function.

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,mhi (x) = 0, i = 1, . . .p,

Can you think of examples?

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 2 / 38

Page 3: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Optimization

General Optimization Problem: Standard Formx ∈ Rn are the optimization variables and f0 is the objective function.

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,mhi (x) = 0, i = 1, . . .p,

Can you think of examples?

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 2 / 38

Page 4: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Optimization

General Optimization Problem: Standard Formx ∈ Rn are the optimization variables and f0 is the objective function.

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,mhi (x) = 0, i = 1, . . .p,

Can you think of examples?

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 2 / 38

Page 5: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Why Convex Optimization?

Historically:Linear programs (linear objectives & constraints) were the focus

Nonlinear programs: some easy, some hardBy early 2000s:

Main distinction is between convex and non-convex problemsConvex problems are the ones we know how to solve efficientlyMostly batch methods until... around 2010? (earlier if you were into neural nets)

By 2010 +- few years, most people understood theoptimization / estimation / approximation error tradeoffsaccepted that stochatic methods were often faster to get good results

(especially on big data sets)

now nobody’s scared to try convex optimization machinery on non-convex problems

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 3 / 38

Page 6: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Why Convex Optimization?

Historically:Linear programs (linear objectives & constraints) were the focusNonlinear programs: some easy, some hard

By early 2000s:Main distinction is between convex and non-convex problemsConvex problems are the ones we know how to solve efficientlyMostly batch methods until... around 2010? (earlier if you were into neural nets)

By 2010 +- few years, most people understood theoptimization / estimation / approximation error tradeoffsaccepted that stochatic methods were often faster to get good results

(especially on big data sets)

now nobody’s scared to try convex optimization machinery on non-convex problems

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 3 / 38

Page 7: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Why Convex Optimization?

Historically:Linear programs (linear objectives & constraints) were the focusNonlinear programs: some easy, some hard

By early 2000s:Main distinction is between convex and non-convex problems

Convex problems are the ones we know how to solve efficientlyMostly batch methods until... around 2010? (earlier if you were into neural nets)

By 2010 +- few years, most people understood theoptimization / estimation / approximation error tradeoffsaccepted that stochatic methods were often faster to get good results

(especially on big data sets)

now nobody’s scared to try convex optimization machinery on non-convex problems

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 3 / 38

Page 8: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Why Convex Optimization?

Historically:Linear programs (linear objectives & constraints) were the focusNonlinear programs: some easy, some hard

By early 2000s:Main distinction is between convex and non-convex problemsConvex problems are the ones we know how to solve efficiently

Mostly batch methods until... around 2010? (earlier if you were into neural nets)By 2010 +- few years, most people understood the

optimization / estimation / approximation error tradeoffsaccepted that stochatic methods were often faster to get good results

(especially on big data sets)

now nobody’s scared to try convex optimization machinery on non-convex problems

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 3 / 38

Page 9: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Why Convex Optimization?

Historically:Linear programs (linear objectives & constraints) were the focusNonlinear programs: some easy, some hard

By early 2000s:Main distinction is between convex and non-convex problemsConvex problems are the ones we know how to solve efficientlyMostly batch methods until... around 2010? (earlier if you were into neural nets)

By 2010 +- few years, most people understood theoptimization / estimation / approximation error tradeoffsaccepted that stochatic methods were often faster to get good results

(especially on big data sets)

now nobody’s scared to try convex optimization machinery on non-convex problems

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 3 / 38

Page 10: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Why Convex Optimization?

Historically:Linear programs (linear objectives & constraints) were the focusNonlinear programs: some easy, some hard

By early 2000s:Main distinction is between convex and non-convex problemsConvex problems are the ones we know how to solve efficientlyMostly batch methods until... around 2010? (earlier if you were into neural nets)

By 2010 +- few years, most people understood theoptimization / estimation / approximation error tradeoffsaccepted that stochatic methods were often faster to get good results

(especially on big data sets)

now nobody’s scared to try convex optimization machinery on non-convex problems

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 3 / 38

Page 11: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Your Reference for Convex Optimization

Boyd and Vandenberghe (2004)Very clearly written, but has a ton of detail for a first pass.See the Extreme Abridgement of Boyd and Vandenberghe.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 4 / 38

Page 12: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

What we will quickly review today

1 Convex Sets and Functions

2 The General Optimization Problem

3 Lagrangian Duality: Convexity not required

4 Convex Optimization

5 Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 5 / 38

Page 13: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Table of Contents

1 Convex Sets and Functions

2 The General Optimization Problem

3 Lagrangian Duality: Convexity not required

4 Convex Optimization

5 Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 6 / 38

Page 14: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex Sets and Functions

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 7 / 38

Page 15: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Notation from Boyd and Vandenberghe

f : Rp→ Rq to mean that f maps from some subset of Rp

namely dom f ⊂ Rp, where dom f is the domain of f

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 8 / 38

Page 16: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex Sets

DefinitionA set C is convex if for any x1,x2 ∈ C and any θ with 06 θ6 1 we have

θx1+(1−θ)x2 ∈ C .

KPM Fig. 7.4

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 9 / 38

Page 17: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex and Concave Functions

DefinitionA function f : Rn→ R is convex if dom f is a convex set and if for all x ,y ∈ dom f , and06 θ6 1, we have

f (θx +(1−θ)y)6 θf (x)+(1−θ)f (y).

x y

λ1 − λ

A B

KPM Fig. 7.5

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 10 / 38

Page 18: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Examples of Convex Functions on R

Examplesx 7→ ax +b is both convex and concave on R for all a,b ∈ R.

x 7→ |x |p for p > 1 is convex on Rx 7→ eax is convex on R for all a ∈ REvery norm on Rn is convex (e.g. ‖x‖1 and ‖x‖2)Max: (x1, . . . ,xn) 7→max {x1 . . . ,xn} is convex on Rn

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 11 / 38

Page 19: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Examples of Convex Functions on R

Examplesx 7→ ax +b is both convex and concave on R for all a,b ∈ R.x 7→ |x |p for p > 1 is convex on R

x 7→ eax is convex on R for all a ∈ REvery norm on Rn is convex (e.g. ‖x‖1 and ‖x‖2)Max: (x1, . . . ,xn) 7→max {x1 . . . ,xn} is convex on Rn

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 11 / 38

Page 20: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Examples of Convex Functions on R

Examplesx 7→ ax +b is both convex and concave on R for all a,b ∈ R.x 7→ |x |p for p > 1 is convex on Rx 7→ eax is convex on R for all a ∈ R

Every norm on Rn is convex (e.g. ‖x‖1 and ‖x‖2)Max: (x1, . . . ,xn) 7→max {x1 . . . ,xn} is convex on Rn

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 11 / 38

Page 21: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Examples of Convex Functions on R

Examplesx 7→ ax +b is both convex and concave on R for all a,b ∈ R.x 7→ |x |p for p > 1 is convex on Rx 7→ eax is convex on R for all a ∈ REvery norm on Rn is convex (e.g. ‖x‖1 and ‖x‖2)

Max: (x1, . . . ,xn) 7→max {x1 . . . ,xn} is convex on Rn

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 11 / 38

Page 22: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Examples of Convex Functions on R

Examplesx 7→ ax +b is both convex and concave on R for all a,b ∈ R.x 7→ |x |p for p > 1 is convex on Rx 7→ eax is convex on R for all a ∈ REvery norm on Rn is convex (e.g. ‖x‖1 and ‖x‖2)Max: (x1, . . . ,xn) 7→max {x1 . . . ,xn} is convex on Rn

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 11 / 38

Page 23: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex Functions and Optimization

DefinitionA function f is strictly convex if the line segment connecting any two points on the graph of flies strictly above the graph (excluding the endpoints).

Consequences for optimization:convex: if there is a local minimum, then it is a global minimumstrictly convex: if there is a local minimum, then it is the unique global minumum

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 12 / 38

Page 24: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex Functions and Optimization

DefinitionA function f is strictly convex if the line segment connecting any two points on the graph of flies strictly above the graph (excluding the endpoints).

Consequences for optimization:convex: if there is a local minimum, then it is a global minimumstrictly convex: if there is a local minimum, then it is the unique global minumum

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 12 / 38

Page 25: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Table of Contents

1 Convex Sets and Functions

2 The General Optimization Problem

3 Lagrangian Duality: Convexity not required

4 Convex Optimization

5 Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 13 / 38

Page 26: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The General Optimization Problem

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 14 / 38

Page 27: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

General Optimization Problem: Standard Form

General Optimization Problem: Standard Form

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,mhi (x) = 0, i = 1, . . .p,

where x ∈ Rn are the optimization variables and f0 is the objective function.

Assume domain D=⋂m

i=0 dom fi ∩⋂p

i=1 dom hi is nonempty.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 15 / 38

Page 28: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

General Optimization Problem: Standard Form

General Optimization Problem: Standard Form

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,mhi (x) = 0, i = 1, . . .p,

where x ∈ Rn are the optimization variables and f0 is the objective function.

Assume domain D=⋂m

i=0 dom fi ∩⋂p

i=1 dom hi is nonempty.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 15 / 38

Page 29: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

General Optimization Problem: More Terminology

The set of points satisfying the constraints is called the feasible set.A point x in the feasible set is called a feasible point.

If x is feasible and fi (x) = 0,then we say the inequality constraint fi (x)6 0 is active at x .

The optimal value p∗ of the problem is defined as

p∗ = inf {f0(x) | x satisfies all constraints} .

x∗ is an optimal point (or a solution to the problem) if x∗ is feasible and f (x∗) = p∗.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 16 / 38

Page 30: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

General Optimization Problem: More Terminology

The set of points satisfying the constraints is called the feasible set.A point x in the feasible set is called a feasible point.If x is feasible and fi (x) = 0,

then we say the inequality constraint fi (x)6 0 is active at x .

The optimal value p∗ of the problem is defined as

p∗ = inf {f0(x) | x satisfies all constraints} .

x∗ is an optimal point (or a solution to the problem) if x∗ is feasible and f (x∗) = p∗.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 16 / 38

Page 31: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

General Optimization Problem: More Terminology

The set of points satisfying the constraints is called the feasible set.A point x in the feasible set is called a feasible point.If x is feasible and fi (x) = 0,

then we say the inequality constraint fi (x)6 0 is active at x .

The optimal value p∗ of the problem is defined as

p∗ = inf {f0(x) | x satisfies all constraints} .

x∗ is an optimal point (or a solution to the problem) if x∗ is feasible and f (x∗) = p∗.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 16 / 38

Page 32: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

General Optimization Problem: More Terminology

The set of points satisfying the constraints is called the feasible set.A point x in the feasible set is called a feasible point.If x is feasible and fi (x) = 0,

then we say the inequality constraint fi (x)6 0 is active at x .

The optimal value p∗ of the problem is defined as

p∗ = inf {f0(x) | x satisfies all constraints} .

x∗ is an optimal point (or a solution to the problem) if x∗ is feasible and f (x∗) = p∗.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 16 / 38

Page 33: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Do We Need Equality Constraints?

Consider an equality-constrained problem:

minimize f0(x)

subject to h(x) = 0

Note thath(x) = 0 ⇐⇒ (h(x)> 0 AND h(x)6 0)

Can be rewritten as

minimize f0(x)

subject to h(x)6 0−h(x)6 0.

For simplicity, we’ll drop equality contraints from this presentation.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 17 / 38

Page 34: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Do We Need Equality Constraints?

Consider an equality-constrained problem:

minimize f0(x)

subject to h(x) = 0

Note thath(x) = 0 ⇐⇒ (h(x)> 0 AND h(x)6 0)

Can be rewritten as

minimize f0(x)

subject to h(x)6 0−h(x)6 0.

For simplicity, we’ll drop equality contraints from this presentation.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 17 / 38

Page 35: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Do We Need Equality Constraints?

Consider an equality-constrained problem:

minimize f0(x)

subject to h(x) = 0

Note thath(x) = 0 ⇐⇒ (h(x)> 0 AND h(x)6 0)

Can be rewritten as

minimize f0(x)

subject to h(x)6 0−h(x)6 0.

For simplicity, we’ll drop equality contraints from this presentation.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 17 / 38

Page 36: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Do We Need Equality Constraints?

Consider an equality-constrained problem:

minimize f0(x)

subject to h(x) = 0

Note thath(x) = 0 ⇐⇒ (h(x)> 0 AND h(x)6 0)

Can be rewritten as

minimize f0(x)

subject to h(x)6 0−h(x)6 0.

For simplicity, we’ll drop equality contraints from this presentation.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 17 / 38

Page 37: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Do We Need Equality Constraints?

Consider an equality-constrained problem:

minimize f0(x)

subject to h(x) = 0

Note thath(x) = 0 ⇐⇒ (h(x)> 0 AND h(x)6 0)

Can be rewritten as

minimize f0(x)

subject to h(x)6 0−h(x)6 0.

For simplicity, we’ll drop equality contraints from this presentation.M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 17 / 38

Page 38: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Table of Contents

1 Convex Sets and Functions

2 The General Optimization Problem

3 Lagrangian Duality: Convexity not required

4 Convex Optimization

5 Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 18 / 38

Page 39: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Lagrangian Duality: Convexity not required

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 19 / 38

Page 40: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrangian

The general [inequality-constrained] optimization problem is:

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m

DefinitionThe Lagrangian for this optimization problem is

L(x ,λ) = f0(x)+m∑i=1

λi fi (x).

λi ’s are called Lagrange multipliers (also called the dual variables).

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 20 / 38

Page 41: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrangian Encodes the Objective and Constraints

Supremum over Lagrangian gives back encoding of objective and constraints:

supλ�0

L(x ,λ) = supλ�0

(f0(x)+

m∑i=1

λi fi (x)

)

=

{f0(x) when fi (x)6 0 all i∞ otherwise.

Equivalent primal form of optimization problem:

p∗ = infxsupλ�0

L(x ,λ)

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 21 / 38

Page 42: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrangian Encodes the Objective and Constraints

Supremum over Lagrangian gives back encoding of objective and constraints:

supλ�0

L(x ,λ) = supλ�0

(f0(x)+

m∑i=1

λi fi (x)

)

=

{f0(x) when fi (x)6 0 all i∞ otherwise.

Equivalent primal form of optimization problem:

p∗ = infxsupλ�0

L(x ,λ)

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 21 / 38

Page 43: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrangian Encodes the Objective and Constraints

Supremum over Lagrangian gives back encoding of objective and constraints:

supλ�0

L(x ,λ) = supλ�0

(f0(x)+

m∑i=1

λi fi (x)

)

=

{f0(x) when fi (x)6 0 all i∞ otherwise.

Equivalent primal form of optimization problem:

p∗ = infxsupλ�0

L(x ,λ)

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 21 / 38

Page 44: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Primal and the Dual

Original optimization problem in primal form:

p∗ = infxsupλ�0

L(x ,λ)

Get the Lagrangian dual problem by “swapping the inf and the sup”:

d∗ = supλ�0

infxL(x ,λ)

We will show weak duality: p∗ > d∗ for any optimization problem

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 22 / 38

Page 45: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Primal and the Dual

Original optimization problem in primal form:

p∗ = infxsupλ�0

L(x ,λ)

Get the Lagrangian dual problem by “swapping the inf and the sup”:

d∗ = supλ�0

infxL(x ,λ)

We will show weak duality: p∗ > d∗ for any optimization problem

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 22 / 38

Page 46: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Primal and the Dual

Original optimization problem in primal form:

p∗ = infxsupλ�0

L(x ,λ)

Get the Lagrangian dual problem by “swapping the inf and the sup”:

d∗ = supλ�0

infxL(x ,λ)

We will show weak duality: p∗ > d∗ for any optimization problem

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 22 / 38

Page 47: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Weak Max-Min Inequality

TheoremFor any f :W ×Z → R, we have

supz∈Z

infw∈W

f (w ,z)6 infw∈W

supz∈Z

f (w ,z).

ProofFor any w0 ∈W and z0 ∈ Z , we clearly have

infw∈W

f (w ,z0)6 f (w0,z0)6 supz∈Z

f (w0,z).

Since infw∈W f (w ,z0)6 supz∈Z f (w0,z) for all w0 and z0, we must also have

supz0∈Z

infw∈W

f (w ,z0)6 infw0∈W

supz∈Z

f (w0,z).

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 23 / 38

Page 48: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Weak Max-Min Inequality

TheoremFor any f :W ×Z → R, we have

supz∈Z

infw∈W

f (w ,z)6 infw∈W

supz∈Z

f (w ,z).

ProofFor any w0 ∈W and z0 ∈ Z , we clearly have

infw∈W

f (w ,z0)6 f (w0,z0)6 supz∈Z

f (w0,z).

Since infw∈W f (w ,z0)6 supz∈Z f (w0,z) for all w0 and z0, we must also have

supz0∈Z

infw∈W

f (w ,z0)6 infw0∈W

supz∈Z

f (w0,z).

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 23 / 38

Page 49: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Weak Max-Min Inequality

TheoremFor any f :W ×Z → R, we have

supz∈Z

infw∈W

f (w ,z)6 infw∈W

supz∈Z

f (w ,z).

ProofFor any w0 ∈W and z0 ∈ Z , we clearly have

infw∈W

f (w ,z0)6 f (w0,z0)6 supz∈Z

f (w0,z).

Since infw∈W f (w ,z0)6 supz∈Z f (w0,z) for all w0 and z0, we must also have

supz0∈Z

infw∈W

f (w ,z0)6 infw0∈W

supz∈Z

f (w0,z).

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 23 / 38

Page 50: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Weak Duality

For any optimization problem (not just convex), weak max-min inequality implies weakduality:

p∗ = infxsupλ�0

[f0(x)+

m∑i=1

λi fi (x)

]

> supλ�0,ν

infx

[f0(x)+

m∑i=1

λi fi (x)

]= d∗

The difference p∗−d∗ is called the duality gap.For convex problems, we often have strong duality: p∗ = d∗.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 24 / 38

Page 51: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Weak Duality

For any optimization problem (not just convex), weak max-min inequality implies weakduality:

p∗ = infxsupλ�0

[f0(x)+

m∑i=1

λi fi (x)

]

> supλ�0,ν

infx

[f0(x)+

m∑i=1

λi fi (x)

]= d∗

The difference p∗−d∗ is called the duality gap.For convex problems, we often have strong duality: p∗ = d∗.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 24 / 38

Page 52: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Function

The Lagrangian dual problem:

d∗ = supλ�0

infxL(x ,λ)

DefinitionThe Lagrange dual function (or just dual function) is

g(λ) = infxL(x ,λ) = inf

x

(f0(x)+

m∑i=1

λi fi (x)

).

The dual function may take on the value −∞ (e.g. f0(x) = x).The dual function is always concave

since pointwise min of affine functions

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 25 / 38

Page 53: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Function

The Lagrangian dual problem:

d∗ = supλ�0

infxL(x ,λ)

DefinitionThe Lagrange dual function (or just dual function) is

g(λ) = infxL(x ,λ) = inf

x

(f0(x)+

m∑i=1

λi fi (x)

).

The dual function may take on the value −∞ (e.g. f0(x) = x).The dual function is always concave

since pointwise min of affine functions

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 25 / 38

Page 54: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Function

The Lagrangian dual problem:

d∗ = supλ�0

infxL(x ,λ)

DefinitionThe Lagrange dual function (or just dual function) is

g(λ) = infxL(x ,λ) = inf

x

(f0(x)+

m∑i=1

λi fi (x)

).

The dual function may take on the value −∞ (e.g. f0(x) = x).

The dual function is always concavesince pointwise min of affine functions

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 25 / 38

Page 55: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Function

The Lagrangian dual problem:

d∗ = supλ�0

infxL(x ,λ)

DefinitionThe Lagrange dual function (or just dual function) is

g(λ) = infxL(x ,λ) = inf

x

(f0(x)+

m∑i=1

λi fi (x)

).

The dual function may take on the value −∞ (e.g. f0(x) = x).The dual function is always concave

since pointwise min of affine functionsM.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 25 / 38

Page 56: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

In terms of Lagrange dual function, we can write weak duality as

p∗ > supλ>0

g(λ) = d∗

So for any λ with λ> 0, Lagrange dual function gives a lower bound on optimalsolution:

p∗ > g(λ) for all λ> 0

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 26 / 38

Page 57: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

In terms of Lagrange dual function, we can write weak duality as

p∗ > supλ>0

g(λ) = d∗

So for any λ with λ> 0, Lagrange dual function gives a lower bound on optimalsolution:

p∗ > g(λ) for all λ> 0

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 26 / 38

Page 58: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

The Lagrange dual problem is a search for best lower bound on p∗:

maximize g(λ)

subject to λ� 0.

λ dual feasible if λ� 0 and g(λ)>−∞.λ∗ dual optimal or optimal Lagrange multipliers if they are optimal for theLagrange dual problem.

Lagrange dual problem often easier to solve (simpler constraints).d∗ can be used as stopping criterion for primal optimization.Dual can reveal hidden structure in the solution.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 27 / 38

Page 59: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

The Lagrange dual problem is a search for best lower bound on p∗:

maximize g(λ)

subject to λ� 0.

λ dual feasible if λ� 0 and g(λ)>−∞.

λ∗ dual optimal or optimal Lagrange multipliers if they are optimal for theLagrange dual problem.

Lagrange dual problem often easier to solve (simpler constraints).d∗ can be used as stopping criterion for primal optimization.Dual can reveal hidden structure in the solution.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 27 / 38

Page 60: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

The Lagrange dual problem is a search for best lower bound on p∗:

maximize g(λ)

subject to λ� 0.

λ dual feasible if λ� 0 and g(λ)>−∞.λ∗ dual optimal or optimal Lagrange multipliers if they are optimal for theLagrange dual problem.

Lagrange dual problem often easier to solve (simpler constraints).d∗ can be used as stopping criterion for primal optimization.Dual can reveal hidden structure in the solution.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 27 / 38

Page 61: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

The Lagrange dual problem is a search for best lower bound on p∗:

maximize g(λ)

subject to λ� 0.

λ dual feasible if λ� 0 and g(λ)>−∞.λ∗ dual optimal or optimal Lagrange multipliers if they are optimal for theLagrange dual problem.

Lagrange dual problem often easier to solve (simpler constraints).

d∗ can be used as stopping criterion for primal optimization.Dual can reveal hidden structure in the solution.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 27 / 38

Page 62: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

The Lagrange dual problem is a search for best lower bound on p∗:

maximize g(λ)

subject to λ� 0.

λ dual feasible if λ� 0 and g(λ)>−∞.λ∗ dual optimal or optimal Lagrange multipliers if they are optimal for theLagrange dual problem.

Lagrange dual problem often easier to solve (simpler constraints).d∗ can be used as stopping criterion for primal optimization.

Dual can reveal hidden structure in the solution.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 27 / 38

Page 63: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

The Lagrange Dual Problem: Search for Best Lower Bound

The Lagrange dual problem is a search for best lower bound on p∗:

maximize g(λ)

subject to λ� 0.

λ dual feasible if λ� 0 and g(λ)>−∞.λ∗ dual optimal or optimal Lagrange multipliers if they are optimal for theLagrange dual problem.

Lagrange dual problem often easier to solve (simpler constraints).d∗ can be used as stopping criterion for primal optimization.Dual can reveal hidden structure in the solution.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 27 / 38

Page 64: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Recap Lagrangian Duality

Lagrangian L(x ,λ) = f0(x)+∑m

i=1λi fi (x), with λi multipliers / dual variables

Equivalence to original optimization problem:

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m⇒ p∗ = inf

xsupλ�0

[L(x ,λ)]

Weak duality p∗ > supλ�0,ν infx [L(x ,λ)] = d∗

Dual function g(λ) = infx L(x ,λ) = infx (f0(x)+∑m

i=1λi fi (x)) is always concaveConvex problems (fi convex) have strong duality p∗ = d∗

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 28 / 38

Page 65: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Recap Lagrangian Duality

Lagrangian L(x ,λ) = f0(x)+∑m

i=1λi fi (x), with λi multipliers / dual variablesEquivalence to original optimization problem:

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m⇒ p∗ = inf

xsupλ�0

[L(x ,λ)]

Weak duality p∗ > supλ�0,ν infx [L(x ,λ)] = d∗

Dual function g(λ) = infx L(x ,λ) = infx (f0(x)+∑m

i=1λi fi (x)) is always concaveConvex problems (fi convex) have strong duality p∗ = d∗

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 28 / 38

Page 66: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Recap Lagrangian Duality

Lagrangian L(x ,λ) = f0(x)+∑m

i=1λi fi (x), with λi multipliers / dual variablesEquivalence to original optimization problem:

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m⇒ p∗ = inf

xsupλ�0

[L(x ,λ)]

Weak duality p∗ > supλ�0,ν infx [L(x ,λ)] = d∗

Dual function g(λ) = infx L(x ,λ) = infx (f0(x)+∑m

i=1λi fi (x)) is always concaveConvex problems (fi convex) have strong duality p∗ = d∗

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 28 / 38

Page 67: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Recap Lagrangian Duality

Lagrangian L(x ,λ) = f0(x)+∑m

i=1λi fi (x), with λi multipliers / dual variablesEquivalence to original optimization problem:

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m⇒ p∗ = inf

xsupλ�0

[L(x ,λ)]

Weak duality p∗ > supλ�0,ν infx [L(x ,λ)] = d∗

Dual function g(λ) = infx L(x ,λ) = infx (f0(x)+∑m

i=1λi fi (x)) is always concave

Convex problems (fi convex) have strong duality p∗ = d∗

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 28 / 38

Page 68: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Recap Lagrangian Duality

Lagrangian L(x ,λ) = f0(x)+∑m

i=1λi fi (x), with λi multipliers / dual variablesEquivalence to original optimization problem:

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m⇒ p∗ = inf

xsupλ�0

[L(x ,λ)]

Weak duality p∗ > supλ�0,ν infx [L(x ,λ)] = d∗

Dual function g(λ) = infx L(x ,λ) = infx (f0(x)+∑m

i=1λi fi (x)) is always concaveConvex problems (fi convex) have strong duality p∗ = d∗

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 28 / 38

Page 69: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Table of Contents

1 Convex Sets and Functions

2 The General Optimization Problem

3 Lagrangian Duality: Convexity not required

4 Convex Optimization

5 Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 29 / 38

Page 70: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex Optimization

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 30 / 38

Page 71: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Convex Optimization Problem: Standard Form

Convex Optimization Problem: Standard Form

minimize f0(x)

subject to fi (x)6 0, i = 1, . . . ,m

where f0, . . . , fm are convex functions.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 31 / 38

Page 72: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Strong Duality for Convex Problems

For a convex optimization problems, we usually have strong duality, but not always.For example:

minimize e−x

subject to x2/y 6 0y > 0

The additional conditions needed are called constraint qualifications.

Example from Laurent El Ghaoui’s EE 227A: Lecture 8 Notes, Feb 9, 2012

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 32 / 38

Page 73: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Slater’s Constraint Qualifications for Strong Duality

Sufficient conditions for strong duality in a convex problem.

Roughly: the problem must be strictly feasible.Qualifications when problem domain1 D⊂ Rn is an open set:

Strict feasibility is sufficient. (∃x fi (x)< 0 for i = 1, . . . ,m)For any affine inequality constraints, fi (x)6 0 is sufficient.

Otherwise, see notes or BV Section 5.2.3, p. 226.

1D is the set where all functions are defined, NOT the feasible set.M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 33 / 38

Page 74: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Slater’s Constraint Qualifications for Strong Duality

Sufficient conditions for strong duality in a convex problem.Roughly: the problem must be strictly feasible.

Qualifications when problem domain1 D⊂ Rn is an open set:Strict feasibility is sufficient. (∃x fi (x)< 0 for i = 1, . . . ,m)For any affine inequality constraints, fi (x)6 0 is sufficient.

Otherwise, see notes or BV Section 5.2.3, p. 226.

1D is the set where all functions are defined, NOT the feasible set.M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 33 / 38

Page 75: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Slater’s Constraint Qualifications for Strong Duality

Sufficient conditions for strong duality in a convex problem.Roughly: the problem must be strictly feasible.Qualifications when problem domain1 D⊂ Rn is an open set:

Strict feasibility is sufficient. (∃x fi (x)< 0 for i = 1, . . . ,m)For any affine inequality constraints, fi (x)6 0 is sufficient.

Otherwise, see notes or BV Section 5.2.3, p. 226.

1D is the set where all functions are defined, NOT the feasible set.M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 33 / 38

Page 76: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Slater’s Constraint Qualifications for Strong Duality

Sufficient conditions for strong duality in a convex problem.Roughly: the problem must be strictly feasible.Qualifications when problem domain1 D⊂ Rn is an open set:

Strict feasibility is sufficient. (∃x fi (x)< 0 for i = 1, . . . ,m)For any affine inequality constraints, fi (x)6 0 is sufficient.

Otherwise, see notes or BV Section 5.2.3, p. 226.

1D is the set where all functions are defined, NOT the feasible set.M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 33 / 38

Page 77: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Table of Contents

1 Convex Sets and Functions

2 The General Optimization Problem

3 Lagrangian Duality: Convexity not required

4 Convex Optimization

5 Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 34 / 38

Page 78: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 35 / 38

Page 79: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness

Consider a general optimization problem (i.e. not necessarily convex).

If we have strong duality, we get an interesting relationship betweenthe optimal Lagrange multiplier λi andthe ith constraint at the optimum: fi (x∗)

Relationship is called “complementary slackness”:

λ∗i fi (x∗) = 0

Always have Lagrange multiplier is zero or constraint is active at optimum or both.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 36 / 38

Page 80: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness

Consider a general optimization problem (i.e. not necessarily convex).If we have strong duality, we get an interesting relationship between

the optimal Lagrange multiplier λi andthe ith constraint at the optimum: fi (x∗)

Relationship is called “complementary slackness”:

λ∗i fi (x∗) = 0

Always have Lagrange multiplier is zero or constraint is active at optimum or both.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 36 / 38

Page 81: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness

Consider a general optimization problem (i.e. not necessarily convex).If we have strong duality, we get an interesting relationship between

the optimal Lagrange multiplier λi andthe ith constraint at the optimum: fi (x∗)

Relationship is called “complementary slackness”:

λ∗i fi (x∗) = 0

Always have Lagrange multiplier is zero or constraint is active at optimum or both.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 36 / 38

Page 82: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness

Consider a general optimization problem (i.e. not necessarily convex).If we have strong duality, we get an interesting relationship between

the optimal Lagrange multiplier λi andthe ith constraint at the optimum: fi (x∗)

Relationship is called “complementary slackness”:

λ∗i fi (x∗) = 0

Always have Lagrange multiplier is zero or constraint is active at optimum or both.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 36 / 38

Page 83: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness “Sandwich Proof”

Assume strong duality: p∗ = d∗ in a general optimization problemLet x∗ be primal optimal and λ∗ be dual optimal. Then:

f0(x∗) = g(λ∗) = inf

xL(x ,λ∗)

(strong duality and definition)

6 L(x∗,λ∗)

= f0(x∗)+

m∑i=1

λ∗i fi (x∗)︸ ︷︷ ︸

60

6 f0(x∗).

Each term in sum∑

i=1λ∗i fi (x

∗) must actually be 0. That is

λ∗i fi (x∗) = 0, i = 1, . . . ,m .

This condition is known as complementary slackness.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 37 / 38

Page 84: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness “Sandwich Proof”

Assume strong duality: p∗ = d∗ in a general optimization problemLet x∗ be primal optimal and λ∗ be dual optimal. Then:

f0(x∗) = g(λ∗) = inf

xL(x ,λ∗) (strong duality and definition)

6 L(x∗,λ∗)

= f0(x∗)+

m∑i=1

λ∗i fi (x∗)︸ ︷︷ ︸

60

6 f0(x∗).

Each term in sum∑

i=1λ∗i fi (x

∗) must actually be 0. That is

λ∗i fi (x∗) = 0, i = 1, . . . ,m .

This condition is known as complementary slackness.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 37 / 38

Page 85: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness “Sandwich Proof”

Assume strong duality: p∗ = d∗ in a general optimization problemLet x∗ be primal optimal and λ∗ be dual optimal. Then:

f0(x∗) = g(λ∗) = inf

xL(x ,λ∗) (strong duality and definition)

6 L(x∗,λ∗)

= f0(x∗)+

m∑i=1

λ∗i fi (x∗)︸ ︷︷ ︸

60

6 f0(x∗).

Each term in sum∑

i=1λ∗i fi (x

∗) must actually be 0. That is

λ∗i fi (x∗) = 0, i = 1, . . . ,m .

This condition is known as complementary slackness.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 37 / 38

Page 86: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness “Sandwich Proof”

Assume strong duality: p∗ = d∗ in a general optimization problemLet x∗ be primal optimal and λ∗ be dual optimal. Then:

f0(x∗) = g(λ∗) = inf

xL(x ,λ∗) (strong duality and definition)

6 L(x∗,λ∗)

= f0(x∗)+

m∑i=1

λ∗i fi (x∗)︸ ︷︷ ︸

60

6 f0(x∗).

Each term in sum∑

i=1λ∗i fi (x

∗) must actually be 0. That is

λ∗i fi (x∗) = 0, i = 1, . . . ,m .

This condition is known as complementary slackness.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 37 / 38

Page 87: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness “Sandwich Proof”

Assume strong duality: p∗ = d∗ in a general optimization problemLet x∗ be primal optimal and λ∗ be dual optimal. Then:

f0(x∗) = g(λ∗) = inf

xL(x ,λ∗) (strong duality and definition)

6 L(x∗,λ∗)

= f0(x∗)+

m∑i=1

λ∗i fi (x∗)︸ ︷︷ ︸

60

6 f0(x∗).

Each term in sum∑

i=1λ∗i fi (x

∗) must actually be 0. That is

λ∗i fi (x∗) = 0, i = 1, . . . ,m .

This condition is known as complementary slackness.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 37 / 38

Page 88: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Complementary Slackness “Sandwich Proof”

Assume strong duality: p∗ = d∗ in a general optimization problemLet x∗ be primal optimal and λ∗ be dual optimal. Then:

f0(x∗) = g(λ∗) = inf

xL(x ,λ∗) (strong duality and definition)

6 L(x∗,λ∗)

= f0(x∗)+

m∑i=1

λ∗i fi (x∗)︸ ︷︷ ︸

60

6 f0(x∗).

Each term in sum∑

i=1λ∗i fi (x

∗) must actually be 0. That is

λ∗i fi (x∗) = 0, i = 1, . . . ,m .

This condition is known as complementary slackness.M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 37 / 38

Page 89: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Result of “Sandwich Proof” and Consequences

Let x∗ be primal optimal and λ∗ be dual optimal.If we have strong duality, then

p∗ = d∗ = f0(x∗) = g(λ∗) = L(x∗,λ∗)

and we have complementary slacknessλ∗i fi (x

∗) = 0, i = 1, . . . ,m.

From the proof, we can also conclude that

L(x∗,λ∗) = infxL(x ,λ∗).

If x 7→ L(x ,λ∗) is differentiable, then we must have ∇L(x∗,λ∗) = 0.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 38 / 38

Page 90: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Result of “Sandwich Proof” and Consequences

Let x∗ be primal optimal and λ∗ be dual optimal.If we have strong duality, then

p∗ = d∗ = f0(x∗) = g(λ∗) = L(x∗,λ∗)

and we have complementary slacknessλ∗i fi (x

∗) = 0, i = 1, . . . ,m.

From the proof, we can also conclude that

L(x∗,λ∗) = infxL(x ,λ∗).

If x 7→ L(x ,λ∗) is differentiable, then we must have ∇L(x∗,λ∗) = 0.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 38 / 38

Page 91: LagrangianDualityandConvexOptimization · LagrangianDualityandConvexOptimization MarylouGabrié&HeHe Materialoriginallydesignedby: JuliaKempe&DavidRosenberg CDS, NYU February16,2021

Result of “Sandwich Proof” and Consequences

Let x∗ be primal optimal and λ∗ be dual optimal.If we have strong duality, then

p∗ = d∗ = f0(x∗) = g(λ∗) = L(x∗,λ∗)

and we have complementary slacknessλ∗i fi (x

∗) = 0, i = 1, . . . ,m.

From the proof, we can also conclude that

L(x∗,λ∗) = infxL(x ,λ∗).

If x 7→ L(x ,λ∗) is differentiable, then we must have ∇L(x∗,λ∗) = 0.

M.G. H.H. D.R. & J. K. (CDS, NYU) DS-GA 1003 February 16, 2021 38 / 38