dmitriydrusvyatskiy schooloforie,cornelluniversity...

Post on 27-Sep-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Slope and geometry in variational mathematics

Dmitriy DrusvyatskiySchool of ORIE, Cornell University

Joint work withAris Daniilidis (Barcelona), Alex D. Ioffe (Technion), Martin Larsson (Lausanne),

and Adrian S. Lewis (Cornell)

January 29, 2013

Fix a metric space (X , d) and a function f : X → R.

Slope: “Fastest instantaneous rate of decrease”

|∇f |(x) := limsupx→x

f (x)− f (x)d(x, x) .

• If f : Rn → R is smooth, then |∇f |(x) = |∇f (x)|.

Critical points:

x is critical for f ⇐⇒ |∇f |(x) = 0.

Eg:

Origin is critical

2/35

Fix a metric space (X , d) and a function f : X → R.

Slope: “Fastest instantaneous rate of decrease”

|∇f |(x) := limsupx→x

f (x)− f (x)d(x, x) .

• If f : Rn → R is smooth, then |∇f |(x) = |∇f (x)|.

Critical points:

x is critical for f ⇐⇒ |∇f |(x) = 0.

Eg:

Origin is critical

2/35

Fix a metric space (X , d) and a function f : X → R.

Slope: “Fastest instantaneous rate of decrease”

|∇f |(x) := limsupx→x

f (x)− f (x)d(x, x) .

• If f : Rn → R is smooth, then |∇f |(x) = |∇f (x)|.

Critical points:

x is critical for f ⇐⇒ |∇f |(x) = 0.

Eg:

Origin is critical

2/35

Fix a metric space (X , d) and a function f : X → R.

Slope: “Fastest instantaneous rate of decrease”

|∇f |(x) := limsupx→x

f (x)− f (x)d(x, x) .

• If f : Rn → R is smooth, then |∇f |(x) = |∇f (x)|.

Critical points:

x is critical for f ⇐⇒ |∇f |(x) = 0.

Eg:

Origin is critical

2/35

Fix a metric space (X , d) and a function f : X → R.

Slope: “Fastest instantaneous rate of decrease”

|∇f |(x) := limsupx→x

f (x)− f (x)d(x, x) .

• If f : Rn → R is smooth, then |∇f |(x) = |∇f (x)|.

Critical points:

x is critical for f ⇐⇒ |∇f |(x) = 0.

Eg:

Origin is critical

2/35

Fix a metric space (X , d) and a function f : X → R.

Slope: “Fastest instantaneous rate of decrease”

|∇f |(x) := limsupx→x

f (x)− f (x)d(x, x) .

• If f : Rn → R is smooth, then |∇f |(x) = |∇f (x)|.

Critical points:

x is critical for f ⇐⇒ |∇f |(x) = 0.Eg:

Origin is critical 2/35

Method of alternating projectionsCommon problem:

Given sets A,B ⊂ Rn , find some point x ∈ A ∩ B.

A

Bx

Distance and projection:

dB(x) = miny∈B|x − y| and PB(x) = {nearest points of B to x}.

Finding points in PA and PB is often easy!

Eg 1 (simple example): Linear programming:

{x : x ≥ 0}⋂{x : Ax = b}.

Eg 2 (more interesting): Low-order control:

{X � 0 : rankX ≤ r}⋂{X : A(X) = b}.

3/35

Method of alternating projectionsCommon problem:

Given sets A,B ⊂ Rn , find some point x ∈ A ∩ B.

A

Bx

Distance and projection:

dB(x) = miny∈B|x − y| and PB(x) = {nearest points of B to x}.

Finding points in PA and PB is often easy!

Eg 1 (simple example): Linear programming:

{x : x ≥ 0}⋂{x : Ax = b}.

Eg 2 (more interesting): Low-order control:

{X � 0 : rankX ≤ r}⋂{X : A(X) = b}.

3/35

Method of alternating projectionsCommon problem:

Given sets A,B ⊂ Rn , find some point x ∈ A ∩ B.

A

Bx

Distance and projection:

dB(x) = miny∈B|x − y| and PB(x) = {nearest points of B to x}.

Finding points in PA and PB is often easy!

Eg 1 (simple example): Linear programming:

{x : x ≥ 0}⋂{x : Ax = b}.

Eg 2 (more interesting): Low-order control:

{X � 0 : rankX ≤ r}⋂{X : A(X) = b}.

3/35

Method of alternating projectionsCommon problem:

Given sets A,B ⊂ Rn , find some point x ∈ A ∩ B.

A

Bx

Distance and projection:

dB(x) = miny∈B|x − y| and PB(x) = {nearest points of B to x}.

Finding points in PA and PB is often easy!

Eg 1 (simple example): Linear programming:

{x : x ≥ 0}⋂{x : Ax = b}.

Eg 2 (more interesting): Low-order control:

{X � 0 : rankX ≤ r}⋂{X : A(X) = b}.

3/35

Method of alternating projectionsCommon problem:

Given sets A,B ⊂ Rn , find some point x ∈ A ∩ B.

A

Bx

Distance and projection:

dB(x) = miny∈B|x − y| and PB(x) = {nearest points of B to x}.

Finding points in PA and PB is often easy!

Eg 1 (simple example): Linear programming:

{x : x ≥ 0}⋂{x : Ax = b}.

Eg 2 (more interesting): Low-order control:

{X � 0 : rankX ≤ r}⋂{X : A(X) = b}.

3/35

Method of alternating projections

Method of alternating projections (von Neumann ’33):

xk+1 ∈ PB(xk)xk+2 ∈ PA(xk+1)

A

B

x0

The “angle” between A and B drives the convergence!

Quantifying the angle:

ψ(x, y) :={|x − y| if x ∈ A, y ∈ B+∞ otherwise

Comparison of |∇ψy|(x) and |∇ψx |(y) quantifies the angle!

4/35

Method of alternating projections

Method of alternating projections (von Neumann ’33):

xk+1 ∈ PB(xk)xk+2 ∈ PA(xk+1)

A

B

x0

The “angle” between A and B drives the convergence!

Quantifying the angle:

ψ(x, y) :={|x − y| if x ∈ A, y ∈ B+∞ otherwise

Comparison of |∇ψy|(x) and |∇ψx |(y) quantifies the angle!

4/35

Method of alternating projections

Method of alternating projections (von Neumann ’33):

xk+1 ∈ PB(xk)xk+2 ∈ PA(xk+1)

A

B

x0

The “angle” between A and B drives the convergence!

Quantifying the angle:

ψ(x, y) :={|x − y| if x ∈ A, y ∈ B+∞ otherwise

Comparison of |∇ψy|(x) and |∇ψx |(y) quantifies the angle!

4/35

Outline

• Variational geometry:— Subdifferentials— The semi-algebraic case— Slope and error bounds

• Applications:— Alternating projections and transversality— Steepest descent trajectories

• To smooth or not to smooth?

• Active sets in optimization

5/35

SubdifferentialsFréchet subdifferential:

v ∈ ∂f (x) ⇐⇒ x is critical for f − 〈v, ·〉,

or equivalentlyv ∈ ∂f (x) ⇐⇒ f (x) ≥ f (x) + 〈v, x − x〉+ o(|x − x|).

• Eg: If f is smooth ∂f = {∇f }.Example:

∂(−|·|)(x) =

1 if x < 0∅ if x = 0−1 if x > 0

y = −|x|

Subdifferential graph:

gph ∂f := {(x, v) ∈ Rn ×Rn : v ∈ ∂f (x)}.

6/35

SubdifferentialsFréchet subdifferential:

v ∈ ∂f (x) ⇐⇒ x is critical for f − 〈v, ·〉,or equivalently

v ∈ ∂f (x) ⇐⇒ f (x) ≥ f (x) + 〈v, x − x〉+ o(|x − x|).

• Eg: If f is smooth ∂f = {∇f }.Example:

∂(−|·|)(x) =

1 if x < 0∅ if x = 0−1 if x > 0

y = −|x|

Subdifferential graph:

gph ∂f := {(x, v) ∈ Rn ×Rn : v ∈ ∂f (x)}.

6/35

SubdifferentialsFréchet subdifferential:

v ∈ ∂f (x) ⇐⇒ x is critical for f − 〈v, ·〉,or equivalently

v ∈ ∂f (x) ⇐⇒ f (x) ≥ f (x) + 〈v, x − x〉+ o(|x − x|).• Eg: If f is smooth ∂f = {∇f }.

Example:

∂(−|·|)(x) =

1 if x < 0∅ if x = 0−1 if x > 0

y = −|x|

Subdifferential graph:

gph ∂f := {(x, v) ∈ Rn ×Rn : v ∈ ∂f (x)}.

6/35

SubdifferentialsFréchet subdifferential:

v ∈ ∂f (x) ⇐⇒ x is critical for f − 〈v, ·〉,or equivalently

v ∈ ∂f (x) ⇐⇒ f (x) ≥ f (x) + 〈v, x − x〉+ o(|x − x|).• Eg: If f is smooth ∂f = {∇f }.

Example:

∂(−|·|)(x) =

1 if x < 0∅ if x = 0−1 if x > 0

y = −|x|

Subdifferential graph:

gph ∂f := {(x, v) ∈ Rn ×Rn : v ∈ ∂f (x)}.

6/35

SubdifferentialsFréchet subdifferential:

v ∈ ∂f (x) ⇐⇒ x is critical for f − 〈v, ·〉,or equivalently

v ∈ ∂f (x) ⇐⇒ f (x) ≥ f (x) + 〈v, x − x〉+ o(|x − x|).• Eg: If f is smooth ∂f = {∇f }.

Example:

∂(−|·|)(x) =

1 if x < 0∅ if x = 0−1 if x > 0

y = −|x|

Subdifferential graph:

gph ∂f := {(x, v) ∈ Rn ×Rn : v ∈ ∂f (x)}.6/35

Subdifferentials

Fundamental theoretical and computational hurdle:1. |∇f | is not lsc,2. gph ∂f is not closed.

Limiting slope:

|∇f |(x) := liminfx→x

|∇f |(x).

Limiting subdifferential:

gph ∂f := cl (gph ∂f ).

Relationship (Ioffe ’00):

|∇f |(x) = dist (0, ∂f (x)).

7/35

Subdifferentials

Fundamental theoretical and computational hurdle:1. |∇f | is not lsc,2. gph ∂f is not closed.

Limiting slope:

|∇f |(x) := liminfx→x

|∇f |(x).

Limiting subdifferential:

gph ∂f := cl (gph ∂f ).

Relationship (Ioffe ’00):

|∇f |(x) = dist (0, ∂f (x)).

7/35

Subdifferentials

Fundamental theoretical and computational hurdle:1. |∇f | is not lsc,2. gph ∂f is not closed.

Limiting slope:

|∇f |(x) := liminfx→x

|∇f |(x).

Limiting subdifferential:

gph ∂f := cl (gph ∂f ).

Relationship (Ioffe ’00):

|∇f |(x) = dist (0, ∂f (x)).

7/35

Subdifferentials

Are these notions adequate?

Pathology: gph ∂f can be very large!

What do we expect the size of gph ∂f to be?

• If f : Rn → R is smooth, then gph ∂f is n-dimensionalsmooth manifold.

• If f : Rn → R is convex, then gph ∂f is n-dimensionalLipschitz manifold (Minty ’62).

Multiple authors (Rockafellar, Borwein, Wang,. . .):There are functions f : Rn → R with “2n-dimensional” gph ∂f .

8/35

Subdifferentials

Are these notions adequate?

Pathology: gph ∂f can be very large!

What do we expect the size of gph ∂f to be?

• If f : Rn → R is smooth, then gph ∂f is n-dimensionalsmooth manifold.

• If f : Rn → R is convex, then gph ∂f is n-dimensionalLipschitz manifold (Minty ’62).

Multiple authors (Rockafellar, Borwein, Wang,. . .):There are functions f : Rn → R with “2n-dimensional” gph ∂f .

8/35

Subdifferentials

Are these notions adequate?

Pathology: gph ∂f can be very large!

What do we expect the size of gph ∂f to be?

• If f : Rn → R is smooth, then gph ∂f is n-dimensionalsmooth manifold.

• If f : Rn → R is convex, then gph ∂f is n-dimensionalLipschitz manifold (Minty ’62).

Multiple authors (Rockafellar, Borwein, Wang,. . .):There are functions f : Rn → R with “2n-dimensional” gph ∂f .

8/35

Subdifferentials

Are these notions adequate?

Pathology: gph ∂f can be very large!

What do we expect the size of gph ∂f to be?

• If f : Rn → R is smooth, then gph ∂f is n-dimensionalsmooth manifold.

• If f : Rn → R is convex, then gph ∂f is n-dimensionalLipschitz manifold (Minty ’62).

Multiple authors (Rockafellar, Borwein, Wang,. . .):There are functions f : Rn → R with “2n-dimensional” gph ∂f .

8/35

Semi-algebraic geometry

Q ⊂ Rn is semi-algebraic if it is a finite union of

solution sets to finitely many polynomial inequalities.

Semi-algebraicity is robust (Tarski-Seidenberg theorem).Eg:f semi-algebraic =⇒ gph ∂f and |∇f | are semi-algebraic.

Semi-algebraic Q “stratify” into finitely many manifolds {Mi}.Dimension:

dimQ := maxi=1,...,k

{dimMi}.

9/35

Semi-algebraic geometry

Q ⊂ Rn is semi-algebraic if it is a finite union of

solution sets to finitely many polynomial inequalities.

Semi-algebraicity is robust (Tarski-Seidenberg theorem).Eg:f semi-algebraic =⇒ gph ∂f and |∇f | are semi-algebraic.

Semi-algebraic Q “stratify” into finitely many manifolds {Mi}.Dimension:

dimQ := maxi=1,...,k

{dimMi}.

9/35

Semi-algebraic geometry

Q ⊂ Rn is semi-algebraic if it is a finite union of

solution sets to finitely many polynomial inequalities.

Semi-algebraicity is robust (Tarski-Seidenberg theorem).Eg:f semi-algebraic =⇒ gph ∂f and |∇f | are semi-algebraic.

Semi-algebraic Q “stratify” into finitely many manifolds {Mi}.Dimension:

dimQ := maxi=1,...,k

{dimMi}.

9/35

Semi-algebraic geometry

Theorem (D-Ioffe-Lewis)For semi-algebraic f : Rn → R, we have

dim gph ∂f = n,

even locally around any point in gph ∂f .

Conclusion: Criticality is meaningful for concrete variationalproblems!

Semi-algebraic f have only finitely many critical values(cf. Sard’s Theorem) ⇒ intervals (a, b) of non-critical values.

What can we learn from non-criticality?

10/35

Semi-algebraic geometry

Theorem (D-Ioffe-Lewis)For semi-algebraic f : Rn → R, we have

dim gph ∂f = n,

even locally around any point in gph ∂f .

Conclusion: Criticality is meaningful for concrete variationalproblems!

Semi-algebraic f have only finitely many critical values(cf. Sard’s Theorem) ⇒ intervals (a, b) of non-critical values.

What can we learn from non-criticality?

10/35

Semi-algebraic geometry

Theorem (D-Ioffe-Lewis)For semi-algebraic f : Rn → R, we have

dim gph ∂f = n,

even locally around any point in gph ∂f .

Conclusion: Criticality is meaningful for concrete variationalproblems!

Semi-algebraic f have only finitely many critical values(cf. Sard’s Theorem)

⇒ intervals (a, b) of non-critical values.

What can we learn from non-criticality?

10/35

Semi-algebraic geometry

Theorem (D-Ioffe-Lewis)For semi-algebraic f : Rn → R, we have

dim gph ∂f = n,

even locally around any point in gph ∂f .

Conclusion: Criticality is meaningful for concrete variationalproblems!

Semi-algebraic f have only finitely many critical values(cf. Sard’s Theorem) ⇒ intervals (a, b) of non-critical values.

What can we learn from non-criticality?

10/35

Semi-algebraic geometry

Theorem (D-Ioffe-Lewis)For semi-algebraic f : Rn → R, we have

dim gph ∂f = n,

even locally around any point in gph ∂f .

Conclusion: Criticality is meaningful for concrete variationalproblems!

Semi-algebraic f have only finitely many critical values(cf. Sard’s Theorem) ⇒ intervals (a, b) of non-critical values.

What can we learn from non-criticality?

10/35

Slope and error boundsCommon problem: Estimate

dist (x, [f ≤ r ]) (difficult).

“The residual”:f (x)− r (easy).

Desirable quality: Exists κ with

dist (x, [f ≤ r ]) ≤ κ(f (x)− r).

Restrict f : Rn → R to a “slice” f−1(a, b).

Lemma (Error bound)The following are equivalent.Non-criticality:

|∇f | ≥ 1κ.

Error-bound:

dist (x, [f ≤ r ]) ≤ κ(f (x)− r), when r ∈ (a, f (x)).

11/35

Slope and error boundsCommon problem: Estimate

dist (x, [f ≤ r ]) (difficult).

“The residual”:f (x)− r (easy).

Desirable quality: Exists κ with

dist (x, [f ≤ r ]) ≤ κ(f (x)− r).

Restrict f : Rn → R to a “slice” f−1(a, b).

Lemma (Error bound)The following are equivalent.Non-criticality:

|∇f | ≥ 1κ.

Error-bound:

dist (x, [f ≤ r ]) ≤ κ(f (x)− r), when r ∈ (a, f (x)).

11/35

Slope and error boundsCommon problem: Estimate

dist (x, [f ≤ r ]) (difficult).

“The residual”:f (x)− r (easy).

Desirable quality: Exists κ with

dist (x, [f ≤ r ]) ≤ κ(f (x)− r).

Restrict f : Rn → R to a “slice” f−1(a, b).

Lemma (Error bound)The following are equivalent.Non-criticality:

|∇f | ≥ 1κ.

Error-bound:

dist (x, [f ≤ r ]) ≤ κ(f (x)− r), when r ∈ (a, f (x)).

11/35

Slope and error boundsCommon problem: Estimate

dist (x, [f ≤ r ]) (difficult).

“The residual”:f (x)− r (easy).

Desirable quality: Exists κ with

dist (x, [f ≤ r ]) ≤ κ(f (x)− r).

Restrict f : Rn → R to a “slice” f−1(a, b).

Lemma (Error bound)The following are equivalent.Non-criticality:

|∇f | ≥ 1κ.

Error-bound:

dist (x, [f ≤ r ]) ≤ κ(f (x)− r), when r ∈ (a, f (x)).

11/35

Slope and error boundsCommon problem: Estimate

dist (x, [f ≤ r ]) (difficult).

“The residual”:f (x)− r (easy).

Desirable quality: Exists κ with

dist (x, [f ≤ r ]) ≤ κ(f (x)− r).

Restrict f : Rn → R to a “slice” f−1(a, b).

Lemma (Error bound)The following are equivalent.Non-criticality:

|∇f | ≥ 1κ.

Error-bound:

dist (x, [f ≤ r ]) ≤ κ(f (x)− r), when r ∈ (a, f (x)).

11/35

Alternating projections & transversality

12/35

Convergence of alternating projections

Indicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B. (Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).

13/35

Convergence of alternating projectionsIndicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B. (Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).

13/35

Convergence of alternating projectionsIndicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B. (Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).

13/35

Convergence of alternating projectionsIndicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B.

(Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).

13/35

Convergence of alternating projectionsIndicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B. (Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).

13/35

Convergence of alternating projectionsIndicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B. (Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).

13/35

Convergence of alternating projectionsIndicator function:

δA(x) ={

0 if x ∈ A∞ if x /∈ A

Coupling function:

ψ(x, y) = δA(x) + |x − y|+ δB(y).

A

B

x0

Sufficient condition: Exists κ > 0 withmax {|∇ψx |(y), |∇ψy|(x)} ≥ κ

for all x ∈ A and y ∈ B, not in A ∩ B. (Proof: Error bound)

Equivalently: For any v = x−y|x−y| , either

dist(v, ∂δB(y)

)≥ κ or dist

(− v, ∂δA(x)

)≥ κ.

Normal cone: NB(y) := ∂δB(y).13/35

Convergence of alternating projections

dist(v,NB(y)

)≥ κ or dist

(v,−NA(x)

)≥ κ

x1

x4

x3x2

x5

x6x7Q

Figure: Normal cones

Transversality: NA(x) ∩ −NB(x) = {0}.

Local convergence (D-Ioffe-Lewis ’13):

A and B transverse at x =⇒ local R-linear convergence.

14/35

Convergence of alternating projections

dist(v,NB(y)

)≥ κ or dist

(v,−NA(x)

)≥ κ

x1

x4

x3x2

x5

x6x7Q

Figure: Normal cones

Transversality: NA(x) ∩ −NB(x) = {0}.

Local convergence (D-Ioffe-Lewis ’13):

A and B transverse at x =⇒ local R-linear convergence.

14/35

Convergence of alternating projections

dist(v,NB(y)

)≥ κ or dist

(v,−NA(x)

)≥ κ

x1

x4

x3x2

x5

x6x7Q

Figure: Normal cones

Transversality: NA(x) ∩ −NB(x) = {0}.

Local convergence (D-Ioffe-Lewis ’13):

A and B transverse at x =⇒ local R-linear convergence.

14/35

Convergence of alternating projections

dist(v,NB(y)

)≥ κ or dist

(v,−NA(x)

)≥ κ

x1

x4

x3x2

x5

x6x7Q

Figure: Normal cones

Transversality: NA(x) ∩ −NB(x) = {0}.

Local convergence (D-Ioffe-Lewis ’13):

A and B transverse at x =⇒ local R-linear convergence.

14/35

Steepest descent trajectories

15/35

Steepest descent trajectoriesNotation: X is a complete metric space.

Bounded speed: A curve x : [0,T ]→ X is 1-Lipschitz if

dist (x(t), x(s)) ≤ |t − s|.

What are trajectories of steepest descent?

Motivation: 1-Lipschitz curves x : [0,T ]→ X satisfy

|∇(f ◦ x)| ≤ |∇f |(x).

Definition (Trajectories of near-steepest descent)A curve x : [0,T ]→ X is a trajectory of near-steepest descent if• x is 1-Lipschitz,• f ◦ x is decreasing,• |(f ◦ x)′| ≥ |∇f |(x), a.e. on [0,T ].

16/35

Steepest descent trajectoriesNotation: X is a complete metric space.Bounded speed: A curve x : [0,T ]→ X is 1-Lipschitz if

dist (x(t), x(s)) ≤ |t − s|.

What are trajectories of steepest descent?

Motivation: 1-Lipschitz curves x : [0,T ]→ X satisfy

|∇(f ◦ x)| ≤ |∇f |(x).

Definition (Trajectories of near-steepest descent)A curve x : [0,T ]→ X is a trajectory of near-steepest descent if• x is 1-Lipschitz,• f ◦ x is decreasing,• |(f ◦ x)′| ≥ |∇f |(x), a.e. on [0,T ].

16/35

Steepest descent trajectoriesNotation: X is a complete metric space.Bounded speed: A curve x : [0,T ]→ X is 1-Lipschitz if

dist (x(t), x(s)) ≤ |t − s|.

What are trajectories of steepest descent?

Motivation: 1-Lipschitz curves x : [0,T ]→ X satisfy

|∇(f ◦ x)| ≤ |∇f |(x).

Definition (Trajectories of near-steepest descent)A curve x : [0,T ]→ X is a trajectory of near-steepest descent if• x is 1-Lipschitz,• f ◦ x is decreasing,• |(f ◦ x)′| ≥ |∇f |(x), a.e. on [0,T ].

16/35

Steepest descent trajectoriesNotation: X is a complete metric space.Bounded speed: A curve x : [0,T ]→ X is 1-Lipschitz if

dist (x(t), x(s)) ≤ |t − s|.

What are trajectories of steepest descent?

Motivation: 1-Lipschitz curves x : [0,T ]→ X satisfy

|∇(f ◦ x)| ≤ |∇f |(x).

Definition (Trajectories of near-steepest descent)A curve x : [0,T ]→ X is a trajectory of near-steepest descent if• x is 1-Lipschitz,• f ◦ x is decreasing,• |(f ◦ x)′| ≥ |∇f |(x), a.e. on [0,T ].

16/35

Steepest descent trajectoriesNotation: X is a complete metric space.Bounded speed: A curve x : [0,T ]→ X is 1-Lipschitz if

dist (x(t), x(s)) ≤ |t − s|.

What are trajectories of steepest descent?

Motivation: 1-Lipschitz curves x : [0,T ]→ X satisfy

|∇(f ◦ x)| ≤ |∇f |(x).

Definition (Trajectories of near-steepest descent)A curve x : [0,T ]→ X is a trajectory of near-steepest descent if• x is 1-Lipschitz,• f ◦ x is decreasing,• |(f ◦ x)′| ≥ |∇f |(x), a.e. on [0,T ].

16/35

Example

Figure: f (x, y) = max{x + y, |x − y|}+ x(x + 1) + y(y + 1) + 10017/35

Subgradient dynamical systems

Theorem (D-Ioffe-Lewis)For reasonable f : Rn → R and a curve x : [0,T ]→ Rn the followingare equivalent.1. x is a curve of near-steepest descent,2. f ◦ x is decreasing and (after reparametrizing)

x ∈ −∂f (x), a.e. on [0,T ].

Remark: When 2. holds,

x is the shortest element of −∂f (x), a.e. on [0,T ].

Reasonable conditions: f is smooth, convex, or semi-algebraic.

18/35

Subgradient dynamical systems

Theorem (D-Ioffe-Lewis)For reasonable f : Rn → R and a curve x : [0,T ]→ Rn the followingare equivalent.1. x is a curve of near-steepest descent,2. f ◦ x is decreasing and (after reparametrizing)

x ∈ −∂f (x), a.e. on [0,T ].

Remark: When 2. holds,

x is the shortest element of −∂f (x), a.e. on [0,T ].

Reasonable conditions: f is smooth, convex, or semi-algebraic.

18/35

Subgradient dynamical systems

Theorem (D-Ioffe-Lewis)For reasonable f : Rn → R and a curve x : [0,T ]→ Rn the followingare equivalent.1. x is a curve of near-steepest descent,2. f ◦ x is decreasing and (after reparametrizing)

x ∈ −∂f (x), a.e. on [0,T ].

Remark: When 2. holds,

x is the shortest element of −∂f (x), a.e. on [0,T ].

Reasonable conditions: f is smooth, convex, or semi-algebraic.

18/35

Existence

Theorem (Ambrosio et al. ’05, De Giorgi ’93)For reasonable f : X → R, there exist curves of near-steepestdescent starting from any point.

Proof ingredients:• Moreau-Yosida approximation:

xk+1 = argminx∈X

{f (x) + 12τ d

2(x, xk)}.

• Extraneous topologies ⇒ existence of minimizers andconvergence.

Proof is opaque and uses heavy machinery!

19/35

Existence

Theorem (Ambrosio et al. ’05, De Giorgi ’93)For reasonable f : X → R, there exist curves of near-steepestdescent starting from any point.

Proof ingredients:• Moreau-Yosida approximation:

xk+1 = argminx∈X

{f (x) + 12τ d

2(x, xk)}.

• Extraneous topologies ⇒ existence of minimizers andconvergence.

Proof is opaque and uses heavy machinery!

19/35

Existence

Theorem (Ambrosio et al. ’05, De Giorgi ’93)For reasonable f : X → R, there exist curves of near-steepestdescent starting from any point.

Proof ingredients:• Moreau-Yosida approximation:

xk+1 = argminx∈X

{f (x) + 12τ d

2(x, xk)}.

• Extraneous topologies ⇒ existence of minimizers andconvergence.

Proof is opaque and uses heavy machinery!

19/35

New proof idea: Error bound lemma

Equipartition: For η > 0,

f (x0)− η = τ0 < . . . < τk = f (x0).

Initialize: j = 0;while i ≤ k do

xj+1 ← P[f≤τj+1](xj);end

Consider resulting trajectories as k →∞.

20/35

New proof idea: Error bound lemma

Equipartition: For η > 0,

f (x0)− η = τ0 < . . . < τk = f (x0).

Initialize: j = 0;while i ≤ k do

xj+1 ← P[f≤τj+1](xj);end

Consider resulting trajectories as k →∞.

20/35

New proof idea: Error bound lemma

Equipartition: For η > 0,

f (x0)− η = τ0 < . . . < τk = f (x0).

Initialize: j = 0;while i ≤ k do

xj+1 ← P[f≤τj+1](xj);end

Consider resulting trajectories as k →∞.

20/35

Lengths of steepest descent curves

One motivation: Algorithm complexity (Eg: Attouch et al. ’11).

Theorem (D-Ioffe-Lewis)If f is semi-algebraic, then any bounded trajectory ofnear-steepest descent (with maximal domain) has bounded lengthand converges to a critical point.

Historic remarks:• Lojasiewicz famously proved it for real analytic functions.• True for convex f (Daniilidis et al. ’12).• Not true for C∞ functions (Palis, de Melo).

21/35

Lengths of steepest descent curves

One motivation: Algorithm complexity (Eg: Attouch et al. ’11).

Theorem (D-Ioffe-Lewis)If f is semi-algebraic, then any bounded trajectory ofnear-steepest descent (with maximal domain) has bounded lengthand converges to a critical point.

Historic remarks:• Lojasiewicz famously proved it for real analytic functions.• True for convex f (Daniilidis et al. ’12).• Not true for C∞ functions (Palis, de Melo).

21/35

Lengths of steepest descent curves

One motivation: Algorithm complexity (Eg: Attouch et al. ’11).

Theorem (D-Ioffe-Lewis)If f is semi-algebraic, then any bounded trajectory ofnear-steepest descent (with maximal domain) has bounded lengthand converges to a critical point.

Historic remarks:• Lojasiewicz famously proved it for real analytic functions.• True for convex f (Daniilidis et al. ’12).• Not true for C∞ functions (Palis, de Melo).

21/35

To smooth or not to smooth?

22/35

Approximation of functionsMollification:

fε(x) := (f ∗ φε)(x) =∫

Rnφε(x − y)f (y) dy

where φε is a scaled standard bump function φ

• Downside: Not much control on derivatives.

23/35

Approximation of functionsMollification:

fε(x) := (f ∗ φε)(x) =∫

Rnφε(x − y)f (y) dy

where φε is a scaled standard bump function φ

• Downside: Not much control on derivatives.

23/35

Approximation of functionsSet-up: Q ↪−→ Rn f−−→ R.

Q

Assume Q is a disjoint union of C∞ manifolds

Q = M1 ∪M2 ∪ . . . ∪Mk−1 ∪Mk

24/35

Approximation of functionsMotivation: Integration by parts∫

Qg∆f dx

=∫∂Q

g〈∇f , n〉dS −∫

Q〈∇g,∇f 〉 dx.

Goal: approximate f by a C2−smooth f so that the ∇f ⊥ n.

Theorem (D-Larsson)Given a continuous f : Rn → R and any ε > 0, there exists aC1-smooth f satisfying1. Closeness: |f (x)− f (x)| < ε for all x ∈ Rn,2. Neumann Boundary condition:

x ∈ Mi =⇒ ∇f (x) ∈ TxMi .

provided {Mi} is a Whitney stratification of Q.

• For semi-algebraic Q, Whitney stratifications always exist!

25/35

Approximation of functionsMotivation: Integration by parts∫

Qg∆f dx =

∫∂Q

g〈∇f , n〉dS −∫

Q〈∇g,∇f 〉 dx.

Goal: approximate f by a C2−smooth f so that the ∇f ⊥ n.

Theorem (D-Larsson)Given a continuous f : Rn → R and any ε > 0, there exists aC1-smooth f satisfying1. Closeness: |f (x)− f (x)| < ε for all x ∈ Rn,2. Neumann Boundary condition:

x ∈ Mi =⇒ ∇f (x) ∈ TxMi .

provided {Mi} is a Whitney stratification of Q.

• For semi-algebraic Q, Whitney stratifications always exist!

25/35

Approximation of functionsMotivation: Integration by parts∫

Qg∆f dx =

∫∂Q

g〈∇f , n〉dS −∫

Q〈∇g,∇f 〉 dx.

Goal: approximate f by a C2−smooth f so that the ∇f ⊥ n.

Theorem (D-Larsson)Given a continuous f : Rn → R and any ε > 0, there exists aC1-smooth f satisfying1. Closeness: |f (x)− f (x)| < ε for all x ∈ Rn,2. Neumann Boundary condition:

x ∈ Mi =⇒ ∇f (x) ∈ TxMi .

provided {Mi} is a Whitney stratification of Q.

• For semi-algebraic Q, Whitney stratifications always exist!

25/35

Approximation of functionsMotivation: Integration by parts∫

Qg∆f dx =

∫∂Q

g〈∇f , n〉dS −∫

Q〈∇g,∇f 〉 dx.

Goal: approximate f by a C2−smooth f so that the ∇f ⊥ n.

Theorem (D-Larsson)Given a continuous f : Rn → R and any ε > 0, there exists aC1-smooth f satisfying1. Closeness: |f (x)− f (x)| < ε for all x ∈ Rn,2. Neumann Boundary condition:

x ∈ Mi =⇒ ∇f (x) ∈ TxMi .

provided {Mi} is a Whitney stratification of Q.

• For semi-algebraic Q, Whitney stratifications always exist!

25/35

Approximation of functionsMotivation: Integration by parts∫

Qg∆f dx =

∫∂Q

g〈∇f , n〉dS −∫

Q〈∇g,∇f 〉 dx.

Goal: approximate f by a C2−smooth f so that the ∇f ⊥ n.

Theorem (D-Larsson)Given a continuous f : Rn → R and any ε > 0, there exists aC1-smooth f satisfying1. Closeness: |f (x)− f (x)| < ε for all x ∈ Rn,2. Neumann Boundary condition:

x ∈ Mi =⇒ ∇f (x) ∈ TxMi .

provided {Mi} is a Whitney stratification of Q.

• For semi-algebraic Q, Whitney stratifications always exist!

25/35

Approximation of functionsMotivation: Integration by parts∫

Qg∆f dx =

∫∂Q

g〈∇f , n〉dS −∫

Q〈∇g,∇f 〉 dx.

Goal: approximate f by a C2−smooth f so that the ∇f ⊥ n.

Theorem (D-Larsson)Given a continuous f : Rn → R and any ε > 0, there exists aC1-smooth f satisfying1. Closeness: |f (x)− f (x)| < ε for all x ∈ Rn,2. Neumann Boundary condition:

x ∈ Mi =⇒ ∇f (x) ∈ TxMi .

provided {Mi} is a Whitney stratification of Q.

• For semi-algebraic Q, Whitney stratifications always exist!25/35

Approximation of functionsWhat about higher order of smoothness?

Need a more stringentstratification.

Definition (Normally flat stratification)The stratification {Mi} is normally flat if Mi ⊂ clMj implies

PMi (x) = PMi ◦ PMj (x) for all x near Mi and Mj .

Mi

Mj

x

PMj(x)

PMi(x) = PMi

◦ PMj(x)

• If {Mi} is normally flat, can guarantee f is C∞−smooth!

Example: partition of polyhedra into open faces is normally flat.

26/35

Approximation of functionsWhat about higher order of smoothness? Need a more stringentstratification.

Definition (Normally flat stratification)The stratification {Mi} is normally flat if Mi ⊂ clMj implies

PMi (x) = PMi ◦ PMj (x) for all x near Mi and Mj .

Mi

Mj

x

PMj(x)

PMi(x) = PMi

◦ PMj(x)

• If {Mi} is normally flat, can guarantee f is C∞−smooth!

Example: partition of polyhedra into open faces is normally flat.

26/35

Approximation of functionsWhat about higher order of smoothness? Need a more stringentstratification.

Definition (Normally flat stratification)The stratification {Mi} is normally flat if Mi ⊂ clMj implies

PMi (x) = PMi ◦ PMj (x) for all x near Mi and Mj .

Mi

Mj

x

PMj(x)

PMi(x) = PMi

◦ PMj(x)

• If {Mi} is normally flat, can guarantee f is C∞−smooth!

Example: partition of polyhedra into open faces is normally flat.

26/35

Approximation of functionsWhat about higher order of smoothness? Need a more stringentstratification.

Definition (Normally flat stratification)The stratification {Mi} is normally flat if Mi ⊂ clMj implies

PMi (x) = PMi ◦ PMj (x) for all x near Mi and Mj .

Mi

Mj

x

PMj(x)

PMi(x) = PMi

◦ PMj(x)

• If {Mi} is normally flat, can guarantee f is C∞−smooth!

Example: partition of polyhedra into open faces is normally flat.

26/35

Approximation of functionsWhat about higher order of smoothness? Need a more stringentstratification.

Definition (Normally flat stratification)The stratification {Mi} is normally flat if Mi ⊂ clMj implies

PMi (x) = PMi ◦ PMj (x) for all x near Mi and Mj .

Mi

Mj

x

PMj(x)

PMi(x) = PMi

◦ PMj(x)

• If {Mi} is normally flat, can guarantee f is C∞−smooth!

Example: partition of polyhedra into open faces is normally flat.26/35

Approximation of functions

Any other examples?

Notation:• Sn are n × n symmetric matrices.• λ : Sn → Rn is the eigenvalue map

λ(A) = (λ1(A), . . . , λn(A)).

Normally flat stratifications “lift” (D-Larsson ’12):

A symmetric stratification {Mi} of Q ⊂ Rn is normally flat⇐⇒ stratification {λ−1(Mi)} of λ−1(Q) ⊂ Sn is normally flat.

27/35

Approximation of functions

Any other examples?

Notation:• Sn are n × n symmetric matrices.• λ : Sn → Rn is the eigenvalue map

λ(A) = (λ1(A), . . . , λn(A)).

Normally flat stratifications “lift” (D-Larsson ’12):

A symmetric stratification {Mi} of Q ⊂ Rn is normally flat⇐⇒ stratification {λ−1(Mi)} of λ−1(Q) ⊂ Sn is normally flat.

27/35

Approximation of functions

Any other examples?

Notation:• Sn are n × n symmetric matrices.• λ : Sn → Rn is the eigenvalue map

λ(A) = (λ1(A), . . . , λn(A)).

Normally flat stratifications “lift” (D-Larsson ’12):

A symmetric stratification {Mi} of Q ⊂ Rn is normally flat⇐⇒ stratification {λ−1(Mi)} of λ−1(Q) ⊂ Sn is normally flat.

27/35

Approximation of functions

Examples: Sn+ = {X ∈ Sn : X � 0},

Rn×n+ = {X ∈ Rn×n : det(X) ≥ 0},Bn

2 = {X ∈ Rn×n : σmax ≤ 1}.

=⇒ Strongest approximation results become available!

Applications:

1. Probability: properties of the matrix-valued Bessel Process(Larsson ’12)

2. PDEs: Geometry of Sobolev spaces.3. More forthcoming...

28/35

Approximation of functions

Examples: Sn+ = {X ∈ Sn : X � 0},

Rn×n+ = {X ∈ Rn×n : det(X) ≥ 0},Bn

2 = {X ∈ Rn×n : σmax ≤ 1}.

=⇒ Strongest approximation results become available!

Applications:

1. Probability: properties of the matrix-valued Bessel Process(Larsson ’12)

2. PDEs: Geometry of Sobolev spaces.3. More forthcoming...

28/35

Approximation of functions

Examples: Sn+ = {X ∈ Sn : X � 0},

Rn×n+ = {X ∈ Rn×n : det(X) ≥ 0},Bn

2 = {X ∈ Rn×n : σmax ≤ 1}.

=⇒ Strongest approximation results become available!

Applications:1. Probability: properties of the matrix-valued Bessel Process

(Larsson ’12)2. PDEs: Geometry of Sobolev spaces.3. More forthcoming...

28/35

Active sets in optimization.

29/35

Active sets in optimization

Figure: Q is 4× 4 Toeplitz spectrahedron

Definition (Partial Smoothness)A convex set Q is partly smooth relative toM⊂ Q if1. (Smoothness)M is a smooth manifold,2. (Sharpness) NM = spanNQ onM,3. (Continuity) NQ varies continuously onM.

30/35

Active sets in optimization

Figure: Q is 4× 4 Toeplitz spectrahedron

Definition (Partial Smoothness)A convex set Q is partly smooth relative toM⊂ Q if1. (Smoothness)M is a smooth manifold,2. (Sharpness) NM = spanNQ onM,3. (Continuity) NQ varies continuously onM.

30/35

Active sets in optimization

Why do optimizers care?

• Many optimization algorithms identifyM in finite time!

Eg: Gradient projection, Newton-like, proximal point.

=⇒ Acceleration strategies!

How to see this structure in eigenvalue optimization?

Partly smooth manifolds “lift” (Daniilidis-D-Lewis):

Q partly smooth relative toMm

λ−1(Q) partly smooth relative to λ−1(M)

provided Q is symmetric.

31/35

Active sets in optimization

Why do optimizers care?

• Many optimization algorithms identifyM in finite time!

Eg: Gradient projection, Newton-like, proximal point.

=⇒ Acceleration strategies!

How to see this structure in eigenvalue optimization?

Partly smooth manifolds “lift” (Daniilidis-D-Lewis):

Q partly smooth relative toMm

λ−1(Q) partly smooth relative to λ−1(M)

provided Q is symmetric.

31/35

Active sets in optimization

Why do optimizers care?

• Many optimization algorithms identifyM in finite time!

Eg: Gradient projection, Newton-like, proximal point.=⇒ Acceleration strategies!

How to see this structure in eigenvalue optimization?

Partly smooth manifolds “lift” (Daniilidis-D-Lewis):

Q partly smooth relative toMm

λ−1(Q) partly smooth relative to λ−1(M)

provided Q is symmetric.

31/35

Active sets in optimization

Why do optimizers care?

• Many optimization algorithms identifyM in finite time!

Eg: Gradient projection, Newton-like, proximal point.=⇒ Acceleration strategies!

How to see this structure in eigenvalue optimization?

Partly smooth manifolds “lift” (Daniilidis-D-Lewis):

Q partly smooth relative toMm

λ−1(Q) partly smooth relative to λ−1(M)

provided Q is symmetric.

31/35

Active sets in optimization

Why do optimizers care?

• Many optimization algorithms identifyM in finite time!

Eg: Gradient projection, Newton-like, proximal point.=⇒ Acceleration strategies!

How to see this structure in eigenvalue optimization?

Partly smooth manifolds “lift” (Daniilidis-D-Lewis):

Q partly smooth relative toMm

λ−1(Q) partly smooth relative to λ−1(M)

provided Q is symmetric.

31/35

Active sets in optimization

Why do optimizers care?

• Many optimization algorithms identifyM in finite time!

Eg: Gradient projection, Newton-like, proximal point.=⇒ Acceleration strategies!

How to see this structure in eigenvalue optimization?

Partly smooth manifolds “lift” (Daniilidis-D-Lewis):

Q partly smooth relative toMm

λ−1(Q) partly smooth relative to λ−1(M)

provided Q is symmetric.

31/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)

orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Nonsmoothness under symmetry

Actions:permutation group Σ acts on Rn (permutation)orthogonal group On acts on Sn (conjugation)

Correspondence:On-invariant (spectral)⇐⇒ Σ-invariant (symmetric).

Q is spectral ⇐⇒ Q = λ−1(M) withM symmetric.

Eg: {X : X � 0} = λ−1({x : x ≥ 0}

).

THE TRANSFER PRINCIPLE:Variational properties of Q andM are in 1-to-1 correspondence!

Eg: Convexity, stratifiability, Whitney conditions, normalflatness, partial smoothness, . . .

32/35

Summary

Slope

Geometry

Analysis Algorithms

33/35

Thank you.

34/35

References• Optimality, identifiability, and sensitivity, D-Lewis,submitted to Math. Programming Ser. A.

• The dimension of semi-algebraic subdifferentialgraphs, D-Ioffe-Lewis. Nonlinear Analysis, 75(3),1231-1245, 2012.

• Semi-algebraic functions have small subdifferentials,D-Lewis, to appear in Math. Prog. Ser. B.

• Tilt stability, uniform quadratic growth, and strongmetric regularity of the subdifferential, D-Lewis, toappear in SIAM J. on Opt.

• Approximating functions on stratifiable domains,D-Larsson, submitted to the Trans. of the AMS.

• Generic nondegeneracy in convex optimization,D-Lewis, Proc. Amer. Soc. 139 (2011), 2519-2527.

• Trajectories of subgradient dynamical systems,D-Ioffe-Lewis, submitted to SIAM J. on Control and Opt.

• Spectral lifts of identifiable sets and partly smoothmanifolds, Daniilidis-D-Lewis, preprint.

Available at http://people.orie.cornell.edu/dd379/35/35

top related