madmm: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 applications...
TRANSCRIPT
![Page 1: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/1.jpg)
1/54
MADMM: a generic algorithm for non-smoothoptimization on manifolds
Michael Bronstein
Faculty of Informatics Perceptual Computing Group
University of Lugano Intel Corporation
Switzerland Israel
Louvain-la-Neuve, 25 September 2015
![Page 2: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/2.jpg)
2/54
Image.processing Geometry processing
Image analysis..
Shape . analysis
Computer vision
Computer graphics 2D 3D
nD
Pattern recognition Machine
learning . Graph analysis . & processing
![Page 3: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/3.jpg)
3/54
What is manifold optimization?
Manifold (or manifold-constrained) optimization problem
minX∈Rn×m
f(X) s.t. X ∈M
f ∶ Rn×m → R is a smooth function
M is a Riemannian submanifold of Rn×m
Absil et al. 2009
![Page 4: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/4.jpg)
4/54
Applications
Sphere: principal geodesic analysis1, 1-bit compressed sensing2
Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5
Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9
Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12
Oblique: ICA13, blind source separation14
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008
; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.
![Page 5: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/5.jpg)
4/54
Applications
Sphere: principal geodesic analysis1, 1-bit compressed sensing2
Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5
Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9
Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12
Oblique: ICA13, blind source separation14
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015
; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.
![Page 6: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/6.jpg)
4/54
Applications
Sphere: principal geodesic analysis1, 1-bit compressed sensing2
Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5
Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9
Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12
Oblique: ICA13, blind source separation14
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012
; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.
![Page 7: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/7.jpg)
4/54
Applications
Sphere: principal geodesic analysis1, 1-bit compressed sensing2
Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5
Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9
Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12
Oblique: ICA13, blind source separation14
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006
; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.
![Page 8: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/8.jpg)
4/54
Applications
Sphere: principal geodesic analysis1, 1-bit compressed sensing2
Stiefel manifold: eigenvalue-, assignment-, Procrustes problems3,orthogonal dictionary learning4, binary coding5
Product of Stiefel manifolds: functional correspondence6,manifold learning7, structure-from-motion8, sensor localization9
Fixed-rank PSD: maxcut problems, sparse PCA10, matrixcompletion11, multidimensional scaling12
Oblique: ICA13, blind source separation14
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.2015; 5Xia et al. 2015; 6Kovnatsky et al. 2013; 7Eynard et al. 2015; 8Arie-Nachimsonet al. 2012; 9Cucuringu et al. 2012; 10Journee et al. 2010; 11Tan et al. 2014;12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.
![Page 9: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/9.jpg)
5/54
Toy example: eigenvalue problem
minx∈Rn
x⊺Ax s.t. x⊺x = 1
![Page 10: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/10.jpg)
5/54
Toy example: eigenvalue problem
minx∈Rn
x⊺Ax s.t. x⊺x = 1
![Page 11: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/11.jpg)
6/54
Optimization on the manifold: main idea
minX∈M
f(X)
where f ∶M→ R is a function on the manifold (scalar field)
No global system of coordinates
Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient
Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport
Absil et al. 2009
![Page 12: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/12.jpg)
6/54
Optimization on the manifold: main idea
minX∈M
f(X)
where f ∶M→ R is a function on the manifold (scalar field)
No global system of coordinates
Manifold M is locally homeomorphic to the tangent space TXM
Intrinsic gradient
Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport
Absil et al. 2009
![Page 13: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/13.jpg)
6/54
Optimization on the manifold: main idea
minX∈M
f(X)
where f ∶M→ R is a function on the manifold (scalar field)
No global system of coordinates
Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient ∇Mf ∶M→ TM such that
f(“X + dV ”) = f(X) + ⟨∇Mf(X), dV ⟩TXM +O(∥dV ∥2)
Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport
Absil et al. 2009
![Page 14: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/14.jpg)
6/54
Optimization on the manifold: main idea
minX∈M
f(X)
where f ∶M→ R is a function on the manifold (scalar field)
No global system of coordinates
Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient = projection of the extrinsic gradient
∇Mf(X) = PTXM∇f(X)
Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport
Absil et al. 2009
![Page 15: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/15.jpg)
6/54
Optimization on the manifold: main idea
minX∈M
f(X)
where f ∶M→ R is a function on the manifold (scalar field)
No global system of coordinates
Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient = projection of the extrinsic gradient
∇Mf(X) = PTXM∇f(X)
Exponential map expx ∶ TXM→M
Moving vectors on M requires parallel transport
Absil et al. 2009
![Page 16: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/16.jpg)
6/54
Optimization on the manifold: main idea
minX∈M
f(X)
where f ∶M→ R is a function on the manifold (scalar field)
No global system of coordinates
Manifold M is locally homeomorphic to the tangent space TXMIntrinsic gradient = projection of the extrinsic gradient
∇Mf(X) = PTXM∇f(X)
Exponential map expx ∶ TXM→MMoving vectors on M requires parallel transport
Absil et al. 2009
![Page 17: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/17.jpg)
7/54
Optimization on the manifold: main idea
X(k)
X(k+1)
M
Absil et al. 2009
![Page 18: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/18.jpg)
7/54
Optimization on the manifold: main idea
X(k)
∇f(X(k))
PX(k)
∇Mf(X(k))
TX(k)M
M
Absil et al. 2009
![Page 19: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/19.jpg)
7/54
Optimization on the manifold: main idea
X(k)
∇f(X(k))
PX(k)
α(k)∇Mf(X(k))
TX(k)M
M
Absil et al. 2009
![Page 20: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/20.jpg)
7/54
Optimization on the manifold: main idea
X(k)
∇f(X(k))
PX(k)
α(k)∇Mf(X(k))
RX(k)
X(k+1)
TX(k)M
M
Absil et al. 2009
![Page 21: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/21.jpg)
8/54
Optimization on the manifold
Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat
Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1
until convergence;
Projection and retraction operators are manifold-dependent
Typically expressed in closed form
“Black box”: need to provide only f(X) and gradient ∇f(X)
Absil et al. 2009; Boumal et al. 2014
![Page 22: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/22.jpg)
8/54
Optimization on the manifold
Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat
Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1
until convergence;
Projection and retraction operators are manifold-dependent
Typically expressed in closed form
“Black box”: need to provide only f(X) and gradient ∇f(X)
Absil et al. 2009; Boumal et al. 2014
![Page 23: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/23.jpg)
8/54
Optimization on the manifold
Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat
Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1
until convergence;
Projection and retraction operators are manifold-dependent
Typically expressed in closed form
“Black box”: need to provide only f(X) and gradient ∇f(X)
Absil et al. 2009; Boumal et al. 2014
![Page 24: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/24.jpg)
8/54
Optimization on the manifold
Algorithm 1 Conceptual algorithm for smooth optimization on Mrepeat
Compute extrinsic gradient ∇f(X(k))Projection: ∇Mf(X(k)) = PX(k)(∇f(X(k)))Compute step size α(k) along the descent direction −∇Mf(X(k))Retraction: X(k+1) = RX(k)(−α(k)∇Mf(X(k)))k ← k + 1
until convergence;
Projection and retraction operators are manifold-dependent
Typically expressed in closed form
“Black box”: need to provide only f(X) and gradient ∇f(X)
Absil et al. 2009; Boumal et al. 2014
![Page 25: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/25.jpg)
9/54
Prototype problem
Non-smooth manifold optimization problem
minX∈M
f(X) + g(AX)
f ∶ Rn×m → R is a smooth function
g ∶ Rk×m → R is a non-smooth function
A is k × n matrix
M is a Riemannian submanifold of Rn×m
Typical examples: g(X) = ∥X∥1, ∥X∥2,1-, or ∥X∥∗
Smoothing Subgradient Splitting
/ Approximate / Problem dependent / Problem dependent
Kovnatsky, B, Glashoff 2015
![Page 26: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/26.jpg)
9/54
Prototype problem
Non-smooth manifold optimization problem
minX∈M
f(X) + g(AX)
f ∶ Rn×m → R is a smooth function
g ∶ Rk×m → R is a non-smooth function
A is k × n matrix
M is a Riemannian submanifold of Rn×m
Typical examples: g(X) = ∥X∥1, ∥X∥2,1-, or ∥X∥∗
Smoothing Subgradient Splitting
/ Approximate / Problem dependent / Problem dependent
Kovnatsky, B, Glashoff 2015
![Page 27: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/27.jpg)
9/54
Prototype problem
Non-smooth manifold optimization problem
minX∈M
f(X) + g(AX)
Smoothing Subgradient Splitting
/ Approximate / Problem dependent / Problem dependent
Smoothing: Chen 2012Subgradient: Ferreira, Oliveira 1998; Ledyaev, Zhu 2007; Kleinsteuber, Shen 2012Splitting: Lai, Osher 2014; Neumann et al. 2014; Rosman et al. 2014
![Page 28: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/28.jpg)
9/54
Prototype problem
Non-smooth manifold optimization problem
minX∈M
f(X) + g(AX)
Smoothing Subgradient Splitting
/ Approximate / Problem dependent / Problem dependent
Smoothing: Chen 2012Subgradient: Ferreira, Oliveira 1998; Ledyaev, Zhu 2007; Kleinsteuber, Shen 2012Splitting: Lai, Osher 2014; Neumann et al. 2014; Rosman et al. 2014
![Page 29: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/29.jpg)
10/54
Manifold ADMM
![Page 30: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/30.jpg)
11/54
Manifold ADMM
Non-smooth manifold optimization problem
equivalently written as
minX∈M
f(X) + g(AX)
introducing an artificial variable Z and a linear constraint
Apply the method of multipliers only to the constraint Z = AX
minX∈M,Z
f(X) + g(Z) + ρ2∥AX −Z +U∥2F
Solve alternating w.r.t. X and Z and updating U ← U +AX −Z
Problem breaks into
Smooth manifold optimization sub-problem w.r.t. X, and
Non-smooth unconstrained sub-problem w.r.t. Z
Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015
![Page 31: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/31.jpg)
11/54
Manifold ADMM
Non-smooth manifold optimization problem equivalently written as
minX∈M,Z
f(X) + g(Z) s.t. Z = AX
introducing an artificial variable Z and a linear constraint
Apply the method of multipliers only to the constraint Z = AX
minX∈M,Z
f(X) + g(Z) + ρ2∥AX −Z +U∥2F
Solve alternating w.r.t. X and Z and updating U ← U +AX −Z
Problem breaks into
Smooth manifold optimization sub-problem w.r.t. X, and
Non-smooth unconstrained sub-problem w.r.t. Z
Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015
![Page 32: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/32.jpg)
11/54
Manifold ADMM
Non-smooth manifold optimization problem equivalently written as
minX∈M,Z
f(X) + g(Z) s.t. Z = AX
introducing an artificial variable Z and a linear constraint
Apply the method of multipliers only to the constraint Z = AX
minX∈M,Z
f(X) + g(Z) + ρ2∥AX −Z +U∥2F
Solve alternating w.r.t. X and Z and updating U ← U +AX −Z
Problem breaks into
Smooth manifold optimization sub-problem w.r.t. X, and
Non-smooth unconstrained sub-problem w.r.t. Z
Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015
![Page 33: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/33.jpg)
11/54
Manifold ADMM
Non-smooth manifold optimization problem equivalently written as
minX∈M,Z
f(X) + g(Z) s.t. Z = AX
introducing an artificial variable Z and a linear constraint
Apply the method of multipliers only to the constraint Z = AX
minX∈M,Z
f(X) + g(Z) + ρ2∥AX −Z +U∥2F
Solve alternating w.r.t. X and Z and updating U ← U +AX −Z
Problem breaks into
Smooth manifold optimization sub-problem w.r.t. X, and
Non-smooth unconstrained sub-problem w.r.t. Z
Hestenes 1969; Powell 1969; Kovnatsky, Glashoff, B 2015
![Page 34: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/34.jpg)
12/54
MADMM
Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.
repeat
X-step: X(k+1) = argminX∈M
f(X) + ρ2∥AX −Z(k) +U (k)∥2F
Z-step: Z(k+1) = argminZ
g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F
Update U (k+1) = U (k) +AX(k+1) −Z(k+1)
k ← k + 1until convergence;
Solver/number of optimization iterations in X- and Z-steps
X-step and Z-step in some problems have a closed form
Parameter ρ > 0 can be chosen fixed or adapted
Kovnatsky, Glashoff, B 2015
![Page 35: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/35.jpg)
12/54
MADMM
Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.
repeat
X-step: X(k+1) = argminX∈M
f(X) + ρ2∥AX −Z(k) +U (k)∥2F
Z-step: Z(k+1) = argminZ
g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F
Update U (k+1) = U (k) +AX(k+1) −Z(k+1)
k ← k + 1until convergence;
Solver/number of optimization iterations in X- and Z-steps
X-step and Z-step in some problems have a closed form
Parameter ρ > 0 can be chosen fixed or adapted
Kovnatsky, Glashoff, B 2015
![Page 36: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/36.jpg)
12/54
MADMM
Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.
repeat
X-step: X(k+1) = argminX∈M
f(X) + ρ2∥AX −Z(k) +U (k)∥2F
Z-step: Z(k+1) = argminZ
g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F
Update U (k+1) = U (k) +AX(k+1) −Z(k+1)
k ← k + 1until convergence;
Solver/number of optimization iterations in X- and Z-steps
X-step and Z-step in some problems have a closed form
Parameter ρ > 0 can be chosen fixed or adapted
Kovnatsky, Glashoff, B 2015
![Page 37: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/37.jpg)
12/54
MADMM
Algorithm 2 MADMM for non-smooth optimization on manifold MInitialize k ← 1, Z(1) = AX(1), U (1) = 0.
repeat
X-step: X(k+1) = argminX∈M
f(X) + ρ2∥AX −Z(k) +U (k)∥2F
Z-step: Z(k+1) = argminZ
g(Z) + ρ2∥AX(k+1) −Z +U (k)∥2F
Update U (k+1) = U (k) +AX(k+1) −Z(k+1)
k ← k + 1until convergence;
Solver/number of optimization iterations in X- and Z-steps
X-step and Z-step in some problems have a closed form
Parameter ρ > 0 can be chosen fixed or adapted
Kovnatsky, Glashoff, B 2015
![Page 38: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/38.jpg)
13/54
Compressed modes
![Page 39: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/39.jpg)
14/54
Laplacian eigenfunctions
The first k eigenfunctions of some Laplacian are used in...
Spectral clustering Dimensionalityreduction
Spectral distances
Ng et al. 2001; Belkin, Nyogi 2001; Coifman, Lafon 2006
![Page 40: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/40.jpg)
15/54
Laplacian eigenfunctions
Find the first k eigenfunctions of an n × n Laplacian matrix ∆
minΦ∈Rn×k
tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics
Many efficient solvers with global optimality guarantees
1D Euclidean Laplacian eigenfunctions = Fourier basis
∆e−iωx = −ω2e−iωx
Globally supported!
![Page 41: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/41.jpg)
15/54
Laplacian eigenfunctions
Find the first k eigenfunctions of an n × n Laplacian matrix ∆
minΦ∈Rn×k
tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics
Many efficient solvers with global optimality guarantees
1D Euclidean Laplacian eigenfunctions = Fourier basis
∆e−iωx = −ω2e−iωx
Globally supported!
![Page 42: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/42.jpg)
15/54
Laplacian eigenfunctions
Find the first k eigenfunctions of an n × n Laplacian matrix ∆
minΦ∈Rn×k
tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics
Many efficient solvers with global optimality guarantees
1D Euclidean Laplacian eigenfunctions = Fourier basis
∆e−iωx = −ω2e−iωx
Globally supported!
![Page 43: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/43.jpg)
15/54
Laplacian eigenfunctions
Find the first k eigenfunctions of an n × n Laplacian matrix ∆
minΦ∈Rn×k
tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics
Many efficient solvers with global optimality guarantees
1D Euclidean Laplacian eigenfunctions = Fourier basis
∆e−iωx = −ω2e−iωx
Globally supported!
![Page 44: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/44.jpg)
15/54
Laplacian eigenfunctions
Find the first k eigenfunctions of an n × n Laplacian matrix ∆
minΦ∈Rn×k
tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
tr(Φ⊺∆Φ) = ∑ij wij∥φi − φj∥2F a.k.a. Dirichlet energy in physics
Many efficient solvers with global optimality guarantees
1D Euclidean Laplacian eigenfunctions = Fourier basis
∆e−iωx = −ω2e−iωx
Globally supported!
![Page 45: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/45.jpg)
16/54
Laplacian eigenfunctions: 1D example
0 10 20 30 40 50 60 70 80 90 100−0.2
0
0.2
0 10 20 30 40 50 60 70 80 90 100−0.2
0
0.2
0 10 20 30 40 50 60 70 80 90 100−0.2
0
0.2
0 10 20 30 40 50 60 70 80 90 100−0.2
0
0.2
0 10 20 30 40 50 60 70 80 90 100−0.2
0
0.2
0 10 20 30 40 50 60 70 80 90 100−0.2
0
0.2
φ1 φ2
φ3 φ4
φ5 φ6
First eigenfunctions of a 1D Euclidean Laplacian
![Page 46: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/46.jpg)
17/54
Laplacian eigenfunctions: non-Euclidean example
0
max
min
First Laplacian eigenfunctions of a Laplacian on a triangular mesh
Neumann et al. 2014
![Page 47: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/47.jpg)
18/54
Compressed modes
minΦ∈Rn×k
tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
Dirichlet energy = smoothness
L1-norm = sparsity
Smoothness + sparsity = localization
Ozolins et al. 2013
![Page 48: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/48.jpg)
18/54
Compressed modes
minΦ∈Rn×k
tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
Dirichlet energy = smoothness
L1-norm = sparsity
Smoothness + sparsity = localization
Ozolins et al. 2013
![Page 49: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/49.jpg)
18/54
Compressed modes
minΦ∈Rn×k
tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
Dirichlet energy = smoothness
L1-norm = sparsity
Smoothness + sparsity = localization
Ozolins et al. 2013
![Page 50: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/50.jpg)
18/54
Compressed modes
minΦ∈Rn×k
tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
Dirichlet energy = smoothness
L1-norm = sparsity
Smoothness + sparsity = localization
Ozolins et al. 2013
![Page 51: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/51.jpg)
19/54
Compressed modes: 1D example
0 10 20 30 40 50 60 70 80 90 100−2
0
2
4
6
0 10 20 30 40 50 60 70 80 90 100−5
0
5
0 10 20 30 40 50 60 70 80 90 100−5
0
5
0 10 20 30 40 50 60 70 80 90 100−5
0
5
0 10 20 30 40 50 60 70 80 90 100−5
0
5
0 10 20 30 40 50 60 70 80 90 100−5
0
5
φ1 φ2
φ3 φ4
φ5 φ6
First compressed modes of a 1D Euclidean Laplacian
Ozolins et al. 2013
![Page 52: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/52.jpg)
20/54
Compressed modes: non-Euclidean example
0
max
min
First compressed modes
Neumann et al. 2014
![Page 53: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/53.jpg)
20/54
Compressed modes: non-Euclidean example
0
max
min
First Laplacian eigenfunctions
Neumann et al. 2014
![Page 54: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/54.jpg)
21/54
Wannier functions
Maximally-localized Wannier functions in Si and GaAs crystals
Wannier 1937; Mostofi 2008
![Page 55: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/55.jpg)
22/54
Splitting method for orthogonality constraints (SOC)
minΦ∈Rn×k
tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
Algorithm 3 SOC method for computing compressed modes
Initialize k ← 1, Φ(1), P (1) = Q(1) = Φ(1), U (1) = V (1) = 0repeat
Φ(k+1) = argminΦ
tr(Φ⊺∆Φ)+ ρ2∥Φ−Q(k)+U (k)∥2F+ ρ
′
2∥Φ−P (k)+V (k)∥2F
Q(k+1) = argminQ
µ∥Q∥1 + ρ2∥Φ(k+1) −Q +U (k)∥2F
P (k+1) = argminP
ρ′
2∥Φ(k+1) − P + V (k)∥2F s.t. P ⊺P = I
U (k+1) = U (k) +Φ(k+1) −Q(k+1)
V (k+1) = V (k) +Φ(k+1) − P (k+1)
k ← k + 1until convergence;
Lai, Osher 2014
![Page 56: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/56.jpg)
22/54
Splitting method for orthogonality constraints (SOC)
minΦ,P,Q∈Rn×k
tr(Φ⊺∆Φ) + µ∥Q∥1 s.t. P = Φ, Q = Φ, P ⊺P = I
Algorithm 3 SOC method for computing compressed modes
Initialize k ← 1, Φ(1), P (1) = Q(1) = Φ(1), U (1) = V (1) = 0repeat
Φ(k+1) = argminΦ
tr(Φ⊺∆Φ)+ ρ2∥Φ−Q(k)+U (k)∥2F+ ρ
′
2∥Φ−P (k)+V (k)∥2F
Q(k+1) = argminQ
µ∥Q∥1 + ρ2∥Φ(k+1) −Q +U (k)∥2F
P (k+1) = argminP
ρ′
2∥Φ(k+1) − P + V (k)∥2F s.t. P ⊺P = I
U (k+1) = U (k) +Φ(k+1) −Q(k+1)
V (k+1) = V (k) +Φ(k+1) − P (k+1)
k ← k + 1until convergence;
Lai, Osher 2014
![Page 57: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/57.jpg)
22/54
Splitting method for orthogonality constraints (SOC)
minΦ,P,Q∈Rn×k
tr(Φ⊺∆Φ) + µ∥Q∥1 s.t. P = Φ, Q = Φ, P ⊺P = I
Algorithm 3 SOC method for computing compressed modes
Initialize k ← 1, Φ(1), P (1) = Q(1) = Φ(1), U (1) = V (1) = 0repeat
Φ(k+1) = argminΦ
tr(Φ⊺∆Φ)+ ρ2∥Φ−Q(k)+U (k)∥2F+ ρ
′
2∥Φ−P (k)+V (k)∥2F
Q(k+1) = argminQ
µ∥Q∥1 + ρ2∥Φ(k+1) −Q +U (k)∥2F
P (k+1) = argminP
ρ′
2∥Φ(k+1) − P + V (k)∥2F s.t. P ⊺P = I
U (k+1) = U (k) +Φ(k+1) −Q(k+1)
V (k+1) = V (k) +Φ(k+1) − P (k+1)
k ← k + 1until convergence;
Lai, Osher 2014
![Page 58: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/58.jpg)
23/54
Compressed modes as manifold optimization
minΦ∈S(n,k)
tr(Φ⊺∆Φ) + µ∥Φ∥1
Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}
Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function
minΦ∈S(n,k)
tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F
Sub-problem w.r.t. Z: sparse coding (Lasso) problem
minZ∥Z∥1 + ρ
2∥Φ +U −Z∥2F
Kovnatsky, Glashoff, B 2015
; Chen et al. 1995; Tibshirani 1996
![Page 59: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/59.jpg)
23/54
Compressed modes as manifold optimization
minΦ∈S(n,k),Z
tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ2∥Φ −Z +U∥2F
Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}
Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function
minΦ∈S(n,k)
tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F
Sub-problem w.r.t. Z: sparse coding (Lasso) problem
minZ∥Z∥1 + ρ
2∥Φ +U −Z∥2F
Kovnatsky, Glashoff, B 2015
; Chen et al. 1995; Tibshirani 1996
![Page 60: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/60.jpg)
23/54
Compressed modes as manifold optimization
minΦ∈S(n,k),Z
tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ2∥Φ −Z +U∥2F
Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function
minΦ∈S(n,k)
tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F
Sub-problem w.r.t. Z: sparse coding (Lasso) problem
minZ∥Z∥1 + ρ
2∥Φ +U −Z∥2F
Kovnatsky, Glashoff, B 2015
; Chen et al. 1995; Tibshirani 1996
![Page 61: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/61.jpg)
23/54
Compressed modes as manifold optimization
minΦ∈S(n,k),Z
tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ2∥Φ −Z +U∥2F
Stiefel manifold S(n, k) = {X ∈ Rn×k ∶X⊺X = I}Sub-problem w.r.t. Φ: smooth manifold-constrained minimization ofa quadratic function
minΦ∈S(n,k)
tr(Φ⊺∆Φ) + ρ2∥Φ −Z +U∥2F
Sub-problem w.r.t. Z: sparse coding (Lasso) problem
minZ∥Z∥1 + ρ
2∥Φ +U −Z∥2F
Kovnatsky, Glashoff, B 2015; Chen et al. 1995; Tibshirani 1996
![Page 62: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/62.jpg)
24/54
Compressed modes by MADMM
Algorithm 4 MADMM for computing compressed modes
Input n × n Laplacian matrix ∆, parameter µ > 0
Output k first compressed modes of ∆
Initialize k ← 1, Φ(1) ←some orthonormal matrix, Z(1) = Φ(1), U (1) = 0
repeat
Φ(k+1) = argminΦ∈S(n,k)
tr(Φ⊺∆Φ) + ρ2∥Φ −Z(k) +U (k)∥2F
Z(k+1) = Shrinkµρ(Φ(k+1) +U (k))
Update U (k+1) = U (k) +Φ(k+1) −Z(k+1)
k ← k + 1until convergence;
where Shrinkα(x) = sign(x)max{0, ∣x∣ − α} is the shrinkage operator
Kovnatsky, Glashoff, B 2015
![Page 63: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/63.jpg)
25/54
Convergence
Convergence of MADMM with different random initializations(compressed modes problem of size n = 500, k = 10)
10−1 100 101 102
101
102
103
Time (sec)
Co
st
Kovnatsky, Glashoff, B 2015
![Page 64: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/64.jpg)
26/54
Convergence
Convergence of MADMM with different X-step solvers(compressed modes problem of size n = 500, k = 10)
10−1 100 101 102
101
102
103
2
3
5
23
5
Time (sec)
Co
st
Trust regions
Conjugate gradients
Kovnatsky, Glashoff, B 2015; Manopt: Boumal et al. 2014
![Page 65: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/65.jpg)
27/54
Convergence
Example of convergence of different methods(compressed modes problem of size n = 8 × 103, k = 10)
0 1,000 2,000 3,000 4,000 5,000100
101
102
103
Time (sec)
Co
st
Lai & Osher
Neumann et al.
MADMM
Kovnatsky, Glashoff, B 2015; Lai, Osher 2014; Neumann et al. 2014
![Page 66: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/66.jpg)
28/54
Scalability
Complexity of different methods(compressed modes problem of size n, k = 10)
1,000 2,000 3,000 4,000 5,00010−1
100
101
102
Problem size n
Tim
e/it
er(s
ec)
Lai & Osher
Neumann
MADMM
Kovnatsky, Glashoff, B 2015; Lai, Osher 2014; Neumann et al. 2014
![Page 67: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/67.jpg)
29/54
Functional correspondence
![Page 68: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/68.jpg)
30/54
Applications of shape correspondence
Texture mapping Pose transfer
B2, Kimmel 2007; Sumner et al. 2004
![Page 69: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/69.jpg)
31/54
Shape correspondence
s
S
q
Q
t
Point-wise map t∶S → Q
![Page 70: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/70.jpg)
31/54
Shape correspondence
s
S
q
Q
t
s′
q′
Minimum-distortion point-wise map t∶S → Q
B2, Kimmel 2006
![Page 71: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/71.jpg)
31/54
Shape correspondence
f
F(S)
g
F(Q)
linear T
Functional map T ∶F(S)→ F(Q)
Ovsjanikov et al. 2012
![Page 72: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/72.jpg)
32/54
Functional correspondence
f
g
↓
T↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Ovsjanikov et al. 2012
![Page 73: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/73.jpg)
32/54
Functional correspondence
f
g
φ1 φ2 φk
ψ1 ψ2 ψk
≈ a1 + a2 + ⋯ + ak
≈ b1 + b2 + ⋯ + bk
↓
T↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Ovsjanikov et al. 2012
![Page 74: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/74.jpg)
32/54
Functional correspondence
f
g
φ1 φ2 φk
ψ1 ψ2 ψk
≈ a1 + a2 + ⋯ + ak
≈ b1 + b2 + ⋯ + bk
↓
T↓
↓
C↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Ovsjanikov et al. 2012
![Page 75: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/75.jpg)
32/54
Functional correspondence
f
g
φ1 φ2 φk
ψ1 ψ2 ψk
≈ a1 + a2 + ⋯ + ak
≈ b1 + b2 + ⋯ + bk
↓
T↓
↓
C↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = I
Represent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Ovsjanikov et al. 2012
![Page 76: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/76.jpg)
32/54
Functional correspondence
f
g
φ1 φ2 φk
ψ1 ψ2 ψk
≈ a1 + a2 + ⋯ + ak
≈ b1 + b2 + ⋯ + bk
↓
T↓
↓
C↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Ovsjanikov et al. 2012
![Page 77: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/77.jpg)
32/54
Functional correspondence
f
g
φ1 φ2 φk
ψ1 ψ2 ψk
≈ a1 + a2 + ⋯ + ak
≈ b1 + b2 + ⋯ + bk
↓
T↓
↓
C↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Given known corresponding functions F = (f1,⋯, fq) andG = (g1,⋯, gq), find C by solving linear system CΦ⊺F = Ψ⊺G
Ovsjanikov et al. 2012
![Page 78: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/78.jpg)
32/54
Functional correspondence
f
g
φ1 φ2 φk
ψ1 ψ2 ψk
≈ a1 + a2 + ⋯ + ak
≈ b1 + b2 + ⋯ + bk
↓
T↓
↓
C↓
Representation in truncated Laplacain eigenbasis, T ≈ ΨCΦ⊺
If T is area-preserving, C⊺C = IRepresent C =XY ⊺, then T ≈ ΨΦ⊺ = ΨX(ΦY )⊺ (rotation of bases)
Given known corresponding Fourier coefficients A = Φ⊺F andB = Ψ⊺G, find C by solving linear system CA = B
Ovsjanikov et al. 2012
![Page 79: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/79.jpg)
33/54
Functional correspondence in shape collection
S1 S2
SL⋱
AijCij ≈ Bij
Si
Sj
Kovnatsky, B2, Glashoff, Kimmel 2013; Kovnatsky, Glashoff, B 2015
![Page 80: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/80.jpg)
33/54
Functional correspondence in shape collection
X1 X2
XLXi
Xj
S1 S2
SL⋱
AijXi ≈ BijXj
Si
Sj
Kovnatsky, B2, Glashoff, Kimmel 2013; Kovnatsky, Glashoff, B 2015
![Page 81: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/81.jpg)
34/54
Functional correspondence as manifold optimization
min(X1,⋯,XL)∈SL(k,k)
∑i≠j∥AijXi −BijXj∥2,1 + µ
L
∑i=1
tr(X⊺i ΛiXi)
where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si
Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled
Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data
X-step: manifold-constrained minimization of a quadratic function
Z-step: one iteration of shrinkage
Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015
![Page 82: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/82.jpg)
34/54
Functional correspondence as manifold optimization
min(X1,⋯,XL)∈SL(k,k)
∑i≠j∥AijXi −BijXj∥2,1 + µ
L
∑i=1
tr(X⊺i ΛiXi)
where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si
Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled
Optimization on product of Stiefel manifolds SL(k, k)
L2,1-norm allows to cope with outliers in correspondence data
X-step: manifold-constrained minimization of a quadratic function
Z-step: one iteration of shrinkage
Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015
![Page 83: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/83.jpg)
34/54
Functional correspondence as manifold optimization
min(X1,⋯,XL)∈SL(k,k)
∑i≠j∥AijXi −BijXj∥2,1 + µ
L
∑i=1
tr(X⊺i ΛiXi)
where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si
Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled
Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data
X-step: manifold-constrained minimization of a quadratic function
Z-step: one iteration of shrinkage
Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015
![Page 84: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/84.jpg)
34/54
Functional correspondence as manifold optimization
min(X1,⋯,XL)∈SL(k,k)
∑i≠j∥AijXi −BijXj∥2,1 + µ
L
∑i=1
tr(X⊺i ΛiXi)
where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si
Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled
Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data
X-step: manifold-constrained minimization of a quadratic function
Z-step: one iteration of shrinkage
Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015
![Page 85: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/85.jpg)
34/54
Functional correspondence as manifold optimization
min(X1,⋯,XL)∈SL(k,k)
∑i≠j∥AijXi −BijXj∥2,1 + µ
L
∑i=1
tr(X⊺i ΛiXi)
where Aij ,Bij are Fourier coefficients of given corresponding functionson shapes Si,Sj , and Λi are the first k eigenvalues of Laplacian ∆Si
Joint diagonalization of Laplacians: find new bases Φi = ΦiXi thatapproximately diagonalize ∆i and are coupled
Optimization on product of Stiefel manifolds SL(k, k)L2,1-norm allows to cope with outliers in correspondence data
X-step: manifold-constrained minimization of a quadratic function
Z-step: one iteration of shrinkage
Eynard, Kovnatsky, B2, Glashoff 2012; Kovnatsky, Glashoff, B 2015
![Page 86: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/86.jpg)
35/54
Correspondence data
Example of correspondence data(10% of outliers shown in red)
Kovnatsky, Glashoff, B 2015
![Page 87: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/87.jpg)
36/54
Correspondence quality
Robust (MADMM)
Least squares
Kovnatsky, Glashoff, B 2015
![Page 88: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/88.jpg)
37/54
Correspondence quality
Correspondence quality evaluated using Princeton protocol
0 5 ⋅ 10−2 0.1 0.15 0.2 0.250
0.2
0.4
0.6
0.8
1
% geodesic diameter
%o
fco
rres
po
nd
ence
LS
MADMM
Kovnatsky, Glashoff, B 2015; data: B2, Kimmel 2008, benchmark: Kim et al. 2011
![Page 89: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/89.jpg)
38/54
Convergence
Convergence of different methods
0 2 4 6 8 10
100.2
100.4
10-410-6
10-8
Time (sec)
Co
st
Smoothing
MADMM
Kovnatsky, Glashoff, B 2015
![Page 90: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/90.jpg)
39/54
Multimodal spectral clustering
UncoupledNo outliers
100%
Coupled (L2)No outliers
53%
Coupled (L2)10% outliers
72%
Coupled (L2,1)10% outliers
82%
Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012
![Page 91: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/91.jpg)
39/54
Multimodal spectral clustering
UncoupledNo outliers
100%
Coupled (L2)No outliers
53%
Coupled (L2)10% outliers
72%
Coupled (L2,1)10% outliers
82%
Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012
![Page 92: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/92.jpg)
39/54
Multimodal spectral clustering
UncoupledNo outliers
100%
Coupled (L2)No outliers
53%
Coupled (L2)10% outliers
72%
Coupled (L2,1)10% outliers
82%
Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012
![Page 93: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/93.jpg)
39/54
Multimodal spectral clustering
UncoupledNo outliers
100%
Coupled (L2)No outliers
53%
Coupled (L2)10% outliers
72%
Coupled (L2,1)10% outliers
82%
Kovnatsky, Glashoff, B 2015; Eynard, Kovnatsky, B2, Glashoff 2012
![Page 94: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/94.jpg)
40/54
Multidimensional scaling
![Page 95: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/95.jpg)
41/54
Multidimensional scaling
D = X =
7 1 9 2 13
10 2 7 2 13
9 1 2 2 2
2 14 2 7 9
3 14 1 2 1
3 2 9 10 7
MDS problem: given an n × n (squared) distance matrix D, find ak-dimensional configuration of points X ∈ Rn×k such that
∥xi − xj∥22 ≈ dij
Cayton, Dasgupta 2006
![Page 96: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/96.jpg)
42/54
Similarity vs Distance
Equivalence between distances and similarities
(Squared) distances Similarities
EDM PSD
dist(B) = (bii + bjj − 2bij)
B = − 12HDH
where H = I − 1n
11⊺ is the double-centering matrix
Schonberg 1938; Dattoro 2005; Cayton, Dasgupta 2006
![Page 97: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/97.jpg)
42/54
Similarity vs Distance
Equivalence between distances and similarities
(Squared) distances Similarities
EDM PSD
B = − 12HDH
B∗= UΛ+U⊺
where H = I − 1n
11⊺ is the double-centering matrix
Schonberg 1938; Dattoro 2005; Cayton, Dasgupta 2006
![Page 98: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/98.jpg)
43/54
Classical MDS
Algorithm 5 Classical MDS
Input squared distance matrix D
Compute similarity by double centering: B = − 12HDH
Perform eigendecomposition B = UΛU⊺ and take the largest k positiveeigenvalues Λk and corresponding eigenvectors Uk
Output X = UkΛ1/2k
Classical MDS as optimization problem: minimize the strain
minX∈Rn×k
∥HDH −XX⊺∥2F
Young, Householder 1938
![Page 99: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/99.jpg)
43/54
Classical MDS
Algorithm 5 Classical MDS
Input squared distance matrix D
Compute similarity by double centering: B = − 12HDH
Perform eigendecomposition B = UΛU⊺ and take the largest k positiveeigenvalues Λk and corresponding eigenvectors Uk
Output X = UkΛ1/2k
Classical MDS as optimization problem: minimize the strain
minX∈Rn×k
∥HDH −XX⊺∥2F
Young, Householder 1938
![Page 100: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/100.jpg)
44/54
Sensitivity to outliers
Error dispersion by double-centering
(Squared) distance matrix Similarity matrix
ε ε/n ε
ε/n2
B = − 12HDH
Cayton, Dasgupta 2006
![Page 101: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/101.jpg)
45/54
Sensitivity to outliers
Seattle
SF
LA
Denver
NY WDC
Atlanta
Miami Houston
Chicago
Distances between 10 US cities computed with classical MDS
with distance between NY and LA doubled
Kruskal, Wish 1978; Cayton, Dasgupta 2006
![Page 102: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/102.jpg)
45/54
Sensitivity to outliers
Seattle SF
LA
Denver
NY
WDC Atlanta
Miami
Houston Chicago
Distances between 10 US cities computed with classical MDSwith distance between NY and LA doubled
Kruskal, Wish 1978; Cayton, Dasgupta 2006
![Page 103: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/103.jpg)
46/54
Robust Euclidean embedding (REE)
Minimize a robust norm (instead of the Frobenius norm)
minD∗∈EDM
∥D −D∗∥1
and then recover k-dimensional X from D∗ using classical MDS
Non-smooth
Can be formulated as a semi-definite program (SDP), or
Solved by subgradient minimization
Cayton, Dasgupta 2006
![Page 104: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/104.jpg)
46/54
Robust Euclidean embedding (REE)
Minimize a robust norm (instead of the Frobenius norm)
minD∗∈EDM
∥D −D∗∥1
and then recover k-dimensional X from D∗ using classical MDS
Non-smooth
Can be formulated as a semi-definite program (SDP), or
Solved by subgradient minimization
Cayton, Dasgupta 2006
![Page 105: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/105.jpg)
46/54
Robust Euclidean embedding (REE)
Minimize a robust norm (instead of the Frobenius norm)
minD∗∈EDM
∥D −D∗∥1
and then recover k-dimensional X from D∗ using classical MDS
Non-smooth
Can be formulated as a semi-definite program (SDP), or
Solved by subgradient minimization
Cayton, Dasgupta 2006
![Page 106: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/106.jpg)
46/54
Robust Euclidean embedding (REE)
Minimize a robust norm (instead of the Frobenius norm)
minD∗∈EDM
∥D −D∗∥1
and then recover k-dimensional X from D∗ using classical MDS
Non-smooth
Can be formulated as a semi-definite program (SDP), or
Solved by subgradient minimization
Cayton, Dasgupta 2006
![Page 107: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/107.jpg)
47/54
REE as manifold optimization
minB∈S+(n,k)
∥D − dist(B)∥1
Manifold of fixed-rank positive semi-definite matricesS+(n, k) = {X ∈ Rn×n ∶X =X⊺ ⪰ 0, rank(X) = k}Only non-smooth function (f ≡ 0)
X-step: manifold-constrained minimization of a quadratic function
Z-step: one iteration of shrinkage
Kovnatsky, Glashoff, B 2015
![Page 108: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/108.jpg)
48/54
REE by MADMM
Algorithm 6 MADMM for solving the REE problem
Input squared distance matrix D
Initialize k ← 1, Z(1) =X(1), U (1) = 0
repeat
X-step: B(k+1) = argminB∈S+(n,k)
∥dist(B(k+1)) −Z(k) −D +U (k)∥2F
Z-step: Z(k+1) = Shrink 1ρ
(dist(B(k+1)) −D +U (k))
Update U (k+1) = U (k) + dist(B(k+1)) −D −Z(k+1)
k ← k + 1until convergence;
Kovnatsky, Glashoff, B 2015
![Page 109: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/109.jpg)
49/54
Robust Euclidean embedding example
Groundtruth
Classical MDS
MADMM
Embedding of distanced between 500 US cities corrupted by sparse noise(doubling the distance between a few pairs of cities)
Kovnatsky, Glashoff, B 2015
![Page 110: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/110.jpg)
50/54
Scalability of REE
Complexity of different methods for REE problem of different size n
0 200 400 600 800 1,000
10−2
100
102
Problem size n
Tim
e/it
er(s
ec)
SDP
Subgradient
MADMM
Kovnatsky, Glashoff, B 2015; Cayton, Dasgupta 2006
![Page 111: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/111.jpg)
51/54
Convergence
Convergence of different methods on REE problem of size n = 500
0 20 40 60 80 100
103.5
104
10-310-4
10-5
10-2
Time (sec)
Str
ess
Subgradient
MADMM
Kovnatsky, Glashoff, B 2015; Cayton, Dasgupta 2006
![Page 112: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/112.jpg)
52/54
Conclusions
Non-smooth manifold optimization problems are ubiquitous inmachine learning, pattern recognition, signal processing, andcomputer graphics applications
MADMM is a generic algorithm for such problems
Any manifold, any function
Very simple to implement
No parameters to tune
A. Kovnatsky, K. Glashoff, M. M. Bronstein, ‘MADMM: a generic algorithm fornon-smooth optimization on manifolds’, arXiv:1505.07676, May 2015
![Page 113: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/113.jpg)
53/54
A. Kovnatsky
Funded by
![Page 114: MADMM: a generic algorithm for non-smooth optimization on ... · 9/25/2015 · 4/54 Applications Sphere: principal geodesic analysis1, 1-bit compressed sensing2 Stiefel manifold:](https://reader033.vdocuments.us/reader033/viewer/2022042922/5f3c8de2cbb0b042673dc136/html5/thumbnails/114.jpg)
54/54
Thank you!