convex optimization of graph laplacian eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 ·...
TRANSCRIPT
![Page 1: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/1.jpg)
Convex Optimization of Graph Laplacian
Eigenvalues
Stephen Boyd
Stanford University
(Joint work with Persi Diaconis, Arpita Ghosh, Seung-Jean Kim, Sanjay Lall,Pablo Parrilo, Amin Saberi, Jun Sun, Lin Xiao. Thanks to Almir Mutapcic.)
ICM 2006 Madrid, August 29, 2006
![Page 2: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/2.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
ICM 2006 Madrid, August 29, 2006 1
![Page 3: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/3.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
![Page 4: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/4.jpg)
(Weighted) graph Laplacian
• graph G = (V,E) with n = |V | nodes, m = |E| edges
• edge weights w1, . . . , wm ∈ R
• l ∼ (i, j) means edge l connects nodes i, j
• incidence matrix: Ail =
1 edge l enters node i−1 edge l leaves node i
0 otherwise
• (weighted) Laplacian: L = Adiag(w)AT
• Lij =
−wl l ∼ (i, j)∑
l∼(i,k)wl i = j
0 otherwise
ICM 2006 Madrid, August 29, 2006 2
![Page 5: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/5.jpg)
Laplacian eigenvalues
• L is symmetric; L1 = 0
• we’ll be interested in case when L � 0 (i.e., L is PSD)(always the case when weights nonnegative)
• Laplacian eigenvalues (eigenvalues of L):
0 = λ1 ≤ λ2 ≤ · · · ≤ λn
• spectral graph theory connects properties of graph, and λi (with w = 1)
e.g.: G connected iff λ2 > 0 (with w = 1)
ICM 2006 Madrid, August 29, 2006 3
![Page 6: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/6.jpg)
Convex spectral functions
• suppose φ is a symmetric convex function in n− 1 variables
• then ψ(w) = φ(λ2, . . . , λn) is a convex function of weight vector w
• examples:
– φ(u) = 1Tu (i.e., the sum):
ψ(w) =
n∑
i=2
λi =
n∑
i=1
λi = TrL = 21Tw (twice the total weight)
– φ(u) = maxi ui:
ψ(w) = max{λ2, . . . , λn} = λn (spectral radius)
ICM 2006 Madrid, August 29, 2006 4
![Page 7: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/7.jpg)
More examples
• φ(u) = mini ui (concave) gives ψ(w) = λ2, algebraic connectivity(concave function of w)
• φ(u) =∑
i 1/ui (with ui > 0):
ψ(w) =
n∑
i=2
1
λi
proportional to total effective resistance of graph, TrL†
ICM 2006 Madrid, August 29, 2006 5
![Page 8: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/8.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
![Page 9: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/9.jpg)
Convex optimization
minimize f(x)subject to x ∈ X
• x ∈ Rn is optimization variable
• f is convex function (can maximize concave f by minimizing −f)
• X ⊆ Rn is closed convex set
• roughly speaking, convex optimization problems are tractable, ‘easy’ tosolve numerically (tractability depends on how f and X are described)
ICM 2006 Madrid, August 29, 2006 6
![Page 10: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/10.jpg)
Symmetry in convex optimization
• permutation (matrix) π is a symmetry of problem if f(πz) = f(z) forall z, πz ∈ X for all z ∈ X
• if π is a symmetry and the convex optimization problem has a solution,it has a solution invariant under π
(if x⋆ is a solution, so is average over {x⋆, πx⋆, π2x⋆, . . .})
ICM 2006 Madrid, August 29, 2006 7
![Page 11: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/11.jpg)
Duality in convex optimization
primal:minimize f(x)subject to x ∈ X dual:
maximize g(y)subject to y ∈ Y
• y is dual variable; dual objective g is concave; Y is closed, convex(various methods can be used to generate g, Y)
• p⋆ (d⋆) is optimal value of primal (dual) problem
• weak duality : for any x ∈ X , y ∈ Y, f(x) ≥ g(y); hence, p⋆ ≥ g(y)
• strong duality : for convex problems, provided a ‘constraint qualification’holds, there exist x⋆ ∈ X , y⋆ ∈ Y with f(x⋆) = g(y⋆)
hence, x⋆ is primal optimal, y⋆ is dual optimal, and p⋆ = d⋆
ICM 2006 Madrid, August 29, 2006 8
![Page 12: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/12.jpg)
Semidefinite program (SDP)
a particular type of convex optimization problem
minimize cTxsubject to
∑n
i=1 xiAi � B, Fx = g
• variable is x ∈ Rn; data are c, F , g, symmetric matrices Ai, B
• � means with respect to positive semidefinite cone
• generalization of linear program (LP)
minimize cTxsubject to
∑n
i=1 xiai ≤ b, Fx = g
(here ai, b are vectors; ≤ means componentwise)
ICM 2006 Madrid, August 29, 2006 9
![Page 13: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/13.jpg)
SDP dual
primal SDP:
minimize cTxsubject to
∑n
i=1 xiAi � B, Fx = g
dual SDP:
maximize −TrZB − νTgsubject to Z � 0,
(
FTν + c)
i+ TrZAi = 0, i = 1, . . . , n
with (matrix) variable Z, (vector) ν
ICM 2006 Madrid, August 29, 2006 10
![Page 14: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/14.jpg)
SDP algorithms and applications
since 1990s,
• recently developed interior-point algorithms solve SDPs very effectively
(polynomial time, work well in practice)
• many results for LP extended to SDP
• SDP widely used in many fields
(control, combinatorial optimization, machine learning, finance, signalprocessing, communications, networking, circuit design, mechanicalengineering, . . . )
ICM 2006 Madrid, August 29, 2006 11
![Page 15: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/15.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
![Page 16: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/16.jpg)
The basic idea
some interesting weight optimization problems have the common form
minimize φ(w) = φ(λ2, . . . , λn)subject to w ∈ W
where φ is symmetric convex, and W is closed convex
• these are convex optimization problems
• we can solve them numerically(up to our ability to store data, compute eigenvalues . . . )
• for some simple graphs, we can get analytic solutions
• associated dual problems can be quite interesting
ICM 2006 Madrid, August 29, 2006 12
![Page 17: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/17.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
![Page 18: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/18.jpg)
Distributed averaging
12
34
5
6
x1 x2
x3x4 x5
x6
W12
W13W14 W24
W36 W46
W45
• each node of connected graph has initial value xi(0) ∈ R; goal is tocompute average 1Tx(0)/n using distributed iterative method
• applications in load balancing, distributed optimization, sensor networks
ICM 2006 Madrid, August 29, 2006 13
![Page 19: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/19.jpg)
Distributed linear averaging
• simple linear iteration: replace each node value with weighted averageof its own and its neighbors’ values; repeat
xi(t+ 1) = Wiixi(t) +∑
j∈Ni
Wijxj(t)
= xi(t) −∑
j∈Ni
Wij (xi(t) − xj(t))
where Wii +∑
j∈NiWij = 1
• we’ll assume Wij = Wji, i.e., weights symmetric
• weights Wij determine whether convergence to average occurs, and ifso, how fast
• classical result: convergence if weights Wij (i 6= j) are small, positive
ICM 2006 Madrid, August 29, 2006 14
![Page 20: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/20.jpg)
Convergence rate
• vector form: x(t+ 1) = Wx(t) (we take Wij = 0 for i 6= j, (i, j) 6∈ E)
• W satisfies W = WT , W1 = 1
• convergence ⇐⇒ limt→∞W t = (1/n)11T ⇐⇒
ρ(W − (1/n)11T ) = ‖W − (1/n)11T‖ < 1
ρ is spectral radius; ‖ · ‖ is spectral norm
• asymptotic convergence rate given by ‖W − (1/n)11T‖
• convergence time is τ = −1/ log ‖W − (1/n)11T‖
ICM 2006 Madrid, August 29, 2006 15
![Page 21: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/21.jpg)
Connection to Laplacian eigenvalues
• identifying Wij = wl for l ∼ (i, j), we have W = I − L
• convergence rate given by
‖W − (1/n)11T‖ = ‖I − L− (1/n)11T‖= max{|1 − λ2|, . . . , |1 − λn|}= max{1 − λ2, λn − 1}
. . . a convex spectral function, with φ(u) = maxi |1 − ui|
ICM 2006 Madrid, August 29, 2006 16
![Page 22: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/22.jpg)
Fastest distributed linear averaging
minimize ‖W − (1/n)11T‖subject to W ∈ S, W = WT , W1 = 1
optimization variable is W ; problem data is graph (sparsity pattern S)
in terms of Laplacian eigenvalues
minimize max{1 − λ2, λn − 1}
with variable w ∈ Rm
• these are convex optimization problems
• so, we can efficiently find the weights that give the fastest possibleaveraging on a graph
ICM 2006 Madrid, August 29, 2006 17
![Page 23: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/23.jpg)
Semidefinite programming formulation
introduce scalar variable s to bound spectral norm
minimize ssubject to −sI � I − L− (1/n)11T � sI
(for Z = ZT , ‖Z‖ ≤ s⇐⇒ −sI � Z � sI)
an SDP (hence, can be solved efficiently)
ICM 2006 Madrid, August 29, 2006 18
![Page 24: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/24.jpg)
Constant weight averaging
• a simple, traditional method: constant weight on all edges, w = α1
• yields update
xi(t+ 1) = xi(t) +∑
j∈Ni
α(xj(t) − xi(t))
• a simple choice: max-degree weight, α = 1/maxi di
di is degree (number of neighbors) of node i
• best constant weight: α⋆ =2
λ2 + λn
(λ2, λn are eigenvalues of unweighted Laplacian, i.e., with w = 1)
• for edge transitive graph, wl = α⋆ is optimal
ICM 2006 Madrid, August 29, 2006 19
![Page 25: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/25.jpg)
A small example
max-degree best constant optimalρ = ‖W − (1/n)11T‖ 0.779 0.712 0.643
τ = 1/ log(1/ρ) 4.01 2.94 2.27
ICM 2006 Madrid, August 29, 2006 20
![Page 26: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/26.jpg)
Optimal weights
(note: some are negative!)
.07.03
.36
.22.18
.34−.25
.36
.50 .08
.22.43
.08
.34
.24
.15.34
.22
.42 .34
−.03
λi(W ): −.643, −.643, −.106, 0.000, .368, .643, .643, 1.000
ICM 2006 Madrid, August 29, 2006 21
![Page 27: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/27.jpg)
Larger example
50 nodes, 200 edges
max-degree best constant optimalρ = ‖W − (1/n)11T‖ .971 .947 .902
τ = 1/ log(1/ρ) 33.5 18.3 9.7
ICM 2006 Madrid, August 29, 2006 22
![Page 28: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/28.jpg)
Eigenvalue distributions
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
2
4
6
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
2
4
6
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
2
4
6
max-degree
best constant
optimal
ICM 2006 Madrid, August 29, 2006 23
![Page 29: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/29.jpg)
Optimal weights
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50
2
4
6
8
10
12
69 out of 250 are negative
ICM 2006 Madrid, August 29, 2006 24
![Page 30: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/30.jpg)
Another example
• a cut grid with n = 64 nodes, m = 95 edges
• edge width shows weight value (red for negative)
• τ = 85; max-degree τ = 137
ICM 2006 Madrid, August 29, 2006 25
![Page 31: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/31.jpg)
Some questions & comments
• how much better are the optimal weights than the simple choices?
– for barbell graphs Kn −Kn, optimal weights are unboundedly betterthan max-degree, optimal constant, and several other simple weightchoices
• what size problems can be handled (on a PC)?
– interior-point algorithms easily handle problems with 104 edges– subgradient-based methods handle problems with 106 edges– any symmetry can be exploited for efficiency gain
• what happens if we require the weights to be nonnegative?
– (we’ll soon see)
ICM 2006 Madrid, August 29, 2006 26
![Page 32: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/32.jpg)
Least-mean-square average consensus
• include random noise in averaging process: x(t+ 1) = Wx(t) + v(t)v(t) i.i.d., E v(t) = 0, E v(t)v(t)T = I
• steady-state mean-square deviation:
δss = limt→∞
E
1
n
∑
i<j
(xi(t) − xj(t))2
=n
∑
i=2
1
λi(2 − λi)
for ρ = max{1 − λ2, λn − 1} < 1
• another convex spectral function
ICM 2006 Madrid, August 29, 2006 27
![Page 33: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/33.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
![Page 34: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/34.jpg)
Random walk on a graph
• Markov chain on nodes of G, with transition probabilities on edges
Pij = Prob (X(t+ 1) = j | X(t) = i)
• we’ll focus on symmetric transition probability matrices P(everything extends to reversible case, with fixed equilibrium distr.)
• identifying Pij with wl for l ∼ (i, j), we have P = I − L
• same as linear averaging matrix W , but here Wij ≥ 0(i.e., w ≥ 0, diag(L) ≤ 1)
ICM 2006 Madrid, August 29, 2006 28
![Page 35: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/35.jpg)
Mixing rate
• probability distribution πi(t) = Prob(X(t) = i) satisfiesπ(t+ 1)T = π(t)TP
• since P = PT and P1 = 1, uniform distribution π = (1/n)1 isstationary, i.e., ((1/n)1)TP = ((1/n)1)T
• π(t) → (1/n)1 for any π(0) iff
µ = ‖P − (1/n)11T‖ = ‖I − L− (1/n)11T‖ < 1
µ is called second largest eigenvalue modulus (SLEM) of MC
• SLEM determines convergence (mixing) rate, e.g.,
supπ(0)
‖π(t) − (1/n)1‖tv ≤(√n/2
)
µt
• associated mixing time is τ = 1/ log(1/µ)
ICM 2006 Madrid, August 29, 2006 29
![Page 36: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/36.jpg)
Fastest mixing Markov chain problem
minimize µ = ‖I − L− (1/n)11T‖ = max{1 − λ2, λn − 1}subject to w ≥ 0, diag(L) ≤ 1
• optimization variable is w; problem data is graph G
• same as fast linear averaging problem, with additional nonnegativityconstraint Wij ≥ 0 on weights
• convex optimization problem (indeed, SDP), hence efficiently solved
ICM 2006 Madrid, August 29, 2006 30
![Page 37: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/37.jpg)
Two common suboptimal schemes
• max-degree chain: w = (1/maxi di)1
• Metropolis-Hastings chain: wl =1
max{di, dj}, where l ∼ (i, j)
(comes from Metropolis method of generating reversible MC withuniform stationary distribution)
ICM 2006 Madrid, August 29, 2006 31
![Page 38: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/38.jpg)
Small example
optimal transition probabilities (some are zero)
.03.00
.50
.26.25
.48.00
.50
.50 .06
.18.47
.06
.38
.16
.12.26
.18
.36 .26
.00
max-degree M.-H. optimal (fastest avg)SLEM µ .779 .774 .681 (.643)
mixing time τ 4.01 3.91 2.60 (2.27)
ICM 2006 Madrid, August 29, 2006 32
![Page 39: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/39.jpg)
Larger example
50 nodes, 200 edges
max-degree M.-H. optimal (fastest avg)SLEM µ .971 .949 .915 (.902)
mixing time τ 33.5 19.1 11.3 (9.7)
ICM 2006 Madrid, August 29, 2006 33
![Page 40: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/40.jpg)
Optimal transition probabilities
• 82 edges (out of 200) edges have zero transition probability
• distribution of positive transition probabilities:
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
5
6
7
8
9
10
ICM 2006 Madrid, August 29, 2006 34
![Page 41: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/41.jpg)
Subgraph with positive transition probabilities
ICM 2006 Madrid, August 29, 2006 35
![Page 42: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/42.jpg)
Another example
• a cut grid with n = 64 nodes, m = 95 edges
• edge width shows weight value (dotted for zero)
• mixing time τ = 89; Metropolis-Hastings mixing time τ = 120
ICM 2006 Madrid, August 29, 2006 36
![Page 43: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/43.jpg)
Some analytical results
• for path, fastest mixing MC is obvious one (Pi,i+1 = 1/2)
• for any edge-transitive graph (hypercube, ring, . . . ), all edge weightsare equal, with value 2/(λ2 + λn) (unweighted Laplacian eigenvalues)
ICM 2006 Madrid, August 29, 2006 37
![Page 44: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/44.jpg)
Commute time for random walk on graph
• Pij proportional to wl, for l ∼ (i, j); Pii = 0
• P not symmetric, but MC is reversible
• can normalize w as 1Tw = 1
• commute time Cij: time for random walk to return to i after visiting j
• expected commute time averaged over all pairs of nodes is
C =1
n2
n∑
i,j=1
ECij =2
(n− 1)
n∑
i=2
1
λi
(Chandra et al, 1989)
• called total effective resistance . . . another convex spectral function
ICM 2006 Madrid, August 29, 2006 38
![Page 45: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/45.jpg)
Minimizing average commute time
find weights that minimize average commute time on graph:
minimize C = 2/(n− 1)∑n
i=2 1/λi
subject to w ≥ 0, 1Tw = 1
• another convex problem of our general form
ICM 2006 Madrid, August 29, 2006 39
![Page 46: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/46.jpg)
Outline
• some basic stuff we’ll need
– graph Laplacian eigenvalues– convex optimization and semidefinite programming
• the basic idea
• some example problems
– distributed linear averaging– fastest mixing Markov chain on a graph– fastest mixing Markov process on a graph– its dual: maximum variance unfolding
• conclusions
![Page 47: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/47.jpg)
Markov process on a graph
• (continuous-time) Markov process on nodes of G, with transition ratewl ≥ 0 between nodes i and j, for l ∼ (i, j)
• probability distribution π(t) ∈ Rn satisfies heat equation π̇(t) = −Lπ(t)
• π(t) = e−tLπ(0)
• π(t) converges to uniform distribution (1/n)1, for any π(0), iff λ2 > 0
• (asymptotic) convergence as e−λ2t; λ2 gives mixing rate of process
• λ2 is concave, homogeneous function of w(come from symmetric concave function φ(u) = mini ui)
ICM 2006 Madrid, August 29, 2006 40
![Page 48: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/48.jpg)
Fastest mixing Markov process on a graph
maximize λ2
subject to∑
l d2lwl ≤ 1, w ≥ 0
• variable is w ∈ Rm; data is graph, normalization constants dl > 0
• a convex optimization problem, hence easily solved
• allocate rate across edges so as maximize mixing rate
• constraint is always tight at solution, i.e.,∑
l d2lwl = 1
• when d2l = 1/m, optimal value is called absolute algebraic connectivity
ICM 2006 Madrid, August 29, 2006 41
![Page 49: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/49.jpg)
Interpretation: Grounded unit capacitor RC circuit
• charge vector q(t) satisfies q̇(t) = −Lq(t), with edge weights given byconductances, wl = gl
• charge equilibrates (i.e., converges to uniform) at rate determined by λ2
• with conductor resistivity ρ, length dl, and cross-sectional area al, wehave gl = al/(ρdl)
ICM 2006 Madrid, August 29, 2006 42
![Page 50: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/50.jpg)
• total conductor volume is∑
l dlal = ρ∑
l d2lwl
• problem is to choose conductor cross-sectional areas, subject to a totalvolume constraint, so as to make the circuit equilibrate charge as fast aspossible
optimal λ2 is .105; uniform allocation of conductance gives λ2 = .073
ICM 2006 Madrid, August 29, 2006 43
![Page 51: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/51.jpg)
SDP formulation and dual
alternate formulation:
minimize∑
d2lwl
subject to λ2 ≥ 1, w ≥ 0
SDP formulation:
minimize∑
d2lwl
subject to L � I − (1/n)11T , w ≥ 0
dual problem:
maximize TrXsubject to Xii +Xjj −Xij −Xji ≤ d2
l , l ∼ (i, j)1TX1 = 0, X � 0
with variable X ∈ Rn×n
ICM 2006 Madrid, August 29, 2006 44
![Page 52: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/52.jpg)
A maximum variance unfolding problem
• use variables x1, . . . , xn ∈ Rn, with X =
xT1...xT
n
[x1 · · · xn]
• dual problem becomes maximum variance unfolding (MVU) problem
maximize∑
i ‖xi‖2
subject to ‖xi − xj‖ ≤ dl, l ∼ (i, j)∑
i xi = 0
• position n points in Rn to maximize variance, while respecting localdistance constraints
ICM 2006 Madrid, August 29, 2006 45
![Page 53: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/53.jpg)
• similar to semidefinite embedding for unsupervised learning ofmanifolds (Weinberger & Saul 2004)
• surprise: duality between fastest mixing Markov process and maximumvariance unfolding
ICM 2006 Madrid, August 29, 2006 46
![Page 54: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/54.jpg)
Conclusions
some interesting weight optimization problems have the common form
minimize φ(w) = φ(λ2, . . . , λn)subject to w ∈ W
where φ is symmetric convex, and W is closed convex
• these are convex optimization problems
• we can solve them numerically(up to our ability to store data, compute eigenvalues . . . )
• for some simple graphs, we can get analytic solutions
• associated dual problems can be quite interesting
ICM 2006 Madrid, August 29, 2006 47
![Page 55: Convex Optimization of Graph Laplacian Eigenvaluesboyd/papers/pdf/icm06_talk.pdf · 2007-11-09 · Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Stanford University](https://reader030.vdocuments.us/reader030/viewer/2022041102/5edca91cad6a402d66676be0/html5/thumbnails/55.jpg)
Some references
• Convex OptimizationCambridge University Press, 2004
• Fast linear iterations for distributed averagingSystems and Control Letters 53:65-78, 2004
• Fastest mixing Markov chain on a graphSIAM Review 46(4):667-689, 2004
• Fastest mixing Markov process on a graph and a connection to amaximum variance unfolding problemto appear, SIAM Review 48(4), 2006
• Minimizing effective resistance of a graphto appear, SIAM Review, 2007
• Distributed average consensus with least-mean-square deviationMTNS p2768-2776, 2006
these and other papers available at www.stanford.edu/~boyd
ICM 2006 Madrid, August 29, 2006 48