![Page 1: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/1.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
An exact first-order algorithm for decentralized consensusoptimization
Xinwei Sun(1301110047), Jingru Zhang(1301110029)
December 22, 2014
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 2: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/2.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
1 Introduction
2 AlgorithmDGD methodEXTRA methodEXTRA as Corrected DGD
3 Convergence analysisConvex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
4 Experiment
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 3: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/3.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Decentralized Consensus Optimization
minx∈Rp
f̄ (x) =1
n
n∑i=1
fi (x) fi : Rp → R. (1)
EXTRA method
use a fixed step size independent of the network size.
convergence rate O( 1k
) for general convex objection with Lipschitzdifferential.
linear convergence rate for (restricted) strongly convex.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 4: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/4.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Previous Methods
existing first-order decentralized methods:(sub)gradient,(sub)gradient-push,fast (sub)gradient,dual averaging.
more restrictive assumptions and worse convergence rate.
using a fixed step size,do not converge to a solution x?,just in itsneighborhood.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 5: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/5.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Notation
x(i) ∈ Rp:local copy of the global variable x.
xk(i):its value at iteration k.
f(x) ,∑n
i=1 fi (x(i)):an aggregate objective function of the local variables
x ,
−xT
(1)−−xT
(2)−...
−xT(n)−
∈ Rn×p
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 6: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/6.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Notation
∇f(x) ,
−∇T f1(x(1))−−∇T f2(x(2))−
...−∇T fn(x(n))−
∈ Rn×p.
x is consensual if all of its rows are identical,i.e., x(1) = · · · = x(n).
‖A‖G ,√
trace(ATGA):G-matrix norm.
null{A} , {x ∈ Rn|Ax = 0}:null space of A.
span{A} , {y ∈ Rm|y = Ax , ∀x ∈ Rn}:linear span of all the columns of A.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 7: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/7.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
Algorithm 1 (DGD)
xk+1(i) =
n∑j=1
wijxk(j) − αk∇fi (xk
(i)), for agent i = 1, · · · , n. (2)
matrix versionxk+1 = Wxk − αk∇f(xk). (3)
xk(i) ∈ Rp is the local copy of x held by agent i at iteration k,
W = [wij ] ∈ Rn×n is a symmetric mixing matrixsatisfying null{I −W } = span{1}.If two agents i and j are neither neighborsnor identical,then wij = 0,σmax(W − 1
n11T) < 1,
αk > 0 is a step size for iteration k.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 8: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/8.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
Dilemma of DGD
converge slowly to an exact solution with a sequence of diminishing stepsizes.
converge faster with a fixed step size but stall at an inaccuratesolution (O(α)-neighbourhood of a solution).
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 9: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/9.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
The Cause of Inexact Convergence With a Fixed Step Size
let x∞ be the limit of xk . Take the limit over k and get
x∞ = Wx∞ − α∇f(x∞). (4)
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 10: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/10.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
Development of EXTRA
Consider the DGD update at iterations k+1 and k as follows
xk+2 = Wxk+1 − α∇f(xk+1), (5)
xk+1 = Wxk − α∇f(xk), (6)
W̃ =I + W
2, the choice of W̃ will be generalized later. (7)
Subtracting the above two iterations of DGD,we get the update formula ofEXTRA:
xk+2 − xk+1 = Wxk+1 − W̃xk − α∇f(xk+1) + α∇f(xk). (8)
Conclusion 1
Provided that null{I−W} = span{1}, W̃ = I+W2 , 1T(W − W̃) = 0 and the
continuity of ∇f , if a sequence following EXTRA converges to a pointx?, then x? is consensual and any of its identical row vectors solves problem.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 11: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/11.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
The Algorithm EXTRA
Algorithm 2 (EXTRA)
Choose α > 0 and mixing matrices W ∈ Rn×n and W̃ ∈ Rn×n
Pick any x0 ∈ Rn×p
1.x1 ←Wx0 − α∇f(x0);2.for k = 0, 1, · · · doxk+2 ← (I + W)xk+1 − W̃xk − α[∇f (xk+1)−∇f (xk)];end for
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 12: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/12.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
Assumptions on The Mixing Matrices W and W̃
Assumption 1
(Mixing matrix). Consider a connected network G = {V, E} consisting of a setof agents V = {1, 2, · · ·, n} and a set of undirected edges E . The mixingmatrices W = [wij ] ∈ Rn×n satisfy
1. (Decentralized property) If i 6= j and (i , j) /∈ E , then wij = w̃ij = 0.
2. (Symmetry) W = W>, W̃ = W̃>.
3. (Null space property) null{W − W̃ } = span{1}, null{I − W̃ } ⊇ span{1}.
4. (Spectral property) W̃ � 0 and I+W2
< W̃ < W .
In fact,EXTRA can use the same W used in DGD and simply take W̃ = I+W2
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 13: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/13.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
Mixing Matrices
The mixing matrices W and W̃ diffuse information throughout the network,which can significantly affect performance.We will verify this point in thenumerical experiments!!!W can be chosen through following methods:
symmetric doubly stochastic matrix: W = WT,W1 = 1 and wij ≥ 0.
Laplacian-based constant edge weight matrix
Metropolis constant edge weight matrix
symmetric fastest distributed linear averaging(FDLA) matrix
W̃ = I+W2 is found to be very efficient.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 14: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/14.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
DGD methodEXTRA methodEXTRA as Corrected DGD
Due to the update formula
xk+2 − xk+1 = Wxk+1 − W̃xk − α∇f(xk+1) + α∇f(xk),
We add the iterations k together and obtain
xk+1 = Wxk − α∇f (xk) +k−1∑t=0
(W − W̃)xt, k = 0, 1, · · · (9)
An EXTRA update is a DGD update with a cumulative correction term.Therole of the cumulative term
∑k−1t=0 (W − W̃)xt is to neutralize −α∇f (xk) in
(span{1})⊥
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 15: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/15.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
Preliminaries
Lemma. 1
Assume null{I −W } = span{1}. If
x? =
−− x?>(1) −−−− x?>(2) −−
···
−− x?>(n) −−
(10)
satisfies conditions:1. x? = Wx? (consensus),2. 1>Of (x?) = 0 (optimality),then x? = x?(i), for any i , is a solution to consensus optimization problem.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 16: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/16.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
Lemma. 2
Given mixing matrices W and W̃ , define U = (W̃ −W )12 by letting
U , VS12V> ∈ Rn×n where VSV> = W̃ −W is the econmical-form singular
value decomposition. Then, under someassumptions, x? is consensual andx?(1) = x?(2) = · · · = x?(n) is optimal to optimization problem if and only if there
exists q? = Up for some p ∈ Rn×p such that
Uq? + αOf (x?) = 0 (11)
Ux? = 0 (12)
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 17: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/17.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
Theorem 1
Under assumptions 1− 3, If α satisfies 0 < α < 2λmin(W̃ )Lf
, then
‖zk − z?‖2G − ‖zk+1 − z?‖2G ≥ ζ‖zk − zk+1‖2G , k = 0, 1, ..., (13)
where ζ = 1− αLf2λmin(W̃ )
.
zk =
[qk
xk
]. zk =
[qk
xk
]. G =
[I 0
0 W̃
].
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 18: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/18.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
The key steps in the proof of above theorem.
(I + W − 2W̃ )x? = 0
(14)
2α
Lf‖Of (xk)− Of (x?)‖2F ≤ 2α〈xk − x?,Of (xk)− Of (x?)〉
(15)
〈xk+1 − x?,U(q? − qk+1)〉 = 〈U(xk+1 − x?), q? − qk+1〉 = 〈Uxk+1, q? − qk+1〉(16)
〈xk+1 − x?, W̃ (q? − qk+1)〉 = 〈W̃ (xk+1 − x?), q? − qk+1〉(17)
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 19: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/19.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
The key steps in the proof of above theorem
‖zk − z?‖2G − ‖zk+1 − z?‖2G − ‖zk − zk+1‖2G − 2‖xk+1 − x?‖2I+W−2W̃
+αLf
2‖xk − xk+1‖2F ≥ 0 (18)
‖zk − z?‖2G − ‖zk+1 − z?‖2G − ‖zk − zk+1‖2G +αLf
2‖xk − xk+1‖2F ≥ 0 (19)
‖zk − z?‖2G − ‖zk+1 − z?‖2G ≥ ‖zk − zk+1‖2G ′ (20)
‖zk − zk+1‖2G ′ ≥ ζ‖zk − zk+1‖2G (21)
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 20: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/20.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
From the analysis above, we can give these thee necessary assumptions:
Assumption 2
(Mixing matrix). Consider a connected network G = {V, E} consisting of a setof agents V = {1, 2, · · ·, n} and a set of undirected edges E . The mixingmatrices W = [wij ] ∈ Rn×n satisfy
1. (Decentralized property) If i 6= j and (i , j) /∈ E , then wij = w̃ij = 0.
2. (Symmetry) W = W>, W̃ = W̃>.
3. (Null space property) null{W − W̃ } = span{1}, null{I − W̃ } ⊇ span{1}.
4. (Spectral property) W̃ � 0 and I+W2
< W̃ < W .
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 21: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/21.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
Assumption 3
(Convex objective with Lipschitz continuous gradient) Objective functions fi areproper closed convex and Lipschitz differentiable:
‖Ofi (xa)− Ofi (xb)‖2 ≤ Lfi ‖xa − xb‖2,∀xa, xb ∈ Rp,
where Lfi ≥ 0 are constant.
Assumption 4
(Solution existence) The optimization problem has a nonempty set of optimalsolutions: X ? 6= Ø.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 22: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/22.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
From the theorem above, we can get:
Theorem 2
In the same setting of Theorem 3, the following rates hold:(1) Running-average progress:
1k
∑kt=1 ‖z
t − z t+1‖2G = O( 1k
);
(2) Running-best progress:
mint≤k{‖z t − z t+1‖2G} = o( 1k
);
(3) Running-average optimality residuals:
1k
∑kt=1 ‖Uq
t + αOf xt
‖2W̃
= O( 1k
) and 1k
∑kt=1 ‖Ux
t‖2F = O( 1k
);
(4) Running-best optimality residuals:
mint≤k{‖Uqt + αOf xt
‖2W̃
= o( 1k
) and mint≤k‖Ux t‖2F = o( 1k
);
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 23: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/23.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
Theorem 3
If g(x) , f (x) + 14α‖x‖2
W̃−Wis restricted strongly convex with respect to x?
with constant µg > 0, then with proper step size α <2µgλmin(W̃ )
L2f
, there exists
δ > 0 such that the sequence {zk} generated by EXTRA satisfies
‖zk − z?‖2G ≥ (1 + δ)‖zk+1 − z?‖2G . (22)
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 24: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/24.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Convex objective with Lipshitz continuous gradientStrongly convex with Lipshitz continuous gradient
EXTRA
Denote µf is the constant of the restrictedly convex function f (x), then in theproof of the theorem above,
µg = min{µf − 2Lf γ,λ̃min(W̃ −W )
2α(1 + 1γ2
)} (23)
So we set
α =λ̃min(W̃ −W )
2(1 + 1γ2
)(µf − 2Lf γ)= O(
µg
L2f
) (24)
The paper said that in this case if we set α = O( 1Lf
), the algorithm stillconverges and become faster, but it remains an open question to prove linearconvergence under this step.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 25: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/25.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
EXTRA
Experiment set:We consider solving problem
minimizex f̄ (x) =1
n
n∑i=1
1
2‖M(i)x − y(i)‖22 (25)
Here y(i) = M(i)x + e(i), where y(i) ∈ Rmi and M(i) ∈ Rmi×p are measured data,x ∈ Rp is unknown signal, and e(i) ∈ Rmi is unknown noise. In this set, n = 10,mi = 1, ∀i , p = 5. Data M(i) and e(i), ∀i , are generated following the standardnormal equation. The algorithm starts from x0
(i) = 0, ∀i , and we set
‖x − x0i ‖ = 300.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 26: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/26.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
EXTRA
0 500 1000 1500 2000 2500 300010
−12
10−10
10−8
10−6
10−4
10−2
100
EXTRA
DGD w ith fix ed α
DGD w ith α /k 1 / 2
DGD w ith α /k 1 / 3
DGD w ith 3α /k 1 / 3
DGD w ith 5α /k 1 / 2
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 27: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/27.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
EXTRA
We adjust M(i), ∀i , to 20 to make f(i)(x) strongly convex. All other parametersare the same. And we set α = 1
2Lf.
0 50 100 150 200 25010
−12
10−10
10−8
10−6
10−4
10−2
100
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 28: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/28.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Other Mixing Matrices
We have tried other methods to choose mixing matrices W .The result of usingmetropolis method with different connectivity ratio r = 0.2, 0.5, 0.7 as follows.We can see the significant effect from mixing matrices.
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 29: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/29.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 30: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/30.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA
![Page 31: An exact first-order algorithm for decentralized …...di erential. linear convergence rate for (restricted) strongly convex. Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA](https://reader033.vdocuments.us/reader033/viewer/2022060505/5f1e93f59d4e9749cd6b2574/html5/thumbnails/31.jpg)
IntroductionAlgorithm
Convergence analysisExperiment
Xinwei Sun(1301110047), Jingru Zhang(1301110029) EXTRA