![Page 1: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/1.jpg)
Fast Leader (Full) Recovery despite Dynamic Faults
Ajoy K. Datta
Stéphane Devismes
Lawrence L. Larmore
Sébastien Tixeuil
![Page 2: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/2.jpg)
Join Work
ICDCN, 04/01/2013, Mumbia
Ajoy K. Datta & Lawrence L. Larmore
Sébastien Tixeuil
![Page 3: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/3.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
![Page 4: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/4.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
A fault = a process state corruption
![Page 5: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/5.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
![Page 6: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/6.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
![Page 7: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/7.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
![Page 8: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/8.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
![Page 9: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/9.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
![Page 10: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/10.jpg)
Self-Stabilization [Dijkstra,74]
ICDCN, 04/01/2013, Mumbia
Recover after any number of
transient faults
![Page 11: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/11.jpg)
Price of the Versatility
1. Several impossibility results– E.g., Leader Election and Token
Circulation in anonymous networks
2. The stabilization time usually depends on global parameters
(diameter, size of the network …)
ICDCN, 04/01/2013, Mumbia
![Page 12: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/12.jpg)
Price of the Versatility
1. Several impossibility results– E.g., Leader Election and Token
Circulation in Anonymous Networks
2. The stabilization time usually depends on global parameters
(diameter, size of the network …)
ICDCN, 04/01/2013, Mumbia
![Page 13: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/13.jpg)
When a few number of faults hit the system
• Self-Stabilization: Ω(D) rounds
ICDCN, 04/01/2013, Mumbia
![Page 14: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/14.jpg)
When a few number of faults hit the system
• Self-Stabilization: Ω(D) rounds
• Stronger forms:– Fault Containment [Ghosh et al, Dist Comp 2007]
– k-adaptive Self-Stabilization [Burman et al, OPODIS’05]
• Weakened forms:– k-stabilization [Beauquier et al, PODC’98]
ICDCN, 04/01/2013, Mumbia
![Page 15: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/15.jpg)
When a few number of faults hit the system
• Self-Stabilization: Ω(D) rounds
• Stronger forms:– Fault Containment [Ghosh et al, Dist Comp 2007]
– k-adaptive Self-Stabilization [Burman et al, OPODIS’05]
• Weakened forms:– k-stabilization [Beauquier et al, PODC’98]
ICDCN, 04/01/2013, Mumbia
![Page 16: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/16.jpg)
Fault-Containment
• Pros– Self-stabilizing– If f ≤ k faults, stabilization time in O(f) rounds– Containment radius– Fault gap is small
• Cons (currently) – k=1, or– Surrounded by a majority of correct processes, or – Synchronous setting, or– Probabilistic recovery
ICDCN, 04/01/2013, Mumbia
![Page 17: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/17.jpg)
Fault gap• The minimum time between consecutive faulty
transitions to have O(f) recovery time
ICDCN, 04/01/2013, Mumbia
Legitimate
Illegitimate
≥ Fault gap
O(f)
![Page 18: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/18.jpg)
Fault gap• The minimum time between consecutive faulty
transitions to have O(f) recovery time
ICDCN, 04/01/2013, Mumbia
Legitimate
Illegitimate
< fault gap
>Ω(D)
![Page 19: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/19.jpg)
Time-Adaptive Self-stabilization
• Self-Stabilization
• If the hamming distance to a legitimate configuration is f ≤ k, i.e., f ≤ k faults occurs simultaneous (Static faults), – “output” stabilization in O(f) rounds
ICDCN, 04/01/2013, Mumbia
![Page 20: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/20.jpg)
Output vs. State Stabilization
ICDCN, 04/01/2013, Mumbia
Legitimate
Correct OutputO(f)
>Ω(D)
Illegitimate
f ≤ k faults
![Page 21: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/21.jpg)
Output vs. State Stabilization
ICDCN, 04/01/2013, Mumbia
Legitimate
Correct OutputO(f)
>Ω(D)
Illegitimate
f ≤ k faults
The fault gap depends on global parameters
![Page 22: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/22.jpg)
k-Stabilization (first definition)
ICDCN, 04/01/2013, Mumbia
If the hamming distance to a legitimate configuration is f ≤ k, i.e., f ≤ k faults occurs simultaneous,the system eventually recoversOtherwise no guarantee
![Page 23: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/23.jpg)
k-Stabilization (first definition)
• Pros– Can solve more problems than self-stabilization– Usually, only-k-dependent stabilization time– Usually, only-k-dependent fault gap
• Cons– Not self-stabilizing– Static faults: f ≤ k faults should occur in a single
transition ICDCN, 04/01/2013, Mumbia
![Page 24: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/24.jpg)
Our definition of k-stabilization
• Faulty transition = one process state corruption
• Dynamic faults: – if f ≤ k faulty transitions occur
in an arbitrary manner• The system eventually recovers
ICDCN, 04/01/2013, Mumbia
![Page 25: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/25.jpg)
Our definition of k-stabilization
ICDCN, 04/01/2013, Mumbia
Legitimate
Illegitimate
1 fault 1 fault 1 fault
f ≤ k faults
![Page 26: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/26.jpg)
Our contribution
• Leader recovery protocol– On an anonymous (yet oriented) ring– Asynchronous atomic read/write
– k-stabilizing if n ≥ 18k + 1– Stabilization time O(k2) rounds– Log(k) bits per process– This problem is unsolvable in self-stabilizing setting
ICDCN, 04/01/2013, Mumbia
![Page 27: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/27.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
The system stars in a legitimate configuration where one process is elected
![Page 28: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/28.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
Some faulty transitions occurs in an arbitrary manner
![Page 29: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/29.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
Some faulty transitions occurs in an arbitrary manner
Fault propagation
![Page 30: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/30.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
Some faulty transitions occurs in an arbitrary manner
Fault propagation
![Page 31: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/31.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
If n ≥ 18k + 1, the system recovers the same leader inO(k2) rounds
![Page 32: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/32.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
If n ≥ 18k + 1, the system recovers the same leader inO(k2) rounds
![Page 33: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/33.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
If n ≥ 18k + 1, the system recovers the same leader inO(k2) rounds
![Page 34: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/34.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
If n ≥ 18k + 1, the system recovers the same leader inO(k2) rounds
![Page 35: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/35.jpg)
Our contribution
ICDCN, 04/01/2013, Mumbia
If n ≥ 18k + 1, the system recovers the same leader inO(k2) rounds
![Page 36: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/36.jpg)
Fault gap
ICDCN, 04/01/2013, Mumbia
Legitimate
Illegitimate
f ≤ k faulty transition
f ≤ k faulty transitions
0 0O(k2) rounds
![Page 37: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/37.jpg)
Main ideas of the algorithm
ICDCN, 04/01/2013, Mumbia
![Page 38: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/38.jpg)
Vote = Relative Address {-∈3k..3k} { }∪ ⊥
ICDCN, 04/01/2013, Mumbia
0
⊥⊥
3
2
1-1
-2
-3
⊥
3k
Interval of relevance:6+1 votes
![Page 39: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/39.jpg)
After k faults
ICDCN, 04/01/2013, Mumbia
0
⊥⊥
3
2
1-1
-2
-3
⊥
![Page 40: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/40.jpg)
After k faults
ICDCN, 04/01/2013, Mumbia
0
⊥⊥
3
0
1-1
-2
-3
⊥
![Page 41: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/41.jpg)
After k faults
ICDCN, 04/01/2013, Mumbia
1
⊥⊥
3
0
1 0
-2
-3
⊥
At most 3k processes change their votes
![Page 42: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/42.jpg)
After k faults
ICDCN, 04/01/2013, Mumbia
1
⊥⊥
3
0
1 0
-2
-3
⊥
At most 3k processes change their votes
Always a majority of votes for the previous leader
![Page 43: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/43.jpg)
Rumors
ICDCN, 04/01/2013, Mumbia
1
1
Vote
Rumor
In a legitimate state, Vote = Rumor, for all process
Main idea:Vote: hard to change Rumor: easy to change
![Page 44: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/44.jpg)
Rumors
ICDCN, 04/01/2013, Mumbia
1
2
Vote
Rumor If Rumor ≠ Vote• If Rumor ≠ ⊥
• Candidate ← Rumor• Else
• Candidate ← VoteInitiate Query(Candidate)
![Page 45: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/45.jpg)
Rumors
ICDCN, 04/01/2013, Mumbia
1
2
Vote
Rumor Query(Candidate) traverses the interval of relevance of the candidate (6k+1 processes), and
Count the votes for the candidate
![Page 46: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/46.jpg)
Query Return
• If at least 3k+1 votes for the Candidate
– If Rumor ≠ ≠ Candidate⊥• Initiate a Denial of rumor in its interval of relevance
– Vote←Candidate
– Rumor←Candidate
• Else
– If Rumor = Candidate, then Rumor←⊥– Initiate a Denial of Candidate in its interval of relevance
– If Vote = Candidate, then Vote← ⊥
ICDCN, 04/01/2013, Mumbia
![Page 47: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/47.jpg)
Query Tracks
ICDCN, 04/01/2013, Mumbia
![Page 48: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/48.jpg)
Other tracks
• Denial (to kill a rumor)
• To manage lost queries– Probe wave– Report
(see the paper)
ICDCN, 04/01/2013, Mumbia
![Page 49: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/49.jpg)
Deadlock Prevention
• Each two neighboring processes share a resource– Think of chopstick between 2 philosophers
ICDCN, 04/01/2013, Mumbia
![Page 50: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/50.jpg)
Deadlock Prevention
• Each two neighboring processes share a resource– Think of chopstick between 2 philosophers
• Only a process that holds both its left and right resources can initiate a query
ICDCN, 04/01/2013, Mumbia
![Page 51: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/51.jpg)
Deadlock Prevention
• Each two neighboring processes share a resource– Think of chopstick between 2 philosophers
• Only a process that holds both its left and right resources can initiate a query
• So, at any time at most n/2 pending initiated query
ICDCN, 04/01/2013, Mumbia
![Page 52: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/52.jpg)
Deadlock Prevention
• Each two neighboring processes share a resource– Think of chopstick between 2 philosophers
• Only a process that holds both its left and right resources can initiate a query
• So, at any time at most n/2 pending initiated query• Now, we can have up to 9k rogue queries, i.e., non-
initiated queries
ICDCN, 04/01/2013, Mumbia
![Page 53: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/53.jpg)
Deadlock Prevention
• Each two neighboring processes share a resource– Think of chopstick between 2 philosophers
• Only a process that holds both its left and right resources can initiate a query
• So, at any time at most n/2 pending initiated query• Now, we can have up to 9k rogue queries, i.e., non-
initiated queries• So, n > n/2+9k, that is n ≥ 18k + 1
ICDCN, 04/01/2013, Mumbia
![Page 54: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/54.jpg)
Conclusion
• Less restrictive definition of k-stabilization
• Using this definition, we solve a problem having no self-stabilizing solution:– Leader recovery protocol
• On an anonymous (yet oriented) ring• Only-k-dependent complexity:
– Stabilization time O(k2) rounds– Log(k) bits per process
ICDCN, 04/01/2013, Mumbia
![Page 55: Fast Leader (Full) Recovery despite Dynamic Faults](https://reader035.vdocuments.us/reader035/viewer/2022062322/5681509e550346895dbe99f9/html5/thumbnails/55.jpg)
Thank You!ICDCN, 04/01/2013, Mumbia