csce 668 distributed algorithms and systems
Post on 31-Dec-2015
15 Views
Preview:
DESCRIPTION
TRANSCRIPT
CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS
Fall 2011Prof. Jennifer WelchCSCE 668
Set 9: Fault Tolerant Consensus 1
Processor Failures in Message Passing
CSCE 668Set 9: Fault Tolerant Consensus
2
Crash: at some point the processor stops taking steps at the processor's final step, it might
succeed in sending only a subset of the messages it is supposed to send
Byzantine: processor changes state arbitrarily and sends messages with arbitrary content
Consensus Problem
CSCE 668Set 9: Fault Tolerant Consensus
3
Every processor has an input. Termination: Eventually every nonfaulty
processor must decide on a value. decision is irrevocable!
Agreement: All decisions by nonfaulty processors must be the same.
Validity: If all inputs are the same, then the decision of a nonfaulty processor must equal the common input.
Examples of Consensus
CSCE 668Set 9: Fault Tolerant Consensus
4
Binary inputs: input vector 1,1,1,1,1
decision must be 1 input vector 0,0,0,0,0
decision must be 0 input vector 1,0,0,1,0
decision can be either 0 or 1
Multi-valued inputs: input vector 1,2,3,2,1
decision can be 1 or 2 or 3
Overview of Consensus Results
CSCE 668Set 9: Fault Tolerant Consensus
5
Synchronous system At most f faulty processors Tight bounds for message passing:
crash failures Byzantine failures
number of rounds
f + 1 f + 1
total number of processors
f + 1 3f + 1
message size polynomial polynomial
Overview of Consensus Results
CSCE 668Set 9: Fault Tolerant Consensus
6
Impossible in asynchronous case. Even if we only want to tolerate a single
crash failure. True both for message passing and
shared read-write memory.
Modeling Crash Failures
CSCE 668Set 9: Fault Tolerant Consensus
7
Modify failure-free definitions of admissible execution to accommodate crash failures:
All but a set of at most f processors (the faulty ones) taken an infinite number of steps. In synchronous case: once a faulty processor
fails to take a step in a round, it takes no more steps.
In a faulty processor's last step, an arbitrary subset of the processor's outgoing messages make it into the channels.
Modeling Byzantine Failures
CSCE 668Set 9: Fault Tolerant Consensus
8
Modify failure-free definitions of admissible execution to accommodate Byzantine failures:
A set of at most f processors (the faulty ones) can send messages with arbitrary content and change state arbitrarily (i.e., not according to their transition functions).
Consensus Algorithm for Crash Failures
CSCE 668Set 9: Fault Tolerant Consensus
9
Code for each processor:
v := my inputat each round 1 through f+1: if I have not yet sent v then send v to all wait to receive messages for this round v := minimum among all received values
and current value of v
if this is round f+1 then decide on v
Execution of Algorithm
CSCE 668Set 9: Fault Tolerant Consensus
10
round 1: Relation to Formal Model send my input in channels initially receive round 1 msgs deliver events compute value for v compute events
round 2: send v (if this is a new value) due to previous compute
events receive round 2 msgs deliver events compute value for v compute events
… round f + 1:
send v (if this is a new value) due to previous compute events
receive round f + 1 msgs deliver events compute value for v compute events decide v part of compute events
Correctness of Crash Consensus Algorithm
CSCE 668Set 9: Fault Tolerant Consensus
11
Termination: By the code, finish in round f+1.
Validity: Holds since processors do not introduce spurious messages: if all inputs are the same, then that is the only value ever in circulation.
Correctness of Crash Consensus Algorithm
CSCE 668Set 9: Fault Tolerant Consensus
12
Agreement: Suppose in contradiction pj decides on a smaller value, x,
than does pi.
Then x was hidden from pi by a chain of faulty processors:
There are f + 1 faulty processors in this chain, a contradiction.
q1 q2 qf qf+1 pj
pi
round1
round2
roundf
roundf+1…
Performance of Crash Consensus Algorithm
CSCE 668Set 9: Fault Tolerant Consensus
13
Number of processors n > f f + 1 rounds at most n2 •|V| messages, each of size
log|V| bits, where V is the input set.
Lower Bound on Rounds
CSCE 668Set 9: Fault Tolerant Consensus
14
Assumptions: n > f + 1 every processor is supposed to send a
message to every other processor in every round
Input set is {0,1}
Failure-Sparse Executions
CSCE 668Set 9: Fault Tolerant Consensus
15
Bad behavior for the crash algorithm was when there was one crash per round.
This is bad in general. A failure-sparse execution has at most
one crash per round. We will deal exclusively with failure-
sparse executions in this proof.
Valence of a Configuration
CSCE 668Set 9: Fault Tolerant Consensus
16
The valence of a configuration C is the set of all values decided by a nonfaulty processor in some configuration reachable from C by an admissible (failure-sparse) execution.
Bivalent: set contains 0 and 1. Univalent: set contains only one value
0-valent or 1-valent
Valence of a Configuration
CSCE 668Set 9: Fault Tolerant Consensus
17
C
E F GD
0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 1 <= decisions
0/1 : bivalent1 : 1-valent0 : 0-valent
0/1
0 0/1 1 0/1
Statement of Round Lower Bound
CSCE 668Set 9: Fault Tolerant Consensus
18
Theorem (5.3): Any crash-resilient consensus algorithm requires at least f + 1 rounds in the worst case.
Proof Strategy:
show bivalentinitialconfig.
…round
1round
2roundf - 2
roundf - 1
show we can keep things bivalentthrough round f - 1
roundf
show we cankeep a n.f.proc. fromdeciding inround f
Existence of Bivalent Initial Config.
CSCE 668Set 9: Fault Tolerant Consensus
19
Suppose in contradiction all initial configurations are univalent.inputs valency
000…00 0
000…01 ?
000…11 ?
…
001…11 ?
011…11 ?
111…11 1
by validity condition
0
1
Existence of Bivalent Initial Config.
CSCE 668Set 9: Fault Tolerant Consensus
20
Let I0 be a 0-valent initial config I1 be a 1-valent initial config s.t. they differ only in pi 's input
I0
pi fails initially, no other failures.By termination, eventually rest decide.
all but pi
decide 0
I1
This execution looks the same as the oneabove to all the processors except pi.
all but pi
decide 0
Keeping Things Bivalent
CSCE 668Set 9: Fault Tolerant Consensus
21
Let ' be a (failure-sparse) k-1 round execution ending in a bivalent config. for k - 1 < f - 1
Show there is a one-round (f-s) extension of ' ending in a bivalent config. so has k < f rounds
Suppose in contradiction every one-round (f-s) extension of ' is univalent.
Keeping Things Bivalent
CSCE 668Set 9: Fault Tolerant Consensus
22
'
failure-free round k1-val
pi crashes0-val
pi fails to send to
pi fails to send to q1,…,qm
pi fails to send to q1,…,qj+1
pi fails to send to q1,…,qj
rounds 1 to k-1
1-val
0-val
bi-val
……
now focusin on thesetwo extensions
Keeping Things Bivalent
CSCE 668Set 9: Fault Tolerant Consensus
23
'
1-val
0-val
pi fails to sendto q1,…,qj
pi fails to sendto q1,…,qj+1
rounds 1 to k-1
round k
n.f.decide1
n.f.decide1
qj+1 fails inrd. k+1; no other failures
only qj+1 can tell difference
Cannot Decide in Round f
CSCE 668Set 9: Fault Tolerant Consensus
24
We've shown there is an f - 1 round (failure-sparse) execution, call it , ending in a bivalent configuration.
Extending this execution to f rounds might not preserve bivalence.
However, we can keep a processor from explicitly deciding in round f, thus requiring at least one more round (f+1).
Cannot Decide in Round f
CSCE 668Set 9: Fault Tolerant Consensus
25
Case 1: There is a 1-round (f-s) extension of ending in a bivalent config. Then we are done.
Case 2: All 1-round (f-s) extensions of end in univalent configs.
Cannot Decide in Round f
CSCE 668Set 9: Fault Tolerant Consensus
26
0-val
pi fails to send to nf pj
rounds 1 to f-1
1-valround f
failure free
bi-val.
pi fails to send to nf pj ,sends to another nf pk
pi might send to pk
pi sends to pj
and pklook same to pk
look same to pj
pk either undecidedor decided 1
pj either undecidedor decided 0
top related