pords12 lec03 global states
TRANSCRIPT
-
7/27/2019 Pords12 Lec03 Global States
1/39
Principles of Reliable
Distributed Systems
Lecture 3: ConsistentGlobal States
Idit Keidar
1 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
2/39
Todays Agenda
Consistent global states
Monitoring a distributed system from within
Snapshots Chandy-Lamport protocol
2 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
3/39
Todays Material
Consistent Global States of DistributedSystems: Fundamental Concepts andMechanisms (1993)zalp Babaoglu, Keith Marzullo
Ch. 4, Distributed Systems 2nd edition,Sape Mullender (Editor)
ACM Press Frontier Series, Addison Wesley
3 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
4/39
Todays Model
Asynchronous
Non-assumption
Note: An algorithm for this model also works
in the synchronous model
No failures
4 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
5/39
Questions About the State of a
Distributed System - Examples
Is the system deadlocked?
How many processes are reading file X?
How much total credit does the bank have? How balanced is the distribution of work
among servers?
What is the longest communication path?
5 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
6/39
Is This Hard?
If youre simulating the distributed systemin a single process then this is easy
Otherwise what do we do?
We can send messages to all processes and askthem to tell us their state
Problem?
While the question is being answered,processes continue to exchange messages andchange their state
6 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
7/39
What is the State of a Distributed
System Anyway?
7 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
8/39
What ismy total
balance?
Joes balance: $500Joes balance: $550
Joes balance: $100Transfer $50 to Bank B
Joes balance: $50
Bank A
Bank B
Joe
50
500
8 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
9/39
Definition of Global State
A system consists of
Sequential processes, p1, p2, , pn Links between pairs of processes
Processes and links are state machines One atomic action can modify the states of both
one process and one link (or only one process)
Global state:
Tuple including states of all system components
Processes
Links (messages in transit)
9 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
10/39
Joes balance: $500
Joes balance: $50
Bank A
Bank B Joes transfer: $50
The global state:
$50, $500, $50
process
states
link
states
10 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
11/39
Ideal Global State
Imagine freezing all the processes and links
simultaneously
The combined states of the processes and links
is the desired global state
With this global state, we can reconstruct the
balance, or tell whether a system is deadlocked
Problem? In our asynchronous system there is no such
thing as simultaneity
11 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
12/39
Runs and Histories
Recall: a run (execution) is an alternating
sequence of events and states
In a given run, each processpigenerates a
(possibly infinite) sequence of events,
called the local history ofpi :
The global history is the set:
12 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
13/39
Distributed Computation
A distributed computation is a partiallyordered set (H, ), ordered by causality Lamports happened before relation
Space-time diagram of a computation:
13 Idit Keidar, Principles of Reliable Distributed Systems
e11
e23
e12 || e32
-
7/27/2019 Pords12 Lec03 Global States
14/39
Cuts
A cutof a distributed computation is a collection
of local history prefixes:
14 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
15/39
Consistency
A cut C is consistentif
A consistent global state is one thatcorresponds to a consistent cut
Rationale:
There is no way to tell whether a particular
global state ever occurs Instead, we define a consistent state as one that
could have occurred
15 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
16/39
Which Cuts are Consistent?
16 Idit Keidar, Principles of Reliable Distributed Systems
e15 did not
happen yet
e15
e36,
which
already
happened
cut C1 is
consistent
-
7/27/2019 Pords12 Lec03 Global States
17/39
And Now, Back to Runs
A run of a distributed computation defines a total
orderingR of its events
E.g.,
17 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
18/39
Monitoring the System State
from Within
We introduce an extra processp0, the monitor
It tries to create a snapshotof the system state
By communicating with other processes
The $1,000,000 question is whether the state
constructed by the monitor is a state that actually
could have occurred in the distributed computation
18 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
19/39
Nave Algorithm #1: Snapshot
Processp0 sends a message to each process
Asking for its state
Whenpireceives such a message, it replies
with its current state
When all nprocesses have replied,p0constructs the snapshot of the global state
Does it work?
19 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
20/39
Example 1/2
e11 did not
yet happen
e32 already
happened
20 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
21/39
Example 2/2: Deadlock Detection
21 Idit Keidar, Principles of Reliable Distributed Systems
deadlock
detected by
monitor
response missed!
-
7/27/2019 Pords12 Lec03 Global States
22/39
Nave Algorithm #1 Problems
Sampling processes at arbitrary times yields
inconsistent states
Example 1
Sampling process states misses messages in
transit
Example 2
Perhaps we can fix this by observing all
events?
22 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
23/39
Nave Algorithm #2:
Continuous Monitoring
Each processp1,p2,, when it executes anevent, also sends a notification top0
For each event ofpi, the monitorp0 updates
its copy ofpis local state At any time, the global state is:
The n-tuple of latest local states
The n(n-1)-tuple of latest link states as reflectedby send and receive events
Are the constructed states consistent?
23 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
24/39
Deadlock Detection Revisited
24 Idit Keidar, Principles of Reliable Distributed Systems
Does it
always
work?
no deadlock
-
7/27/2019 Pords12 Lec03 Global States
25/39
Deadlock Detection Revisited 2
25 Idit Keidar, Principles of Reliable Distributed Systems
deadlockdetected!
Solution?
-
7/27/2019 Pords12 Lec03 Global States
26/39
Delivery Layer to the Rescue
26 Idit Keidar, Principles of Reliable Distributed Systems
Monitor: tracks global state
deliver: update state
Network
receive
Delivery Layer: waits for messages
that should be acted on first
-
7/27/2019 Pords12 Lec03 Global States
27/39
What Delivery Rule?
Is FIFO delivery good enough?
No!
Homework: show a counter example
We needcausal delivery
Ensures that the constructed global state is
consistent
How do we implement it?
LTS or Vector Clocks
27 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
28/39
Back to Snapshots
A snapshot request is initiated atp0 No continuous monitoring
p0 no longer gets all the history of events
Needs to learn the effects of all the events
occurring up to some point
Assume FIFO channels
Is sending the process states enough?
28 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
29/39
Synchronous Snapshot Protocol
The Idea p0 chooses a time tss in the future
Far enough so that it will be able to tell all processesto take a snapshot by then
Using synchrony here in two ways what are they?
Goal: obtain a snapshot of the system at time tss What does the snapshot include?
Processpi will contribute to the snapshot
Its local state si at time tss The states of its incoming linksXji (for
j=1,,n) at time tss
29 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
30/39
Synchronous Snapshot Protocol
1. p0 sends take snapshot at time tss to all processes,where tss is far enough in the future
2. WhenRCi reads tss,pi Records its local state si,
Sends an empty message on each of its outgoing links
Starts recording messages on its incoming linksXji While this is going on,pi does not execute any events
3. Messages are tagged with timestamps fromRC
4. Whenpi receives m with TS(m) tss frompj pi stops recording messages onXji (closes the link)
When all links are closed,pi sends si and theXjis top0
Why?
30 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
31/39
Does This Work?
The snapshot reflects an event e iffe occursbefore tss The effects of such events on links are gathered in
theXji
sets
By FIFO order, links are flushed by messagessent at time tss
Thus,
By the clock condition, the cut is consistent
Reminder: ife f then RC(e) < RC(f)
31 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
32/39
Overcoming Asynchrony
Assumep0 could choose a logical time ltssfar enough in the future
So no processs logical clock will reach ltssbefore the take snapshot at ltss message fromp0 will reach it
We could replaceRCwith logical clocks
Satisfy the clock condition
Would the same algorithm work?
32 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
33/39
The Role of Logical Time?
Note thatpi does nothing special between
Receiving take snapshot at ltss and
Getting the first message that makes its logical
clock reach ltss
We could makepis logical clock becomeltss right away by sending the takesnapshot message with timestamp lt
ss
-1
Making it take snapshot now
So the logical time plays no role!
33 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
34/39
Eliminating ltss
We can replace the explicit use of logical
time with the arrival of the first take
snapshot message
We will call this message a marker
34 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
35/39
Asynchronous Snapshot Protocol
[Chandy, Lamport, 85]
1. p0 sends itself a marker (take snapshot)
2. Whenpi receives the first marker (say frompj), it
Records its local state si
Forwards the marker on all its outgoing links SetsXji to empty and closesXji
Initializes other incoming linksXki to empty, and starts
recording messages received on them
3. Whenpi receives a marker frompk, it
ClosesXki
When all links are closed,pi sends si and theXjis top0
35 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
36/39
p0
p1
p2
e11 e1
2 e13 e1
4 e15 e1
6
e21 e2
2 e23 e2
4 e25
m
Record state s23
X12 is empty
Record state s12
Start recordingX21
X21 is {m}
36 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
37/39
Deadlock Detection Revisited 3
37 Idit Keidar, Principles of Reliable Distributed Systems
X31 includesresponsewaitingforp3
waitingforp1
waiting
forp2
-
7/27/2019 Pords12 Lec03 Global States
38/39
Whats Going On?
Markers traverse each link once In each direction
Forwarded on all links on first receipt
Once a process gets markers on all its links, itterminates (liveness)
Assume ej is in the gathered state andei = sendi(m), ej = receivej(m) Thenp
j
receives mbefore the marker frompi Hence,pi sends mbefore recording its state, andei
is also in the gathered state
What assumption do we use?
38 Idit Keidar, Principles of Reliable Distributed Systems
-
7/27/2019 Pords12 Lec03 Global States
39/39
Chandy-Lamport Summary
The distributed snapshot algorithm described here
came about when I visited Chandy, He posed
the problem to me over dinner, but we had both
had too much wine to think about it right then.The next morning, in the shower, I came up with
the solution. When I arrived at Chandy's office, he
was waiting for me with the same solution.
-- Leslie Lamport
39 Idit K id P i i l f R li bl Di t ib t d S t