w ha t is a distributed sys tem?
TRANSCRIPT
CS 378Intro to
Distributed ComputingLorenzo Alvisi
Harish Rajamani
What is a distributed system?
“A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.”
Leslie Lamport
Local OS Local OS Local OS
Network
Machine CMachine BMachine A
Middleware
What is a distributed system?
A distributed system is software through which a collection of independent computers appears to its users as a single, coherent system
Distributed Applications
Goals (and auto-goals) of a Distributed System
Connecting Resources and Users
Transparency
Openness
Scalability
Transparency
Access Hides differences in data representation and invocation mechanisms
Location Hides where an object resides
Migration Hides from an object that object’s location
Relocation Hides from a client the change of location of an object to which the client is bound
Replication Hides that an object may be replicated, with replicas at different locations
Concurrency Hides coordination of activities between objects
Failure Hides the failure and recovery of object
Persistence Hides whether a resource is in memory or on disk
Transparency Description
OpennessConform to well-defined interfaces
Easily interact with other open systems
Achieve independence in heterogeneity wrtHardwarePlatformLanguages
Support different app/user-specific policies
OpennessConform to well-defined interfaces
Easily interact with other open systems
Achieve independence in heterogeneity wrtHardwarePlatformLanguages
Support different app/user-specific policies
ideally, provide only mechanisms!
Scalability
Size scalability number of users and processes
Geographical scalabilitymaximum distance between nodes
Administrative scalabilitynumber of administrative domains
ScalingDistribute
partition data and computation across multiple machineJava applets, DNS, WWW
Replicatemake copies of data available at different machines
mirrored web sites, replicated fs, replicated db
Cacheallow client processes to access local copies
Web caches, file cachingConsi
stency
A first course in Distributed Computing...
Two basic approaches
cover many interesting systems, and distill from them fundamental principles
focus on a deep understanding of the fundamental principles, and see them instantiated in a few systems
A few intriguing questions
How do we talk about a distributed execution?
Can we draw global conclusions from local information?
Can we coordinate operations without relying on synchrony?
For the problems we know how to solve, how do we characterize the “goodness” of our solution?
Are there problems that simply cannot be solved?
What are useful notions of consistency, and how do we maintain them?
What if part of the system is down? Can we still do useful work? What if instead part of the system becomes “possessed” and starts behaving arbitrarily–all bets are off?
Global Predicate Detection
and Event Ordering
Our Problem
To compute predicatesover the state of
a distributed application
Model
Message passing
No failures
Two possible timing assumptions:1. Synchronous System2. Asynchronous System
No upper bound on message delivery timeNo bound on relative process speedsNo centralized clock
Clock Synchronization
External Clock Synchronization:keeps processor clock within some maximum
deviation from an external time source.
• can exchange of info about timing events of different systems
• can take actions at real-time deadlines
• synchronization within 0.1 ms
Internal Clock Synchronization:keeps processor clocks within some maximum deviation from each other.
• can measure duration of distributed activities that start on one process and terminate on another
• can totally order events that occur on a distributed system
Clock Synchronization:Take 1
Assume an upper bound and a lower bound on message delivery time
Guarantee that processes stay synchronized within .max−min
max
min
Clock Synchronization:Take 1
Assume an upper bound and a lower bound on message delivery time
Guarantee that processes stay synchronized within .
Time (ms) 93.174.484.22
% of messagesProblem:
5000 message run(IBM Almaden)
max−min
max
min
Clock Synchronization:Take 2
No upper bound on message delivery time...
...but lower bound on message delivery time
Use timeout to detect process failures
slaves send messages to master
Master averages slaves value; computes fault-tolerant average
Precision:
maxp
min
4 maxp − min
Probabilistic Clock Synchronization (Cristian)
Master-Slave architecture
Master is connected to external time source
Slaves read master’s clock and ! adjust their own
How accurately can a slave read the master’s clock?
The Idea
Clock accuracy depends on message roundtrip time
if roundtrip is small, master and slave cannot have drifted by much!
Since no upper bound on message delivery, no certainty of accurate enough reading...
… but very accurate reading can be achieved by repeated attempts
Asynchronous systems
Weakest possible assumptions
cfr. “finite progress axiom”
Weak assumptions less vulnerabilities
Asynchronous " slow
“Interesting” model w.r.t. failures (ah ah ah!)
≡
Client-Server
Processes exchange messages using Remote Procedure Call (RPC)
A client requests a service by sending the server a message. The client blocks while waiting
for a response
sc
Client-Server
Processes exchange messages using Remote Procedure Call (RPC)
The server computes the response (possibly asking other servers) and returns it to the
client
A client requests a service by sending the server a message. The client blocks while waiting
for a response
s
#!?%!
c
Deadlock!
p2
p1
p3
Goal
Design a protocol by which a processor can determine whether a global predicate (say, deadlock)
holds
Draw arrow from to if has received a request but has not responded yet
Wait-For Graphs
pi pj pj
Draw arrow from to if has received a request but has not responded yet
Cycle in WFG deadlock
Deadlock cycle in WFG
Wait-For Graphs
⇒ ♦
⇒ ·
pi pj pj
The protocol
sends a message to
On receipt of ’s message, replies with its state and wait-for info
p1 . . . p3p0
p0 pi
An execution
p1p1
p2 p2p3 p3
An execution
p1p1
p2 p2p3 p3
An execution
Ghost Deadlock!
p2 p2
p1p1
p3 p3
Houston,we have a problem...
Asynchronous system
no centralized clock, etc. etc.
Synchrony useful to
coordinate actions
order events
Mmmmhhh...
Events and HistoriesProcesses execute sequences of events
Events can be of 3 types: local, send, and receive
is the -th event of process
The local history of process is the sequence of events executed by process
: prefix that contains first k events
: initial, empty sequence
The history H is the set
hp
hkp
h0
p
eip
hp0∪ hp1
∪ . . . hpn−1
NOTE: In H, local histories are interpreted as sets, rather than sequences, of events
p
p
p
i
Ordering events
Observation 1: Events in a local history are totally ordered
timepi
Ordering events
Observation 1: Events in a local history are totally ordered
Observation 2: For every message , precedes
timepi
timepi
time
m receive(m)send(m)
m
pj
Happened-before(Lamport[1978])
A binary relation defined over events
1. if and , then
2. if and , then
3. if and then
→
eki , el
i ∈ hi k < l eki → e
li
ei = send(m) ej = receive(m)ei → ej
e → e′
e′→ e
′′e → e
′′
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
H and impose a partial order→
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
H and impose a partial order→
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
H and impose a partial order→
Space-Time diagrams
A graphic representation of a distributed execution
time
p1
p2p3
p1
p2
p3
H and impose a partial order→
Runs andConsistent Runs
A run is a total ordering of the events in H that is consistent with the local histories of the processors
Ex: is a run
A run is consistent if the total order imposed in the run is an extension of the partial order induced by
A single distributed computation may correspond to several consistent runs!
h1, h2, . . . , hn
→
Cuts
A cut C is a subset of the global history of H
p1
p2
p3
C = hc1
1∪ h
c2
2∪ . . . h
cn
n
A cut C is a subset of the global history of H
The frontier of C is the set of events
Cuts
p1
p2
p3
C = hc1
1∪ h
c2
2∪ . . . h
cn
n
ec1
1, e
c2
2, . . . e
cn
n
Global states and cuts
The global state of a distributed computation is an -tuple of local states
To each cut corresponds a global state
Σ = (σ1, . . .σn)
(σc1
1, . . .σ
cn
n)
(c1 . . . cn)
n