why do we need models? there are many dimensions of variability in distributed systems. examples:...
TRANSCRIPT
Why do we need models?
There are many dimensions of variability in distributed systems. Examples: interprocess communication mechanisms, failure classes, security mechanisms etc.
Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the implementation details from observers who view the system at a higher level.
A message passing model
System is a graph G = (V, E).
V= set of nodes (sequential processes)
E = set of edges (links or channels) (bi/unidirectional)
Four types of actions by a process:
- Internal action
- Communication action
- Input action
- Output action
A reliable FIFO channel
• Axiom 1. Message m sent message m received
• Axiom 2. Message propagation delay is arbitrary but finite.
• Axiom 3. m1 sent before m2 m1 received before m2.
P
Q
Life of a process
When a message m is received
1. Evaluate a predicate with m
and the local variables;
2. if predicate = true then
- update internal variables;
-send k (k≥0) messages;
else skip {do nothing}
end if
Synchrony vs. Asynchrony
Send & receive can be blocking or non-blocking
Postal communication is asynchronous:
Telephone communication is synchronous
Synchronous communication or not?
(1) RPC,(2) Email
Synchronous
clocks
Local physical clocks synchronized
Synchronous processes
Lock-step synchrony
Synchronous channels
Bounded delay
Synchronous message-order
First-in first-out channels
Synchronous communication
Communication via handshaking
Shared memory model
Here is an implementation of a channel of capacity (max-1)
Sender P writes into the circular buffer and increments the pointer (when p ≠ q-1 mod max)
Receiver Q reads from the circular buffer and increments the pointer (when q ≠ p)
0
1
2
max-1
P Q
p q
Variations of shared memory models
0
3
21 State reading model.
Each process can read
the states of its neighbors
0
3
21Link register model.
Each process can read from
and write ts adjacent buffers
Modeling wireless networks
• Communication via broadcasting• Limited range• Dynamic topology• Collision of broadcasts
(handled by CSMA/CA protocol)• Low power
0
1
2
3
4
5
6
0
1
2
3
4
5
6
(a)
(b)
RTS RTS
CTS
Weak vs. Strong Models
One object (or operation) of a strong model = More than one objects (or operations) of a weaker model.
Often, weaker models are synonymous with fewer restrictions.
One can add layers to create a stronger model from weaker one).
Examples
HLL model is stronger than assembly language model.
Asynchronous is weaker than synchronous.
Bounded delay is stronger than unbounded delay (channel)
Model transformation
Stronger models
- simplify reasoning, but
- needs extra work to implement
Weaker models
- are easier to implement.
- Have a closer relationship with the real
world
Can model X be implemented using model Y is an interesting question in computer science.
Sample problems
Non-FIFO channel to FIFO channel
Message passing to shared memory
Atomic broadcast to non-atomic broadcast
A resequencing protocol{sender process P} {receiver process Q}var i : integer {initially 0} var k : integer {initially 0}
buffer: buffer[0..∞] of msg {initially k: buffer [k] = empty
repeat repeat send m[i],i to Q; {STORE} receive m[i],i from P; i := i+1 store m[i] into buffer[i];
forever {DELIVER} while buffer[k] ≠ empty do begin
deliver content of buffer [k];buffer [k] := empty k := k+1;
endforever
Observations(a) Needs Unbounded sequence numbers and (b) Needs Unbounded number of buffer slots This is bad.
Now solve the same problem on a model where (a) The propagation delay has a known upper bound of T.(b) The messages are sent out @ r per unit time.(c) The messages are received at a rate faster than r.
The buffer requirement drops to r.T. Synchrony pays.
Question. How can we solve the problem using bounded buffer space when the propagation delay is arbitrarily large?
Non-atomic to atomic broadcast
We now include crash failure as a part of our model. First realize what the difficulty is with a naïve approach:
{process i is the sender}for j = 1 to N-1 (j ≠ i) send message m to neighbor[j]
What if the sender crashes at the middle?
Suggest an algorithm for atomic broadcast.
Other classifications
Reactive vs Transformational systemsA reactive system never sleeps (like: a server, or a token ring)A transformational (or non-reactive systems) reaches a fixed point
after which no further change occurs in the system
Named vs Anonymous systemsIn named systems, process id is a part of the algorithm. In anonymous systems, it is not so. All are equal.
(-) Symmetry breaking is often a challenge.
(+) Saves log N bits. Easy to switch one process by another with no side effect)
Model and complexity
Many measures
o Space complexityo Time complexityo Message complexityo Bit complexityo Round complexity
Consider broadcasting in an n-cube (n=3)
0 1
2
3
4
5
6
7
source
Broadcasting using messages{Process 0} sends m to neighbors
{Process i > 0}
repeatreceive message m {m contains
the value};if m is received for the first time then x[i] := m.value; send x[i] to
each neighbor j > I else discard mend ifforeverWhat is the (1) message complexity (2) space complexity per process?
0 1
2
3
4
5
6
7
Each process jhas a variable x[j]initially undefined
Broadcasting using shared memory
{Process 0} x[0] := v{Process i > 0}repeatif a neighbor j < i : x[i] ≠ x[j]
then x[i] := x[j] {this is a step}
else skip end ifforever
What is the time complexity? How many steps are needed?Can be arbitrarily large!
0 1
2
3
4
5
6
7
Each process jhas a variable x[j]initially undefined
Broadcasting using shared memory
Now, use “large atomicity”, where
in one step, a process can read
the states of ALL its neighbors.
What is the time complexity?
How many steps are needed?
The time complexity is now O(n2) 0 1
2
3
4
5
6
7
Each process jhas a variable x[j]initially undefined
Complexity in rounds
Rounds are truly defined for
synchronous systems. An
asynchronous round consists
of a number of steps where
every process (including the
slowest one) takes at least
one step.
How many rounds do you need
to complete the broadcast?
0 1
2
3
4
5
6
7
Each process jhas a variable x[j]initially undefined
Exercises
Consider an anonymous distributed system consisting of N
processes. The topology is a completely connected
network, and the links are bidirectional. Propose an
algorithm using which processes can acquire unique
identifiers. (Hint: use coin flipping, and organize the
computation in rounds).
Exercises
Alice and Bob enter into an agreement: whenever one falls
sick, (s)he will call the other person. Since making the
agreement, no one called the other person, so both
concluded that they are in good health. Assume that the
clocks are synchronized, communication links are perfect,
and a telephone call requires zero time to reach. What kind
of interprocess communication model is this?