ics362 – distributed systems dr. ken cosh lecture 8
TRANSCRIPT
![Page 1: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/1.jpg)
ICS362 – Distributed Systems
Dr. Ken Cosh
Lecture 8
![Page 2: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/2.jpg)
Review
Replication & Consistency– Data Centric Consistency Models
Continuous Consistency Sequential Consistency Causal Consistency Entry (/Release) Consistency
– Client Centric Consistency Models Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follows Reads
– Replica Management Replica & Content Placement
– Protocols Remote Write Protocols Local Write Protocols Active Replication – Quorum Based Protocols
![Page 3: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/3.jpg)
This Week
Fault Tolerance– Process Resilience– Reliable Client-Server Communication– Reliable Group Communication– Distributed Commit– Recovery
![Page 4: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/4.jpg)
Fault Tolerance
One of our primary Distributed Systems goals was Fault Tolerance– i.e. a partial failure may result in some
components not working, but at the same time other components may be totally unaffected
Whereas in non-Distributed Systems a failure may bring down the whole system.
![Page 5: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/5.jpg)
Dependent Systems
Fault tolerance is closely related to the concept of dependability
– i.e. the degree of trust users have with a system. In distributed systems we consider the following
properties affecting dependability– Availability– Reliability– Safety– Maintainability– (Security)
![Page 6: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/6.jpg)
Availability / Reliability
Availability– The property that a system is ready to be used when
requested Measured by a probability
Reliability– The property that a system can run continuously without
failure Measured by a time interval
Note: These are different definitions to those discussed in other courses… ;)
![Page 7: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/7.jpg)
Availability / Reliability
If a system goes down for one millisecond every hour;– It is highly available (>99.9999%)– But highly unreliable
If a system never crashes, but is shut down for 2 weeks each year– It is highly reliable– But not very available (96%)
![Page 8: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/8.jpg)
Safety / Maintainability
Safety– Situations where a temporary failure in a system leads to
something catastrophic Human life, injury, environmental damage etc.
Maintainability– Refers to how easily a failed system can be repaired– Highly maintainable systems are often highly available
Especially if the failures can be automatically detected and corrected
![Page 9: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/9.jpg)
Failures
A system fails when it doesn’t perform as promised– When one or more service can’t be provided
An error is the system state which leads to the failure
– Perhaps a message sent across the network is damaged
An error is caused by a fault (hence fault tolerance– The fault could be incorrect transmission medium (which is
easily corrected), or poor weather conditions (which is not so easily corrected).
![Page 10: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/10.jpg)
Faults
Faults lead to Errors, Errors lead to Failures– But there are different types of faults.
Transient Faults– Occur once and then disappear.
E.g. bird flies through a microwave beam transmitter– The operation can simply be repeated
Intermittent Faults– Occur, then vanish then reappear
E.g. loose contact on a connector– Typically disappear when the engineer arrives!
Permanent Faults– Occur until the faulty component is replaced
E.g. Burnt out disk, software bug
![Page 11: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/11.jpg)
Failure Models
Distributed Systems– Collection of Clients & Servers communicating
and providing services Both machine and communication channels could cause
faults
– Complex dependencies between servers A faulty server may be caused by a fault within a
different server
– There are several different types of failures
![Page 12: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/12.jpg)
Crash Failure
Server prematurely halts, but was working until it stopped.– Perhaps caused by the operating system in which
case there is one solution
Reboot it! Our PCs suffer from crash failures so
frequently that we just ‘accept it’– the reset button is now on the front of the case.
![Page 13: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/13.jpg)
Omission Failure
Server fails to respond to a request– Receive Omission
When the server didn’t receive the request in the first place.
– Send Omission When the server fails to send the response
– Perhaps a send buffer overflow.
![Page 14: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/14.jpg)
Timing Failure
When the server’s response is outside of a specified time interval– Remember isochronous data streams?– Providing data too soon can cause as many
problems as being too late…
![Page 15: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/15.jpg)
Response failure
When the server’s response is just incorrect Value Failure
– When the server simply sends the wrong reply to a request
State Transition Failure– When the server reacts unexpectedly to an
incoming request Perhaps it can’t recognise the message, or perhaps it
has no code for dealing with the message.
![Page 16: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/16.jpg)
Arbitrary Failures
Perhaps the most serious failures, also known as Byzantine Failures.– Server produces output that it shouldn’t have, but
it can’t be detected as being incorrect.– Worse is when the server works maliciously with
other servers to produce intentionally wrong answers
We’ll return to Byzantine later…
![Page 17: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/17.jpg)
Redundancy
The key to masking failures is Redundancy– Information Redundancy
Extra bits added to allow recovery from damaged bits (e.g. Hamming codes)
– Time Redundancy If need be after a period of time the action is performed
again (perhaps if a transaction aborts)
– Physical Redundancy Extra equipment / processes to make it possible to
continue with broken components (replication!)
![Page 18: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/18.jpg)
Physical Redundancy
We have 2 eyes, 2 ears, 2 lungs… A boeing 747 has 4 engines, but can fly with
only 3. In football we have a referee and 2 referees
assistants (linesmen) TMR or (Triple Modular Redundancy) works
by having 3 components
![Page 19: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/19.jpg)
Triple Modular Redundancy
![Page 20: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/20.jpg)
Triple Modular Redundancy
Suppose A2 fails.– Each voter (V1, V2, V3) gets 2 good inputs
allowing them to pass the correct value to stage B.
Suppose voter V1 fails.– B1 will get an incorrect input, but B2 & B3 can
produce the correct output so V4-V6 can choose the correct response.
![Page 21: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/21.jpg)
Process Resilience
Similar to TMR, the key to tolerating faulty processes is organising multiple identical processes in a group.
– When a message is sent to the group, all processes receive it, in the hope that one can deal with it.
Process groups are dynamic– A process can join or leave, and a process could be part of
multiple groups at the same time
The group can be considered as a single abstraction– i.e. a message can be sent to the group regardless of which
processes are in the group
![Page 22: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/22.jpg)
Flat Groups vs Hierarchical Groups
In a Flat Group all processes are equal– Decisions are made collectively
In a Hierarchical Group one process may be the co-ordinator– The co-ordinator decides which worker process is
best to perform some request
![Page 23: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/23.jpg)
Flat Groups vs Hierarchical Groups
![Page 24: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/24.jpg)
Flat Groups vs Hierarchical Groups
Flat Groups have no single point of failure– If one crashes, the group continues but just
becomes smaller– But, decision making is complicated, often
involving a vote Hierarchical Groups are the opposite
– If the coordinator breaks, the group breaks– But, the coordinator can make decisions without
interrupting the others
![Page 25: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/25.jpg)
Group Membership Management
How do we know which processes are part of a group?
– We could have a group server responsible for creating, deleting groups and allowing processes to join and leave a group
This is efficient, but again results in a single point of failure
– Alternatively it could be managed in a distributed style To join or leave a group a process simply lets everyone know
they are there or they are leaving Assuming they leave voluntarily and don’t just crash
![Page 26: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/26.jpg)
Group Membership Management
A further issue with distributed management is that joining / leaving needs to be synchronous with messages being sent
– i.e. when a process joins it should then receive all subsequent messages and should stop receiving messages when it leaves
Which means joining and leaving are added to the process queue
Also, what happens when too many processes leave and the group can’t function any longer?
– We need to rebuild the group – what if multiple processes attempt to rebuild the group simultaneously?
![Page 27: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/27.jpg)
How many processes are needed?
A system is k fault tolerant, if k components fail and it continues working.
– If processes fail silently k+1 processes are needed.– If processes exhibit Byzantine failures, 2k+1 are needed
Byzantine failures occur when a process continues to send erroneous or random replies
But how do we determine (with certainty) that k processes might fail, but k+1 won’t?
![Page 28: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/28.jpg)
What are the processes deciding?
Who should be coordinator? Whether or not to commit a transaction? How do we divide up tasks? How / When should we synchronise? …
![Page 29: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/29.jpg)
Failure Detection
How can we know when a process has failed?– Ping - “Are you alive?”
But is it the process or the communication channel that has failed?
False Positives
– Gossiping – “I’m alive!”
![Page 30: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/30.jpg)
Reliable CommunicationClient / Server
As well as processes being ‘unreliable’, the communication between processes is ‘unreliable’.
Building a fault tolerant DS involves managing point to point communication.
– TCP masks omission failures such as lost messages using acknowledgements and retransmissions
– But this doesn’t resolve crash failures when the server may crash during transmission
![Page 31: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/31.jpg)
RPC Semantics
RPC works well when client and server are functioning. If there is a crash it’s not easy to mask the difference between local and remote calls.
– The client is unable to locate the server– The request message from the client to the server is lost– The server crashes after receiving a request– The reply message from the server to the client is lost– The client crashes after sending a request
Each pose different problems
![Page 32: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/32.jpg)
Client Cannot Locate Server
Server could be down, or perhaps has upgraded and is now using a different communication format
We could generate an exception (Java, C)– Not every language has exceptions– Exceptions destroy the transparency
If the RPC responds with an exception “Cannot Locate Server”, it is clear that it isn’t a single processor system.
![Page 33: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/33.jpg)
Lost Request Messages
Easiest to deal with– Start a timer, if the timer expires before an
acknowledgement or a reply, then send the message again.
Server just needs to detect if it is a message or a retransmission
– But, if too many messages are lost the client will conclude “Cannot Locate Server”
![Page 34: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/34.jpg)
Server Crashes
Tricky as there are different scenarios
The client can’t tell the difference between b and c, but they need different responses
![Page 35: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/35.jpg)
Server Crashes
The server has 2 options– At Least Once Semantics– At Most Once Semantics
While we would like– Exactly Once Semantics
There is no way to arrange this
![Page 36: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/36.jpg)
Semantics
At least Once Semantics– Wait until the server reboots and try the operation again.– Keep trying until you get a response– The RPC will be carried out at least once, but possibly
more. At most Once Semantics
– Give up immediately!– The RPC may have been carried out, but wont be carried
out more than once. Alternative:
– Give no guarantees, so the RPC may happen anywhere between zero or a large number of times.
![Page 37: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/37.jpg)
Server Crashes
The client also has options (4)– Never reissue a request– Always reissue a request– Reissue a request if it did not yet receive an
acknowledgement– Reissue a request if it received an
acknowledgement, but no reply
![Page 38: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/38.jpg)
Server Crashes
With 2 server strategies and 4 client strategies, there are 8 possible combinations– None of them are satisfactory
In short, the possibility of server crashes radically changes the nature of RPC, very different from single processor systems.
![Page 39: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/39.jpg)
Lost Reply Messages
Also difficult– Did the reply get lost, or is the server just slow?
Resend the request based on a client timer?– Depends whether the request is idempotent
Idempotency– Can the request is performed more than once
without any damage being done?
![Page 40: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/40.jpg)
Idempotency
Consider a request for the first 1024bytes of data from file “xyz.txt”
Consider a request to transfer 1,000,000B from your account to mine
What happens if the reply is lost 10 times?
![Page 41: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/41.jpg)
Lost Reply Messages
An alternative is to contain a sequence number within each request
– The retransmission will then have a different sequence number from the original request and the server can distinguish the two.
However, this requires the server to maintain administration for each client
A further option is to send a bit in the message header indicating if it is an original request or a retransmission
– Original requests can be performed, but care should be taken with retransmissions.
![Page 42: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/42.jpg)
Client Crashes
When the client (parent) crashes after it has sent an RPC then the process becomes an ‘orphan’.
– i.e. there is no parent waiting for the results of the process. Orphans cause problems
– They waste CPU (and other) resources– They can cause confusion if they send their result just after the
client reboots How can we deal with orphans?
– Exterminate them– Reincarnation– Gentle reincarnation– Expiration
![Page 43: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/43.jpg)
Orphan Extermination
Each time a client sends an RPC message it stores on a hard disk what it is about to do.
When it reboots it checks the log and explicitly kills off any orphans.
Downsides:– It’s expensive writing to disks– It might not work, as the orphans may have themselves
made RPC calls creating grand-orphans– If the network is broken it might not be possible to find the
orphans again– If the orphan has a lock on some resource, that lock may
remain in place forever
![Page 44: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/44.jpg)
Reincarnation
When the client returns it sends a message to all other machines declaring a new epoch– Complete with a new epoch number
All servers can check if they have remote computations and if so kill them– If any are missed when they report back they will
have a different epoch number so are easy to detect
![Page 45: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/45.jpg)
Gentle Reincarnation
When an epoch request comes in, each machine tries to locate the owner of their remote computations– If the owner can’t be located, the computation is
killed.
![Page 46: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/46.jpg)
Expiration
Each RPC is given a standard amount of time T to complete the job– If it can’t finish, then it explicitly asks for a new
quantum
If a client crashes and waits T before rebooting all orphans are sure to be gone.
The problem is choosing a suitable T.
![Page 47: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/47.jpg)
Reliable Group CommunicationProcess Groups
Reliable Multicasting enables messages to be delivered to all members of a process group
Unfortunately enabling reliable multicasting is not that easy
– Most transport layers support reliable point-to-point communication channels, but not reliable communication to groups.
– At its simplest we can use multiple point-to-point messages
![Page 48: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/48.jpg)
Reliable Multicasting
What happens when a process joins during the communication?– Should it get the message?
What happens if the sending process crashes?
To simplify, lets assume that we know who is in the group and nobody is going to join or leave
![Page 49: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/49.jpg)
Basic Reliable Multicasting
![Page 50: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/50.jpg)
Basic Reliable Multicasting
Each message has a sequence number and then stores the message until it receives “Acknowledge” from every other process.
If a receiver missed a message it can simple request resubmission– Or if the sender doesn’t get all the
acknowledgements within a certain amount of time, it can resend the message.
![Page 51: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/51.jpg)
Scalability?
Clearly as the process group grows, there are an increasing number of ‘Acknowledgements’
– Do we need to give this feedback?
We could only give the negative acknowledgements – and this would scale better.
– However the sender is then forced to keep all sent messages in a buffer indefinitely waiting for retransmission requests
![Page 52: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/52.jpg)
Scalable Reliable Multicasting
Only negative acknowledgements (NACK) are sent.– What happens if there are a lot of NACKs?
When a process notices a missing message, it multicasts the NACK– But waits a random delay R before the NACK.– Therefore if another process receives a NACK it
can suppress it’s own NACK feedback as it knows the message will be retransmitted shortly.
![Page 53: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/53.jpg)
Scalable Reliable Multicasting
![Page 54: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/54.jpg)
Scalable Reliable Multicasting
One downside is that all processes are interrupted by the NACK, even those who successfully received the original message
It could be more efficient to group processes who regularly miss the same messages
![Page 55: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/55.jpg)
Hierarchical Feedback Control
![Page 56: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/56.jpg)
Atomic Multicast
Now lets reconsider reliable multicasting in the presence of potential process failure
– The message needs to be delivered to all processes or none at all.
– The messages also need to be delivered in the same order to all processes
Messages can be stored in a middleware layer and delivered to the application when an agreement is made on group membership
– If a process fails, it is no longer a member of the group, if it rejoins it must have it’s state brought up to date before continuing
![Page 57: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/57.jpg)
Message Receipt vs Message Delivery
![Page 58: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/58.jpg)
Group View
A group view is the list of processes contained in a group.
– Suppose message m is multicast through the group– Simultaneously a process joins or leaves the group creating
a view change message vc
We have to ensure that m is delivered to all processes before vc
– (Unless vc indicates that the sender of m has failed)
![Page 59: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/59.jpg)
Atomic Multicasting
![Page 60: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/60.jpg)
View Changes
Here we create virtual synchrony A view change acts as a barrier across no
multicast can pass It is comparable to using a synchronisation
variable Each view change ushers in a new epoch
![Page 61: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/61.jpg)
Multicast Message Ordering
We consider 4 different orderings– Unordered Multicasts– FIFO-ordered Multicasts– Causally-ordered Multicasts– Totally-ordered Multicasts
![Page 62: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/62.jpg)
Reliable Unordered Multicasts
No message ordering constraints
![Page 63: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/63.jpg)
Reliable FIFO-ordered multicasts
Messages from any one process are delivered in order
![Page 64: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/64.jpg)
Reliable Causally-ordered Multicast
Regardless of where the messages come from, if one causally precedes another the communication layer will deliver them in order– This can be implemented through vector
timestamps
![Page 65: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/65.jpg)
Total-ordered Multicasts
When messages are delivered, they are delivered in the same order to all group members– With FIFO ordering still respected.
![Page 66: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/66.jpg)
Distributed Commit
The Atomic Multicast problem is an example of the Distributed Commit problem– Whereby a distributed set of processes commit to
performing some operation, or not. One Phase Commit?
– The co-ordinator instructs processes to perform an operation (or not)
– With the obvious problem when a process may not be able to perform the operation
![Page 67: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/67.jpg)
2 Phase Commit
1) Co-ordinator sends out VOTE_REQUEST 2) Participant responds with VOTE_COMMIT
or VOTE_ABORT 3) Co-ordinator compares responses and
sends out GLOBAL_COMMIT or GLOBAL_ABORT
4) Participant either performs operation (or not!)
![Page 68: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/68.jpg)
2-Phase Commit
At least we can discover if processes are capable of performing the operation, but further issues arise when processes crash (or are blocked waiting for a response).
If a process crashes time outs can be used. If the co-ordinator crashes, processes could
consult with one another to figure out whether to Globally Commit or not.
![Page 69: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/69.jpg)
3-Phase Commit
The biggest problem arises when the Co-ordinator blocks or crashes with all processes waiting for the GLOBAL_COMMIT or GLOBAL_ABORT message
This is rare, but a 3 Phase Commit is (at least theoretically) applicable to resolve this.
![Page 70: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/70.jpg)
Recovery
A further aspect of fault tolerance involves the ability to recover from an error (before it results in a failure).– Backward Recovery
Return from a present erroneous state to a previous error-free state (a check point)
– Forward Recovery Attempt to move forward to a new error-free state.
![Page 71: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/71.jpg)
Forward & Backward Recovery
Backward– Consider retransmission of messages – here we
return to a previous state before the message was sent.
Forward– In Erasure Correction a missing packet is
constructed from information in other packets – when a message can be constructed from k out of n packets.
![Page 72: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/72.jpg)
Backward Recovery
Widely implemented technique Involves creating checkpoints that can be
returned to Challenges
– It can be expensive in terms of performance– No guarantee that the error will simply reoccur– It might not be possible
Try rolling back an ATM to before the 10,000B was erroneously issued.
![Page 73: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/73.jpg)
Checkpointing with Logging
One way of improving the costs of creating regular checkpoints– Log messages (sent and received) in between
checkpoints, and then replay the messages
![Page 74: ICS362 – Distributed Systems Dr. Ken Cosh Lecture 8](https://reader036.vdocuments.us/reader036/viewer/2022081603/56649f2f5503460f94c49d53/html5/thumbnails/74.jpg)
Review
Introduction to Fault Tolerance Process Resilience Reliable Client – Server Communication Reliable Group Communication Distributed Commit Recovery