outline
DESCRIPTION
Outline. Announcement Midterm Review Distributed File Systems – continued If we have time. Announcements. Please turn in your homework #3 at the beginning of class The midterm will be on March 20 This coming Thursday It will be an open-book, open-note exam. Operating System. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/1.jpg)
04/22/23 COP5611 1
Outline
• Announcement• Midterm Review• Distributed File Systems – continued
– If we have time
![Page 2: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/2.jpg)
04/22/23 COP5611 2
Announcements
• Please turn in your homework #3 at the beginning of class
• The midterm will be on March 20– This coming Thursday– It will be an open-book, open-note exam
![Page 3: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/3.jpg)
04/22/23 COP5611 3
Operating System
• An operating system is a layer of software on a bare machine that performs two basic functions– Resource management
• To manage resources so that they are used in an efficient and fair manner
– User friendliness
![Page 4: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/4.jpg)
04/22/23 COP5611 4
Distributed Systems
• A distributed system is a collection of independent computers that appears to its users as a single coherent system– Independent computers mean that they do not
share memory or clock– The computers communicate with each other by
exchanging messages over a communication network
![Page 5: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/5.jpg)
04/22/23 COP5611 5
Distributed Systems – cont.
![Page 6: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/6.jpg)
04/22/23 COP5611 6
Distributed Systems – cont.
• Advantages– The computing power of a group of cheap
workstations can be enormous• Decisive price/performance advantage over traditional
time-sharing systems– Resource sharing– Enhanced performance– Improved reliability and availability– Modular expandability
![Page 7: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/7.jpg)
04/22/23 COP5611 7
Distributed System Architecture – cont.
• Distributed systems are often classified based on the hardware– Multiprocessor systems– Homogenous multi-computer systems– Heterogeneous multi-computer systems
![Page 8: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/8.jpg)
04/22/23 COP5611 8
Distributed Operating Systems
• Hardware for distributed systems is important, but the software largely determines what a distributed system looks like to a user
• Distributed operating systems are much like the traditional operating systems– Resource management– User friendliness– The key concept is transparency
![Page 9: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/9.jpg)
04/22/23 COP5611 9
Distributed Operating Systems – cont.
• In a truly distributed operating system, the user views the system as a virtual uniprocessor system even though physically it consists of multiple computers– In other words, the use of multiple computers
and accessing remote data and resources should be invisible to the user
![Page 10: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/10.jpg)
04/22/23 COP5611 10
Overview of Different Kinds of Distributed Systems
System Description Main Goal
DOSTightly-coupled operating system for multi-processors and homogeneous multicomputers
Hide and manage hardware resources
NOSLoosely-coupled operating system for heterogeneous multicomputers (LAN and WAN)
Offer local services to remote clients
Middleware Additional layer atop of NOS implementing general-purpose services
Provide distribution transparency
![Page 11: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/11.jpg)
04/22/23 COP5611 11
Multicomputer Operating Systems• General structure of a multicomputer operating system
![Page 12: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/12.jpg)
04/22/23 COP5611 12
Network Operating System
1-19
![Page 13: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/13.jpg)
04/22/23 COP5611 13
Middleware and Openness
• In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications.
1.23
![Page 14: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/14.jpg)
04/22/23 COP5611 14
Comparison Between Systems
ItemDistributed OS
Network OS
Middleware-based OSMultipro
c.Multicomp.
Degree of transparency Very High High Low High
Same OS on all nodes Yes Yes No No
Number of copies of OS 1 N N N
Basis for communication
Shared memory Messages Files Model
specificResource management
Global, central
Global, distributed Per node Per node
Scalability No Moderately Yes VariesOpenness Closed Closed Open Open
![Page 15: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/15.jpg)
04/22/23 COP5611 15
Issues in Distributed Operating Systems
• Absence of global knowledge– In a distributed system, due to the unavailability
of a global memory and a global clock and due to unpredictable message delays, it is practically impossible to for a computer to collect up-to-date information about the global state of the distributed system
– Therefore a fundamental problem is to develop efficient techniques to implement a decentralized system wide control
– Another problem is how to order all the events
![Page 16: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/16.jpg)
04/22/23 COP5611 16
Issues in Distributed Operating Systems – cont.
• Naming– Plays an important role in achieving location
transparency– A name service maps a logical name into a
physical address by making use of a table lookup, an algorithm, or a combination of both
– In distributed systems, the tables may be replicated and stored at many places
• Consider naming in a distributed file system
![Page 17: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/17.jpg)
04/22/23 COP5611 17
Issues in Distributed Operating Systems – cont.
• Scalability– Systems generally grow with time, especially
distributed systems– Scalability requires that the growth should not
result in system unavailability or degraded performance
– This puts additional constraints on design approaches
![Page 18: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/18.jpg)
04/22/23 COP5611 18
Issues in Distributed Operating Systems – cont.
• Compatibility– Refers to the interoperability among the
resources in a system– Three different levels
• Binary level– All processors execute the same binary instruction repertoire– Virtual binary level
• Execution level– Same source code can be compiled and executed properly
• Protocol level– A common set of protocols
![Page 19: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/19.jpg)
04/22/23 COP5611 19
Issues in Distributed Operating Systems – cont.
• Process synchronization– The synchronization of processes in distributed
systems is difficult because of the unavailability of shared memory
• It needs to synchronize processes running on different computers when they try to concurrently access a shared resource
• This is the mutual exclusion problem as in classical operating systems
![Page 20: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/20.jpg)
04/22/23 COP5611 20
Issues in Distributed Operating Systems – cont.
• Resource management– Resource management needs to make both local
and remote resources available to uses in an effective manner
– Data migration• Distributed file system• Distributed shared memory
– Computation migration• Remote procedure call
– Distributed scheduling
![Page 21: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/21.jpg)
04/22/23 COP5611 21
Issues in Distributed Operating Systems – cont.
• Structuring– The distributed operating system requires some
additional constraints on the structure of the underlying operating system
– The collective kernel structure• An operating system is structured as a collection of
processes that are largely independent of each other– Object-oriented operating system
• The operating system’s services are implemented as objects
![Page 22: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/22.jpg)
04/22/23 COP5611 22
Clients and Servers
• General interaction between a client and a server.
![Page 23: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/23.jpg)
04/22/23 COP5611 23
Layered Protocols
• Layers, interfaces, and protocols in the OSI model.
![Page 24: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/24.jpg)
04/22/23 COP5611 24
Network Layer
• The primary task of a network layer is routing• The most widely used network protocol is the
connection-less IP (Internet Protocol)– Each IP packet is routed to its destination
independent of all others• A connection-oriented protocol is gaining
popularity– Virtual channel in ATM networks
![Page 25: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/25.jpg)
04/22/23 COP5611 25
Transport Layer
• This layer is the last part of a basic network protocol stack– In other words, this layer can be used by application
developers• An important aspect of this layer is to provide end-to-
end communication– The Internet transport protocol is called TCP (Transmission
Control Protocol)– The Internet protocol also supports a connectionless transport
protocol called UDP (Universal Datagram Protocol)
![Page 26: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/26.jpg)
04/22/23 COP5611 26
Sockets• Socket primitives for TCP/IP.
Primitive Meaning
Socket Create a new communication endpointBind Attach a local address to a socket
Listen Announce willingness to accept connections
Accept Block caller until a connection request arrives
Connect Actively attempt to establish a connectionSend Send some data over the connectionReceive Receive some data over the connectionClose Release the connection
![Page 27: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/27.jpg)
04/22/23 COP5611 27
Sockets – cont.
• Connection-oriented communication pattern using sockets.
![Page 28: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/28.jpg)
04/22/23 COP5611 28
Socket Programming
• Review– IP– TCP– UDP– Port
• Server Design Issues– Iterative vs. concurrent server– Stateless vs. stateful server– Multithreaded server
![Page 29: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/29.jpg)
04/22/23 COP5611 29
A Multithreaded Server
![Page 30: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/30.jpg)
04/22/23 COP5611 30
The Message Passing Model
• The message passing model provides two basic communication primitives– Send and receive – Send has two logical parameters, a message and
its destination– Receive has two logical parameters, the source
and a buffer for storing the message
![Page 31: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/31.jpg)
04/22/23 COP5611 31
Semantics of Send and Receive Primitives
• There are several design issues regarding send and receive primitives– Buffered or un-buffered– Blocking vs. non-blocking primitives
• With blocking primitives, the send does not return control until the message has been sent or received and the receive does not return control until a message is copied to the buffer
• With non-blocking primitives, the send returns control as the message is copied and the receive signals its intention to receive a message and provide a buffer for it
![Page 32: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/32.jpg)
04/22/23 COP5611 32
Semantics of Send and Receive Primitives – cont.
• Synchronous vs. asynchronous primitives– With synchronous primitives, a SEND primitive
is blocked until a corresponding RECEIVE primitive is executed
– With asynchronous primitives, a SEND primitive does not block if there is no corresponding execution of a RECEIVE primitive
• The messages are buffered
![Page 33: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/33.jpg)
04/22/23 COP5611 33
Remote Procedure Call
• RPC is designed to hide all the details from programmers– Overcome the difficulties with message-passing
model• It extends the conventional local procedure
calls to calling procedures on remote computers
![Page 34: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/34.jpg)
04/22/23 COP5611 34
Steps of a Remote Procedure Call – cont.
![Page 35: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/35.jpg)
04/22/23 COP5611 35
Remote Procedure Call – cont.
• Design issues– Structure
• Mostly based on stub procedures– Binding
• Through a binding server• The client specifies the machine and service required
– Parameter and result passing• Representation issues• By value and by reference
![Page 36: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/36.jpg)
04/22/23 COP5611 36
Remote Object Invocation
• Extend RPC principles to objects– The key feature of an object is that it encapsulates
data (called state) and the operations on those data (called methods)
– Methods are made available through an interface– The separation between interfaces and the objects
implementing these interfaces allows us to place an interface at one machine, while the object itself resides on another machine
![Page 37: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/37.jpg)
04/22/23 COP5611 37
Distributed Objects• Common organization of a remote object with
client-side proxy.
![Page 38: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/38.jpg)
04/22/23 COP5611 38
Inherent Limitations of a Distributed System
• Absence of a global clock– In a centralized system, time is unambiguous– In a distributed system, there exists no system
wide common clock• In other words, the notion of global time does not
exist– Impact of the absence of global time
• Difficult to reason about temporal order of events• Makes it harder to collect up-to-date information on
the state of the entire system
![Page 39: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/39.jpg)
04/22/23 COP5611 39
Inherent Limitations of a Distributed System
• Absence of shared memory– An up-to-date state of the entire system is not
available to any individual process• This information, however, is necessary to reason
about the system’s behavior, debugging, recovering from failures
![Page 40: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/40.jpg)
04/22/23 COP5611 40
Lamport’s Logical Clocks
• Logical clocks– For a wide of algorithms, what matters is the
internal consistency of clocks, not whether they are close to the real time
– For these algorithms, the clocks are often called logical locks
• Lamport proposed a scheme to order events in a distributed system using logical clocks
![Page 41: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/41.jpg)
04/22/23 COP5611 41
Lamport’s Logical Clocks – cont.
• Definitions– Happened before relation
• Happened before relation () captures the causal dependencies between events
• It is defined as follows– a b, if a and b are events in the same process and a
occurred before b.– a b, if a is the event of sending a message m in a process
and b is the event of receipt of the same message m by another process
– If a b and b c, then a c, i.e., “” is transitive
![Page 42: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/42.jpg)
04/22/23 COP5611 42
Lamport’s Logical Clocks – cont.
• Definitions – continued– Causally related events
• Event a causally affects event b if a b– Concurrent events
• Two distinct events a and b are said to be concurrent (denoted by a || b) if a b and b a
• For any two events, either a b, b a, or a || b
![Page 43: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/43.jpg)
04/22/23 COP5611 43
Lamport’s Logical Clocks – cont.
• Implementation rules– [IR1] Clock Ci is incremented between any two
successive events in process PiCi := Ci + d ( d > 0)
– [IR2] If event a is the sending of message m by process Pi, then message m is assigned a timestamp tm = Ci(a). On receiving the same message m by process Pj, Cj is set to
Cj := max(Cj, tm + d)
![Page 44: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/44.jpg)
04/22/23 COP5611 44
An Example
![Page 45: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/45.jpg)
04/22/23 COP5611 45
Total Ordering Using Lamport’s Clocks
• If a is any event at process Pi and b is any event at process Pj, then a => b if and only if either
– Where is any arbitrary relation that totally orders the processes to break ties
jiji
ji
PPbCaC
bCaC
and )()(
or )()(
![Page 46: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/46.jpg)
04/22/23 COP5611 46
A Limitation of Lamport’s Clocks
• In Lamport’s system of logical clocks – If a b, then C(a) < C(b)– The reverse if not necessarily true if the events
have occurred on different processes
![Page 47: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/47.jpg)
04/22/23 COP5611 47
A Limitation of Lamport’s Clocks
![Page 48: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/48.jpg)
04/22/23 COP5611 48
Vector Clocks
• Implementation rules– [IR1] Clock Ci is incremented between any two
successive events in process PiCi[i] := Ci[i] + d ( d > 0)
– [IR2] If event a is the sending of message m by process Pi, then message m is assigned a timestamp tm = Ci(a). On receiving the same message m by process Pj, Cj is set to
Cj[k] := max(Cj[k], tm[k])
![Page 49: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/49.jpg)
04/22/23 COP5611 49
Vector Clocks – cont.
![Page 50: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/50.jpg)
04/22/23 COP5611 50
Vector Clocks – cont.
• Assertion– At any instant,
• Events a and b are casually related if ta < tb or tb < ta. Otherwise, these events are concurrent
• In a system of vector clocks,
][][ :, iCiCji ji
btba a tiff
![Page 51: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/51.jpg)
04/22/23 COP5611 51
Causal Ordering of Messages
• The causal ordering of messages tries to maintain the same causal relationship that holds among “message send” events with the corresponding “message receive” events– In other words, if Send(M1) -> Send(M2), then
Receive(M1) -> Receive(M2)– This is different from causal ordering of events
![Page 52: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/52.jpg)
04/22/23 COP5611 52
Causal Ordering of Messages – cont.
![Page 53: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/53.jpg)
04/22/23 COP5611 53
Causal Ordering of Messages – cont.
• The basic idea– It is very simple– Deliver a message only when no causality
constraints are violated– Otherwise, the message is not delivered
immediately but is buffered until all the preceding messages are delivered
![Page 54: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/54.jpg)
04/22/23 COP5611 54
Birman-Schiper-Stephenson Protocol
![Page 55: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/55.jpg)
04/22/23 COP5611 55
Schiper-Eggli-Sando Protocol
![Page 56: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/56.jpg)
04/22/23 COP5611 56
Schiper-Eggli-Sando Protocol – cont.
![Page 57: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/57.jpg)
04/22/23 COP5611 57
Schiper-Eggli-Sando Protocol – cont.
![Page 58: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/58.jpg)
04/22/23 COP5611 58
Local State
• Local state– For a site Si, its local state at a given time is
defined by the local context of the distributed application, denoted by LSi.
• More notations– mij denotes a message sent by Si to Sj
– send(mij) and rec(mij) denote the corresponding sending and receiving event.
![Page 59: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/59.jpg)
04/22/23 COP5611 59
Definitions – cont.
![Page 60: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/60.jpg)
04/22/23 COP5611 60
Definitions – cont.
![Page 61: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/61.jpg)
04/22/23 COP5611 61
Global State – cont.
![Page 62: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/62.jpg)
04/22/23 COP5611 62
Definitions – cont.
Strongly consistent global state: A global state is strongly consistent if it is consistent and transitless
![Page 63: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/63.jpg)
04/22/23 COP5611 63
Global State – cont.
![Page 64: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/64.jpg)
04/22/23 COP5611 64
Chandy-Lamport’s Global State Recording Algorithm
![Page 65: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/65.jpg)
04/22/23 COP5611 65
Cuts of a Distributed Computation
• A cut is a graphical representation of a global state– A consistent cut is a graphical representation of a
consistent global state • Definition
– A cut of a distributed computation is a set C={c1, c2, ...., cn}, where ci is a cut event at site Si in the history of the distributed computation
![Page 66: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/66.jpg)
04/22/23 COP5611 66
Cuts of a Distributed Computation – cont.
![Page 67: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/67.jpg)
04/22/23 COP5611 67
Cuts of a Distributed Computation – cont.
![Page 68: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/68.jpg)
04/22/23 COP5611 68
Cuts of a Distributed Computation – cont.
![Page 69: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/69.jpg)
04/22/23 COP5611 69
Cuts of a Distributed Computation – cont.
![Page 70: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/70.jpg)
04/22/23 COP5611 70
Cuts of a Distributed Computation – cont.
![Page 71: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/71.jpg)
04/22/23 COP5611 71
The Critical Section Problem
• When processes (centralized or distributed) interact through shared resources, the integrity of the resources may be violated if the accesses are not coordinated– The resources may not record all the changes– A process may obtain inconsistent values– The final state of the shared resource may be
inconsistent
![Page 72: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/72.jpg)
04/22/23 COP5611 72
Mutual Exclusion
• One solution to the problem is that at any time at most only one process can access the shared resources– This solution is known as mutual exclusion– A critical section is a code segment in a process
which shared resources are accessed• A process can have more than one critical section
• There are problems which involve shared resources where mutual exclusion is not the optimal solution
![Page 73: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/73.jpg)
04/22/23 COP5611 73
The Structure of Processes
• Structure of process Pi
repeat entry section critical section exit section reminder sectionuntil false;
![Page 74: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/74.jpg)
04/22/23 COP5611 74
Requirements of Mutual Exclusion Algorithms
• Freedom from deadlocks– Two or more sites should not endlessly wait for messages
• Freedom from starvation– A site would wait indefinitely to execute its critical section
• Fairness– Requests are executed in the order based on logical clocks
• Fault tolerant– It continues to work when some failures occur
![Page 75: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/75.jpg)
04/22/23 COP5611 75
Performance Measure for Distributed Mutual Exclusion
• The number of messages per CS invocation• Synchronization delay
– The time required after a site leaves the CS and before the next site enters the CS
– System throughput 1/(sd+E), where sd is the synchronization delay and E the average CS execution time
• Response time– The time interval a request waits for its CS
execution to be over after its request messages have been sent out
![Page 76: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/76.jpg)
04/22/23 COP5611 76
Performance Measure for Distributed Mutual Exclusion
![Page 77: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/77.jpg)
04/22/23 COP5611 77
A Centralized Algorithm
• It is a simple solution– One site, called the control site, is responsible for
granting permission to the CS execution– To request the CS, a site sends a REQUEST
message to the control site• When a site is done with CS execution, it sends a
RELEASE message to the control site– The control site queues up the requests for the
CS and grant them permission
![Page 78: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/78.jpg)
04/22/23 COP5611 78
Distributed Solutions• Non-token-based algorithms
– Use timestamps to order requests and resolve conflicts between simultaneous requests
– Lamport’s algorithm and Ricart-Agrawala Algorithm
• Token-based algorithms– A unique token is shared among the sites– A site is allowed to enter the CS if it possess the
token and continues to hold the token until its CS execution is over; then it passes the token to the next site
![Page 79: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/79.jpg)
04/22/23 COP5611 79
Lamport’s Distributed Mutual Exclusion Algorithm
• This algorithm is based on the total ordering using Lamport’s clocks– Each process keeps a Lamport’s logical clock
• Each process is associated with a unique id that can be used to break the ties
– In the algorithm, each process keeps a queue, request_queuei, which contains mutual exclusion requests ordered by their timestamp and associated id
– Ri of each process consists of all the processes– The communication channel is assumed to be FIFO
![Page 80: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/80.jpg)
04/22/23 COP5611 80
Lamport’s Distributed Mutual Exclusion Algorithm – cont.
![Page 81: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/81.jpg)
04/22/23 COP5611 81
Lamport’s Distributed Mutual Exclusion Algorithm – cont.
![Page 82: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/82.jpg)
04/22/23 COP5611 82
Ricart-Agrawala Algorithm
![Page 83: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/83.jpg)
04/22/23 COP5611 83
A Simple Toke Ring Algorithm
• When the ring is initialized, one process is given the token
• The token circulates around the ring– It is passed from k to k+1 (modulo the ring size)– When a process acquires the token from its
neighbor, it checks to see if it is waiting to enter its critical section
• If so, it enters its CS– When exiting from its CS, it passes the token to the next
• Otherwise, it passes the token to the next
![Page 84: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/84.jpg)
04/22/23 COP5611 84
Suzuki-Kasami’s Algorithm
• Data structures– Each site maintains a vector consisting the largest
sequence number received so far from other sites– The token consists of a queue of requesting sites
and an array of integers, consisting of the sequence number of the request that a site executed most recently
![Page 85: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/85.jpg)
04/22/23 COP5611 85
Suzuki-Kasami’s Algorithm – cont.
![Page 86: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/86.jpg)
04/22/23 COP5611 86
Distributed Deadlock Detection
• In distributed systems, the system state can be represented by a wait-for graph (WFG)– In WFG, nodes are processes and there is a
directed edge from node P1 to node P2 if P1 is blocked and is waiting for P2 to release some resource
– The system is deadlocked if there is a directed cycle or knot in its WFG
– The problem is how to maintain the WFG and detect cycle/knot in the graph
![Page 87: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/87.jpg)
04/22/23 COP5611 87
Distributed Deadlock Detection – cont.
• Centralized detection algorithms• Distributed deadlock algorithms
– Path-pushing– Edge-chasing– Diffusion computation– Global state detection– You need to know the basic ideas but not the
details about those algorithms
![Page 88: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/88.jpg)
04/22/23 COP5611 88
Agreement Protocols
• In distributed systems, sites are often required to reach mutual agreement– In distributed database systems, data managers must
agree on whether to commit or to abort a transaction– Reaching an agreement requires the sites have
knowledge about values at other sites• Agreement when the system is free from failures• Agreement when the system is prone to failure
![Page 89: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/89.jpg)
04/22/23 COP5611 89
Agreement Problems
• There are three well known agreement problems– Byzantine agreement problem– Consensus problem– Interactive consistency problem
![Page 90: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/90.jpg)
04/22/23 COP5611 90
Lamport-Shostak-Pease Algorithm
![Page 91: Outline](https://reader035.vdocuments.us/reader035/viewer/2022062323/56815b85550346895dc9874d/html5/thumbnails/91.jpg)
04/22/23 COP5611 91
Lamport-Shostak-Pease Algorithm – cont.