message passing

60
Distributed Systems Message Passing Prof. Aniruddha S Rumale Assistant Professor, Comp. Engg. Dept.

Upload: aniruddha

Post on 18-Nov-2014

10 views

Category:

Documents


4 download

DESCRIPTION

this presentation is based on distributed systems syllabus of pune universityI tried to explain the message passing

TRANSCRIPT

Page 1: Message Passing

Distributed Systems

Message Passing

Prof. Aniruddha S RumaleAssistant Professor, Comp. Engg. Dept.

Page 2: Message Passing

Introduction•Process means program in execution.

•Two comps in DS are communicating means two processes communicating with each other.

•Process communication is necessary to achieve some common goal.

•DS needs to provide InterProcess Communication ( IPC)

Page 3: Message Passing

IPCIPC requires information sharing among two or more processes.

1) Original sharing or shared data approach

information to be shared is stored in commonmemory area

2) Copy Sharing or message passing

information to be shared is physically copied fromsender process’s address space to address spacesof all receiver processes in form of message.

Page 4: Message Passing

P1 P2 P1 P2Shared common memory

Shared data approach Message-passing approach

•Generally Computers in N/W do not share memory. So IPCin DS uses Message-passing over shared data

Approach.

•Message-passing system provides message based IPCprotocols by shielding the complex N/W protocolsand also by shielding the multiple heterogeneousplatforms from user.

Page 5: Message Passing

Features of Message-passing system

1. Simplicity should be simple and easy to use.

It must be straightforward to construct new applications and communicate with existingones using systems primitives.

2. Uniform Semanticssemantics of remote

communications( communicating processes are on different nodes) should be as close as possible to local communications( communicating processes are on same node).

Page 6: Message Passing

Features of Message-passing system …continue

3.Efficiencyan IPC protocol of a message-passing system can be

made efficient by reducing the number of message exchanges, as far as practicable, during thecommunication process.

Avoiding the cost of establishing and terminating connections between same pair of processes eachand every message exchange between them.

Minimizing the cost of maintaining the connections.

Piggybacking the acknowledgement of previous messages with the next message during a connection involving several message exchanges between sender and receiver.

Page 7: Message Passing

Features of Message-passing system …continue

4. ReliabiltyDS is prone to catastrophic events.( node crash,communication link (s) failures etc…)

A reliable IPC protocol can cope with failure problemsand guarantees the delivery of message.

Handling of lost messages involves ACKs andretransmissions on the basis of timeout.

Duplicate messages may be sent in the event offailures or timeouts. A reliable IPC protocol must becapable of detecting and handling of duplicates. Itinvolves generating and assigning sequence numbers

to messages.

Page 8: Message Passing

Features of Message-passing system …continue

5.CorrectnessCorrectness is a feature related to IPC protocols forgroup communication.

Atomicity : message sent to a group of receivers willbe delivered to either all of them or none of them.

Ordered Delivery: messages arrive at all receivers inorder acceptable to an application.

Survivability: guarantees that messages will be delivered correctly despite partial failures ofprocesses, machines, or communication links.

Page 9: Message Passing

Features of Message-passing system …continue

6.FlexibiltyNot all applications require the same degree of correctness and reliability of IPC protocols. Thus , the IPC protocol of message passing system must be flexible enough to cater to the various needs of different applications.

IPC primitives must also have the flexibility to permitany kind of control flow between the cooperatingprocesses, including synchronous and asynchronoussend/receive.

Page 10: Message Passing

Features of Message-passing system …continue

7. SecurityA good message-passing system must be capable of

providing a secure end-to-end communication.

A message in transit on the network should not beaccessible to any user other than those to whom it is

addressed and the sender. This involves

a) authentication of receiver(s) of a message by senderb) authentication of sender of message by receiver(s)c) encryption of message before sending it over N/W.

Page 11: Message Passing

Features of Message-passing system …continue

8. Portabilitythe message-passing system should itself be

portable.

the applications written by using the primitives of IPCprotocols of message-passing system should beportable.

Page 12: Message Passing

Issues in IPC by message-passing

Message is a block of information formatted by a sending process in such a manner that it is meaningful to the receiving process.

ActualData orPointerTo data

SequenceNumber

Or Message

ID

Structural information Addresses

NumberOf bytes/elements

Type

SendingProcessAddress

ReceivingProcessAddress

A typical message structure

Page 13: Message Passing

Issues in IPC by message-passing…continued

• who is sender and who is receiver?

•How many receivers? One or many?

•Is the message guaranteed to have been accepted by its receiver(s)?

•Does the sender need to wait for reply?

•How to handle catastrophic events ( node crash, link(s) failure(s), etc…) occurring during communication?

•If receiver is not ready; what to do with message? Discard or store in buffer? What to do if buffer is full?

•Can receiver choose order of acceptance to serve outstanding messages?

Page 14: Message Passing

SynchronizationCentral issue in communication structure is synchronization imposed on the communicating processes by the communication primitives.

Two semantics Blocking and non-blocking can be used.

A primitive is said to have non-blocking semantics if its invocation does not block the execution of its invoker.

If execution of invoker is blocked, it is blocking semantics.

These semantics are primarily used for send and receive primitives.

Incase of blocking send primitive, sending process is blocked after execution of send until it receives an ACK from receiver that the message is received.

Incase of non-blocking send, process proceeds with its execution as soon as the message get copied to buffer.( transferred if Null-buffer is used)

Page 15: Message Passing

Synchronization…continued

Incase of block-receive, receiving process is blocked until it receives a message (ACK).

Incase of non-blocking receive, process proceeds with its execution as soon as receive primitive is executed.

How non-blocking receive knows that message is arrived in buffer?

1. Polling : a test primitive is provided to allow receiver to check the buffer status. A periodic execution of test is carried out called as polling.

2. Interrupt : when buffer get filled and becomes ready to be used by receiving process a software interrupt notifies this to receiver. Saves repeated unsuccessful check of polling.

Page 16: Message Passing

Synchronization…continued

A variant to non-blocking receive primitive is conditional receive primitive. This returns control to invoking process immediately, either with a message or with an indicator of no-message.

In blocking-send primitive, sending process could get blocked forever if receiver crashes or if message loss due to other reasons. To avoid this blocking primitives uses time-out value ( time-stamp, waiting-time) specifying interval of time after which the operation of blocking-primitive ( blocking-send) is terminated with an error status.

Time-out value is either default ( system calculated ) or user defined ( human-time) with respect to communication criteria.

Page 17: Message Passing

Synchronization…continued

When both send and receive primitives of a communication between two processes use blocking semantics, the communication is said to be synchronous; otherwise it is said to be asynchronous.

Synchronous communication comparatively easy to implement than asynchronous communication. Synchronous mode of communication with both

send and receive having blocking semantics

Sender Receiver

Send (message)Execution suspended

message

Receive (message)Execution suspended

Execution Resumed

Execution Resumed

ACKSend (ACK)

BlockedState

Executing State

Page 18: Message Passing

Synchronization…continued

Synchronous communication is more reliable.

If message get lost or is undelivered, no backward error recovery is necessary.

It limits concurrency

It is subject to communication deadlocks.

Less flexible than asynchronous communication.

Requires unnecessary waits for ACK.

And it is slower than asynchronous communication.

Page 19: Message Passing

Buffering In communication, in some cases receiving process may not be ready to receive a message.Such messages need to be stored somewhere, usually in the buffer of receiver, for later reception and processing.

Null Buffering ( No Buffering) [ Synchronous]

No temporary storage at receiver to store the message.

The message remains in the sender process’s address space and execution of send is delayed until the receiver executes the corresponding receive.

Or message is simply discarded and time-out mechanism is used to resend the message after a time-out period.

SendingProcess

ReceivingProcess

Message

ACK

Null Buffering

Page 20: Message Passing

Buffering… continued

Single-Message Buffer [ Synchronous]

Null buffer is not suitable for communication in DS if receiver is not ready, it may require more than two repeated message transfer of same message. Also receiver need to wait for time taken to transfer the message across the N/W.

To avoid this a buffer with a capacity to store one message is used at receivers end. This is because in synchronous mode an application module may have at most one message outstanding at a time.

The message buffer may either be located in the kernel’s address space or in the receiver process’s address space. Logical path of message transfer involves two copy operations.

Receiver

Sender

Buffer to StoreOne

Message

Page 21: Message Passing

Buffering… continued

Unbounded capacity buffer [ asynchronous]

In asynchronous communication sender never wait for receiver to be ready;

causing many pending messages that yet not have been accepted by receiver;

and thus requires unbounded capacity buffer to store all unreceived messages.

Unbounded capacity of buffer is practically impossible. So in practice asynchronous communication uses finite bound buffers. Receiver

Sender

Buffer to StoreMany

Messages

Page 22: Message Passing

Buffering… continued

Finite bound or Multiple-message buffer [asynchronous]

1. Unsuccessful communication :

message transfers simply fail whenever there is no more buffer space.send normally returns an error message to sender : receiver buffer is full, message can’t be delivered.

makes message passing less reliable.

2. Flow-controlled communication :

sender is blocked until the receiver accepts some messages.

introduces synchronization : may results in deadlocks

Communication based on Finite bound buffer implementation is more complex to implement and use than null buffer or single message buffer.

Page 23: Message Passing

Multidatagram messagesAll N/W has upper bound on data to be transmitted at a time, known as Maximum Transfer Unit (MTU).

If sizeof ( message) > MTU

: fragmentMTU( sizeof (eachfragment)<=MTU)

: number each fragment serially

: if each fragment numbered

: send them in packets ( datagram)

Packet = control information + message data.

If MTU> message data to be send : single-datagram message, else multidatagram messages.

Page 24: Message Passing

Multidatagram messages… continued

At receiver side check packets for sequence number

If packet is numbered store it in buffer

Receive all packets with sequence numbers based oncommon control information.

report any error to sender for retransmission ofmissing packet

arrange packets in order

accepts packets in order

reassemble packets to form complete message

acknowledge the sender

Page 25: Message Passing

Encoding and DecodingA message data should be meaningful to the receiving process.

This needs preservation of program objects while transmission from senders address space to receivers address space.

It is not easy to achieve this on homogeneous systems and it is impossible to achieve this on heterogeneous systems.

An absolute pointer value loses its meaning when transferred from one process address space to another.

Different program objects occupy varying amount of storage space.

And to make a message meaningful to the receiver, there must be some way receiver to identify which program object is stored where in the message buffer and how much space each program object occupies.

Page 26: Message Passing

Encoding and Decoding…continued

Due to problems in transferring program objects in their original form, they get converted to a more suitable stream form for transmission.

Conversion process from original to stream form taking place at sender side is known as encoding.

Conversion of stream form of program objects in to their original form at receiver side prior to their use is known as Decoding.

There are two types of representation used for encoding and decoding. Tagged and untagged representation.

Page 27: Message Passing

Encoding and Decoding…continued

Tagged representation :

type of each program object along with its value is encoded in the message.

Receiving process checks the type of each program object in message due to self-describing nature of coded data.

More expensive than untagged representation.

Untagged representation :

message data only contains program objects. No information about type of any program object is given.

Receiver must have prior knowledge to decode the received data as coded data format is not self-describing.

Page 28: Message Passing

Process Addressing

Addressing ( Naming ) of parties involved in communication is an important issue in message based communication.

Explicit addressing :

the process with which communication is desired is explicitly named as a parameter in the communication primitive used.

Eg . Send ( process_id, message)

Receive( process_id, message)

In above we are sending/receiving message to/from a process identified using process_id

Page 29: Message Passing

Process Addressing…continued

Implicit Addressing :

a process willing to communicate does not explicitly name a process for communication in communication primitive used.

Useful in client server communication when client is concerned with service and not the server from set of server-farm, who is going to serve its purpose.

This is also known as functional addressing as address used is of service and not of process.

Eg. Send_any( service_id, message) send message to any process which can provide the service identified by service_id.

Receive_any(process_id, message) Receive message from any process and return the process_id on reception of message.

Page 30: Message Passing

Failure handling

Sendrequest

sender sendersender

ReceiverReceiver

Receiver

lost

lost

Crash

Sendrequest

Sendrequestreceived

message

Sendresponse

messagemessage

message

restarted

Request message is Lost either because

Of link failure or Because receiver

Node may be downAt time of request

For communication

Response loss either dueTo down sender node or

Failed link of comunication

Receiver’s node Crashed before

Receiving requestFor communication

DS is prone to following failures.

Page 31: Message Passing

Failure handling…continued

To cope with these problems, IPC protocols are designed based on the idea of internal retransmissions of messages after time-outs. And prompt return of ACK messages from receiver.

Four-message IPC protocol for client-server:

1. Client sends request message to server.

2. On reception of request server sends ACK to client.

3. On processing of client’s request server sends Reply containing results of processing to client.

4. On reception of reply client sends ACK to server

Client Server

Request

ACK

Reply

ACK

Blocked state

Executing state

Page 32: Message Passing

Failure handling…continued

Three message reliable IPC protocol for client server :

1. Client sends a request message to server

2. On reception server processes the request and prepares a reply and sends it to client; meanwhile client remain blocked.

3. On reception of reply from server client resumes its execution and sends a ACK to server. Server remain blocked until the ACK from client.

Client Server

Request

Reply

ACK

Blocked state

Executing state

Page 33: Message Passing

Failure handling…continued

Two message IPC protocol for client server communication :

1. Client sends a request message to server and enters in block state for time=time-out.

2. On reception of request from client server processes it and prepares a reply and sends it to client.

3. Servers kernel waits for ACK from clients kernel for time=time-out; In absence server retransmits the reply to client.

Client Server

Request

Reply

Blocked state

Executing state

Page 34: Message Passing

Failure handling…continued

Time-out

Time-out

client server

crash

lost

lost

Time-out

Send request

Send request

Send request

Send request

Request message

Retransmit request message

Retransmit request message

Retransmit request message

ACK

ACK

Unsuccessful Requestexecution

Successful request

ExecutionsMay produce

Different results

Send response

Send response

Example of fault tolerant communication between client-server

Page 35: Message Passing

Idempotency & handling duplicate request messages Idempotency means repeatabilty.

Idempotent operation produces same results without any side effects no matter how many times it is performed with the same arguments. Eg. sqrt(64)=8 for any repeated execution.

Nonidempotent operations do not necessarily produce the same results when executed repeatedly with the same arguments. Eg. See the following code

Debit(amount) { if ((balance-amount) >=( minbalance))

{ balance=balance-amount; return(“success”,balance); }

else return(“failure”,balance); }

// produces different results for amount=100 for every operation

Page 36: Message Passing

clientServer

Balance=1000

crash

lost

lost

Time-out

Send request

Send request

Send request

Send request

Debit(100)

Retransmit Debit(100)

Retransmit Debit(100)

Retransmit Debit(100)

ACK( Success,800)

Example of nonidempotent operation without any measures for fault detection

Balance=1000-100=900

Balance=900-100=800

Balance=800-100=700ACK( Success,700)

Received Balance =700Desired=900

Minbalance=200

Nonidempotent operation

Page 37: Message Passing

Idempotency & handling duplicate request messages…continued

Problem of nonidempotency can be solved using by avoiding orphan executions ( executions of client request done at server side, results of which won’t reach to client and so may client keep retransmitting the same request; yielding in wrong result(s) ) of requests from client.

This can be achieved by using exactly-once semantics, which ensures that only one execution of server’s operation is performed for one request.

Requires identification of orphan executions.

Primitives based on exactly-once semantics are most desired but difficult to implement.

Page 38: Message Passing

Exactly-once semanticsuses unique identifier for every request that a client makes.

Sets up a reply cache in the kernel’s address space on the server machine to cache replies.

Before forwarding a request to server, kernel of server machine checks to see if a reply already exists in reply cache or not.

If yes, that means request is duplicate and already executed. So previously computed result is extracted from reply cache and new response is send to client.

If no, request is forwarded to appropriate server by kernel.

RequestID

Execution Status

Executed/

Notreceived

Resultobtained

Reply-cache contents of Exactly-once semantics

Page 39: Message Passing

clientServer

Balance=1000

crash

lost

lost

Time-out

Send request-1

Send request-1

Send request-1

Send request-1

Debit(100)

Retransmit Debit(100)

Retransmit Debit(100)

Retransmit Debit(100)

ACK( Success,900)

Example of exactly-once operation

Reqest-1Balance=1000-100

=900

ACK( Success,900)

Received Balance =900Desired=900

Minbalance=200

Exactly-once operation

Reqest-1 already executed

Balance=900

Reqest-1 already executed

Balance=900

Page 40: Message Passing

Lost and out of sequence packetsKeeping track of lost and out of sequence packets is a issue in multidatagram messages.

In multidatagram message transmission is said to be complete iff all packets are received by a process to which it is sent.

Simple way is acknowledge each packet separately. ( stop-and-wait protocol) . This leads to communication overhead.

Better approach is sending one acknowledgement for complete multidatagram message when all packets get received at receiver end.( blast protocol)

Page 41: Message Passing

Lost and out of sequence packets…continued

In blast protocol a node crash or a link failure may lead to following problems:

one or more packets of multidatagram message arelost in communication

the packets are received out of sequence by thereceiver.

Efficient mechanism is to use a bitmap to identify the packets of a message. In this approach header part of each packet consists of two extra fields, one of which specifies the total number of packets in multidatagram message and other is the bitmap field that specifies the position of this packect in the complete message.

Senderaddress

Receiveraddress

Type of message

MessageID

Rest ofmessage

Packet Sequence noOr bitmap

No ofpackets

Page 42: Message Passing

Lost and out of sequence packets…continued

In multidatagram message a suitable buffer is set aside by receiver using No_of_packets field in first packet.

Bitmap field gives information where exactly a received packet must be stored in set aside buffer for the particular message.

selective repeat :

After time-out, if all packets are not received, Bitmap ids of non-received packets are communicated with the sender.

On receiving this information sender sends only those packets that have not been received by receiver.

The process get repeated until transmission of multidatagram message won’t get completed. i.e. when all packets of message get received by receiver this retransmission of select packets stops.

Page 43: Message Passing

Sender of multidatagram Message that consists of Five packets

Receiver of multidatagram Message

1

2

3

4

5

Buffer For 5

Packets

Send request message

lost

lost

Place this packetIn position 3

Place this packetIn position 5

Place this packetIn position 4

Send ACK

Time-out

Packets ofThe

ResponseMessage

Resend missingpackets

(M1,5,P1)

(M1,5,P3)

(M1,5,P4)

(M1,5,P5)

(M1,5,P1)

(M1,5,P4)

(M1,5,P2)

ACK

Missing packets info.

M1- Messeage ID=15=packets in M1

P1,P2…= Ith packet

Use of bitmap to keep track of lost and out of sequence packets in multidatagram message transmission

Retransmit requestFor missing packets

Create buffer for fivePackets and store this

Packet in position 2

Place this packetIn position 1

Page 44: Message Passing

Group communicationElementary form of communication is one-to-one or unicast

communication.

DS require group communication (in addition to unicast) those are

1. One to many ( single sender and multiple receiver )multicast ( no of receivers are predefined and known) broadcast( no of receivers are unknown andundefined)

2. Many to one ( multiple sender and single receiver)

3. Many to many ( multiple sender and multiple receiver)

Page 45: Message Passing

Group ManagementIn one to many communication, Receiver processes of a message form a group; closed and open.

A closed group is one in which only the members of the group can send a message to the group.

In close group, an outside process cannot send a message to group as a whole, although it may send a message to an individual member of group.

Open group is one in which any process in system can send message to group as a whole.

Usage of close/open group is application specific and any flexible message passing system must support both types of groups.

Facility of Dynamic creation and deletion of groups is must. And a process must be allowed to enter or leave the group at any time.

Page 46: Message Passing

Group Management

Simple mechanism for this is to use centralized group server to manage groups and their membership information.

Centralized server approach suffers from the problems of poor reliability and poor scalability common to all centralized systems.

Replication of group servers adds communication overhead in keeping group information of all group servers consistent.

Page 47: Message Passing

Group addressingTwo level naming scheme is normally used.

High level group name is in ASCII string that is independent of the location information of the processes in group.

Low level group name depends to a large extent on underlying hardware.

Special N/W address to which multiple machines can listen is called as multicast address, possible on some N/Ws. Packet sent to multicast address is delivered to the machines linked to multicast address.

N/Ws, which can not create multicast address may have broadcasting facility by declaring a particular address such as zero as broadcast address. Packet sent to broadcast address is delivered to all machines in entire N/W.

Page 48: Message Passing

Message delivery to receiver processesUser uses high level group names in programs.

Centralized group server maintains mapping between high and low level group addresses ( names) along with the process identifiers of all processes for each group.

Kernels of sender, receiver , and group server does appropriate mapping and unmapping operations with rest of other operations like encoding/decoding to deliver message to correctly to receiver.

Sender is not at all aware of either size of group or actual mechanism used for group addressing.

Sender simply sends the message to a group by specifying its high level name, and the OS takes the responsibility to deliver the message to all the group members.

Page 49: Message Passing

Buffered and unbuffered multicastMulticasting is asynchronous communication mechanism due to following

reasons.

1. It is unrealistic to expect a sending process to wait until all the receiving processes that belong to multicast group are ready to receive the multicast message.

2. Sending process may not be aware of all receiving processes that belong to the multicast group.

For an unbuffered multicast, the message is not buffered for the receiving process and is lost if receiving process is not in a state ready to receive it. So the message is received only by those processes of multicast group that are ready to receive it.

For a buffered multicast, the message is buffered for the receiving processes, so each process of multicast group eventually receive the message.

Same is true for broadcasting communication.

Page 50: Message Passing

Send to all and bulletin-board semantics• Send-to-all semantics : a copy of message is sent to each process of the

multicast group and message is buffered until it is accepted by the process.

Following two factors are ignored by send-to-all semantics.

Relevance of a message to a particular receiver may depend on the receiver’s state.

Message not accepted within a certain time after transmission may no longer be useful; their value may depend on sender’s state.

• Bulletin-board semantics : a message to be multicast is addressed to a channel instead of being sent to every individual process of multicast group.

Receiving process copies message from channel instead of removing it when it makes a receive request on the channel.

Process that have receive access right on the channel constitute the multicast group.Thus channel acts as a bulletin-board.

Page 51: Message Passing

Flexible reliability in multicast communication1. The 0-reliability : no response is expected by sender from

any of the receiver. Useful in asynchronous multicast.

2. The 1-reliability : sender expects the reply from any of receivers.

3. m-out-of-n-reliable : the multicast group consists of n receivers and sender expects a response from m ( 1<m<n) of the n receivers.

Useful for consistency control of replicated information with m=n/2.

4. All-reliable: sender expects a response message from all the receivers of the multicast group.

Page 52: Message Passing
Page 53: Message Passing
Page 54: Message Passing
Page 55: Message Passing
Page 56: Message Passing
Page 57: Message Passing
Page 58: Message Passing
Page 59: Message Passing
Page 60: Message Passing