application-layer group communication server for extending

26
Application-Layer Group Communication Server for Extending Reliable Multicast Protocols Services Ehab Al-Shaer Hussein Abdel-Wahab Kurt Maly Department of Computer Science Old Dominion University Norfolk, VA, 23508 (ehab,wahab,maly)@cs.odu.edu A short version of this paper appeared in the IEEE International Conference on Network Protocols, held in Atlanta, GA, USA, Oct. 1997. Abstract Reliable multicast protocols are becoming an essential element in distributed applications such as interac- tive distance learning applications. However, the existing implementations of reliable multicast protocols are not sufficient to satisfy the group communications requirements of some distributed applications. Our experience of using number of multicast protocols in IRI distance learning applications shows the ne- cessity of providing an application-layer Reliable Multicast Server (RMS) to extend the primitive group communication services provided by such multicast protocols. Examples of such services include handling heterogeneous environments (such as LAN vs. WAN networks and different reliable multicast protocols), automatic fault recovery and simple application interface. In this paper, we motivate and describe the design and the implementation of RMS architecture which has been used in IRI learning sessions for two semesters. We also show how RMS improves the reliability, performance and flexibility of IRI sessions via supporting extended group communication services. KEYWORDS: Reliable Multicast Transport Protocols; Fault Recovery; Distributed Applications. This work is supported in part by National Science Foundation, Sun Microsystems and COX Fibernet.

Upload: duongdan

Post on 30-Dec-2016

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Application-Layer Group Communication Server for Extending

Application-Layer Group Communication Server

for Extending Reliable Multicast Protocols Services�Ehab Al-Shaer Hussein Abdel-Wahab Kurt Maly

Department of Computer ScienceOld Dominion University

Norfolk, VA, 23508(ehab,wahab,maly)@cs.odu.edu

A short version of this paper appeared in the IEEE International Conference on Network Protocols, held in

Atlanta, GA, USA, Oct. 1997.

Abstract

Reliable multicast protocols are becoming an essential element in distributed applications such as interac-

tive distance learning applications. However, the existing implementations of reliable multicast protocols

are not sufficient to satisfy the group communications requirements of some distributed applications. Our

experience of using number of multicast protocols in IRI distance learning applications shows the ne-

cessity of providing an application-layer Reliable Multicast Server (RMS) to extend the primitive group

communication services provided by such multicast protocols. Examples of such services include handling

heterogeneous environments (such as LAN vs. WAN networks and different reliable multicast protocols),

automatic fault recovery and simple application interface. In this paper, we motivate and describe the

design and the implementation of RMS architecture which hasbeen used in IRI learning sessions for two

semesters. We also show how RMS improves the reliability, performance and flexibility of IRI sessions via

supporting extended group communication services.

KEYWORDS: Reliable Multicast Transport Protocols; Fault Recovery; Distributed Applications.�This work is supported in part by National Science Foundation, Sun Microsystems and COX Fibernet.

Page 2: Application-Layer Group Communication Server for Extending

1 Introduction

With the recent advances in network and computing technology, the demand of group-ware distributed

applications is growing rapidly. Examples of such application include distance learning[18], group-ware

editing[13], video conferencing[17, 19], file distribution [9] and distributed interactive simulations [12].

Distributed applications require a reliable group communications to send/receive messages to/from a group

of members. Thus, areliable multicast protocolis necessary (1) to eliminate the unnecessary network

traffic and processing overhead caused by unicast or broadcast communications, and (2) to guarantee a

reliable delivery of messages to members in the group.

Many protocols were proposed to support reliable group communication [4, 6, 11, 14, 15, 20, 22, 23, 24]

(see URL http://www.tascnets.com/mist/doc/ mcpCompare.html) over native IP multicasting described

in [10]. However, few of them were implemented and used in real-world applications. Examples of re-

liable multicast protocols that were implemented and made available in public domain (for academic use)

include Reliable Multicast Protocols version 1.3b(RMP) [23], Generic Scalable Reliable Multicast Proto-

col (GSRM) [11], Reliable Multicast Transport Protocol (RMTP) [20] and Single Connection Emulation

(SCE) [22].

In addition to designing, simulating, verifying and validating reliable multicast protocols, implementing,

integrating and using these protocols in real applicationsoffer a valuable experience to the research com-

munity as well as the end users. We use reliable multicast protocols in the Interactive Remote Instruction

(IRI) [18] system which is a collaborative distributed multimedia system for large-scale distance learning.

Reliable multicasting is essential for distributed applications such as IRI. IRI has been used in teaching

Computer Science courses for two semesters. Our experiencein integrating, tuning and using the existing

reliable multicast protocols in IRI system at different environments shows that most of these implementa-

tions are not yet mature enough to provide sufficient group communication services required by distributed

applications such as IRI. In the following we highlight someof the practical problems that arise when using

many of the existing reliable multicast protocols:� The existence of bugs and failures in the current implementations may result in ceasing the session

or terminating the clients (i.e. the processes using multicast service). We experienced this problem

through our extensive use of reliable multicasting in IRI sessions. Thus, it is required to protect the

application from communications failures and vice versa.� The performance of the existing protocols can be dramatically changed in different domain environ-

ments. For example, some protocols show better performancein LAN than WAN or the Internet

environment. On the other hand, some other protocols may notperform as well as the previous ones

in the LAN but they tend to be more efficient in WAN environment. However, some distributed appli-

cations exist in heterogeneous environment (e.g. operatesin LAN and WAN). Therefore, combining

features of different protocols is important to increase the overall efficiency of group communica-

2

Page 3: Application-Layer Group Communication Server for Extending

tion. This may require integrating different protocols into one framework such as RMS which also

contributes to multicast protocol heterogeneity.� Clients may require other group communication service which are not supported by the protocol

design such asselective re-transmissionin which messages are sent semi-reliably [7, 14] anddynamic

group maskingwhere sender can block some clients in the group from receiving certain streams of

messages.� Some protocols do not support a transparent multi-session group communication service where the

client can join multiple groups simultaneously. For example, in RMP version 1.3b, the client must

maintain a protocol object and event loop per joined group which is not transparent in the user

prospective. Thus, it is important to support a transparentmulti-session multicasting service such

that the clients can join multiple groups simultaneously without managing such groups explicitly.� Exiting protocols can not discriminate between multiple active sessions based on their priority.� Some protocols offer animperativeApplication Programming Interface which involves the applica-

tion clients in low level details of protocol-specific operations. This, consequently, may distort the

client from its own routine and reduce the usability of such protocols. Therefore, it is important to

provide a high-level simple interface to the applications to request and control group communication

services.� Clients in the distributed applications may request some information about the group communication

such as the member list, the status of multicast group, configuration and statistical protocol infor-

mation. Some of these group information requests are not supported directly by reliable multicast

protocols. A unified interface to support such extended group requests is significant for group man-

agement and diagnoses purposes.

For these reasons, , providing an application-layerReliable Multicast Server (RMS)which acts as a service

access point between the clients and the reliable multicastprotocols is necessary to enhance and extend

the group communication services of multicast protocols. In this paper, we will motivate and describe

the design and the implementation of the RMS which was developed to support the reliable multicast

communication in IRI system. We also describe the RMS extended group communication services and

their impact in improving the reliability, performance andflexibility of the group communication in IRI

environment.

This paper is organized as follows: Section 2 describes IRI system and its software architecture; Section 3

discusses the design and the implementation of RMS in IRI system; Section 4 explains the extended group

communication services supported by RMS and their contribution to improve the reliability, the perfor-

mance and the flexibility of multicast sessions; and Section5 presents the concluding remarks.

2 Overview of IRI System

3

Page 4: Application-Layer Group Communication Server for Extending

Kernel Unix Socket Communiucation

Process

Reliable Multicast Server (RMS)

STTSE

Sess

Network Backbone

I P

U D P

CMTPT

Figure 1: IRI Software Architecture

The RMS was developed as part of IRI software architecture toprovide reliable multicasting for IRI tools.

However, RMS design and services are general and can be used by any distributed applications. This is

a fundamental feature in RMS design as will be shown later in this paper. In this section, we give a brief

description of IRI system which is useful for our discussionthroughout this paper.

The Interactive Remote Instruction (IRI1) is a collaborative distributed multimedia system that hasbeen

developed at Old Dominion University to support distance learning [18]. A typical IRI session may include

a teacher and tens to hundreds of students located in different classrooms and they all can participate in one

single “virtual classroom” facilitated by IRI system. Eachstudent, in the virtual classroom, is assigned a

multimedia workstation2 that runs IRI software which consists of 9 processes that communicate with each

other through UNIX Socket messages [18] (See Figure 1). Similarly, IRI applications in different machines

communicate with each other through multicast messages.

The major software components of IRI are Session Control (Sess) and Reliable Multicast Server (RMS)

which are used for resource management and group communication, respectively. IRI provides full interac-

tion via its Audio, Video, Presentation Tool (PT) and Tool Sharing Engine (TSE) [1] components. The PT is

used for slide presentation and TSE is used to share X applications (e.g. Netscape) in the virtual classroom.1http://www.cs.odu.edu/˜tele/iri2Sun Sparc station with audio and video capabilities.

4

Page 5: Application-Layer Group Communication Server for Extending

IRI also provides Survey Tool (ST) for collecting feedback information from students and Class Manage-

ment Tool (CMT) to diagnose the components status. The majorcomponents in IRI, which use RMS, are

Sess, TSE, PT, ST and CMT (see Figure 4). Each component that requires a multicast service joins its own

private multicast group. For example, PT tools in IRI virtual classroom join PTGrpX.cs.odu.edu whereXa unique identifier in IRI session [18].

3 Design and Implementation of the Reliable Multicast Server

The Reliable Multicast Server (RMS) is designed as an application-layer process that acts as an interface

server between the multicast clients and the reliable multicast protocol itself. Therefore, RMS has a well-

defined simple interface to request multicast services suchas join or to inquire information related to the

group communication such as member list or status. We will discuss this issue in more details in Section 4.

RMS communicates with multicast clients by exchanging messages via UNIX sockets (UNIX-domain or

TCP) [21]. Thus, group communication requests such as joining or leaving the group are sent to RMS

which performs these requests by invoking the proper functions specific to the underlying reliable multicast

protocols. Currently, in IRI sessions, RMS uses RMP version1.3b to provide reliable multicasting. How-

ever, any other reliable multicast protocol can be integrated with RMS and without the client knowledge.

Therefore, one of the RMS services is to provide a wrapper that simplifies and standardizes the interface

and the interaction between the clients and the reliable multicast protocols. Moreover, RMS supports many

other group communication services as well. RMS extended services are discussed in Section 4. In this

section, we describe the design and the object-oriented implementation of RMS in IRI system.

The RMS Interface: The RMS component is a service layer built over reliable multicast protocols to

alleviate the difficulty inherited by using reliable multicast protocols API directly. Thus, RMS provides

a simple interface for requesting group communication services such as join, send and leave requests.

Figure 2 shows the supported services and their corresponding interfaces. Three types of interfaces are

provided:initialization, primaryandmiscellaneousgroup communication services. The first one is used to

initialize and activate the group communication service. In particular,ConnectToLocalRMS is used to

establish the data and control channels andReceiveControlReply is used in order to receive, decode

and print the control messages sent by RMS such as status codes and group members information. The

primary interfaces are used to request the group communications services such as sending data to a group,

joining and leaving the group. TheSendConnectRequest() enables non-group member to multicast

data reliably to any group. The miscellaneous interfaces are used for acquiring group information (such

asSendMembersListRequest() to get a list of group members) and protocol information (such as

SendConfigurationRequest() to get protocols configuration) or to support the extended services

described in Section 4.

5

Page 6: Application-Layer Group Communication Server for Extending

\\

\\ APIrms.hh

\\

class APIRMS

{

public:

APIRMS();

˜APIRMS();

/* Service Initialization */

int ConnectToLocalRMS(char *CntLink , int *CntSock, char *DataLink, int *DataSock);

int ConnectToRemoteRMS(char *HostName, char *PortNumber, int *CntSock, int *DataSock);

int ReceiveControlReply(int CntSock);

/* Primary Group Communication Services */

int SendJoinRequest(int CntSock, char *grpname, char *flags);

int SendConnectRequest(int CntSock, char *grpname, char *flags);

int SendLeaveRequest(int CntSock, char *grpnam);

int SendData(int sock, char *grpname, void *Msg, int Size, int QoS);

/* Miscellaneous Group Communication Services */

int SendGroupMaskRequest(int CntSock, char *grpname, int *IDs);

int SendGroupMaskRelease(int CntSock, char *grpname, int *IDs);

int SendMembersListRequest(int CntSock, char *grpname);

int SendGrpInfoRequest(int CntSock, char *grpname);

int SendMemStatusRequest(int CntSock, char *grpname);

int SendGrpNumRequest(int CntSock, char *grpname);

int SendConfigurationRequest(int CntSock, char *grpname);

int SendStatisticsRequest(int CntSock, char *grpname);

void HelpInfo(void);

/* .... */

};

Figure 2: RMS Application Programming Interface

RMS-Client Communication Protocol: To acquire any group communication service, the client (e.g.

IRI components) has to connect to RMS through two socket connections called channels: Control Channel

(CC) and Data Channel (DC). The CC is used to send control messages such as join request or membership

status from the client to RMS. Also, the requests result codeand replies are sent from RMS to the client

over the CC. The DC is used to send and receive data to and from the multicast groups, respectively. RMS

can serve two types of clients:� Local clientswhich reside in the same host of RMS and request group communication service. This

type of clients should use UNIX Socket to communicate with RMS as described above. The UNIX-

domain sockets has two major advantages over other inter-process communications in UNIX envi-

ronment: (1) it is more efficient for inter-process communication than TCP [21], and (2) it is simpler

than shared memory which requires a mutual exclusion mechanism. Clients, in this case, invoke

ConnectToLocalRMS() to indicate UNIX-domain communication.� Remote clientswhich request a group communication from a remote host that does not have RMS.

This type of clients use two TCP connections (CC and DC, as describe before) to communicate

6

Page 7: Application-Layer Group Communication Server for Extending

with RMS. This feature (called “client-tunneling”) is necessary to accommodate clients existing in

hosts that can not run the underlying reliable multicast protocol since it was not ported to this specific

platform or the host is not multicast-enabled3 . Clients, in this case, useConnectToRemoteRMS()

to established two TCP connections with remote host that hasRMS.

For simplicity, we focus our discussion in this paper on local clients (ConnectToLocalRMS()), how-

ever, whatever is used for local clients is directly applicable on remote clients. There are at least three

reasons for establishing different channels for control and data: (1) to enable sending the control infor-

mation as out-of-band messages, (2) to enable the client andthe RMS to communicate via CC in case of

failure in the DC, and (3) to avoid encapsulating the data stream with control headers to distinguish con-

trol messages. This imposes extra processing and copying overhead per each data packet which decreases

the performance. As soon as the UNIX connection is established, the client (IRIApp) starts sending the

group communication requests. Figure 8 in the Appendix shows part of the client code (IRIApp) and its

interaction with RMS.

RMS-RMS Communication Protocol: RMS can use any reliable multicast protocol to communicate

with another RMS and thereby deliver the multicast messagesto remote clients participating in the group.

This is performed transparent to the clients and users. For example, IRI components such as PT send

and receive from their multicast groups with no concern of the underlying multicast protocols and its

specific interface. In fact, the underlying multicast protocol had been upgraded and replaced with other

protocols several times, in IRI, without notifying the clients or developers since the RMS-Client protocols

and the APIRMS are still preserved. Moreover, the clients participating in the same group can request

using different reliable multicast protocols for any reason. The RMS-RMS communication protocol operate

between heterogeneous reliable multicast protocols as long as these protocols are integrated within RMS

frameworks. In this case, RMS is acting as a gateway between different reliable multicast protocols which

is an important for providing a flexible group communicationservice. We will discuss this issue in more

details in Sections 4.2

RMS Objects Interactions: There are five objects used by RMS:� RMSis the main server (MainThread in Figure 4) which receives the client connection and allocates

a thread worker for each one. As shown in Figure 3, RMS also initializes the service which includes

creating the MgmtThread and the UNIX Socket files [21].� MgmtThreadobject is used for group management purposes as will be described later.� RMSThreadobject is used to serve the client and perform the protocol-specific operations such as

joining the group, sending and receiving messages to and from the group, respectively (see Figure 4).3can not send or receive IP multicast packets.

7

Page 8: Application-Layer Group Communication Server for Extending

//

// RMS.cc

//

main(int argc, char **argv)

{

// MainThread

RMSCommandArgs(argc, argv);

if (RMS_Init() < 0)

{ perror("RMS_Init"); exit(1);}

RMS_main_loop();

/* unreachable code */

} /* end of main */

void RMS_main_loop()

{

UnixSock *usock = new UnixSock;

/* Opening Control and Data sockets*/

if( (CntUSock=usock->OpenUnixListener(CntFile)) < 0)

{ perror("OpenUnixListener"); exit(1); }

if( (DataUSock=usock->OpenUnixListener(DataFile)) < 0)

{ perror("OpenUnixListener"); exit(1); }

/* select event loop initialization */

FD_ZERO(&afds);

FD_SET(CntUSock, &afds); /* control for UNIX-domain */

while(TRUE)

{

if (select(nfds, &rfds, (fd_set *)0, (fd_set *)0,

(struct timeval *)0) < 0)

{ perror("select"); continue; }

if (FD_ISSET(CntUSock, &rfds))

{ /* Set up thread arguments and spawn a thread */

count++;

argptr->thrid=count;

argptr->CntSock = CntUSock;

argptr->DataSock = DataUSock;

thr_create(NULL, 0, StartRMSThread, (void *) argptr,

THR_NEW_LWP, &tid);

} /* end of if */

} /* end of while */

} /* end of RMS_main_loop */

Figure 3: RMS Main Program

RMSThread may represent any reliable multicast protocols integrated in RMS. RMS can create as

many RMSThreads as required by the clients’ requests. Usually, there is one RMPThread per multi-

cast client.� APIRMSprovides the application interface to interact with RMS. APIRMS is used by the client (see

Figure 8) in order to request RMS services.� UnixSocketobject is used for UNIX-domain communications between RMS and its local clients.

Multithreaded Architecture: RMS is designed as a multithreaded application where each LWP (Light

Weight Process) or thread is completely independent from the others. This means no shared access exists

8

Page 9: Application-Layer Group Communication Server for Extending

: Data ChannelUnix Socket

: Control Channel DCCC : Control Channel: LWP : LWP

: Thread : Thread Creation: Data Channel

Unix SocketDCCC

Network Backbone

Network

Backbone

Network

Backbone

Kernel

Mgm

tThr

ead

RM

SThr

ead

RM

SThr

ead

MainThreadRMS

R M S CC DCCC DC

Client Client

Client Client

Kernel

Mgm

tThr

ead

RM

SThr

ead

RM

SThr

ead

MainThreadRMS

R M S CC DCCC DC

Kernel

Mgm

tThr

ead

RM

SThr

ead

RM

SThr

ead

MainThreadRMS

R M S CC DCCC DC

Client Client

Figure 4: Architecture of the Reliable Multicast Server

and consequently no synchronization scheme is necessary, which is important to avoid any performance

bottlenecks.

As shown in Figure 3, RMS MainThread starts by initializing the server viaRMS Init() which involves

creating the UNIX sockets files and the MgmtThread which is dedicated for group management purposes

such as collecting group information and sending control requests in the entire virtual classroom. We

will discuss the duty of MgmtThread in more details in Section 4.1. Then, RMS MainThread goes to

listen to clients’ connections through CC (CntFile) and DC (DataFile). Clients such as IRI components

that demand reliable multicasting connect to RMS through two UNIX socket connections (CC and DC), as

described before. As soon as the UNIX connection is acceptedby MainThread, RMS spawns a thread called

RMSThread to serve the corresponding requester and it then goes back to listen to new connections. The

RMSThread then establishes the control and the data connections and waits for clients requests. Figure 4

shows the design architecture of RMS in one machine. It also shows that one RMS can serve more than

one client in the same machine by allocating RMPThread per client.

The RMSThread is created upon request and lasts for the duration of the requesting component. From this

point, the multipoint communication activities between RMSThread and the IRI component go through the

established channels. Figure 3 shows the some parts of the RMS main program, RMS.cc, (declarations are

omitted). The multithreaded design of RMS has many performance and reliability advantages:

(1) It alleviates the overhead of managing multiple processes such as context switching that may degrade

9

Page 10: Application-Layer Group Communication Server for Extending

the performance,

(2) From our experience with the current version of RMP (RMP 1.3b), we found that RMP has some run-

time problems that could cease IRI session. The multithreaded design principle in RMS assists in isolating

RMP failures that may occur in an RMSThread from affecting the other RMSThread(s) and consequently

protecting their components. Even if one or more clients andtheir RMSThread have aborted or crashed,

other clients can still operate using RMS. We found this feature very useful for increasing the reliability

and availability of IRI during the class session.

4 Extended Services in Group Communication

The main objective of this work is toextendandenhancethe services of the available implementations of

reliable multicast protocols and thereby providing an efficient support for group communications. One of

the major motivations for this work is that it contributes tothe existing protocol implementations which

could be used in many other applications. Secondly, it provides important services without modifying the

existing protocol implementation. Thirdly, it requires noinvolvement from the application itself in order to

attain these services. From our experience in integrating and using many multicast protocols (RMP 1.3b,

GSRM, RMTP and SCE) in IRI system, we found it important to present an application-layer RMS to offer

extended services required by distributed applications such as IRI and not supported by the current protocols

implementations. In this section, we will describe examples of extended services supported by RMS in IRI

system. We will also illustrate the impact of these service on improving the reliability, the performance and

the flexibility of the group communications in IRI as an example of distributed applications.

4.1 Group Communication Fault Recovery

Recovery from group communications failure is necessary indistributed applications specially in distance

learning applications such as IRI [3, 5]. Faults result in isolating a client or group of clients from the rest

of the group which cause a considerable destruction to the session activity and dialog. In [3], we discussed

many sources of errors and failures that could be generated in group communications. For examples, from

our practical experience with RMP 1.3b, we found that RMP version 1.3b has some run-time problems/bugs

that could cease the group communication in IRI session. These faults are usually declared by “Failure

Detection” events sent to each or some RMSThread(s). One source of this failure is the abnormal exit of a

member while another member is still sending. In this case, providing means for efficient and rapid recovery

is important in order to resume the session and with minimal destruction. RMS supports an automatic fault

recovery mechanism in case of any identified group failure. The following the algorithm is used by RMS

to recover in case of failure in RMP 1.3b4:4We chose to write it in English to avoid using extra space for explaining the code

10

Page 11: Application-Layer Group Communication Server for Extending

1. Each RMS join a special multicast group calledManagement Group (MgmtGrp). MgmtGrp is served

by an independent RMSThread, called MgmtThread, which consequently will not be effected by

failures occur in other groups.

2. If any RMSThread receives an indication of group failure:

(a) The RMSThread deallocates the current object of reliable multicast protocol.

(b) The RMSThread re-allocates a new object of reliable multicast protocol.

(c) Then, the RMSThread informs the MgmtThread of this failure by:

i. Depositing the failure information in a shared data structure that includes the failure type,

the groupname and the thread unique identification number. This implies acquiring a lock

before accessing the data structure.

ii. Then, RMSThread sends aSIGUSR1 UNIX signal to MgmtThread to indicate the failure

occurrence.

(d) The RMSThread re-joins the group name again, sets ajoin-timer5 and waits for the group re-

formation to complete.

3. When the MgmtThread receives this signal, it will clear the failure entry from the data structure and

multicast a notification of failure (GrpFailure message) toall MgmtThreads.

4. When the MgmtThread receives a notification of failure, itsends aSIGUSER2 signal to the corre-

sponding RMSThread assigned to this group that had the failure.

5. If the RMSThread receives aSIGUSER2 signal, it leaves the group, allocates a new protocol object

and re-join the same group again. It then wait for other RMSThread to join the group again.

6. Whenever a new RMSThread joins the group, RMSThread members receives a list of all active clients

in the group. If all RMSThreads (and clients) are successfully re-joined or the join-timer expires, the

first RMSThread discovered the failure will send a multicastmessage to RMSThreads to resume

working (sending and receiving) normally.

7. RMSThread will ignore any GrpFailure duplicates if they are received after the first one by time

periodt, such thatt � join-timer (seconds).

8. During the recovery time, RMSThread buffers the messagescoming from the client. The RMSThread

will request the client to pause and resume transmission viathe CC if the buffers are full. If the group

is reformed successfully, the buffered messages will be sent immediately.

Basically, the algorithm recovers the communication failures by reforming the failing group transparently

and without the clients involvement. In RMP version 1.3, thegroup communication failure are normally

identified via an explicit event notifications (Failure Detected) from the protocol itself. However, some

reliable multicast protocol may not convey an explicit message when a group failure occurs. For this

reason, we use a simple algorithm for group diagnostic purposes.5it is typically 15 seconds in IRI.

11

Page 12: Application-Layer Group Communication Server for Extending

Group Partitioning Recovery Algorithm: The group partitioning can be detected by some reliable mul-

ticast protocols such as RMP 1.3b which sends keep-alive message to the group every 60 seconds for this

purpose. However, if the underlying multicast protocol does not discover such failures, it can be discovered

by the failure detection algorithm described later in the this section. In either case, the group partitioning

problem can be easily recovered using the same fault recovery algorithm described previously. However,

this algorithm allows each partitioned group to reform independently and it does not enable the partitioned

groups to re-join in one group even after the network portioning problem is resolved. This is not the desired

situation in some distributed applications such IRI. IRI requires that each re-formed group to be connected

with the teacher in the session in order to validate this re-formed group. For this reason, the previous al-

gorithm must be modified to allow re-formation only for the groups that can communicate with the teacher

member. In order to achieve this, step number 6 in the previous algorithm is slightly modified. After the

join-timer expires, the first RMSThread discovered the failure (i.e. partitioning) checks if the teacher has

already joined the group successfully. If this is the case, RMSThread sends a multicast message to resume

group activities (i.e. successful reformation). Otherwise, RMSThread declare unsuccessful reformation

which causes each member to drop its membership again. Members in this portioned group keep repeating

this process frequently (with exponential back-off) untileither the teacher is found or the group members

give up completely.

Group Failure Detection Algorithm: If the group is not active (means a member do not send or receive)

for T imeV alue, RMSThread sends a “GRPping-request” message to the RMSThreads of the group. If

RMSThread receives the GRPping-request, it multicasts “GRPping-reply” message to the group. However,

to avoid GRPping request implosion, the RMSThread sets arequest-timerto a random value between 0

andMaxReqT ime before sending. If the request-timer expired and no GRPping-request is sent, then

RMSThread sends one out and sets thewaiting-timersuch that

waiting-timer> 2 � MaxPropagationT ime +MaxRepT ime6. TheMaxReqT ime can be appropri-

ately chosen based on the network characteristics (such as maximum propagation delay) and number of

members in the group. We foundMaxReqT ime = 600 ms is appropriate in IRI environment (maximum

propagation delay is 15 ms when 30 members are active). However, assigning bigger but reasonable timer

values in this specific algorithm should not cause any harm since the detection mechanism itself is not a

delay sensitive operation (i.e. no big deal if a client discovers the fault few seconds later).

Similarly, the RMSThreads also setreply-timer to a random value between 0 andMaxRepT ime before

sending GRPping-reply. If the reply-timer expired, the RMSThread sends a reply immediately. If some

RMPThreads do not reply within waiting-timer period, then the failure is declared and the fault recovery

algorithm starts since some members do not respond. It is important here to know that each RMSThread

has the group member list as will be described in Section 4.6.In IRI, theT imeV alue (non-active time) is6assuming the packet transmission time is negligible.

12

Page 13: Application-Layer Group Communication Server for Extending

: Data ChannelUnix Socket

DC

CC

CC

RMTP Clients

DC

: LWP

DCCC

: Thread : Thread CreationCC : Control Channel

: LWP : Data Channel

Unix Socket: Control Channel DC

Mgm

tThr

ead

RM

SThr

ead

MainThreadRMS

Mgm

tThr

ead

RM

SThr

ead

MainThreadRMS

R M S (RMTP-based) R M S (RMP-based)

Mgm

tThr

ead

RM

SThr

ead

MainThreadRMS

Multicast Protocol Gateway

R M T P R M P Mgm

tThr

ead

RM

SThr

ead

R M P

Network Backbone

R M T P

CC DC DCCC

RMP Clients

Figure 5: Centralized Gateway Architecture for Inter-Protocol Communication

set to one minute which is unlikely to be reached during IRI session. Similarly, duplicates are ignored if

they are received after the first GRPping request by less thanT imeV alue seconds.

The group fault detection and recovery mechanisms improve the reliability and the robustness of IRI dis-

tributed application. This reduces IRI communication downtime which increases the quality of IRI session

in distance education [3].

4.2 Supporting Inter-protocol Multicast Communication

One of the major extended services that RMS contributes to group communications is offering a mech-

anism to bridge different reliable multicast protocols. Clients using different reliable multicast protocols

can still participate (send and receive) in one multicast group. Moreover, this is performed transparent to

the distributed applications which use reliable multicastprotocols. In this case, the RMS act as a gateway

between two or more protocols. This service is significantlyimportant for the following reasons:

(1) It enables combining the features of different reliablemulticast protocols into one multicast suite in

the distributed application. Reliable multicast protocols have different design objectives and application

domains. Therefore, combining different multicast protocols in one application could be useful to accom-

modate heterogeneous environments (LAN vs WAN environment).

13

Page 14: Application-Layer Group Communication Server for Extending

RMPRMTP

: Data ChannelUnix Socket

RMP

RMP

RMTP

Mgm

tThr

ead

RMTP

Mgm

tThr

ead

: ThreadDC

RMTP

CC : Control Channel

RMS

: LWP

MainThread

: Thread Creation

Network Backbone

Client

SecT

hrea

d

Pri

mT

hrea

dT

oBeS

entQ

RMSThread

R M SDCCC

RMP

To Other RMSTo Other RMS

To Other RMS

Figure 6: Distributed Proxies Architecture for Inter-Protocol Communication

(2) For large-scale applications, it is desirable to provide an open solution where users are not obligated to

use one multicast protocol. So, it is possible to have different users using different reliable multicast proto-

cols. Separating the application requirements from the underlying multicast protocols is useful specially if

it is unlikely to have a standard reliable multicast protocol in the near future.

(3) This service is useful for experimental purposes since it enables the developers as well as the users to

compare and study the behavior of different reliable multicast protocols concurrently.

In IRI system, we chose to integrate RMP which is a token-ringbased protocol and RMTP which is a treebased protocol. Here, we will describe the extension in the RMS implementation to support this service.One single change in the RMS interface in Figure 2 is requiredto support this service:

ConnectToLocalRMS(char *CntLink, int *CSock, char *DataLink,int *DSock, char *ProtocolName)

TheProtocolNameparameter is added in the connect request to determine the primary reliable multicast

protocol used by this client (in IRI, RMP or RMTP), if there isan option for more than one protocol.

This also can be used withConnectToRemoteRMS(). There are two alternative RMS architectures

to support inter-protocol communication:centralized gatewayanddistributed proxiesarchitectures. The

former gains simplicity but the later gains efficiency and flexibility. In the following, we describe both

approaches:

14

Page 15: Application-Layer Group Communication Server for Extending

Centralized Gateway Architecture: This architecture is used when one reliable multicast protocol is

available per RMS for each client. However, different clients (and also RMSs) can use different reliable

multicast protocols to communication in the group. The design architecture of RMS described in Section 3

can be easily extended to support this service. In this architecture, one machine acts as a gateway between

different reliable multicast protocols. It can be a dedicated RMS process or a regular RMS participating

in the group and performing the gateway functionality. In either case, the RMS must be started with a

special flag (-g) in order to activate the gateway service. When the gateway starts, it creates a new thread

(MgmtThread) to join a special group called management group (MgmtGrp). Every RMS member has a

MgmtThread which joins the same MgmtGrp group. The MgmtGrp is used to exchange group information

between the gateway and the MgmtThreads regardless what protocols their clients use. The gateway joins

this group using both RMP and RMTP. This means that the gateway has a MgmtThread for each reliable

multicast protocol used in the distributed application. For example, in IRI, there are one MgmtThread for

RMP clients and another MgmtThread for RMTP clients. If a client sends ajoin or leaverequest, the

RMSThread performs join or leave operations and sends a multicast control message to the MgmtGrp in

order to notify the gateway (as well as all members) of this action. This protocol is necessary because some

reliable multicast protocols may not send any join or leave notification to the group when a member joins

or leaves. If the gateway joins late, it sends a multicast message to MgmtGrp using all protocols (e.g. RMP

and RMTP) to request information about all groups. This information contains each member, its protocol

and its active groups. Hence, each MgmtThread has the group information of those members who use

the same protocols since the join and leave information are multicasted to MgmtGrp. When MgmtThread

receives group information request, it sets a randomized timer. If the timer expires and no reply has been

sent yet, then it sends a reply with group information of every member uses the same reliable multicast

protocol. This mechanism is used to avoid reply implosion. This request is repeated for each protocol used

by RMSs. Similarly, if the gateway crashes for any reason, itwill get the information from at least one

MgmtThread of each protocol domain.

At this point, the gateway has the information of all membersand their groups. Next, the gateway joins the

groups that have members using different reliable multicast protocols. Consequently, whenever the gateway

receives a multicast message, it forwards the message to thesame group through other multicast protocols.

For example, in IRI, when the RMP client sends a multicast message to a group namedX, the gateway

receives and forwards this message to the same groupX using RMTP (i.e. McastSendData() in RMTP

API). Forwarding multicast messages terminates if the group members start using one reliable multicast

protocol (i.e. since other members left the group).

This architecture has the following limitations: (1) it involves one receive and one send extra for each

multicast message which may decrease the overall throughput of the group communications, (2) this extra

processing (e.g. receive and send) imposes a delay in the multicast path to the other group, (3) this archi-

tecture does not allow testing different protocols concurrently as the case in the next architecture, and (4) it

suffers a single-point of failure since members in the groupmay loose some messages during the gateway

crash and recovery time.

15

Page 16: Application-Layer Group Communication Server for Extending

Distributed Proxies Architecture: This architecture is more efficient and flexible than the previous one.

However, it requires each RMS to integrate all different protocols which potentially can be used by the

application clients. It is desirable, in some application domains, to designate different protocols to serve

different clients in the same group even though the clients may have the potential to use the same reliable

multicast protocol. This, in particular, permits the application to assign the local clients (in the same LAN)

to protocols that tend to be more efficient in LAN environmentand the remote clients (i.e. WAN clients)

of the same group to other protocols which are found more suitable in WAN environment. This hybrid

protocol configuration (e.g. combining token ring-based protocol such as RMP and tree-based protocols

such as RMTP ) provides more flexibility to distributed applications and may also improve the performance

and the scalability of multicast protocols.

In particular, in IRI, every RMSThread serving a client integrates both RMP and RMTP. After the client per-

formsConnectToLocalRMS(), as shown in Figure 6, the RMS creates LWP thread called the primary

thread orPrimThreadthat uses the primary protocols and another LWP thread called a secondary thread

or SecThreadthat uses the other reliable multicast protocol. When a joinrequest is sent by the client, the

RMS performs join operation using both protocols. In particular, the PrimThread performs join operation

and then it requests the SecThread to join the same group too.If join and leave operations are announced

to the group by the underlying protocols, then RMSThread canknow immediately if there exist member(s)

that use different protocol. Otherwise, RMSThread uses theMgmtThread protocol as described before to

distributed the group information. Typically, when RMSThread joins a group, it sends a notification to the

MgmtGrp announcing its group names and primary protocol. The MgmtThreads maintain the group infor-

mation using the same protocol described before. RMSThreads can get the group information by accessing

the group data structure stored by the MgmtThreads reside inthe same machine. Before the client starts

send or receive group activity, the PrimThread must send a multicast message to MgmtGrp requesting the

group information. In either case, each RMSThread knows if there exist clients which use different primary

protocol than its own primary protocol in the same group. This information is important for sending data

out to the group. WhenSendData() is invoked, the message is read from the UNIX socket and inserted

in a queue called ToBeSentQ. This queue is checked by both PrimThread and SecThread simultaneously.

If there exists a message in ToBeSentQ and there exist clients in the same group using different primary

protocols, then both PrimThread and SecThread send the message to the group. Only PrimThread receives

multicast messages and forwards them to the client over the UNIX socket. Thus, SecThread is used only for

sending multicast messages to members in the same group using different primary protocols. It is important

to notice that these protocol operations (joining, exchanging group information, sending and receiving) are

completely transparent to the application clients and the users as well.

Figure 6 shows the general design of this architecture. In order to provide asynchronous protocol operations

which make each protocol independent from the other, we use aLWP thread per protocol instead of a single

LWP for all protocols. Thus, blocking operations or API calls of one do not slow down the other protocol

since they are invoked concurrently instead of sequentially, and they share the resources equally (LWPs have

the same priority). Moreover, we use a single queue (ToBeSentQ) instead of queue per thread/protocol in

16

Page 17: Application-Layer Group Communication Server for Extending

order to minimize the number of packet copies from the UNIX socket to the queue. One of the interesting

issues in this architecture is that it permits members usingdifferent protocols to run in different pace.

This implies that clients can be out-of-synch during the session. However, this problem can be controlled

effectively through adjusting the length of ToBeSentQ to the proper value. For example, if the ToBeSentQ

length is 1 slot, then all members regardless of their protocols will run almost in the same pace. In fact, this

could be a useful feature sometimes because it can treat members in the group differently based on their

capabilities.

This solution is more efficient because it involves a single extra send for each multicast messages and the

delay in the multicast path is negligible since each messageis multicasted via independent LWP threads. It

is flexible because (1) it enables users/developers to test and evaluate different multicast protocols concur-

rently, and (2) the client is allowed to use different multicast protocols simultaneously which is important

to accommodate LAN and WAN environments as explained before.

4.3 Extended Protocol Services

RMS can be used effectively to support new group communication services that are not defined in the reli-

able multicast protocol specification. In this section, we discuss two important reliable multicast protocol

services which are provided by RMS with no direct support from the underlying protocol. These protocol

services are:

Selective Re-transmission Mode: RMP version 1.3b does not support semi-reliable transmission mode

which allows the client to discriminate during transmission between messages of the same stream (i.e.

some messages may not be re-transmitted). The selective re-transmission mode is significantly important

in multicasting continuous streams such as video in distributed multimedia applications [7]. Using the

RMS, the client can select the transmission mode (reliable or unreliable) for each message by setting the

QoS parameter inSendData() (0 means unreliable otherwise reliable). However, the client has to enable

this feature when joining the group by setting the proper join flag inSendJoinRequest() [8]. If the

client requests the selective re-transmission service fora certain group, each RMSThread serving this group

establishes two different multicast groups for every client: Reliable Group(RG) using RMP andUnreliable

Group(UG) using native IP multicasting. Thus, each RMSThread serving this particular group, for example

group name GrpXYZ, is engaged with GrpXYZ-RG and GrpXYZ-UG groups which are the reliable and

the unreliable groups of GrpXYZ, respectively. It is important to notice that these groups are transparent

to the client. When the client designates the message as reliable, the RMSThread uses the RG to send this

message. Otherwise, the RMS uses UG to send this message. TheRMSThreads assign a sequence number

(RMS-sequence) for each message (reliable or unreliable) before it is sent.

In the receiving side, the RMSThreads continuesly receive from the RG (RGSock) and UG (UGSock)

groups (as shown in Figure 7). However, it is mandatory (1) toreceive the message stream in same order

17

Page 18: Application-Layer Group Communication Server for Extending

//

// RMSThread.cc

//

/* ...... */

void Receiver(int RGSock, int UGSock)

{

/* ...... */

CurrentSeq = LastSeq=0;

FD_ZERO(&afds);

FD_SET(RGSock, &afds); /* reliable group */

FD_SET(UGSock, &afds); /* unreliable group*/

while(TRUE)

{

LastSeq = CurrentSeq;

if (select(nfds, &rfds, (fd_set *)0, (fd_set *)0,

(struct timeval *)0) < 0)

{ perror("select"); exit(1);}

if (FD_ISSET(RGSock, &rfds))

{ /* Read message from RG (reliable) */

ReceiveDataFromRG(RGSock,MAX_SIZE,RelMsg,&Count);

CurrentSeq = GetSeqNumber(RelMsg);

AddToReceiveQ(RelMsg); // put the original Msg in Queue

}

else if (FD_ISSET(UGSock, &rfds))

{ /* Read message from UR (unreliable) */

ReceiveDataFromURG(UGSock,MAX_SIZE,UnRelMsg,&Count);

CurrentSeq = GetSeqNumber(UnRelMsg);

if (CurrentSeq < LastSeq)

continue; // ignore this message // CASE(1)

else if (CurrentSeq == LastSeq+1)

AddToReceiveQ(UnRelMsg); // CASE(2)

/* then, CurrentSeq > LastSeq+1 */

else if (GetRelSeqNumber(UnRelMsg)<=LastSeq)

AddToReceiveQ(UnRelMsg); // CASE(3)

else

continue; // ignore this message // CASE(4)

}

} /* end of while */

} /* end of Receiver() */

Figure 7: The Receiver Selective Re-transmission Algorithm

it were sent, and (2) to deliver reliable messages to the client. For these reasons, when the RMSThread

receives a message via the UG, it checks its sequence number and discards the unreliable messages that

has sequence number less than the last one received (Case 1 inFigure 7). On the other hand, if it has

the next sequence number, the message will be delivered to the client immediately (Case 2). However,

what if the sequence number of the UG-message is greater thanthe next sequence number (Case 3 and

Case 4). To handle this case effectively, the RMSThread putsalso in the header of the UG-messages the

sequence number of the last RG-message (reliable message) that has been sent. In other words, the UG-

message has its own sequence number as well as the sequence number of last RG-message sent in the same

stream. If the sequence number of the last reliable message found in UG-message has not been received by

RMSThread yet, then this UG-message is discarded (Case 4). Otherwise the UG-message is delivered to

the client (Case 3). This solution does not only avoid infinite buffering of UG-messages, but it also avoids

18

Page 19: Application-Layer Group Communication Server for Extending

unnecessary dropping of UG-messages in the receiver end which increases the throughput and the quality

of group communication. Therefore, clients in the other endreceive messages in order but some unreliable

designated (UG-messages) messages may have been dropped. Figure 7 shows the algorithm in C (only

important parts are shown). This algorithm is transparent to the client and incurs a negligible overhead.

Rapid and Dynamic Sub-Group Formation: The ability to form transient sub-groups is a desirable

service in many distributed applications. A member in a multicast group may desire to send “private”

information to a subset of the group for a short time. For example, in IRI, the teacher may desire to have a

private communication (e.g. chatting) with one or group of students in the virtual classroom.

This service is also needed in many other distributed applications such as monitoring distributed systems [2]

and distributed interactive simulation [14] where messages may be delivered based on some of its contents

rather than the IP address and port number. However, the standard IP dispatching of multicast messages is

based on the class-D IP address and the port number. One solution is to form a new transient sub-group by

requesting the corresponding members tojoin this group temporally and thenleave. However, this solution

suffers an unnecessary overhead specially if such multicast sub-groups change frequently and last for a

short period of time. For these reasons, forming transient groups byjoining andleavingmulticast groups

for each request is not an efficient solution due to the overhead of processing and sending “join” and “leave”

multicast requests specially if such operations are group atomic in which all members should confirm the

new membership.

RMS provides this service more effectively by filtering out the sub-group messages which do not belong to

the members. If a member wants to deliver messages to a subsetof the group, it sends aSendGroupMaskRequest()

to RMS including the member identification numbers (IDs) of the sub-group member. Each member is as-

signed a unique identification number at the join time. The sender can useSendMembersListRequest()

to get the member IDS in the group. Then, the RMSThread at the sender will include a mask number in the

header of each message sent from the requesting member. The RMSThread in the receiver end forwards the

messages only if the member (client) ID matches the mask in the message. This filtering is applied on each

message form this requesting member until it sendsSendGroupMaskRelease(). The ID matching is

typically one binary bit operation (i.e. anding) and one conditional instruction (no string comparisons or

hashing is involved) . This technique is designed to supporta short lasting sub-group multicasting deliv-

ery. For a long term sub-group multicasting, the first approach could be more appropriate to avoid sending

messages to RMSThreads that have non-subgroup members for along time.

4.4 Providing Multi-session Group Communications

It is desirable sometimes to enable the client to join more than one group simultaneously. We call this

feature multi-session group communication. Some protocolimplementations do not provide an efficient

support for this service. For example, in RMP 1.3b, the application has to maintain objects and track the

19

Page 20: Application-Layer Group Communication Server for Extending

event loop for each joined group independently. This complicates the programming task specially when

large number of groups are involved. The RMS provides a simple and effective protocol to support this

service. The client can join as much groups as desired by simply sendingSendJoinRequest()with the

right group name. The client can also choose either to use single RMSThread to serve all groups or to assign

a thread per group (callingConnectToLocalRMS() again). In the former case, the application (client)

is responsible for demultiplexing the data coming from different groups since each message includes the

group name in the header. In the later case, group data will bereceived from different sockets, so messages

demultiplexing in the application is not required.

As well as supporting multiple groups per client, the RMS caneffectively support multiple clients in the

same machine by assigning a RMSThread per client. This alleviates the context switching overhead of

having a process per client. This is typically the case in IRIsystem since components (e.g. Sess, TSE and

PT) are processes that join different groups in every machine [18].

Priority-based Group Service: One of the distinguished features of this service is the potential to sup-

port priority-based group communication service. In this service, the client may join multiple sessions and

enforce some preference in the service level among these sessions. This is extremely useful in multime-

dia distributed applications where different groups have different response requirements. For example, the

audio client in real-time interactive systems requires faster response than whiteboard (PT in IRI) client.

The RMS multithreaded architecture7 enables the client to assign different classes of services for managing

their groups [16]. Briefly, There are two general approachesto support priority-based group communica-

tion service: (1) usingthr setprio() with unbounded thread, and (2) using real-time LWP bounded

threads which is complex and may not be desirable for reliability issues [16]. Although this issue is still a

work in progress, we are looking to design a dynamic scheme using both approaches in IRI. Our current

implementation does not support the priority-based service.

4.5 A Simple Programming Interface

One of the major services provided by RMS is reducing the programming effort involved in using or in-

tegrating the implementations of reliable multicast protocols. Some reliable multicast protocols such as

RMP has an imperative API interface. For example, in RMP 1.3b, the protocol event loop must be con-

trolled from the the application event loop itself which makes the programming environment less man-

ageable (i.e. the applications must release the event control to RMP for sometime and vice versa). Also,

the applications must be aware of some protocol specific states such as “FlowControl is Full” and be-

have accordingly to resolve this problem. The RMS providesasynchronous, declarativeandsimplepro-

gramming interface (See Figure 2). It is asynchronous sincethe application can send a request and goes

back to its event loop immediately. It is declarative since the programmer has to know nothing about7in Solaris

20

Page 21: Application-Layer Group Communication Server for Extending

the state or the specifics of the underlying protocol in orderto issue a request. However, it is suffi-

cient to send to RMS the request with proper parameters. For example, to send a message correctly,

SendData(SockNum,GrpName,Msg,MsgSize,QoS) must be invoked with the right parameters.

Then, RMS will take care of any problems that may arise when sending this message such as flow control

problems. This is performed transparently and asynchronously with the application operations. It is also

simple since the application event loop is independent fromthe protocol event loop. This service has the

following good contributions to IRI system:

(1) Improving usability: it provides all services supported by the underlying protocols to IRI components

such as Sess, PT and TSE while hiding their details and specifics. More interestingly, the underlying re-

liable multicast protocol was upgraded and sometimes replaced by another one with no need to notify IRI

components and developers.

(2) Increasing maintainability:both the application programs (or IRI components) and the RMS can mod-

ify their implementations independently. This offers a considerable flexibility in IRI development and

testing process.

4.6 Extended Group Communication Requests

The RMS provides new group communication requests which maynot be supported directly by the underly-

ing reliable multicast protocols. Examples of such requests include (see Figure 2)SendMembersListRequest()

to request the group member list,SendMemStatusRequest() to check the membership status (e.g.

connected or disconnected from the group),SendGrpInfoRequest() to request the group information

such as multicast address, port number, TTL value and memberID number,SendConfigurationRequest()

to get the protocol configuration information such as windowsize, timers values and buffer sizes, and

SendStatisticsRequest() to collect some statistics on the multicast services such asnumber of

dropped, retransmitted and duplicated packets. This information is useful for group management, problem

diagnosis and performance tuning purposes.

Some of these requests are partially supported by the underlying protocols. For example, RMP sends a

notification whenever a member joins or leaves the group. TheRMS collects such notifications to build

a group information base. In addition, some of these requests are supported after modifying the original

source code of the protocols in order to reveal this information at run-time. Some statistics parameters (e.g.

the current packet drop rate) is an example of this type.

5 Conclusion And Future Work

This paper motivates and describes our work of developing a Reliable Multicast Server (RMS) to enhance

and extend the services provided by the current implementations of reliable multicast protocols. Our experi-

ence in using and integrating number of reliable multicast protocols in IRI, which is a large-scale distributed

21

Page 22: Application-Layer Group Communication Server for Extending

distance learning applications, exposes the importance ofan application-level RMS to support the following

extended services:� Providing an automatic recovery mechanism for group communication failures,� Providing an architecture for inter-protocol communication between different reliable multicast pro-

tocols,� Supporting selective re-transmission mode,� Supporting rapid and dynamic sub-group multicasting,� Providing an efficient multi-session group communications,� Providing a simple declarative group communication programming interface, and� Supporting a new set group communication requests.

Through out our experience of using RMS during IRI teaching sessions in the past two semesters, the per-

formance, the reliability and the flexibility of group communication have been noticeably improved in IRI

virtual classroom. In this paper, we describe the design architecture and the object-oriented implementation

of RMS as part of IRI system. The RMS code is available for academic use from www.cs.odu.edu/ tele/iri.

Our future work in this direction consists of short-term andlong-term plans. Our short-term plan includes

(1) testing and tuning RMS within IRI to run efficiently over Virginia ATM wide area network, (2) providing

an efficient framework for supporting slow clients in multicast groups, (3) developing mechanism to support

different classes of group communication service for different groups or session using real-time threads or

priority scheduling, and (4) conducting some performance benchmarking experiments for the available

implementations of reliable multicast protocols in different environments and compare the result with RMS

using hybrid protocols. The long-term plan includes designing and developing a reliable multicast protocol

that adapts to the requirements of distance learning applications in LAN and WAN environments.

Acknowledgment

Special thanks for Emad Mohamed for his patience in going through this paper and for his insightful com-

ments. Also, we would like to thank Todd Montgomery and Thomas Yeh in GlobalCast Communications

for their immediate response and support. Finally, we wouldlike to thank Sanjoy Paul from AT&T Bell

labs and Rajesh Talpade from Georgia Tech for their help and support.

References

[1] H. Abdel-Wahab and K. Jeffay. Issues, Problems and Solutions in Sharing X Clients on Multiple Displays.Journal of Internetworking Research & Experience, pages 1–15, Vol. 5, No. 1, March 1994.

22

Page 23: Application-Layer Group Communication Server for Extending

[2] Ehab Al-Shaer, Hussein Abdel-Wahab, and Kurt Maly. High-performancdMonitoring Architecture for Large-scale Distributed Systems Using Event Filtering.CS&I’97: Third International Conference on Computer Sci-ence & Informatics, 3:42–46, March 1997.

[3] Ehab Al-Shaer, Alaa Youssef, Hussein Abdel-Wahab, Kurt Maly, , and C.Michael Overstreet. Reliability, Scala-bility and Robustness Issues in IRI.WETICE’97:IEEE6th Workshops on Enabling Technologies: Infrastructurefor Collaborative Enterprises, June 1997.

[4] S. Armstrong, A. Freier, and K. Marzullo. RFC-1301: Multicast Transport Protocol. February 1992.

[5] K. P. Birman and T. A. Joseph. Reliable Communication in the Presence of Failures. ACM Transactions onComputer Systems, 5(1), February 1987.

[6] K. P. Birman, A. Schiper, and P. Stephenson. Lightweight Causal and Atomic Group Multicast.ACM Transac-tions on Computer Systems, 9(3):272–314, August 1991.

[7] R. Braudes and S. Zabele. Requirements for Multicast Protocols, RFC 1458. Technical report, Network WorkingGroup, IETF, May 1993.

[8] D. Clark and D. L.Tennenhouse. Architectural Considerations for NewGeneration of Protocols. InACMSIGCOMM’90, pages 200–208, Philadelphia, September 1990.

[9] J. Cooperstock and S. Kotsopoulos. Why Use a Fishing Line WhenYou Have a Net? An Adaptive MulticastData Distribution Protocol. InIn Proceedings of Usenix ’96, 1996.

[10] S. E. Deering. RFC-1112: Host Extension for IP Multicasting.August 1989.

[11] S. Floyd, V. Jacobson, C. Liu, S. McCanne, and L. Zhang. A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing.To appear in IEEE/ACM Transactions on Networking,http://ftp.ee.lbl.gov/floyd/srm.html.

[12] Hugh Holbrook, Sandeep Singhal, and David Cheriton. Log-Based Receiver-Reliable Multicast for DistributedInteractive Simulation. InACM SIGCOMM’95, pages 328–341, 1995.

[13] Michael Koch. Design Issues and Model for a Distributed Multi-UserEditor. The International Journal ofComputer Supported Cooperative Work, 3(4):359–378, March 1995.

[14] A. Koifman and S. Zabele. RAMP: A Reliable Adaptive Multicast Protocol. In Proceedings IEEE INFOCOM’96, San Francisco, CA., March 1996.

[15] B.N. Levine and J.J. Garcia-Luna-Aceves. A Comparison of Known Classes of Reliable Multicast TransportProtocols. InProceedings of International Conference on Network Protocols, October 1996.

[16] Bil Lewis and Daniel J. Berg, editors.Threads Primer: A Guide to Multithreaded Programming. Sun SoftPress, 1996.

[17] M. R. Macedonia and D. P. Brutzman. Mbone Provides Audio and Video Across the Internet.IEEE ComputerMagazine, pages 30–36, April 1994.

[18] K. Maly, H. Abdel-Wahab, C. M. Overstreet, C. Wild, A. Gupta, A. Youssef, E. Stoica, and E. S. Al-Shaer.Interactive Distance Learning over Intranets.Journal of IEEE Internet Computing, pages 60–71, Feb. 1997.

[19] S. McCanne and V. Jacobson. vic: A Flexible Framework for Packet Video. ACM Multimedia, pages 511–522,November 1995.

[20] S. Paul, K. Sabnani, J. Lin, and S. Bhattacharyya. Reliable Multicast Transport Protocol (RMTP).IEEE Journalon Selected Areas in Communications, 15(3):407–421, April 1997.

[21] W. Richard Stevens.TCP/IP Illustrated, Volume 3: TCP for Transactions, HTTP, NNTP and the UNIX DomainProtocols. Addison-Wesley, Reading, Massachusetts, 1996.

[22] R. Talpade and M. Ammar. Single Connection Emulation: An Architecture for Providing a Reliable MulticastTransport Service. InProceedings of the IEEE International Conference on Distributed Computing Systems,Vancouver, BC, Canada, June 1995.

23

Page 24: Application-Layer Group Communication Server for Extending

[23] B. Whetten, T. Montgomery, and S. Kaplan. A High Performance TotallyOrdered Multicast Protocol. InTheoryand Practice in Distributed Systems, Springer, Verlag LCNS 938, 1994.

[24] R. Yavatkar, J. Griffioen, and M. Sudan. A Reliable Dissemination Protocol for Interactive Collaborative Ap-plications.ACM Multimedia, 1995.

24

Page 25: Application-Layer Group Communication Server for Extending

Appendix

Here, we present a brief explanation of the client code (IRIApp.cc) which is listed in Figure 8. Because of

the space limitation, only the important parts of the code ispresented here. The client starts by connecting

to RMS via the control and data channels using APIRMS object.Then, the client joins thegroupname

by sending the join request to RMS. The client, then, goes to its event loop to perform two major tasks: (1)

it reads control and data messages from the standard input and sends them out to RMS, and (2) it receives

control or data messages from RMS or the group, respectively. The user can switch between both channels

(data or control) by typing “c” or “d” in the standard input. The client usesReceiveControlReply()

to receive and display the control reply messages. The return code is checked to pause the transmission if

it is needed. Also, the client usesSendControlMessage() to decode the control message request and

send it accordingly.

//

// IRIApp.cc

//

APIRMS *RMS = new APIRMS;

UnixSock *usock = new UnixSock;

main()

{

if (RMS->ConnectToLocalRMS(CntFname,&CntSock,

DataFname,&DataSock) < 0)

{ perror{"ConnectToRMS"); exit(1);}

if (RMS->SendJoinRequest(CntSock, grpname, flags) < 0)

{ perror{"SendJoinRequest"); exit(1);}

Client_Mainloop();

} /* end of main program */

void Client_Mainloop(void)

{

StopSending = TRUE;

/* select event loop initialization */

FD_ZERO(&afds); FD_SET(0, &afds);

FD_SET(CntSock, &afds); FD_SET(DataSock, &afds);

while(TRUE)

{

if (select(nfds, &rfds, (fd_set *)0, (fd_set *)0,

&timeout) < 0)

{ perror("select"); exit(1);}

if (FD_ISSET(0, &rfds))

{ /* Read message (Control or Data) from

standard input and send it RMS */

printf("Enter c (for Control) or d (for Data)>\n");

ch=getchar();

if (ch==’c’) /* Control channel is requested */

{

if (SendControlMessage(CntSock) < 0)

{ perror("SendControlMessage"); continue;}

}

else if (ch==’d’) /* Data channel is requested */

{ /* Otherwise, read and send data*/

gets(msg);

if (!StopSending) /* no pause required*/

RMS->SendData(DataSock, NULL, msg,

25

Page 26: Application-Layer Group Communication Server for Extending

//

// IRIApp.cc (Cont.)

//

else if (FD_ISSET(CntSock, &rfds))

{ /* Control Message arrived from RMS */

if (CntCode=RMS->ReceiveControlReply(CntSock) < 0)

{ perror("ReceiveControlReply"); exit(1); }

if (CntCode == PAUSE)

StopSending = TRUE;

else if (CntCode == RESUME)

StopSending = FALSE;

}

else if (FD_ISSET(DataSock, &rfds))

{ /* Data Message arrived from RMS or the group */

if( (tmp=usock->ReceiveData(DataSock, MAX_MSG_SIZE,

buff,&Count)) == NULL)

{ shutdown(1, DataSock); exit(1); }

//

// Here, the client consumes the recieved message

//

}

} /* end of while */

} /* end of main */

int SendControlMessage(in CntSock)

{

/* Read Control message from standard input and

send it to RMS */

gets(request_msg);

if (ControlMessage(request_msg) == Leave)

{

sscanf(request_msg,"%s %s", command, grpname);

return(RMS->SendLeaveRequest(CntSock, grpname));

}

else if (ControlMessage(request_msg) == MembersList)

{

sscanf(request_msg,"%s %s", command, grpname);

retrun(RMS->SendMembersListRequest(CntSock, grpname));

}

/* ... ... ... */

/* and so forth for all control messages */

/* ... ... ... */

}

Figure 8: Example of A Client (IRIApp) Program

strlen(msg)+1, RELIABLE);

}

}

26