ieee transactions on parallel and distributed … · online gaming) that often include many...

peerTalk: A Peer-to-Peer MultipartyVoice-over-IP System

Xiaohui Gu, Member, IEEE, Zhen Wen, Member, IEEE, Philip S. Yu, Fellow, IEEE, andZon-Yin Shae, Member, IEEE

Abstract—Multiparty voice-over-IP (MVoIP) services allow a group of people to freely communicate with each other via the Internet,which have many important applications such as online gaming and teleconferencing. In this paper, we present a peer-to-peer MVoIP

system called peerTalk. Compared to traditional approaches such as server-based mixing, peerTalk achieves better scalability andfailure resilience by dynamically distributing the stream processing workload among different peers. Particularly, peerTalk decouples

the MVoIP service delivery into two phases: mixing phase and distribution phase. The decoupled model allows us to explore theasymmetric property of MVoIP services (for example, distinct speaking/listening activities and unequal inbound/outbound bandwidths)

so that the system can better adapt to distinct stream mixing and distribution requirements. To overcome arbitrary peer departures/failures, peerTalk provides lightweight backup schemes to achieve fast failure recovery. We have implemented a prototype of the

peerTalk system and evaluated its performance using both a large-scale simulation testbed and a real Internet environment. Our initialimplementation demonstrates the feasibility of our approach and shows promising results: peerTalk can outperform existing

approaches such as P2P overlay multicast and coupled distributed processing for providing MVoIP services.

Index Terms—Peer-to-peer streaming, voice-over-IP, adaptive system, service overlay network, quality-of-service, failure resilience.

Ç

1 INTRODUCTION

RECENT Internet advancement has made large-scale livestreaming a reality [37]. Although previous work has

studied the feasibility of supporting stream content deliveryusing peer-to-peer (P2P) architectures (for example, [15],[14], [7], [21], [13], and [12]), little is known whether it isfeasible to provide large-scale multiparty voice-over-IP(MVoIP) services using application end points such as peerhosts. The MVoIP service allows a group of people to freelycommunicate with each other via the Internet, which can beused in many important applications such as massivelymultiplayer online gaming [10], [20], telechorus, and onlinestock trading. Different from conventional conferencingsystems that impose explicit or implicit floor controls, westrive to provide a more flexible MVoIP service that allowsany participant to “speak” at anytime. By speaking, wemean not only the uttering of words but also nonverbalactivities such as shouting, singing, cheering, and laughingthat are common in interactive and spontaneous applica-tions such as online gaming. For example, in the Internetgaming application, MVoIP services allow game players to

easily communicate with each other for deploying strategiesand game spectators to cheer up players. The emergingcollaborative distributed virtual environment applicationssuch as inhabited television [28] and digital virtual world(for example, Second Life [1]) can support large onlinecommunities and highly interactive social events where it iscommon to have overlapping audio transmissions frommultiple participants.

Traditional multiparty conferencing systems employeither multicast (for example, [16], [15], [14], and [7]),illustrated in Fig. 1a, or server-based centralized audiomixing (for example, H.323 multipoint control units),illustrated in Fig. 1b. Using the multicast approach, thesystem needs to distribute multiple audio streams concur-rently from all active speakers to all participants. Althoughmulticast is well suited for broadcast applications thatusually involve one active speaker, it becomes inefficient forinteractive and spontaneous applications (for example,online gaming) that often include many simultaneousspeakers. The system can be overloaded by processingmany audio streams concurrently. Moreover, since anyparticipant is allowed to produce audio streams at any time,we need to maintain a large number of multicast trees for allparticipants, which can incur a lot of maintenance overheadespecially in dynamic P2P environments where peers candynamically leave or join the system. The audio mixingscheme can effectively reduce the number of concurrentstreams, which first mixes the audio streams of all activespeakers into a single stream and then distribute the mixedstream to all participants. However, centralized audiomixing lacks the scalability desired by P2P applicationsthat often have large groups and many concurrent voice-over-IP (VoIP) sessions. For example, the existing mostpopular VoIP system Skype [2] can only supportconferencing sessions with at most five people. Previous

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 19, NO. 3, MARCH 2008 1

. X. Gu is with the Department of Computer Science, North Carolina StateUniversity, 3296 EB-II, 890 Oval Drive, Box 8206, Raleigh, NC 27695.E-mail: [email protected].

. Z. Wen and Z.-Y. Shae are with the IBM T.J. Watson Research Center, 19Slyline Drive, Hawthorne, NY 10532.E-mail: {zhenwen, zshae}@us.ibm.com.

. P.S. Yu is with the Department of Computer Science, University of Illinoisat Chicago, 851 South Morgan Street, Chicago, IL 60607-7053.E-mail: [email protected].

Manuscript received 23 May 2006; revised 24 June 2007; accepted 15 Aug.2007; published online 30 Aug. 2007.Recommended for acceptance by X. Zhang.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TPDS-0133-0506.Digital Object Identifier no. 10.1109/TPDS.2007.70766.

1045-9219/08/$25.00 ! 2008 IEEE Published by the IEEE Computer Society

work (for example, [28], [22], and [10]) has proposed thecoupled distributed processing (CDP) approach thatuses the same tree for both stream mixing anddistribution, illustrated in Fig. 1c. However, we observethat MVoIP services present asymmetric properties: 1) thenumber of active speakers (that is, stream sources) is oftendifferent from the number of listeners (for example, streamreceivers), and 2) the inbound bandwidth of a peer can bedifferent from its outbound bandwidth (for example, cablenetwork). Thus, CDP can be suboptimal by using the sametree for both mixing and distribution.

In this paper, we present the design and implementationof the first P2P MVoIP system called peerTalk. Compared toprevious work, our solution presents three unique features.First, peerTalk provides the first decoupled distributedprocessing (DDP) model for MVoIP services, illustrated inFig. 1d. The DDP model partitions the multistream audiodelivery into two phases: 1) mixing phase, which mixesaudio streams of all active speakers into a single stream viaa mixing tree, and 2) distribution phase, which distributes themixed audio stream to all listeners via a distribution tree. Thedecoupled processing model can better match the asym-metric property of the MVoIP application, which allows usto optimize and adapt to distinct stream processingoperations (that is, mixing or distribution) more efficiently.Second, peerTalk is fully distributed and self-organizing,which does not require any specialized servers or IPmulticast support. The system provides scalable MVoIPservices by efficiently distributing the stream processingload among different peers. Thus, peerTalk can naturallyscale up as more peers join the system. Third, peerTalk isadaptive, which can dynamically grow or shrink the mixingtree based on the current number of active speakers. Duringan MVoIP session, the number of active speakers candynamically change over time. For example, in a P2Pgaming application, there can be many active speakers atexciting moments and fewer speakers during quiet periods.Any static solution (for example, predetermined aggrega-tion tree at setup time) can be either oversufficient, whichwastes system resources, or undersufficient, which fails tomeet workload requirements. Thus, peerTalk performscontinuous optimization to adaptively optimize the qualityof the MVoIP service in dynamic P2P environments.

The peerTalk system aims at supporting P2P applications(for example, P2P gaming [20]) where MVoIP services aremostly applicable. However, compared to conventionaldistributed systems, P2P environments present more chal-lenges due to higher failure frequency and arbitrary peer

departures. The peerTalk system provides failure-resilientMVoIP services using a set of lightweight failure recoveryschemes. First, the system maintains a number of backupsfor each mixer on the mixing tree by utilizing redundantresources in P2P environments. Thus, we can achieve fastfailure recovery for time-sensitive VoIP applications byavoiding constructing a new mixing tree on the fly as muchas possible. Second, similar to previous work [15], [6], [36],peerTalk adopts an overlay-based approach for failureresilience. We first connect peer hosts into an overlay meshon top of the IP network. The mixing tree and thedistribution tree are then built on top of the overlay mesh.Finally, we assume cooperative P2P environments wherepeers are willing to share resources with each other. TheP2P VoIP service provides natural incentives forparticipants to share resources since they want to receivehigh-quality VoIP services with low cost.

We have implemented a prototype of the peerTalksystem and conducted extensive experiments in both widearea networks such as PlanetLab [27] and simulated P2Pnetworks. Our experiments validate the feasibility ofsupporting MVoIP services using P2P systems and demon-strate the performance advantages of our approachcompared to existing schemes. More specifically, ourresults show that 1) peerTalk can greatly reduce resourcecontentions in P2P environments compared to the overlaymulticast approach, especially for MVoIP sessions withlarge group sizes and heavy workloads (that is, manyactive speakers), 2) peerTalk achieves much a lower servicedelay than the CDP approach by using separate trees, and3) peerTalk can quickly recover MVoIP service failureswhile maintaining low resource contention and servicedelays among live peers. The rest of this paper is organizedas follows: Section 2 introduces the peerTalk system model.Section 3 presents the detailed design and algorithms forP2P MVoIP service provisioning. Section 4 presents thefailure resilience management schemes. Section 5 presentsthe experimental results and analysis. Section 6 discussesrelated work. Finally, the paper concludes in Section 7.

2 SYSTEM MODEL

In this section, we introduce the peerTalk system model.First, we describe the MVoIP service model and its applica-tions. Second, we present the overlay-based P2PVoIP systemarchitecture. Third, we provide an overview of our approachto providing MVoIP services using a P2P system.

2 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 19, NO. 3, MARCH 2008

Fig. 1. Design alternatives of MVoIP services. (a) Overlay multicast. (b) Centralized mixing. (c) CDP. (d) DDP.

2.1 Multiparty Voice-over-IP Service Model

MVoIP services allow geographically dispersed participantsto communicate with each other in a more natural way thanother alternative solutions such as instant messaging. Thebasic MVoIP service model considered in this paper is thateach participant is allowed to speak at anytime and shouldbe able to hear the voices of all other active speakers.Different from conventional conferencing systems that oftenimpose explicit or implicit floor control, the MVoIP servicedoes not limit the number of participants who can “speak”and the time when participants can “speak.” By speaking,we mean that participants produce any audio signals thatcould be not only the uttering of words but also nonverbalactivities such as singing, cheering, and laughing, or somebackground sound in a virtual environment (for example,music). The MVoIP service has many interesting applica-tions. For example, in increasingly popular multiplayerInternet game applications [20], the MVoIP service allowsboth players and spectators to communicate naturally inreal time [35]. The players can better coordinate with eachother for deploying strategies using audio than usinginstant text messaging. Moreover, the MVoIP service allowsthe game spectators to cheer up the players in morepersonalized ways [10]. Other important applicationsinclude interactive Internet TV, teleimmersions, audio-enabled teleauctions, and collaborative virtual environ-ments. All of the above applications have a commonproperty that many participants can produce audio streamssimultaneously. Moreover, the number of active speakerscan change over time as the session’s activeness changes.

2.2 Overlay-Based System Architecture

The peerTalk system adopts an overlay-based approach forquality-of-service (QoS) management and failure resilience.Instead of constructing the mixing and distribution treesdirectly, peerTalk first connects peer hosts into an overlaymesh on top of the existing IP network. The mixing anddistribution trees are then constructed on top of the overlaymesh. Each peer is connected with a number of peers calledneighbors via application-level virtual links calledoverlay links. Each overlay link between two peer hosts viand vj, denoted by li;j, can be mapped to the IP networkpath between vi and vj. The number of neighbors to whicha peer host can be connected is called the outbound degreeof the peer host, which is limited by the outboundbandwidth at the peer host. Similarly, the inbound degreeof the peer host is constrained by its inbound bandwidth.The overlay topology can dynamically change while eachpeer selects different neighbor peers. Specifically, toconstruct an overlay mesh with node degree k, each peerselects !k=2" nearby peers as neighbors for network localityand !k=2" random peers as neighbors for failure resilience[31]. Remote random peers allow the overlay network tobetter survive correlated failures.

Each peer sends heartbeat messages to its neighbors toindicate its liveness and current stream processing perfor-mance (for example, processing time and throughput). Eachpeer can keep an up-to-date neighbor list and the neighbors’information based on the heartbeat messages. Each peeralso periodically monitors the network delay to itsneighbors and the bandwidth of the corresponding links

using active probing [19]. Each peer maintains the routingcost (that is, network delay) to every other peer and the paththat leads to such a cost. The distribution tree rooted at eachpeer is constructed from the reverse shortest paths insimilar fashion to DVMRP [16]. The mixing tree isdynamically constructed using the adaptive P2P mixingalgorithm presented in Section 3. The rationale behind theoverlay-based approach include the following: 1) allowingeach peer to maintain QoS information (for example, CPUload) about its neighbors and the network QoS (forexample, network delay and data loss) of its adjacentoverlay links from itself to its neighbors, 2) reducing tree-repairing frequency by leveraging the resilience property ofthe overlay mesh that contains multiple redundant pathsbetween every pair of peer hosts, and 3) leveragingprevious overlay multicast solutions (for example, [15])for building the distribution tree.

2.3 Approach Overview

The peerTalk system provides the MVoIP service using anew P2P stream processing approach, which decouples theaudio stream mixing from the audio stream distribution.Fig. 2 shows a P2P MVoIP session with eight participants.Unlike conventional schemes (for example, centralizedmixing), peerTalk does not require any special serversand uses only the end systems of all participants, calledpeer hosts, to perform audio stream processing in a fullydistributed and self-organizing fashion. Each MVoIPservice session employs a set of audio stream processingcomponents called mixers and distributors. The mixers anddistributors are dynamically instantiated on different peerhosts based on their load conditions. Each mixer, denotedby Mi, has multiple input ports and a single output port.The mixer periodically aggregates the audio samples thatarrived at all input ports into one audio sample andnormalizes the result to generate a mixed audio samplepacket that is sent out via the output port. The mixer is thebasic building block in the mixing phase of the decoupledstream processing. In contrast, each distributor, denoted byDi, has a single input port and multiple output ports. Thedistributor replicates each input audio packet into multiplecopies that are sent out via the output ports. The distributorprovides the basic function for the distribution phase.

Different from traditional client-server systems, P2Psystems consist of end-system hosts, denoted by vi. Thepeer host often has constrained resources such as limitedmemory for buffering audio packets received from

GU ET AL.: PEERTALK: A PEER-TO-PEER MULTIPARTY VOICE-OVER-IP SYSTEM 3

Fig. 2. Decoupled MVoIP service delivery model.

networks and low outbound bandwidth (for example,cable/DSL networks). However, multistream audio proces-sing is often resource intensive (for example, large bufferrequirements for many streams and large bandwidthrequirement for sending/receiving packets), especially forlarge-scale MVoIP service sessions involving many partici-pants. Thus, centralized stream processing becomes inap-plicable in P2P environments since no single peer host canmeet the resource requirements. To address the problem,the peerTalk system employs multiple peer hosts tocollectively fulfill the task of audio stream processing. ThepeerTalk system first connects a number of mixers into amixing tree, illustrated by the upper level tree in Fig. 2. Theleaf nodes of the mixing tree consist of all participating peerhosts. We assume that each peer host performs silencesuppression to save resources. If a peer host is a leaf node inthe mixing tree, it generates an audio stream only if the localparticipant produces any sound. The internal nodes of themixing tree consist of serving peer hosts that provide audiomixing functions. Since the number of active speakers candynamically change, the audio mixing workload varies overtime. The peerTalk system can dynamically grow or shrinkthe mixing tree to adapt to the number of active speakers.For the distribution phase, we leverage the existing overlaymulticast solution (for example, [14] and [15]) to construct adistribution tree to disseminate the mixed audio stream fromthe root of the mixing tree to all listening participants,shown by the lower level tree in Fig. 2. Note that theinternal nodesMi andDi in the mixing tree and distributiontree can be instantiated on peer hosts that belong todifferent VoIP sessions. In this paper, we assume that peersare willing to share their resources when they join thesystem. Some research work has addressed the problem ofenforcing fair resource sharing [25], [11] in P2P systems,which, however, is not the focus of this paper. We alsoassume that peer hosts are trustworthy, and secure audiotransmissions can be achieved using cryptography schemes.

Compared to the multicast approach, our scheme has anextra mixing delay. However, the audio mixing phase cangreatly reduce the network traffic and the stream proces-sing load by reducing the number of concurrent streamseach peer has to handle and distribute across networks. Onthe other hand, the height of the mixing tree is often muchsmaller than that of the distribution tree since the activespeakers often constitute a small subset of all participants.The mixing tree delay is thus relatively small compared tothe distribution tree delay that needs to cover all partici-pants. Moreover, different from the multicast approach thathas to use different multicast trees rooted at active speakers,peerTalk always uses the optimal multicast tree that has thesmallest distribution delay. As a result, peerTalk can bemore efficient than the multicast approach, especially forhighly active large-scale sessions with many active speakersand participants.

3 PEER-TO-PEER VOICE-OVER-IPSERVICE PROVISIONING

We now present a fully distributed algorithm for dynami-cally constructing and adapting the audio mixing trees in

P2P environments. The basic idea of our approach is toadaptively distribute the dynamic audio stream mixingworkload among different peer hosts while continuouslyoptimizing the service quality of different MVoIP sessions.

3.1 Service Provisioning Protocol

We now present the VoIP service session provisioningprotocol in the peerTalk system, illustrated in Fig. 3. At asession beginning, all participants of the session run anelection protocol to select the best peer as the rendezvouspoint that serves as the root of both mixing tree anddistribution tree. Different from the multicast approachwhere each active speaker uses a different tree todisseminate the audio stream to all participants, peerTalkonly uses one distribution tree to send the mixed audiostream to all participants. This provides an optimizationopportunity for the system to employ the best multicast treefor the distribution phase. Thus, we want to place therendezvous point on the peer host that is the source of thebest multicast tree. In the current peerTalk system, the bestmulticast tree is the one that has the minimum averagedelay between the source and all other participants.1 Whentwo multicast trees have similar distribution delays, wechoose the one that has a larger mixing capacity. Specifi-cally, all peers concurrently run the DVMRP algorithm toconstruct multicast trees rooted at themselves. Each peermeasures the average delay of its own multicast tree andthen propagates the delay information plus its mixingcapacity to all other members via the overlay mesh. Allpeers then select the same best peer as the rendezvouspoint. For example, in Fig. 3a, all eight participants initiatethe multicast tree construction algorithm and then select thepeer b as the rendezvous point.

Initially, the mixing tree only includes the root mixerinstantiated on the rendezvous point, illustrated in Fig. 3b.All participants are connected to the root mixer as itschildren. During runtime, the system adaptively grows orshrinks the mixing tree based on the dynamic mixingworkload changes using a fully distributed algorithm. First,the root mixer monitors the number of active speakersamong all participants. If the number of active speakers islarger than the number that the root mixer can handle, itspawns new child mixers on other peer hosts to offload the


1. We can use different criteria for selecting the root mixer. We use thedistribution tree delay as the primary selection criteria because thedistribution delay often accounts for a major part in the end-to-end voicepacket delay. We can also use different composite metrics based on thenetwork conditions and audio mixing requirements.

Fig. 3. MVoIP service session setup protocol. (a) Optimal distribution

tree selection. (b) Setup initial mixing and distribution trees.

audio mixing workload. The basic idea of mixing treeadaptation is that each mixer can either split itself if it isoverloaded or merge with its sibling mixers if it is under-loaded. The mixer is also dynamically migrated amongdifferent peer hosts to achieve improved service quality. Wenow describe the distributed algorithms for mixer splitting,mixer merging, and mixer migration, respectively.

3.2 Mixer Splitting

Each mixer Mi in the mixing tree monitors the number ofaudio streams that concurrently arrived at its input ports.Since peers can perform silence suppression, a leaf node onthe mixing tree generates an audio stream only if the localparticipant produces any sound. An internal node on themixing tree generates an output audio stream if any of itsinput ports receives an input stream. Suppose the mixer Mi

has n input ports denoted by I1; I2; . . . ; In. We use timeseries Ak, 1 # k # n, to describe the data arrival pattern atthe input port Ik. The time series Ak consist of a sequence oftime-stamped numbers denoted by ak 2 Ak. At time t, weset ak $ 1 if there are data arriving at the input port Ik orak $ 0 if no data arrives. Hence, the total number of audiostreams that concurrently arrived at the mixer Mi at time t,denoted by !i%t&, can be calculated as !i%t& $

Pnk$1 ak. To

achieve stability, we use the moving average value of thetotal audio stream number at time t, denoted by Ni;t. Ni;t

can be computed by the exponential smoothing algorithmas follows:

Ni;t $ ! 'Ni;t(1 ) %1( !& ' !i%t&; 0 < !< 1: %1&

For conciseness, we omit the t inNi;t and useNi to representthe moving average value of the total audio stream numberat current time t.

Since peer hosts are often resource constrained, they canonly process a limited number of audio streams whilekeeping up with the input stream rate without droppingdata. Let us consider the mixerMi located on the peer host vithat can process atmostCi streams. ThemixerMi triggers thesplitting process if the number of arriving audio streamsexceeds its processing limit, that is, Ni > Ci. If the over-loaded mixer Mi is not the root mixer, it splits itself intotwo mixers Mi;1 and Mi;2, illustrated in Fig. 4a. One of themMi;1 remains on the host vi and is assigned a subset of thechildren of Mi whose aggregate workload is dCi

2 e. The rest ofthe children are assigned to the newmixerMi;2. The peer hostvi then selects its most lightly loaded neighbor vj to hostMi;2.If the workload ofMi;2 still exceeds the processing limit of vj,the mixer Mi;2 continues to split itself until the workload ofeach new mixer is within the processing limit of its hostingpeer. Note that the above process may trigger the parentof Mi to split since the number of its children is increased.

If the overloaded mixer Mi is the root mixer, that is,Mi $ M0, the peer host vi first creates a new mixer M1 andtransfers all the children of M0 to M1, illustrated in Fig. 4b.The new mixer M1 then becomes the only child ofM0 and ismigrated to one of the neighbors of vi that has the largestavailable stream processing capacity. By doing so, theheight of the mixing tree is thus increased by one. Let usassume that M1 is placed on the peer host vj. If theworkload of M1 still exceeds the capacity of vj, M1 performsthe same splitting as the previous case since M1 is not the

root mixer. All spawned new mixers become the children ofthe root mixer M0. To minimize the average workload forall input streams, we distribute the children of Mi to eachnew spawned mixers Mi;1 . . . ;Mi;k based on the data arrivaltime series A1; . . . ; An. We calculate the correlation coeffi-cient between every two data arrival time series Ai and Aj,which indicates the possibility of concurrent data arrivalson the input ports Ii and Ij. We then allocate the leastcorrelated input streams to the same mixer to minimize theaverage aggregate workload at each mixer.

3.3 Mixer MergingWe now present the mixer merging algorithm illustrated inFig. 4. The mixer merging process can effectively shrink themixing tree to avoid excessive audio mixing overhead(delay and packet loss) by minimizing the number of mixerstraversed by the audio streams. Similar to the mixersplitting process, each mixer Mi monitors the number ofaudio streams that concurrently arrived at its input ports. Ifthe total workload Ni is significantly less than the mixer’sprocessing capacity Ci (for example, Ni < bCi

2 c), the mixerseeks to merge with its succeeding sibling Mj in the mixingtree. If the aggregate workload of Mi and Mj is withinthe processing limit of a single mixer, that is,Ni )Nj # max%Ci; Cj&, we merge the two mixers into onemixer. If Ci # Cj, we delete Mi and connect the children ofMi to Mj. Otherwise, we delete Mj and connect the childrenof Mj to Mi. Note that the above process may trigger theparent of Mi and Mj to perform mixer merging since theinput stream number of the parent mixer decreases. If amixer Mi becomes the only child of its parent mixer Mp, we


Fig. 4. Mixer splitting and merging operations. (a) Nonroot mixersplitting. (b) Root mixer splitting. (c) Nonroot mixer merging. (d) Rootmixer merging.

can merge Mi with Mp to reduce the height of the mixingtree. The situation occurs when the children of Mp mergewith each other into one mixer. Fig. 5 shows thepseudocode of the mixer merging algorithm. To avoidsystem thrashing between mixer splitting and mixermerging, peerTalk requires that mixer merging cannot betriggered within a certain time threshold if the mixer is justpartitioned from the other mixer.

3.4 Mixer MigrationThe peerTalk system performs dynamic mixer migrationto continuously optimize the audio mixing process. Wecan migrate a mixer Mi from a peer host vi to one of theneighbors of vi if the neighbor peer is better in terms of1) a larger stream processing capacity because of moreabundant CPU, memory, and network bandwidth re-sources, 2) better network connection (that is, less delayor packet loss) from the children of Mi to Mi and thenfrom Mi to the parent of Mi, and 3) higher availability[9]. Each of these criteria can lead to different peer hostcomparison results. Thus, the peerTalk system allows theupper level application to prioritize these different criteriafor customized decision making. For illustration, let usassume that criteria 1, 2, and 3 have decreasing priorities.

Each mixer Mi on the peer host vi periodically probesthe neighbor hosts of vi in the overlay mesh to decidewhether migration should be triggered. Let us assume thatvi has k neighbors v1; . . . ; vk. The mixer Mi sends theaddresses of its parent Mp and children M1; . . . ;Mn to all ofits neighbor hosts vj, 1 # j # k. The mixer Mi then askseach neighbor to return a set of information including 1) thecurrent stream processing capacity, 2) the average delay/packet loss from M1; . . . ;Mn to vj and from vj to Mp, and3) the failure probability of vj. The mixer Mi first selectsqualified neighbor hosts whose processing capacity cansatisfy the current workload of Mi. If qualified neighborhosts exist, Mi further selects the best neighbor host thathas 1) the minimum worst case delay/packet loss and2) the lowest failure/departure probability. If the bestneighbor host is significantly better than the current host vi,the mixer Mi is migrated to the selected neighbor host.2

To achieve smooth mixer migration, the system firstcreates a new mixer M 0

i on the selected neighbor host andconnects M 0

i to the parent of Mi and the children of Mi. Inthe meantime, the system still uses Mi to serve the currentMVoIP session. When M 0

i finishes the setup, the children ofMi is notified to send audio streams toM 0

i. The old mixerMi

is then deleted. Since the mixer M 0i may be instantiated on a

more powerful peer host, the mixer migration can trigger amixer merging process. Hence, mixer migration can notonly improve the performance of the current mixing treebut also help to consolidate the mixing tree so as to reduceintermediate mixers during the stream mixing process.

4 FAILURE RESILIENCE MANAGEMENT

We now present a set of lightweight schemes to improve thesystem’s resilience to peer failures and churns. By failureresilience, we mean that the system should be able toquickly recover an MVoIP session from end-system ornetwork failures with minimum service interruption.Compared to dedicated servers, peer hosts are more proneto failures. Hence, failure resilience management becomesparticularly important in P2P environments.3

4.1 Mixer Replication

We design a proactive replication-based failure recoverymechanism to tolerate fail-stop failures of networks andpeer hosts, illustrated in Fig. 6. Different from a reactiveapproach that dynamically finds a replacement for theprimary upon failure, our replication-based approach isproactive by maintaining a number of backups in advance.For example, in Fig. 6, each of the three mixers M0, M1, andM2 maintains one backup mixer for itself. During the VoIPsession, no audio data are sent to the backup mixer.However, the primary mixer needs to periodically probeits backup mixers to monitor their liveness and resourceavailability. The motivation of the proactive approach istwofold. First, P2P environments provide plentiful redun-dant resources for hosting backup replicas. Second, theproactive approach can avoid constructing a new mixingtree on the fly if backup mixers are still usable. Thus, we canachieve fast failure recovery for time-sensitive VoIP services.Each mixer in the mixing tree, called the primary, maintainsa number of backup replicas on different peer hosts.


Fig. 5. Mixer merging algorithm.

2. For stability, mixer migration is triggered only if the performance ofthe neighbor host is better than that of the current host by a certainthreshold value.

3. We can leverage previous resilient overlay multicast solutions (forexample, [8] and [34]) to achieve failure resilience in the distribution phase.Thus, our research focuses on the mixing phase of MVoIP service delivery.

Fig. 6. Failure recovery in a mixing tree.

Let us assume that a primary mixer wants to maintaink backup mixers. As we mentioned before, each mixerperiodically probes its neighbor hosts to decide whetherone of them is better for hosting the mixer. At the sametime, the primary mixer can identify k qualified peer hoststo host replicas. If less than k qualified peer hosts arefound, the primary mixer probes the neighbors of itsneighbors until k replicas are instantiated. During runtime,the primary mixer periodically probes its replicas to checktheir liveness and update the states of all replicas. If one ofthe replicas becomes unavailable, the primary mixer triesto find another qualified peer host in its nearby neighbor-hood to host the replica. When the primary mixer ismigrated to a new peer host, the replicas are also migratedto the neighbors of the new peer host to assure thatbackups are still close to the primary for localized replicamaintenance.

The number of replicas represents the trade-off betweenfailure resilience and replication overhead. If the primarymaintains k replicas up all the time, the primary can survivek( 1 concurrent replica failures. Note that the roles ofdifferent mixers are nonuniform to the failure resilience ofthe mixing tree. The higher level mixers in the mixing treeare more important than the lower level mixers becausethey are responsible for aggregating the output streams ofthose lower level mixers. Thus, we propose a differentiatedmixer replication scheme to maintain more replicas forhigher level mixers in the mixing tree. The motivation ofdifferentiated replication is to maximize the overall failureresilience of the MVoIP service under limited replicationoverhead.

4.2 Failure Detection

The failure of the mixing tree can be caused by eithernetwork failures between peer hosts or end-system failures.We do not distinguish graceful failures (quitting withnotification) from fail-stop failures (crashes/quiet leaving),although the graceful failures can be handled moreefficiently. For example, we can request the quitting peerto continue working until the system finishes switching toone of its replicas.

When replicas stop receiving the heartbeat messagesfrom the primary, they assume that the primary fails.4

Replicas then execute an election algorithm to reach aconsensus on which replica should take over based on apredefined election criteria (for example, smallest peeridentifier). The elected replica then contacts the parentand the children of the failed primary mixer. The parentand the children of the failed mixer then drop theconnections to the failed primary mixer and connect tothe new primary mixer.5 For example, in Fig. 6, when theprimary mixer M2 fails, the replica M 0

2 takes over theaudio mixing process for the participants e, f , and g andconnects to the parent mixer M0.

4.3 Churn Management

In contrast to conventional client-server systems, P2Psystems exhibit a high rate of continuous node arrivalsand departures, which is called churn. The peerTalk systemreacts to churn according to the different roles of peers inthe MVoIP service:

1. Participant. This produces and receives audiostreams.

2. Overlay router. This provides application-level for-warding in the overlay mesh.

3. Mixer. This provides the audio mixing service.4. Distributor. This distributes audio streams to multi-

ple receivers.5. Backup. This hosts mixer replicas.

Peer joins. When a peer wants to join an existing MVoIPsession, it is first incorporated into the P2P overlay mesh byan out-of-band bootstrap mechanism [15]. The peer selects afew peer hosts provided by the bootstrap service asneighbors and also requests a few other peers to add itselfas a neighbor. After the peer successfully joins the overlaymesh, it becomes an overlay router that can forward packetsfor its neighbors. The peer then broadcasts a message toother peers via the overlay mesh requesting to join theMVoIP session. The peer can acquire the session ID from thebootstrap service. If any peer that is already in the sessionreceives the requesting message, it replies to the messagewith the address of the mixer Mi to which it is connected.The peer then connects to the mixer Mi according to thefirst reply it receives and ignores other later replies. Thus,the peer is successfully added into the mixing tree bybecoming a child ofMi. The overlay multicast algorithm canconnect the new peer into the distribution tree. While thepeer stays in the system, the peer can be selected to play therole of a mixer, a distributor, or a backup.

Peer departures. When a peer vi leaves the systemwithout prior notice (that is, crash/disconnection), thesystem first needs to repair the overlay mesh and updatesmembership lists on other live peers. The neighbors of vican detect the departure of vi after they stop receivingthe heartbeat messages from vi for an extended period. Thesystem then updates the mesh by deleting vi from theneighbor lists of all other live peers. The mesh can becomepartitioned because of the departure of vi. The system canrepair the partitioned mesh by adding more overlay links atpartitioned peers [15]. If vi also hosts a primary mixer Mi,the departure of vi triggers dynamic failure recovery torepair the mixing tree with a replica ofMi. If vi only acts as abackup for a primary mixer Mi, the departure of vi causesMi to create a new backup replica.

5 EXPERIMENTAL EVALUATION

We now present an experimental evaluation of the peerTalksystem. We ran large-scale experiments on a networksimulation environment and prototype experiments on thePlanetLab Internet testbed [27]. Our results demonstrate that1) peerTalk that employs DDP can achieve better MVoIPservice quality than CDP and overlay multicast, two existingstate-of-the-art schemes, 2) peerTalk can simultaneouslyachieve both low resource contention and a short network


4. The heartbeat messages are small messages sent with high frequencyto ensure timely failure detection.

5. The session transition may cause VoIP service glitch. To further reducethe failure impact, we can incorporate the failure prediction mechanism intothe system to initiate the session transition protocol before the primary fails,which however is beyond the scope of this paper.

delay, whereas CDP has a long network delay, and overlaymulticast tends to incur high bandwidth congestion, and3) peerTalk can achieve failure resilience under P2P systemchurn by just maintaining a few backup mixers.

5.1 Evaluation Methodology

We have implemented a prototype of the peerTalk systemand tested it on both simulation environments and thePlanetLab Internet testbed [27]. The simulator performspacket-level discrete-event network simulation. The simula-tor uses thedegree-based Internet topologygenerator Inet-3.0[41], [39] to generate a 5,120-node power-law graph torepresent the IP physical network. The delay of each physicallink is distributed in the range of [8, 12] ms similar to [15],which is proportional to the euclidean distance betweentwo end points. The bandwidth of each edge network link isdistributed in the range of [256k, 10M] bps according to thecapacity of current residential access networks (for example,ADSL and cable networks). We also emulate asymmetricresidential access networks (for example, ADSL, cablenetworks) where the upload bandwidth is smaller than thedownload bandwidth. The inbound or outbound bandwidthof a core network node is proportional to the number of itsinbound or outbound physical links. We have conductedexperiments on different physical networks where linkbandwidth follows either a uniform or a Zipf distribution.

To emulate mixer processing delays and peer relayingdelays, each overlay node is configuredwith a certainmixingor relaying capacity denoting the amount of data the overlaynode can mix or relay per second. We assign varied capacityvalues to different hosts to emulate heterogeneous environ-ments. We then randomly select a number of stub nodes asapplication end points (that is, peer hosts). Each peer host israndomly connected to [5, 10] other peers as neighbors toemulate a scalable overlay mesh with low node degrees. Theoverlay topology is connected using the short-long algorithmpresented in [31]. The simulator emulates packet routing atboth the IP layer and the overlay layer using the Dijkstrashortest path algorithm based on the delay metric.

To demonstrate the efficiency of peerTalk, we compareour approach with CDP [28], [22], [10] and overlay multicast[15]. The CDP algorithm first selects the best multicast treeamong all peers similar to the peerTalk system. However,the CDP algorithm uses the same tree for both streammixing and stream distribution. The overlay multicast usesthe DVMRP algorithm [16] to construct multicast trees ontop of the overlay mesh.

A previous study indicates that delay and loss are thekey factors that decide the user’s perception about the voicequality [23]. Hence, we use the following metrics to evaluatethe service quality of an MVoIP service session:

1. Link stress. This is the link stress over all utilizedphysical links where the link stress of one physicallink is defined as RequiredBandwidth

AvaliableBandwidth . Higher link stressimplies a larger network queuing delay and lossprobability.

2. Node stress. This is the node stress over all utilizedpeer hosts where the node stress of one peer host isdefined as the total amount of audio data the peerhost needs to process over its processing capacity.

Larger node stress implies a larger stream proces-sing delay and loss probability at peer hosts.

3. Propagation delay of an MVoIP session. This isdefined as the mean propagation delay from allactive speakers to all listeners where each propaga-tion delay denotes the network propagation delayover the network path for each voice packettravelling from one speaker to one listener.

4. Service delay of an MVoIP session. This is defined asthe mean service delay from all active speakers to alllisteners where each service delay includes networkpropagation delays, peer mixing delays, and peerdistribution delays.6

We use a range of different workloads to evaluate theperformance of the peerTalk system. The voice encodingsfollow the G.711 standard [23] with 64-Kbps codec bit rate,80-byte codec sample size, and 10-ms codec sample intervalEach packet includes 40 bytes for IP/UDP/RTP headersand 160 bytes for voice payload. The stream rate is50 packets per second. Thus, the total bandwidth perconnection is 80-Kbps. We use two different models toemulate the speaking activities:

1. ExplicitON/OFFmodel. Thismodel directly adjusts thenumber of active speakers to reflect speaking activitychanges. The activity of each active speaker alternatesbetweenONperiods andOFFperiods.During theONperiod, a stream of voice packets is generated,whereas no data is generated during the OFF period.The durations of the ON period and the OFF periodare generated from two exponential distributionsbased on a previous experimental study [23].

2. Real VoIP conversation data. These use real telephonyconversations from switchboard data [17], whichconsist of 500 pairs of conversations for a total of1,000 voice streams. Each conversation session lasts300 seconds. The original voice data have beenconverted to VoIP packets and consisted of multiplepairs of users conversing on diverse topics. Unlessotherwise specified, each simulation run lasts300 seconds and has a certain warm-up period forthe system to reach its stable performance.

5.2 Simulation Results

In the first set of experiments,we evaluate the performance ofthe peerTalk system under different session sizes, illustratedin Figs. 7, 8, 9, and 10. The overlay network consists of800 peers.We instantiate threeMVoIP sessions concurrently,where the session size ranges from [50, 500] peers. Theworkload is generated using the explicit ON/OFFmodel thatrandomly selects 10 percent of session members as activespeakers.7 Fig. 7 shows the average link stress on all thephysical links used by the three running MVoIP sessionsunderdifferent algorithms.We conducted experimentsusingboth uniform and Zipf network bandwidth distributions.


6. The simulator emulates the propagation delay on physical links butdoes not emulate the queuing delay, packet losses, or cross traffic forachieving large-scale simulations.

7. The advantage of peerTalk is even more prominent under heavierstream workload with a larger number of active speakers, which is shownby the second set of experimental results.

Weobserve that althoughpeerTalk typically employs smallermixing trees than CDP, peerTalk can achieve similar linkstress as CDP by employing explicit load balancing. Bothapproaches can achieve much lower link stress than themulticast algorithm, especially under large session sizes. Thelink stress reduction is evenmore prominent for the networkwith Zipf bandwidth distribution. The reason is that bothpeerTalk andCDPemploy amultistreamaudiomixingphasethat can greatly reduce the number of concurrent audiostreams distributed across networks. This result indicatesthat both peerTalk and CDP incur much lower networkcongestion than the multicast approach, which implies alower network queuing delay and packet loss rate. Similarly,both peerTalk and CDP imposemuch lower node stress thanthe multicast approach, shown in Fig. 8. Compared to themulticast scheme, both peerTalk and CDP have an extramixing phase.Weneed to evaluatewhether themixing phasecauses a significant increase to the network propagationdelay during the audio stream delivery. Fig. 9 shows theaverage network propagation delays achieved by different

algorithms. The average network propagation delay iscalculated among all the audio packets that are transmittedfrom all speakers to all listeners. We observe that peerTalkhas amuch lower propagation delay than the CDP algorithmby using separate trees for mixing and distribution phases.Fig. 10 shows the average service delay achieved by differentalgorithms as we increase the session size. The service delayincludes the network propagation delay, peer mixing delay,and peer distribution delay. The results show that peerTalkconsistently achieves a lower service delay than the CDP andmulticast approaches. Note that the real service delay of themulticast approach will be higher if we add the networkqueuing delay, which can be induced from the link stressresults. The results show the advantage of the decoupledprocessing model and the adaptive stream mixing schemeemployed by the peerTalk system.

Our second set of experiments compare the performanceof different algorithms under different numbers of activespeakers, shown in Figs. 11, 12, 13, and 14. The number ofactive speakers is controlled by a speaker ratio that denotesthe percentage of session members as active speakers. Every


Fig. 7. Link stress under different session sizes.

Fig. 8. Node stress under different session sizes.

Fig. 9. Propagation delay under different session sizes.

Fig. 10. Service delay under different session sizes.

Fig. 11. Link stress under different speaker ratios.

Fig. 12. Node stress under different speaker ratios.

Fig. 13. Propagation delay under different speaker ratios.

Fig. 14. Service delay under different speaker ratios.

10 seconds, we randomly select a number of sessionmembers as active speakers. Similar to the first set ofexperiments, we use an 800-node overlay network andconcurrently run three MVoIP sessions. Each sessionincludes 100 peers with [5 percent, 30 percent] randomlyselected active speakers. Fig. 11 shows that both peerTalkand CDP have much lower link stress than multicast byemploying audio mixing, especially under high speakerratios. In Fig. 12, we observe that peerTalk can achieve lowernode stress than CDP because of its inherent load balancingcapability. Fig. 13 shows that peerTalk has a much lowernetwork propagation delay than CDP and adaptivelyexpands the mixing tree as the speaker ratio increases.Finally, Fig. 14 shows the total service delay achieved bydifferent algorithms. We observe that peerTalk can consis-tently achieve a lower service delay than CDP and multicastapproaches. Under low speaker ratios, peerTalk can employa small mixing tree to avoid excessive mixing delays; underhigh speaker ratios, peerTalk can adaptively expand themixing tree to handle high stream workloads.

Our third set of experiments studies how differentalgorithms scale as we gradually increase the number ofconcurrent sessions running on top of the overlay system,illustrated in Figs. 15, 16, 17, and 18. In this set of experiments,we use an 800-node overlay network. Each session includes50 randomly selected peers with 10 percent randomlyselected peers as active speakers. Similar to the previoustwo experiments, both peerTalk and CDP incur lower linkstress and node stress than the multicast approach. Further,peerTalk achieves lower node stress thanCDPbyperforming

explicit load balancing using mixer migration. Overall,peerTalk consistently achieves a lower service delay thanCDP and multicast. We also observe that the service delay ofthe multicast approach increases much faster than those ofpeerTalk and CDP as more sessions are created on top of theoverlay system. This results show that audio mixing isnecessary in order to achieve scalable MVoIP services overP2P overlay networks.

We have also compared the performance of differentalgorithms using real VoIP conversation data. We use a400-node overlay network and instantiate three MVoIPsessions concurrently on top of the overlay network. Eachsession consists of 20 peers. The speaking activity of eachpeer pair is the playback of one conversation trace selectedfrom the 500 pairs of conversation trace files. Figs. 19 and 20show the link stress and the total service delay of differentalgorithms under real workloads. The results show a similartrend as the results under synthetic workloads. BothpeerTalk and CDP can significantly reduce the link stressusing audio mixing compared to the multicast approach.Overall, peerTalk consistently achieves lower service a delaythan the other two approaches.

We now evaluate the proactive failure recovery schemesof the peerTalk system under P2P network churn where anumber of peers dynamically leave or join the system,illustrated in Figs. 21 and 22. The algorithm “backup-k”means that we maintain k backup mixers for each primarymixer. We use a 1,000-node overlay network and instantiatethree MVoIP sessions concurrently on top of the overlaynetwork. Each session consists of 100 randomly selected


Fig. 15. Link stress under different session numbers.

Fig. 16. Node stress under different session numbers.

Fig. 17. Propagation delay under different session numbers.

Fig. 18. Service delay under different session numbers.

Fig. 19. Link stress under real VoIP workloads.

Fig. 20. Service delay under real VoIP workloads.

peers with 10 percent speaking ratio. The system randomlyselects a number of departure nodes every 5 secondsaccording to a specified churn rate. During each 300-secondsimulation run, we start from a low-churning system with" $ 10 percent churn rate (that is, 10 percent of the totalsystem peers randomly leave the system8), then increase thechurn rate to 20 percent at time 100, and further increase thechurn rate to 30 percent of all nodes at time 200. The systemreconstructs the distribution tree using the DVMRP algo-rithm and repairs the overlay mesh partition by randomlyadding neighbors to the peers with few neighbors left. InFig. 21, the y-axis shows the accumulated number of failuresthat cannot be recovered by the maintained backup mixers.In Fig. 22, the y-axis shows the failure frequency that denotesthe number of failures that cannot be recovered by thepeerTalk backup scheme every second. The “backup-0”algorithm represents the reactive failure recovery approachthat takes no prevention action (that is, no backup mixers/distributors). The fault tolerance improvement (that is,failure number reduction) from “backup-0” to “backup-1”and from “backup-1” to “backup-2” is much larger than thatfrom “backup-2” to “backup-3” and from “backup-3” to“backup-4.” We observe that by maintaining four backupmixers, the system can recovermost failures even under highsystem churn (that is, up to 30 percent random failing peers).

5.3 PlanetLab Results

To evaluate the feasibility and performance of our approachunder a real Internet environment, we have deployed andevaluated the peerTalk system on the PlanetLab wide areanetwork testbed [27]. The peerTalk software at eachPlanetLab host includes five major modules:

1. Mixer manager. This executes the mixer splitting,mixer merging, and mixer migration algorithms.

2. Overlay topology manager. This maintains the overlaymesh network.

3. Monitoring module. This is responsible for monitoringthe network/service states of neighbors (for exam-ple, network delays).

4. Session manager. This maintains the peer membershipinformation about all VoIP sessions, which is builton top of the DHT system [33], [38], [30].

5. Data transmission module. This is responsible forsending, receiving, and forwarding audio data.

We used the SCRIBE software [14] to realize a P2P overlaymulticast. To evaluate the feasibility of adaptive mixing, wehave measured the average time of basic mixer adaptationoperations (that is, mixer splitting, mixer merging, andmixer migration) in the real Internet setting. Our initialresults indicate that peerTalk can finish the basic mixeradaptation operations between PlanetLab hosts within afew milliseconds.

Different from the simulation that uses an explicit modelto generate the workload, the prototype experiments usedthe ON/OFF workload model. We dynamically adjust themean duration values of the ON period and the OFF periodto emulate different speaking activities. The audio packetsfollow the standard G.711 codec requirements described inSection 5.1. Our experiments used about 50 PlanetLab hoststhat spread across the US. We instantiate two peerTalknodes on each PlanetLab host. At the beginning, each peersends a probe message to all other peers via the SCRIBEmulticast interface and measures the average delay betweenitself and all other peers. All peers then exchange with eachother the average delay from themselves to all other peers.All peers then select the best multicast tree that has theminimum average delay as the optimal distribution tree.The CDP uses the optimal tree for both mixing anddistribution. peerTalk uses the optimal tree for distributionand constructs the mixing tree using the adaptive streammixing algorithm. The overlay multicast scheme usesSCRIBE to perform multistream distribution from all activespeakers to all group members.

We first test the three different algorithms under a lightworkload condition with few concurrent active speakers.Fig. 23 shows the cumulative distribution of total delaysbetween all pairs of communicating participants using thethree different algorithms. Different from the simulation thatonly models network propagation delay, the packet delaymeasured on PlanetLab reflects all the processing andqueuing delays at both peer hosts and Internet connections.Weobserve that peerTalk achieves the best performance (thatis, shortest service delays), whereas CDP has the worstperformance. The reason is that under a light workloadcondition, the advantage of audio mixing is not significant,


8. Some nodes will be dynamically added back to the system to keep thenumber of live nodes in the system at a constant level of %1( "& 'N .

Fig. 21. Total failure number under system churn.

Fig. 22. Failure frequency under system churn.

Fig. 23. Packet delays under a light workload.

and CDP suffers from a large mixing delay. Besides packetdelays, the quality of VoIP services is also affected by theinterpacket delay jitter [23]. The delay jitter describes thevariations of interpacket delays. Thus, we also measuredthe delay jitter result during the above experiment, which isillustrated in Fig. 24. We observe that peerTalk can alsoachieve better delay jitters than the other two schemes.

We then increase the system workload by increasing thenumber of concurrent active speakers. Figs. 25 and 26 showthe cumulative distribution of total delays and delay jittersachieved by different algorithms under a heavy workload.We observe that the effect of audio mixing becomessignificant, and peerTalk can achieve a much lower delaythan the other two alternatives. The multicast approach iscompletely overloaded by the number of concurrentstreams, which have excessive total delays. The experi-mental results validate our hypothesis that adaptive audiomixing can greatly reduce network and stream processingdelays by reducing the link stress and node stress. Such animprovement can offset the small extra mixing delay with alarge margin in most cases compared to the multicastapproach. Since peerTalk tends to perform more adapta-tions under a heavy workload, peerTalk has slightly largerdelay jitters than the other two schemes. However, suchdifference is marginal. Thus, we conclude that peerTalk canperform better than the two state-of-the-art approaches inreal Internet environments.

6 RELATED WORK

In this section, we compare peerTalk with related work thatis classified into three major categories: 1) VoIP systems,2) P2P systems, and 3) distributed multimedia systems.

VoIP systems. Recently, VoIP systems have received a lotof research attention. Much of previous work has beendevoted to evaluating and improving the quality oftwo-party VoIP services (for example, [23] and [40]).Ren et al. [32] proposed an Autonomous-System-awarepeer relay protocol to improve two-party VoIP quality.People have also studied the MVoIP services that presentmore challenges. For example, Rangan et al. proposedhierarchical Media Mixing architectures for supportinglarge-scale audio conferencing [29]. Lennox and Schulzrinnedeveloped a reliable MVoIP system using a full meshtopology [22]. Radenkosvic and GreenHalgh proposed aDistributed Partial Mixing approach to supporting MVoIPservices with TCP fairness [28]. Different from previouswork, the peerTalk system focuses on providing MVoIPservices in P2P environments, which provides a purelyapplication-level solution with unique features of

self-organization, adaptation to workload, and failureresilience. In [18], we have presented the basic adaptivemixer splitting and merging algorithms. This paper presentsthe complete peerTalk framework including the newalgorithms for rendezvous point election, mixer migration,and failure resilience management.

P2P systems. With the popularity of P2P file sharingsystems, P2P systems have drawn much research attention.One salient advantage of P2P systems is that they canaggregate a tremendous amount of resources in a failure-resilient and cost-efficient fashion. Previous work hasaddressed the problems of scalable data lookup usingdistributed hash tables (DHTs) (for example, [33], [38], and[30]) and incentive engineering (for example, [25] and [11])for providing efficient P2P data sharing. Inspired by P2P filesharing systems, researchers have proposed many otherP2P applications such as P2P content delivery (for example,[21], [13], and [12]), P2P file systems (for example, [3]), andP2P storage systems (for example, [4]). While peerTalk canbenefit from many previous P2P research results, ourresearch is orthogonal to previous work. Our work morefocuses on exploring the specific properties and require-ments of MVoIP services. To the best of our knowledge, ourwork is the first study on using P2P stream processing forMVoIP services, which we believe could be a new killerapplication for the P2P technology.

Distributed multimedia systems. Much multimediaprocessing needs to be performed in a distributed fashion.For example, Amir et al. [5] proposed the active serviceframework and applied it on a media transcoding gatewayservice. In [26], Ooi and Renesse proposed a framework todecompose a computation into subcomputations and assignthem to multiple gateways. In [24], Nahrstedt et al.proposed an Hourglass-based system to deliver compositemultimedia content to users in pervasive computingenvironments. The peerTalk system is similar to the abovework in terms of distributing media processing amongmultiple hosts. However, instead of considering genericmedia processing, our work more focuses on P2P audiostream processing for providing MVoIP services. Thus, the


Fig. 24. Delay jitters under a light workload. Fig. 25. Packet delays under a heavy workload.

Fig. 26. Delay jitters under a heavy workload.

new contribution of the peerTalk system is to organize andadapt the audio stream processing based on the uniquefeatures of MVoIP services.

7 CONCLUSION

Traditionally, MVoIP services use a collection of multicasttrees or a centralized audio mixing server. In this paper, weargue that a P2P MVoIP system can achieve betterscalability and cost-effectiveness by adaptively and effi-ciently distributing the stream processing workload amongdifferent peers. To the best of our knowledge, this is the firstwork that studied the P2P system design for the MVoIPapplication, which we believe could be a new killerapplication for the P2P technology. Specifically, this papermakes the following contributions: 1) we propose a noveldecoupled stream processing model that can better explore theasymmetric property of MVoIP services and optimize thestream mixing and distribution processes separately, 2) weprovide localized mixer splitting/merging/migration algo-rithms to continuously optimize the quality of the MVoIPservices according to speaking activity changes, and 3) wepropose lightweight backup schemes to make peerTalkresilient to peer failures/departures by utilizing redundantresources in P2P environments. We have implemented aprototype of the peerTalk system that are evaluated in bothreal-world wide area networks and simulated P2P net-works. Our results show that peerTalk can combine theadvantages of two state-of-the-art approaches (that is,multicast and audio mixing) while overcoming theirdisadvantages for providing MVoIP services.

REFERENCES

[1] The Second Life 3D On-Line Digital World, http://secondlife.com/,2006.

[2] The Skype Internet Telephony System, http://www.skype.com/,2006.

[3] T. Gil, A. Muthitacharoen, R. Morris, and B. Chen, “Ivy: A Read/Write Peer-to-Peer File System,” Proc. Fifth Symp. OperatingSystems Design and Implementation (OSDI ’02), Dec. 2002.

[4] A. Adya et al., “FARSITE: Federated, Available, and ReliableStorage for an Incompletely Trusted Environment,” Proc. FifthSymp. Operating Systems Design and Implementation (OSDI ’02),Dec. 2002.

[5] E. Amir, S. McCanne, and R.H. Katz, “An Active ServiceFramework and Its Application to Real-Time Multimedia Trans-coding,” Proc. ACM SIGCOMM ’98, Oct. 1998.

[6] D. Andersen, H. Balakrishnan, F. Kaashoek, and R. Morris,“Resilient Overlay Networks,” Proc. 18th ACM Symp. OperatingSystems Principles (SOSP ’01), Oct. 2001.

[7] S. Banerjee, B. Bhattacharjee, and C. Kommareddy, “ScalableApplication Layer Multicast,” Proc. ACM SIGCOMM ’02,Aug. 2002.

[8] S. Banerjee, S. Lee, B. Bhattacharjee, and A. Srinivasan, “ResilientMulticast UsingOverlays,” Proc. ACMSIGMETRICS ’03, June 2003.

[9] R. Bhagwan, K. Tati, Y. Cheng, S. Savage, and G.M. Voelker,“TotalRecall: System Support for Automated Availability Manage-ment,” Proc. ACM/Usenix Symp. Networked Systems Design andImplementation (NSDI ’04), Mar. 2004.

[10] A. Bharambe, V. Padmanabhan, and S. Seshan, “SupportingSpectators in Online Multiplayer Games,” Proc. Third ACMWorkshop Hot Topics in Networks (HotNets ’04), Nov. 2004.

[11] A. Blanc, Y.-K. Liu, and A. Vahdat, “Designing Incentives forPeer-to-Peer Routing,” Proc. IEEE INFOCOM ’05, Mar. 2005.

[12] J.W. Byers, J. Considine, M. Mitzenmacher, and S. Rost, “InformedContent Delivery Across Adaptive Overlay Networks,” IEEE/ACM Trans. Networking, vol. 12, no. 5, pp. 767-780, Oct. 2004.

[13] M. Castro, P. Druschel, A.-M. Kermarrec, A. Nandi, A. Rowstron,and A. Singh, “SplitStream: High-Bandwidth Multicast in Co-operative Environments,” Proc. 19th ACM Symp. Operating SystemsPrinciples (SOSP ’03), Oct. 2003.

[14] M. Castro, P. Druschel, A.-M. Kermarrec, and A. Rowstron,“SCRIBE: A Large-Scale and Decentralized Application-LevelMulticast Infrastructure,” Proc. ACM SIGCOMM ’01, Aug. 2001.

[15] Y.-H. Chu, S.G. Rao, S. Seshan, and H. Zhang, “EnablingConferencing Applications on the Internet Using an OverlayMulticast Architecture,” Proc. ACM SIGCOMM ’01, Aug. 2001.

[16] S. Deering, “Multicast Routing in Internetworks and ExtendedLANS,” Proc. ACM SIGCOMM ’88, Aug. 1988.

[17] D. Graff, K. Walker, and A. Canavan, Switchboard-2 Phase II,LDC 99S79, http://www.ldc.upenn.edu/Catalog/, 1999.

[18] X. Gu, Z. Wen, P.S. Yu, and Z.-Y. Shae, “Supporting Multi-PartyVoice-over-IP Services with Peer-to-Peer Stream Processing,” Proc.13th ACM Int’l Conf. Multimedia (Multimedia ’05), Nov. 2005.

[19] Iperf, Iperf Bandwidth Measurement Tool, http://dast.nlanr.net/Projects/Iperf/, 2006.

[20] B. Knutsson, H. Lu, W. Xu, and B. Hopkins, “Peer-to-Peer Supportfor Massively Multiplayer Games,” Proc. IEEE INFOCOM ’04,Mar. 2004.

[21] D. Kostic, A. Rodriguez, J. Albrecht, and A. Vahdat, “Bullet: HighBandwidth Data Dissemination Using an Overlay Mesh,” Proc.19th ACM Symp. Operating Systems Principles (SOSP ’03), Oct. 2003.

[22] J. Lennox and H. Schulzrinne, “A Protocol for ReliableDecentralized Conferencing,” Proc. 13th Int’l Workshop Networkand Operating Systems Support for Digital Audio and Video(NOSSDAV ’03), June 2003.

[23] A. Markopoulou, F. Tobagi, and M. Karam, “Assessment ofVoIP Quality over Internet Backbones,” IEEE Trans. Networking,Oct. 2003.

[24] K. Nahrstedt, B. Yu, J. Liang, and Y. Cui, “Hourglass MultimediaContent and Service Composition Framework for Smart RoomEnvironments,” Elsevier J. Pervasive and Mobile Computing, 2005.

[25] T.-W. Ngan, D. Wallach, and P. Druschel, “Enforcing Fair Sharingof Peer-to-Peer Resources,” Proc. Second Int’l Workshop Peer-to-PeerSystems (IPTPS ’03), Feb. 2003.

[26] W.T. Ooi and R.V. Renesse, “Distributing Media Transformationover Multiple Media Gateways,” Proc. Ninth ACM Int’l MultimediaConf. (Multimedia ’01), Sept. 2001.

[27] L. Peterson, T. Anderson, D. Culler, and T. Roscoet, “A Blueprintfor Introducing Disruptive Change in the Internet,” Proc. FirstACM Workshop Hot Topics in Networks (HotNets ’02), Oct. 2002.

[28] M. Radenkosvic and C. GreenHalgh, “Multi-Party DistributedAudio Service with TCP Fairness,” Proc. 10th ACM Int’l Conf.Multimedia (Multimedia ’02), Dec. 2002.

[29] P.V. Rangan, H.M. Vin, and S. Ramanathan, “CommunicationArchitectures and Algorithms for Media Mixing in MultimediaConferences,” IEEE/ACM Trans. Networking, vol. 1, no. 1, Feb. 1993.

[30] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker,“A Scalable Content-Addressable Network,” Proc. ACMSIGCOMM, 2001.

[31] S. Ratnasamy, M. Handley, R. Karp, and S. Shenker,“Topologically-Aware Overlay Construction and Server Selec-tion,” Proc. IEEE INFOCOM ’02, June 2002.

[32] S. Ren, L. Guo, and X. Zhang, “ASAP: An AS-Aware Peer-RelayProtocol for High Quality VoIP,” Proc. 26th IEEE Int’l Conf.Distributed Computing Systems (ICDCS ’06), July 2006.

[33] A. Rowstron and P. Druschel, “Pastry: Scalable, Distributed ObjectLocation and Routing for Large-Scale Peer-to-Peer Systems,” Proc.IFIP/ACM Int’l Conf. Distributed Systems Platforms (Middleware ’01),Nov. 2001.

[34] D. Rubenstein, S. Kasera, D. Towsley, and J. Kurose, “ImprovingReliable Multicast Using Active Parity Encoding Services (APES),”J. Computer Networks, vol. 44, no. 1, Jan. 2004.

[35] A. Singh and A. Acharya, “Using Session Initiation Protocol toBuild Context-Aware VoIP Support for Multiplayer NetworkedGames,” Proc. ACM SIGCOMM ’04, Aug. 2004.

[36] A.C. Snoeren, K. Conley, and D.K. Gifford, “Mesh Based ContentRouting Using XML,” Proc. 18th ACM Symp. Operating SystemsPrinciples (SOSP ’01), Oct. 2001.

[37] K. Sripanidkulchai, A. Ganjam, B. Maggs, and H. Zhang, “TheFeasibility of Supporting Large-Scale Live Streaming Applica-tions with Dynamic Application End-Points,” Proc. ACMSIGCOMM ’04, Aug. 2004.


[38] I. Stoica, R.Morris, D. Karger,M.F. Kaashoek, andH. Balakrishnan,“Chord: A Scalable Peer-to-Peer Lookup Service for InternetApplications,” Proc. ACM SIGCOMM ’01, Aug. 2001.

[39] H. Tangmunarunkit, R. Govindan, S. Jamin, S. Shenker, andW. Willinger, “Network Topology Generators: Degree-Basedversus Structural,” Proc. ACM SIGCOMM ’02, Aug. 2002.

[40] S. Tao et al., “Improving VoIP Quality through Path Switching,”Proc. IEEE INFOCOM ’05, Mar. 2005.

[41] J. Winick and S. Jamin, “Inet3.0: Internet Topology Generator,”Technical Report UM-CSE-TR-456-02, http://irl.eecs.umich.edu/jamin/, 2002.

Xiaohui Gu received the BS degree incomputer science from Peking University,Beijing, and the MS and PhD degrees in2001 and 2004, respectively, from the Depart-ment of Computer Science, University ofIllinois, Urbana-Champaign. She worked atIBM T.J. Watson Research Center, Hawthorne,New York, from 2004-2007. She is currentlyan assistant professor at North CarolinaState University. Her general research interests

include systems and networking, with a current focus on intelligentsystem management, large-scale data stream processing, and peer-to-peer systems. She received an ILLIAC fellowship, the David J. KuckBest Master Thesis Award, and a Saburo Muroga Fellowship from theUniversity of Illinois, Urbana-Champaign. She received the IBM FirstInvention Award on 2004 and First Invention Plateau Award on 2006.She has served on the program committee and organizing committeein many major conferences including PerCom 2006-07, RTSS 2006,ACM Middleware Doctoral Symposium 2007, ICPS 2005-07, ACMMultimedia 2005 service composition workshop, and ACM MultimediaModeling Conference 2006. She is a member of the IEEE.

Zhen Wen received the PhD degree incomputer science from University of Illinois,Urbana-Champaign. His thesis work was onimproving human computer interaction usinghuman face avatars. He is a research staffmember at IBM T.J. Watson Research Center,Yorktown Heights, New York. His researchinterests include visualization, computer gra-phics, machine learning, pattern recognition,and multimedia systems. At IBM, his current

research focuses on adaptive context-sensitive user interfaces forinformation analysis. He received a best paper award at IUI 2005 and anIBM Research Division Award in 2005. He serves on the technicalcommittees of major conferences such as ACM Multimedia and IEEEMultimedia. He is a member of the IEEE.

Philip S. Yu received the BS degree inelectrical engineering from the National TaiwanUniversity, the MS and PhD degrees in electricalengineering from Stanford University, and theMBA degree from New York University. He iswith the IBM T.J. Watson Research Center,Yorktown Heights, New York, and is currentlythe manager of the Software Tools and Techni-ques Group. His research interests include datamining, Internet applications and technologies,

database systems, multimedia systems, parallel and distributedprocessing, and performance modeling. He has published more than500 papers in refereed journals and conference proceedings. He holdsor has applied for more than 300 US patents. He is an associate editorof ACM Transactions on the Internet Technology and ACM Transac-tions on Knowledge Discovery from Data. He is on the steeringcommittee of the IEEE Conference on Data Mining and was a memberof the IEEE Data Engineering Steering Committee. He was the editor inchief of the IEEE Transactions on Knowledge and Data Engineeringfrom 2001 to 2004, an editor, advisory board member, and also aguest coeditor of the special issue on mining of databases. He had alsoserved as an associate editor of Knowledge and Information Systems.In addition to serving as a program committee member on variousconferences, he was the program chair or cochair of the IEEEWorkshop on Scalable Stream Processing Systems (SSPS ’07), theIEEE Workshop on Mining Evolving and Streaming Data (2006), the2006 Joint Conferences of the Eighth IEEE Conference onE-Commerce Technology (CEC ’06) and the Third IEEE Conferenceon Enterprise Computing, E-Commerce and E-Services (EEE ’06),the 11th IEEE International Conference on Data Engineering, theSixth Pacific Area Conference on Knowledge Discovery and DataMining, the Ninth ACM Sigmod Workshop on Research Issues in DataMining and Knowledge Discovery, the Second IEEE InternationalWorkshop on Research Issues on Data Engineering: Transaction andQuery Processing, the PAKDD Workshop on Knowledge Discoveryfrom Advanced Databases, and the Second IEEE InternationalWorkshop on Advanced Issues of E-Commerce and Web-basedInformation Systems. He served as the general chair or cochair of the2006 ACM Conference on Information and Knowledge Management,the 14th IEEE International Conference on Data Engineering, and theSecond IEEE International Conference on Data Mining. He hasreceived several IBM honors including two IBM Outstanding InnovationAwards, an Outstanding Technical Achievement Award, two ResearchDivision Awards, and the 88th Plateau of Invention Achievement Award.He received a Research Contributions Award in the IEEE InternationalConference on Data Mining in 2003 and also an IEEE Region 1 Awardfor “promoting and perpetuating numerous new electrical engineeringconcepts” in 1999. He is an IBM Master Inventor. He is a fellow of theIEEE and the ACM.

Zon-Yin Shae received the BA and MS degreesin electronic engineering from the NationalChiao-Tung University, Taiwan, and the PhDdegree in electrical engineering from the Uni-versity of Pennsylvania, Philadelphia, in 1989.He has been with the IBM T.J. Watson ResearchCenter, Yorktown Heights, New York, since1989. He works in the areas of multimedianetworking, SIP/VoIP converged networks andapplications, multimedia data analysis, and

service computing. He has held numerous patents and was an activemember of the H323 and MPEG international standard group. He is amember of the IEEE.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.


ieee transactions on parallel and distributed … · online gaming) that often include many...

Documents