live streaming with

Upload: sandeep-reddy-dasani

Post on 05-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Live Streaming With

    1/14

    1

    Live Streaming with

    Receiver-based Peer-division MultiplexingHyunseok Chang Sugih Jamin Wenjie Wang

    AbstractA number of commercial peer-to-peer systems forlive streaming have been introduced in recent years. The behaviorof these popular systems has been extensively studied in severalmeasurement papers. Due to the proprietary nature of thesecommercial systems, however, these studies have to rely on ablack-box approach, where packet traces are collected froma single or a limited number of measurement points, to infervarious properties of traffic on the control and data planes.Although such studies are useful to compare different systemsfrom end-users perspective, it is difficult to intuitively under-stand the observed properties without fully reverse-engineeringthe underlying systems. In this paper we describe the network

    architecture of Zattoo, one of the largest production live stream-ing providers in Europe at the time of writing, and present alarge-scale measurement study of Zattoo using data collected bythe provider. To highlight, we found that even when the Zattoosystem was heavily loaded with as high as 20,000 concurrentusers on a single overlay, the median channel join delay remainedless than 2 to 5 seconds, and that, for a majority of users, thestreamed signal lags over-the-air broadcast signal by no morethan 3 seconds.

    Index TermsPeer-to-peer system, live streaming, networkarchitecture

    I. INTRODUCTION

    THERE is an emerging market for IPTV. Numerous com-mercial systems now offer services over the Internet that

    are similar to traditional over-the-air, cable, or satellite TV.

    Live television, time-shifted programming, and content-on-

    demand are all presently available over the Internet. Increased

    broadband speed, growth of broadband subscription base, and

    improved video compression technologies have contributed to

    the emergence of these IPTV services.

    We draw a distinction between three uses of peer-to-peer

    (P2P) networks: delay tolerant file download of archival ma-

    terial, delay sensitive progressive download (or streaming) of

    archival material, and real-time live streaming. In the first

    case, the completion of download is elastic, depending on

    available bandwidth in the P2P network. The applicationbuffer receives data as it trickles in and informs the user

    upon the completion of download. The user can then start

    playing back the file for viewing in the case of a video

    file. Bittorrent and variants are example of delay-tolerant file

    download systems. In the second case, video playback starts as

    H. Chang is with Alcatel-Lucent Bell Labs, Holmdel, NJ 07733 USA (e-mail: [email protected]).

    S. Jamin is with EECS Department, University of Michigan, Ann Arbor,MI 48109 USA (e-mail: [email protected]).

    W. Wang is with IBM Ressearch, CRL, Beijing 100193, China (e-mail:[email protected]).

    This work was done when authors Chang and Wang were at Zattoo Inc.

    soon as the application assesses it has sufficient data buffered

    that, given the estimated download rate and the playback rate,

    it will not deplete the buffer before the end of file. If this

    assessment is wrong, the application would have to either

    pause playback and rebuffer, or slow down playback. While

    users would like playback to start as soon as possible, the

    application has some degree of freedom in trading off playback

    start time against estimated network capacity. Most video-on-

    demand systems are examples of delay-sensitive progressive-

    download application. The third case, real-time live streaming,

    has the most stringent delay requirement. While progressivedownload may tolerate initial buffering of tens of seconds or

    even minutes, live streaming generally cannot tolerate more

    than a few seconds of buffering. Taking into account the

    delay introduced by signal ingest and encoding, and network

    transmission and propagation, the live streaming system can

    introduce only a few seconds of buffering time end-to-end and

    still be considered live [1].

    The Zattoo peer-to-peer live streaming system was a free-

    to-use network serving over 3 million registered users in eight

    European countries at the time of study, with a maximum

    of over 60,000 concurrent users on a single channel. The

    system delivers live streams using a receiver-based, peer-

    division multiplexing scheme as described in Section II. Toensure real-time performance when peer uplink capacity is

    below requirement, Zattoo subsidizes the networks bandwidth

    requirement, as described in Section III. After delving into

    Zattoos architecture in detail, we study in Sections IV and V

    large-scale measurements collected during the live broadcast

    of the UEFA European Football Championship, one of the

    most popular one-time events in Europe, in June, 2008 [2].

    During the course of the month of June 2008, Zattoo served

    more than 35 million sessions to more than one million distinct

    users. Drawing from these measurements, we report on the

    operational scalability of Zattoos live streaming system along

    several key issues:

    1) How does the system scale in terms of overlay size and

    its effectiveness in utilizing peers uplink bandwidth?

    2) How responsive is the system during channel switching,

    for example, when compared to the 3-second channel

    switch time of satellite TV?

    3) How effective is the packet retransmission scheme in

    allowing a peer to recover from transient congestion?

    4) How effective is the receiver-based peer-division multi-

    plexing scheme in delivering synchronized sub-streams?

    5) How effective is the global bandwidth subsidy system

    in provisioning for flash crowd scenarios?

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    2/14

    2

    6) Would a peer further away from the stream source

    experience adversely long lag compared to a peer closer

    to the stream source?

    7) How effective is error-correcting code in isolating packet

    losses on the overlay?

    We also discuss in Section VI several challenges in in-

    creasing the bandwidth contribution of Zattoo peers. Finally,

    we describe related works in Section VII and conclude inSection VIII.

    I I . SYSTEM ARCHITECTURE

    The Zattoo system rebroadcasts live TV, captured from

    satellites, onto the Internet. The system carries each TV

    channel on a separate peer-to-peer delivery network and is not

    limited in the number of TV channels it can carry. Although

    a peer can freely switch from one TV channel to another, and

    thereby departing and joining different peer-to-peer networks,

    it can only join one peer-to-peer network at any one time.

    We henceforth limit our description of the Zattoo delivery

    network as it pertains to carrying one TV channel. Fig. 1shows a typical setup of a single TV channel carried on the

    Zattoo network. TV signal captured from satellite is encoded

    into H.264/AAC streams, encrypted, and sent onto the Zattoo

    network. The encoding server may be physically separated

    from the server delivering the encoded content onto the Zattoo

    network. For ease of exposition, we will consider the two as

    logically co-located on an Encoding Server. Users are required

    to register themselves at the Zattoo website to download a free

    copy of the Zattoo player application. To receive the signal

    of a channel, the user first authenticates itself to the Zattoo

    Authentication Server. Upon authentication, the user is granted

    a ticket with limited lifetime. The user then presents this ticket,

    along with the identity of the TV channel of interest, to theZattoo Rendezvous Server. If the ticket specifies that the user

    is authorized to receive signal of the said TV channel, the

    Rendezvous Server returns to the user a list of peers currently

    joined to the P2P network carrying the channel, together with

    a signed channel ticket. If the user is the first peer to join a

    channel, the list of peers it receives contain only the Encoding

    Server. The user joins the channel by contacting the peers

    returned by the Rendezvous Server, presenting its channel

    ticket, and obtaining the live stream of the channel from them

    (see Section II-A for details).

    Each live stream is sent out by the Encoding Server as

    n logical sub-streams. The signal received from satellite is

    encoded into a variable-bit rate stream. During periods ofsource quiescence, no data is generated. During source busy

    periods, generated data is packetized into a packet stream,

    with each packet limited to a maximum size. The Encoding

    Server multiplexes this packet stream onto the Zattoo network

    as n logical sub-streams. Thus the first packet generated isconsidered part of the first sub-stream, the second packet that

    of the second sub-stream, the n-th packet that of the n-th sub-stream. The n+1-th packet cycles back to the first sub-stream,etc. such that the i-th sub-stream carries the mn+i-th packets,where m 0, 1 i n, and n a user-defined constant.We call a set of n packets with the same index multiplier m

    Rendezvous Server

    Authentication Server

    Feedback Server

    other admin servers

    Demultiplexer

    Encoding

    Servers

    Fig. 1. Zattoo delivery network architecture.

    a segment. Thus m serves as the segment index, while iserves as the packet index within a segment. Each segment

    is of size n packets. Being the packet index, i also serves asthe sub-stream index. The number mn + i is carried in each

    packet as its sequence number.Zattoo uses the Reed-Solomon (RS) error correcting code

    (ECC) for forward error correction. The RS code is a sys-

    tematic code: of the n packets sent per segment, k < npackets carry the live stream data while the remainder carries

    the redundant data [3, Section 7.3]. Due to the variable-bit

    rate nature of the data stream, the time period covered by a

    segment is variable, and a packet may be of size less than the

    maximum packet size. A packet smaller than the maximum

    packet size is zero-padded to the maximum packet size for

    the purposes of computing the (shortened) RS code, but is

    transmitted in its original size. Once a peer has received kpackets per segment, it can reconstruct the remaining n kpackets. We do not differentiate between streaming data and

    redundant data in our discussion in the remainder of this paper.

    When a new peer requests to join an existing peer, it

    specifies the sub-stream(s) it would like to receive from the

    existing peer. These sub-streams do not have to be consecutive.

    Contingent upon availability of bandwidth at existing peers,

    the receiving peer decides how to multiplex a stream onto

    its set of neighboring peers, giving rise to our description of

    the Zattoo live streaming protocol as a receiver-based, peer-

    division multiplexing protocol. The details of peer-division

    multiplexing is described in Section II-A while the details of

    how a peer manages sub-stream forwarding and stream recon-

    struction is described in Section II-B. Receiver-based peer-

    division multiplexing has also been used by the latest version

    of CoolStreaming peer-to-peer protocol though it differs from

    Zattoo in its stream management (Section II-B) and adaptive

    behavior (Section II-C) [4].

    A. Peer-Division Multiplexing

    To minimize per-packet processing time of a stream, the

    Zattoo protocol sets up a virtual circuit with multiple fan outs

    at each peer. When a peer joins a TV channel, it establishes

    a peer-division multiplexing (PDM) scheme amongst a set of

    neighboring peers, by building a virtual circuit to each of the

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    3/14

    3

    neighboring peers. Baring departure or performance degrada-

    tion of a neighbor peer, the virtual circuits are maintained

    until the joining peer switches to another TV channel. With the

    virtual circuits set up, each packet is forwarded without further

    per-packet handshaking between peers. We describe the PDM

    boot strapping mechanism in this section and the adaptive

    PDM mechanism to handle peer departure and performance

    degradation in Section II-C.

    The PDM establishment process consists of two phases:

    the search phase and the join phase. In the search phase, the

    new, joining peer determines its set of potential neighbors. In

    the join phase, the joining peer requests peering relationships

    with a subset of its potential neighbors. Upon acceptance of a

    peering relationship request, the peers become neighbors and

    a virtual circuit is formed between them.

    Search phase. To obtain a list of potential neighbors, a

    joining peer sends out a SEARCH message to a random subset

    of the existing peers returned by the Rendezvous Server. The

    SEARCH message contains the sub-stream indices for which

    this joining peer is looking for peering relationships. The sub-

    stream indices is usually represented as a bitmask of n bits,where n is the number of sub-streams defined for the TVchannel. In the beginning, the joining peer will be looking for

    peering relationships for all sub-streams and have all the bits

    in the bitmask turned on. In response to a SEARCH message,

    an existing peer replies with the number of sub-streams it can

    forward. From the returning SEARCH replies, the joining peer

    constructs a set of potential neighbors that covers the full set of

    sub-streams comprising the live stream of the TV channel. The

    joining peer continues to wait for SEARCH replies until the

    set of potential neighbors contains at least a minimum number

    of peers, or until all SEARCH replies have been received.

    With each SEARCH reply, the existing peer also returns a

    random subset of its known peers. If a joining peer cannotform a set of potential neighbors that covers all of the sub-

    streams of the TV channel, it initiates another SEARCH round,

    sending SEARCH messages to peers newly learned from the

    previous round. The joining peer gives up if it cannot obtain

    the full stream after two SEARCH rounds. To help the joining

    peer synchronize the sub-streams it receives from multiple

    peers, each existing peer also indicates for each sub-stream

    the latest sequence number it has received for that sub-stream,

    and the existence of any quality problem. The joining peer can

    then choose sub-streams with good quality that are closely

    synchronized.

    Join phase. Once the set of potential neighbors is estab-

    lished, the joining peer sends JOIN requests to each potentialneighbor. The JOIN request lists the sub-streams for which

    the joining peer would like to construct virtual circuit with the

    potential neighbor. If a joining peer has l potential neighbors,each willing to forward it the full stream of a TV channel, it

    would typically choose to have each forward only 1/l-th of thestream, to spread out the load amongst the peers and to speed

    up error recovery, as described in Section II-C. In selecting

    which of the potential neighbors to peer with, the joining

    peer gives highest preference to topologically close-by peers,

    even if these peers have less capacity or carry lower quality

    sub-streams. The topological location of a peer is defined

    Zattoo

    Peer and

    Player

    Application

    PDM

    Fig. 2. Zattoo peer with IOB.

    to be its subnet number, autonomous system (AS) number,

    and country code, in that order of precedence. A joining

    peer obtains its own topological location from the Zattoo

    Authentication Server as part of its authentication process.

    The list of peers returned by both the Rendezvous Server

    and potential neighbors all come attached with topological

    locations. A topology-aware overlay not only allows us to

    be ISP-friendly, by minimizing inter-domain traffic and

    thus save on transit bandwidth cost, but also helps reduce

    the number of physical links and metro hops traversed inthe overlay network, potentially resulting in enhanced user-

    perceived stream quality.

    B. Stream Management

    We represent a peer as a packet buffer, called the IOB,

    fed by sub-streams incoming from the PDM constructed as

    described in Section II-A.1 The IOB drains to (1) a local

    media player if one is running, (2) a local file if recording

    is supported, and (3) potentially other peers. Fig. 2 depicts

    a Zattoo player application with virtual circuits established to

    four peers. As packets from each sub-stream arrive at the peer,

    they are stored in the IOB for reassembly to reconstruct thefull stream. Portions of the stream that have been reconstructed

    are then played back to the user. In addition to providing

    a reassembly area, the IOB also allows a peer to absorb

    some variabilities in available network bandwidth and network

    delay.

    The IOB is referenced by an input pointer, a repair pointer,

    and one or more output pointers. The input pointer points

    to the slot in the IOB where the next incoming packet with

    sequence number higher than the highest sequence number

    1In the case of the Encoding Server, which we also consider a peer on theZattoo network, the buffer is fed by the encoding process.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    4/14

    4

    received so far will be stored. The repair pointer always

    points one slot beyond the last packet received in order and is

    used to regulate packet retransmission and adaptive PDM as

    described later. We assign an output pointer to each forwarding

    destination. The output pointer of a destination indicates the

    destinations current forwarding horizon on the IOB. In accor-

    dance to the three types of possible forwarding destinations

    listed above, we have three types of output pointers: player

    pointer, file pointer, and peer pointer. One would typically

    have at most one player pointer and one file pointer but

    potentially multiple concurrent peer pointers, referencing an

    IOB. The Zattoo player application does not currently support

    recording.

    Since we maintain the IOB as a circular buffer, if the

    incoming packet rate is higher than the forwarding rate of a

    particular destination, the input pointer will overrun the output

    pointer of that destination. We could move the output pointer

    to match the input pointer so that we consistently forward the

    oldest packet in the IOB to the destination. Doing so, however,

    requires checking the input pointer against all output pointers

    on every packet arrival. Instead, we have implemented the IOBas a double buffer. With the double buffer, the positions of the

    output pointers are checked against that of the input pointer

    only when the input pointer moves from one sub-buffer to the

    other. When the input pointer moves from sub-buffer a to sub-buffer b, all the output pointers still pointing to sub-buffer b aremoved to the start of sub-buffer a and sub-buffer b is flushed,ready to accept new packets. When a sub-buffer is flushed

    while there are still output pointers referencing it, packets that

    have not been forwarded to the destinations associate with

    those pointers are lost to them, resulting in quality degradation.

    To minimize packet lost due to sub-buffer flushing, we would

    like to use large sub-buffers. However, the real-time delay

    requirement of live streaming limits the usefulness of latearriving packets and effectively puts a cap on the maximum

    size of the sub-buffers.

    Different peers may request for different numbers of, possi-

    bly non-consecutive, sub-streams. To accommodate the differ-

    ent forwarding rates and regimes required by the destinations,

    we associate a packet map and forwarding discipline with each

    output pointer. Fig. 3 shows the packet map associated with an

    output peer pointer where the peer has requested sub-streams

    1, 4, 9, and 14. Every time a peer pointer is repositioned

    to the beginning of a sub-buffer of the IOB, all the packet

    slots of the requested sub-streams are marked NEEDed and

    all the slots of the sub-streams not requested by the peer are

    marked SKIP. When a NEEDed packet arrives and is storedin the IOB, its state in the packet map is changed to READY.

    As the peer pointer moves along its associated packet map,

    READY packets are forwarded to the peer and their states

    changed to SENT. A slot marked NEEDed but not READY,

    such as slot n + 4 in Fig. 3, indicates that the packet is lostor will arrive out-of-order and is bypassed. When an out-of-

    order packet arrives, its slot is changed to READY and the

    peer pointer is reset to point to this slot. Once the out-of-order

    packet has been sent to the peer, the peer pointer will move

    forward, bypassing all SKIP, NEED, and SENT slots until it

    reaches the next READY slot, where it can resume sending.

    SKIP NEED

    SENT READY

    1 n

    segment 0

    4 9 14

    segment m-1

    Fig. 3. Packet map associated with a peer pointer.

    Peer0

    Peer1

    Player

    File

    inputpointer

    ...

    ...

    PacketMap

    PacketMap

    PacketMap

    segment 0

    segmentm-1

    segment size: n

    sub-buffer 0

    sub-buffer 1

    received packet empty slot

    segment 0

    segmentm-1

    IOB

    PacketMap

    repair

    pointer

    Fig. 4. IOB, input/output pointers and packet maps.

    The player pointer behaves the same as a peer pointer except

    that all packets in its packet map will always start out marked

    NEEDed.

    Fig. 4 shows an IOB consisting of a double buffer, with an

    input pointer, a repair pointer, and an output file pointer, anoutput player pointer, and two output peer pointers referencing

    the IOB. Each output pointer has a packet map associated

    with it. For the scenario depicted in the figure, the player

    pointer tracks the input pointer and has skipped over some

    lost packets. Both peer pointers are lagging the input pointer,

    indicating that the forwarding rates to the peers are bandwidth

    limited. The file pointer is pointing at the first lost packet.

    Archiving a live stream to file does not impose real-time delay

    bound on packet arrivals. To achieve the best quality recording

    possible, a recording peer always waits for retransmission of

    lost packets that cannot be recovered by error correction.

    In addition to achieving lossless recording, we use re-

    transmission to let a peer recover from transient networkcongestion. A peer sends out a retransmission request when

    the distance between the repair pointer and the input pointer

    has reached a threshold of R packet slots, usually spanningmultiple segments. A retransmission request consists of an R-bit packet mask, with each bit representing a packet, and the

    sequence number of the packet corresponding to the first bit.

    Marked bits in the packet mask indicate that the corresponding

    packets need to be retransmitted. When a packet loss is

    detected, it could be caused by congestion on the virtual

    circuits forming the current PDM or congestion on the path

    beyond the neighboring peers. In either case, current neighbor

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    5/14

    5

    peers will not be good sources of retransmitted packets. Hence

    we send our retransmission requests to r random peers that arenot neighbor peers. A peer receiving a retransmission request

    will honor the request only if the requested packets are still

    in its IOB and it has sufficient left-over capacity, after serving

    its current peers, to transmit all the requested packets. Once a

    retransmission request is accepted, the peer will retransmit all

    the requested packets to completion.

    C. Adaptive PDM

    While we rely on packet retransmission to recover from

    transient congestions, we have two channel capacity adjust-

    ment mechanisms to handle longer-term bandwidth fluctua-

    tions. The first mechanism allows a forwarding peer to adapt

    the number of sub-streams it will forward given its current

    available bandwidth, while the second allows the receiving

    peer to switch provider at the sub-stream level.

    Peers on the Zattoo network can redistribute a highly

    variable number of sub-streams, reflecting the high variability

    in uplink bandwidth of different access network technologies.

    For a full-stream consisting of sixteen constant-bit rate sub-

    streams, our prior study show that based on realistic peer

    characteristics measured from the Zattoo network, half of the

    peers can support less than half of a stream, 82% of peers can

    support less than a full-stream, and the remainder can support

    up to ten full streams (peers that can redistribute more than

    a full stream is conventionally known as supernodes in the

    literature) [5]. With variable-bit rate streams, the bandwidth

    carried by each sub-stream is also variable. To increase peer

    bandwidth usage, without undue degradation of service, we

    instituted measurement-based admission control at each peer.

    In addition to controlling resource commitment, another goal

    of the measurement-based admission control module is tocontinually estimate the amount of available uplink bandwidth

    at a peer.

    The amount of available uplink bandwidth at a peer is

    initially estimated by the peer sending a pair of probe packets

    to Zattoos Bandwidth Estimation Server. Once a peer starts

    forwarding sub-streams to other peers, it will receive from

    those peers quality-of-service feedbacks that inform its update

    of available uplink bandwidth estimate. A peer sends quality-

    of-service feedback only if the quality of a sub-stream drops

    below a certain threshold.2 Upon receiving quality feedback

    from multiple peers, a peer first determines if the identified

    sub-streams are arriving in low quality. If so, the low quality of

    service may not be caused by limit on its own available uplinkbandwidth; in which case, it ignores the low quality feedbacks.

    Otherwise, the peer decrements its estimate of available uplink

    bandwidth. If the new estimate is below the bandwidth needed

    to support existing number of virtual circuits, the peer closes

    2Depending on a peers NAT and/or firewall configuration, Zattoo useseither UDP or TCP as the underlying transport protocol. The quality of a sub-stream is measured differently for UDP and TCP. A packet is considered lostunder UDP if it doesnt arrive within a fixed threshold. The quality measurefor UDP is computed as a function of both the packet lost rate and the bursterror rate (number of contiguous packet losses). The quality measure for TCPis defined to be how far behind a peer is, relative to other peers, in servingits sub-streams.

    a virtual circuit. To reduce the instability introduced into the

    network, a peer closes first the virtual circuit carrying the

    smallest number of sub-streams.

    A peer attempts to increase its available uplink bandwidth

    estimate periodically: if it has fully utilized its current estimate

    of available uplink bandwidth without triggering any bad

    quality feedback from neighboring peers. A peer doubles the

    estimated available uplink bandwidth if current estimate is

    below a threshold, switching to linear increase above the

    threshold, similar to how TCP maintains its congestion win-

    dow size. A peer also increases its estimate of available uplink

    bandwidth if a neighbor peer departs the network without any

    bad quality feedback.

    When the repair pointer lags behind the input pointer by Rpacket slots, in addition to initiating a retransmission request,

    a peer also computes a loss rate over the R packets. If theloss rate is above a threshold, the peer considers the neighbor

    slow and attempts to reconfigure its PDM. In reconfiguring

    its PDM, a peer attempts to shift half of the sub-streams

    currently forwarded by the slow neighbor to other existing

    neighbors. At the same time, it searches for new peer(s) toforward these sub-streams. If new peer(s) are found, the load

    will be shifted from existing neighbors to the new peer(s).

    If sub-streams from the slow neighbor continues to suffer

    after the reconfiguration of the PDM, the peer will drop the

    neighbor completely and initiate another reconfiguration of the

    PDM. When a peer loses a neighbor due to reduced available

    uplink bandwidth at the neighbor or due to neighbor departure,

    it also initiates a PDM reconfiguration. A peer may also

    initiate a PDM reconfiguration to switch to a topologically

    closer peer. Similar to the PDM establishment process, PDM

    reconfiguration is accomplished by peers exchanging sub-

    stream bitmasks in a request/response handshake, with each

    bit of the bitmask representing a sub-stream. During and aftera PDM reconfiguration, slow neighbor detection is disabled

    for a short period of time to allow for the system to stabilize.

    III. GLOBAL BANDWIDTH SUBSIDY SYSTEM

    Each peer on the Zattoo network is assumed to serve a

    user through a media player, which means that each peer

    must receive, and can potentially forward, all n sub-streamsof the TV channel the user is watching. The limited redistri-

    bution capacity of peers on the Zattoo network means that

    a typical client can contribute only a fraction of the sub-

    streams that make up a channel. This shortage of bandwidth

    leads to a global bandwidth deficit in the peer-to-peer net-

    work. Whereas bittorrent-like delay-tolerant file downloads orthe delay-sensitive progressive download of video-on-demand

    applications can mitigate such global bandwidth shortage by

    increasing download time, a live streaming system such as

    Zattoos must subsidize the bandwidth shortfall to provide

    real-time delivery guarantee.

    Zattoos Global Bandwidth Subsidy System (or simply, the

    Subsidy System), consists of a global bandwidth monitoring

    subsystem, a global bandwidth forecasting and provisioning

    subsystem, and a pool of Repeater nodes. The monitoring

    subsystem continuously monitors the global bandwidth re-

    quirement of a channel. The forecasting and provisioning

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    6/14

    6

    subsytem projects global bandwidth requirement based on

    measured history and allocates Repeater nodes to the chan-

    nel as needed. The monitoring and provisioning of global

    bandwidth is complicated by two highly varying parameters

    over time, client population size and peak streaming rate, and

    one varying parameter over space, available uplink bandwidth,

    which is network-service provider dependent. Forecasting of

    bandwidth requirement is a vast subject in itself. Zattoo

    adopted a very simple mechanism, described in Section III-B

    which has performed adequately in provisioning the network

    for both daily demand fluctuations and flash crowds scenarios

    (see Section IV-C).

    When a bandwidth shortage is projected for a channel, the

    Subsidy System assigns one or more Repeater nodes to the

    channel. Repeater nodes function as bandwidth multiplier, to

    amplify the amount of available bandwidth in the network.

    Each Repeater node serves at most one channel at a time;

    it joins and leaves a given channel at the behest of the

    Subsidy System. Repeater nodes receive and serve all nsub-streams of the channel they join, run the same PDM

    protocol, and are treated by actual peers like any other peerson the network; however, as bandwidth amplifiers, they are

    usually provisioned to contribute more uplink bandwidth than

    the download bandwidth they consume. The use of Repeater

    nodes makes the Zattoo network a hybrid P2P and content

    distribution network.

    We next describe the bandwidth monitoring subsystem of

    the Subsidy System, followed by design of the simple band-

    width projection and Repeater node assignment subsystem.

    A. Global Bandwidth Measurement

    The capacity metric of a channel is the tuple (D, C),

    where D is the aggregate download rates required by allusers on the channel, and C is the aggregate upload capacityof those users. Usually C < D and the difference betweenthe two is the global bandwidth deficit of the channel. Since

    channel viewership changes over time as users join or leave

    the channel, we need a scalable means to measure and update

    the capacity metric. We rely on aggregating the capacity

    metric up the peer division multiplexing tree. Each peer in the

    overlay periodically aggregates the capacity metric reported

    by all its downstream receiver peers, adds its own capacity

    measure (D, C) to the aggregate, and forwards the resultingcapacity metric upstream to its forwarding peers. By the time

    the capacity metric percolates up to the Encoding Server, it

    contains the total download and upload rate aggregates of thewhole streaming overlay. The Encoding Server then simply

    forwards the obtained (D, C) to the Subsidy Server.

    B. Global Bandwidth Projection and Provisioning

    For each channel, the Subsidy Server keeps a history of the

    capacity metric (D, C) reports received from the channelsEncoding Server. The channel utilization ratio (U) is the ratioD over C. Based on recent movements of the ratio U, weclassify the capacity trend of each channel into the following

    four categories.

    Stable: the ratio U has remained within [S-, S+] forthe past Ts reporting periods.

    Exploding: the ratio U increased by at least E betweenTe (e.g., Te = 2) reporting periods.

    Increasing: the ratio U has steadily increased by I (IE) over the past Ti reporting periods.

    Falling: the ratio U has decreased by F over the past Tfreporting periods.

    Orthogonal to the capacity-trend based classification above,

    each channel is further categorized in terms of its capacity

    utilization ratio as follows.

    Under-utilized: the ratio U is below the low thresholdL, e.g., U 0.5.

    Near Capacity: the ratio U is almost 1.0, e.g., U 0.9.

    If the capacity trend of a channel has been Exploding, one

    or more Repeater nodes will be assigned to it immediately. If

    the channels capacity trend has been Increasing, it will be

    assigned a Repeater node with a smaller capacity. The goal

    of the Subsidy System is to keep a channel below Near

    Capacity. If a channels capacity trend is Stable and the

    channel is Under-utilized, the Subsidy System attempts tofree Repeater nodes (if any) assigned to the channel. If a

    channels capacity utilization is Falling, the Subsidy System

    waits for the utilization to stabilize before reassigning any

    Repeater nodes.

    Each Repeater node periodically sends a keep-alive message

    to the Subsidy System. The keep-live message tells the Sub-

    sidy System which channel the Repeater node is serving, plus

    its CPU and capacity utilization ratio. This allows the Subsidy

    System to monitor the health of Repeater nodes and to increase

    the stability of the overlay during Repeater reassignment.

    When reassigning Repeater nodes from a channel, the Subsidy

    System will start from the Repeater node with the lowest

    utilization ratio. It will notify the selected Repeater node to

    stop accepting new peers and then to leave the channel after

    a specified time.

    In addition to Repeater nodes, the Subsidy System may

    recognize extra capacity from idle peers whose owners are

    not actively watching a channel. However, our previous study

    shows that a large number of idle peers are required to make

    any discernible impact on the global bandwidth deficit of a

    channel [5]. Our current Subsidy System therefore does not

    solicit bandwidth contribution from idle peers.

    IV. SERVER-SIDE MEASUREMENTS

    In the Zattoo system, two separate centralized collectorservers collect usage statistics and error reports, which we

    call the stats server and the user-feedback server re-

    spectively. The stats server periodically collects aggregated

    player statistics from individual peers, from which full session

    logs are constructed and entered into a session database.

    The session database gives a complete picture of all past

    and present sessions served by the Zattoo system. A given

    database entry contains statistics about a particular session,

    which includes join time, leave time, uplink bytes, download

    bytes, and channel name associated with the session. We

    study the sessions generated on three major TV channels from

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    7/14

    7

    TABLE ISESSION DATABASE (6/1/20086/30/2008).

    Channel # sessions # distinct users

    ARD 2,102,638 298,601

    Cuatro 1,845,843 268,522

    SF2 1,425,285 157,639

    TABLE IIFEEDBACK LOGS (6/20/20086/29/2008).

    Channel # feedback logs # sessions

    ARD 871 1,253

    Cuatro 2,922 4,568

    SF2 656 1,140

    three different countries (Germany, Spain, and Switzerland),

    from June 1st to June 30th, 2008. Throughout the paper, we

    label those channels from Germany, Spain, and Switzerland as

    ARD, Cuatro, and SF2, respectively. Euro 2008 games were

    held during this period, and those three channels broadcast a

    majority of the Euro 2008 games including the final match.

    See Table I for information about the collected session datasets.

    The user-feedback server, on the other hand, collects

    users error logs submitted asynchronously by users. The user

    feedback data here is different from peers quality feedback

    used in PDM reconfiguration described in Section II-C. Zat-

    too player maintains an encrypted log file which contains,

    for debugging purposes, detailed behavior of client-side P2P

    engine, as well as history of all the streaming sessions initiated

    by a user since the player startup. When users encounter

    any error while using the player, such as log-in error, join

    failure, bad quality streaming etc., they can choose to report

    the error by clicking a Submit Feedback button on the

    player, which causes the Zattoo player to send the generatedlog file to the user-feedback server. Since a given feedback

    log not only reports on a particular error, but also describes

    normal sessions generated prior to the occurrence of the

    error, we can study users viewing experience (e.g., channel

    join delay) from the feedback logs. Table II describes the

    feedback logs collected from June 20th to June 29th. A given

    feedback log can contain multiple sessions (for the same or

    different channels), depending on users viewing behavior.

    The second column in the table represents the number of

    feedback logs which contain at least one session generated

    on the channel listed in the corresponding entry in the first

    column. The numbers in the third column indicate how many

    distinct sessions generated on said channel are present in thefeedback logs.

    A. Overlay Size and Sharing Ratio

    We first study how many concurrent users are supported by

    the Zattoo system, and how much bandwidth is contributed by

    them. For this purpose, we use the session database presented

    in Table I. By using the join/leave timestamps of the collected

    sessions, we calculate the number of concurrent users on a

    given channel at time i. Then we calculate the average sharingratio of the given channel at the same time. The average

    TABLE IIIAVERAGE SHARING RATIO.

    ChannelAverage sharing ratioOff-peak Peak

    ARD 0.335 0.313

    Cuatro 0.242 0.224

    SF2 0.277 0.222

    sharing ratio is defined as total users uplink rate divided

    by their download rate on the channel. A sharing ratio of

    one means users contribute to other peers in the network as

    much traffic as they download at the time. We calculate the

    average sharing ratio from the total download/uplink bytes

    of the collected sessions. We first obtain all the sessions

    which are active across time i. We call a set of suchsessions Si. Then assuming uplink/download bytes of eachsession are spread uniformly throughout the entire session

    duration, we approximate the average sharing ratio at time

    i as

    iSi

    uplink bytes(i)/duration(i)

    iSi

    download bytes(i)/duration(i).

    Fig. 5 shows the overlay size (i.e., number of concurrentusers) and average sharing ratio super-imposed across the

    month of June, 2008. According to the figure, the overlay

    size grew to more than 20,000 (e.g., 20,059 on ARD on 6/18

    and 22,152 on Cuatro on 6/9). As opposed to the overlay

    size, the average sharing ratio tends to stay flatter throughout

    the month. Occasional spikes in the sharing ratio all occurred

    during 2AM to 6AM (GMT) when the channel usage is very

    low, and therefore may be considered statistically insignificant.

    By segmenting a 24-hour day into two time periods, e.g.,

    off-peak hours (0AM-6PM) and peak hours (6PM-0AM),

    Table III shows the average sharing ratio in the two time

    periods separately. Zattoos usage during peak hours typically

    accounts for about 50% to 70% of the total usage of theday. According to the table, the average sharing ratio during

    peak hours is slightly lower than, but not very much different

    from during off-peak hours. Cuatro channel in Spain exhibits

    relatively lower sharing ratio than the two other channels. One

    bandwidth test site [6] reports that average uplink bandwidth in

    Spain is about 205 kbps, which is much lower than in Germany

    (582 kbps) and Switzerland (787 kbps). The lower sharing

    ratio on the Spanish channel may reflect regional difference

    in residential access network provisioning. The balance of the

    required bandwidth is provided by Zattoos Encoding Server

    and Repeater nodes.

    B. Channel Switching Delay

    When user clicks on a new channel button, it takes some

    time (a.k.a. channel switching delay) for the user to be

    able to start watching streamed video on Zattoo player. The

    channel switching delay has two components. First, Zattoo

    player needs to contact other available peers and retrieve all

    required sub-streams from them. We call the delay incurred

    during this stage join delay. Once all necessary sub-streams

    have been negotiated successfully, the player then needs to

    wait and buffer the minimum amount of streams (e.g., 3

    seconds) before starting to show the video to the user. We call

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    8/14

    8

    0

    5000

    10000

    15000

    20000

    25000

    0 5 10 15 20 25 3000.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    OverlaySize

    SharingRatio

    Day in 2008/6

    Overlay SizeSharing Ratio

    (a) ARD

    0

    5000

    10000

    15000

    20000

    25000

    0 5 10 15 20 25 3000.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    OverlaySize

    SharingRatio

    Day in 2008/6

    Overlay SizeSharing Ratio

    (b) Cuatro

    0

    5000

    10000

    15000

    20000

    25000

    0 5 10 15 20 25 3000.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    OverlaySize

    SharingRatio

    Day in 2008/6

    Overlay SizeSharing Ratio

    (c) SF2

    Fig. 5. Overlay size and sharing ratio.

    0

    0.2

    0.4

    0.6

    0.8

    1

    0 2 4 6 8 10 12 14 16 18 20 22 24

    CDF

    Arrival Hour

    Feedback LogsSession Database

    (a) ARD

    0

    0.2

    0.4

    0.6

    0.8

    1

    0 2 4 6 8 10 12 14 16 18 20 22 24

    CDF

    Arrival Hour

    Feedback LogsSession Database

    (b) Cuatro

    0

    0.2

    0.4

    0.6

    0.8

    1

    0 2 4 6 8 10 12 14 16 18 20 22 24

    CDF

    Arrival Hour

    Feedback LogsSession Database

    (c) SF2

    Fig. 6. CDF of user arrival time.

    the resulting wait time buffering delay. The total channel

    switching delay experienced by users is thus the sum of join

    delay and buffering delay. PPLive reports channel switching

    delay around 20 to 30 seconds, but can be as high as 2 minutes,

    of which join delay typically accounts for 10 to 15 seconds [7].

    We measure the join delay experienced by Zattoo users from

    the feedback logs described in Table II. Debugging informa-

    tion contained in the feedback logs tells us when user clickedon a particular channel, and when the player has successfully

    joined the P2P overlay and starting to buffer content. One

    concern in relying on user-submitted feedback logs to infer

    join delay is the potential sampling bias associated with them.

    Users typically submit feedback logs when they encounter

    some kind of errors, and that brings up the question of whether

    the captured sessions are representative samples to study.

    We attempt to address this concern by comparing the data

    from feedback logs against those from the session database.

    The latter captures the complete picture of users channel

    watching behavior, and therefore can serve as a reference. In

    our analysis, we compare the user arrival time distribution

    obtained from the two data sets. For fair comparison, we useda subset of the session database which was generated during

    the same period when the feedback logs were collected (i.e.,

    from June 20th to 29th).

    Fig. 6 plots the CDF distribution of user arrivals per hour

    obtained from feedback logs and session database separately.

    The steep slope of the distributions during hour 18-20 (6-

    8PM) indicates the high frequency of user arrivals during

    those hours. On ARD and Cuatro, the user arrival distributions

    inferred from feedback logs are almost identical to those from

    session database. On the other hand, on SF2, the distribution

    obtained from feedback logs tends to grow slowly during early

    TABLE IVMEDIAN CHANNEL JOIN DELAY.

    ChannelMedian join delay Maximum overlay size

    Off-peak Peak Off-peak Peak

    ARD 2.29 sec 1.96 sec 2,313 19,223

    Cuatro 3.67 sec 4.48 sec 2,357 8,073

    SF2 2.49 sec 2.67 sec 1,126 11,360

    hours, which indicates that feedback submission rate duringoff-peak hours on SF2 was relatively lower than normal. Later

    during peak hours, however, feedback submission rate picks

    up as expected, closely matching the actual user arrival rate.

    Based on this observation, we argue that feedback logs can

    serve as representative samples of daily user activities.

    Fig. 7 shows the CDF distributions of channel join delay

    for ARD, Cuatro and SF2 channels. We show the distributions

    for off-peak hours (0AM-6PM) and peak hours (6PM-0AM)

    separately. Median channel join delay is also presented in a

    similar fashion in Table IV. According to the CDF distribu-

    tions, 80% of users experience less than 4 to 8 seconds of

    join delay, and 50% of users even less than 2 seconds of join

    delay. Also, Table IV shows that even a 10-fold increase onthe number of concurrent users during peak hours does not

    unduly lengthen the channel join delay (up to 22% increase

    in median join delay).

    C. Repeater Node Assignment

    As illustrated in Fig. 5, the live coverage of Euro Cup games

    brought huge flash crowds to the Zattoo system. With typical

    users contributing only about 2530% of the average streaming

    bitrate, Zattoo must subsidize the balance. As described in

    Section III, the Zattoos Subsidy System assigns Repeater

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    9/14

    9

    00.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

    CDF

    Channel Join Delay (Sec)

    Off-Peak HoursPeak Hours

    (a) ARD

    00.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

    CDF

    Channel Join Delay (Sec)

    Off-Peak HoursPeak Hours

    (b) Cuatro

    00.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

    CDF

    Channel Join Delay (Sec)

    Off-Peak HoursPeak Hours

    (c) SF2

    Fig. 7. CDF of channel join delay.

    0

    5000

    10000

    15000

    20000

    25000

    18 19 20 21 22 23 240

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    OverlaySize

    NumberofRepeatersAssigned

    Hour of Day

    Overlay SizeNumber of Repeaters

    (a) ARD

    0

    5000

    10000

    15000

    20000

    25000

    12 14 16 18 20 22 240

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    OverlaySize

    NumberofRepeatersAssigned

    Hour of Day

    Overlay SizeNumber of Repeaters

    (b) Cuatro

    0

    5000

    10000

    15000

    20000

    25000

    18 19 20 21 22 23 240

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    OverlaySize

    NumberofRepeatersAssigned

    Hour of Day

    Overlay SizeNumber of Repeaters

    (c) SF2

    Fig. 8. Overlay size and channel provisioning.

    nodes to channels that require more bandwidth than its own

    globally available aggregate upload bandwidth.

    Fig. 8 shows the Subsidy System assigning more Repeater

    nodes to a channel as flash crowds arrived during each Euro

    Cup game and then gradually reclaiming them as the flash

    crowd departed. For each of the channel reported, we choose

    a particular date with the biggest flash crowd on the channel.

    The dip in overlay sizes on ARD and Cuatro channels occurred

    during the half-time break of the games. The Subsidy Server

    was less aggressive in assigning Repeater nodes to the ARD

    channel, as compared to the other two channels, because the

    Repeater nodes in the vicinity of the ARD server have higher

    capacity than those near the other two.

    V. CLIENT-SIDE MEASUREMENTS

    To further study the P2P overlay beyond details obtain-

    able from aggregated session-level statistics, we run several

    modified Zattoo clients which periodically retrieve the internal

    states of other participating peers in the network by exchang-

    ing SEARCH/JOIN messages with them. After a given probesession is over, the monitoring client archives a log file where

    we can analyze control/data traffic exchanged and detailed

    protocol behavior. We run the experiment during Zattoos live

    coverage of Euro 2008 (June 7th to 29th). The monitoring

    clients tuned to game channels from one of Zattoos data

    centers located in Switzerland while the games were broadcast

    live. The data sets presented in this paper were collected

    during the coverage of the championship final on two separate

    channels: ARD in Germany and Cuatro in Spain. Soccer teams

    from Germany and Spain participated in the championship

    final.

    As described in Section II-A, Zattoos peer discovery is

    guided by peers topology information. To minimize potential

    sampling bias caused by our use of single vantage point for

    monitoring, we assigned empty AS number and country

    code to our monitoring clients, so that their probing is not

    geared towards those peers located in the same AS and

    country.

    A. Sub-Stream Synchrony

    To ensure good viewing quality, peer should not only

    obtain all necessary sub-streams (discounting redundant sub-

    streams), but also have those sub-streams delivered temporally

    synchronized with each other for proper online decoding. Re-

    ceiving out-of-sync sub-streams typically results in pixelated

    screen on the player. As described in Sections 1 and II-C,

    Zattoos protocol favors sub-streams that are relatively in-

    sync when constructing the PDM, and continually monitors

    the sub-streams progression over time, replacing those sub-

    streams that have fallen behind and reconfiguring the PDM

    when necessary. In this section we measure the effectivenessof Zattoos Adaptive PDM in selecting sub-streams that are

    largely in-sync.

    To quantify such inter-sub stream synchrony, we measure

    the difference in the latest (i.e., maximum) packet sequence

    numbers belonging to different incoming sub-streams. When a

    remote peer responds to a SEARCH query message, it includes

    in its SEARCH reply the latest sequence numbers that it has

    received for all sub-streams. If some sub-streams happen to be

    lossy or stalled at that time, the peer marks such sub-streams

    in its SEARCH replies. Thus, we can inspect SEARCH replies

    from existing peers to study their inter-sub stream synchrony.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    10/14

    10

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    CDF

    Number of Bad Sub-streams

    Euro 2008 final on ARDEuro 2008 final on Cuatro

    (a) CDF for number of bad sub-streams

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 100 200 300 400 500 600 700 800 900 1000

    CDF

    Sub-stream Synchrony (# of Packets)

    Euro 2008 final on ARDEuro 2008 final on Cuatro

    (b) CDF for sub-stream synchrony

    Fig. 9. Sub-stream synchrony.

    In our experiment, we collected SEARCH replies from

    4,420 and 6,530 distinct peers from ARD and Cuatro channels

    respectively, during the 2-hour coverage of the final game.

    From the collected SEARCH replies, we check how many

    sub-streams (out of 16) are bad (e.g., lossy or missing) foreach peer. Fig. 9(a) shows the CDF distribution of the number

    of bad sub-streams. According to the figure, about 99% (ARD)

    and 96% (Cuatro) of peers have 3 or less bad sub-streams.

    Current Zattoo deployment dedicates k = 3 sub-streams(out of n = 16) for loss recovery purposes. That is, given asegment of 16 consecutive packets, if peer has received at least

    13 packets, it can reconstruct the remaining 3 packets from

    the RS error correcting code (see Section 1). Thus if peer can

    receive any 13 sub-streams out of 16 reliably, it can decode

    the full stream properly. The result in Fig. 9 (a) suggests that

    the number of bad sub-streams is low enough as to not cause

    quality issues in the Zattoo network.

    After discounting bad sub-streams, we then look at the

    synchrony of the remaining good sub-streams in each peer.

    Fig. 9(b) shows the CDF distribution of the sub-stream syn-

    chrony in the two channels. Sub-stream synchrony of a given

    peer is defined as the difference between maximum and mini-

    mum packet sequence numbers among all sub-streams, which

    is obtained from the peers SEARCH reply. For example, if

    some peer has sub-stream synchrony measured at 100, it means

    that the peer has one sub-stream that is ahead of another sub-

    stream by 100 packets. If all the packets are received in order,

    the sub-stream synchrony of a peer measures at most n 1.If we received multiple SEARCH replies from the same peer,

    we average the sub-stream synchrony across all the replies.

    Given the 500 kbps average channel data rate, 60 consecutive

    packets roughly correspond to 1-second worth of streaming.

    Thus, the figure shows that on Cuatro channel, 20% of peers

    have their sub-streams completely in-sync, while more than

    90% have their sub-streams lagging each other by at most

    5 seconds; on ARD channel, 30% are in-sync, and more than

    90% are within 1.5 seconds. The buffer space of Zattoo player

    has been dimensioned sufficiently to accommodate such low

    degree of out-of-sync sub-streams.

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    -350 -300 -250 -200 -150 -100 -50 0 50 100

    CDF

    Relative Peer Synchrony (# of Packets)

    Euro 2008 final on CuatroEuro 2008 final on ARD

    Fig. 10. Peer synchrony.

    B. Peer Synchrony

    While sub-stream synchrony tells us stream quality differentpeers may experience, peer synchrony tells us how varied in

    time peers viewing points are. With small scale P2P networks,

    all participating peers are likely to watch live streaming

    roughly synchronized in time. However, as the size of the

    P2P overlay grows, the viewing point of edge nodes may be

    delayed significantly compared to those closer to the Encoding

    Server. In the experiment, we define the viewing point of a

    peer as the median of the latest sequence numbers across

    its sub-streams. Then we choose one peer (e.g., a Repeater

    node directly connected to the Encoding Server) as a reference

    point, and compare other peers viewing point against the

    reference viewing point.

    Fig. 10 shows the CDFs of relative peer synchrony. Therelative peer synchrony of peer X is obtained by the viewingpoint of X subtracted by the reference viewing point. So peersynchrony at -60 means that a given peers viewing point

    is delayed by 60 packets (roughly 1 second for a 500 kbps

    stream) compared to the reference viewing point. A positive

    viewing point means that a given peers stream gets ahead

    of the reference point, which could happen for peers which

    receive streams directly from the Encoding Server. The figure

    shows that about 1% of peers on ARD and 4% of peers on

    Cuatro experienced more than 3 seconds (i.e., 180 packets)

    delay in streaming compared to the reference viewing point.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    11/14

    11

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    -350 -300 -250 -200 -150 -100 -50 0 50 100

    CDF

    Peer Synchrony

    Peer hops from ES: 6Peer hops from ES: 5Peer hops from ES: 4Peer hops from ES: 3Peer hops from ES: 2

    (a) ARD

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    -350 -300 -250 -200 -150 -100 -50 0 50 100

    CDF

    Peer Synchrony

    Peer hops from ES: 6Peer hops from ES: 5Peer hops from ES: 4Peer hops from ES: 3Peer hops from ES: 2

    (b) Cuatro

    Fig. 11. Peer synchrony vs. peer-hop from the Encoding Server.

    To better understand to what extent peers position in the

    overlay affects peer synchrony, we plot in Fig. 11 the CDFs

    of peer synchrony at different depths, i.e., distances from the

    Encoding Server. We look at how much delay is introduced

    for sub-streams traversing i peers from the Encoding Server(i.e., depth i). For this purpose, we associate per-sub streamviewing point information from SEARCH reply with per-sub

    stream overlay depth information from JOIN reply, where

    SEARCH/JOIN replies were sent from the same peer, close in

    time (e.g., less than 1 second apart). If the length of peer-hop

    from the Encoding Server has an adverse effect on playback

    delay and viewing point, we expect peers further away from

    the Encoding Server to be further offset from the reference

    viewing point, resulting in CDFs that are shifted to the upper

    left corner for peers at further distances from the Encoding

    Server. The figure shows an absence of such a leftward shift

    in the CDFs. The median delay at depth 6 does not grow

    by more than 0.5 seconds compared to the median delay at

    depth 2. The figure shows that having more peers further away

    from the Encoding Servers increases the number of peers that

    are 0.5 seconds behind the reference viewing point, without

    increasing the offset in viewing point itself. On the Zattoo

    network, once the virtual circuits comprising a PDM have

    been set up, each packet are streamed with minimal delay

    through each peer. Hence each peer-hop from the Encoding

    Server introduces delay only in tens of milliseconds range,

    attesting to the suitability of the network architecture to carry

    live media streaming on large-scale P2P networks.

    C. Effectiveness of ECC in Isolating Loss

    Now we investigate the effects of overlay sizes on the per-

    formance scalability of the Zattoo system. Here we focus on

    client-side quality (e.g., loss rate). As described in Section 1,

    Zattoo-broadcast media streams are RS encoded, which allows

    peers to reconstruct a full stream once they obtain at least k ofn sub-streams. Since the ECC-encoded stream reconstructionoccurs hop by hop in the overlay, it can mask sporadic packet

    losses, and thus prevent packet losses from being propagated

    throughout the overlay at large.

    To see if such packet loss containment actually occurs in

    the production system, we run the following experiments. We

    let our monitoring client join a game channel, stay tuned for

    15 seconds, and then leave. We wait for 10 seconds after the

    15-second session is over. We repeat this process throughout

    the 2-hour game coverage and collect the logs. A given log file

    from each session contains a complete list of packet sequence

    numbers received from the connected peers during the session,

    from which we can detect upstream packet losses. We then

    associate individual packet loss with the peer-path from the

    Encoding Server. This gives us a rough sense of whether

    packets traversing longer hops would experience higher loss

    rate. In our analysis, we discount packets delivered for the first

    3 seconds of a session to allow the PDM to stabilize.

    Fig. 12 shows how the average packet loss rate changes

    across different overlay depths (in all cases < 1%). On bothchannels, the packet loss rate does not grow with overlaydepth. This result confirms our expectation that ECC help

    localize packet losses on the overlay.

    V I . PEE R-TO-P EE R SHARING RATIO

    The premise of a given P2P systems success in providing

    scalable stream distribution is sufficient bandwidth sharing

    from participating users [5]. Section IV-A shows that the

    average sharing ratio of Zattoo users ranges from 0.2 to 0.35,

    which translates into bandwidth uplinks ranging from 100 kbps

    to 175 kbps. This is far lower than the numbers reportedas typical uplink bandwidth in countries where Zattoo is

    available [6]. Aside from the possibility that users bandwidth

    may be shared with other applications, we find factors such

    as users behavior, support for variable-bit rate encoding,

    and heterogeneous NAT environments contributing to sub-

    optimal sharing performance. Designers of P2P streaming

    systems must pay attention to these factors to achieve good

    bandwidth sharing. However, one must also keep in mind that

    improving bandwidth sharing should not be at the expense

    of compromised user viewing experience, e.g., due to more

    frequent uplink bandwidth saturation.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    12/14

    12

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0 1 2 3 4 5 6 7

    PacketLossRate(%)

    Peer Hops from Encoding Server

    (a) ARD

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0 1 2 3 4 5 6 7

    PacketLossRate(%)

    Peer Hops from Encoding Server

    (b) Cuatro

    Fig. 12. Packet loss rate vs. peer hops from Encoding Server.

    0

    10

    20

    30

    40

    50

    60

    0 10 20 30 40 50 60

    AverageReceiver-PeerChurns

    Session Length (Min)

    Fig. 13. Receiver-peer churn frequency.

    User churns: It is known that frequent user churns ac-

    companied by short sessions, as well as flash crowd behaviorpose a unique challenge in live streaming systems [7], [8].

    User churns can occur in both upstream (e.g., forwarding

    peers) and downstream (e.g., receiver peers) connectivity on

    the overlay. While frequent churns of forwarding peers can

    cause quality problems, frequent changes of receiver peers can

    lead to under-utilization of a forwarding peers uplink capacity.

    To estimate receiver peers churn behavior, we visualize

    in Fig. 13 the relationship between session length and the

    frequency of receiver-peer churns experienced during the ses-

    sions. We studied receiver-peer churn behavior for the Cuatro

    channel by using Zattoos feedback logs described in Table II.

    The y-value at session length x minutes denotes the average

    number of receiver-peer churns experienced for sessions withlength [x, x + 1). According to the figure, the receiver-peerchurn frequency tends to grow linearly with session length,

    and peers experience approximately one receiver-peer churn

    every minute.

    Variable-bit rate streams: Variable-bit rate (VBR) en-

    coding is commonly used in production streaming systems,

    including Zattoo, due to its better quality-to-bandwidth ra-

    tio compared to constant-bit rate encoding. However, VBR

    streams may put additional strain on peers bandwidth con-

    tribution and quality optimization. As described in Sec-

    tion II-C, the presence of VBR streams require peers to

    perform measurement-based admission control when allocat-

    ing resources to set up a virtual circuit. To avoid degrada-

    tion of service due to overloading of the uplink bandwidth,

    the measurement-based admission control module must be

    conservative both in its reaction to increases in availablebandwidth and its allocation of available bandwidth to newly

    joining peers. This conservativeness necessarily lead to under-

    utilization of resources.

    NAT reachability: Asymmetric reachability imposed by

    prevalent NAT boxes adds to the difficulty in achieving full

    utilization of users uplink capacity. Zattoo delivery system

    supports 6 different NAT configurations: open host, full cone,

    IP-restricted, port-restricted, symmetric, and UDP-disabled,

    listed in increasing degree of restrictiveness. Not every pair-

    wise communication can occur among different NAT types.

    For example, peers behind a port-restricted NAT box cannot

    communicate with those of symmetric NAT type (we docu-

    mented a NAT reachability matrix in an earlier work [5]).

    To examine how the varied NAT reachability comes into

    play as far as sharing performance is concerned, we performed

    the following two controlled experiments. In both experiments,

    we run four Zattoo clients, each with a distinct NAT type,

    tuned to the same channel concurrently for 30 minutes. In

    the first experiment, we fixed the maximum allowable uplink

    capacity (denoted max cp) of those clients to the same con-

    stant value.3 In the second experiment, we let the max cp of

    those clients self-adjust over time (which is the default setting

    of Zattoo player). In both experiments, we monitored how the

    uplink bandwidth utilization ratio (i.e., ratio between actively

    used uplink bandwidth and maximum allowable bandwidthmax cp) changes over time. The experiments were repeated

    three times for reliability.

    Fig. 14 shows the capacity utilization ratio from these two

    experiments for four different NAT types. Fig. 14(a) visualizes

    how fast the capacity utilization ratio converges to 1.0 over

    time after the client joins the network. Fig. 14(b) plots the

    average utilization ratio (i.e., [

    tcurrent capacity(t)dt] /

    [

    tmax cp(t)dt]). In both constant and adjustable cases, the

    3The experiments were performed from one of Zattoos data centers inEurope, and there was sufficient uplink bandwidth available to support 4 *max cp.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    13/14

    13

    results clearly exemplify the varied sharing performance across

    different NAT types. Especially, the symmetric NAT type,

    which is the most restrictive among the four, shows inferior

    sharing ratio compared to the rest. Unfortunately, however, the

    symmetric NAT type is the second most popular NAT type

    for Zattoo population [5], and therefore can seriously affect

    Zattoos overall sharing performance. There is a relatively

    small number of peers running IP-restricted NAT-type, hence

    its sharing performance has not been fine-tuned at Zattoo.

    Reliability of NAT detection: To allow communications

    among clients behind a NAT gateway (i.e., NATed clients),

    each client in Zattoo performs a NAT detection procedure

    upon player startup to identify its NAT configuration and

    advertises it to the rest of the Zattoo network. Zattoos NAT

    detection procedure implements a UDP-based standard STUN

    protocol which involves communicating with external STUN

    servers to discover the presence/type of a NAT gateway [9].

    The communication occurs in UDP with no reliable transport

    guarantee. Lost or delayed STUN UDP packets may lead

    to inaccurate NAT detection, preventing NATed clients from

    contacting each other, and therefore adversely affect theirsharing performance.

    To understand the reliability and accuracy of STUN-based

    NAT detection procedure, we utilize Zattoos session database

    (Table I) which stores among other things NAT information,

    public IP address, and private IP address for each session. We

    assume that all the sessions associated with the same pub-

    lic/private IP address pair would be generated under the same

    NAT configuration. If sessions with the same public/private

    IP address pair report inconsistent NAT detection results, we

    consider those sessions as having experienced failed NAT

    detection, and apply the majority rule to determine the correct

    result for that configuration.

    Fig. 15 plots the daily trend of NAT detection failure ratederived from ARD, Cuatro and SF2 during the month of June,

    2008. The NAT detection failure rate at hour x is the numberof bogus NAT detection results divided by the total number of

    NAT detections occurring on that hour. The total number of

    NAT detections indicates how busy Zattoo system was through

    out. The NAT detection failure rate grows from around 1.5%

    at 2-3AM to a peak of almost 6% at 8PM. This means that at

    the busiest times, the NAT type of about 6% of clients are not

    determinable, leading to them not contributing any bandwidth

    to the P2P network.

    VII. RELATED WORKS

    Aside from Zattoo, several commercial peer-to-peer systemsintended for live streaming have been introduced since 2005,

    notably PPLive, PPStream, SopCast, TVAnts, and UUSee from

    China, and Joost, Livestation, Octoshape, and RawFlow from

    the EU. A large number of measurement studies have been

    done on one or the other of these systems [7], [10][16].

    Many research prototypes and improvements to existing P2P

    systems have also been proposed and evaluated [4], [17][24].

    Our study is unique in that we are able to collect network

    core data from a large production system with over 3 million

    registered users with intimate knowledge of the underlying

    network architecture and protocol.

    0

    1

    2

    3

    4

    5

    6

    7

    0 2 4 6 8 10 12 14 16 18 20 22 240.0E+

    5.0E+4

    1.0E+5

    1.5E+5

    2.0E+5

    2.5E+5

    3.0E+5

    3.5E+5

    NATDetectionFailureRate(%)

    Num

    berofNATDetections

    Hour of Day

    NAT failure rate# of NAT detections

    Fig. 15. NAT detection failure rate.

    P2P systems are usually classified as either tree-based push

    or mesh-based swarming [25]. In tree-based push schemes,

    peers organize themselves into multiple distribution trees [26]

    [28]. In mesh-based swarming, peers form a randomly con-

    nected mesh, and content is swarmed via dynamically con-

    structed directed acyclic paths [21], [29], [30]. Zattoo is not

    only a hybrid between these two, similar to the latest version

    of Coolstreaming [4], its dependence on Repeater nodes also

    makes it a hybrid of P2P network and a content-distribution

    network (CDN), similar to PPlive [7].

    VIII. CONCLUSION

    We have presented a receiver-based, peer-division multi-

    plexing engine to deliver live streaming content on a peer-to-

    peer network. The same engine can be used to transparently

    build a hybrid P2P/CDN delivery network by adding Repeater

    nodes to the network. By analyzing large amount of usage data

    collected on the network during one of the largest viewing

    event in Europe, we have shown that the resulting network

    can scale to a large number of users and can take good

    advantage of available uplink bandwidth at peers. We have also

    shown that error-correcting code and packet retransmission

    can help improve network stability by isolating packet losses

    and preventing transient congestion from resulting in PDM

    reconfigurations. We have further shown that the PDM and

    adaptive PDM schemes presented have small enough overhead

    to make our system competitive to digital satellite TV in terms

    of channel switch time, stream synchronization, and signal lag.

    ACKNOWLEDGMENT

    The authors would like to thank Adam Goodman for devel-oping the Authentication Server, Alvaro Saurin for the error

    correcting code, Zhiheng Wang for an early version of the

    stream buffer management, Eric Wucherer for the Rendezvous

    Server, Zach Steindler for the Subsidy System, and the rest of

    Zattoo development team for fruitful collaboration.

    REFERENCES

    [1] R. auf der Maur, Die Weiterverbreitung von TV- und Radioprogrammenuber IP-basierte Netze, in Entertainment Law (f. d. Schweiz), 1st ed.Stampfli Verlag, 2006.

    [2] UEFA, Euro2008, http://www1.uefa.com/.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011

  • 7/31/2019 Live Streaming With

    14/14

    14

    0

    0.2

    0.4

    0.6

    0.8

    1

    0 5 10 15 20 25 30

    CurrentCapacityUtilizationRatio

    Time Since Joining (Min)

    Full coneIP-restricted

    Port-restrictedSymmetric

    (a) Constant maximum capacity

    0

    0.2

    0.4

    0.6

    0.8

    1

    AverageCap

    acityUtilizationRatio

    Full-cone IP-restricted Port-restricted Symmetric

    (b) Adjustable maximum capacity

    Fig. 14. Uplink capacity utilization ratio.

    [3] S. Lin and D. J. Costello, Jr., Error Control Coding, 2nd ed. PearsonPrentice-Hall, 2004.

    [4] S. Xie, B. Li, G. Y. Keung, and X. Zhang, CoolStreaming: Design,Theory, and Practice, IEEE Trans. on Multimedia, vol. 9, no. 8,December 2007.

    [5] K. Shami et al., Impacts of Peer Characteristics on P2PTV NetworksScalability, in Proc. IEEE INFOCOM Mini Conference, April 2009.

    [6] Bandwidth-test.net, Bandwidth test statistics across different countries,http://www.bandwidth-test.net/stats/country/.

    [7] X. Hei, C. Liang, J. Liang, Y. Liu, and K. W. Ross, Insights intoPPLive: A Measurement Study of a Large-Scale P2P IPTV System, inProc. IPTV Workshop, International World Wide Web Conference, 2006.

    [8] B. Li et al., An Empirical Study of Flash Crowd Dynamics ina P2P-Based Live Video Streaming System, in Proc. IEEE GlobalTelecommunications Conference, 2008.

    [9] J. R. et al, STUN - Simple Traversal of User Datagram Protocol (UDP)Through Network Address Translators (NATs), 1993, RFC 3489.

    [10] A. Ali, A. Mathur, and H. Zhang, Measurement of Commercial Peer-to-Peer Live Video Streaming, in Proc. Workshop on Recent Advancesin Peer-to-Peer Streaming, August 2006.

    [11] L. Vu et al., Measurement and Modeling of a Large-scale Overlay

    for Multimedia Streaming, in Proc. Intl Conf. on HeterogeneousNetworking for Quality, Reliability, Security and Robustness, 2007.

    [12] T. Silverston and O. Fourmaux, Measuring P2P IPTV Systems, inProc. ACM NOSSDAV, November 2008.

    [13] C. Wu, B. Li, and S. Zhao, Magellan: Charting Large-Scale Peer-to-Peer Live Streaming Topologies, in Proc. ICDCS07, 2007, p. 62.

    [14] M. Cha et al., On Next-Generation Telco-Managed P2P TV Architec-tures, in Proc. Intl Workshop on Peer-to-Peer Systems, 2008.

    [15] D. Ciullo et al., Understanding P2P-TV Systems Through Real Mea-surements, in Proc. IEEE GLOBECOM, November 2008.

    [16] E. Alessandria et al., P2P-TV Systems under Adverse Network Con-ditions: a Measurement Study, in Proc. IEEE INFOCOM, April 2009.

    [17] S. Banerjee et al., Construction of an Efficient Overlay MulticastInfrastructure for Real-time Applications, in Proc. IEEE INFOCOM,2003.

    [18] D. Tran, K. Hua, and T. Do, ZIGZAG: An Efficient Peer-to-PeerScheme for Media Streaming, in Proc. IEEE INFOCOM, 2003.

    [19] D. Kostic et al., Bullet: High Bandwidth Data Dissemination Usingan Overlay Mesh, in Proc. ACM SOSP, Bolton Landing, NY, USA,October 2003.

    [20] R. Rejaie and S. Stafford, A Framework for Architecting Peer-to-PeerReceiver-driven Overlays, in Proc. ACM NOSSDAV, 2004.

    [21] X. Liao et al., AnySee: Peer-to-Peer Live Streaming, in Proc. IEEEINFOCOM, April 2006.

    [22] F. Pianese et al., PULSE: An Adaptive, Incentive-Based, UnstructuredP2P live Streaming System, IEEE Trans. on Multimedia, vol. 9, no. 6,2007.

    [23] J. Douceur, J. Lorch, and T. Moscibroda, Maximizing Total Uploadin Latency-Sensitive P2P Applications, in Proc. ACM SPAA, 2007, pp.270279.

    [24] Y.-W. Sung, M. Bishop, and S. Rao, Enabling Contribution Awarenessin an Overlay Broadcasting System, in Proc. ACM SIGCOMM, 2006.

    [25] N. Magharei, R. Rejaie, and Y. Guo, Mesh or Multiple-Tree: AComparative Study of Live P2P Streaming Approaches, in Proc. IEEE

    INFOCOM, May 2007.[26] V. N. Padmanabhan, H. J. Wang, and P. A. Chou, Resilient Peer-to-Peer

    Streaming, in Proc. IEEE ICNP, November 2003.

    [27] M. Castro et al., SplitStream: High-Bandwidth Multicast in CooperativeEnvironments, in Proc. ACM SOSP, October 2003.

    [28] J. Liang and K. Nahrstedt, DagStream: Locality Aware and FailureResilient Peer-to-Peer Streaming, in Proc. SPIE Multimedia Computingand Networking, January 2006.

    [29] N. Magharei and R. Rejaie, PRIME: Peer-to-Peer Receiver-drIvenMEsh-based Streaming, in Proc. IEEE INFOCOM, May 2007.

    [30] M. Hefeeda et al., PROMISE: Peer-to-Peer Media Streaming UsingCollectCast, in Proc. ACM Multimedia, November 2003.

    Hyunseok Chang received the B.S. degree in elec-trical engineering from Seoul National University,Seoul, Korea, in 1998, and the M.S.E. and Ph.D.degrees in computer science and engineering fromthe University of Michigan, Ann Arbor, in 2001and 2006, respectively. He is currently a Member

    of Technical Staff with the Network Protocols andSystems Department, Al- catel-Lucent Bell Labs,Holmdel, NJ. Prior to that, he was a Software Ar-chitect at Zattoo, Inc., Ann Arbor, MI. His researchinterests include content distribu- tion network, mo-

    bile applications, and network measurements.

    Sugih Jamin received the Ph.D. degree in computerscience from the University of Southern California,Los Angeles, in 1996. He is an Associate Professorwith the Department of Electrical Engineering andComputer Science, University of Michigan, AnnArbor. He cofounded a peer-to-peer live streamingcompany, Zattoo, Inc., Ann Arbor, MI, in 2005.Dr. Jamin received the National Science Founda-tion (NSF) CAREER Award in 1998, the Presiden-tial Early Career Award for Scientists and Engineers

    (PECASE) in 1999, and the Alfred P. Sloan ResearchFellowship in 2001.

    Wenjie Wang received the Ph.D. degree in com-puter science and engineering from the Universityof Michigan, Ann Arbor, in 2006. He is a ResearchStaff Member with IBM Research China, Beijing,China. He co-founded a peer-to-peer live streamingcompany, Zattoo Inc., Ann Arbor, MI, in 2005, basedon his Ph.D. dissertation, and was CTO. He alsoworked in Microsoft Research China and OracleChina as a visiting student and consultant intern.

    IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011