signal processing: image communicationmoncef/publications/rate-adaptation-ic-2011.pdf · rate...

Contents lists available at SciVerse ScienceDirect

Signal Processing: Image Communication

Signal Processing: Image Communication ] (]]]]) ]]]–]]]

0923-59

doi:10.1

n Corr

E-m# E

Pleasdistr

journal homepage: www.elsevier.com/locate/image

Rate adaptation for dynamic adaptive streaming over HTTP incontent distribution network

Chenghao Liu a,n, Imed Bouazizi b, Miska M. Hannuksela b,#, Moncef Gabbouj a,#

a Department of Signal Processing, Tampere University of Technology, Finlandb Nokia Research Center, Tampere, Finland

a r t i c l e i n f o

Keywords:

Rate adaptation

Dynamic Adaptive Streaming over HTTP

DASH

Adaptive HTTP Streaming

CDN

65/$ - see front matter & 2011 Elsevier B.V. A

016/j.image.2011.10.001

esponding author. Tel.: þ358 509349231.

ail address: [email protected] (C. Liu).

URASIP member.

e cite this article as: C. Liu, et aibution network, Signal Process. Ima

a b s t r a c t

Recently the 3rd Generation Partnership Project (3GPP) and the Moving Picture Experts

Group (MPEG) specified Dynamic Adaptive Streaming over HTTP (DASH) to cope with

the shortages in progressive HTTP based downloading and Real-time Transport Protocol

(RTP) over the User Datagram Protocol (UDP), shortly RTP/UDP, based streaming. This

paper investigates rate adaptation for the serial segment fetching method and the

parallel segment fetching method in Content Distribution Network (CDN). The serial

segment fetching method requests and receives segments sequentially whereas the

parallel segment fetching method requests media segments in parallel. First, a novel

rate adaptation metric is presented in this paper, which is the ratio of the expected

segment fetch time (ESFT) and the measured segment fetch time to detect network

congestion and spare network capacity quickly. ESFT represents the optimum segment

fetch time determined by the media segment duration multiplied by the number of

parallel HTTP threads to deliver media segments and the remaining duration to fetch

the next segment to keep a certain amount of media time in the client buffer. Second,

two novel rate adaptation algorithms are proposed for the serial and the parallel

segment fetching methods, respectively, based on the proposed rate adaptation metric.

The proposed rate adaptation algorithms use a step-wise switch-up and a multi-step

switch-down strategy upon detecting the spare networks capacity and congestion with

the proposed rate adaptation metric. To provide a good convergence in the representa-

tion level for DASH in CDN, a sliding window is used to measure the latest multiple rate

adaptation metrics to determine switch-up. To decide switch-down, a rate adaptation

metric is used. Each rate adaptation metric represents a reception of a segment/portion

of a segment, which can be fetched from the different edge servers in CDN, hence it can

be used to estimate the corresponding edge server bandwidth. To avoid buffer overflow

due to a slight mismatch in the optimum representation level and bandwidth, an idling

method is used to idle a given duration before sending the next segment. In order to

solve the fairness between different clients who compete for bandwidth, the prioritized

optimum segment fetch time is assigned to the newly joined clients. The proposed rate

adaptation method does not require any transport layer information, which is not

available at the application layer without cross layer communication. Simulation results

show that the proposed rate adaptation algorithms for the serial and the parallel

segment fetching methods quickly adapt the media bitrate to match the end-to-end

network capacity, provide an advanced convergence and fairness between different

clients and also effectively control buffer underflow and overflow for DASH in CDN. The

reported simulation results demonstrate that the parallel rate adaptation outperforms

ll rights reserved.

l., Rate adaptation for dynamic adaptive streaming over HTTP in contentge Commun. (2011), doi:10.1016/j.image.2011.10.001

www.elsevier.com/locate/image

www.elsevier.com/locate/image

dx.doi.org/10.1016/j.image.2011.10.001

mailto:[email protected]

http://www.eurasip.org

dx.doi.org/10.1016/j.image.2011.10.001

dx.doi.org/10.1016/j.image.2011.10.001

C. Liu et al. / Signal Processing: Image Communication ] (]]]]) ]]]–]]]2

Please cite this article as: C. Liu, et adistribution network, Signal Process. Ima

the serial DASH rate adaptation algorithm with respect to achievable media bitrates

while the serial rate adaptation is superior to the parallel DASH with respect to the

convergence and buffer underflow frequency.

& 2011 Elsevier B.V. All rights reserved.

1. Introduction

Recently, Hypertext Transfer Protocol (HTTP) has beenwidely used for the delivery of real-time multimedia contentover the Internet, such as in video streaming applications.Unlike the use of Real-time Transport Protocol (RTP) overUser Datagram Protocol (UDP), HTTP [1] is easy to configureand is typically granted traversal of firewalls and networkaddress translators (NATs), which makes it attractive formultimedia streaming applications. Krasic et al. [2], Wanget al. [3] and Kim and Ammar [4] reported that short-termtransmission rate variation in HTTP/TCP can be smoothed outby receiver side buffering. Thus, commercial solutions, suchas Microsoft Smooth Streaming [5] and Adobe DynamicStreaming [6], have been launched as well as standardizationprojects have been carried out. Adaptive HTTP streaming(AHS) was first standardized in Release 9 of 3GPP packet-switched streaming (PSS) service [7]. MPEG took 3GPP AHSRelease 9 as a starting point for its newly published MPEGDASH standard [8]. 3GPP continued to work on adaptiveHTTP streaming in communication with MPEG and recentlypublished the 3GP-DASH (Dynamic Adaptive Streaming overHTTP) [9]. MPEG DASH and 3GP-DASH are technically similarand are therefore collectively referred to as DASH in thispaper. In DASH, the client continuously requests and receivessmall segments of multimedia content, denoted as mediasegments. To adapt the media bitrate to the varying networkbandwidth, DASH allows clients to request media segmentsfrom different representations, each of which represents aspecific media bitrate.

A DASH client might suffer from frequent interruptionsand sub-optimum media bitrates without an efficient rateadaptation algorithm. A media bitrate higher than the shar-able bandwidth would cause network congestion. As DASHtypically uses Transmission Control Protocol (TCP) as trans-port layer protocol, network congestion causes TCP conges-tion control mechanism to become active, which may furtherresult in a dramatic decrease in throughput. Hence, bufferedmedia data can be drained-up much faster in DASH com-pared to the traditional RTP/UDP-based streaming in case ofnetwork congestion. On the other hand, i.e., when the mediabitrate is lower than the sharable bandwidth, the mediaquality does not reach the optimum allowed by the availablebandwidth. Many of the Internet’s real-time video servicesuse a progressive download approach and hence suffer fromthe above mentioned problems, such as frequent playbackinterruption and sub-optimum streaming quality. Manyvideo services offer a set of pre-defined quality levels of avideo clip to users for manual a-priori selection. If the bitrateof the selected representation turns out to be higher than theavailable end-to-end bandwidth, then the user will mostprobably experience playback interruptions and re-bufferingevents due to buffer underflow. Otherwise, if the bitrate ofthe representation is lower than the available network

l., Rate adaptation fge Commun. (2011),

bandwidth, then the user will consume the content at asub-optimal quality. Since the client will download contentfaster than the playout rate, it could result in inefficient use ofavailable bandwidth if the user stops watching the content.An efficient rate adaptation algorithm is required for DASH tosolve the above problems, specifically, the frequent interrup-tion in playback and sub-optimum streaming quality. How-ever, these problems become even more challenging becauseof the difficulties in differentiating between the short-termthroughput variation, incurred by the congestion control, andthe average throughput changes due to more persistentbandwidth changes. Furthermore, the rate adaptation algo-rithm of DASH should take into account the infrastructure ofthe networks.

The research issues of the rate adaptation for DASHinclude the following aspects. First, the rate adaptationmethod must deploy a metric to determine if the bitrateof a specific representation matches the available end-to-end bandwidth. This metric is expected to distinguishbetween persistent throughput variation due to networkbandwidth changes and the short-term throughputdynamics attributable to the TCP congestion control. Inaddition, the metric should identify network congestionand spare network bandwidth fast enough in order toreact promptly and reach the optimum representationlevel as soon as possible, wherein the higher representa-tion level represents the higher level of media bitrates andthe lower representation level represents the lower levelof media bitrates. Second, the rate adaptation algorithmshould prevent frequent hopping between media repre-sentations, because frequent changes in the perceivedmedia quality are likely to be annoying. In segmentdelivery over distributed networks, such as content dis-tribution network (CDN), preventing too frequent repre-sentation switching becomes more challenging assegments can be transmitted through different routeshaving different bandwidths. Thirdly, the rate adaptationalgorithm should avoid client buffer underflow and over-flow. Buffer underflow causes playback interruptions andoverflow can result in bandwidth waste. Fourthly, the rateadaptation algorithm for DASH should provide good fair-ness between different DASH clients, which compete foravailable bandwidth in the network. Finally, the mediasegment duration should be set appropriately in order tosmooth the short-term HTTP/TCP throughput variationand to provide low enough adaptation latencies.

Two strategies for issuing HTTP requests in DASH havebeen proposed, namely the serial segment fetchingmethod [10] and the parallel segment fetching method[11]. In the former, segments are requested and receivedone after another, whereas in the latter, media segmentsare requested in parallel using several TCP connections.This paper presents novel rate adaptation algorithms forboth the serial and the parallel segment fetching methods.

or dynamic adaptive streaming over HTTP in contentdoi:10.1016/j.image.2011.10.001

dx.doi.org/10.1016/j.image.2011.10.001

C. Liu et al. / Signal Processing: Image Communication ] (]]]]) ]]]–]]] 3

The main contributions of this paper include thefollowing two aspects. First, we propose a novel rateadaptation metric for DASH. The proposed rate adaptationmetric compares the expected segment fetch time ðESFTÞ

with the measured segment fetch time ðSFTÞ to determineif the media bitrate matches with the current bandwidth.The ESFT is the optimum segment fetch time, which isestimated each time before requesting a segment basedon (a) the segment duration multiplied by the number ofparallel segment fetching threads (b) the remaining timeperiod to fetch the next segment. The former is deter-mined by the scheduling method of the serial and theparallel segment fetching methods, whereas the latter isdecided by the real-time buffered media time. The rateadaptation metrics not only take into account the idealsegment fetch time of the serial and the parallel segmentfetching methods but also consider the buffered mediatime. Hence, the proposed rate adaptation metrics canprovide convergence in the representation level andprevent buffer underflow.

Second, we propose two novel rate adaptation algo-rithms for the serial and the parallel segment fetchingmethods in CDN, respectively. The proposed rate adapta-tion algorithms for the serial segment fetching methoduses a single rate adaptation metric to determine a switchup/down of representation levels. The proposed rate adap-tation algorithm for parallel segment fetching methoddeploys a sliding window to select the minimum rateadaptation metrics among multiple metrics. The minimumrate adaptation metrics is used to determine the switch-up/switch-down in order to match the media bitrates to theminimum bandwidth in the distributed edge servers ofCDN. Thus, the sliding window strategy of the proposedrate adaptation algorithm for the parallel segment fetchingmethod can efficiently prevent frequent hopping betweenrepresentation levels. Furthermore, the proposed rate adap-tation algorithms give a priority for newly joined DASHclients in order to share the available bandwidth fairlyamong DASH clients. In case of congestion, DASH clientsconsuming a higher representation level will quickly switchdown to a lower level firstly in order to leave room for DASHclients having a lower level using the proposed rate adapta-tion algorithm. Furthermore, in order to efficiently usenetwork bandwidth and memory resources for the users, amethod of determining the idle time before sending anHTTP GET request for the next segment is presented, thuslimiting the maximum amount of pre-fetching media.

The rest of this paper is organized as follows. Section 2presents the related work to describe the literature worksrelating to the rate adaptation for DASH. Section 3 gives abrief overview of the terminology of DASH, and DASHinfrastructure using a CDN. The serial and the parallelsegment fetching methods are presented in Section 4. InSection 5, a novel rate adaptation metric for DASH ispresented. Sections 6 and 7 provide further details on thetwo novel rate adaptation algorithms for the serial andthe parallel segment fetching methods in CDN, respec-tively. The simulation results of the proposed rate adapta-tion algorithms for the serial and the parallel segmentfetching methods in CDN are provided in Section 8. Finally,Section 9 concludes the paper.

Please cite this article as: C. Liu, et al., Rate adaptation fdistribution network, Signal Process. Image Commun. (2011),

2. Related work

In the last decade, several research works have beenconducted in the rate adaptation for media streamingover TCP, called TCP-based streaming in this paper. In thescenario of the TCP-based media streaming, most of theresearch works focused on sender-driven rate adaptation.For example, Lam et al. [12] proposed a method to estimateclient-side buffer occupancy in the server and to adaptmedia bitrates to maintain the client-side buffer occupancyabove a certain threshold.

Rate adaptation for DASH is a relatively new researchtopic and thus only few research works have been carriedout in this field. In DASH, decisions to adapt mediabitrates are made at the client side where the client bufferoccupancy is known and can be used directly for the rateadaptation. For example, in one of the earliest papers onadaptive streaming over HTTP by Farber et al. [13], rateadaptation decisions were made on the basis of mediaduration in the client buffer. However, client bufferoccupancy based rate adaptation for DASH has the follow-ing drawbacks. First, this method can cause hopping inthe representation level. As the choices of representationlevels are limited, it is difficult to match the optimummedia bitrate to the network capacity. Typically, theoptimum media bitrates should be the highest bitrate,which is lower than the bandwidth. In such a scenario, theclient buffer occupancy, which can be measured asbuffered media time, will dramatically increase even ifthe rate adaptation reaches the optimum media bitrate.Hence, the rate adaptation cannot converge to the opti-mum representation level as the client buffer occupancywill change even when the rate adaptation reaches theoptimum media bitrates. Second, the client buffer basedrate adaptation can result in unfairness between differentclients, which compete for bandwidth. For example, someclients can switch to a high-bitrate representation if ade-quate bandwidth is available, while other clients might notbe able to fetch equally high-bitrate representation if thenetworks are already on the stage of saturation when theystart streaming. Efficient rate adaptation algorithm shouldprovide a fair allocation of the network bandwidth to clients.Kuschnig et al. [14] presented a method of priority anddeadline driven rate adaptation method for scalable videostreaming over TCP using bandwidth estimation and clientbuffer estimation at the server. To stabilize TCP throughputover a given period of time and decrease quality fluctuation,relatively large groups of pictures (GOPs) are used toestimate TCP throughput by deploying the application layerbased and TCP stack based bandwidth estimation techni-ques. As pointed out in [14], the accuracy of the bandwidthestimation method suffers from the large GOP, which iscoped with the priority and deadline driven rate adaptationmethod proposed in the same paper. Recently, Akhshabiet al. [15] reported experimental evaluation of the rateadaptation in DASH. In [15], the experiment is performedusing the Microsoft smooth streaming, Netflix player, andAdobe OSMF player to evaluate the internal rate adaptationalgorithms with certain experimental methodology.

In our previous work [10], a rate adaptation algorithmfor the serial segment fetching method was presented. In


dx.doi.org/10.1016/j.image.2011.10.001


[10], the ratio of the segment duration and the latestsingle segment fetch time was used to detect networkcongestion and spare network capacity. However, the rateadaptation algorithm in [10] has the following short-comings. First, the rate adaptation method was developedfor the case where segments are requested and receivedsequentially, and hence ESFT is determined in [10] to beequal to the media segment duration ðMSDÞ. However, forthe parallel segment fetching method where the number ofsimultaneous HTTP transactions can be larger than one,ESFT should also be proportional to the number of parallelHTTP threads. Second, the buffered media time status of theclient buffer is not taken into account in the rate adaptationmetric proposed in [10], which causes buffer underflow dueto the lower buffered media time. To prevent buffer under-flow, the buffered media time should be used as the metricfor determining the next segment media bitrates. Third, therate adaptation method compares the ratio of the segmentduration by the latest received single segment fetch timewith the switch-up and switch down thresholds to adaptthe representation level for the serial segment fetchingmethod. In the parallel segment fetching method, multi-ple segments are received simultaneously, hence multiplesegment fetch times should be used to decide whether ornot to perform switching-up in order to improve conver-gence to a single representation level when the networksresource capacity is not changed severely. Fourth, thefairness issue between different DASH clients has notbeen discussed in [10].

DASH may employ the current web distribution net-work infrastructure, e.g., proxy caches and CDN. Cur-rently, CDNs are widely used to deliver media and webdata to end users through globally distributed edgeservers. CDNs, such as Akamai, deploy Domain NameSystem (DNS) based request redirection technique toredirect a client request to an appropriate edge serveramong multiple edge servers with the predefined policies.Each client request is used to fetch a media segment inDASH. The detailed request redirection mechanism andthe redirection frequency are described in Section 3.

Scheduling of HTTP GET requests for segment fetchingplays a critical role in achievable streaming bitrates forDASH in CDN. However, the specific scheduling method ofrequesting segments has not been specified in DASH. Onlythe informative instructions of the client behavior areprovided in both 3GP-DASH and MPEG-DASH. One com-mon method is the serial segment fetching method, whichrequests and receives segments one after another. For theDASH in CDN, the serial segment fetching method mightnot yield the optimum streaming service to DASH clientssince it only operates one HTTP thread at a time to delivermedia segments over the distributed edge servers in CDN,which results in inefficient usage of the distributed net-work resources. The parallel segment fetching methodcan occupy network resources of CDN more exhaustivelyas multiple HTTP threads are used to deliver the mediasegments in parallel over distributed networks.

In our previous work [11], a parallel adaptive HTTPstreaming method was presented. The method schedulesHTTP GET segment requests in parallel at the DASH client,for efficiently utilizing the distributed networks resources


and improving the achievable streaming quality. Specifi-cally, our previous paper presented a method, which (a)enables a receiver to request multiple segments in paral-lel, (b) provides a solution to maintain a limited numberof HTTP sessions for receiving segments in parallel and todetermine when to start a new HTTP session to requestthe next segment and (c) can adapt the received mediabitrate while receiving previously requested segments.For the rate adaptation algorithm, [11] deployed a simplerate adaptation algorithm, which uses the buffered mediatime to determine the media bitrates, hence results insimilar problems as described above for [13]. Since thefocus was on improving achievable representation level,the simulation results in [11] were limited in reportingthe average representation level for a single DASH clientuser. In this paper, we investigate the rate adaptation forthe serial and parallel segment fetching in CDN.

In a multiple TCP connection based media streaming,requesting a set of media chunks using multipleTCP streams was presented in [16,17]. Multiple TCPconnection based media streaming was previously pre-sented in [18]. The work in [16,17] extended multiple TCPconnections based media streaming to the request-response based internet video streaming. In the samepapers, the authors presented analytical models for thetotal achievable TCP throughputs for the request-responsebased TCP streaming using multiple TCP connections.Although this paper uses the parallel HTTP requests, theaim and the underlying principles are totally differentfrom those in [16,17], where the aim of multiple TCPconnections was to provide resilience against short-terminsufficient bandwidth using multiple TCP connections fora streaming application. In contrast, the target in ourprevious paper [11] was to fully utilize the distributednetwork resources, for example CDN, to improve HTTP-Streaming quality at the client. The underlying principle ofthe works [16–18] was that by allocating a certain bottle-neck bandwidth to multiple TCP streams, the TCP through-put variation can be smoothed out and the achievable TCPthroughput can be increased to some degree as multiple TCPstreams do not observe packet losses at the same time. Incontrast, [11] is inspired by the distributed infrastructureof CDN.

Evensen et al. [19] presented a method of deployingmultiple heterogeneous interfaces to improve streamingquality. They proposed to divide a segment into multiplesub-segments dynamically and request sub-segmentsusing distributed requests over multiple heterogeneousinterfaces simultaneously. As [19] uses sub-segments,several techniques for partitioning segments into sub-segments were studied and the paper proposed a dynamicand static sub-segment partitioning method. In contrast,our previous paper [11] used segment as a fetching unit,which makes the parallel segments fetching methodsimpler compared to the parallel sub-segment fetchingmethod [19]. The underlying principle of paper [11] is theefficient utilization of the distributed edge servers in CDN.On the contrary, [19] considered a device using multipleheterogeneous network interfaces, such as a smart phoneequipped with the ability to transfer data through WLANand HSDPA networks simultaneously.


dx.doi.org/10.1016/j.image.2011.10.001


3. Terminology and architecture of DASH in CDN

In this section, we shall first explain the terms anddefinition of DASH as specified in 3GPP PSS. Then, theend-to-end systems of DASH, including the DASH contentprovider, DASH server and DASH client, are described. Inaddition, the media distribution architecture in CDN ispresented.

In DASH, the media presentation description (MPD)provides the necessary information for clients to establisha dynamic adaptive streaming over HTTP. The MPDcontains the metadata, such as an HTTP-URL of eachsegment, to make GET segment request. The segmentcontains a certain duration of media data, and metadatato decode and present the included media content. Eachrepresentation consists of multiple media segments andrepresents the coding choice such as encoded bitrate,spatial and temporal resolution, etc. The representationis identified through a unique identifier (ID). The clientcan obtain the available number of representations andthe corresponding properties by accessing the MPD.

Fig. 1 shows a media distribution architecture usingDASH in CDN. An end-to-end DASH system consists of aserver, clients and a content provider.

DASH content preparation prepares the DASH compli-ant media content in an offline segment creation mode, orin a reactive segment creation. In the offline segmentcreation mode, the MPD and the media segments for therepresentations are created before providing client accessto the content. In the reactive segment creation mode, thecontent provider constructs the media segments accord-ing to the received HTTP GET request from a media file,such as an MP4 file, which contains multiple representa-tions. In the second mode, for example, the client requestmay contain the segment start time, segment durationand representation level. The content provider firstlyparses the HTTP request and encapsulates part of anMP4 file to a segment as specified in the HTTP request.A DASH content provider sends DASH compliant mediadata to a DASH server for streaming. The DASH server canbe a standard web server to provide dynamic adaptivestreaming over HTTP to DASH clients.

The DASH client firstly accesses MPD to obtain therepresentation and the address to locate each segment formaking HTTP GET segment request. The client continuouslyrequests and receives the segments from the DASH server.DASH further enables the client to request media segmentsfrom different representations for reacting to varying

DASH server

DASHcontent preparation

CDN server

DNSEdge serversCDN

DASH-clientsgroup

Fig. 1. System for Dynamic Adaptive Streaming over HTTP (DASH) in

CDN.


network resources. The rate adaptation can occur each timebefore requesting a segment.

As DASH facilitates the use of the current web replica-tion and caching infrastructure, CDN is used in this paperto deliver media segments to DASH clients. CDN hasachieved great success in delivering web and media data.The infrastructure of CDN, such as Akamai, consists of a setof edge servers, which deliver content to clients as shown inFig. 1. CDN can redirect a client request, such as a GET mediasegment request, to an appropriate edge server amongmultiple edge servers.

The edge server selection method, especially the fre-quencies of switching edge server, affects the perfor-mance of serial and parallel segment fetching methods.For locating the edge server from which an end userreceives a segment, the DNS request redirection is used toredirect a client request to an appropriate edge server inCDN such as Akamai [20]. The frequencies of switching theedge server depends on the deployed metrics by the DNSbased request redirection.

To select an edge server from multiple edge servercandidates, the metrics of proximity, content cachingpossibility and edge server load can be used in a DNSrequest redirection system [20]. Proximity indicates thedistance between the edge server and the client and aclose-by edge server will be selected. Edge server load ismeasured as the bandwidth and processing loads of theedge servers. The edge server load balancing algorithmdistributes traffics among edge servers targeting a uni-form load distribution at the edge servers. Basically, theproximity based edge server selection method results inthe infrequent switching between edge servers since thedistance between the edge server and the client is rela-tively static. In contrast, the load based edge server selectionmethod causes frequent switching when balancing the loadbetween different edge servers in order to improve theoverall performance.

To achieve improved load balancing, short time-to-live(TTL) is typically used in the DNS based request redirec-tion, also called DNS resolution. For DNS based requestresolution [21], the authoritative DNS name server andlocal DNS server cooperate in a hierarchical way. The localDNS server sends a DNS request to an authoritative DNSserver, which resolves the CDN server name to the IPaddress of the edge server. In the IP address mappingreceived from authoritative DNS server, a time-to-live(TTL) is typically contained for the mapping. When thelocal DNS server queries the authoritative DNS server, thelocal DNS server will cache the content record for the timespecified in the TTL. If the local DNS server receives aquery about the caching name server for the mentionedrecord within the TTL, the local DNS server resolves the edgeserver IP address with the cached content recoding insteadof querying it from the authoritative name server. Krishna-murthy et al. [21] reported that generally a low TTL isassigned in order to assist the DNS based load balancingamong the distributed edge servers. The authors claimedthat the local DNS servers are forced to process frequentDNS lookups by assigning a small DNS TTL for the contentrecord. Therefore, CDN can obtain control for utilizing thedistributed edge servers efficiently by taking advantage of


dx.doi.org/10.1016/j.image.2011.10.001


the load balancing algorithm. As an observation of the realCDN request redirection, they reported that the Akamai andAdero, Digital Island, Speedera and Fasttide assigned 10, 20,20, 120 and 230 s of TTL, respectively.

Recently, Adhikari et al. [22] presented research resultsof uncovering load balancing and server selection strate-gies deployed in Youtube using a reverse engineeringbased methodology. They reported that Youtube deploysthe static load sharing using hash based mechanism, asemi-dynamic approach using location aware DNS resolu-tions and dynamic load-sharing using HTTP redirections.In the first strategy, Youtube maps video-id to a uniquehostname in each of the namespace in the hierarchicalDNS based host namespaces. In the second strategy, theproximity information is used in request redirectionaccording to the ‘‘normal hours’’ and ‘‘busy’’ hours. Inthe third strategy, Youtube uses local ‘‘intra-tier’’ load-sharing before resorting to ‘‘inter-tier’’ load-sharing tofurther balance the load among different edge servers.The research results in [22] showed that the load balan-cing is deployed in practical applications such as Youtubeto smooth the un-even load distribution caused by thecombination of spontaneous video demands and dynamicvideo popularity. The ‘‘inter-tier’’ load-sharing infers thatfrequent edge server switching is possible to balance theload in the distributed edge servers.

Therefore, redirecting a HTTP request for fetching aspecific segment to a specific edge server depends on theload balancing algorithm of DNS based request redirec-tion and TTL value used in CDN. In addition, the dynamiccaching status of media segments and the varying net-work resources further affect the load balancing algo-rithm to select an appropriate edge server. Several factors,including load balancing algorithm, TTL value used anddynamic caching and network resources together deter-mine the switching between different edge servers for theconsecutive requests from the same client. Specifyingsuch frequency is out of the scope of this paper. Tofocused on rate adaptation algorithm of DASH, the pro-posed rate adaptation algorithm is presented as transpar-ent of the edge server switching frequencies.

In the content outsourcing, pull-based and push-basedoutsourcing can be utilized [23]. In the pull-based contentoutsourcing, the client request is redirected to the selectededge server as discussed according to the above policiesusing the DNS redirection or URL rewriting methods. Thepull-based outsourcing can be classified into uncooperativeand cooperative outsourcing. For both uncooperative andcooperative pull-based outsourcing, if the requested seg-ment is cached in the selected edge server, then the clientfetches the segment directly from the edge server. In thecase of a cache miss, the uncooperative outsourcing redir-ects the request either to the CDN server such as Akamaiserver or to the original server. For the cooperative out-sourcing, the edge servers cooperate with each other toresponse to the request from a client, in case of a cache miss.The push-based content outsourcing distributes the con-tents to the edge servers proactively. As the push-basedoutsourcing is on the theoretical study stage, this paperfocuses on the pull-based outsourcing. For the uncoopera-tive and cooperative outsourcing we focus on the former


strategy as most of CDN providers, such as Akamai, deploythis method [23].

Thanks to the distributed resources offered by CDNs, aDASH service over a CDN can improve the scalability instreaming clients. However, the achievable bandwidth fordelivery of media segments is still limited. In mediadelivery over CDN, congestion might not only occur inthe ‘‘last mile’’ of media delivery network, e.g. the linkfrom the Internet service provider to the end user equip-ment, but also in the distributed edge servers. Further-more, more and more end users are connected to theInternet using a high-bandwidth access link. Hence,efficient utilization of the distributed edge servers inCDNs is one of the key factors to provide high qualitymedia in DASH. Different segments can be delivered toDASH clients through different edge servers in CDN withrequest redirection and load balancing algorithm. There-fore, a parallel segment fetching method was proposed inour previous paper [11] to utilize the limited bandwidthin the distributed edge servers more efficiently.

As shown in Fig. 1, the scenario of multiple DASHclients, which compete for the distributed bandwidths inthe edge servers, is discussed in this paper. To investigatethe uncooperative pull based content outsourcing in thescenario of multiple DASH clients, different DASH clientsrequest a different media clip. By requesting differentmedia clips, each GET segment request is sent to the CDNserver, which then pulls the segment from the initialserver.

4. Serial and parallel segment fetching methods

In this section, the serial and the parallel segmentfetching methods are briefly described, [10,11]. A sum-mary of the terms and a definition of the symbolsappearing in this paper are shown in Table 1.

The serial segment fetching method [10] is a commonmethod to request a segment where segments are requestedand received one after another. Fig. 2 shows an example ofrequesting segments serially. The horizontal axis denotesthe time of requesting and receiving segments and theperiod of time to receive segment #i (rec segment #i) isdepicted in Fig. 2. Note that in general the next segmentneed not be requested immediately after completing thereception of the previous segment, but rather segmentrequests can be scheduled according to the buffer occu-pancy and the rate adaptation strategy of the client. Thedetailed flowchart of the serial segment fetching methodwill not be discussed further because of its simplicity.

In the parallel segment fetching method [11] segmentsare requested in parallel. Fig. 3 shows an example of theparallel segment fetching method. In addition, the timeaxis is depicted to scale the time period to request andreceive segments. As shown in Fig. 3, the parallel segmentfetching method maintains a certain number of parallelHTTP threads to request segments in parallel, e.g., twoHTTP threads are used in this example. The purple anddark blue bars denote the receiving segments over the firstand the second HTTP threads, respectively. The purple anddark blue lines indicate the time axis to request and receivethe media segments, respectively. The parallel segment


dx.doi.org/10.1016/j.image.2011.10.001

Table 1Terms and definition of symbols.

Notation Definition

A Slope of the linear function of mk used in Eq. (2)

brcEncoded media bitrate of the current representation level ri

briMedia bitrate of the representation level, wherein r0 and rN�1denote the lowest and the highest representation

bmax Media bitrate of the highest representation level

bmin Media bitrates of the lowest representation level

bmtc Current buffered media time

bmtmin Minimum buffered media time used to determine tid

c The current representation level

DASH Dynamic adaptive streaming over HTTP

d Ratio of the minimum received portions duration and segment duration to determine s0 for smoothing the short-term TCP variation

ed Switch-down factor

ecd Switch-down factor calculated by the current representation level

emind

Minimum switch-down factor calculated by all representation levels

EPFT Expected portions fetch time

eu Switch-up factor

ecu Switch-up factor calculated by the current representation level

emaxu Max switch-up factor calculated by all representation levels

ESFT Expected segment fetch time

uk The ratio of the duration of the received portion and the duration of the full segment, used for scheduling a new parallel HTTP request

ISFT Ideal segment fetch time

k Parallel index, wherein the latest segment has the index 0 and the earliest segment has the index np�1

lk Threshold of jk to schedule a new parallel HTTP request

l0 First term of the linear function of mk used in Eq. (2)

MPD Media portion duration denoting the playback duration contained in the received portions

MPDk Received media portion duration of a segment with the parallel index k

MSD Media segment duration denoting the playback duration contained in a segment

MSDk Media segment duration contained in the segment with the parallel index k

noo Number of the out order received segments

nuoo Upper limit of noo

np Number of parallel HTTP threads to deliver media segments

nup Upper limit of HTTP threads number to deliver media segments in parallel

PFT Portion fetch time denoting the spent time for fetching the portions of a segment

PFTM Portion fetch time based rate adaptation metric

RAMs Rate adaptation metric, including PFTM and SFTM, with the rate adaptation index tRAMs0

The rate adaptation metric with the rate adaptation index t0, which is determined as Eq. (21)

rec segment

#i

spent time to fetch segment #i

qRSFT,

s factor

ri Representation level i

RSFT Remaining segment fetch time

RSFT,

pThe priority remaining segment fetch time for parallel segment fetching method which will be used in the beginning phase of DASH

RSFT,

sThe priority remaining segment fetch time for serial segment fetching method which will be used in the beginning phase of DASH

SFT Segment fetch time

SFTM Segment fetch time based rate adaptation metric

s Rate adaptation index of the segment/portions which starts from t0 and ends at t0þw�1

s0 The staring rate adaptation index of the segment/chunk

TBMT Target buffered media time

tid Idle period of time

t_reqi Time to request a segment #i

tsi Playback timestamp of the first frame of a requesting segment # i

tsns Playback timestamp of the first frame of the next segment

ts0 The current playback time stamp at the time instant of requesting the next segment

up Upper limit of parallel HTTP threads

w Rate adaptation window size which determines the account of RAM used in the rate adaptation

Fig. 2. An example of serial segment fetching method.


fetching method requests segment iþ1 at t_reqiþ1 usingthe second HTTP thread when segment i is still beingreceived using the first HTTP thread by the client. From


time t_reqiþ1 to the time to receive the last byte of segment#i, the two segments, i.e., segment #i�1 and segment #iþ1are received in parallel.


dx.doi.org/10.1016/j.image.2011.10.001

Fig. 3. An example of parallel segment fetching method.


The scheduling method of HTTP GET segment requestsin parallel is described as follows. The number of activeparallel HTTP threads ðnpÞ is compared with predefinedupper limit ðnu

pÞ to limit the maximum number of parallelHTTP threads to deliver media segments in parallel. Werecommend to set nu

p as 2, since two parallel HTTP requestsalready can improve the achievable media bitrates accord-ing to the simulation results [11]. In addition, a large valueof nu

p may increase the possibility of receiving segments outof order, which may further cause buffer drain-up, alsocalled buffer underflow.

Each segment received in parallel is identified with aparallel index k, wherein the last requested parallel indexis equal to 0 and the earliest requested parallel index isequal to np�1. When requesting a new segment in parallel,all parallel indexes are updated in turn as mentioned above.The parallel index is different from the segment IDs markedas #i in Fig. 3. In each received segment, the portion ofreceived media data is referred to as received portion andmedia duration contained in a received portion is denotedas MPD.

The ratio of the duration of the received portion andthe duration of the full segment, jk, is derived as follows

jk ¼MPDk=MSDk, 0rkrnp�1 ð1Þ

The DASH client compares jk with the correspondingthreshold mk for each parallel received segment k fromindex 0 to index np�1, to determine if a new HTTP threadneeds to start to fetch the next segment. If jk is largerthan or equal to mk for all k from 0 to np�1, then a newrequest may be sent and the client checks if the remainingconditions to send the next HTTP request are met. Other-wise, (if jk is lower than mk for any k in the intervalf0;np�1g), the new request will not be sent. The remain-ing conditions to send the next request will be presentedin the following paragraphs. To receive the earlier seg-ment in time, a linear function mk is used as

mk ¼ Akþm0, 0omko1 ð2Þ

In case mk is larger than or equal to 1, it is set as 1. Inthis paper, A is set as 0.2 as in our previous paper [11]. m0

can be set as 1/np. Larger m0 and A will assist to success-fully receive the earlier requested segments but deducethe efficiency in improving the achievable media bitrates.

In addition, the requested segments may be receivedout of order. To assist receiving the earlier requestedsegments in time, the client does not request a newsegment in parallel in case the gap between the latestrequested segment ID and the earliest requested segment


ID is equal to or larger than its upper limit ðnuooÞ. The

smaller nuoo will facilitate to receive the earlier requested

segments in time but will limit the capability of parallelsegment requesting method with respect to improvingreceiving media bitrates. In this paper, nu

oo is set as 5.If np is smaller than nu

p , noo is smaller than nuoo, and jk

is larger than or equal to mk for all k from 0 to np�1 and, anew HTTP request is sent immediately or after a certainidling period derived from the buffered media time. Thebuffered media time is limited for preventing bandwidthwaste due to the fact that a client may stop viewing acertain media clip before playing out the whole down-loaded media data. The idling period between two con-secutive requests is used to limit the maximum amount ofbuffered media time. The idling period must not be toolong so that the buffered media time can smooth out themost severe bandwidth decrease, i.e., from the highestbandwidth to the lowest bandwidth. Therefore, the idletime period is calculated as

tid ¼ bmtc�bmtmin�MSDUbmax=bmin ð3Þ

If tid is greater than 0 then the client idles tid amount oftime before requesting the next segment. Otherwise, (if tid

is less than or equal to 0), the client requests a new segmentimmediately. In case of a small MSDUbmax=bmin, bmtmin

should be set to a relatively large value to increase thecondition of bmtc for idling and to prevent buffer underflowdue to sudden bandwidth decrease. In case of a largeMSDUbmax=bmin, bmtmin can be set as a relatively smallvalue because bmtc for idling becomes large enough. bmtmin

is set to 0 in this paper since MSDðbmax=bminÞ is already largeenough with the selected media bitrates of the representa-tions in the simulation to prevent buffer underflow. Forrequesting the next segment, the rate adaptation moduleshould be performed to determine the representation levelof the next segment. Then, np is increased by one when anew HTTP thread is started to request the next segment inparallel.

To conclude the discussion above, the DASH clientdoes not request a new segment and continues to receivethe previously requested segments in the following threecases. First, the number of segments being received equalsto the limit of the number of parallel HTTP threads, i.e. np

is equal to nup . Second, jk is smaller than mk in one of the

segments being received. Third, the idle period is greaterthan 0.

With the chunked transferring mode of HTTP, the DASHserver can split a media segment into small portions andtransmit a segment as several portions. The client can receive


dx.doi.org/10.1016/j.image.2011.10.001


a segment portion-by-portion until receiving the wholesegment. When new portions of a segment are received bythe client, the client updates the duration of the receivedportion ðMPDkÞ for the parallel received segment. If thewhole segment is received by the client, then np is decreasedby one. Finally, if all segments are received or the user endsthe current DASH, then the current parallel segment fetchingis terminated. The detailed scheduling algorithm is presentedin the following pseudocode.

Scheduling algorithm for HTTP GET segment requests for the parallel segment fetching method in CDN [11]

Initialize the current representation level i as 0

Send HTTP GET request to fetch the first segment immediately

While User has not stopped the current DASH and more segments are available for fetching

Wait for an event

If Received portions of the requested segments

If The received portions of a segment having parallel index k

Update MPDk

If Received a whole segment

np ¼ np�1

Else If Timeout of the scheduled event to send the next HTTP GET request

Send the next HTTP GET request immediately (after processing the rate adaptation module)

np ¼ npþ1

If np onup and noo onu

oo

k¼ 0

While konp

If jk Zmk ,

k¼ kþ1

ElseBreak

End while

If k¼ np

Calculate the idle period tid as Eq. (3)

If tid 40

Schedule an event to send the next HTTP GET request after idling tid amount of time

ElseSend the next HTTP GET request immediately (after processing the rate adaptation module)

np ¼ npþ1

End while

5. Proposed rate adaptation metric

This section presents the proposed rate adaptationmetric for DASH, called segment fetch time ðSFTÞ rateadaptation metric ðSFTMÞ, which compares ESFT with themeasured SFT. The former represents the optimum seg-ment fetch time and the latter denotes the measuredsegment fetch time. It is well known that the instanta-neous TCP transmission rate is dynamically changinghence it is not feasible to measure the streaming mediabitrate capacity using the instantaneous HTTP/TCP recep-tion rate. Instead, a DASH client estimates the optimumsegment fetch time, i.e., ESFT each time before requestinga segment. After receiving a segment, the estimated ESFT

is compared with the measured SFT to determine if themedia bitrate of the current representation matches theavailable end-to-end bandwidth capacity.

In the parallel segment fetching method, the HTTP GETrequest can be sent when a portion of the previous segment,


shortly called as portion in this paper, is received. As thereception data rate of the latest received portion canrepresent the prevailing network bandwidth, the measuredportion fetch time ðPFTÞ is compared with the correspond-ing expected portion fetch time ðEPFTÞ, which is used asanother rate adaptation metric denoted as PFT rate adapta-tion metric ðPFTMÞ.

Specifically, SFT denotes a period of time from the timeinstant of sending a HTTP GET segment request to the

instant of receiving the last byte of the requested seg-ment; while the PFT denotes a period of time from thetime instant of sending a HTTP GET segment request tothe instant of receiving the last byte of the portion of therequested segment. To smooth out short-term variationsin HTTP/TCP reception rate, the measuring period shouldbe selected appropriately. Typically a longer period iscapable of producing a smoother throughput measure-ment. However, a longer period will also result in a slowerrate adaptation behavior. Based on the theoretical analy-sis of smoothing short-term TCP throughput reported inour previous paper [24] and experimental results reportedin the same paper, a measuring duration around 7–10 s issufficient to smooth out the varying instantaneous HTTP/TCP reception rates.

5.1. Expected segment fetch time for DASH

The spare network bandwidth and network congestioncan be identified by comparing ESFT with the measuredsegment fetch time. ESFT is determined by the following


dx.doi.org/10.1016/j.image.2011.10.001

Fig. 4. An example of ideal parallel segment fetching method to demonstrate the relationship between ISFT and MSD.


two factors. One factor is the media segment durationmultiplied by the number of parallel HTTP threads, whichrepresents the ideal time to fetch each segment ðISFTÞ. Theother factor is the available time to fetch the next segmentto maintain the target buffered media time ðTBMTÞ. TBMT isused to smooth the short-term transmission variation. Theavailable time to fetch the next segment is denoted as theremaining segment fetch time ðRSFTÞ. ISFT represents theintrinsic segment fetch time, which is determined as theperiod of time to fetch each segment when the mediabitrate of each segment exactly matches the capacity ofthe network resources. The intrinsic segment fetch timerelies on the segment fetching method as ISFT is differentwith the different nu

p . We will give an example to demon-strate how the segment fetch method affects ISFT afterpresenting the equation to calculate it. While RSFT repre-sents the real-time buffering status of DASH, ISFT, RSFT andESFT are computed as follows in the following paragraphs.

ISFT is determined as

ISFT ¼MSDnup ð4Þ

As discussed in Section 4, the serial segment fetchingmethod can be regarded as a special case of the parallelsegment fetching method, wherein nu

p equals 1.Fig. 4 shows an example of demonstrating a scenario,

in which the media bitrate exactly matches the networkcapacity. In Fig. 4, the purple and dark blue bars and linesrepresent parallel HTTP threads as in Fig. 3. In addition,the playback timestamp of the first frame of each segmentis labeled as tsi, the blue bars and line denote playbackduration of each segment and playback timestamp axis,which also applies to the following figures depicting theparallel segment fetching method. In the ideal scenario,each segment is fetched using exactly ISFT, which equalsto the media segment duration multiplied by nu

p , e.g. 2 inthis example. It should be noticed that the ideal scenariodoes not assume an instantaneous reception rate match-ing the bandwidth but assumes only an average receptionrate matching the bandwidth.

In practice, the above discussed ideal scenario is hardto achieve because of the following reasons. First, there is


typically a slight mismatch between the optimum mediabitrates and the network capacity. Second, the networkresources and throughput are dynamically changing. Tosmooth short-term HTTP/TCP throughput variation, it isbeneficial to maintain a TBMT indicating a desired latencybetween the time instant to receive the last byte of asegment to the playback of the first media frame of asegment. The RSFT depends on the current playbacktimestamp ðts0Þ at the time instant of requesting the nextsegment, the timestamp of the first frame of the nextsegment ðtsnsÞ and TBMT . Hence, RSFT can be calculated as

RSFT ¼ tsns�ts0�TBMT ð5Þ

Based on Eqs. (4) and (5), finally, ESFT is determined as

ESFT ¼minðISFT ,RSFTÞ ð6Þ

In this equation, ESFT represents the optimum seg-ment fetch time, which is used in the rate adaptationmetrics. ISFT will be constant with a specific nu

p but RSFT

will vary dynamically.The reason for using ISFT in the proposed rate adapta-

tion metric is to prevent possible fluctuation in represen-tation level. RSFT is another representation of the bufferedmedia time. Hence if only RSFT were used to determineESFT , then the representation level might fail to convergeto an optimum representation level as discussed in Section 2.A large RSFT might be a consequence of a cumulativeincrease of buffered media time. For example, bufferedmedia time may be accumulated due to a step-wise repre-sentation-level switch-up strategy. Therefore, RSFT mightnot provide an optimal value for ESFT . In contrast, wepropose to determine ESFT as the minimum of ISFT andRSFT . The former represents the intrinsic segment fetchingtime determined by the segment fetching method and thelatter represents the real-time buffering status. Hence theproposed method can efficiently facilitate the convergenceproperty but also prevent buffer underflow.

Fig. 5 shows an example of determining ESFT for theparallel segment fetching method. In this example, it isassumed that nu

p is equal to 2 and TBMT is equal to MSD.The last byte of segment #i�2 is assumed to be received


dx.doi.org/10.1016/j.image.2011.10.001

Fig. 5. Example of determining ESFT for the parallel segment fetching method.


at the client later than expected as in the ideal scenariodepicted in Fig. 4. Consequently, the HTTP GET request forsegment #i is also sent later than in the ideal scenario. SoRSFT is determined using Eq. (5) and is lower than ISFT.According to Eq. (6), ESFT equals to RSFT as shown inFig. 5, which is used in the rate adaptation metrics and therate adaptation algorithms as discussed in the followingsections.

5.2. SFT rate adaptation metric

ESFT represents the optimum duration for fetching asegment. If the measured segment fetch time ðSFTÞ isequal to ESFT calculated before sending the HTTP GETsegment request, then each segment is fetched using theoptimum segment fetch time and TBMT amount of mediadata will be maintained at the client buffer. Hence theencoded media bitrate of the current representationmatches the network capacity. If SFT is larger than thecalculated ESFT, then it means that the average TCPthroughput for fetching the segment is lower than thebitrate of the current representation level. Otherwise, (ifSFT is lower than the calculated ESFT), the smoothed TCPthroughput is higher than the bitrate of the currentrepresentation. The last case can occur in DASH becausethe TCP sender transmits the available data at the highestpossible rate allowed by the TCP congestion control andavoidance algorithm. Hence a ratio of ESFT and themeasured SFT is used as the metric denoted as SFTM todetect congestion and to probe the spare network capa-city.

SFTM¼ ESFT=SFT ð7Þ

where ESFT is calculated as in Eq. (6) each time beforesending an HTTP GET request for the next segment andSFT is measured after a segment is fetched. The proposedrate adaptation metric can be used to identify if the mediabitrate matches the available network capacity, as ourmetric measures the reception status in a given durationto smooth out short-term variations caused by the TCPcongestion control.


5.3. PFT rate adaptation metric

As PFT measures the time spent to fetch portions of asegment instead of the whole segment, PFT should becompared with the expected portions fetch time ðEPFTÞ

instead of the ESFT , where the EPFT denotes the optimumtime to fetch the corresponding portions, which havebeen received by the DASH client. Therefore, PFTM iscomputed as follows:

PFTM¼ ðESFTUMPDÞ=ðPFTUMSDÞ ð8Þ

PFTM rate adaptation is useful in the rate adaptation ofthe parallel segment fetching method. In the parallelsegment fetching method, an HTTP GET request can besent when the previously requested segments have notbeen completely received at the DASH client. In such ascenario, PFT represents the latest reception status andreveals the current network situation. On the other hand,the measuring duration should be larger than a certainduration to smooth out short-term TCP throughputvariations. Several methods of partitioning a segment intosub-segments dynamically and requesting parallel sub-segments, instead of requesting parallel segments, werediscussed in [19]. In this paper, we focus on the rateadaptation and PFT used in one of the rate adaptationmetrics.

One advantage of SFTM and PFTM is that our methodsdo not require information from the transport layer (TCPlayer). Our method uses the reception status at theapplication layer, but does not rely on the TCP throughputcalculation models. In order to use the TCP throughputcalculation equations, the packet loss rates, RTT or thecurrent congestion window are required; however, suchinformation is not available at the application layer. Inaddition, if the current congestion window is included inthe TCP throughput calculation equation, then the TCPthroughput equation basically represents the instanta-neous transmission rate instead of the smoothed TCPthroughput. As it is well known, the instantaneous TCPthroughput is varying and cannot be directly used in therate adaptation to estimate the end-to-end networkscapacity. In contrast, our rate adaptation metrics cansmooth out the varying TCP throughput rate by measur-ing a certain duration of the reception status. Therefore,


dx.doi.org/10.1016/j.image.2011.10.001


our rate adaptation metrics can be applied for the rateadaptation of DASH, which is operated in the applicationlayer.

6. Proposed rate adaptation algorithm for the serialsegment fetching method

In this section, we propose a rate adaptation algorithmfor the serial segment fetching method based on SFTM,presented in Section 5. In the serial segment fetchingmethod, the HTTP GET next segment request is sent afterreceiving the current segment. The rate adaptation algo-rithm occurs each time after receiving a media segmentand before requesting the next segment to determine therepresentation level of the next segment.

In the serial segment fetching method, since ISFT isequal to MSD, Eq. (7) can be expressed according toEq. (6) as

SFTM¼MINðMSD,RSFTÞ=SFT ð9Þ

In our previous work [10], a simplified rate adaptationmetric was presented, which was set to the ratio of MSD

and SFT . The rate adaptation metric in [10] only considersISFT of the serial segment fetching method but does nottake into account RSFT .

In real-time DASH, different RSFTs of different DASHclients will cause unfairness especially to those shortlived and lately joined clients. In the beginning of DASH,if RSFT calculated by Eq. (5) is smaller than the priorityRSFT for a serial segment fetching method ðRSFTs

,Þ in Eq.

(10), then RSFT is replaced by RSFT,

s. Once RSFT calculatedby Eq. (5) is larger than RSFT

,

s, RSFT will always be set asEq. (5).

RSFT,

s ¼ rUMSD ð10Þ

In this paper, r is selected as 0.75. RSFT,

s is especiallyimportant for DASH clients whose initial buffered mediatime is much lower than the ISFT . The priority RSFT has alarger impact on the parallel segment fetch methodcompared to the serial segment fetching method sincethe former occupies the larger ISFT compared to the later.The priority RSFT will be further discussed in Section 7.

The rate adaptation deploys a step-wise switch-up anda multi-step based switch-down method to change therepresentation level. The condition of switch-up/down andthe corresponding operations are described as follows:

Switch-up: It takes place if the following conditionholds:

SFTM41þeu ð11Þ

In Eq. (11), the left term represents the metric to detectspare network bandwidth and the right term denotes thecondition to switch-up to the next higher representationlevel. eu, is determined according to the maximumbitrates jump-up ratio of all representation levels emax

u

and bitrates jump-up ratio of the current representationlevel (ec

u). emaxu is calculated as

emaxu ¼maxfðbriþ 1

�briÞ=bri

, 8i¼ ½0,1,:::,N�1�g ð12Þ

emaxu provides a conservative switch-up factor, which

improves the convergence property of the proposed rate


adaptation algorithm. And ecu is calculated as

ecu ¼ ðbrcþ 1

�brc Þ=brc ð13Þ

ecu provides a more aggressive switch-up factor com-

pared to emaxu . Since ec

u is only determined by the bitratejump ratio of the current representation level and emax

u isdecided by the maximum of bitrate jump ratios of allrepresentation levels. To reach a compromise between theconvergence of the rate adaptation algorithm and theachievable media bitrates, eu is calculated as

eu ¼minfemaxu ,2ec

ug ð14Þ

Eq. (14) not only improves the convergence andachievable media bitrate, but also makes the rate adapta-tion algorithm performance independent of the bitratedistribution of the representation levels. For example, ifemax

u is much larger than ecu, then eu can be determined as

2ecu, which enables the rate adaptation algorithm to

switch-up the representation level to the optimum level.Otherwise, (emax

u is smaller than 2ecu), eu will be decided by

emaxu for improving the convergence.

In the case of a decision to switch-up, the rate adaptationalgorithm selects the next higher representation level. Thereason for using a conservative step-wise switch-up strategyis to prevent playback interruption which might occur incase of aggressive switch-up operations. During a step-wiseswitch-up, moreover, buffered media time can rise to amore safety level to prevent buffer underflow due to asudden bandwidth decrease. So the initial buffering time,spent to buffer the media time to reach its minimum, can beset as a relatively small value to reduce the delay beforestarting playback.

Switch-down: It will be performed if the followinginequality holds

SFTMo1�ed ð15Þ

The left term of Eq. (15) compares the selected mediabitrate to the current bandwidth in terms of ratio of theoptimum segment fetch time to the actual segment fetchtime, while the right term represents the condition toswitch-down to a lower representation level.

To identify ed, the minimum bitrate switch-down ratio ofall representation levels ðemin

d Þ and bitrates switch-downratio of the current representation level ðec

dÞ are used. emind is

calculated as

emind ¼minfðbri

�bri�1Þ=bri

, 8i¼ ½0,1,:::,N�1�g ð16Þ

ecd is calculated as

ecd ¼ ðbrc�brc�1

Þ=brc ð17Þ

Then ed is calculated as

ed ¼maxf2emind ,ec

dg ð18Þ

In case of congestion, SFT will be much larger than ESFT ,i.e. RAM is lower than 1. Hence Eq. (18) selects a switch downfactor at least larger than a specific value such as 2emin

d whichis used in this paper. When ec

d is larger than 2emind , we set ed

as ecd. Eq. (18) enables the proposed rate adaptation algorithm

to provide an advanced convergence property. For a rateadaptation algorithm of DASH, providing an advanced con-vergence property is one of most challenging task since theswitch-up and switch-down behaviors from different DASH


dx.doi.org/10.1016/j.image.2011.10.001


clients will affect other clients. In this paper, we propose Eqs.(14) and (18) to provide a conservative switch-up andswitch-down for improving the convergence property.

Proposed Algorithm 1. Rate adaptation of the serial segment fetching method operated at the client.

Initialize the current representation level i as 0

Send HTTP GET the first segment request immediately

While User has not stopped the current DASH and more segments are available for fetching

Wait for an event

If (Received a segment and tid r0) or (timeout to request the next segment)

If switch up condition meets

Set representation level to the next higher level

Else if switch down condition meets

Representation level i is determined as the first representation ri to meet the inequality (19)

ElseRepresentation level remains unchanged

Send HTTP GET the next segment request immediately

If Received a segment and tid 40

Idle tid period before sending HTTP GET the next segment request

End while

To support the conservative switch-down, a client buffersenough amount of media time to recover from a severebandwidth decrease. To accumulate media time, this paperdeploys an additive increase and multiplicative decreasestrategy to adapt the representation level.

In the switch-down, an aggressive switch-down willbe performed. The selected representation level is deter-mined to be the first representation (in descending order)with level ri to meet

brioSFTMUbrc ð19Þ

The idle time calculation algorithm is deployed beforesending the next GET request in order to prevent clientbuffer overflow, in a similar manner as Eq. (3) presentedin the parallel segment fetching method.

Fig. 6. An example of demonstrating unfairness between different clients due t

and the ESFT to fetch the segment #i and (b) second client playback timestam


The proposed rate adaptation algorithm of the serialsegment fetching method is described as the followingpseudocode.

7. Proposed rate adaptation algorithm for parallelsegment fetching method

In this section, we present a novel rate adaptationalgorithm for the parallel segment fetching method basedon the proposed rate adaptation metrics, i.e. SFTM andPFTM. In this section, SFTM and PFTM are uniformlydenoted as rate adaptation metric RAM.

The rate adaptation algorithm also deploys a step-wiseswitch-up and an aggressive switch-down method, whichare determined according to the following two cases.

Switch-up: It takes place if all RAMt of the parallelreceived segments meet the following condition

RAMt41þeu,8t¼ ½t0,t0þ1,:::,t0þw�1� ð20Þ

o the different RSFTs: (a) first client playback timestamp of each segment

p of each segment and the ESFT to fetch the segment #j.


dx.doi.org/10.1016/j.image.2011.10.001


where the left term represents the metric to detect conges-tion and the right term denotes the condition to switch up tothe next higher representation level. In the above switch-upcondition, a sliding window is used to compare the latest w

number of RAMt with the switch-up threshold to decidewhether or not to perform a switch-up. And the rateadaptation index (t) starts from t0 instead of 0 to set aminimum received potion media duration for each RAMt.RAMt can be used as the rate adaptation metric if the ratio ofthe received portion duration and the segment duration islarger than a predefined threshold (d) or PFT is larger thanISFT as

t0 ¼ argminðkÞ

MPDk

MSDZd or MPDk4 ISFT

� �,

8k¼ ½0,1,:::,np�1� ð21Þ

The reason of setting t0 is to smooth the short-termTCP transmission variation. d is set as 0.7 in this paper. eu

is calculated as in Eq. (14).Switch-down: It will be performed if the RAMt0

meetsthe following condition:

RAMt0o1�ed ð22Þ

t0 is the starting rate adaptation index determined asEq. (21). ed is calculated as in Eq. (18), which provides aconservative switch-down performance since ed is set asat least larger than 2emin

d .In the switch-down, an aggressive switch-down will

be performed. The selected representation level is deter-mined to be the first representation (in descending order)with level ri to meet

brioRAMt0

brc ð23Þ

The fairness between different clients is one of theimportant factors to evaluate the efficiency of a rateadaptation algorithm. Fig. 6. shows an example demonstrat-ing the occurrence of an unfairness between the two clientsbecause of the sustained different ESFTs. Fig. 6(a) and (b)depict the ESFTs of the first and the second clients,respectively. In Fig. 6, the same term and definition of thesymbols are used as in Fig. 4 to depict the playback time-stamps of the first frame of each segment, the playbackdurations of each segment and the playback timestampaxes. It is assumed that the first client starts DASH earlierthan the second client. In this example, we assume that thespare network bandwidth is available when the first clientstarts DASH and the network bandwidth is scarce when thesecond client starts DASH. In Fig. 6(a), the first client reachesa relatively higher representation level 5 and buffers a largeamount of media time at time instant t_reqi using the sparenetworks. The second client in Fig. 6 (b), however, stays at alower representation level 1 and only buffers a smallamount of media time at time instant t_reqj as the networksenters a saturated status when the second client startsDASH. The different buffered media time results in differentRSFTs, which further causes different ESFTs as demon-strated in Fig. 6.

To the first client, shown in Fig. 6(a), the current playingsegment is segment i�5 when the first client requestssegment i at time instant t_reqi. In this example, ISFT equalsto two times MSD as nu

p is set as 2 and TBMT equals MSD.


Since RSFT covers more than two segment durations, it islarger than ISFT . So ESFT needed to fetch segment i is equalto ISFT according to Eq. (6). For the second client, shown inFig. 6(b), the current playing segment is j�3 when thesecond client requests segment j at time instant t_reqj.Under the same TBMT with the first client, RSFT of thesecond client is less than ISFT . Thus ESFT required to fetchsegment j is equal to RSFT according to Eq. (6) for thesecond client. As networks undergo a saturated status, itmight be difficult for the second client to accumulatebuffered media time and RSFT may be sustainably lowerthan ISFT . In real-time DASH, different ESFTs of differentclients will cause different achievable media bitrates, sinceESFT determines RAM, SFTM and PFTM, according to Eqs.(7) and (8).

To solve the problem of unfairness caused by differentRSFTs of different clients, this paper presents a method ofdeploying the priority RSFT for parallel segment fetching

method ( RSFT,

p) in the beginning of DASH. If RSFT

calculated by Eq. (5) is smaller than RSFT,

p calculated by

Eq. (24), then RSFT is replaced by RSFT,

p. Once RSFT

calculated by Eq. (5) is larger than RSFT,

p, RSFT will be

calculated as in Eq. (5).

RSFT,

p ¼ ð1þðnup�1Þ=2ÞMSD ð24Þ

RSFT,

p allows newly started DASH clients to success-fully switch-up to the appropriate medium representationlevel.

The rate adaptation algorithm of the parallel segmentfetching method is shown in the following pseudocode,which operates as the rate adaptation module in Algorithm 1.

Proposed Algorithm 2. Rate adaptation of the parallelsegment fetching method operated at DASH client.

If switch-up condition is met

Set representation level to the next higher level

Else if switch-down condition is met

Representation level i is determined with the first

representation ri to meet Eq. (23)

ElseRepresentation level remains unchanged

End

8. Simulation results

The proposed rate adaptation algorithms for the serialand the parallel segment fetching methods in CDN areimplemented in ns2 [25]. In ns2, the proxy cache moduleconsists of the standard proxy cache, the client and theserver which operate on HTTP/TCP protocols. Recently,Muller and Timmerer [26] presented a reference softwarefor DASH so that rate adaptation algorithms can beimplemented in the reference software for evaluation.The bandwidth dynamic and packets loss are emulatedusing the traffic shaping tools operated on the rooter or atthe end user device. However, emulating the distributednetworks such as CDN is not an easy task. For emulatingCDN, first, a distributed network topology should becreated. For evaluating the dynamics of the distributed


dx.doi.org/10.1016/j.image.2011.10.001

Table 2Two sets of bandwidth settings.

Bandwidth settings Links #1 Links #2 Links #3

Bandwidth setting #1 (Mbits/s) 5.0 2.5 2.5

Bandwidth setting #2 (Mbits/s) 9.0 6.0 6.0

Table 3Group size with the bandwidth setting #1.

Bandwidth settings Minimum Maximum Step

Bandwidth setting #1 8 32 8


network resources, each edge server should be equippedwith the traffic shape tool. Second, it is complex to specifythe transporting pass for traffics in the distributed net-works, e.g., determining the route between the DASHserver and the DASH client for each segment. In contrast,all of the above issues are easy in ns2. Therefore, ns2 is agood choice for evaluating the rate adaptation method forDASH in CDN.

In this paper we evaluate the rate adaptation algorithmin the aspects of (a) convergence in the requesting repre-sentation level indicating the media bitrates, (b) fairnessbetween different clients, which compete for the edgeserver bandwidth, (c) achievable media bitrate, (d) rateadaptation speed and (e) buffer underflow frequency.

To our knowledge, few rate adaptation algorithms arepublished in the literature. Several HTTP streaming playersexist such as Microsoft smooth streaming [5] and AdobeOSMF player [6] have their own internal rate adaptationtechniques; however, they are not public. Hence it is notfeasible to compare the proposed rate adaptation algorithmwith those players. Although we cannot implement thespecific rate adaptation algorithm, it is possible to observethe behavior of the commercial players. As mentioned inthe related work Section of this paper, Akhshabi et al. [15]reported the experimental results and the evaluationfocusing on the evolution of requesting media bitratesover time using commercial players. In Section 8.5 of thispaper, samples of the representation level evolution arereported for 10 different clients that compete for thebandwidth of the edge servers. The proposed rate adapta-tion algorithms are compared with the Smooth streamingreported in [15].

Reporting samples of evolution of the representationlevels has limitation in the coverage of reporting resultsconsidering the page limitation. For example, when the

DASHserver

CDNS-0

ES-7

Internetconnector

DASHclient-0

ExpS-0

DASHclient-1

DASHclient-n

...

ExpS-1

ExpS-8ExpR-0

ExpR-1 ExpS-9

ExpS-6 ExpS-7

ExpR-7 ExpS-15

ExpR-6 ExpS-14

ExpR-8 ExpR-11

ExpR-12

ExpR-15

CDNS-4

ES-0

ES-1

ES-6

Link #1

Link #3

4#kniL2#kniL

Link #5

... ...

...

...

Fig. 7. Network topology of the simulation


DASH client group size was increased to 10, 20 and 30 oreven more, it is not feasible to observe each clientevolution process. To conduct and report extensive simu-lation results, the distribution of the representation levelsof DASH clients are reported. The histogram of therepresentations level, which indicates the occurrence fre-quency of each representation level, is an efficient tool toevaluate the distribution of the representation levels.Reporting each client histogram will also take many pageswith the large number of simulated clients. Hence, eachhistogram is represented as the combination of two values,which capture the features of the histogram and thecomprehensive results and details are reported in Section8.2. The mean of the average media bitrates is reported toevaluate the achievable media bitrates of the overall clientsand the buffer underflow frequency is reported, in Sections8.3 and 8.4, respectively. Finally, samples of the representa-tion level evolutions are reported in Section 8.5.

DASHserver

CDNS-0

Internetconnector

DASHclient-0ExpS-0

DASHclient-1

DASHclient-8

...

ExpS-1

ExpS-2ExpR-0

ExpR-1 ExpS-3

ExpR-2

ExpR-3

ES-0

ES-1

Link #1Link #2

Link #4Link #5

Link #3

. (a) Topology #1, (b) Topology #2.


dx.doi.org/10.1016/j.image.2011.10.001


8.1. Simulation topology and setting

Fig. 7 shows the network topologies used in thesimulation. In the first topology Fig. 7(a), two layers ofmedia distribution networks are deployed including fourCDN servers (CDNS-0, CDNS-1, CDNS-2 and CDNS-3) andeight edge servers (ES-0, ES-1, y, ES-7). In topology #2, twolayers of media distribution networks are deployed includ-ing one CDN server (CDNS-0) and two edge servers(ES-0 andES-1). As shown in Fig. 7, the media segments are deliveredfrom HTTP streaming server to CDN servers, edge severs, theInternet connector and finally to DASH clients. The edgeserver fetches the segment from the connected upper layerCDN server, which then fetches the segment from the DASHserver. The DASH server contains all representations of themedia clip. In this paper, all segments are fetched from theDASH server. Caching scenario is not considered in thispaper. To simulate the above scenario, different clientsrequest the different media clips, hence all segments havenot been cached by the edge servers when they receive aHTTP GET segment request.

Table 2 shows the bandwidth settings of links #1, links#2 and links #3 in Fig. 7. The bandwidths in links #4 andlinks #5 are set as 2.0 Mbits/s. All link delays are set to2 ms. Topology #1 in Fig. 7(a) and the bandwidth setting#1 are used to report the simulation results of Sections8.2, 8.3 and 8.4. Topology #2 in Fig. 7(b) and bandwidthsetting #2 are used to report the representation levelevolution in Section 8.5.

To simulate practical congestion in real networks andevaluate the rate adaptation algorithm, multiple DASHclients are simulated, which compete for the bandwidthsof the distributed edge servers in CDN. The clients areclassified into two groups. For the first group, clients startthe DASH randomly in the time interval between 0 s and10 s and end at 1200 s. For the second group, clients startthe DASH randomly in the time interval 400–410 s andstop at 800 s. By increasing or reducing the number ofclients, denoted as group size, the rate adaptation effi-ciency can be evaluated for different levels of congestion.Here the group size is the sum of the client’s number inthe two groups.

Table 3 shows the minimum group size, maximumgroup size and group size jump step for the bandwidthsetting #1. As shown in Table 3, the group size is muchlarger than the number of edge servers. Hence multipleclients will compete for the bandwidth of the links betweenthe CDN server and the edge server, and the links betweenthe edge server and the Internet connector in Fig. 7. For thebandwidth setting #1, the simulation runs multiple timeswith different group sizes. The numbers of clients in the firstand the second group are equal.

For the first DASH client group, the simulation time isdivided into three periods to report simulation results.The three periods include 0–400 s, i.e., before starting thesecond client group, 400–800 s, i.e., when the secondclient group compete for the bandwidth of the distributednetworks, and 800–1200 s, i.e., after ending the secondclient group. For the second client group, the simulationresults are reported in the period starting from 400 s andending at 800 s.


To simulate the dynamics of the Internet, backgroundtraffic is added to each link between the CDN server andthe edge server, and between the edge server and theInternet connector by attaching the exponential trafficsenders (ExpS-id) and the corresponding receivers (ExpR-id). Specifically, the exponential traffic senders areattached to the CDN servers (CDNS-id) and edge server(ES-id). The corresponding exponential receivers areattached to the edge servers (ES-id) and the Internetconnector. Both operating and idling periods of the expo-nential traffic are set to 0.5 and the average bitrate ofexponential traffics during the operating period is set to0.2 Mbits/s. The exponential traffics start from 0 s and endat 1200 s.

The edge server selection method and the frequency ofswitching the edge server affect the achievable mediabitrates of the serial and the parallel segment fetchingmethods. When a simple edge server selection method suchas round-robin strategy, is used, the frequent switching ofthe edge server balances the edge server loads better thanthe infrequent switching and will more efficiently utilizesthe bandwidths of distributed edge servers. Hence theachievable media bitrate of serial and parallel segmentfetching methods can be improved. As discussed inSection 3, the proximity based edge server selection tech-nique causes infrequent request redirection, and the shortTTL based load balancing algorithm results in frequent edgeserver switching. Recall that the edge server selectionmethod and determination of the frequencies of switchingedge server are out of the scope of this paper. Therefore, thedifferent frequencies of the switching edge server aresimulated in order to evaluate the proposed rate adaptationalgorithms for the serial and the parallel segment fetchingmethods with respect to the different switching frequencies.In the simulation, the frequencies of 1 and 0.25 are used forswitching edge servers, indicating switching edge serversevery one segments and four segments, respectively. Inaddition, clients randomly select the initial edge server fromthe set of edge servers to fetch the first segment. Under aspecific switching frequency, round-robin strategy is used toswitch the edge server.

In our simulations, TBMT and the initial bufferedmedia time are set to 20 s. For achieving adaptive HTTPstreaming, the server provides ten sets of representationsfor clients to adapt the media bitrates wherein the mediabitrates include 64, 128, 192, 256, 384, 512, 640, 896,1152 and 1408 Kbits/s and the corresponding representa-tion level starts from 0 and ends at 9, respectively. Bitratechanging step was set to increase as representation levelreaches to a higher level in order to achieve an equivalentquality improvement each time bitrate switches-up.

8.2. Distribution of representation level

The histogram of the representation level ri is anefficient metric for evaluating the rate adaptation algo-rithm since the histogram indicates the frequency dis-tribution of the inputs. In the histogram calculation, theinput is each segment representation levelx¼ fx0,x1,. . .,xH�1g and the output includes a vector of theoccurrence frequency of all possible representation levels


dx.doi.org/10.1016/j.image.2011.10.001

Fig. 8. An example of representation level histograms of two clients and corresponding (s2f , s2

l ): (a) first client histogram (65.57, 2.33) and (b) second

client histogram (114.84, 0.5).

Fig. 9. Representation level distribution of each client using the proposed rate adaptation algorithm for the serial segment fetching method under the

edge server switching frequency 1: (a) 0–400 s, (b) 400–800 s and (c) 800–1200 s.

Fig. 10. Representation level distribution of each client using the proposed rate adaptation algorithm for the parallel segment fetching method under the

edge server switching frequency 1: (a) 0–400 s, (b) 400–800 s and (c) 800–1200 s.


cr ¼ fcr0,cr1

,. . .,crM�1g and a vector of the distinct representa-

tion level location br ¼ fbr0,br1,. . .,brN�1g. In the former vector,the vector size of occurrence frequency M is equal to thenumber of all possible representation levels provided by thecontent provider and the occurrence count is set as zero fora representation level that has not occurred. In the latervector, if a representation occurrence count is larger than orequal to a constant threshold y, then the correspondingrepresentation level is regarded as a distinct representationlevel and the corresponding representation level valuedenotes the distinct representation level location. In this


paper, y is set as 5. Such a distinct representation levellocations ðbrkÞ forms a vector of distinct representation levellocations denoted as br .

br ¼ fbr0,br1,. . ., drN�1 g ¼ argrkðcrk

Zy, k¼ 0,1,. . .M�1Þ ð25Þ

where N denotes the size of distinct representation levellocation, which is smaller than or equal to M.

The variance is calculated over cr and br to obtain thevariance of the occurrence frequency of all possiblerepresentation levels (s2

f ) and the variance of distinct


dx.doi.org/10.1016/j.image.2011.10.001

Fig. 11. Representation level distribution of each client using the proposed rate adaptation algorithm for the serial segment fetching method under the

edge server switching frequency 0.25: (a) 0–400 s, (b) 400–800 s and (c) 800–1200 s.

Fig. 12. Representation level distribution of each client based on the proposed rate adaptation algorithm for the parallel segment fetching method under

the edge server switching frequency 0.25: (a) 0–400 s, (b) 400–800 s and (c) 800–1200 s.


representation level location (s2l ) as follows:

s2f ¼

PM�1i ¼ 0 ðcri

�crÞ2

Mð26Þ

s2l ¼

PN�1i ¼ 0 ð br ı�br Þ2

Nð27Þ

where cr denotes the mean of the occurrence frequency ofall possible representation levels and br denotes the meanof distinct representation level locations, respectively. Alarge s2

f and a smaller s2l indicate a superior convergence

property and vice versa.Fig. 8 shows an example of calculating s2

f and s2l for two

given histograms of representation levels and comparing theconvergence property of the two histograms by comparingthe combination of s2

f and s2l , denoted as (s2

f , s2l ), of the two

histograms. As shown in Fig. 8, cr and br of the first and thesecond histogram are 1,2,9,12,4,25,0,0,0,0f g, f2,3,5g, and1,2,4,3,7,35,0,0,0,0f g, f4,5g, respectively. As can be observed

from Fig. 8, the second histogram represents a superiorconvergence compared to the first histogram because ofthe following two reasons. First, the shape of the secondhistogram is sharper than that of the first histogram. Second,the distinct representation levels are more focused in thesecond histogram compared to the first histogram. Accordingto Eqs. (26) and (27), (s2

f , s2l ) of the first and the second

histogram are (65.77, 2.33) and (114.84, 0.5), respectively.The second histogram gives a larger s2

f and a smaller s2l

compared to the first histogram. As mentioned before a


larger s2f and a smaller s2

l indicate a superior convergence,hence the second histogram represents a more advancedconvergence, which is consistent with our observation.

Figs. 9 and 10 plot the (s2f , s2

l ) of each client for therate adaptation algorithms of the serial and the parallelsegment fetching methods, respectively, in the interval of0–400 s, 400–800 s, 800–1200 s, with the edge serverswitching frequency 1. Figs. 11 and 12 show the repre-sentation level distributions, similar to Figs. 9 and 10but with the edge server switching frequency 0.25. Thehorizontal and vertical axes represent s2

l and s2f , respec-

tively. The dots distributed in the left-up corner of thefigure indicate better convergence compared to the dotsdistributed in the right-bottom corner of the figure accord-ing to the above mentioned rule that a larger s2

f and asmaller s2

l indicate better convergence performance. Foursets of group sizes from group size 8 to 32 are reported.

Comparing the convergence properties of the threeperiods in Figs. 9 and 10, the best results are given in theperiod of 800–1200 s, medium results are in period 0–400 s and the worst results are shown in period 400–800 sfor the proposed rate adaptation algorithms. The reasonsare as following. In the period of 800–1200 s, less clientscompete for the bandwidth compared to the period of400–800 s and the starting representation level is largerthan 0. In the period of 0–400 s, the representation levelstarts from 0. In the period of 400–800 s, more clientscompete for the bandwidth compared to the other peri-ods. In addition, the convergence property of the rate


dx.doi.org/10.1016/j.image.2011.10.001


adaptation algorithms depends on the group size accord-ing to the simulation results in Figs. 9 and 10. When thegroup size is 8 or 16, the rate adaptation algorithmsprovide an improved convergence. The convergence levelis still acceptable for the group size 24. But it becomesworse when the group size was increased to 32, whichneeds to be improved further. However, the above resultsalready show that the proposed rate adaptation algorithmis superior to the behavior of the Smooth streamingplayer, wherein the two players compete for the band-width of a common link, which was reported in section 6of [15]. According to the evaluation reported in [15], thesecond player, which started later than the first player,failed to switch-up to the appropriate level and keptoscillating between some lower representation levelsdue to the competition from only two players. Improvedconvergence is reported in Figs. 9 and 10 with the groupsize from 8 to 24 compared to the results and evaluationin [15], wherein the group size was 2.

Providing improved convergence for a large group ofDASH clients in CDN is still an open and challenging taskbecause of the following reasons. In the scenario ofmultiple clients competing for the bandwidth in the

Fig. 13. Mean of average bitrates of the first group with edge server switchin

800–1200 s.

Fig. 14. Mean of average bitrates of two groups with edge server switching fr

400–800 s.


distributed edge servers, the number of clients may bedynamically changing, which causes variation in thesharable bandwidth. Furthermore, the rate adaptation ofone client might cause a change in the rate adaptation ofother clients because of the varying sharable bandwidth.For example, when several clients compete for the band-width in a common link, a switch-up behavior of a clientmay cause the reduced sharable bandwidth for otherclients, while, a switch-down behavior of a client mayresult in the increased sharable bandwidth in the otherclients. Such a relation between different clients is calledinter-client effect in this paper. The inter-clients affectionwill increase along with an increase in the group size.Moreover, the edge server selection and load balancingalgorithm affect the variation in the sharable bandwidthof each client and hence affect the convergence propertyof the rate adaptation algorithm of DASH.

Comparing the rate adaptation algorithms for theserial and parallel segment fetching methods, shown inFigs. 9, 11, Figs. 10, 12, respectively, the simulation resultsshow that the rate adaptation algorithm for the serialsegment fetching method slightly outperforms the rateadaptation algorithm for the parallel segment fetching

g frequency 1: (a) the first group in 0–400 s and (b) the first group in

equency 1: (a) the first group in 400–800 s and (b) the second group in


dx.doi.org/10.1016/j.image.2011.10.001

Fig. 15. Mean of average bitrates of the first group with edge server switching frequency 0.25: (a) the first group in 0–400 s and (b) the first group in

800–1200 s.

Fig. 16. Mean of average bitrates of two groups with edge server switching frequency 0.25: (a) the first group in 400–800 s and (b) the second group in

400–800 s.

Table 4Buffer underflow frequency.

Group

size

Buffer underflow frequency

Edge server switching

frequency 1

Edge server switching

frequency 0.25

RA for the

serial

RA for the

parallel

RA for the

serial

RA for the

parallel

8 0 0 0 0

16 0 0 0 0

24 0 0 0 8

32 0 2 0 15


method in the terms of convergence in the representationlevel. Comparing the rate adaptation performance withrespect to the different edge server switching frequency,no significant difference is observed.

8.3. Average bitrates with varying group size

The mean of the average media bitrates for thedifferent group sizes are reported with the bandwidthsetting #1, for analyzing the rate adaptation algorithmswith the respect to the achievable media bitrates underthe conditions of the different crowded networks. In Figs.13–16, the y and x axes denote the mean of the averagemedia bitrates of each client group and the group size,respectively. And the Serial-DASH-G1/2 and the Parallel-DASH-G1/2 denote the rate adaptation results for the serialand parallel segment fetching methods in the first/secondgroup, receptively. Fig. 13(a)–(b) shows the mean of themedia bitrates of the first group in the period of 0–400 s and800–1200 s, respectively. In both periods, the distributednetworks bandwidths are only used by the clients of the firstgroup without competition from the second group. So thenumber of active clients in the period of 0–400 s and


800–1200 s is half of the group size. Fig. 14(a)–(b) showsthe mean of the media bitrates of the first and the secondgroup in the period of 400–800 s. In this period, thedistributed bandwidths are shared by the clients of the twogroups, so the number of active clients is equal to the groupsize. Comparing the rate adaptation algorithms for the serialand the parallel segment fetching methods, the mean of themedia bitrates of the rate adaptation algorithm for theparallel segment fetching method is higher than that of the


dx.doi.org/10.1016/j.image.2011.10.001


serial segment fetching method in most of the cases.Comparing the results in Figs. 13 and 14, the achievablemeans of the average media bitrates in the period of the 0–400 s and 800–1200 s are larger than those in the period of400–800 s. The reason for this is that the clients of thesecond group become active in the period 400–800 s. Inaddition, the mean of the media bitrate decreases along withan increase in the group size in the rate adaptation algo-rithms for the serial and parallel segment fetching methods.

To evaluate the efficiency of the proposed rate adaptaiotnmethods with different edge server switching frequencies,simulation results with the edge server switching frequenciesof 0.25 are reported in Figs. 15 and 16. Simualtion resultsshow that the performance of the achievable media bitratesdecreases in the rate adaptation algrorighms for the serialand the parallel segment fetching methods when the edge

Fig. 17. Representation level evolution of the rate adaptation algorighm for t

setting #2: (a) clients of the fist group and (b) clients of the second group.

Fig. 18. Representation level evolution of the rate adaptation algorighm for th

setting #2: (a) clients of the first group and (b) clients of the second group.


server switching frequency is changed from 1 to 0.25. Themain reason is that the frequent edge server switchingstrategy balances the loads of the distributed edge serversbetter than the infrequent edge server switching strategy. Itindicates that the rate adaptation algorithms depend on theperformance of the load balancing method, hence it is worthyto develop an advanced load balancing algorighm and edgeserver selection method for improving the performance ofDASH especially with respect to achievable media bitrates.

8.4. Buffer underflow frequency

Table 4 reports the frequency of the buffer underflow forvarious group sizes. The edge server switching frequenciesinclude 1 and 0.25. In Table 4, the rate adaptation algo-rithms for the serial and the parallel segment fetching

he serial segment fetching method with group size 10 and bandwidth

e parallel segment fetching method with group size 10 and bandwidth


dx.doi.org/10.1016/j.image.2011.10.001


methods are uniformly noted as RA. Buffer underflow hasnot occurred in the rate adaptation algorithm for the serialsegment fetching method. However, buffer underflowoccurred twice and 23 times in the rate adaptation algo-rithm for the parallel segment fetching method with edgeserver switching frequency 1 and 0.25, respectively. Simula-tion results indicate that the underflow frequency dependson the edge serve switching frequency. In-frequent edgeserver switching cause a higher interruption frequency inthe clients compared to frequent switching. The reason isthat frequent edge server switching provides a moreadvanced load balancing algorithm compared to in-frequentswitching. The target of the parallel segment fetchingmethod is to improve the achievable media bitrates by fullyutilizing the distributed edge server bandwidths. To improvethe achievable media bitrate, however, an advanced loadbalancing algorithm is required. Without an advanced loadbalancing algorithm, the rate adaptation algorithm for theparallel segment fetching method may suffer from a rela-tively higher buffer underflow as shown in Table 4. Theparallel segment fetching method improves the achievablemedia bitrates as reported in the simulation results inSection 8.3. So it is worthy to further exploit the loadbalancing algorithm of the CDN to improve the performanceof DASH rate adaptation algorithm and further enhance themedia streaming experience perceived by users.

8.5. Representation level evolution

In this sub-section, the representation level evolutionsare reported to demonstrate the instantaneous changes ofthe representation level over time. The network topology#2 shown in Fig. 7(b) and the bandwidth setting #2 areused to report the representation level evolution. Theedge server switching frequency is set to 0.25. There arefive clients in the first and the second group. To evaluatethe inter-client effect, the proposed rate adaptation algo-rithm behavior is compared with the rate adaptationbehavior of Smooth streaming reported in [15].

Figs. 17 and 18 show the representation level evolu-tion of the rate adaptation algorithms for the serial andthe parallel segment fetching methods, respectively. The y

and x axes denote the instantaneous representation leveland the simulation time, receptively. In both figures, theresults of the rate adaptation algorithm in the first group andthe second group are reported in (a) and (b), respectively.

First, the rate adaptation speed can be observed fromthe representation level evolution. The rate adaptationspeed is one of the most important factors for evaluatingthe efficiency of the rate adaptation algorithm for DASH.A fast switch-down prevents the client buffer underflowand a fast switch-up affects playback media quality. In theperiod of 0–400 s of Figs. 17-(a) and 18(a), the represen-tation level can switch-up from level 0 to level 6 in lessthan 20 s and can switch-up to the highest level in lessthan 30 s, respectively. In the period of 400–800 s, thenetworks become congested because the clients of thesecond group start DASH randomly at the period of 400–410 s. In this period as shown in Fig. 17(a), the clients ofthe first group and the second group switch-down andswitch-up to the appropriate representation level within


30 s. In the period of 400–800 s as shown in Fig. 18(a),some clients have switched-down to a lower layer bytaking more time compared to the rate adaptation algo-rithm of the serial segment fetching method in Fig. 17(a).However, clients of the second group in Fig. 18(b) are able toswitch-up to a higher level as fast as in Fig. 17(b). In theperiod of 800–1200 s of Figs. 17 (a) and 18(a), after remov-ing the clients of the second group at 800 s, clients of thefirst group will reach to the highest level using less than20 s. The representation level evolution results show thatthe proposed rate adaptation algorithms quickly switch-upand switch-down representation levels by efficiently iden-tifying spare network capacities and network congestion. Ifspare network bandwidths are available, fetching segmenttakes less time compared to the optimum segment fetchtime, i.e. ESFT . Otherwise, if the networks become congested,segment fetch time is larger than ESFT . Hence, the proposedrate adaptation method can switch-up and switch-downrepresentation level quickly.

Second, the convergence behavior, fairness betweendifferent clients and inter-client effect can be evaluatedthrough the evolution of the representation levels. In theperiod of 0–400 s and 800–1200 s of Figs. 17(a) and 18(a),the proposed rate adaptation algorithms for the serialsegment fetching method and the parallel segment fetchingmethod show an improved convergence and a better fairnessbetween clients in the first group. In the period of 400–800 s,Figs. 17(a) and 18(a) show that clients of the first group willswitch-down to a lower level to react to the newly joiningclients of the second group. And Figs. 17(b) and 18(b) showthat clients of the second group will switch-up to theappropriate higher representation level.

In Fig. 17(a) and (b), clients of the first and the secondgroup converge to the levels 6–9, which indicates that theproposed rate adaptation algorithm provides a decentconvergence and fairness between different clients. Closeto time 600 s, a switch-up occurred for several clients inthe first group, which resulted in other clients having toswitch-down to a lower level. For example, a switch-down occurred in the client#3 and client#4 in the firstgroup. When multiple clients compete for the sharablebandwidth, improving convergence becomes a challen-ging task as discussed in Section 8.2. As shown in thesimulation results, a severe hopping of the representationlevels is avoided using the proposed SFTM and theconservative switch-up and switch-down factor given inEqs. (14) and (18), respectively.

In Fig. 18(a) and (b), most of the representation levelsof the first and the second group converge to levels 6–9. InFig. 18(a), a client switches-down to a lower level near550 s but it quickly switches-up to a higher level. Oncecongestion is detected, the proposed rate adaptationmethod deploys the multi-step based switch-down strategy,which may cause over switching-down. To prevent playbackinterruption in DASH due to client buffer underflow, the rateadaptation for DASH should be equipped with a fasterswitch-down speed compared to the rate adaptation forthe RTP/UDP based streaming scenario. In media streamingover HTTP, congestion will cause a TCP congestion window,a TCP throughput decrease, and packets retransmission. Itwill further result in serious draining of the buffered media


dx.doi.org/10.1016/j.image.2011.10.001


time in the client’s buffer. In case of over switching-down,the rate adaptation method can switch-up the representa-tion level to the optimum level with a fast speed by usingthe proposed rate adaptation metrics as shown in theresults in client #4 near 550 s. To further improve conver-gence, the sliding window strategy is used in this paper tomeasure the multiple rate adaptation metrics for switching-up. Each rate adaptation metric represents the reception of asegment or a portion of a segment, i.e. portion, which can befetched in parallel from the distributed edge servers. Theobservation of the multiple segments or portions receptioncan efficiently represent the current bandwidths capacitiesin the distributed networks. Meanwhile, in order to avoidaffecting the speed of switching-up, the representationlevels within the measuring window can be different. Soconsecutive switching-up can occur as shown at the startingphase of DASH in all clients and near 550 s for client #4.

As mentioned in the previous Section 8.2, when twoSmooth streaming players compete for the common linkbandwidth, the second player, which starts later than thefirst player, fails to switch-up to the appropriate higherlevel according to the experimental results and evaluationreported in section 6 of [15]. Instead, the second Smoothstreaming player oscillates between the lower levels ofbitrates, which indicates the unfairness between the firstand the second player. The failure of switching-up in thelater joined client in the Smooth streaming may be incurredas the rate adaptation algorithm deploys the buffered mediatime to decide switch-up and switch-down. Hence, if aplayer has not buffered enough media time, it will fail toswitch-up to the appropriate high level, which causes theunfairness.

In contrast, the reported samples of representation levelevolutions in Figs. 17(b) and 18(b) show that clients of thesecond group, which start later than clients of the first group,can successfully switch-up to the higher representation level.Simulation results of Figs. 17(b) and 18(b) demonstrate thatthe proposed rate adaptation algorithms can efficientlyreduce unfairness. Ten clients compete for the bandwidthin our simulation, while, two clients compete for the band-width in [15]. Thus the competition between different clientsin this paper is more severe. The severe competition makes itmore difficult for the clients to switch-up and convergebecause of the severe inter-client effects. Although thesimulation conditions used are somewhat restricted, theproposed rate adaptation algorithms show a fast switch-upspeed and an acceptable convergence behavior.

Last, buffer underflow does not occur during the wholeperiod for clients of the first and the second group. In thispaper, RSFT , which reveals the buffered media time status, istaken into account to determine ESFT to prevent the bufferunderflow in a severe congestion.

9. Conclusion

In this paper, we propose two rate adaptation metricsand two rate adaptation algorithms for Dynamic AdaptiveStreaming over HTTP (DASH). In DASH, segments can berequested one after another, called serial segment fetch-ing method, or they can be requested in parallel, calledparallel segment fetching method. A novel rate adaptation


metric is proposed, which can detect spare networkscapacity and network congestion. The proposed rateadaptation metric compares the expected segment fetchtime (ESFT) with the measured segment fetch time anddetermines if the current media bitrate matches thenetwork capacity. The optimum segment fetch time isdecided by the ideal segment fetch time (ISFT) and thereal-time remaining segment fetch time (RSFT). The ISFTis determined by the media segment duration (MSD)multiplied by the parallel HTTP threads to deliver seg-ments. The RSFT depends on the reception history ofprevious segments and is related to the buffered mediatime in the client buffer. Based on the proposed rateadaptation metric, two novel rate adaptation algorithmsfor the serial and the parallel segment fetching methodsare presented in this paper. A sliding window strategy isused to measure multiple rate adaptation metrics toreduce the fluctuation frequency in the representationlevels due to the different bandwidths in the differentedge servers distributed in the networks, while notaffecting switch-up speed. Furthermore, this paper inves-tigates the fairness issue between different DASH clientswhich compete for the common link bandwidth. Amethod of solving unfairness is proposed which uses thepriority based RSFT calculation method in the beginningphase of DASH for the newly joining clients. Extensivesimulation results are reported by simulating DASH inCDN with multiple competing DASH clients. Simulationresults show that the efficiency of the proposed rateadaptation algorithms in the aspects of switch-downand switch-up speed, convergence behavior and fairnessbetween different DASH clients. The rate adaptationalgorithm for the parallel segment fetching method out-performs the rate adaptation algorithm for the serialsegment fetching method with respect to achievablemedia bitrates but slightly inferior with respect to con-vergence and buffer underflow frequency. In addition, theachievable media bitrates of the rate adaptation algorithmfor the serial and the parallel segment fetching methoddepends on the load balancing algorithm. For the parallelsegment fetching method, the buffer underflow frequencyof the rate adaptation algorithm relies on the on the loadbalancing algorithm. In future work, we will investigatefurther CDN optimization for DASH.

Acknowledgment

This work was supported by the Academy of Finland,(application number 129,657, Finnish Programme for Centresof Excellence in Research 2006–2011). The authors wouldlike to appreciate the efforts of the anonymous reviewers fortheir comments which have improved the paper.

References

[1] R. Fielding, J. Gettys, J.C. Mogul, H. Frystyk, L. Masinter, P. Leach, T.Berners-Lee, Hypertext Transfer Protocol-HTTP/1.1, RFC 2616, June1999.

[2] C. Krasic, K. Li, J. Walpole, The case for streaming multimediawith TCP, in: Proceedings of the IDMS, Lancaster, UK, September2001.


dx.doi.org/10.1016/j.image.2011.10.001


[3] B. Wang, J. Kurose, P. Shenoy, D. Towsley, Multimedia streaming viaTCP: an analytic performance study, ACM Transactions on Multi-media Computing Communications and Applications (TOMCCAP) 4(2) (2008).

[4] T. Kim, M.H. Ammar, Receiver buffer requirement for video stream-ing over TCP, in: Proceedings of the SPIE VCIP, 2006, San Jose, CA,January 2006.

[5] A. Zambelli, IIS Smooth Streaming Technical Overview, MicrosoftCorporation, 2009.

[6] Adobe, HTTP Dynamic Streaming on the Adobe Flash Platform,Adobe Systems Incorporated, 2010.

[7] 3GPP TS 26.234 Release 9: Transparent end-to-end Packet-switched Streaming Service (PSS), protocols and codecs.

[8] ISO/IEC 23009-1: Dynamic Adaptive Streaming Over HTTP (DASH)-Part 1: Media Presentation Description and Segment Formats, DraftInternational Standard, August 30 2011.

[9] 3GPP TS 26.247 Release 10: Transparent end-to-end Packet-switched Streaming Service (PSS), Progressive Download andDynamic Adaptive Streaming over HTTP (3GP-DASH).

[10] C. Liu, I. Bouazizi, M. Gabbouj, Rate adaptation for adaptive HTTPstreaming, in: Proceedings of the ACM Multimedia SystemsConference (ACM MMSys 2011), San Jose, California, February23–25, 2011.

[11] C. Liu, I. Bouazizi, M. Gabbouj, Parallel adaptive HTTP mediastreaming, in: Proceedings of the IEEE International Conferenceon Computer Communications and Networks (ICCCN 2011),Hawaii, July 31–August 4, 2011.

[12] S. Lam, J.Y.B. Lee, S.C. Liew, W. Wang, A transparent rate adaptationalgorithm for streaming video over the internet, in: Proceedings ofthe Eighteenth International Conference on Advanced InformationNetworking and Applications, Fukuoka, Japan, March 2004.

[13] N. Farber, S. Dohla, J. Issing, Adaptive progressive download basedon the MPEG-4 file format, Journal of Zhejiang University Science A7 (Suppl. 1) (2006) 106–111 (Proceedings of International PacketVideo Workshop 2006).

[14] R. Kuschnig, I. Kofler, H. Hellwagner, An evaluation of TCP-basedrate-control algorithms for adaptive internet streaming of H.264/SVC, in: Proceedings of the ACM Multimedia Systems Conference(ACM MMSys 2010), Scottsdale, Arizona, February 2010.

[15] S. Akhshabi, A. Begen, C. Dovrolis, An experimental evaluation ofrate-adaptation algorithms in adaptive streaming over HTTP, in:Proceedings of the ACM Multimedia Systems Conference (ACMMMSys 2011), San Jose, California, February 23–25, 2011.


[16] R. Kuschnig, I. Kofler, H. Hellwagner, Evaluation of HTTP-basedrequest-response streams for internet video streaming, in: Pro-

ceedings of the ACM Multimedia Systems Conference (ACM MMSys2011), San Jose, California, February 23–25, 2011.

[17] R. Kuschnig, I. Kofler, H. Hellwagner, Improving internet videostreaming performance by parallel TCP-based request-responsestreams. in: Proceedings of the Seventh Annual IEEE Consumer

Communications and Networking Conference (IEEE CCNC 2010),January 2010.

[18] S. Tullimas, T. Nguyen, R. Edgecomb, S. Cheung, Multimediastreaming using multiple TCP connections, ACM Transactions on

Multimedia Computing, Communications, and Applications (TOMC-CAP) 4 (2) (2008).

[19] K. Evensen, D. Kaspar, C. Griwodz, P. Halvorsen, A. Hansen, P.

Engelstad, Improving the performance of quality-adaptive videostreaming over multiple heterogeneous access networks, in: Pro-

ceedings of the ACM Multimedia Systems Conference (ACM MMSys2011), San Jose, California, February 23–25, 2011.

[20] J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, B. Weihl,Globally distributed content delivery, IEEE Internet Computing(2002) 50–58.

[21] B. Krishnamurthy, C.E. Wills, Y. Zhang, On the use and performanceof content distribution networks, in: Proceedings of Internet

Measurement Workshop, 2001.[22] V.K. Adhikari, S. Jain, Z. Zhang. Where do you ’’tube’’? uncovering

YouTube server selection strategy, in: Proceedings of the IEEEInternational Conference on Computer Communications and Net-works (ICCCN 2011), Hawaii, July 31–August 4, 2011.

[23] G. Pallis, A. Vakali, Insight and perspectives for content deliverynetworks, Communications of the ACM (CACM) 49 (No. 1) (2006)

101–106.[24] C. Liu, I. Bouazizi, M. Gabbouj, Segment duration for rate adaptation

of adaptive HTTP streaming, in: Proceedings of the IEEE InternationalConference on Multimedia and Expo (ICME 2011), Barcelona, Spain,July 11–15, 2011.

[25] The network simulator Ns-2. Available [online]. /http://www.isi.edu/nsnam/nsS.

[26] C. Muller, C. Timmerer, A test-bed for the dynamic adaptivestreaming over HTTP featuring session mobility, in: Proceedings

of the ACM Mulitimedia Systems Conference (ACM MMSys 2011),San Jose, California, February 23–25, 2011.


http://www.isi.edu/nsnam/ns

http://www.isi.edu/nsnam/ns

dx.doi.org/10.1016/j.image.2011.10.001

signal processing: image communicationmoncef/publications/rate-adaptation-ic-2011.pdf · rate...

Documents