scheduling on-demand broadcasts:

Scheduling On-demand Broadcasts:

New Metrics and Algorithms

Swarup Acharya

Information Sciences Research Center Bell Laboratories, Lucent Technologies

Murray Hill, NJ

S. Muthukrishnan

Mathematical Sciences Research Center Bell Laboratories, Lucent Technologies

Murray Hill, NJ

2

Introduction

Broadcast capacity has been increased due to various technological advances

This increase is complemented by the growth of large-scale information-centric applications

Many of this applications are pull-based, that is, they respond to on-demand user requests

Generally pull-based systems are more widespread, and adapt better to dynamic workloads

3

Presentation Index

Description of the on-demand heterogeneous broadcasting setting and formalization of

performance metrics

Review of relevant scheduling algorithms and description of new ones

Description of the experimental setup and presentation of the performance results for different algorithms

Conclusions and Future Work

4

Background and Performance Issues

Aspects in scheduling on-demand broadcast : Broadcast Delivery

The bandwidth utilization is clearly larger in broadcast-based systems because all pending requests for that data item are simultaneously satisfied

HeterogeneityData requirements of users and applications are of

different sizes. Encapsulation of all responses into single-size broadcast is wasteful

ClairvoyanceGiven a channel bandwidth, the size of the data itemrequested provides an estimate of its service time

5

Downlink bandwidth B Kbytes/s Size of data item of the i th request is Si (Bytes)

Service time is ti = Si / B milliseconds Page is the basic fixed-length unit of data transfer Broadcast pages have self-identifying headers Server delivers pages comprising an item in order

Background and Performance IssuesThe Model

...Continued

6

PreemptionBackground and Performance Issues

Preemption: Interruption of a broadcast to service others requests before resuming the remainder of the original broadcast

Advantages :

•Significantly better performance for all metrics

•Preemptive schedules can often be approximated

reasonably

Disadvantages :

•Requires additional buffer space at the server

•Increased algorithm complexity

Preemption is used for heterogeneous workloads to avoid backlog of pending requests while a long job is serviced

...Continued

7

Background and Performance IssuesPerformance Metrics

...Continued

Individual measure

•Response time

•Stretch

Overall system performance

•Average of individual measure

•Maximum of individual measure

•Average of the maximum stretch for each class* (AMAX)

*Splitting of jobs into “classes” based on their service time

8

Scheduling AlgorithmsNon-Preemptive versions

•First-Come-First-Served (FCFS)

Data items are broadcast in the order of their request (Extremely poor performance for most metrics in the broadcast case)

•Longest Wait First (LWF)

The data item which has the largest total wait time is chosen for broadcast (Algorithm is expensive to implement)

•Shortest Service Time First (SSSF)

At the time of scheduling, the data item which has the shortest service time is chosen for the broadcast

9

Scheduling AlgorithmsPreemptive versions

•Preemptive Longest Wait First (PLWF)

After the broadcast of a page, the LWF criterion is applied to pick the subsequent data item for broadcast

•Shortest Remaining Service Time (SRST)

After the broadcast of a page, the SSTF criterion is applied to pick the subsequent data item for broadcast

•Longest Total Stretch First (LTSF)

The data item which has the largest total current stretch is chosen for broadcast

•BASE

An offline algorithm which has complete knowledge of the entire access trace (Not practical)

...Continued

10


...Continued

How BASE works :

•Repeatedly guess a value of maximum stretch (S) for any job

•Define a deadline for each job = arrival_time + service_time*S

•Use Earliest Deadline First to check if all jobs meets its deadline

•Use binary search to find minimum feasible S

•All pending requests for an object are simultaneously satisfied

BASE algorithm is nearly optimal for the point-to-point case (minimizes the maximum stretch of any job) but is not optimal for the broadcast case

11


...Continued

Example :

•Consider 2 requests for data item A with service time x

•First request arrives at t and second at t+1

•The EDF will broadcast A at t followed by a second copy of A at t+x

•The maximum stretch is : }x

1-2xmax{1,

Alternate approach :

•A is broadcast at [t, t+1) and then is preempted to be rebroadcast at t+1 (due to second request)

•The maximum stretch is : (upper bound for an optimal strategy)

•For x>2 we have :

}1,x

1xmax{

}1,x

1xmax{}

x

1-2xmax{1,

12

Experimental ResultsSimulation Model & Parameter Settings

Input Traces :

•A web workload generator was used (SURGE)

•45.000 accesses generated from simultaneous requests by 15 clients

•1000 distinct documents were accessed.

•Smallest doc.= 213 bytes - Largest doc.= 5.6 MB

Parameter Settings :

•High Downlink bandwidth NetBW = 100 Kbytes/s

•Low Downlink bandwidth NetBW = 32 Kbytes/s

•Job length (Service time) = roundup (size of doc. / NetBW )

•Page broadcast time (Quanta) = 0.02 sec

13

Experimental ResultsEffect of Preemption

Average Response Time Improvement (HighBW network)

•Preemption improves average response time

•SRST outperforms PLWF (priority to smaller requests)

•PLWF preempts 2% of the requests (for current input trace)

•SRST preempts 8% of the requests (for current input trace)

14

Experimental ResultsEffect of Preemption

...Continued

Average Stretch Improvement (HighBW network)

•Preemption improves average stretch

•SRST outperforms PLWF (priority to smaller requests)

•PLWF preempts mainly on multiple common requests

•SRST preempts additionally based on request size

15

Experimental ResultsResponse Time & Stretch Study

Average & Maximum Response Time (HighBW network)

•No single winner among algorithms

•PLWF fare badly in average response time

•SRST fare badly in worst response time

•Both LTSF & BASE strike a reasonable balance

16


...Continued

Average & Maximum Stretch (HighBW network)

General Observation* :

•BASE does well on measures it is designed to optimize (max stretch)

•BASE does well on measures it is not designed to optimize (average response time)

*Similar results in LowBW network

17


...Continued

Maximum Stretch Per Job Class

(LowBW network)

Algorithm AMAX Max. Stretch547.80

347.13

304.82

SRST

LTSF

BASE

2563.17

458.33

361.33

Class definition :

•Each job is divided into classes

•A job of size between 2i-1+1 and 2i (bytes) belongs to class i .

•All jobs with size less than 1024 bytes belong to class 1

18


...Continued

Average Stretch Per Job Class

(LowBW network)

•BASE provides the most desirable overall performance

•Fine balance between the demands of individual requests with global requirements

•As stated before, BASE is not practical since it is an offline algorithm and computationally expensive

19

Experimental ResultsDeveloping an Online BASE algorithm

Problems creating an Online BASE :

•Guessing a suitable stretch value:

•An online setting has to “guess” the stretch value found by BASE approximately.

•Adjustment must be made to this “guess” value as requests arrive over time.

•Choosing candidates for broadcast efficiently:

•The efficiency of an online setting, depends on how the deadlines are maintained as new requests arrive over time.

•Efficiency depends also on how the candidate requests with the earliest deadline is determined.

20


...Continued

Online Base algorithm settings:

•Guessing the stretch value is based on the past history of accesses

•At any point in time the current stretch guess is used to set a deadline

•Once a deadline is set, it is not changed even if stretch value is changed

Algorithm variations based on the History Window (HWin):

•MAX: Use the maximum value of the individual stretch of the last HWin satisfied requests as the current guess of stretch

•AVG: Use the average of the stretches of last HWin completed requests

•MAXa: Similar to MAX but stretch value is multiplied by a factor a*

•AVGa: Similar to AVG but stretch value is multiplied by a factor a*

*a factor is set to γ1/3 , γ = largest doc. so far / smallest doc. so far

21


...Continued

Maximum Stretch vs. HWin

•Increasing the window size reduces the maximum stretch

•The MAX algorithm follows BASE closely

•If MAX is using SoFar as value of HWin it reduces storage overhead

•Other values of HWin require maintaining a sliding window of values

22


...Continued

Maximum Stretch per class

•MAX matches BASE the best

•AVG tends to do the worst

•MAX has AMAX = 346.87

•AVGa has AMAX = 321.71

23


...Continued

Average Stretch per class

•MAX matches BASE the best

•MAXa tends to do the worst

•Graphs show that a factor has only limited benefit in practice.

24

Experimental ResultsPerformance of MAX

Maximum & Average Response Time

•MAX compares SRST and LTSF

•In spite of simplifications MAX matches the performance of BASE

•Although MAX is designed with only stretch performance in mind, its performance on response times is very good

25

Experimental ResultsPerformance of MAX

...Continued

Maximum & Average Stretch

•MAX strikes a fine balance between individual & global requirements

•Its running time is O(logN) where N is the number of pending requests in the system at any time

•Together with SRST is the fastest algorithm

•BASE & LTSF are significantly more expensive

•The complexity of MAX can be further decreased to O(logC) by grouping the pending requests into C classes

26

Conclusions & Future Work

We have studied the problem of scheduling heterogeneous request in an on-demand broadcast-based environment

Beside of the response time of a request we studied the stretch of a request witch is a better performance measure for heterogeneous workloads

Several scheduling algorithms have been proposed based on the stretch measure

MAX algorithm found to do well and balance individual worst case performance and average global performance in both response time and stretch

27

Conclusions & Future Work

...Continued

An open algorithmic problem is raised : The determination of a schedule that minimizes the average response time or the maximum stretch of a schedule, in the broadcast setting with preemption

BASE optimizes the maximum stretch in the unicast case, no longer does so in the broadcast case

SRST optimizes the average response time in the unicast case, no longer does so in the broadcast case

scheduling on-demand broadcasts:

Documents

broadcast algorithm

better performance

broadcast delivery

broadcast caselongest

broadcastbased systems

performance results

performance issuesaspects

poor performance