join-idle-queue: a novel load balancing algorithm for dynamically scalable web services li yu,...

Post on 16-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services

Li Yu, Qiaomin Xie, Gabriel Kliot, Alan Geller, James R. Larus, Albert Greenberg

IFIP Performance 2011 Best PaperPresented by Amir Nahir

2

Agenda

Queuing terminology and background: M/M/N, Processor Sharing

Motivation The algorithm Some analysis Results Caveat: where has time gone?

3

M/M/N

Single shared queue Jobs wait in the queue Whenever a server completes one job, it gets

the next from the queue

4

M/M/N

Pros: Jobs arrive to the next server to become

available Cons:

Centralized (single point of failure, bottleneck) Hidden overhead – the time is takes to access

the queue to get the job

5

Service Disciplines

FIFO (a.k.a FCFS) Processor sharing

6

FIFO vs. Processor Sharing

Analysis is very similar But results are not quite the same

E.g., assume three jobs arrive at the system at time 0

PS is currently seen as the “realistic” model

Avg. time in system = 2

Avg. time in system = 3

7

Web-Services: The User’s Experience

No one really tries to model the whole process as a single problem

Common component (unchanged by the research) are often neglected

Server

Scheduler

The Main Motivation for the Paper

Reduce delays on the job’s critical path

Server

Scheduler

Server

Scheduler

Scheduler

9

The Join-Idle-Queue Algorithm: System Structure

Two-layer system: dispatchers (front-ends) and processors (back-ends, servers) The ratio between servers and dispatchers is

denoted by r No assumptions regarding processor discipline

(can support PS, FIFO) Each dispatcher has an I-queue

The I-queue holds servers (not jobs)

10

The Join-Idle-Queue Algorithm: Dispatcher Behavior

Upon receiving a job from user: If there are servers in the I-queue, dequeue first

server and send job to it Otherwise – send job to random server

This deteriorates system performance

This is termed primary load balancing

11

The Join-Idle-Queue Algorithm: Server Behavior

Upon completing all jobs: Choose a dispatcher

Two techniques are considered: Random and SQ(d) Register in its I-queue

This is termed secondary load balancing

12

The Join-Idle-Queue Algorithm at Work

1 2 3 4

12

4

13

The Join-Idle-Queue Algorithm: Corner Case 1

1 2 3 4

2

Server 2 is busy processing a jobwhile being registered as “idle ”

in one of the I-queues

14

The Join-Idle-Queue Algorithm: Corner Case 2

1 2 3 4

2

Server 2 is reported as “idle”in more than one dispatcher

2

15

JIQ Analysis: Some Notations

r – the ratio of servers to dispatchers When is the algorithm expected to perform

better, large r or small r?

16

JIQ Analysis: Some Notations

pi – the probability that a server holds exactly i jobs p0 – the probability that a server is idle

λi – the arrival rate of jobs to a server which holds exactly i jobs λ0 – the arrival rate of jobs to idle processors

ρ=λ/μ Common notation in queuing

17

Load Balancing Assertions

No matter how your balance the load: p0 = 1 – λ

a

dispatchers

n servers

λN

0iiip

18

Load Balancing: So Where Does the Wisdom Go?

It’s not about: increasing the probability that a server is idle

It’s about increasing the arrival rate to idle (and lightly loaded) servers And from there,

1i

ipiQ

19

Theorem 1: Proportion of Occupies I-Queues

There’s a strong connection between idle servers and occupied I-queues Jobs arrive at the system at rate λn The proportion of idle servers is (1-λ)n This proportion is equally distributed among

the dispatchers, so the proportion of occupied I-queues is (1-λ)n/m = (1-λ)r

20

Theorem 1: Proportion of Occupies I-Queues

On the other hand, the authors show that “server arrivals” to the I-queue do behave like a Poisson process (when n→∞) Servers arrive at I-queues at rate ρ

There are ρm occupied I-queues (on average) And so the average I-queue length, under

random secondary load balancing, is:

1

)1( r

21

Corollary 2: The Arrival Rate at Idle Servers (1) Job arrival rate at the

specific dispatcher is λn/m A job has probability ρ to

find an occupied I-queue Average I-queue length is

r(1-λ)1 2 3 4

2

11

1

rm

n

22

Corollary 2: The Arrival Rate at Idle Servers (2) Job arrival rate at servers is λ A job has probability (1-ρ) to find

an empty I-queue

Overall arrival rate at idle servers1 2 3 4

2

)1)(1()1(1

r

23

Corollary 2: The Arrival Rate at Non-Idle Servers Job arrival rate at servers is λ A job has probability (1-ρ) to

find an empty I-queue Arrival rate at busy servers is

λ (1-ρ)

1 2 3 4

2

The arrival rate at idle servers is (r+1) times higher than the arrival

rate at non-idle servers

24

Proportion of Empty I-queues

25

Results (Exponential Job Length)

26

Job Length Distributions

27

Sensitivity to Variance (PS)

28

Affect of r on Performance

29

Caveat: Scheduling Still Takes Time…

When the decision for the secondary load balancing takes place, the servers is not registered at any I-queue

At this time, performance is expected to degrade….

Server

Scheduler

Scheduler

30

top related