priority scheduling: an application for the permutahedron ethan bolker umass-boston bmc software ams...

46
Priority Scheduling: An Application for the Permutahedron Ethan Bolker UMass-Boston BMC Software AMS Toronto meeting September 24, 2000

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Priority Scheduling: An Application for the Permutahedron

Ethan Bolker

UMass-Boston

BMC Software

AMS Toronto meeting

September 24, 2000

2

Plan• Brief introduction to queueing theory• Priority scheduling• Conservation laws and the permutahedron• Specifying CPU shares

interesting pictures and open questions

References: www.cs.umb.edu/~eb/goalmodeAcknowledgements: Jeff Buzen, Yiping Ding, Dan Keefe, Oliver Chen, Aaron

Ball, Tom Larard

3

Queueing theory• Workload: stream of jobs visiting a server

(ATM, time shared CPU, printer, …)• Jobs queue when server is busy• Input:

– Arrival rate: job/sec – Service demand: s sec/job

• Performance metrics:– Utilization: u = s (must be 1)– Response time: r = ??? – Degradation: d = r/s – Queue length: q = r (Little’s law)

4

Response time computations• r, d, q measure queueing delay

r s (d 1), unless parallel processing possible

• Randomness really mattersr = s (d = 1) if arrivals scheduled (best case, no waiting)

r >> s for bulk arrivals (worst case, maximum delays)

• Theorem. d = 1/(1- u) if arrivals are Poisson and service is exponentially distributed (M/M/1). r = s/(1- u) (think virtual server with speed 1-u )

q = u/(1- u) (convention: job in service is on queue)

5

M/M/1• Essential nonlinearity often counterintuitive

– at u = 90% average queue length is 0.9/(1-0.9) = 9,– average response time is s/(1-0.9) = 10s,– but 1 customer in 10 has no wait at all (10% idle time)

• A useful guide even when hypotheses fail– accurate enough ( 20%) for real computer systems– d depends only on u: many small jobs have same impact as few

large jobs– faster system smaller s smaller u r = s/(1-u)

double win: less service, less wait– waiting costly, server cheap (telephones): want u 0– server costly (doctors): want u 1 but scheduled

6

Multiple Job Streams

• Multiple workloads, utilizations u1, u2, …

• U = ui < 1

All degradations equal: di = 1/(1-U)

• Suppose priority scheduling possible

Study degradation vector V = (d1, d2, …)

Priority Scheduling• Priority state: order workloads by priority (ties OK)

– two workloads, 3 states: 12, 21, [12]– three workloads, 13 states:

• 123 (6 = 3! of these ordered states), • [12]3 (3 of these), • 1[23] (3 of these), • [123] (1 state with no priorities)

– n wkls, f(n) states, n! ordered (simplex lock combos)• p(s) = prob( state = s ) = fraction of time in state s• V(s) = degradation vector when state = s (measure this, or compute it

using queueing theory) • V = s p(s)V(s) (time avg is convex combination)• Achievable region is convex hull of vectors V(s)

8

Two workloads

d1

V(12) (wkl 1 high prio)

V(21)

V([12]) (no priorities)

achievable region

d2

d1 = d2

9

Two workloads

d1

V(12) (wkl 1 high prio)

V(21)

V([12]) (no priorities)

0.5 V

(12)

+ 0.

5V(2

1)

V([1

2])

d2

d1 = d2

10

Two workloads

d1

V(12) (wkl 1 high prio)

V(21)

V([12]) (no priorities)

d2

d1 = d2

note: u1 < u2 wkl 2 effect on wkl 1 large

11

Conservation• No Free Lunch Theorem. Weighted average degradation

is constant, independent of priority scheduling scheme:

i (ui /U) di = 1/(1-U)

• Provable from some hypotheses• Observable in some real systems• Sometimes false: shortest job first minimizes average

response time (printer queues, supermarket express checkout lines)

12

Conservation• For any proper set A of workloads

Imagine giving those workloads top priority. Then can pretend other wkls don’t exist. In that case

i A (ui /U(A)) di = 1/(1-U(A))When wkls in A have lower priorities they have higher degradations, so in general

i A (ui /U(A)) di 1/(1-U(A))

• These 2n -2 linear inequalities determine the convex achievable region R

• R is a permutahedron: only n! vertices

13

Two workload permutahedrond2

u1d1 + u2d2 = U/(1-U)

d1

14

Two workload permutahedrond2

V(21)

u1d1 + u2d2 = U/(1-U)

d2 1/(1- u2 )

d1

15

Two workload permutahedron

d2

V(12)

V(21)d1 1/(1- u1 )

u1d1 + u2d2 = U/(1-U)

d2 1/(1- u2 )

achievable region

d1

16

Three workload permutahedron

d2

d1

d3

V(213)

u1d1 + u2d2 + u3d3 = U/(1-U)

V(123)

17

Experimental evidence

18

Four workload permutahedron4! = 24 vertices (ordered states)

24 - 2 = 14 facets (proper subsets)(conservation constraints)

74 faces (states)

Simplicial geometry and transportation polytopes,Trans. Amer. Math. Soc. 217 (1976) 138.

19

• Administrator specifies performance goals – desired degradations (IBM OS/390) (not today)– CPU shares (UNIX offerings from HP, IBM, Sun)

• Operating system dispatches jobs in an attempt to meet goals

• Model predicts degradations by constructing map

Scheduling for performance

permutahedronworkload performance goals

20

Specifying CPU shares• Administrator specifies workload CPU shares• Share f (0 < f < 1) means workload guaranteed

fraction f of CPU when at least one of its jobs is queued for service, can get more if some competition is absent

• share utilization• share cap• share should be renamed guarantee

21

Map shares to degradations- two workloads -

• Suppose f1 and f2 > 0 , f1 + f2 = 1

• Model: System operates in state – 12 with probability f1

– 21 with probability f2

(independent of who is on queue)

• Average degradation vector:

V = f1 V(12) + f2 V(21)

22

Model validation

23

Model validation

24

Map shares to degradations- three (n) workloads -

f1 f2 f3prob(123) = ------------------------------ (f1 + f2 + f3) (f2 + f3) (f3)

• Theorem: These n! probabilities sum to 1– interesting identity generalizing adding fractions

– prove by induction, or by coupon collecting

• V = ordered states s prob(s) V(s)

• O(n!), (n!), good enough for n 9 (12)

• Searching for fast (approximate) algorithm ...

25

Model validation

26

Model validation

27

Map shares to degradations(geometry)

• Interpret shares as barycentric coordinates in the n-1 simplex

• Study the geometry of the map from the simplex to the n-1 dimensional permutahedron

• Easy when n=2: each is a line segment and map is linear

28

Mapping a triangle to a hexagon

f1 = 1 f 1 =

0

f3 = 0

f3 = 1

132

123

213

312

321

231

wkl 1 high priority

wkl 1 low priority

M

29

Mapping a triangle to a hexagon

f1 = 1

{23}

f 1 =

0

30

Mapping a triangle to a hexagon

31

Implementing fair share scheduling

• Actual Sun/solaris implementation is subtle

• HP and IBM are black boxes (for me)

• Stochastic solution: randomly choose queued job to dispatch (implement the model rather than model an implementation)

• May require prior computation of priodist(w, p) = prob(wkl w runs at prio p)

• workload priority probabilities, not state probabilities

32

Priority distributions• Given degradations, compute a priodist• A priodist is an nn matrix with row sums 1• {priodists} = cartesian product of n n-simplices

• Map is surjective, not injective• Look for a well behaved inverse image

permutahedron (dim n-1)priodist space (dim n(n-1))

33

Three workload permutahedron

d1

d2 d1 = d2

132

123

213

[12]3

312

321

231

3[12]

[123]

2[13]

[23]1

1[23]

[13]2

d2 = d3

d1 = d3

34

… dissected into 3! quadrilaterals

d1

d2

d1 = d2

123[12]3

[123]

1[23]

d2 = d3

35

… each mapped to from a skew quadrilateral of priodists

123[12]3

[123]

1[23]

1 0 00 .5 .50 .5 .5

.5 .5 0

.5 .5 0 0 0 1

1 0 00 1 00 0 1

.33 .33 .33

.33 .33 .33

.33 .33 .33

(x,y) xyP123 + x(1-y) P1[23] + (1-x)yP[12]3 + (1-x)(1-y) P[123] degradation vector in this corner of permutahedron

P123

P[123]P1[23]

P[12]3

(x,y)

36

Skew quadrilaterals

• Given 4 points P00, P01, P10, P11 Rm , map unit square: (x,y) xyP00 + x(1-y) P01+ (1-x)yP10 + (1-x)(1-y) P11

• Easy to generalize to 2k points

• Analogous to convex hull, which maps barycentric coordinates on a simplex

• Reference for this construction?

37

Inversion

d2

d1

Try to locate * = (d1, d2 ) on coordinate grid

38

Sequential bisection

d2

d1

39

Sequential bisection

d2

d1

40

Sequential bisection

d2

d1

41

Sequential bisection

d2

d1

42

Sequential bisection

d2

d1

43

… may fail to converge

d1

d2

44

Tempered sequential bisection

d1

d2

o

45

Tempered sequential bisection

d1

d2

oo

46

Tempered sequential bisection

prove that this converges... d1

d2

ooo