outline for today’s lecture objective for today: finish the last lecture without preemption...

65
Outline for Today’s Lecture Objective for today: • Finish the last lecture without preemption • Real-time scheduling • Beyond classic scheduling • Multiprocessor • Networks of workstations • Dynamic voltage scaling Administrative: ??

Upload: victor-douglas

Post on 26-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Outline for Today’s Lecture

Objective for today:

• Finish the last lecture without preemption

• Real-time scheduling

• Beyond classic scheduling

• Multiprocessor

• Networks of workstations

• Dynamic voltage scaling

Administrative: ??

Page 2: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Preemptive FCFS: Round Robin

Preemptive timeslicing is one way to improve fairness of FCFS.

If job does not block or exit, force an involuntary context switch after each quantum Q of CPU time.

Preempted job goes back to the tail of the ready list.

With infinitesimal Q round robin is called processor sharing.

D=3 D=2 D=1

3+ε 5 6

R = (3 + 5 + 6 + ε)/3 = 4.67 + ε

In this case, R is unchanged by timeslicing.Is this always true?

quantum Q=1

preemptionoverhead = ε

FCFS

round robin

Page 3: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Evaluating Round Robin

• Response time. RR reduces response time for short jobs.

For a given load, a job’s wait time is proportional to its D.

• Fairness. RR reduces variance in wait times.

But: RR forces jobs to wait for other jobs that arrived later.

• Throughput. RR imposes extra context switch overhead.CPU is only Q/(Q+ε) as fast as it was before.

Degrades to FCFS with large Q.

D=5 D=1R = (5+6)/2 = 5.5

R = (2+6 + ε)/2 = 4 + ε

Q is typically5-100 milliseconds;ε is 1-5 μs in 1998.

Page 4: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Minimizing Response Time: SJF

Shortest Job First (SJF) is provably optimal if the goal is to minimize R.

Example: express lanes at the MegaMart

Idea: get short jobs out of the way quickly to minimize the number of jobs waiting while a long job runs.

Intuition: longest jobs do the least possible damage to the wait times of their competitors.

1 3 6

D=3D=2D=1

R = (1 + 3 + 6)/3 = 3.33

Page 5: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

SJF

In preemptive case, shortest remaining time first.

In practice, we have to predict the CPU service times (computation time until next blocking).

Favors interactive jobs, needing response, & repeatedly doing user interaction

Favors jobs experiencing I/O bursts - soon to block, get devices busy, get out of CPU’s way

Focus is on an average performance measure, some long jobs may starve under heavy load/ constant arrival of new short jobs.

Page 6: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Behavior of SJF Scheduling

• With SJF, best-case R is not affected by the number of tasks in the system.

Shortest jobs budge to the front of the line.

• Worst-case R is unbounded, just like FCFS.

Since the queue is not “fair”, starvation exists - the longest jobs are repeatedly denied the CPU resource while other more recent jobs continue to be fed.

• SJF sacrifices fairness to lower average response time.

Page 7: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

SJF in Practice

Pure SJF is impractical: scheduler cannot predict D values.

However, SJF has value in real systems:

• Many applications execute a sequence of short CPU bursts with I/O in between.

• E.g., interactive jobs block repeatedly to accept user input.

Goal: deliver the best response time to the user.

• E.g., jobs may go through periods of I/O-intensive activity.

Goal: request next I/O operation ASAP to keep devices busy and deliver the best overall throughput.

• Use adaptive internal priority to incorporate SJF into RR.

Weather report strategy: predict future D from the recent past.

Page 8: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Considering I/O

In real systems, overall system performance is determined by the interactions of multiple service centers.

CPU

I/O device

I/O requestI/O completion

start (arrival rate λ)

exit (throughput λ until some

center saturates)

A queue network has K service centers.Each job makes Vk visits to center k demanding service Sk.

Each job’s total demand at center k is Dk = Vk*Sk

Forced Flow Law: Uk = λk Sk = λ Dk

(Arrivals/throughputs λk at different centers are proportional.)

Easy to predict Xk, Uk, λk, Rk and Nk

at each center: use Forced Flow Lawto predict arrival rate λk at eachcenter k, then apply Little’s Law to k.

Then:

R = Σ Vk*Rk

Page 9: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Digression: BottlenecksIt is easy to see that the maximum throughput X of a system is reached as 1/λ approaches Dk for service center k

with the highest demand Dk.

k is called the bottleneck center

Overall system throughput is limited by λk when Uk approaches 1.

This job is I/O bound. How much will performance improve if we double the speed of the CPU?Is it worth it?

To improve performance, always attack the bottleneck center!

CPU

I/O

S0 = 1Example 1:

S1 = 4

CPU

I/O

S0 = 4Example 2:

S1 = 4

Demands are evenly balanced. Will multiprogramming improve system throughput in this case?

Page 10: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Two Schedules for CPU/Disk

CPU busy 25/25: U = 100%Disk busy 15/25: U = 60%

5 5 1 1

4CPU busy 25/37: U = 67%Disk busy 15/37: U = 40%

33% performance improvement

1. Naive Round Robin

2. Round Robin with SJF

Page 11: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Multilevel Feedback Queue

Many systems (e.g., Unix variants) implement priority and incorporate SJF by using a multilevel feedback queue.

• multilevel. Separate queue for each of N priority levels.Use RR on each queue; look at queue i-1 only if queue i is empty.

• feedback. Factor previous behavior into new job priority.

high

low

I/O bound jobs waiting for CPU

CPU-bound jobs

jobs holding resoucesjobs with high external priority

ready queuesindexed by priority

GetNextToRun selects jobat the head of the highestpriority queue. constant time, no sorting

Priority of CPU-boundjobs decays with systemload and service received.

Page 12: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Real Time Schedulers

Real-time schedulers must support regular, periodic execution of tasks (e.g., continuous media).

e.g. Microsoft’s Rialto scheduler [Jones97] supports an external interface for:

• CPU Reservations

“I need to execute for X out of every Y units.”

Scheduler exercises admission control at reservation time: application must handle failure of a reservation request.

• Time Constraints

“Run this before my deadline at time T.”

Page 13: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Assumptions

Tasks are periodic with constant interval between requests, Ti

(request rate 1/Ti)

Each task must be completed before the next request for it occurs

Tasks are independent

Run-time for each task is constant (max), Ci

Any non-periodic tasks are special

Page 14: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Task Model

time

1

2

Ti Ti

T2

C1 = 1

C2 = 1

Page 15: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Definitions

Deadline is time of next request

Overflow at time t if t is deadline of unfulfilled request

Feasible schedule - for a given set of tasks, a scheduling algorithm produces a schedule so no overflow ever occurs.

Critical instant for a task - time at which a request will have largest response time.

• Occurs when task is requested simultaneously with all tasks of higher priority

Page 16: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Rate Monotonic

Assign priorities to tasks according to their request rates, independent of run times

Optimal in the sense that no other fixed priority assignment rule can schedule a task set which can not be scheduled by rate monotonic.

If feasible (fixed) priority assignment exists for some task set, rate monotonic is feasible for that task set.

Page 17: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Task Model

time

1

2

T1 T1

T2

C1 = 1

C2 = 1

Page 18: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Earliest Deadline First

Dynamic algorithm

Priorities are assigned to tasks according to the deadlines of their current request

With EDF there is no idle time prior to an overflow

For a given set of m tasks, EDF is feasible iffC1/T1 + C2/T2 + … + Cm/Tm 1

If a set of tasks can be scheduled by any algorithm, it can be scheduled by EDF

Page 19: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Task Model

time

1

2

T1 T1

T2

C1 = 1

C2 = 1

Page 20: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Proportional Share

Goals: to integrate real-time and non-real-time tasks, to police ill-behaved tasks, to give every process a well-defined share of the processor.

Each client, i, gets a weight wi

Instantaneous share fi (t) = wi /( wj )

Service time (fi constant in interval) Si(t0, t1) = fi (t) t

Set of active clients varies fi varies over time

Si(t0 , t1) = fi () d

jA(t)

t0

t1

Page 21: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Common Proportional Share Competitors

• Weighted Round Robin – RR with quantum times equal to share

RR:

WRR:

• Fair Share –adjustments to priorities to reflect share allocation (compatible with multilevel feedback algorithms)

Linux

10203020

Page 22: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Common Proportional Share Competitors

• Weighted Round Robin – RR with quantum times equal to share

RR:

WRR:

• Fair Share –adjustments to priorities to reflect share allocation (compatible with multilevel feedback algorithms)

Linux

1020 1010

Page 23: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Common Proportional Share Competitors

• Weighted Round Robin – RR with quantum times equal to share

RR:

WRR:

• Fair Share –adjustments to priorities to reflect share allocation (compatible with multilevel feedback algorithms)

Linux

010 10

Page 24: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

• Fair QueuingWeighted Fair Queuing

Stride scheduling

• VT – Virtual Time advances at a rate proportional to share

• VFT – Virtual Finishing Time: VT a client would have after executing its next time quantum

• WFQ schedules by smallest VFT

EA never below -1

Common Proportional Share Competitors

VTA(t) = WA(t) / SA

VFT = 3/3 VFT = 3/2VFT = 2/1

2/3 2/2 1/1

t

Page 25: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Lottery SchedulingLottery scheduling [Waldspurger96] is another scheduling technique.

Elegant approach to periodic execution, priority, and proportional resource allocation.

• Give Wp “lottery tickets” to each process p.

• GetNextToRun selects “winning ticket” randomly.

If ΣWp = N, then each process gets CPU share Wp/N ... ...probabilistically, and over a sufficiently long time interval.

• Flexible: tickets are transferable to allow application-level adjustment of CPU shares.

• Simple, clean, fast.Random choices are often a simple and efficient way to

produce the desired overall behavior (probabilistically).

Page 26: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Example List-based Lottery

10 2 5 1 2

T = 20

Random(0, 19) = 15

10 12 17Summing:

Page 27: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Linux Scheduling

Page 28: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Linux Scheduling Policy

Runnable process with highest priority and timeslice remaining runs (SCHED_OTHER policy)

• Dynamically calculated priority

Starts with nice value

Bonus or penalty reflecting whether I/O or compute bound by tracking sleep time vs. runnable time: sleep_avg and decremented by timer tick while running

Page 29: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Linux Scheduling Policy

• Dynamically calculated timeslice

The higher the dynamic priority, the longer the timeslice:

• Recalculated every round when “expired” and “active” swap

• Exceptions for expired interactive

Go back on active unless there are starving expired tasks

Low priorityless interactive

High prioritymore interactive

10ms 150ms 300ms

Page 30: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Linux task_struct

Process descriptor in kernel memory represents a process (allocated on process creation, deallocated on termination).

• Linux: task_struct, located via task pointer in thread_info structure on process’s kernel state.

statepriopolicystatic_priosleep_avgtime_slice…

task_struct

task_struct

Page 31: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

.

.

.

.

.

.

Runqueue for O(1) Scheduler

active

expired

priority array

priority array

.

.

.

.

.

.

priority queue

priority queue

priority queue

priority queue

Higher prioritymore I/O300ms

lower prioritymore CPU10ms

Page 32: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

.

.

.

.

.

.

Runqueue for O(1) Scheduler

active

expired

priority array

priority array

.

.

.

.

.

.

priority queue

priority queue

priority queue

priority queue

10

Page 33: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

.

.

.

.

.

.

Runqueue for O(1) Scheduler

active

expired

priority array

priority array

.

.

.

.

.

.

priority queue

priority queue

priority queue

priority queue

1

0

X

X

Page 34: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Linux Real-time

No guarantees

SCHED_FIFO

• Static priority, effectively higher than SCHED_OTHER processes*

• No timeslice – it runs until it blocks or yields voluntarily

• RR within same priority level

SCHED_RR

• As above but with a timeslice.

* Although their priority number ranges overlap

Page 35: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Beyond “Ordinary” Uniprocessors

Multiprocessors

• Co-scheduling and gang scheduling

• Hungry puppy task scheduling

• Load balancing

Networks of Workstations

• Harvesting Idle Resources - remote execution and process migration

Laptops and mobile computers

• Power management to extend battery life, scaling processor speed/voltage to tasks at hand, sleep and idle modes.

Page 36: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Multiprocessor Scheduling

What makes the problem different?

Workload consists of parallel programs

• Multiple processes or threads, synchronized and communicating

• Latency defined as last piece to finish.

Time-sharing and/or Space-sharing (partitioning up the Mp nodes)

• Both when and where a process should run

Page 37: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Architectures

P P P P

Memory

$ $ $ $

Symmetric mp NUMA

cluster

Interconnect

CA

MemP

$CA

MemP

$

CA

MemP

$ CA

MemP

$

Node 0 Node 1

Node 2 Node 3

Page 38: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Affinity Scheduling

Where (on which node) to run a particular thread during the next time slice?

Processor’s POV: favor processes which have some residual state locally (e.g. cache)

What is a useful measure of affinity for deciding this?• Least intervening time or intervening activity (number of processes

here since “my” last time) *

• Same place as last time “I” ran.

• Possible negative effect on load-balance.

Page 39: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Linux Support for SMP

Every processor has its own private runqueue

Locking – spinlock protects runqueue

Load balancing – pulls tasks from busiest runqueue into mine.

Affinity – cpus_allowed bitmask constrains a process to particular set of processors load_balance runs from schedule( )

when runqueue is empty or periodically esp. during idle.

Prefers to pull processes from expired, not cache-hot, high priority, allowed by affinity

P P P P

Memory

$ $ $ $

Symmetric mp

Page 40: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Processor Partitioning

Static or Dynamic

Process Control (Gupta)

• Vary number of processors available

• Match number of processes to processors

• Adjusts # at runtime.

• Works with task-queue or threads programming model

• Suspend and resume are responsibility of runtime package of application

• Impact on “working set”

Page 41: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Process Control ClaimsTypical speed-up profile

Number of processes per application

spee

dup

||ism andworking setin memory

Lock contention,memory contention,context switching,cache corruption

Magic point

Page 42: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Co-Scheduling

John Ousterhout (Medusa OS)

Time-sharing model

Schedule related threads simultaneouslyWhy?

• Common state and coordination

How?

• Local scheduling decisions after some global initialization (Medusa)

• Centralized (SGI IRIX)

Page 43: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Effect of Workload

Impact of communication and cooperation

Issues:-context switch+common state-lock contention+coordination

Page 44: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

CM*’s Version

Matrix S (slices) x P (processors)

Allocate a new set of processes (task force) to a row with enough empty slots

Schedule: Round robin through rows of matrix

• If during a time slice, this processor’s element is empty or not ready, run some other task force’s entry in this column - backward in time (for affinity reasons and purely local “fall-back” decision)

Page 45: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Networks of Workstations

What makes the problem different?Exploiting otherwise “idle” cycles.

Notion of ownership associated with workstation.

Global truth is harder to come by in wide area context

Page 46: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

47

Harvesting Idle Cycles

Remote execution on an idle processor in a NOW (network of workstations)

• Finding the idle machine and starting execution there. Related to load-balancing work.

Vacating the remote workstation when its user returns and it is no longer idle

• Process migration

Page 47: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Issues

Why?

Which tasks are candidates for remote execution?

Where to find processing cycles? What does “idle” mean?

When should a task be moved?

How?

Page 48: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Motivation for Cycle Sharing

Load imbalances. Parallel program completion time determined by slowest thread. Speedup limited.

Utilization. In trend from shared mainframe to networks of workstations scheduled cycles to statically allocated cycles

• “Ownership” model

• Heterogeneity

Page 49: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Which Tasks?

Explicit submission to a “batch” scheduler (e.g., Condor) or Transparent to user.

Should be demanding enough to justify overhead of moving elsewhere. Properties?

Proximity of resources.

• Example: move query processing to site of database records.

• Cache affinity

Page 50: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Finding Destination

Defining “idle” workstations

• Keyboard/mouse events? CPU load?

How timely and complete is the load information (given message transit times)?

• Global view maintained by some central manager with local daemons reporting status.

• Limited negotiation with a few peers

• How binding is any offer of free cycles?

Task requirements must match machine capabilities

Page 51: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

When to Move

At task invocation. Process is created and run at chosen destination.

Process migration, once task is already running at some node. State must move.

• For adjusting load balance (generally not done)

• On arrival of workstation’s owner (vacate, when no longer idle)

Page 52: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

How - Negotiation Phase

Condor example: Central manager with each machine reporting status, properties (e.g. architecture, OS). Regular match of submitted tasks against available resources.

Decentralized example: select peer and ask if load is below threshold. If agreement to accept work, send task. Otherwise keep asking around (until probe limit reached).

Page 53: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

How - Execution Phase

Issue - Execution environment.

• File access - possibly without user having account on destination machine or network file system to provide access to user’s files.

• UIDs?

Remote System Calls (Condor)

• On original (submitting) machine, run a “shadow” process (runs as user)

• All system calls done by task at remote site are “caught” and message sent to shadow.

Page 54: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Remote System Calls

Submitting machine Executing machine

OS Kernel OS Kernel

Shadow Remote JobRemote syscallcode

Remote syscall stubs

User code

Regular syscallstubs

Page 55: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

How - Process Migration

Checkpointing current execution state (both for recovery and for migration)

• Generic representation for heterogeneity?

• Condor has a checkpoint file containing register state, memory image, open file descriptors, etc.Checkpoint can be returned to Condor job queue.

• Mach - package up processor state, let memory working set be demand paged into new site.

• Messages in-flight?

Page 56: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Applying Scheduling to Power Management of the CPU

Page 57: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Dynamic Voltage Scaling

CPU can run at different clock frequencies/voltage:• Voltage scalable processors

• StrongARM SA-2 (500mW at 600MHz; 40mW at 150MHz)

• Intel Xscale

• AMD Mobile K6 Plus

• Transmeta• Power is proportional to V2 x F

• Energy will be affected (+) by lower power, (-) by increased time

Page 58: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Dynamic Voltage Scheduling

Questions addressed by the scheduler:

• Which process to run

• When to run it

• How long to run it for

• How fast to run the CPU while it runs

Intuitive goal - fill “soft idle” times with slow computation

Page 59: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Background Work in DVS

• Interval scheduling

• Based on observed processor utilization

• “general purpose” -- no deadlines assumed by the system

• Predicting patterns of behavior to squeeze out idle times.

• Worst-case real-time schedulers (Earliest Deadline First)

• Stretch the work to smoothly fill the period without missing deadlines (without inordinate transitioning).

Page 60: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Interval Scheduling(adjust clock based on past window,

no process reordering involved)

Weiser et. al.

• Algorithms (when):

• Past

• AVGN

• Stepping (how much)

• One

• Double

• Peg – min or max

• Based on unfinished work during previous interval

time

CP

U lo

ad

Clo

ck s

peed

Page 61: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Implementation of Interval Scheduling Algorithms

Issues:

• Capturing utilization measure

• Start with no a priori information about applications and need to dynamically infer / predict behavior (patterns / “deadlines” / constraints?)

• Idle process or “real” process – usually each quantum is either 100% idle or busy

• AVGN: weighted utilization at time tWt = (NWt-1 + Ut-1) / (N+1)

• Inelastic performance constraints – don’t want to allow user to see any performance degradation

Page 62: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Results

• It is hard to find any discernible patterns in “real” applications• Better at larger time scales (corresponding to larger windows in

AVGN ) but then systems becomes unresponsive

• Poor coupling between adaptive decisions of applications themselves and system decision-making (example: MPEG player that can either block or spin)

• NEED application-supplied information

• Simple averaging shows asymmetric behavior – clock rate drops faster than ramps up

• AVGN may not stabilize on the “right” clock speed - Oscillations

Page 63: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

Earliest Deadline First DVS

time

1

2

Ti Ti

T2

C1 = 1

C2 = 1

Page 64: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

time

1

2

Ti Ti

T2

C1 = 1

C2 = 1

Earliest Deadline First DVS

Page 65: Outline for Today’s Lecture Objective for today: Finish the last lecture without preemption Real-time scheduling Beyond classic scheduling Multiprocessor

EDF-based DVS Algorithm

Sort in EDF order

Invoked when thread added or removed or deadline reached

Includes non-runnable in scheduling decision

speed = MAX

workj

deadlinei-currenttime

j<=i

i<=n

Exponential moving average