energy-aware resource adaptation in tessellation os 3. space-time partitioning and two-level...

1
Energy-Aware Resource Adaptation in Tessellation OS 3. Space-time Partitioning and Two-level Scheduling David Chou, Gage Eads Par Lab, CS Division, UC Berkeley •A Spatial Partition receives a vector of basic resources A number of hardware threads, a portion of physical memory, a portion of shared cache memory, and a fraction of memory bandwidth •Spatial partitioning may vary over time Partitions can be time multiplexed; resources are gang- scheduled Partitioning adapts to needs of the system Parallel Computing Laboratory 2. Basic Goals in Tessellation OS Scheduling at Level 1: Coarse-grained resource allocation and distribution at the cell level Scheduling at Level 2: Fine-grained application- specific scheduling within a cell 5. The Policy Service Research supported by Microsoft (Award #024263) and Intel (Award #024894) funding and by matching funding by U.C. Discovery (Award #DIG0710227) We would like to thank the other members of the Par Lab OS Group, particularly Sarah Bird, Juan Colmenares, Steven Hofmeyr, and Burton Smith, for their contributions to the work, as well as the Akaros team. 7. Results 1. Motivation The Cell: Our partitioning abstraction User-level software container with guaranteed access to resources •Basic properties of a cell Full control over resources it owns when mapped to hardware One or more address spaces Communication channels Cell Cell A A Tim Tim e e Space Space Space Space Cell Cell B B 2nd-level Schedulin g 2nd-level Memory Management Address Space A Address Space B T a s k Channe Channe l l Policy Service Partition Mapping and Multiplexing Layer A Partial and Simplified View Resource Mapper (& STRG Validator) Resource Multiplexer Partition Mechanism Layer STR G Space- Time Resource Graph •Distributes resources among cells •Establishes how cells should be time multiplexed Assigns specific resources to cells Produces only feasible mappings Rejects invalid and infeasible STRGs Determines when cells should be activated and suspended Actually activates and suspends cells “The Plan” • Cell ID • Resource counts • Time parameters Separation between Mapper and Multiplexer •Decision Making Process – Mapping of multiple resources – Centralized because it requires global knowledge – Often expensive 4. Resource Allocation Architecture Tessellation Kernel User Space •A Partition may also receive – Exclusive access to other resources (e.g., a hardware device and raw storage partition) – Guaranteed fractional services from other partitions (e.g., network service) •Support a simultaneous mix of high-throughput parallel, interactive, and real-time applications •Allow applications to consistently deliver performance •Energy-limited devices execute a mix of interactive, high-quality multimedia, and parallel throughput applications that compete for computing resources •We want to automatically find low-power, high-performance resource configurations while providing dynamic adaptation to changing power environments •Execution – Relatively simple and fast • If the set of cells does not change – Allows us to explore decentralized approaches System System Penalty Penalty Allocations Allocations 1 1 Allocations Allocations 2 2 Continuously minimize using the penalty of the system (subject to restrictions on the total amount of resources) Penalty Functions Value of Application to the System Runtime Functions Value of Resources to Applications Runtime Service Requirement s = slope d d Pena lty PACORA Convex Construction 0 is defined to be the total system power 0 has a slope that depends on the battery charge Managing Energy with the Policy Service •Application 0 can be used to represent the idle resources in the system •Assume all idle resources are powered off •Alternatively, make penalty a function of the power-delay product Total Power Penalty 0 As battery depletes, the OS may choose to increase the slope of 0 to reflect the increased value of saving power Resources Total Power 0 Powerp(ap,1ap,n) Runtime 1 ( (0,1) , …, (n-1,1) ) Two functions represent each application: •penalty function (provided by user) •runtime function (generated) 6. Energy Control Mechanisms Energy Control Knobs •We have 3 main control mechanisms for manipulating the power usage of our evaluation system, which is a dual-socket Sandy Bridge server with 32 hardware threads. •Power is measured with a LoCal FitPC •We also use the Intel on-chip energy counters to precisely calculate the power used by the cores and DRAM Time Time Frequenc y Time Time 3. Time Multiplexing •We choose when and how long each cell executes for 1. Cores We can allocate and de-allocate the number of hardware threads We utilize core idling (C1 state) 0 cores idling: 170 W 29 cores idling: 150 W 2. Dynamic Voltage/Frequency Scaling We have 14 different frequency/voltage states At 100% frequency on our evaluation machine, wall power measured 250 W At 50% frequency, wall power measured 150 W Initial Adaptation Experiment: •We run two instances of swaptions with initial resoures of 16 cores, 16% cpu utilization, and maximum CPU frequency. At 25, seconds, we reduce frequency to 70% of maximum and cores to 15, as proof of concept of power control. •Power data forthcoming… Future Experiments: •Complete and run both energy control mechanisms with different applications. Measure system power and determine our distance from optimal by exploring the resource search space offline.

Upload: prudence-mathews

Post on 21-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Energy-Aware Resource Adaptation in Tessellation OS 3. Space-time Partitioning and Two-level Scheduling David Chou, Gage Eads Par Lab, CS Division, UC

Energy-Aware Resource Adaptation in Tessellation OS

3. Space-time Partitioning and Two-level Scheduling

David Chou, Gage EadsPar Lab, CS Division, UC Berkeley

• A Spatial Partition receives a vector of basic resources– A number of hardware threads, a portion of physical memory, a portion of shared cache memory, and a fraction of memory bandwidth

•Spatial partitioning may vary over time– Partitions can be time multiplexed; resources are gang-scheduled

– Partitioning adapts to needs of the system

Parallel Computing Laboratory

2. Basic Goals in Tessellation OS

• Scheduling at Level 1: Coarse-grained resource allocation and distribution at the cell level

• Scheduling at Level 2: Fine-grained application-specific scheduling within a cell

5. The Policy Service

Research supported by Microsoft (Award #024263) and Intel (Award #024894) funding and by matching funding by U.C. Discovery (Award #DIG0710227)We would like to thank the other members of the Par Lab OS Group, particularly Sarah Bird, Juan Colmenares, Steven Hofmeyr, and Burton Smith, for their

contributions to the work, as well as the Akaros team.

7. Results

1. Motivation

• The Cell: Our partitioning abstraction– User-level software container with guaranteed access to resources

• Basic properties of a cell– Full control over resources it owns when mapped to hardware

– One or more address spaces– Communication channels

Cell ACell A

TimeTimeSpaceSpace

Spac

eSp

ace

Cell BCell B2nd-level Schedulin

g

2nd-level Memory

Management

Address Space A

Address Space BTa

sk

Channel

Channel

Channel

Channel

Policy Service

Partition Mapping and Multiplexing Layer

A Partial and Simplified View

Resource Mapper(& STRG Validator)

Resource Multiplexer

Partition Mechanism Layer

STRGSpace-Time

Resource Graph

• Distributes resources among cells• Establishes how cells should be time multiplexed

Assigns specific resources to cells

Produces only feasible mappings Rejects invalid and infeasible STRGs

Determines when cells should be activated and suspended

Actually activates and suspends cells

“The Plan”

• Cell ID• Resource counts• Time parameters Separation between

Mapper and Multiplexer• Decision Making Process – Mapping of multiple resources

– Centralized because it requires global knowledge

– Often expensive

4. Resource Allocation Architecture

Tess

ella

tion

Kern

elU

ser S

pace

• A Partition may also receive – Exclusive access to other resources (e.g., a hardware device and raw storage partition)

– Guaranteed fractional services from other partitions (e.g., network service)

• Support a simultaneous mix of high-throughput parallel, interactive, and real-time applications• Allow applications to consistently deliver performance

• Energy-limited devices execute a mix of interactive, high-quality multimedia, and parallel throughput applications that compete for computing resources• We want to automatically find low-power, high-performance resource configurations while providing dynamic adaptation to changing power environments

• Execution – Relatively simple and fast • If the set of cells does not change

– Allows us to explore decentralized approaches

System

System

Penalty

Penalty

Alloca

tions

Alloca

tions

11

Allocations

Allocations 22

Continuously minimize using the penalty of the system

(subject to restrictions on the total amount of

resources)

Penalty Functions

Value of Application to the

System

Runtime Functions

Value of Resources to Applications

RuntimeService

Requirement

s = slopedd

Pena

lty

PACORAConvex Construction

0 is defined to be the total system power0 has a slope that depends on the battery charge

• Managing Energy with the Policy Service• Application 0 can be used to represent the idle resources in the system• Assume all idle resources are powered off

• Alternatively, make penalty a function of the power-delay product Total Power

Penalty 0

As battery depletes, the OS may choose to increase the slope of 0 to reflect the increased value of saving power

ResourcesTotal Power 0

Powerp(ap,1…ap,n)

Runtime1((0,1), …, (n-1,1))

Two functions represent each application:•penalty function (provided by user)•runtime function (generated)

6. Energy Control Mechanisms

Energy Control Knobs•We have 3 main control mechanisms for manipulating the power usage of our evaluation system, which is a dual-socket Sandy Bridge server with 32 hardware threads.•Power is measured with a LoCal FitPC•We also use the Intel on-chip energy counters to precisely calculate the power used by the cores and DRAM

TimeTime Frequenc

y

TimeTime

3. Time Multiplexing•We choose when and how long each cell executes for

1. Cores• We can allocate and

de-allocate the number of hardware threads

• We utilize core idling (C1 state)• 0 cores idling:

170 W• 29 cores idling:

150 W

2. Dynamic Voltage/Frequency Scaling

• We have 14 different frequency/voltage states

• At 100% frequency on our evaluation machine, wall power measured 250 W

• At 50% frequency, wall power measured 150 W

Initial Adaptation Experiment:•We run two instances of swaptions with initial resoures of 16 cores, 16% cpu utilization, and maximum CPU frequency. At 25, seconds, we reduce frequency to 70% of maximum and cores to 15, as proof of concept of power control.•Power data forthcoming…

Future Experiments:•Complete and run both energy control mechanisms with different applications. Measure system power and determine our distance from optimal by exploring the resource search space offline.