Download - Managing Energy and Server Resources in Hosting Centers

Managing Energy and Managing Energy and Server Resources in Hosting Server Resources in Hosting

CentersCentersJeff Chase, Darrell Anderson, Ron Doyle, Jeff Chase, Darrell Anderson, Ron Doyle,

Prachi Thakar, Amin VahdatPrachi Thakar, Amin Vahdat

Duke UniversityDuke University

Back to the FutureBack to the Future Return toReturn to server-centered computing server-centered computing: applications run as : applications run as

servicesservices accessed through the Internet. accessed through the Internet.– Web-based services, ASPs, “netsourcing”Web-based services, ASPs, “netsourcing”

Internet services are hosted on server clusters.Internet services are hosted on server clusters.– Incrementally scalable, etc.Incrementally scalable, etc.

Server clusters may be managed by a third party.Server clusters may be managed by a third party.– Shared data center or Shared data center or hosting centerhosting center– Hosting utility offers economies of scale:Hosting utility offers economies of scale:

• Network accessNetwork access• Power and coolingPower and cooling• Administration and securityAdministration and security• Surge capacitySurge capacity

Managing Energy and Server ResourcesManaging Energy and Server Resources

Key ideaKey idea: a : a hosting center OShosting center OS maintains the maintains the balance of requests and responses, energy inputs, balance of requests and responses, energy inputs, and thermal outputs.and thermal outputs.

responses

energy

waste heat

requests

1. Adaptively provision server resources to match request load.

2. Provision server resources for energy efficiency.

3. Degrade service on power/cooling failures.

Power/cooling “browndown” Dynamic thermal management [Brooks]

US in 2003US in 2003: : 22 TWh 22 TWh

($1B - $2B+)($1B - $2B+)

ContributionsContributions Architecture/prototype for Architecture/prototype for adaptive provisioningadaptive provisioning of server of server

resources in Internet server clusters (resources in Internet server clusters (MuseMuse))– Software feedbackSoftware feedback– Reconfigurable request redirectionReconfigurable request redirection– Addresses a key challenge for hosting automationAddresses a key challenge for hosting automation

Foundation for energy management in hosting centersFoundation for energy management in hosting centers– 25% - 75% energy savings 25% - 75% energy savings – Degrade rationally (“gracefully”) under constraint (e.g., browndown)Degrade rationally (“gracefully”) under constraint (e.g., browndown)

Simple “economic” resource allocationSimple “economic” resource allocation– Continuous Continuous utility functionsutility functions: customers “pay” for performance.: customers “pay” for performance.– Balance service quality and resource usage.Balance service quality and resource usage.

Static ProvisioningStatic Provisioning

Dedicate fixed resources per customerDedicate fixed resources per customer Typical of “co-lo” or dedicated hostingTypical of “co-lo” or dedicated hosting Reprovision Reprovision manuallymanually as needed as needed Overprovision for surgesOverprovision for surges

– High variable cost of capacityHigh variable cost of capacity

How to automate resource provisioning for managed hosting?

Load Is DynamicLoad Is Dynamicibm.com external site• February 2001• Daily fluctuations (3x)• Workday cycle• Weekends off

World Cup soccer site • May-June 1998• Seasonal fluctuations• Event surges (11x)• ita.ee.lbl.gov 0

0 Time (two months)

Thro

ughp

ut (r

eque

sts/

s)

00 Time (one week)

Thro

ughp

ut (r

eque

sts/

s)

M T W Th F S SM T W Th F S S

Week 6 7 8Week 6 7 8

Adaptive ProvisioningAdaptive Provisioning- Efficient resource usage- Load multiplexing- Surge protection- Online capacity planning- Dynamic resource recruitment

- Balance service quality with cost- Service Level Agreements (SLAs)

Utilization TargetsUtilization Targets

i >target : service i is underprovisioned

i = allocated server resource for service i

i = utilization of i at i’s current load i

target = configurable target level for iLeave headroom for load spikes.

i <target : service i is overprovisioned

Muse ArchitectureMuse Architecture

Controlperformance

measures

reconfigurableswitches

configurationcommands

offered request load storage

tier

Executive controls mapping of service traffic to server resources by means of:

• reconfigurable switches• scheduler controls (shares)

server poolstateless

interchangeable

Executive

Server Power DrawServer Power Draw

CPU idle93w

CPU max120w

boot136w

disk spin6-10woff/

hiber2-3w

866 MHz P-III SuperMicro 370-DER (FreeBSD)Brand Electronics 21-1850 digital power meter

work

wattsIdling consumes

60% to 70% of peak power demand.

Energy vs. Service QualityEnergy vs. Service Quality

A

B

C

D

Active set = {A,B,C,D} Active set = {A,B}

A

B

i <target

• Low latency

i =target

• Meets quality goals• Saves energy

Energy-Conscious ProvisioningEnergy-Conscious Provisioning

Light loadLight load: concentrate traffic on a minimal set of servers.: concentrate traffic on a minimal set of servers.– Step down surplus servers to a low-power state.Step down surplus servers to a low-power state.

• APMAPM and and ACPIACPI– Activate surplus servers on demand.Activate surplus servers on demand.

• Wake-On-LANWake-On-LAN BrowndownBrowndown: can provision for a specified energy target.: can provision for a specified energy target.

Resource EconomyResource Economy InputInput: the “value” of : the “value” of performanceperformance for each customer for each customer ii..

– Common unit of value: “money”.Common unit of value: “money”.– Derives from the economic value of the service.Derives from the economic value of the service.– Enables SLAs to represent flexible quality vs. cost tradeoffs.Enables SLAs to represent flexible quality vs. cost tradeoffs.

Per-customer Per-customer utility functionutility function UUii = = bid – penalty.bid – penalty.– Bid for traffic volume (throughput Bid for traffic volume (throughput ii).).– Bid for better service quality, or subtract Bid for better service quality, or subtract penaltypenalty for poor quality. for poor quality.

Allocate resources to maximize expected global utility (“revenue” or Allocate resources to maximize expected global utility (“revenue” or reward).reward).– Predict performance effects.Predict performance effects.

– ““Sell” Sell” to the highest bidder.to the highest bidder.– Never sell resources below cost.Never sell resources below cost. Maximize Maximize bidbidii((ii(t, (t, ii))))

Subject to Subject to i i maxmax

Maximizing RevenueMaximizing Revenue Consider any customer Consider any customer ii with allotment with allotment ii at at fixedfixed time time t.t.

– The marginal utility (The marginal utility (pricepriceii) for a resource unit allotted or ) for a resource unit allotted or reclaimed from reclaimed from ii is the gradient of is the gradient of UUii at at ii..

Expected Expected UtilityUtility

UUii(t, (t, ii))

Resource allotmentResource allotment ii

Adjust allotments until Adjust allotments until price price equilibriumequilibrium is reached. is reached.

The algorithm assumes that The algorithm assumes that UUii is is “concave”:“concave”: the price gradients the price gradients are non-negative and are non-negative and monotonically non-increasingmonotonically non-increasing..

pricepriceii

Feedback and StabilityFeedback and Stability

Allocation planning is Allocation planning is incrementalincremental..– Adjust the solution from the previous interval to react to new Adjust the solution from the previous interval to react to new

observations.observations. Allow system to stabilize before next re-evaluation. Allow system to stabilize before next re-evaluation.

– Set adjustment interval and magnitude to avoid oscillation.Set adjustment interval and magnitude to avoid oscillation.

– Control theory applies. Control theory applies. [Abdelzaher, Shin et al, 2001][Abdelzaher, Shin et al, 2001]

Filter the load observations to distinguish transient and Filter the load observations to distinguish transient and persistent load changes.persistent load changes.– Internet service workloads are extremely bursty.Internet service workloads are extremely bursty.

– Filter must “balance stability and agility” Filter must “balance stability and agility” [Kim and Noble 2001].[Kim and Noble 2001].

““Flop-Flip” FilterFlop-Flip” Filter EWMA-based filter alone is not sufficient.EWMA-based filter alone is not sufficient.

– Average Average AAtt for each interval for each interval tt: : AAtt = = AAt-1t-1 + (1- + (1-)O)Ott

– The gain The gain may be variable or may be variable or flip-flop.flip-flop. Load estimate Load estimate EEtt = E = Et-1t-1 if EEt-1t-1 - A - Att < tolerance < tolerance

elseelse EEtt = A = Att

StableStable ResponsiveResponsive

0

20

40

60

80

100

0 300 600 900 1200

Time (s)

Util

izat

ion

(%)

Raw Data

EWMA (a=7/8)

Flop-Flip

IBM Trace Run (IBM Trace Run (BeforeBefore))

0

500

1000

1500

2000

2500

0 155 310 465 620

Time (minutes)

Thro

ughp

ut (r

eque

sts/

s)

0

70

140

210

280

350

Power D

raw (w

atts), Latency (m

s x50)

ThroughputPowerLatency

1 ms

Throughput (requests/s)

Power draw

(watts)

Latency (ms*50)

IBM Trace Run (IBM Trace Run (AfterAfter))

0

500

1000

1500

2000

2500

0 155 310 465 620

Time (minutes)

Thro

ughp

ut (r

eque

sts/

s)

0

70

140

210

280

350

Pow

er Draw

(watts),

Latency (ms x50)

Throughput

Power

Latency

1 ms

Evaluating Energy SavingsEvaluating Energy Savings

Trace replay shows adaptive provisioning in action.Trace replay shows adaptive provisioning in action.

Server energy savings in this experiment was 29%.Server energy savings in this experiment was 29%.– 5-node cluster, 3x load swings, 5-node cluster, 3x load swings, targettarget = 0.5 = 0.5– Expect roughly comparable savings in cooling costs. Expect roughly comparable savings in cooling costs.

• Ventilation costs are fixed; chiller costs are proportional to Ventilation costs are fixed; chiller costs are proportional to thermal loading.thermal loading.

For a given “shape” load curve, achievable energy savings For a given “shape” load curve, achievable energy savings increases with cluster size.increases with cluster size.

• E.g., higher request volumes,E.g., higher request volumes,• or lower or lower targettarget for better service quality.for better service quality.

– Larger clusters give finer granularity to closely match load.Larger clusters give finer granularity to closely match load.

Expected Resource SavingsExpected Resource Savings

0

20

40

60

80

0 4 8 12 16

Max Servers

Sav

ings

(%)

World Cup (two month)

World Cup (month 2)

World Cup (week 8)

IBM (week)

ConclusionsConclusions Dynamic request redirection enables fine-grained, continuous Dynamic request redirection enables fine-grained, continuous

control over mapping of workload to physical server resources control over mapping of workload to physical server resources in hosting centers.in hosting centers.

Continuous monitoring and control allows a hosting center OS Continuous monitoring and control allows a hosting center OS to provision resources adaptively.to provision resources adaptively.

Adaptive resource provisioning is central to energy and thermal Adaptive resource provisioning is central to energy and thermal management in data centers.management in data centers.

– Adapt to energy “browndown” by degrading service quality.Adapt to energy “browndown” by degrading service quality.

– Adapt to load swings for 25% - 75% energy savings.Adapt to load swings for 25% - 75% energy savings.

Economic policy framework guides provisioning choices based Economic policy framework guides provisioning choices based on SLAs and cost/benefit tradeoffs.on SLAs and cost/benefit tradeoffs.

Future WorkFuture Work

multiple resources (e.g., memory and storage)multiple resources (e.g., memory and storage) multi-tier services and multiple server poolsmulti-tier services and multiple server pools reservations and latency QoS penaltiesreservations and latency QoS penalties rational server allocation and request distributionrational server allocation and request distribution integration with thermal system in data centerintegration with thermal system in data center flexibility and power of utility functionsflexibility and power of utility functions server networks and overlaysserver networks and overlays performability and availability SLAsperformability and availability SLAs application feedbackapplication feedback

Executive

client cluster server pool

Extreme GigE switch

LinkSys 100 Mb/s

switch

redirectors(PowerEdge 1550)

SURGE or traceload generators

Muse Prototype and TestbedMuse Prototype and Testbed

FreeBSD-based redirectorsresource containersAPM and Wake-on-LAN

faithful trace replay+ synthetic Web loadsserver CPU-bound

power meter

Throughput and LatencyThroughput and Latencysaturated: i > target

i increases linearly with i

Average per-request service demand: i i / i

overprovisioned: i > target may reclaim: i(target - i)0

20

40

60

80

100

0 30 60 90 120 150 180

Time (s)

CP

U (%

)

AllocationUsage

0

120

240

360

480

600

0 30 60 90 120 150 180

Time (s)

Thr

ough

put

(req

uest

s/s)

0

20

40

60

80

100

Latency (ms)

ThroughputLatency

An OS for a Hosting CenterAn OS for a Hosting Center

Hosting centers are made up of heterogeneous Hosting centers are made up of heterogeneous components linked by a network fabric.components linked by a network fabric.– Components are specialized.Components are specialized.

– Each component has its own OS.Each component has its own OS.

The role of a The role of a hosting center OShosting center OS is to: is to:– Manage shared resources (e.g., servers, energy) Manage shared resources (e.g., servers, energy)

– Configure and monitor component interactionsConfigure and monitor component interactions

– Direct flow of request/response trafficDirect flow of request/response traffic

Allocation Under Constraint (0)Allocation Under Constraint (0)

0

500

1000

1500

0 500 1000 1500

Time (s)

Thro

ughp

ut (r

eque

sts/

s)

0

1

2

3

Allotm

ent (servers)

Allocation Under Constraint (1)Allocation Under Constraint (1)

0

500

1000

1500

-100 100 300 500 700 900 1100 1300 1500

Time (s)

Thro

ughp

ut (r

eque

sts/

s)

0

1

2

3

Allotm

ent (servers)

OutlineOutline

Adaptive server provisioningAdaptive server provisioning Energy-conscious provisioningEnergy-conscious provisioning Economic resource allocationEconomic resource allocation Stable load estimationStable load estimation Experimental resultsExperimental results

Download - Managing Energy and Server Resources in Hosting Centers

Top Related