Slide 1
Managing Energy and Managing Energy and Server Resources in Hosting Server Resources in Hosting
CentersCentersJeff Chase, Darrell Anderson, Ron Doyle, Jeff Chase, Darrell Anderson, Ron Doyle,
Prachi Thakar, Amin VahdatPrachi Thakar, Amin Vahdat
Duke UniversityDuke University
Slide 2
Back to the FutureBack to the Future Return toReturn to server-centered computing server-centered computing: applications run as : applications run as
servicesservices accessed through the Internet. accessed through the Internet.– Web-based services, ASPs, “netsourcing”Web-based services, ASPs, “netsourcing”
Internet services are hosted on server clusters.Internet services are hosted on server clusters.– Incrementally scalable, etc.Incrementally scalable, etc.
Server clusters may be managed by a third party.Server clusters may be managed by a third party.– Shared data center or Shared data center or hosting centerhosting center– Hosting utility offers economies of scale:Hosting utility offers economies of scale:
• Network accessNetwork access• Power and coolingPower and cooling• Administration and securityAdministration and security• Surge capacitySurge capacity
Slide 3
Managing Energy and Server ResourcesManaging Energy and Server Resources
Key ideaKey idea: a : a hosting center OShosting center OS maintains the maintains the balance of requests and responses, energy inputs, balance of requests and responses, energy inputs, and thermal outputs.and thermal outputs.
responses
energy
waste heat
requests
1. Adaptively provision server resources to match request load.
2. Provision server resources for energy efficiency.
3. Degrade service on power/cooling failures.
Power/cooling “browndown” Dynamic thermal management [Brooks]
US in 2003US in 2003: : 22 TWh 22 TWh
($1B - $2B+)($1B - $2B+)
Slide 4
ContributionsContributions Architecture/prototype for Architecture/prototype for adaptive provisioningadaptive provisioning of server of server
resources in Internet server clusters (resources in Internet server clusters (MuseMuse))– Software feedbackSoftware feedback– Reconfigurable request redirectionReconfigurable request redirection– Addresses a key challenge for hosting automationAddresses a key challenge for hosting automation
Foundation for energy management in hosting centersFoundation for energy management in hosting centers– 25% - 75% energy savings 25% - 75% energy savings – Degrade rationally (“gracefully”) under constraint (e.g., browndown)Degrade rationally (“gracefully”) under constraint (e.g., browndown)
Simple “economic” resource allocationSimple “economic” resource allocation– Continuous Continuous utility functionsutility functions: customers “pay” for performance.: customers “pay” for performance.– Balance service quality and resource usage.Balance service quality and resource usage.
Slide 5
Static ProvisioningStatic Provisioning
Dedicate fixed resources per customerDedicate fixed resources per customer Typical of “co-lo” or dedicated hostingTypical of “co-lo” or dedicated hosting Reprovision Reprovision manuallymanually as needed as needed Overprovision for surgesOverprovision for surges
– High variable cost of capacityHigh variable cost of capacity
How to automate resource provisioning for managed hosting?
Slide 6
Load Is DynamicLoad Is Dynamicibm.com external site• February 2001• Daily fluctuations (3x)• Workday cycle• Weekends off
World Cup soccer site • May-June 1998• Seasonal fluctuations• Event surges (11x)• ita.ee.lbl.gov 0
0 Time (two months)
Thro
ughp
ut (r
eque
sts/
s)
00 Time (one week)
Thro
ughp
ut (r
eque
sts/
s)
M T W Th F S SM T W Th F S S
Week 6 7 8Week 6 7 8
Slide 7
Adaptive ProvisioningAdaptive Provisioning- Efficient resource usage- Load multiplexing- Surge protection- Online capacity planning- Dynamic resource recruitment
- Balance service quality with cost- Service Level Agreements (SLAs)
Slide 8
Utilization TargetsUtilization Targets
i >target : service i is underprovisioned
i = allocated server resource for service i
i = utilization of i at i’s current load i
target = configurable target level for iLeave headroom for load spikes.
i <target : service i is overprovisioned
Slide 9
Muse ArchitectureMuse Architecture
Controlperformance
measures
reconfigurableswitches
configurationcommands
offered request load storage
tier
Executive controls mapping of service traffic to server resources by means of:
• reconfigurable switches• scheduler controls (shares)
server poolstateless
interchangeable
Executive
Slide 10
Server Power DrawServer Power Draw
CPU idle93w
CPU max120w
boot136w
disk spin6-10woff/
hiber2-3w
866 MHz P-III SuperMicro 370-DER (FreeBSD)Brand Electronics 21-1850 digital power meter
work
wattsIdling consumes
60% to 70% of peak power demand.
Slide 11
Energy vs. Service QualityEnergy vs. Service Quality
A
B
C
D
Active set = {A,B,C,D} Active set = {A,B}
A
B
i <target
• Low latency
i =target
• Meets quality goals• Saves energy
Slide 12
Energy-Conscious ProvisioningEnergy-Conscious Provisioning
Light loadLight load: concentrate traffic on a minimal set of servers.: concentrate traffic on a minimal set of servers.– Step down surplus servers to a low-power state.Step down surplus servers to a low-power state.
• APMAPM and and ACPIACPI– Activate surplus servers on demand.Activate surplus servers on demand.
• Wake-On-LANWake-On-LAN BrowndownBrowndown: can provision for a specified energy target.: can provision for a specified energy target.
Slide 13
Resource EconomyResource Economy InputInput: the “value” of : the “value” of performanceperformance for each customer for each customer ii..
– Common unit of value: “money”.Common unit of value: “money”.– Derives from the economic value of the service.Derives from the economic value of the service.– Enables SLAs to represent flexible quality vs. cost tradeoffs.Enables SLAs to represent flexible quality vs. cost tradeoffs.
Per-customer Per-customer utility functionutility function UUii = = bid – penalty.bid – penalty.– Bid for traffic volume (throughput Bid for traffic volume (throughput ii).).– Bid for better service quality, or subtract Bid for better service quality, or subtract penaltypenalty for poor quality. for poor quality.
Allocate resources to maximize expected global utility (“revenue” or Allocate resources to maximize expected global utility (“revenue” or reward).reward).– Predict performance effects.Predict performance effects.
– ““Sell” Sell” to the highest bidder.to the highest bidder.– Never sell resources below cost.Never sell resources below cost. Maximize Maximize bidbidii((ii(t, (t, ii))))
Subject to Subject to i i maxmax
Slide 14
Maximizing RevenueMaximizing Revenue Consider any customer Consider any customer ii with allotment with allotment ii at at fixedfixed time time t.t.
– The marginal utility (The marginal utility (pricepriceii) for a resource unit allotted or ) for a resource unit allotted or reclaimed from reclaimed from ii is the gradient of is the gradient of UUii at at ii..
Expected Expected UtilityUtility
UUii(t, (t, ii))
Resource allotmentResource allotment ii
Adjust allotments until Adjust allotments until price price equilibriumequilibrium is reached. is reached.
The algorithm assumes that The algorithm assumes that UUii is is “concave”:“concave”: the price gradients the price gradients are non-negative and are non-negative and monotonically non-increasingmonotonically non-increasing..
pricepriceii
Slide 15
Feedback and StabilityFeedback and Stability
Allocation planning is Allocation planning is incrementalincremental..– Adjust the solution from the previous interval to react to new Adjust the solution from the previous interval to react to new
observations.observations. Allow system to stabilize before next re-evaluation. Allow system to stabilize before next re-evaluation.
– Set adjustment interval and magnitude to avoid oscillation.Set adjustment interval and magnitude to avoid oscillation.
– Control theory applies. Control theory applies. [Abdelzaher, Shin et al, 2001][Abdelzaher, Shin et al, 2001]
Filter the load observations to distinguish transient and Filter the load observations to distinguish transient and persistent load changes.persistent load changes.– Internet service workloads are extremely bursty.Internet service workloads are extremely bursty.
– Filter must “balance stability and agility” Filter must “balance stability and agility” [Kim and Noble 2001].[Kim and Noble 2001].
Slide 16
““Flop-Flip” FilterFlop-Flip” Filter EWMA-based filter alone is not sufficient.EWMA-based filter alone is not sufficient.
– Average Average AAtt for each interval for each interval tt: : AAtt = = AAt-1t-1 + (1- + (1-)O)Ott
– The gain The gain may be variable or may be variable or flip-flop.flip-flop. Load estimate Load estimate EEtt = E = Et-1t-1 if EEt-1t-1 - A - Att < tolerance < tolerance
elseelse EEtt = A = Att
StableStable ResponsiveResponsive
0
20
40
60
80
100
0 300 600 900 1200
Time (s)
Util
izat
ion
(%)
Raw Data
EWMA (a=7/8)
Flop-Flip
Slide 17
IBM Trace Run (IBM Trace Run (BeforeBefore))
0
500
1000
1500
2000
2500
0 155 310 465 620
Time (minutes)
Thro
ughp
ut (r
eque
sts/
s)
0
70
140
210
280
350
Power D
raw (w
atts), Latency (m
s x50)
ThroughputPowerLatency
1 ms
Throughput (requests/s)
Power draw
(watts)
Latency (ms*50)
Slide 18
IBM Trace Run (IBM Trace Run (AfterAfter))
0
500
1000
1500
2000
2500
0 155 310 465 620
Time (minutes)
Thro
ughp
ut (r
eque
sts/
s)
0
70
140
210
280
350
Pow
er Draw
(watts),
Latency (ms x50)
Throughput
Power
Latency
1 ms
Slide 19
Evaluating Energy SavingsEvaluating Energy Savings
Trace replay shows adaptive provisioning in action.Trace replay shows adaptive provisioning in action.
Server energy savings in this experiment was 29%.Server energy savings in this experiment was 29%.– 5-node cluster, 3x load swings, 5-node cluster, 3x load swings, targettarget = 0.5 = 0.5– Expect roughly comparable savings in cooling costs. Expect roughly comparable savings in cooling costs.
• Ventilation costs are fixed; chiller costs are proportional to Ventilation costs are fixed; chiller costs are proportional to thermal loading.thermal loading.
For a given “shape” load curve, achievable energy savings For a given “shape” load curve, achievable energy savings increases with cluster size.increases with cluster size.
• E.g., higher request volumes,E.g., higher request volumes,• or lower or lower targettarget for better service quality.for better service quality.
– Larger clusters give finer granularity to closely match load.Larger clusters give finer granularity to closely match load.
Slide 20
Expected Resource SavingsExpected Resource Savings
0
20
40
60
80
0 4 8 12 16
Max Servers
Sav
ings
(%)
World Cup (two month)
World Cup (month 2)
World Cup (week 8)
IBM (week)
Slide 21
ConclusionsConclusions Dynamic request redirection enables fine-grained, continuous Dynamic request redirection enables fine-grained, continuous
control over mapping of workload to physical server resources control over mapping of workload to physical server resources in hosting centers.in hosting centers.
Continuous monitoring and control allows a hosting center OS Continuous monitoring and control allows a hosting center OS to provision resources adaptively.to provision resources adaptively.
Adaptive resource provisioning is central to energy and thermal Adaptive resource provisioning is central to energy and thermal management in data centers.management in data centers.
– Adapt to energy “browndown” by degrading service quality.Adapt to energy “browndown” by degrading service quality.
– Adapt to load swings for 25% - 75% energy savings.Adapt to load swings for 25% - 75% energy savings.
Economic policy framework guides provisioning choices based Economic policy framework guides provisioning choices based on SLAs and cost/benefit tradeoffs.on SLAs and cost/benefit tradeoffs.
Slide 22
Future WorkFuture Work
multiple resources (e.g., memory and storage)multiple resources (e.g., memory and storage) multi-tier services and multiple server poolsmulti-tier services and multiple server pools reservations and latency QoS penaltiesreservations and latency QoS penalties rational server allocation and request distributionrational server allocation and request distribution integration with thermal system in data centerintegration with thermal system in data center flexibility and power of utility functionsflexibility and power of utility functions server networks and overlaysserver networks and overlays performability and availability SLAsperformability and availability SLAs application feedbackapplication feedback
Slide 23
Executive
client cluster server pool
Extreme GigE switch
LinkSys 100 Mb/s
switch
redirectors(PowerEdge 1550)
SURGE or traceload generators
Muse Prototype and TestbedMuse Prototype and Testbed
FreeBSD-based redirectorsresource containersAPM and Wake-on-LAN
faithful trace replay+ synthetic Web loadsserver CPU-bound
power meter
Slide 24
Throughput and LatencyThroughput and Latencysaturated: i > target
i increases linearly with i
Average per-request service demand: i i / i
overprovisioned: i > target may reclaim: i(target - i)0
20
40
60
80
100
0 30 60 90 120 150 180
Time (s)
CP
U (%
)
AllocationUsage
0
120
240
360
480
600
0 30 60 90 120 150 180
Time (s)
Thr
ough
put
(req
uest
s/s)
0
20
40
60
80
100
Latency (ms)
ThroughputLatency
Slide 25
An OS for a Hosting CenterAn OS for a Hosting Center
Hosting centers are made up of heterogeneous Hosting centers are made up of heterogeneous components linked by a network fabric.components linked by a network fabric.– Components are specialized.Components are specialized.
– Each component has its own OS.Each component has its own OS.
The role of a The role of a hosting center OShosting center OS is to: is to:– Manage shared resources (e.g., servers, energy) Manage shared resources (e.g., servers, energy)
– Configure and monitor component interactionsConfigure and monitor component interactions
– Direct flow of request/response trafficDirect flow of request/response traffic
Slide 26
Allocation Under Constraint (0)Allocation Under Constraint (0)
0
500
1000
1500
0 500 1000 1500
Time (s)
Thro
ughp
ut (r
eque
sts/
s)
0
1
2
3
Allotm
ent (servers)
Slide 27
Allocation Under Constraint (1)Allocation Under Constraint (1)
0
500
1000
1500
-100 100 300 500 700 900 1100 1300 1500
Time (s)
Thro
ughp
ut (r
eque
sts/
s)
0
1
2
3
Allotm
ent (servers)
Slide 28
OutlineOutline
Adaptive server provisioningAdaptive server provisioning Energy-conscious provisioningEnergy-conscious provisioning Economic resource allocationEconomic resource allocation Stable load estimationStable load estimation Experimental resultsExperimental results