super-dense servers - ibm · 2011-03-08 · charles lefurgy ibm research, austin 4 web server...
TRANSCRIPT
![Page 1: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/1.jpg)
Super-Dense Servers: An Energy-efficient Approach to
Large-scale Server Clusters
Charles Lefurgy
IBM Research, Austin
![Page 2: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/2.jpg)
2Charles Lefurgy IBM Research, Austin
Outline
• Problem– Internet data centers use a lot of energy
• Opportunity– Load-varying applications– Servers can be power-managed
• Solution– Hardware: Dense server blades
• Design decisions• Software support
– Software: Power-Aware Request Distribution• Framework for cluster energy studies• Adapt cluster resources to workload
![Page 3: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/3.jpg)
3Charles Lefurgy IBM Research, Austin
Motivation
• Internet Data Centers– 25% of operation cost are for energy and cooling– Anecdotal evidence that customer’s racks are power-limited– Source: Jennifer Mitchell-Jackson’s thesis at http://enduse.lbl.gov/Info/datacenterreport.pdf
See also http://www.repp.org/articles/static/1/binaries/data_centers_report.pdf
• Power consumption affects cooling and backup power generation requirements, as well as reliability– Higher power means greater investments in sophisticated racks, air
conditioning and power generation infrastructure– Excessive heat may cause intermittent failures– Power draw may become problem for utilities
• Mobile servers, e.g. aircraft, ships, military applications
![Page 4: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/4.jpg)
4Charles Lefurgy IBM Research, Austin
Web server energy
• Sites are built with extra capacity– To handle load spikes– To handle failures
• This causes clusters to be underutilized– Nagano 1998 Winter Olympics
• Average load was 25% of peak encountered
– Wimbledon 1999• Average load was 11% of peak encountered
• Workload varies dramatically– Time of day– Time of year– Application type
• This is an opportunity for power management!
![Page 5: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/5.jpg)
5Charles Lefurgy IBM Research, Austin
Where does the energy go?
0
5
10
15
20
25
30
35
40
45
Pow
er (
W)
5V-Disk12V-Motherboard12V-Disk
5V-Motherboard
3.3V Motherboard
IDLE 1 5 10 15 20 CONNECTIONS
Conventional 600MHz Desktop System
![Page 6: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/6.jpg)
6Charles Lefurgy IBM Research, Austin
When can power be saved?
• 1998 Nagano Winter Olympics
![Page 7: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/7.jpg)
7Charles Lefurgy IBM Research, Austin
Server cluster
Set of computers used as a single system
Web Server Cluster
Content StoreNASD,servers
LoadBalancer
![Page 8: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/8.jpg)
8Charles Lefurgy IBM Research, Austin
The SDS hypothesis
• Use embedded processors for general purpose servers– Army of turtles approach– Take advantage of – Embedded processors use less speculation. Waste less energy.– Low-power + high integration = high density– Focus on MIPS / m3 / Watt
• Use blade form factor for high density• Use blades in tier 1 and tier 2 of web site
– Parallelism in requests is a match for having many slow blades– Tier 3 (database) has too much synchronization and works better on
symmetric multiprocessors.
2cfvP ∝
![Page 9: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/9.jpg)
9Charles Lefurgy IBM Research, Austin
SDS blade
• 1 x86 ULV SpeedStep 500/300 MHz• 512 MB SODIMM (256 MB with disk)• 2 100-Mb Ethernet ports• 1 Toshiba 1.8” IDE 5GB HDD• No keyboard, video, mouse
![Page 10: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/10.jpg)
10Charles Lefurgy IBM Research, Austin
Blade Power Budget
14.871Total
1.35290% efficient power supply
1.485Disk
0.693Clock Generator
0.008Voltage Monitor - I2C
0.743Ethernet Controller
0.330Supervisory Processor
0.173PCI to PCI Bridge
0.007EEPROM
0.033LPC Flash Memory
0.660Ethernet PHY
1.980North/South/Ethernet
0.005Voltage Regulator
1.000SODIMM 256MB
6.402Processor
Worst case power (Watts)
![Page 11: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/11.jpg)
11Charles Lefurgy IBM Research, Austin
Bladed Servers• Blade: Board that plugs into chassis (backplane)• Advantages
– Less cabling (this is important!)– Potentially lower space, power, cooling needs – Mix and match: server, storage, network, etc.– Blade cluster offers finer power management and system control
• Disadvantage– Current data centers may not be able to cope with higher energy density
![Page 12: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/12.jpg)
12Charles Lefurgy IBM Research, Austin
Blade enclosure• Industry standard CompactPCI enclosure. 6U high.• Network blade: network switching• Server blade: processor + memory• System management blade with disk
![Page 13: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/13.jpg)
13Charles Lefurgy IBM Research, Austin
Rack comparison
42360I/O buses
42 MB92 MBL2 cache
84 Gb/s71.4 Gb/sEthernet
168 GB184 GBMain memory
101 GHz (x-86)(2.4 GHZ each)
180 GHz (x-86)(500 MHz each)
Processor speed
18.57CPUs/U
42360CPUs
ConventionalSDS Cluster
![Page 14: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/14.jpg)
14Charles Lefurgy IBM Research, Austin
Software for SDS
• Linux Diskless Server Architecture– Single system image for all blades– Boot from management blade disk– Blades are diskless and boot in 20 seconds
• Ethernet block device– High performance swap– Serving web content
• Blade management across I2C bus– H8 microcontroller on blades acts as power switch
• Console over Ethernet• Power-Aware Request Distribution
– Quick boot time reduces “idle” power
![Page 15: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/15.jpg)
15Charles Lefurgy IBM Research, Austin
Evaluation of SDS Cluster
512 KB512 KB eachL2 cache
1 Gb/s100 Mb/s eachEthernet
2 GB256 MB eachMain memory
1.2 GHz300 MHz eachProcessor speed
18CPUs
IBM x3308 blade SDS cluster
• IBM x330 is what was available when SDS was designed• Use same total memory• Conservative cluster configuration in other aspects.• Blades could only use 300 MHz and 256 MB at time of evaluation• Use a modified TPC-W benchmark (fit images in memory of x330)
![Page 16: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/16.jpg)
16Charles Lefurgy IBM Research, Austin
<tpc-w> results
0.671.12WIPs/Watt
101.7 W104.1 WPower
68117WIPS
IBM x3308 blade SDS cluster
• Benchmark is CPU bound• Blades provide 1.7x performance for similar power level• 2x CPU frequency of blades helps them win
• This is a conservative result– Fixing NIC interrupts, using 500 MHz, and using 512 MB would improve energy-
efficiency of blades
![Page 17: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/17.jpg)
17Charles Lefurgy IBM Research, Austin
Summary of SDS blades
• Lesson: blades are viable deployment alternative for edge and application servers
• 1.7x better performance for CPU-bound <tpc-w> workload at same energy cost
• Performance is worse for applications in which blades “band” together to provide a single cluster image– SpecWeb99 requires a lot of memory– Blades each have less memory and they cannot share their memories– Traditional SMP server are better here
• Heterogeneous deployments are required until memory density improves
![Page 18: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/18.jpg)
PARD
Power-Aware Request Distribution
![Page 19: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/19.jpg)
19Charles Lefurgy IBM Research, Austin
PARD• PARD: a method of scheduling requests among servers so
that energy consumption is minimized while maintaining a particular level of performance
• Goals– Save energy in a web cluster– Do not impact response latency
• Solution: consolidate work onto fewer servers– Turn off inactive servers
• Idle servers use a lot of energy– Cost: small increase in response latency
• Mechanism– Monitor the cluster– Use load balancing
![Page 20: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/20.jpg)
20Charles Lefurgy IBM Research, Austin
Pitfalls in measuring energy for clusters
• Not measuring total system energy– Slowing down the system to save processor power is not useful if other
components are on longer and use more idle energy
• Not scaling workload to the system– Improving an inefficient machine overstates the results!
• Poor metrics– Running the benchmark again and reporting the energy savings is not
enough. How was performance impacted?– Ignoring response time in web benchmarks
• Poor benchmarks– Cluster workloads for energy-efficiency have used manufactured
benchmarks
• No idea if results are “good enough”. What is the limit of the method being evaluated?
![Page 21: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/21.jpg)
21Charles Lefurgy IBM Research, Austin
Life of a blade web server
offbusy
wake
retire
standby
Low-energy State
Web server is running in 30 seconds
Shutdown1 second
Accept newconnections
Server selectedto turn off.Drain currentconnections.
ActiveInactive
![Page 22: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/22.jpg)
22Charles Lefurgy IBM Research, Austin
4 dimensions of problem
• Energy savings• Quality of service (performance)• System characteristics• Workload characteristics
• When any two are fixed, there is a trade-off between the other two
• All must be reported to understand results– Often, only the first two are used
![Page 23: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/23.jpg)
23Charles Lefurgy IBM Research, Austin
System Characteristics
• Cluster unit– For example, a complete server
• Immunity to overload– At what point does a server overload and die?
• System energy consumption– Idle power, peak power
• Startup and shutdown delay of cluster units– Is this unit used by other units?
• Ability to migrate requests– Free up servers to turn them off
![Page 24: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/24.jpg)
24Charles Lefurgy IBM Research, Austin
Workload characteristics
• Workload unit– For example, a connection. Differs by connection type.
• Load profile– The instantaneous load and a required minimum QoS.– “machine utilization ratio” corresponds to “load ratio”
• Rate of change in workload– Can machine respond to spikes quickly enough?
![Page 25: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/25.jpg)
25Charles Lefurgy IBM Research, Austin
Workloads• Adapt TPC-W to fit load-varying web trace
0
200
400
600
800
1000
1200
Load (Connections)
Financial Olympics
![Page 26: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/26.jpg)
26Charles Lefurgy IBM Research, Austin
Test environment• Web cluster
– 8 SDS blades running Apache– Linux Virtual Server (LVS) does request distribution– <tpc-w> benchmark
• Energy monitoring– National Instruments data acquisition equipment and Lab View– Send data to LVS director– Measure “wall power” of cluster
• Cluster utilization– Each server periodically collects statistics– /proc/stat, netstat: CPU, disk, network stats– Send data to LVS director
• Linux Virtual Server– Modified to use monitoring to drive request distribution policies– Use “Least Connections” scheduling policy
![Page 27: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/27.jpg)
27Charles Lefurgy IBM Research, Austin
Simple threshold model• Given a workload, always turn on enough servers so that the
performance goal is met– Assign a threshold to each machine for # connections it can hold before the next
server is turned on. Based on maximum load acceleration.– Turn on another machine when load balancing can no longer be used to avoid
putting a machine over its threshold– Use future knowledge to set threshold. This is to estimate the limit of technique.
0
2
4
6
8
Time (duration = 30 m)
Nu
mb
er o
f b
lad
es
Active Required
0
20
40
60
80
100
120
140
Time (duration=30 m)
Po
wer
(W
atts
)
Pard No Pard
![Page 28: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/28.jpg)
28Charles Lefurgy IBM Research, Austin
Financial workload
0
2
4
6
8
Time (duration = 30m)
Nu
mb
er o
f b
lad
es
Active Required
0
50
100
150
Time (duration = 30 m) ->
Po
wer
(W
atts
)
Pard No Pard
• Maximum rate of change in trace is higher than Olympics– Use a lower threshold to turn blades on earlier
• When rate of change is lower (other parts of trace)– Blades turn on earlier than they really should– This is why “active” is much higher than “required” for most of trace– We can do better by modifying the threshold during the workload
![Page 29: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/29.jpg)
29Charles Lefurgy IBM Research, Austin
Simple threshold results
10%38%Energy savings measured
32%51%Savings possible from workload activity (limit)
10%40%Inactive blades
7.24.76Active blades avg.
FinancialOlympics
![Page 30: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/30.jpg)
30Charles Lefurgy IBM Research, Austin
Improving results
10% (26%)38% (45%)Energy savings measured
32%51%Savings possible from workload activity (limit)
10% (30%)40% (49%)Inactive blades
7.2 (5.57)4.76 (4.07)Active blades avg.
FinancialOlympics
• Change blade threshold as workload changes– Requires knowledge of workload change at every point in time– Appropriate for predictable, cyclic workloads
![Page 31: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/31.jpg)
31Charles Lefurgy IBM Research, Austin
Summary of PARD
• 4 dimensions of problem: energy, QOS, system characteristics, workload characteristics
• Developed method of adapting an industry standard workload to study power management
• Energy savings closely track with utilization of machine• Future work
– Apply to non-connection workloads– Address non-cyclic workloads (spikes)– Extend model to multiple-power states (voltage scaling processors)
![Page 32: Super-Dense Servers - IBM · 2011-03-08 · Charles Lefurgy IBM Research, Austin 4 Web server energy • Sites are built with extra capacity – To handle load spikes – To handle](https://reader033.vdocuments.us/reader033/viewer/2022042005/5e6fb146a7995c4d955e1cb6/html5/thumbnails/32.jpg)
32Charles Lefurgy IBM Research, Austin
PeoplePat BohrerBishop BrockMootaz ElnozahyWes FelterJessie GonzalezTom KellerMike KistlerRavi KokkuCharles Lefurgy
Akihiko MiyoshiThanos PapathanasionJim PhelanKarthick RajamaniRam RajamonyFreeman RawsonAlison SmithBruce SmithEric Van Hensbergen