power optimization chapter 9 contributed by alex turek 1

25
Power Optimization Chapter 9 Contributed by Alex Turek 1

Upload: keeley-hankins

Post on 14-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Power Optimization Chapter 9 Contributed by Alex Turek 1

1

Power Optimization

Chapter 9Contributed by Alex Turek

Page 2: Power Optimization Chapter 9 Contributed by Alex Turek 1

2

Outline

System Design Considerations

Power Optimization (hardware)

Power Optimization (software)

Constant vs. Dynamic Power in ES

Advanced Configuration and Power Interface (ACPI)

Constructing software for maximum power performance

Summary

Page 3: Power Optimization Chapter 9 Contributed by Alex Turek 1

3

Computer System Design Considerations

CostPerformance Power Size

Every decision is a trade off

Generally, lowering power consumption has become more important than increasing performance

Initially developed for mobile platforms

Power optimization is important for all platform types, from mobile devices and embedded systems to multi-million dollar servers

Page 4: Power Optimization Chapter 9 Contributed by Alex Turek 1

4

Power Basics

Power can be optimized on both the hardware and software levels

Current computers consist of many MOSFETs (Metal Oxide Semiconductor field Effect Transistor)

Page 5: Power Optimization Chapter 9 Contributed by Alex Turek 1

5

Active Power Consumption

P = CV2f

C = capacitive loadV = voltage

f = Frequency

Voltage affects active power more so than the frequency or capacitive load of a circuit due to its exponential relation

Voltage and frequency are first order determinants of performance

Page 6: Power Optimization Chapter 9 Contributed by Alex Turek 1

6

Power Consumption in CMOS

Charging and discharging capacitors

Voltage and frequency are first order determinants of performance

Short circuit currents:Short circuit between supply rails during switching

Leakage (static power consumption)Leaking diodes and transistors

Minimizing leakage is extremely important for low-power systems, which are not active most of the time

Page 7: Power Optimization Chapter 9 Contributed by Alex Turek 1

7

Intel 3D TransistorIntel's 3-D Tri-Gate transistors enable chips to operate at lower voltage with lower leakage

The 22nm 3-D Tri-Gate transistors provide up to 37 percent performance increase at low voltage versus Intel's 32nm planar transistors

 the new transistors consume less than half the power when at the same performance as 2-D planar transistors on 32nm chips.

Page 8: Power Optimization Chapter 9 Contributed by Alex Turek 1

8

Intel 3D Transistor

Page 9: Power Optimization Chapter 9 Contributed by Alex Turek 1

9

Optimizing Power Using Software

Power Profiles of Embedded Systems:

Always on/continuous powerfor devices that are required to provide near-peak performance, nearly all the time (displays, controllers, networking devices, etc)

Advanced Power Management such as Intel SpeedStep™Policy and usage based frequency and voltage control

Initially designed for laptops: plugged in vs battery modes

Newer Intel CPUs such as the Core i7 series handle usage power management, where as the OS is used for policy based power management

Intel Core i7-3960X 3.3 GHz (3.9 Ghz Turbo) 6 core 12 thread

Page 10: Power Optimization Chapter 9 Contributed by Alex Turek 1

10

Reducing Power Consumption

Mobile computers/devices created the need for CPUs with better power consumption

Now, CPUs are using less of the overall system power budget, manufacturers are looking for ways to improve the power consumption of other components

The First Step to Optimizing System Power

Maximize the range of power consumption between low-power states and high power states

If the max/min states have negligible differences in power consumption, then power management will have no impact

Page 11: Power Optimization Chapter 9 Contributed by Alex Turek 1

11

Constant vs. Dynamic Power

Constant Power: an embedded computer consumes a constant, minimum amount of power when powered on, regardless of the level of system activity

Other than by powering a system down completely or transitioning to a sleep state, software cannot influence a system’s constant power

Dynamic Power: power consumption based on load

If most of the components in the system support low-power modes, then dynamic power consumption can be significantly controlled through the careful use of resources

Page 12: Power Optimization Chapter 9 Contributed by Alex Turek 1

12

Idle – Peak RatioRatio of Idle to Peak power consumption:

estimate: 6/9 = 0.66;observed: 9/11 = 0.81

~75% of total system power consumption is completely independent of software-level activity on the platform

In comparison:

Intel Core i7-3960X 3.3 GHz (3.9 Ghz Turbo) 6 core 12 threadIdle Power: 62 W;Peak Power: 210 W

62/210 = 0.295At 0% utilization, 29.5% of peak power consumed

FIGURE 9.1 Advertised and Measured Dynamic Power Range for CompuLab’s fit-PC2. The Advertised Power Range Covers 66% of Peak, whereas the Measured Value, When Accounting for Active I/O Devices, Covers 81% of the Observed Peak.

Page 13: Power Optimization Chapter 9 Contributed by Alex Turek 1

13

FIGURE Top: Power Consumption and Efficiency for a System With a Dynamic Power Range Down to 50% of Peak Performance. Power is Normalized to Peak Consumption. Efficiency is Calculated by Dividing Utilization by Normalized Power.

FIGURE Bottom: Power Consumption and Efficiency for a System With a Dynamic Power Range Down to 10% of Peak Performance.

Goal of Power Efficiency: high efficiency over entire operating range

Page 14: Power Optimization Chapter 9 Contributed by Alex Turek 1

14

ACPI

ACPI: (Advanced Configuration and Power Interface): platform specific industry standard that defines the roles, responsibilities, and specific mechanisms provided to achieve reliable power management and system configuration

Developed through the cooperation of HP, Intel, Microsoft, Phoenix Technologies, and Toshiba

Previous interfaces replaced by ACPI such as plug-and-play BIOS and Advanced Power Management (APM) were platform and operating system specific

If most of the components in the system support low-power modes, then dynamic power consumption can be significantly controlled through the careful use of resources

Page 15: Power Optimization Chapter 9 Contributed by Alex Turek 1

15

ACPI continued…The operating system is granted exclusive control over all system-level management tasks, including the boot sequence, device configuration, power management, and external event handling (such as thermal monitoring or power button presses)

ACPI Registers: well defined locations that can be read and written to monitor and change the status of hard-ware resources

ACPI BIOS: This firmware manages system boot and transitions between sleep and active states.

ACPI Tables: describe the interfaces to the un-derlying hardware and represent the system description.

Written in a domain specific language, ACPI Ma-chine Language (AML). The operating system's ACPI driver includes an interpreter for AML.

These tables are provided by the ACPI BIOS

Page 16: Power Optimization Chapter 9 Contributed by Alex Turek 1

16

Idle vs. Sleep

Two dominant forms of user-facing power management

Sleep States: system appears to be powered off completely, while in fact some software may periodically execute to service external events, such as network packet arrivals.

First introduced with laptop use in mind

Idle Modes: take advantage of the variation in system utilization when the system is powered up

Transitions between idle modes are made in milliseconds to reduce active power consumption, without the involvement or awareness of the user

Page 17: Power Optimization Chapter 9 Contributed by Alex Turek 1

17

ACPI System States

Global System States (Gx States)

Describe the entire system

Transitions between these states are typically observed by the user

G0: WorkingG1: Sleeping (suspend or “instant boot” mode)G2 (S5): Soft OffG3: Mechanical Off

Page 18: Power Optimization Chapter 9 Contributed by Alex Turek 1

18

ACPI System States Continued

Sleep States (Sx States)

Sleep states within the global sleep state G1

S1: All system context is maintained by hardware upon entry and thus provides the lowest waking latency

S2: Operating system is responsible for storing and restoring CPU and cache hierarchy context. Upon entry to S2, CPU and cache state are lost

S3: Powers down more internal units than S2, but otherwise is similar

S4: Write all system states, including main memory, to nonvolatile storage

S5: Does not store system context.

Enables system to be powered on electronically (Wake on LAN)

Page 19: Power Optimization Chapter 9 Contributed by Alex Turek 1

19

ACPI System States Continued

Device Power States (Dx States)

Device class-specific approach to power management that allows devices of the same type to be treated in the same way

Device Classes: Audio, COM Port, Display, Input, Modem, Network, PC Card Controller, Storage

D0 (Fully On): fully powered up, fully operational state. Highest Power Consumption

D1: Initial sleep state for a device. Capable of waking itself or the entire system in response to an external event

D2: Incremental state beyond D1. Operates at a lower power and requires a greater waking latency than D1

D3hot: Saves a device-specific state so it can be awakened without a complete reboot

D3 (Off, or d3cold): Complete power down of the device. Device context not restorable

Page 20: Power Optimization Chapter 9 Contributed by Alex Turek 1

20

ACPI System States Continued

Processor Power States (Cx States)

ACPI allows processors to sleep while in the G0 working state. The Cx States only apply when the global state is G0.

C0: The processor executes instructions and operates normally

C1: Lowest transition latency CPU sleep state. Latency is so low that it is negligible and thus not a deciding factor in the operating system’s decision to transition to this state.

C2: Lower-power sleep state than C1. Operating system does consider the transition latency when deciding whether to transition to this state or another.

Transitions to both C1 and C2 are not apparent to software and do not alter system operation.

C3: Offers great power reduction at the cost of an increased transition latency. Processor caches maintain their state but do not emit cache coherence traffic

C4: Optional additional power states included in ACPI revision 2.0. Even lower power consumption and greater transition latency. Vendor specific.

Page 21: Power Optimization Chapter 9 Contributed by Alex Turek 1

21

ACPI System States Continued

Processor Performance States (Px States)

Within processor power state C0, ACPI defines a range of performance states that are intended to enable a fully working system to vary its power consumption and performance by operating at different voltage and frequency levels. The operating system explicitly controls transitions between these states. For multiprocessor systems, each processor must support the same number and type of processor performance state.

P0: Maximum performance, maximum power consumption state.

P1: Next highest performing processor performance state and is expected to have second-greatest power consumption

Pn: ACPI allows for a maximum of 16 distinct performance states.

Page 22: Power Optimization Chapter 9 Contributed by Alex Turek 1

22

More on Intel SpeedStep Technology

Features performance and power management through voltage and frequency control, thermal monitoring, and thermal management features

Provides a central software control mechanism through which the processor can manage different operating points

In a FSB-based system, processors drive seven output voltage identification (VID) pins to facilitate automatic selection of processor voltages VCC from the motherboard voltage regulator

Each core in a multi-core has its own machine state register (MSR) to control the VID value. However, each core must work at the same voltage and frequency.

Page 23: Power Optimization Chapter 9 Contributed by Alex Turek 1

23

Software Construction for Maximum Power Performance

Race to Sleep

Modern systems transition between power states in response to load

Software should be organized to complete all available useful work in a continuous batch and then transition to an idle state and avoid unnecessary interruptions

Traditional software development tools emphasize code execution frequency.

Modern software dev tools must also take into account how often a program interferes with transitions to low-power CPU states

Linux PowerTOP Tool

Open-source power profiling tool for Linux that reports the occupation frequency of processor sleep and performance states

Page 24: Power Optimization Chapter 9 Contributed by Alex Turek 1

24

Evaluating Software and Systems with PowerTOP

PowerTOP can be used to measure and improve performance of code as it is written

The measurement harness is implemented in Python. To begin, the developer identifies the target code for measurement, referred to as the code-under-test (CUT)

Basic Operation of the Measurement Harness

Phase 1. Discover the maximum rate of execution of the CUT

Phase 2. Measure the power efficiency at the maximum execution rate

Phase 3. Measure the power efficiency as execution rate is scaled down from maximum

Two key parameters manipulated in Phase 3:

- total fraction of “sleep” time during the experimental observational period - duration of an individual sleep interval

Page 25: Power Optimization Chapter 9 Contributed by Alex Turek 1

25

Concluding Remarks

System Design Considerations – “Everything’s a tradeoff”

Active Power Consumption – P = CV2f

Idle vs. Peak Ratio – decreasing constant power consumption is most favorable

Advanced Configuration and Power Interface (ACPI) – Standard developed by the industry to provide universal power management

Software techniques for measuring and managing power consumption – race to sleep