multi-core architectural decomposition methods for low-power symmetric and asymmetric...

28
Multicore architectural decomposition methods for low- power symmetric and asymmetric multiprocessing Mark Benson Director of Software Strategy

Upload: mark-benson

Post on 07-Jan-2017

145 views

Category:

Engineering


5 download

TRANSCRIPT

Page 1: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

CONFIDENTIAL 1

Multicore architectural decomposition methods for low-power symmetric and asymmetric multiprocessingMark BensonDirector of Software Strategy

Page 2: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

CONFIDENTIAL 2

Mark Benson, Director of Software Strategy, Logic PD

History of Logic PD• 1960’s Founded as Polivka Logan• 1980’s Added Mechanical Eng• 1990’s Added Software, Electrical

Eng• 2000’s Added Products,

Manufacturing

Products and Services• Product Design• Product Engineering• Embedded Products• Manufacturing

Industries • Industrial, Medical, Aerospace,

Military

Employees• 130 design consultants• 400 operations staff

Geographies• Minneapolis, Boston, San Diego

Page 3: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

The story: low-power for multicore

Moore’sProphecy

Here BeDragons

The MulticorePromise

Amdahl’sWet Blanket

The NewWorld

DynamicPower

Scalingthe Curve

Movingthe Curve

Kindle FireCase Study

Respectingthe Curve(s)

Part I: The QuestPower AND Performance

Part II: DestinationDynamic Scaling

Page 4: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

2007

2009

2011

0

50

100

150

200

250

300

350

US Population US Wireless Subscriptions

Mobile proliferation*

* Source: CTIA; US Census Bureau

Millions

Page 5: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Industrial products

Page 6: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Medical devices

Page 7: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Military radios

Page 8: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Moore’s prophecy

* Source: http://www.gotw.ca/publications/concurrency-ddj.htm

Thermodynamic issues

Signal integrity issues

Multicoremovement

Page 9: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

The promise of multicore Higher performance Lower power (Pdynamic = CfV2)

Heat is reduced and distributed more evenly1 core n cores

Performance

Pow

er Equivalent power atHigher performance

Equivalent performance atLower power

Or

Page 10: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Amdahl’s wet blanket Parallelism has limits

Heat is still an issue Resource contention challenges emerge

1/(1-P)

Page 11: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

The new worldFor low-power multicore, parallel software is necessary but not sufficient

Foundation: parallelize software to the maxRefactor for parallelization and thread-safetyLeverage SMP for more clock cyclesIdentify and resolve resource constraints

Complication: what about low power?

Page 12: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

The story: low-power for multicore

Moore’sProphecy

Here BeDragons

The MulticorePromise

Amdahl’sWet Blanket

The NewWorld

DynamicPower

Scalingthe Curve

Movingthe Curve

Kindle FireCase Study

Respectingthe Curve(s)

Part I: The QuestPower AND Performance

Part II: DestinationDynamic Scaling

YouAre Here

Page 13: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Performance

Pow

erDynamic power

1. Scaling the curve2. Moving the curve3. Respecting the curve(s)

Notdesirable

Lightweightprocessing

Heavyweightprocessing

Dynamicscaling

Page 14: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Performance

Pow

er

Processor provides: power states, DVFS You provide: operating points (DVFS policies) You provide: suspend peripheral coordination

Scaling the curve: power states

Off

Suspend

Running

OPP1OPP2

OPP3OPP4

OPP5OPP6

Power States Operating PointsOPP = {f, V}

DVFS Engine

Feeds

Page 15: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Scaling the curve: wake time Deeper sleep consumes less power Deeper sleep takes longer to wake up

Each design (HW + SW) creates unique profile

Page 16: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Scaling the curve: boot quickly

Xloader Uboot Kernel Shell GUI Framework

Xloader Kernel Shell

GUI Framework

50s

10s

Drivers /dev30s

Normal boot

Fast boot

Pow

er

Performance

Spend moretime here

Get heremore quickly

Page 17: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Moving the curve: voltage tuning Some SoC’s have per-chip voltage calibrations Examples: SmartReflex™, power/clock gating These often require companion chips (PMIC)

Voltage tuning offVoltage tuning on

Performance

Pow

er

Goal: lower power at equivalent performance

Page 18: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Respecting the curve(s) Applications processors have many curves

Make sure your app uses the right curves Use dynamic scaling to your advantage

ARMDSPISP

Performance

Pow

er

1. Image resizing, color conversion, AWB, AE, AF2. Audio/video codecs, data processing3. GUI, communications stacks

Page 19: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Kindle Fire Case Study

Page 20: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Case study: Kindle Fire TI OMAP 4430, 7” 600x1024 display Wi-Fi 802.11n, USB 2.0, 8 GB of storage Android Gingerbread 2.3, 4400 mAh battery

Page 21: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Kindle Fire: dynamic scaling 2 vs 1 ARM Cortex-A9s System under full load (ANTuTu Benchmark*)

Cores: 1 @ 600 MHzPower: 2158 mWTemp: 43.5º C

Cores: 2 @ 300 MHzPower: 1930 mWTemp: 42º C

750

1500

2000

2500

3000

4000

5000

1600

2100

2600

3100

3600

1 core (load) 2 cores (load)

Performance (DMIPS)

Pow

er (m

W)

Under load, multicoreperforms better

* ANTuTu Benchmarkhttp://www.antutulabs.com

Page 22: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Kindle Fire: dynamic scaling When idling, the opposite is true During idle, it’s more efficient to turn off a core

Cores: 1 @ 600 MHzPower: 1611 mWTemp: 38.5º C

Cores: 2 @ 300 MHzPower: 1729 mWTemp: 40º C

750

1500

2000

2500

3000

4000

5000

15001600170018001900200021002200

1 core (idle) 2 cores (idle)

Performance (DMIPS)

Pow

er (m

W)

When idling, single coreperforms better

Page 23: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Kindle Fire: voltage tuning Adjust voltage based on per-chip calibration Benefits occur across all power states

Freq: 800 MHzSmartReflex™: offPower: 3097mWTemp: 50º C

Freq: 800 MHzSmartReflex™: onPower: 2690 mWTemp: 45º C

300 600 800 10001500

2000

2500

3000

3500

4000

SR off (load) SR on (load)

Performance (MHz)

Pow

er (m

W)

SmartReflex™ movesthe curve

Page 24: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Off Suspend Idle Load0

5001000150020002500300035004000

0100002000030000400005000060000

Power / state Wake time / statePower State

Pow

er (m

W)

Tim

e (m

s)

Kindle Fire: wake time

Suspend62 mW31º C180 ms wake

Idle1721 mW34.2º C0 ms wake

Load3431 mW51.1º C0 ms wake

Suspend offers significant power savings with a reasonable wake time Off

0 mW25º C52000 ms wake

Page 25: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

Kindle Fire: summary Consumer products use dynamic scaling You should too Processor choice is important Software is key

Page 26: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

The story: low-power for multicore

Moore’sProphecy

Here BeDragons

The MulticorePromise

Amdahl’sWet Blanket

The NewWorld

DynamicPower

Scalingthe Curve

Movingthe Curve

Kindle FireCase Study

Respectingthe Curve(s)

Part I: The QuestPower AND Performance

Part II: DestinationDynamic Scaling

Page 27: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

It’s all about the user

Page 28: Multi-Core Architectural Decomposition Methods for Low-Power Symmetric and Asymmetric Multi-Processing

THANK YOUMark Benson, Director of Software Strategy, Logic PD

[email protected]