todd-islped-keynote-postweb.eecs.umich.edu/~taustin/papers/islped-keynote-final.pdf · title:...

17
1 On the Rules of Low-Power Design (and How to Break Them) On the Rules of Low-Power Design (and How to Break Them) Prof. Todd Austin Advanced Computer Architecture Lab University of Michigan [email protected] Once upon a time… Once upon a time…

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

1

On the Rules of Low-Power Design(and How to Break Them)

On the Rules of Low-Power Design(and How to Break Them)

Prof. Todd Austin

Advanced Computer Architecture LabUniversity of Michigan

[email protected]

Once upon a time…Once upon a time…

Page 2: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

2

Rules of Low-Power DesignRules of Low-Power Design

1. Minimize switching activity2. Design for lower load capacitance3. Reduce frequency4. Reduce leakage

and the most important of all:5.5. Decrease supply voltage!Decrease supply voltage!

Process margin

Ambient margin

Critical voltage(determined by

critical path)

0.8vVth

P = aCV2f + VIleakaaCC ff IIleakleakVVVV221.2v

Noise margin

Overclockers Break the RulesOverclockers Break the Rules

Process margin

Ambient margin

0.8vVth

1.2vNoise margin

Page 3: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

3

Goals of This PresentationGoals of This Presentation

Review some of the rules of low-power design

Show how clever designs can break these rulesRazor resilient circuitsSubliminal subthreshold voltage processor

Highlight the benefits of taking a rule-breaking approach to technical research

Investigating OverclockingInvestigating Overclocking

Page 4: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

4

Two Slow Pipelines Check a Fast PipelineTwo Slow Pipelines Check a Fast Pipeline

48-b

it LF

SR

48-b

it LF

SR

48-b

it LF

SR

48-b

it LF

SR

X

X

X

clk/2

clk/2

clk clk

clk/2

clk/2

clk

!=

40-b

it E

rror

Cou

nter

40-b

it E

rror

Cou

nter

Slow Pipeline A

Slow Pipeline B

Fast Pipeline

clk/2

18

18

36

36

36

18x18

18x18

18x18

stabilize

90 MHz

45 MHz

45 MHz

18x18-bit Multiplier Block at 90 MHz and 27 C

0.0000000%0.0000001%0.0000010%0.0000100%0.0001000%0.0010000%0.0100000%0.1000000%1.0000000%10.0000000%100.0000000%

1.141.181.221.261.301.341.381.421.461.501.541.581.621.661.701.741.78Supply Voltage (V)

Erro

r rat

e

Zero-margin@ 1.54 V

Environmental-margin@ 1.69 V

Observation: Voltage Margins Are PlentifulObservation: Voltage Margins Are Plentiful

35% energy savings with 1.3% errors

20% energy savings

One error every 20 seconds!

Margin grows if a few (~1%) errors can be tolerated

Page 5: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

5

Razor Resilient CircuitsRazor Resilient Circuits

Main

FF

Shad

ow La

tch

Main

FF

clk clk

clk_del

5

49 MEM39

9

Double-sampling metastability tolerant latches detect timing errors

Second sample is correct-by-design

Microarchitectural support restores program stateTiming errors treated like branch mispredictions

recover

IF

Razo

r FF ID

Razo

r FF EX

Razo

r FF MEM

(read-only)WB

(reg/mem)

error bubble

recover recover

Razo

r FF

Stab

ilizer

FF

PC

recover

flushID

bubbleerror bubble

flushID

error bubble

flushIDFlushControl

flushID

error

Cycle: 0

inst1inst2inst3inst4inst5

123456

inst6

Distributed Pipeline Recovery

inst2inst7inst8

789

inst3inst4

Builds on existing branch prediction frameworkMultiple cycle penalty for timing failureScalable design as all communication is local

Page 6: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

6

Razor Prototype DesignRazor Prototype Design

Six stage 64-bit Alpha pipeline200MHz in 0.18mm @ 1.8Vtunable via sw from 200-50MHz, 1.8-1.1V

32-entry, 3-port RF, 8K I-Cache/8K D-CacheBranch-not-taken branch predictorFull scan capability

Razor overhead:192 Razor FF out of 2408 (9%)Error-free power overhead:

Razor flip-flops: < 1%Short path buffer: 2.1%

Recovery power overhead:18x an inst, for pipeline recovery

D-Cache

IF ID EX

ME

M

WB

Register FileI-Cache

3.3 m

m

3 mm

Razor Prototype TestbedRazor Prototype Testbed

Page 7: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

7

Razor-Based Dynamic Voltage ScalingRazor-Based Dynamic Voltage Scaling

Eref

VoltageControl

Function Σ...

Pipeline

reset

Vdd

Ediff = Eref - Esample

-

EsampleVoltageRegulator

Edifferror

signals

Current design utilizes a very simple proportionalcontrol function

Control algorithm implemented in software

20 40 60 80 100 120 1400123456789

10

1.481.521.561.601.641.681.721.761.80

Example Voltage Controller ResponseExample Voltage Controller Response

Two minute snapshot of a 15 minute run

Con

trol

ler O

utpu

t Vol

tage

(V)

Perc

enta

ge E

rror

Rat

e

Time (Seconds)01

432

56

910

78

Page 8: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

8

Effects of Razor DVSEffects of Razor DVS

Decreasing Supply Voltage

Energy

Energy ofPipeline

Recovery,Erecovery

Total Energy,Etotal = Eproc + Erecovery

Optimal Etotal

PipelineThroughput

IPC

Energy of Processorw/o Razor Support

1%

50%

Energy of ProcessorOperations, Eproc

Razor Also Improves YieldRazor Also Improves Yield

1.4 1.5 1.6 1.7 1.8

1.4

1.5

1.6

1.7

1.8 Chips Linear Fit y=0.78685x + 0.22117

Voltage at First Failure

Volta

ge a

t 0.1

%Er

ror R

ate

Page 9: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

9

0.8v

How Razor Breaks the RulesHow Razor Breaks the Rules

Traditional worst-case design techniques must observe margin rules for reliable operation

Incorporating timing-error correction mechanisms allow margins to be erased

Infrequent use of critical paths allow for even deeper cuts in Vdd

Process margin

Ambient margin

0.8vVth

1.2vNoise margin

0.8v

Back to the RulesBack to the Rules

1. Minimize switching activity2. Design for lower load capacitance3. Reduce frequency4. Reduce leakage

and the most important of all:5.5. Decrease supply voltage!Decrease supply voltage!

Critical voltage(determined by

critical path)

0.8vVth

P = aCV2f + VIleakaaCC ff IIleakleakVVVV22

Process margin

Ambient margin

1.2vNoise margin

Page 10: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

10

Subthreshold Circuits Break The RulesSubthreshold Circuits Break The Rules

Static logic still works below VthDifferences in Ileak continue to (dis)charge outputsBut diminished Ion/Ioff results in big delays

Approach works if the apps are not too demanding

P

N

OUTIN P

N

OUTIN1.2V

0V

1.2V

0V

0.2V

0V 0V

0.2V

Superthreshold Subthreshold

Sensing Applications

Security

Environmental Industrial

Biomedical

Page 11: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

11

Sensor Processing Data RatesSensor Processing Data Rates

Sensing Communication

Computation

Power Supply

Storage

Sensor Processor

Page 12: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

12

Sensing Performance Demands are LowSensing Performance Demands are Low

2965.01 3943.478036.77 8296.37

1.00

10.00

100.00

1000.00

10000.00

Speed (Hz)

Voltage (V)

Platform

325M250M133M100M

1.21.21.21.2

ARM 1020TARM 920TARM 7TDMIARM 720T

xRT:

# tim

es fa

ster t

han r

eal-ti

me

Fast Growing Leakage Complicates DesignFast Growing Leakage Complicates Design

Energy per Instruction

Cycles per Instruction

Energy per Cycle

Einst = Ecycle CPI

Activity factor - average number of transistor switches per transistor per cycle

Total circuit capacitance

Supply Voltage Leakage current

Clock period

Ecycle = N(½αCsVdd + VddIleaktclk2

Page 13: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

13

Fast Growing Leakage Complicates DesignFast Growing Leakage Complicates Design

Activity factor - average number of transistor switches per transistor per cycle

Total circuit capacitance

Supply Voltage Leakage current

Clock period

Ecycle = N(½αCsVdd + VddIleaktclk2

Impact of voltage reduction

⇓ quad.

⇓ quad.

Edyn

???

⇓ quad.

Ecycle

⇑ ~exp.⇑ exp.⇓ linearSubthreshold

~const.⇑ linear⇓ linearSuperthreshold

EleaktclkIleak

Tension

Fast Growing Leakage Complicates DesignFast Growing Leakage Complicates Design

Impact of voltage reduction

⇓ quad.

⇓ quad.

Edyn

???

⇓ quad.

Ecycle

⇑ ~exp.⇑ exp.⇓ linearSubthreshold

~const.⇑ linear⇓ linearSuperthreshold

EleaktclkIleak

Tension

Page 14: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

14

Lessons from Architectural StudiesLessons from Architectural Studies

To reduce Energy per instructionMinimize CPI

To reduce Vmin and energy per cycleMaximize Transistor utility

To reduce leakage energy per cycleMinimize area

To minimize energy at subthreshold voltages, architects must:

Winning designs tend to be compromising designs that balance area, transistor utility and CPIMemory comprises the single largest factor ofleakage energy, therefore, efficient designs must reduce memory storage requirements

Subliminal Architectural OverviewSubliminal Architectural Overview

Imem4x16x2x12

Dmem128x8

Pref

etch

Buf

fer

2x2x

12

RegisterFile

Scheduler

32-bitTimer

PageControl

OpAControl

OpBControl

μOperationDecoder

RegisterWrite

Control

JumpControl

ALU

IF/ID Stage EX/MEM Stage WB Stage

FlagControl

Carry

FetchControl

ExternalInterrupts

Zero

8

8

8

8

12

24

8

8

Page 15: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

15

Subliminal processors

Large solar cell

Solar cell for adders

level converter array

Discrete adders

Mux-based memories

Custom memories

Solar cell for processor

Discrete cells Solar cell for discrete cells

Test module

Test memory

Level converter array Subliminal processors

Large solar cell

Solar cell for adders

level converter array

Discrete adders

Mux-based memories

Custom memories

Solar cell for processor

Discrete cells Solar cell for discrete cells

Test module

Test memory

Level converter array

First Subliminal ChipFirst Subliminal Chip

0.01 0.1 1 1002468

1012141618202224

Ener

gy/In

st (p

J)

MIPS

Subliminal(Michigan)

0.01 0.1 1 1002468

1012141618202224

Ener

gy/In

st (p

J)

MIPS

0.85pJ/[email protected] 4MIPS4

Hempstead(Harvard)

CleverDust(Berkeley)

SNAP/LE(Cornell)Hempstead

(Harvard)

SNAP/LE(Cornell)

Subliminal(Michigan)

2.25pJ/Inst@1MIPS1

Pareto Analysis of Sensor Network ProcessorsPareto Analysis of Sensor Network Processors

Page 16: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

16

How Subliminal Breaks the RulesHow Subliminal Breaks the Rules

Traditional circuit design relies an transistor switching to perform computation

Static logic circuits continue to operate below Vth by modulating leakage currents

Approach lends itself to low-demand sensor apps, as long as care is taken to build an efficient processor

What I Really Learned…What I Really Learned…

A rule-breaking approach to technical research is effective and engaging

You will find yourself on very fertile ground“It is that which everyone knows is certainly true,that is indeed false.”“The early bird gets the worm.”“If you are not failing some of the time, you are not trying hard enough.”

You will more fully engage your colleaguesOne half will think crazy idea will never workOne half will be intrigued (with your crazy idea)

Page 17: Todd-ISLPED-keynote-postweb.eecs.umich.edu/~taustin/papers/ISLPED-keynote-final.pdf · Title: Microsoft PowerPoint - Todd-ISLPED-keynote-post.ppt Author: austin Created Date: 8/29/2008

17

QuestionsQuestions

??

??

?

? ?

? ?

?

??