ee241 - spring 2013bwrcs.eecs.berkeley.edu/classes/icdesign/ee241_s13/lectures/...gate sizing...
TRANSCRIPT
1
EE241 - Spring 2013Advanced Digital Integrated Circuits
Lecture 17: Power-Performance Tradeoffs
2
Announcements
Homework 3 due on Monday
Quiz #3 on Monday
2
3
Reading
Markovic et al, Methods for true energy-performance optimization, IEEE Journal of Solid-State Circuits, vol. 39, no.8, pp. 1281-1293, August 2004.
Chandrakasan and Brodersen, Low power CMOS digital design, IEEE Journal of Solid-State Circuits, vol. 27, no. 4, pp. 473-484, Apr. 1995.
4
Outline
Last lectureFlip-flops
Power and energy basics
This lecturePower-performance tradeoffs at circuit level
Tradeoffs through supply voltage
3
5. Low Power DesignC. Power-PerformanceTradeoffs at Circuit Level
6
Alpha-power based delay model
iin
iL
ThDD
DDdpi C
C
VV
VKt
,
,1
iin
iL
ThDD
DDdpi W
W
VV
VKtD
,
,1
CL
In Out
1 2 N
4
7
♦ Leakage
♦ Switching
Energy models
20 1 , int,Sw L i i DDE C C V
DVeIWE DDnV
VV
InLkt
DDTh
0
8
Sizing, Supply, Threshold Optimization
Transistor sizing can yield large power savings with small delay penalties
Gate sizing
Beta-ratio adjustments
Stack resizing
Supply voltage affects both active and leakage energy
Threshold voltage affects primarily the leakage
5
9
CL
In Out
1 2 N
Unconstrained energy: find min D = tpi
Constrained energy: find min D, under E < Emax
Where E = ei
Apply to Sizing of an Inverter Chain
11 jginjginjgin CCC ,,, 11 jjj WWW
10
Constrained Optimization
Find min(D) subject to E = Emax
Constrained function minimization
E.g. Lagrange multipliers
Can solve analytically for x = Wj, VDD, VTh
maxExExDx
0x
maxDDxEx
Or dual:
6
11
Inverter Chain: Sizing Optimization
12
♦ Variable taper achieves minimum energy
♦ Reduce number of stages at large dinc
Inverter chain: Sizing optimization
[Ma, Franzon, IEEE JSSC, 9/94]
1 2 3 4 5 6 70
5
10
15
20
25
stage
effe
ctiv
e fa
nout
, f
0%
1%
10%
30%
dinc
= 50%nomopt
1 1
11j j
jj
W WW
W
Wnom
DD
SKV
22
1
jW
j j
eS
f f
Stojanovic, ICCAD’02
7
13
♦ Gate sizing (Wi)
Sensitivity to Sizing and Supply
♦ Supply voltage (Vdd)
xv = (VTh+VTh)/Vdd
Vth Vdd
Vddnom
Sens(Vdd)
0
for equal feff (Dmin) 1
swj j
nom j jj
EW e
D f fW
~-2E/Dv
vsw
DD
DD
sw
xx
DE
VD
VE
1
12
14
Sensitivity to Vth
♦ Threshold voltage (Vth)
Low initial leakage
speedup comes for “free”0 Vth
Vthnom
Sens(Vth)
1
t
ThThDDLk
th
Th
nVVVV
PV
DV
E
8
5. Low Power DesignD. Options for Power Reduction
16
Reducing active power
Downsizing transistors (CL)Slows down logic
Lowering the supply voltage (VDD)Slows down logicReducing swing slows down the succeeding stage
Reducing frequency (f)Does not reduce energy
Reducing switching activity ()Logic restructuring, clock gating
Reducing glitchingBalancing logic
fVVCP DDswingLdyn ~
DDswingL VVCE ~
9
17
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDD Variable VTh
+ Input control
+ Variable VTh
18
Energy-Performance Tradeoffs
Enable Time/Perf. Impact
Design Time Run Time
Near-zero perf. penalty
Clock gating
Architectural switching reduction
Multi-VTh
Dynamic VDD
Dynamic VTh
True tradeoffs
Fine-granularity clock gating
VDD, VTH adjustments
Multi-VDD
Sizing, logic styles
Stack forcing
Sleep T’s
10
5. Low Power DesignE. Supply Voltage Reduction
20
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic design
Scaled VDD
Trans. sizing
Multi-VDD
Clock gatingDFS, DVS
Leakage
Stack effects
Trans sizing
Scaling VDD
+ Multi-VTh
Sleep T’s
Multi-VDD Variable VTh
+ Input control
+ Variable VTh
11
21
Supply Voltage Adjustment
How to maintain throughput under reduced supply?
Introducing more parallelism/pipeliningArea increase – cost up
Cost/power tradeoff
Multiple voltage domainsSeparate supply voltages for different blocks
Lower VDD for slower blocks
Cost of DC-DC converters
Dynamic voltage scaling – with variable throughput
Reducing VTH to improve speedLeakage issues
22
Reducing Vdd
P x td = Et = CL * Vdd2
E(Vdd=2)=
(CL) * (2)2
(CL) * (5)2E(Vdd=5)
Strong function of voltage (V2 dependence).
Relatively independent of logic function and style.
E(Vdd=2) 0.16 E(Vdd =5)
0.03
0.05
0.07
0.1
0.15
0.20
0.30
0.50
0.70
1.00
1.5
1 2 5
51 stage ring oscillator
8-bit adder
Vdd (volts)
quadratic dependence
NO
RM
AL
IZE
D P
OW
ER
-DE
LA
Y P
RO
DU
CT
Power Delay Product Improves with lowering VDD.
Chandrakasan, JSSC’92
12
23
Reducing VDD
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
V DD [V]
Del
ay (
norm
aliz
ed)
0
10
20
30
40
50
60
Ene
rgy
(nor
mal
ized
)
Switching power
Leakage power
Delay
32nm process
24
Lower VDD Increases Delay
CL * Vdd
I=Td
Td(Vdd=5)
Td(Vdd=2)=
(2) * (5 - 0.7)2
(5) * (2 - 0.7)2
4
I ~ (Vdd - Vt)2
Relatively independent of logic function and style.
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
5.50
6.00
6.50
7.00
7.50
2.00 4.00 6.00Vdd (volts)
NO
RM
AL
IZE
DD
EL
AY
adder (SPICE)
microcoded DSP chip
multiplier
adder
ring oscillator
clock generator2.0m technology
13
25
Trade-off Between Power and Delay
0
1
2
3
4
5x 10-7
0
10
20
0.30.5
0.7
-0.10.1
0.3
Po
wer
[W
/gat
e]
Del
ay [
ps]
DDs
V
DD VIVCfaPowerTH
1002
0.30.5
0.7
-0.10.1
0.3
31 .THDD
DD
VV
VCDelay
Equi-delay
50nm node, FO3 INV
26
Two Types of Processing
Fixed-rate processing (e.g. signal processing for multimedia or communications)
Stream-based computation
No advantage in obtaining throughput in excess of the real-time constraint
Variable-rate or burst-mode computation (e.g. general purpose computation)
Mostly idle (or low-load) with bursts of computation
Faster is better
14
27
Architecture Trade-off for Fixed-rate ProcessingReference Datapath
28
Parallel Datapath
15
29
Pipelined Datapath
30
A Simple Datapath: Summary
16
31
Next Lecture
Continue low-power design