low-power design techniques in digital systems
DESCRIPTION
Low-Power Design Techniques in Digital Systems. Prof. Vojin G. Oklobdzija University of California November 19, 2003. Outline of the Talk. Power trends in VLSI Scaling theory and predictions Research efforts in power reduction Efficiency measures and design guidelines - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/1.jpg)
Low-Power Design Techniques in Digital Systems
Prof. Vojin G. Oklobdzija
University of California
November 19, 2003
![Page 2: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/2.jpg)
2
Outline of the Talk
• Power trends in VLSI • Scaling theory and predictions• Research efforts in power reduction• Efficiency measures and design guidelines• Latches and Flip-Flops for Low-Power
– Dual-Edge FFs– SOI
• Conclusion: Low-Power perspective
![Page 3: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/3.jpg)
3
Power trends in VLSI
![Page 4: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/4.jpg)
4
“CMOS Circuits dissipate little power by nature. So believed circuit designers”(Kuroda-Sakurai, 95)
“By the year 2000 power dissipation of high-end ICs will exceed the practical limits of ceramic packages, even if the supply voltage can be feasibly reduced.”
(* Taken from Sakurai’s ISSCC 2001 presentation)
959085800.01
0.1
1
10
100P
ow
er (
W)
x4 / 3years
![Page 5: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/5.jpg)
5
Gloom and Doom predictions
Source: Shekhar Borkar, Intel
![Page 6: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/6.jpg)
6Source: Shekhar Borkar, Intel
![Page 7: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/7.jpg)
7
y = 3E-97e0.1131x
y = 2E-124e0.1442x
y = 6E-109e0.126x
y = 2E-222e0.2574x
0
10
20
30
40
50
60
70
80
1995.5 1996 1996.5 1997 1997.5 1998 1998.5 1999 1999.5 2000 2000.5
Year
Pow
er (W
atts
)
RISC
x86
Consumer
Dec Alpha
Expon. (RISC)
Expon. (x86)
Expon. (Consumer)
Expon. (Dec Alpha)
High-end growing at 25% / year
Consumer (low-end)At 13% / year
X86 @ 15% / yrRISC @ 12% / yr
Power versus Year: taken from ISSCC, uP Report, Hot-Chips
![Page 8: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/8.jpg)
8
Year
Vo
ltag
e [V
]
Po
wer
per
ch
ip [
W]
VD
D c
urr
ent
[A]
VDD, Power and Current Trend
1998 2002 2006 2010 20140
0.5
1
1.5
2
2.5
0 0
200 500
Current
Power
Voltage
International Technology Roadmap for Semiconductors 1999 update sponsored by the Semiconductor Industry Association in cooperation with European Electronic Component Association (EECA) , Electronic Industries Association of Japan (EIAJ), Korea Semiconductor Industry Association (KSIA), and Taiwan Semiconductor Industry Association (TSIA)
(* Taken from Sakurai’s ISSCC 2001 presentation)
![Page 9: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/9.jpg)
9
Power Delivery Problem (not just California)
Source: Shekhar Borkar, Intel
Your carstarter !
![Page 10: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/10.jpg)
10
Trend in L di/dt
• di/dt is roughly proportional to I * f, where I is the chip’s current and f is the clock frequency
or I * Vdd * f / Vdd = P * f / VddP * f / Vdd, where P is the chip’s
power. • The trend is: P f Vdd
on-chip L package L slightly decreasesslightly decreases
• Therefore, L di/dt fluctuation increases significantly.
(* Taken from Norman Chang, HP)
![Page 11: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/11.jpg)
11
ISPEC^2/Watt vs Feature Size (microns)
y = 0.3733x-2.5778
1
10
100
0.00 0.20 0.40 0.60Feature Size (microns)
ISP
EC
^2/
Wat
t
Energy-Delay product is improving more than 2x / generation
Saving Grace !
![Page 12: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/12.jpg)
12
ISPEC^2/Watt vsYear
0102030405060708090
100
1995 1996 1997 1998 1999 2000 2001
Year
Consumerx86Server
X86 efficiency improving dramatically 4X / generation
average improving3X / generationHigh-End
processors efficiency not improving
![Page 13: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/13.jpg)
13
Scaling theory and predictions
![Page 14: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/14.jpg)
14
The power dissipation has increased 1000 times over the 15 yearsand is exceeding 70 Watts
Scaling principles:
1. A “constant field scaling” theory [Dennard] assumes that device voltages as well as device dimensions are scaled by a scaling factor x (>1), resulting in a constant electric field in a device:
power density remains constant circuit performance can be improved in terms of:
density x2
speed x power 1/ x2
power-delay product 1/ x3
Limitless progress in CMOS is promised with this scaling scenario
![Page 15: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/15.jpg)
15
In practice neither a supply voltage nor a threshold voltagehad been scaled till 1990 leading to the theory of:
“Constant voltage scaling” which assumes the constant voltage
This assumption yields:
• speed improvement by x2
• power density increases rapidly by x3
![Page 16: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/16.jpg)
16
The constant field is not realistic, x0.5 is satisfactory - however even with that the power dissipation would exceed ECL by 2001: a new philosophy is required !
(* Taken from Sakurai and Kuroda, IEICE 95 paper)
![Page 17: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/17.jpg)
17
High-Performance View Point on Power*taken from Ron Preston, DEC Alpha
P=k C V2 f :
• Shrinking to the new technology (30% reduction in )– C decreases by 30%
– f increases by 1/0.7 = 43%
– Pnew=0.7 (1/0.7) Pold = Pold (No Change in Power ! )
• New design:– Double the No. of devices
– Pnew=2 x 0.7 (1/0.7) Pold = 2 X Pold (Power Doubles !)
Scale Vdd by 30% in the new design:
– Pnew=2 x 0.7 (1/0.7) (0.7)2Pold = Pold (Power stays constant !)
![Page 18: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/18.jpg)
18
High-Performance View Point on Power*taken from Ron Preston, DEC Alpha
Reality:
Paradigm Changes: More Aggressive Circuits, Toggle rate increasing,
Out of Order, Speculative Execution What to Expect: Power will be limited by the package and cooling techniques
Frequency will be determined by the power - as high as package can take !
Chip Vdd Freq. Power
21164 05u 3.3V 300MHz 50W
21264 0.35u 2.0V 600MHz 72W
Change -30% -39% +100% +44%
![Page 19: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/19.jpg)
19
Research Efforts in Low-Power Design
Psw = k CL V2
cc fCLK
Reduce Switching Activity:•Conditional clock•Conditional precharge•Switching-off inactive blocks•Conditional execution
Run it slower:•Use parallelism•Less pipeline stages•Use double-edge flip-flop
Technology scaling:•The highest win•Thresholds should scale•Leakage starts to byte•Dynamic voltage scaling
Reduce the active load:•Minimize the circuits•Use more efficient design•Charge recycling •More efficient layout
![Page 20: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/20.jpg)
20
Reducing the Power Dissipation
• The power dissipation can be minimized by reducing:
• supply voltage• load capacitance• switching activity
– Reducing the supply voltage brings a quadratic improvement
– Reducing the load capacitance contributes to the improvement of both power dissipation and circuit speed.
![Page 21: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/21.jpg)
21
Voltage Scaling
There are three means to maintain the throughput:
• Reduce Vth to improve circuit speed
• Introduce parallel and pipelined architecture while
using slower device speeds (assumes limitless no. of transistors, in reality the transistor density is
only increasing by 60% per year)
• Prepare multiple supply voltages and for each cluster
of circuits choose the lowest supply voltage that satisfies
the speed. (A good level converter is necessary which exhibits small delay and consumes
little power, small area)
![Page 22: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/22.jpg)
22
![Page 23: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/23.jpg)
23
Is there an optimal design point ?
![Page 24: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/24.jpg)
24
Power Dissipation and Circuit Delay
Power : P = pt •fCLK •CL •VDD + I0 •10 •VDD 2
V th
S
(=1.3)
k • CL • VDD
(VDD - Vth)Delay =
k•Q
I=
12
34
-0.400.4
0.8
0
0.2
0.4
0.6
0.8
1x 10
-4
Vth (V)
VDD(V)
Po
wer
(W
)
A
B
12
34
-0.400.40.8
0
1
2
3
4
5x 10
-10
Del
ay (
s)
Vth (V)
VDD(V)
AB
(* Taken from T. Sakurai)
![Page 25: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/25.jpg)
25
Power-Delay Product, Energy-Delay Product
Lowest Voltage – Highest Threshold –
no optimum
•Power-Delay Product is a misleading measure; it will always favor a processor that operates at lower frequency
•Energy-Delay is more adequate - but Energy-Delay2 should be used
(*from Sakurai, Kuroda, IEICE 95 paper)
![Page 26: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/26.jpg)
26
Power-Delay Product, Energy-Delay Product
Horowitz, Indermaur, Gonzales argue against Power-Delay, SLPE’94
![Page 27: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/27.jpg)
27
Energy-Delay**2
(*courtesy of Prof. T. Sakurai)
![Page 28: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/28.jpg)
28
Energy-Delay Product vs. Energy-Delay**2
Nowka, Hofstee, Carpenter of IBM argue against Energy-Delay as a design efficiency measure (private communication)
![Page 29: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/29.jpg)
29
Energy-Delay Product vs. Energy-Delay**2
Nowka, Hofstee, Carpenter of IBM argue against Energy-Delay as a design efficiency measure (private communication)
The same design should have relatively
the same efficiency
Optimal point: (due to to Vth being fixed ?)
![Page 30: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/30.jpg)
30
Feature 601+ 604 620 Diff.
FrequencyMHz
100 100 133(100)
same
CMOS Process .5u 5-metal .5u 4-metal .5u 4-metal ~same
Cache Total 32KB Cache 16K+16K Cache
64K ~same
Load/Store Unit No Yes Yes
Dual Integer Unit No Yes Yes
Register Renaming No Yes Yes
Peak Issue 2 + Br 4 Insts 4 Insts ~double
Transistors 2.8 Million 3.6 Million 6.9 Million +30% /+146%
SPECint92 105 160 225(169)
+50% /+61%
SPECfp02 125 165 300(225)
+30% /+80%
Power 4W 13W 30W(22.5W)
+225%/+463%
Spec/Watt 26.5/31.2 12.3/12.7 7.5/10 -115%/-252%
PF=Watt/Freq**3 4.0E-6 13.0E-6 12.8E-6
(PF/Trans)*E12 1.43 3.61 1.86
IPC 1.05 1.6 1.69
PE*IPC**3 (*E6) 4.01 12.98 12.69
PE=Watt/Spec**3 3.46E-6 3.17E-6 2.63E-6
Example: PowerPC
![Page 31: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/31.jpg)
31
Feature Digital 21164
MIPS 10000 PowerPC 620
HP 8000 Sun Ultra-Sparc
Freq 500 MHz 200 MHz 200 MHz 180 MHz 250 MHz
Pipeline Stages 7 5-7 5 7-9 6-9
Issue Rate 4 4 4 4 4
Out-of-Order Exec. 6 lds 32 16 56 none
Register Renam. (int/FP) none/8 32/32 8/8 56 none
Transistors/Logic transistors
9.3M/1.8M
5.9M/2.3M
6.9M/2.2M
3.9M*/3.9M
3.8M/2.0M
SPEC95(Intg/FlPt)
12.6/18.3 8.9/17.2 9/9 10.8/18.3 8.5/15
Power 25W 30W 30W 40W 20W
SpecInt/Watt
0.5 0.3 0.3 0.27 0.43
1/Energy*Delay 6.4 2.6 2.7 2.9 3.6
Watt/Freq**3 0.2E-6 3.75E-6 3.75E-6 6.86E-6 1.28E-6
(PF/Trans)*E12 0.022 0.64 0.54 1.76 0.34
(PF/LTrans)*E12 0.11 1.63 1.7 1.76 0.64
Watt/Spec**3 12.5E-3 42.5E-3 41.5E-3 31.7E-3 32.5E-3
![Page 32: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/32.jpg)
32
Sensitivity to Vth fluctuation
VTH (V)
0 0.2 0.4 0.7 1
1.5 V
3.0 V
5.0 V
0.6
1.0
1.4
1.8N
orm
aliz
ed D
elay ± 0.15V
VDD =1.0 V
± 0.05V
ΔVTH =
0.5
(* Taken from T. Sakurai)
![Page 33: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/33.jpg)
33
Use of Different Circuits Families
![Page 34: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/34.jpg)
34
Capacitance Reduction
The load capacitance is the sum of:
• gate capacitance• diffusion capacitance • routing capacitance
Using small number of transistors, or small size of transistorscontributes to the reduction in the gate capacitance and the diffusion capacitance.
Pass transistor logic may have advantage because it comprises fewer transistors and exhibits smaller straycapacitance than conventional static CMOS logic.
![Page 35: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/35.jpg)
35
Pass-Transistor Logic
![Page 36: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/36.jpg)
36
Pass-Transistor Logic: CVSL, CPL, SRPL, DSL, DPL, DCVSPG
![Page 37: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/37.jpg)
37
SAPL:Sense-Amplifying Pass-transistor
Logic
All nodes are first discharged and then evaluated by inputs.Outputs are 100mV above GND
![Page 38: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/38.jpg)
38
Where does the power go ?
![Page 39: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/39.jpg)
39
Power use is different from chip to chip:
MPU1 is a low end microprocessorMPU2 is a high-end CPU with large cacheASSP1 is MPEG-2 decoderASSP2 is an ATM switch
(*from Sakurai, Kuroda, IEICE 95 paper)
![Page 40: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/40.jpg)
40
Design Example: Strong Arm 110Two power modes: idle and sleep
Power:0.5W using 1.1V internal PS: 184 Drystone/MIPS @162MHz
1.1W using 2V internal PS: 245 Drystone/MIPS @ 215MHz
Power Breakdown:I-Cache 27%
D-Cache 16%
I-Unit 18%
Exec-Unit 8%
I-MMU 9%
D-MMU 8%
Clock 10%
Others 4% (PLL < 1%)
*from D. Dobberpuhl
![Page 41: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/41.jpg)
41
Design Example: Strong Arm 110
*from D. Dobberpuhl
![Page 42: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/42.jpg)
42
Design Example: Strong Arm 110
However, leakage currents starts to affect stand-by power
*from D. Dobberpuhl
*from D. Dobberpuhl
![Page 43: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/43.jpg)
43
Controlling both: VDD and VTH for low power
![Page 44: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/44.jpg)
44
Controlling VDD and VTH for low power
Low power Low VDD Low speed Low VTH High leakage VDD-VTH control
Active Stand-byMultiple VTH Dual-VTH MTCMOS
Variable VTH VTH hopping VTCMOS
Multiple VDD Dual-VDD Boosted gate MOS
Variable VDD VDD hopping
*) MTCMOS: Multi-Threshold CMOS*) VTCMOS: Variable Threshold CMOS• Multiple : spatial assignment• Variable : temporal assignment
Software-hardware cooperation
Technology-circuit cooperation
(* from Prof. T. Sakurai)
![Page 45: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/45.jpg)
45
Dual-VTH concept
Low-VTH circuit(High leakage)
High-VTH circuit(Low leakage)
Critical paths
Non-critical paths
(* from Prof. T. Sakurai)
![Page 46: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/46.jpg)
46(* from Prof. T. Sakurai)
Clustered Voltage Scaling for Multiple VDD’s
Lower VDD portion is shown as shaded
CVS StructureConventional Design
Critical Path
Level-Shifting F/F
Critical Path
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
M.Takahashi et al., “A 60mW MPEG4 Video Codec Using Clustered Voltage Scaling with Variable Supply-Voltage Scheme,” ISSCC, pp.36-37, Feb.1998.
Once VL is applied to a logic gate, VL is applied to subsequent logic gates until F/F’s to eliminate DC current paths. F/F’s restore VH.
![Page 47: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/47.jpg)
47
Energy consumption isproportional tothe square of VDD.
Energy consumption isproportional tothe square of VDD.
VDD should be loweredto the minimum levelwhich ensuresthe real-time operation.
VDD should be loweredto the minimum levelwhich ensuresthe real-time operation.
Normalized workload0.0 0.2 0.4 0.6 0.8 1.0
No
rmal
ized
po
wer
0.0
0.2
0.4
0.6
0.8
1.0
Variable VddFixed Vdd
If you don’t need to hussle,VDD should be as low as possible
(* from Prof. T. Sakurai)
![Page 48: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/48.jpg)
48
Measured voltage waveforms
1 sync frame
200ms
Sleep
V DDmax =8% on average
V DD
V DDmax
V DDmin
Sleep signal
Sleep=6% on average
(* from Prof. T. Sakurai)
![Page 49: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/49.jpg)
49
Measured power characteristics
Total power = 0.8W x 0.08 + 0.16W x 0.86 + 0.07W x 0.06 = 0.2WTotal power = 0.8W x 0.08 + 0.16W x 0.86 + 0.07W x 0.06 = 0.2W
VDD hopping can cut down power consumption to 1/4VDD hopping can cut down power consumption to 1/4
0.8W
0
0.2
0.4
0.6
0.8
1
Supply voltage: VDD [V]
Po
wer
: P
[W
]
0 1 2
ƒ=100MHz
ƒ=200MHz
0.16W
Downto 1/5
Time for sleep: 6% 0.07W
Time for VDDmin : 86%
Time for VDDmax : 8%
(* from Prof. T. Sakurai)
![Page 50: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/50.jpg)
50
Simulation results
0.0 0.2 0.4 0.6 0.8 1.00.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
RPC: 2 levels (f,f/2)RPC: 3 levels (f,f/2,f/3)RPC: 4 levels (f,f/2,f/3,f/4)RPC: infinite levelspost-simulation analysis
0.0 0.2 0.4 0.6 0.8 1.00.00
0.04
0.08
0.12
0.16
0.20
0.24
0.28
0.32
RPC: 2 levels (f,f/2)RPC: 3 levels (f,f/2,f/3)RPC: 4 levels (f,f/2,f/3,f/4)RPC: infinite levelspost-simulation analysis
MPEG-2 video decoding VSELP speech encoding
No
rma
lize
d P
ow
er
P/P
FIX
No
rma
lize
d P
ow
er
P/P
FIX
Transition Delay TTD
(ms)Transition Delay TTD
(ms)
(* from Prof. T. Sakurai)
![Page 51: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/51.jpg)
51
Aggressive Voltage Scaling
If we can dynamically scale Vdd and Vth the advantage is obvious
*Taken from Kuroda
![Page 52: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/52.jpg)
52
Example
![Page 53: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/53.jpg)
53
TransMeta Example
*Taken from Doug Laird’s presentation, January 19 th 2000
![Page 54: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/54.jpg)
54
TransMeta Example
*Taken from Doug Laird’s presentation, January 19 th 2000
![Page 55: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/55.jpg)
55
TransMeta Example
• “Code Morphing” is another contributor to power reduction since it eliminates unnecessary external memory access
*Taken from Doug Laird’s presentation, January 19 th 2000
![Page 56: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/56.jpg)
56
TransMeta Example
![Page 57: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/57.jpg)
57
Latches and Flip-Flops for Low-Power
![Page 58: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/58.jpg)
58
Simulation Condition and TestbenchTiming
Total FF overhead is setup + clock-to-output time
Circuit optimization towards td-q
Clock skew robustness obtained from observing DQ curve
Power-Delay Product Overall performance
parameter at fixed frequency
)(tmint QDt
dCLKD
dissd PtEDPf) fixedPDP(at
D
Q
QSET
CLR
Clk
Data In
Clock
14X m ininv
14X m ininv
14X m ininv
![Page 59: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/59.jpg)
59
Flip-Flop Performance Comparison
• Total power consumed– internal power
– data power
– clock power
• Measured for four cases– no activity (0000… and 1111…)
– maximum activity (0101010..)
– average activity (random sequence)
Test bench
Delay is (minimum D-Q):Clk-Q + Setup time
C lk
D ata
C lo ck
5 0 fF
2 0 0 fF
2 0 0 fFD Q
Q
![Page 60: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/60.jpg)
60
OLD TEST BENCH:
• Total Power = Drivers Power + Test Unit Power
• PDP- Optimized = Equal Trade-off on Power and Delay
• Improper Load on Drivers
NEW TEST BENCH:
• Drivers: Fixed Gain and Driving Test Unit Only
• Data-to-Output Delay
• PD2P Optimized = Best for Constant-Field Scaling
OLD TEST BENCH
NEW TEST BENCH
![Page 61: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/61.jpg)
61
Comparison in terms of speed and EDPtot
Technology: 0.2u, Vdd=2V, T=20oC, measured @ 100MHz
• Delay: below 200ps• SDFF 187ps• HLFF 199ps• K-6 ETL 200ps
– 200-300ps• PowerPC latch 266ps• 21264 Alpha FF 272ps• Strong Arm FF 275ps• mC2MOS latch 292ps
– above 500ps• SSTC latch 592ps• DSTC latch 629ps• SSTC* latch 898ps• DSTC* latch 1060ps
• PDPtot @100MHz
– below 30fJ• PowerPC latch 28fJ
– 30 - 50fJ• HLFF 29fJ• SDFF 39fJ• mC2MOS latch 40fJ• 21264 Alpha FF 43fJ• Strong Arm FF 45fJ
– 50 - 70fJ• K-6 ETL 70fJ
– above 70fJ• SSTC latch 95fJ• DSTC latch 125fJ
![Page 62: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/62.jpg)
62
Delay comparison
• F-F design brings the fastest structures
0
50
100
150
200
250
300
350
SDFF HLFF K6 PowerPC Alpha 21264FF
Strong ArmFF
mC2MOS
De
lay
[p
s]
![Page 63: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/63.jpg)
63
Delay comparison
• F-F design brings the fastest structures
0
50
100
150
200
250
300
350
SDFF HLFF PowerPC mC2MOS
Del
ay [
ps]
0
100
200
300
400
500
600
700
K6 SA-F/F StrongArm SSTC DSTC
Del
ay
[ps
]
![Page 64: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/64.jpg)
64
Overall rankingPDPtot ranges
0
50
100
150
200
250
300
350
HLFF SDFF Pow erPC mC2MOS StrongArm
Alpha21264
K6 SSTC DSTC SSTC* DSTC*
PDPt
otal
[fJ]
Activity=0.25 equaltransition probability
• EDPtot accepted as the overall cost function• Proposed “low-power” latches from Yuan & Svensson, compared with
other presented structures do not show advantage, (the optimization was not properly done - optimization is yet to be repeated under different setup)
@100MHz
![Page 65: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/65.jpg)
65
Overall ranking, zoomed
0
10
20
30
40
50
60
70
80
HLFF SDFF Pow erPC mC2MOS Strong Arm Alpha21264
K6
PD
Pto
t [f
J]
Activity=0.25 equal transition probability
• Real signals have the activity between 0 and 1.0 ()• Precharged hybrid structures are the fastest but their power consumption
strongly depends on the probability of “ones”• More “ones” above the point
![Page 66: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/66.jpg)
66
Overall performance
• Real signals have the activity between 0 and 1.0 ()• Precharged hybrid structures are the fastest but their power
consumption strongly depends on the probability of “ones”
• More “ones” above the point
0
10
20
30
40
50
60
HLFF SDFF PowerPC mC2MOS
PDP
tot [
fJ]
Activity=0.5 equal transition probability
0
20
40
60
80
100
120
140
160
SA-F/F StrongArm110
K6 SSTC DSTC
PD
Pto
t [fJ
]Activity=0.5 equal transition probability
![Page 67: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/67.jpg)
67
Conventional Clk-Q vs. minimum D-Q
• Hidden positive setup time
• Degradation of Clk-Q
0
50
100
150
200
250
300
350
400
150 200 250 300 350 400 450 500 550 600 650
Delay [ps]
To
tal
po
we
r [u
W]
HLFF
PowerPC
Strong Arm FF
Alpha 21264 FF
mC2MOS latch
K6 ETL
SSTC
DSTC
SDFF
0
50
100
150
200
250
300
350
400
100 150 200 250 300 350
Clk-Q delay [ps]
To
tal
Po
we
r [u
W]
HLFF
PowerPC
Strong Arm FF
Alpha 21264 FF
mC2MOS latch
K6 ETL
SSTC
DSTC
SDFF
![Page 68: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/68.jpg)
68
Internal Power distribution
• Four sequences characterize the boundaries for internal power consumption– …010101… maximum
– random, equal transition probability, average
– …111111… precharge activity
– …000000… leakage + internal clock processing
0
50
100
150
200
250
300
350
400
Random,activity=0.5
…01010101…activity=1
…11111111…activity=0
…00000000…activity=0
Data patterns
Inte
rna
l P
ow
er
[uW
]
HLFF SDFF PowerPC 603 latch
mC2MOS latch StrongARM FF Alpha 21264 FF
K6 ETL
![Page 69: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/69.jpg)
69
Comparison of Clock power consumption
0 10 20 30 40 50
Local Clock power consumption [W]
DSTC MS latch
SSTC MS latch
K6 ETL
StrongArm FF
SA-F/F
mC2MOS
PowerPC MS latchSDFF
HLFF
![Page 70: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/70.jpg)
70
Using Dual-Edge Flip-Flop(run at ½ of the frequency
save on the power consumed in clock distribution tree)
![Page 71: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/71.jpg)
71
Dual-Edge vs. Single-Edge Flip-Flops Comparison
0
50
100
150
200
250
300
350
400
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Delay [ps]
0
50
100
150
200
250
300
350
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Total Power [W]
•Fujitsu 0.18u process; Clock frequency 500MHz (250MHz for Dual Edge FFs)•Data activity ratio = 0.5•VDD = 1.8V•Temp = 25º
![Page 72: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/72.jpg)
72
Dual-Edge vs. Single-Edge Flip-Flops Comparison
0
50
100
150
200
250
300
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Internal Power [W]
0
10
20
30
40
50
60
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Clock Power [W]
0
5
10
15
20
25
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Data Power [W]•Fujitsu 0.18u process; Clock frequency 500MHz (250MHz for Dual Edge FFs)•Data activity ratio = 0.5•VDD = 1.8V•Temp = 25º
![Page 73: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/73.jpg)
73
Silicon on Insulator (SOI) Technology
![Page 74: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/74.jpg)
74
SOI Comparison
0
10
20
30
40
50
60
70
Delay [ps]0
20
40
60
80
100
120
140
Internal Power [uW]05
10
15202530
35404550
Clock Power [uW]
0
20
40
60
80
100
120
140
160
Total Power [uW]0
1
2
3
4
5
6
EDP [fJ]@500Mhz
HALPowPCHLFFSDFFSAFFSA 110
F= 1GHz, = 0.5, Le = 0.08 m, VDD=1.3V, T = 25C
![Page 75: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/75.jpg)
75
In conclusion….
What can we expect that low power will bring to us ?
![Page 76: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/76.jpg)
76
Wearable Computer
![Page 77: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/77.jpg)
77
Wearable Computer
![Page 78: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/78.jpg)
78
Wearable Computer
![Page 79: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/79.jpg)
79
Digital Ink
![Page 80: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/80.jpg)
80
Implantable Computer
![Page 81: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/81.jpg)
81
Bluetooth
![Page 82: Low-Power Design Techniques in Digital Systems](https://reader030.vdocuments.us/reader030/viewer/2022012922/56813e86550346895da8c101/html5/thumbnails/82.jpg)
82
Brain Ultra small volumeSmall number of neuron cellsExtremely low power
Real time image processing(Artificial) Intelligence3D flight control
Sensor
InfraredHumidityCO2
Mosquito
Year 2110
Long lifetime by DNA manipulation Bio-computer
Year 2010Extrapolation of the trend with some saturation
Many important interesting applicationHome, Entertainment, Office, Translation , Health care
Year 2020???
More assembly technique: 3D
Combination of bio and semiconductor