ivan miro-panades | ip-soc | 09/03/2020 · 2020. 3. 11. · o pulsed latches ff o-2v to 2v fbb/rbb...
TRANSCRIPT
System level optimization using body bias
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 2
• DSP implementation on FDSOI
• Power optimization using body bias
• Variability and how to compensate it
• Design considerations using body bias
Outline
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 3
• 32bit VLIW DSP
• Ultra Wide Voltage Range (UWVR) circuit
• UTBB FD-SOI 28nm
• Test case for DVFS & FBB exploration
DSP introduction
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Bit
Ext
ensi
on
Shift Reg.
32bit VLIW DSP core - V/F domain
Control
AddressGen.
DataSRAM
AddressGen.
Reg
iste
r A
rray
2Rea
d-1W
rite
Por
ts 6
4x32
b
Mux
Comp/Select
32b
Mul
tiplie
r
Add
er40
b
Mux
Bit
Tru
nc Prog
SRAM
MAC
SerialInterface
VariabilityPower
Control (CVP)PLL
Input/OutputRAM
2x1Kx32
CODA TMFLT-R
Cordics+
Divider
ShiftReg.
Bit
Ext
ensi
on
TMFLT-S
TMFLT-S
TMFLT-S
TMFLT-S
TMFLT-S
TMFLT-S
R. Wilson, "A 460MHz at 397mV, 2.6GHz at 1.3V, 32b VLIW DSP, embedding FMAX tracking," ISSCC 2014.
TechnologySTMicro
UTBB FDSOI 28 nm
Transistors Flip-Well (LVT) L=24nm
Core area 1 mm²
DSP benchmark FFT 1024
VDD range 0.397V-1.3V
VBB range 0V/±2V
| 4
DSP Fmax
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
0
500
1000
1500
2000
2500
3000
300 400 500 600 700 800 900 1000 1100 1200 1300
Fre
quen
cy (
MH
z)
VDD voltage (mV)
Boost 0mVBoost 500mVBoost 1000mVBoost 1500mVBoost 2000mV
FBB 0VFBB 0.5VFBB 1.0VFBB 1.5VFBB 2.0V
1GHz@570mV
460MHz@397mV
R. Wilson, "A 460MHz at 397mV, 2.6GHz at 1.3V, 32b VLIW DSP, embedding FMAX tracking," ISSCC 2014.
| 5
0
50
100
150
200
250
300
350
400
450
300 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500
Ene
rgy/
cycl
e (p
J/cy
cle)
Frequency (MHz)
Boost 0mVBoost 500mVBoost 1000mVBoost 1500mVBoost 2000mV
+59% frequency
+14% frequency
-20% energy/cycle
FBB 0VFBB 0.5VFBB 1.0VFBB 1.5VFBB 2.0V
DSP energy performance measured results
R. Wilson, "A 460MHz at 397mV, 2.6GHz at 1.3V, 32b VLIW DSP, embedding FMAX tracking," ISSCC 2014.
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 6
KEY FEATURES
o Design Methodology
o UWVR standard cells
o Pulsed latches FF
o -2V to 2V FBB/RBB IOs
o PVT sensors
o Timing slack sensors
o Fast loop adjustment
o UWVR SRAM
UWVR 32-bit DSP in 28nm UTBB FDSOI
100x
“28nm FD-SOI and full voltage scaling break low-power limits for DSPs”IEEE Times Europe
ISSCC’14
476MHz
2GHz
High-Performance, Low power, Dynamic Flexibility
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 7
• DSP implementation on FDSOI
• Power optimization using body bias
• Variability and how to compensate it
• Design considerations using body bias
Outline
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 8
AVFS scheme
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
DVFS for power-performance trade-off
Core
V regulator
F generatorCtrl.
Power Domain
Fre
quen
cy
Voltage Frequency
Pow
er
Vdd & F are adjusted to ensure performance at minimum consumption
���� � ���� � ���
���� ∝ � · � · ����
��� ∝ ���� · �
������°
1
��� ��� ��� ∝
� · ���
��� ��!"
Source: Diego Puschini
| 9
AVFS+BB SCHEME
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
ABB in FD-SOI
• Ultra Wide Voltage Range (UWVR)• Body Bias increases efficiency
Core
V regulator
F generatorCtrl.
Power Domain
Data fusionAdjustment
BB generator
Fre
quen
cy
Voltage
UWVR
Eff.
���� � ���� � ���
���� ∝ � · � · ����
��� ∝ ���� · �
������°
1
��� ��� ��� ∝
� · ���
��� ��!"
0 1.3V
Source: http://www.st.com/web/en/about_st/learn_fd-soi.html
Vdd & Vbb should be chosen to minimize
consumption
Source: Diego Puschini
| 10
Optimizing the power consumption with Body Bias
10.40.2F
(normalized)
P (normalized)
0.6 0.8
1
0.8
0.6
0.4
0.210.80.6
Vdd (V)
F (normalized)
1
0.8
0.6
0.4
0.210.80.6
F (normalized)
PM2
PM1
PM4
PM3
PM6
PM5
1
0.8
0.6
0.4
0.2
10.40.2
P (normalized)
0.6 0.8
0
PM2PM
1
PM4
PM3
PM6
PM5PM3
PM3
ConditionsVdd: 3 supply values
Vbb: from 0V to 1.2V
Akgul, Y., "Power management through DVFS and dynamic body biasing in FD-SOI circuits," Design Automation Conference (DAC), 2014.
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 11
Optimizing the power consumption with Body Bias
1
0.8
0.6
0.4
0.2
10.40.2
P
0.6 0.80
From Discretely Convex Subset (DCS) with finite Body Bias values…… to Piece-Wise Convex Subset (PWCS) with a continuous Body Bias range
1
0.8
0.6
0.4
0.2
10.40.2F
(normalized)
P
0.6 0.80 F
(normalized)
PM2PM
1
PM4
PM3
PM6
PM5
DCS identification PWCS identification
Akgul, Y., "Power management through DVFS and dynamic body biasing in FD-SOI circuits," Design Automation Conference (DAC), 2014.
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 12
• DSP implementation on FDSOI
• Power optimization using body bias
• Variability and how to compensate it
• Design considerations using body bias
Outline
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 13
Variability
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
PVT variations with very different spatial and temporal dynamics
Process Voltage Temperature
P
T
Keng L. Wong and al., “Enhancing Microprocessor Immunity to Power Supply Noise With
Clock-Data Compensation”, IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 41, NO. 4, APRIL
2006
LIRMM
J. Altet and al., “Thermal couplingin integrated circuits: application to thermal
testing,” Solid-StateCircuits, IEEE Journal of, vol. 36, pp. 81–91, 2001.
P. Li and al., “Efficient full-chip thermalmodeling and analysis,” Computer Aided
Design, 2004. ICCAD-2004.IEEE/ACM International Conference on, pp. 319–326,
2004.
Rui Zheng and al. , "Circuit Aging Prediction for Low-Power Operation”, CICC, 2009
www.intel.com
| 14
V/F V/F V/F V/F V/F V/F
V/F V/F V/F V/F V/F V/F
V/F V/F V/F V/F V/F V/F
V/F V/F V/F V/F V/F V/F
GALS
VH
VM
VL
Low area overhead per power domain
MPSoC fine-grain power control
Drawbacks:Many level shifters (leakage, area and latency)
Complex power grid
GALS approach not always easy
Difficult to manage throughput (different frequencies)
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 15
Alternative approach using body bias
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Vbb Vbb Vbb Vbb Vbb Vbb
Vbb Vbb Vbb Vbb Vbb Vbb
Vbb Vbb Vbb Vbb Vbb Vbb
Vbb Vbb Vbb Vbb Vbb Vbb
GALS
V
Only local BB generator
No level shifters
Simpler power grid
GALS approach when needed
Throughput guaranteed (same frequency)
| 16
• Blocks with the same power supply but different body bias voltage does not need level shifters
• Modifying the body bias voltage is equivalent to having different Vttransistors
• Small safety guards are placed between different body bias domains to avoid parasite diodes
No level shifters
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Vbb Vbb
Vbb Vbb
| 17
• Body bias current depends on applied voltage and not on the activity of the circuit
• The current consumption of the body bias in much lower than the main power supply
Thin metal wires! The power grid is not congested by the local VBB routing
Simplified power supply
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
VDD
GNDVDDSGNDS
VDD
GND
Local body bias
VDDSGNDS Local body bias
| 18
• It is possible to reduce the leakage power consumption by more than 23x by only playing with the body bias
• Replace power gating with body bias domains
• Consider Break Even Time and restore time when choosing the technique• Reverse body bias keeps the state of logic no restore time!
Body bias and leakage dependency
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
0.001
0.01
0.1
1
0.5 0.6 0.7 0.8 0.9 1 1.1
Lea
kag
e p
ow
er
(re
lati
ve)
Main power supply (Vdd)
1.1V FBB
0.6V FBB
0V FBB
-0.2V FBB (RBB)30x
23x
Measures of ARM M0+ on ST28nm FDSOI
| 19
• Ultra low power and low area body bias generator with automatic compensation
• Body bias generation is tracking a sensor target
• Area of 0.0067mm² for a 2mm² design (0.35% area overhead)
• 2.5µW power consumption
• Achieving 50% Leakage Reduction in FDSOI 28nm over 0.35-to-1V VDD Range
Automatic body bias compensation unit
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Vbb Vbb
Vbb Vbb
BBCFTGT Digital
VNWi
VPWi
FDIGi
VDD
SENSOR
FOP
Back Bias Island
A. Quelen, “A 2.5μW 0.0067mm2 Automatic Back-Biasing Compensation Unit Achieving 50% Leakage Reduction in FDSOI 28nm over 0.35-to-1V VDD Range”, ISSCC 2018.
Sensor
| 20
Energy and leakage gains
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Back-biasing compensation minimizesthe energy dissipation per logic operation
over full temperature range especially when VDD#VTH
VDD =0.45V, LVT
+85%Leakage Gain
@VDD=0.45V
Temperature [°C]
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
-40 -20 0 20 40 60 80 100 120
LG w/ BBC
Temperature [°C]
Energy Gain vs. w/o BBC Leakage Gain
0
10
20
30
40
50
60
70
80
90
-40 -20 0 20 40 60 80 100 120
w/ BBC, FopMAX
w/ BBC, FopFIX
+67%
+83%
VDD =0.45V, LVT,
AF=5%
[simulations]
| 21
• DSP implementation on FDSOI
• Power optimization using body bias
• Variability and how to compensate it
• Design considerations using body bias
Outline
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 22
• The body bias and the circuit design influence the energy performance of the circuit
• Leakage has higher influence when the circuit activity is low
• Choose the correct cells to meet the required performance while using the body bias to reduce the leakage
Energy optimization
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Voltage
Energy/cycle
MEP
Total energy
Switching energy
Leakage energy
Influenced by:
Circuit activity
Circuit design
Body bias
Influenced by:
Circuit design
Body bias
Influenced by:
Circuit design
Circuit activity
Body bias
| 23
• Option 1:
• Design the circuit where performance is met without FBB• FBB used to boost the performance and also compensate the variability
• Option 2:
• Design the circuit where the performance is met with FBB• No FBB is used to reduce the leakage and compensate the variability
Energy optimization
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Instead of using PB0, PB16 is used with FBB to meet the target frequency
D. Bol, ISSCC’19
| 24
• Don’t forget to close hold-timing at low voltage with forward body bias
• Hold-time issues may appear as the forward body bias reduces the propagation delay
• SRAM memories don’t use body bias on the arrays due to stability issues
• SRAM periphery may use body bias• SRAM array may use single-well instead of dual well approach• Choose the appropriate PVT corner for STA when closing timing w/ and
w/o body bias• Explore the best trade-off when choosing the PVT corners
• Normally-off computing don’t have the same energy constraints as an always-on systems
• Area overhead of a local body bias generator is extremely low
• Consider multi body bias domains
Implementation tips
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 25
• FDSOI is well suited for low power design
• Body bias is a knob to manage the trade-off between performance and power
• Local body bias control allows to compensate the local variability, the area overhead is only 0.35%
• Multi body bias power domains don’t need level-shifters between domains simplified implementation
• Reverse body bias can replace power gating while keeping the internal state of the registers
• Meeting the required performance of a circuit with forward body bias may optimize the energy performance of the circuit
Conclusions
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
| 26
• R. Wilson, et al. “A 460MHz at 397mV, 2.6GHz at 1.3V, 32b VLIW DSP, embedding FMAX tracking,” ISSCC 2014
• Akgul, Y., et al. “Power management through DVFS and dynamic body biasing in FD-SOI circuits,” Design Automation Conference (DAC), 2014.
• E. Beigné, et al. “A 460 MHz at 397 mV, 2.6 GHz at 1.3 V, 32 bits VLIW DSP Embedding F MAX Tracking,” JSSC, 2015.
• I. Miro-Panades, et al. “In-situ Fmax/Vmin tracking for energy efficiency and reliability optimization,” IOLTS, 2017.
• A. Quelen, et al. “A 2.5μW 0.0067mm2 Automatic Back-Biasing Compensation Unit Achieving 50% Leakage Reduction in FDSOI 28nm over 0.35-to-1V VDD Range,” ISSCC 2018.
• D. Bol, et al. “A 40-to-80MHz Sub-4μW/MHz ULV Cortex-M0 MCU SoC in 28nm FDSOI with Dual-Loop Adaptive Back-Bias Generator for 20μs Wake-Up From Deep Fully Retentive Sleep Mode,” ISSCC 2019.
References
Ivan MIRO-PANADES | IP-SoC | 09/03/2020
Commissariat à l’énergie atomique et aux énergies alternatives17 rue des Martyrs | 38054 Grenoble Cedexwww.cea-tech.fr
Établissement public à caractère industriel et commercial | RCS Paris B 775 685 019