power modeling and architecture evaluation for fpga with novel circuits for vdd programmability yan...
Post on 20-Dec-2015
217 views
TRANSCRIPT
![Page 1: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/1.jpg)
Power Modeling and Architecture Evaluation for
FPGA with Novel Circuits for Vdd Programmability
Yan Lin, Fei Li and Lei HeEE Department, UCLA
Partially supported by NSF. Partially supported by NSF.
![Page 2: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/2.jpg)
Overview FPGA architecture evaluation
Area and delay [Rose et al, JSSC’90] Power [Poon et al, FPLA’02][Li et al, FPGA’03]
Vdd programmability for power reduction Concept in [FPGA’03] Application to logic [FPGA’04][DAC’04] Application to interconnects [ICCAD’04]
[Anderson et al, ICCAD’04] Novel circuits and Architecture evaluation
for FPGAs with Vdd-programmability Reduce power by 50% with 17% area and
3% delay increase
![Page 3: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/3.jpg)
Outline Power modeling and architecture
evaluation methodology
FPGA Circuits for Vdd Programmability
Architecture Evaluation with Vdd programmability
Conclusions and Ongoing Work
![Page 4: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/4.jpg)
Framework fpgaEva-LP
Parasitic Extraction
Cycle-accuratePower
Simulator
Power
Arch Spec
Logic Optimization(SIS)
Tech-Mapping (RASP)
Timing-Driven Packing (TV-Pack)
Placement & Routing (VPR)
DelayArea
Benchmark circuits
![Page 5: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/5.jpg)
FPGA Structure and Models Cluster-based Island Style FPGA Structure
100% buffered interconnects, subset switch block input fc = 50%, output fc = 25%
Area and delay models similar to [Betz-Rose-Marquardt] But based on layout and SPICE for 100nm and below
Mixed-level power model from [FPGA’03]Dynamic power
Capacitive power Short-circuit power
( transition time)
Capacitive power Functional switch Glitch
Static Power Sub-threshold leakage Reverse biased leakage Gate leakage
![Page 6: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/6.jpg)
New Power Model in fpgaEva-LP2 Short-circuit power
switching time * switching power
fpgaEva-LP used average signal transition time
fpgaEva-LP2 calculates transition time for each buffer as , the buffer delay is NOT a constant 2 as in literature due to input slew is pre-characterized by SPICE
buffer delay <0.012 ns < 0.03 ns >0.03 ns
α 2 4.4 7
bufferr tt
![Page 7: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/7.jpg)
Validation Using SPICE Validate by comparison for each power-component High fidelity with average absolute error of 8%
0
0.0005
0.001
0.0015
0.002
0.0025
b1 parity cm138a z4ml decode
Benchmark Circuits
FPG
A P
ower
(wat
t)
SPICE simulation fpgaEVA-LP fpgaEVA-LP2
![Page 8: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/8.jpg)
Impact of Random Seeds in VPR
5.25
5.3
5.35
5.4
5.45
5.5
5.55
5.6
10.2 10.4 10.6 10.8 11 11.2 11.4 11.6 11.8 12
Critical Path Delay (ns)
FP
GA
En
erg
y (
nJ
/cy
cle
)
circuit: s38584
1
2
3
4
5
6
7
8
9
10
+5%
+12%
12% delay variation and 5% energy variation Min-delay solution among 10 runs is used
![Page 9: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/9.jpg)
Evaluation of Single-Vdd FPGAs
Architectures explored Cluster size N = {6, 8, 10, 12} LUT size k = {3, 4, 5, 6, 7}
Energy-delay (ED) dominant architectures Architecture with smaller delay or less energy (compared
to any other architecture) Relaxed ED dominant set may be also valuable
3
4
5
6
7
8
9
9 10 11 12 13 14 15 16 17
Critical Path Delay (ns)
To
tal
FP
GA
En
erg
y (
nJ/
cycl
e)
(8, 7)
(6, 7)(6, 6)
(10, 5)(8, 5)
(12, 4)
(6, 5)
(8, 4)
(6, 4)(10, 4)
(8, 6)(12, 5)
(10, 6)
(12, 6)(10, 7)
(12, 7)
(10, 3)(12, 3)
(8, 3)
(6, 3)
![Page 10: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/10.jpg)
Energy versus DelayCurrent commercial
architecture For 100nm ITRS technology Min-Energy arch (N,k)=(10,4) or (8.4) Min-Delay arch (N,k)=(8,7) 0.8x delay but 1.7x power
3
4
5
6
7
8
9
9 10 11 12 13 14 15 16 17
Critical Path Delay (ns)
To
tal
FP
GA
En
erg
y (
nJ/
cycl
e)
(8, 7)
(6, 7)(6, 6)
(10, 5)(8, 5)
(12, 4)
(6, 5)
(8, 4)
(6, 4)(10, 4)
(8, 6)(12, 5)
(10, 6)
(12, 6)(10, 7)
(12, 7)
(10, 3)(12, 3)
(8, 3)
(6, 3)
![Page 11: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/11.jpg)
Outline Power modeling and evaluation
methodology
FPGA Circuits for Vdd Programmability
Architecture Evaluation with Vdd programmability
Conclusions and Ongoing Work
![Page 12: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/12.jpg)
Vdd-programmable FPGA [DAC’04][ICCAD’04] Vdd-programmable logic
block Vdd selection Power-gating unused blocks
![Page 13: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/13.jpg)
Vdd-programmable FPGA [FPGA’04][ICCAD’04] Vdd-programmable logic
block Vdd selection Power-gating unused blocks
Vdd-programmable switch
Vdd-level conversion is needed when VddL drives VddH To avoid excessive leakage
![Page 14: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/14.jpg)
Vdd-programmable Routing Switch
Conventional routing switch
Vdd-programmable routing switch Brute-force design [ICCAD’04]
Two extra SRAM cells for each routing switch
New design One extra SRAM cell NAND2 gate –- minimum size & high-Vt transistor
![Page 15: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/15.jpg)
Vdd-Programmable Interconnect Connection Block
New design Only TWO extra SRAM cells for n connection switches Control logic includes 2n NAND2 and a decoder
Brute-force design [ICCAD’04] 2n extra SRAM cells for n connection switches
![Page 16: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/16.jpg)
Power and Delay Vdd-programmable switch uses
4X PMOS power transistor for 7X routing switch 1X PMOS power transistor for 4X connection switch
Compared to conventional switch 1000X less leakage power
Connection box is 28% faster and has 18% less dynamic power By moving mux from critical path of connection box
(Vdd=1.3v)Type
Switch delay (ns) Energy per switch (Joule)
w/o power transistor
w/ power transistor
w/o power transistor
w/ power transistor
Routing 5.9E-11 6.5E-11(+11%) 3.3E-14 3.2E-14 (-2%)
Connection 2.9E-10 2.1E-10(-28%) 3.8E-14 3.1E-14(-18%)
![Page 17: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/17.jpg)
Vdd-gateable Routing Switch
Vdd-gateable two states Normal Vdd or Power-gating
Enable power-gating capability w/o extra SRAM cells
Can be replaced by tri-state buffer
Conventional
Power transitor
![Page 18: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/18.jpg)
Vdd-gateable Connection Block
Enable power-gating capability w/ only one extra SRAM and a low leakage decoder
Conventional Vdd-gateable
![Page 19: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/19.jpg)
Outline Power modeling and evaluation
methodology
FPGA Circuits for Vdd Programmability
Architecture Evaluation with Vdd programmability
Conclusions and Ongoing Work
![Page 20: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/20.jpg)
FPGA Architecture ClassesArchitecture Class Logic Block Interconnect
Class0 (baseline) single-Vdd single-Vdd
Class1 programmable dual-Vdd
programmable dual-Vdd, level converters in routing
Class2 programmable dual-Vdd
VddH and Vdd-gateable
Class3 programmable dual-Vdd
Class 1, but no level converters in routing
High-Vt is applied to configuration SRAM cells for all the classes
![Page 21: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/21.jpg)
Vdd-level Converters Class3 removes Vdd-level converters from interconnects in
Class1 With constraints that no VddL drives VddH
We developed a routing that one routing tree has a single Vdd level But trees with different Vdd-levels can
share the same wire track
Alternative approaches: Combined vdd-level converter and buffer [Anderson et al,
ICCAD’04] Our new work [DAC’05] allows dual vdd in a tree with a chip
level time slack budgeting for extra power reduction
![Page 22: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/22.jpg)
Energy versus Delay
ED-product reduction 20% by Class1 (Vdd-programmable interconnects w/ level converters) 45% by Class2 (Vdd-gateable interconnects) 50% by Class3 (class1 minus level converters)
Performance degrades 3% due to Vdd programmability
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
10 10.5 11 11.5 12 12.5 13
Critical Path Delay (ns)
Tot
al F
PG
A E
ner
gy/C
ycle
(n
J)Class 0
(8, 7)
(6, 7) (6, 6) (8, 6)(10, 5)(8, 5)
(12, 4)
(8, 4)
(6, 5)(6, 4)
(10, 4)
Class 1
(8, 7)(6, 6)
(10, 5)
(12, 4) (8, 4) (6, 4)
(6, 7)
(8, 5)(8,7)
(6,7)
(8,5)
(10,6) (6,6) (8,6)(10,5)
(12,4)
Class 2
(8,7)(6,7)(10,6) (6,6)
(8,6)(10,5) (8,5) (12,4)
Class 3
LUT 4Low Energy
LUT 7High Performance
![Page 23: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/23.jpg)
Min-area
Min-energy
Energy versus Area
1
2
3
4
5
6
6.00E+06 8.00E+06 1.00E+07 1.20E+07 1.40E+07 1.60E+07 1.80E+07 2.00E+07 2.20E+07 2.40E+07 2.60E+07
Total FPGA Device Area
Tot
al F
PG
A E
ner
gy/C
ycle
(n
J)
Class0(8,7)
(6,7)
(8,6)(6,6)
(10,5)
(8,5)
(12,4)(6,5)
(6,4)(8,4)
(10,4)
Class2
(8,7)(6,7)
(10,6)(6,6)
(8,6)
(10,5)(8,5)
(12,4)(8,4)
(10,4)
Class1
(8,7)
(6,7)(6,6)(10,5)
(8,5)(12,4)(6,4)(8,4)
Class3
(8,7)
(6,7)
(10,4) (8,4) (12,4)
(10,5)(8,5)
(6,6)(10,6)
(8,6)
Average area overhead 118% for Class1 (Vdd-programmable interconnects w/ level converters) 17% for Class2 (Vdd-gateable interconnects) 52% by Class3 (Vdd-programmable interconnects w/o level converters)
Class2 is the best considering both energy and area
![Page 24: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/24.jpg)
Energy Breakdown
Class2 and Class3 dramatically reduce global interconnect leakage
But class1 fails due to leakage in Vdd-level converters
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Class0 Class1 Class2 Class3FPGA Architecture (N,k) = (12,4)
Tot
al F
PG
A E
ner
gy (
nJ/
Cyc
le)
Logic Leakage EnergyLogic Dynamic EnergyLocal Interconnect Leakage EnergyLocal Interconnect Dynamic EnergyGlobal Interconnect Leakage EnergyGlobal Interconnect Dynamic Energy
2.94%3.71%
16.03%
8.09%
49.89%
19.33%
2.70%3.04%
26.22%
7.43%
42.84%
17.77%
4.07%3.92%
39.69%
9.81%
4.88%
37.62%
4.40%
4.32%
42.93%
10.81%5.85%
31.70%
![Page 25: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/25.jpg)
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
Class2: Vdd-gateable interconnects + Vdd-programmable CLBs(12, 4)
FP
GA
Are
a O
verh
ead
3.87%
0.60%
4.96%
4.82%
1.80%
1.39% Power Transistors & SRAMs (CLBs)
Vdd-level Converters (CLBs)
Control (Connection Blocks)
Power Transistors (Connection Blocks)
SRAMs (Connection Blocks)
Power Transistors (Routing Switches)Routing Switches 3.87%
Connection Blocks 10.38%
Logic Blocks 3.19%
Area Overhead
17% = 9% for power transistors + 5% for control + 2% for SRAM
![Page 26: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/26.jpg)
Conclusions and New Results Field programmability is needed for fine-grained dual-vdd
and Vdd-gating in FPGA Vdd-gating offers a better area-power tradeoff than Vdd-
selection 45% energy-delay product reduction with 17% area
overhead Architecture with Vdd-programmability
LUT size 4 low energy and area LUT size 7 best performance
New results [dac’05] Time slack allocation for Vdd-programmable
interconnects Device and architecture co-optimization for 77% energy-
delay reduction
![Page 27: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu](https://reader036.vdocuments.us/reader036/viewer/2022081519/56649d4d5503460f94a2c73c/html5/thumbnails/27.jpg)
References and Download All references and tools at
http://eda.ee.ucla.edu
Results in the slides have been updated compared to the paper in ISFPGA’05