Download - FPGA Interconnect Planning
![Page 1: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/1.jpg)
FPGA Interconnect PlanningFPGA Interconnect PlanningFPGA Interconnect Planning
Amit Singh
Xilinx Inc.San Jose, CA
Malgorzata Marek-Sadowska
University of California, Santa BarbaraSanta Barbara, [email protected]
![Page 2: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/2.jpg)
OutlineOutline
IntroductionPrevious WorkArchitecture Model
• Architecture & Design Rent’s exponentClustering
• Spatial regularity
Fanout distribution• Area minimization• Area-delay minimization• Performance
Conclusions
![Page 3: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/3.jpg)
Introduction
Clustered FPGAs• System-on-Chip (SOC)
Performance, Power, Area• Matching design and architecture complexities• Circuit packing/clustering
Fanout distribution (Zarkesh-Ha et al. 2000)• Rent’s rule
Segment length planning in FPGAs• Area reduction• Area-Delay product minimization
![Page 4: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/4.jpg)
Previous Work
Rent’s rule• Landman, Russo (1974), Donath (1979)
Interconnect distribution model• Davis et al. (1998), Zarkesh-Ha et al. (2000)• Local, semi-global, global wiring requirement
Homogeneous, heterogeneous systems
FPGA segment length distribution• Betz et al. (1999)• Type of routing switches• Impact – segment length distribution on area-delay
productBest area-delay product: Mix of length 4 and 8 segments
![Page 5: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/5.jpg)
Clustered FPGAs
Clusters of LEs, connection boxes and switch-boxes
Regular 2-D mesh array
Example: Xilinx Virtex series, Altera APEX & Stratix
Cluster of LEsCluster of LEs
A Logic ElementA Logic Element
![Page 6: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/6.jpg)
Routing Model
Cluster
WSe
gmen
t Le
ngth
Cluster
S-Box
C-Box
Cluster
![Page 7: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/7.jpg)
Routing Model
Buffered routing switches
Buffer chain delay
Pass-transistor chain delay
![Page 8: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/8.jpg)
How much interconnect?How much interconnect?
~80% of FPGA area = interconnects
Routing resource utilization (RRU) is low • 100% logic utilization
Depopulating logic clusters• Regularity
Interconnect complexity guided fanout distribution-Rent’s rule
• Average fanoutSegment lengths (shorter segments or longer segments?)Switch type (tri-state buffers or pass-transistor?)
![Page 9: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/9.jpg)
Rent’s Rule
(simple) 0 ≤ p ≤ 1 (complex)
Typical values: 0.5 ≤ p ≤ 0.75
Measure for the complexity
of the interconnection topology.
Rent’s rule: Landman and Russo in 1971.
Average number of terminals and blocks per module in a
partitioned design:
T = t B p
p = Rent exponent
t ≅ average # term./block
11
100010
10
100
100
T
B
averageRent’s rule
![Page 10: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/10.jpg)
Rent’s Rule
Definitions
• Pd – Rent’s parameter for Design
• Pa – Rent’s parameter for Architecture R
outi
ng
reso
urc
e u
tiliz
atio
n
Rent’s parameter
Pa = 0.64
![Page 11: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/11.jpg)
Logic Clustering
• Separation : Sum of all terminals of nets incident to LE
• Degree : Number of nets incident to LE
Examplet
CY
Xxz
y
)]1()(2[)( α+•= xnwXG k
aPnkIO )1( +≤
2dseparationc =
•Net weight: r
xw 2)( =
![Page 12: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/12.jpg)
Clustering: Seed selection
degree = 4; separation = 18, c = 1.25
AB
degree = 4; separation = 8, c = 0.5
B
Nets absorbed = 4
A
Nets absorbed = 1
![Page 13: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/13.jpg)
Rent’s Rule: Depopulation
Case 1: Pd <= Pa
• Achieve spatial uniformity.
Case 2: Pd > Pa
• Need more routing resources.
• Solution – Depopulate clusters
A
A
A
![Page 14: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/14.jpg)
Regularity
Better clustering for increased routability
Routing succeeded with a channel width factor of 12Routing succeeded with a channel width factor of 21
Avg. fan-out 2.7
Ours:Avg. fan-out
3.7
![Page 15: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/15.jpg)
Rent’s Rule: Depopulation
Cluster Pin Utilization
01020304050607080
0 5 10 15 20 25 30
# Cluster Pins utilized
# Cl
uste
rs
Ours T-VPack
![Page 16: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/16.jpg)
Segment length Planning
Typical segment distribution
What is the best mix of segments for:i) Area minimization ii) Area-delay minimization?
![Page 17: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/17.jpg)
Fanout Distribution
Inter-cluster wire requirement
Equivalent Rent’s parameter of system
Netlist profile• Low fanout nets – smaller length segments• Global nets – Longer segments (long lines)
Global segments – buffered
Shorter segments – not buffered(pass transistor switches)
Nn
i
Nieq
ikk ∏=
=1
(N
Npp
n
iii
eq
∑== 1
![Page 18: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/18.jpg)
Fanout distribution
Array of clusters
0
50
100
150
200
250
300
350
1 2 3 4 5 6 7 8 9 10
Pins per net
net
sActual Predictedm
mmkNmNetpp ))1(()(
11 −− −−=
( ) 1,)1(1
)1(12
1
−Ω−+−
+−= −
−
Maxp
Max
pMax
avg FopFoFoFo ( ) ∑
= +=Ω
MaxFo
n
p
Max nnnFop
12 )1(
,
![Page 19: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/19.jpg)
Fanout distribution: Area-Minimization
Good Placement
Example
![Page 20: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/20.jpg)
Fanout distribution: Area-Minimization
0.00E+00
2.00E+06
4.00E+06
6.00E+06
8.00E+06
1.00E+07
1 2 3 4 5 6
Circuit
Area
(Tra
nsis
tors
)
Ours Length 1 Xilinx-like
![Page 21: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/21.jpg)
Fanout distribution: Area-Minimization
0.00E+00
5.00E-08
1.00E-07
1.50E-07
2.00E-07
2.50E-07
3.00E-07
1 2 3 4 5 6
Circuit
Crit
ical
Pat
h (n
s)
Ours Length 1 Xilinx-like
![Page 22: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/22.jpg)
Fanout distribution:Area-Delay Product
Performance!
Average fanout: Critical path model
![Page 23: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/23.jpg)
Fanout distribution:Area-Delay Product
Timing-driven Placer and Router
0
0.2
0.4
0.6
0.8
1
1.2
alu4
apex
2ap
ex4
bigke
yde
sdif
feq dsip
ex5p
misex3
s298 se
qtse
ng
Circuits
Area
-del
ay p
rodu
ct
Ours Xilinx-like
![Page 24: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/24.jpg)
Fanout distribution:Performance
Using a Timing-driven Placer and RouterNormalized Critical Path
0
0.2
0.4
0.6
0.8
1
1.2
1.4
alu4
apex
2ap
ex4
bigke
y
des
diffeq ds
ipex
5pmise
x3s2
98 seq
tseng Avg.
Circuits
Nor
mal
ized
del
ay
Ours Xilinx-like
![Page 25: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/25.jpg)
Conclusions & Future Work
FPGA Clustering for Regularity• Rent’s rule
Fanout distribution based segment length planning• Area minimization
Reduced wire-length• Area-Delay minimization
Reduction of 29% over state-of-art
Fixed FPGA Architecture• 20% better area-delay product
Future work: Metal layer assignment• Applying technique to a pipelined FPGA
![Page 26: FPGA Interconnect Planning](https://reader033.vdocuments.us/reader033/viewer/2022050805/55cf8fe8550346703ba11f72/html5/thumbnails/26.jpg)