1
Mapping Logic to Reconfigurable FPGA Routing
Karel Heyse Karel Bruneel and Dirk Stroobandt
FACULTY OF ENGINEERING AND ARCHITECTURE
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 2
FPGA configurationI/O block
LogicBlock
Programmable routing
0 1
0
0
0
1FF
0
AB
O
LUT
Configuration
{ 0 0 0 1 0 1 0 0 0 … }
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 3
Parameterized Configuration
ParameterizedConfiguration
SpecializedConfigurations
{ 0 1 0 A+B AB A 1 }
{ 0 1 0 0 0 0 1 }
* K. Bruneel and D. Stroobandt, “Automatic Generation of Run-time Parameterizable Configurations,” FPL 2008.
{ 0 1 0 1 0 0 1 }
{ 0 1 0 1 0 1 1 }
{ 0 1 0 1 1 1 1 }
0 1
1 0
1 1
0 0A B
Parameters
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 4
Applications
• Dynamic Circuit Specialization– Circuit optimization (smaller, faster, …) using run-
time reconfiguration• Circuit Specialization– Hard coded settings of devices
Very fast generation of specialized configurations
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 5
Static Connections
What’s new in this work
0 1
0
A+B
A
1FF
0
AB
O
LUT
Tunable LUTs (TLUT)
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 6
Tunable Connections (TCON)
AB 1
What’s new in this work
0
A+B
A
1FF
0
AB
O
LUT
Tunable LUTs (TLUT)
Mapping functionality to Tunable Connections (and TLUTs)
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 7
Outline
• FPGA configuration• Applications• What’s new in this work• Toolflow• Technology mapping• Experiments• Conclusion and Future work
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 8
Toolflow
8
Parameterized HDL design
Parameterized configuration
Specialized configuration
Parameter values
Synthesis Technology mapping Placement Routing
Evaluate Boolean fn.
GenericStage
Specialization Stage
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 9
Toolflow
9
Parameterized HDL design
Parameterized configurationSynthesis Technology
mapping Placement Routing
a0
I1 I2 p
O
a1 a2
a3
I0
a4
entity multiplexer isport( --BEGIN PARAM sel : in std_logic_vector(1 downto 0); --END PARAM in : in std_logic_vector(3 downto 0); out : out std_logic);end multiplexer;
architecture behavior of multiplexer isbegin out <= in(conv_integer(sel));end behavior;
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 10
Toolflow
10
Parameterized HDL design
Parameterized configurationSynthesis Technology
mapping Placement Routing
Tunable LUT network
I1 I2
O
I0
a0
I1 I2 p
O
a1 a2
a3
I0
a4
TLUT
TLUT
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 11
Toolflow
11
Parameterized HDL design
Parameterized configurationSynthesis Technology
mapping Placement Routing
TLUT & TCON network
I1 I2
O
I0 I3
C(P)
Tunable Connection (TCON)
TCO
N
TLUT
TLUT
TCO
N
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 12
Toolflow
12
Parameterized HDL design
Parameterized configurationSynthesis Technology
mapping Placement Routing
p
2 Tunable Connections
¬p
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 13
Toolflow
13
Parameterized HDL design
Parameterized configurationSynthesis Technology
mapping Placement Routing
1
2 Tunable Connections
¬1
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 14
Toolflow
14
Parameterized HDL design
Parameterized configurationSynthesis Technology
mapping Placement Routing
0
2 Tunable Connections
¬0
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 15
Technology mapping
a0
I1 I2 I3
O
a1 a2
a3
I0
a4
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 16
Algorithm
• Minimal depth mapping
1. Cone enumeration: Finding all feasible cones per node
2. Cone ranking: Selecting best feasible cone per node3. Cone selection: Selecting cones that provide covering
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 17
Feasibility: LUT
Cone with 3 input nodes = LUT-feasible (3 input LUT)
a0
I1 I2
a1
a3
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 18
Feasibility: TLUT
Cone with 3 input nodesthat aren’t parameters= TLUT-feasible (3 input LUT)
a0
I1 I2
a1
a3
a2
p
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 19
Feasibility: TCON
Equivalent to multiplexer controlled by function of parameters= TCON-feasible
I1 I2p
a0
a2
I2I1 p
a1
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 20
TLC supergate
One TLUT with TCONs attached to its inputs
TLC
TCO
N
TLUT
TCO
N
TCO
N
TCO
N
TCO
N
…
21
Feasibility: TLC supergate
Equivalent to multiplexer controlled by function of parameters and LUTs at its inputs = TLC-feasible
p
I1 I2 I3 I4
q a0
a2a1
I1 I2
b0
I3 I4
c1
c0
p q
0
LUT LUT
22
Feasibility calculation
Based on cone function: Ψ
a0
a2 a1
I1I2
b0
I3I4
c1
c0
pq
Ψ(I1,I2,I3,I4,p,q)
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 23
Feasibility calculation
Using Binary Decision Diagram of cone function
0 0 01111
Ψ
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 24
Feasibility calculation
Using Binary Decision Diagram of coneΨ
f(p,q)
k(I1,I2,l4) m(I1,I2,l3) n(I1,I3,l5)
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 25
Feasibility calculation
Using Binary Decision Diagram of coneΨ
f(p,q)
s(I1) t(I2) v(I3)
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 27
Area and depth
mux
barrels
hift
crossb
ar
regex
1x
regex
2x
regex
4xtlc
1tlc
2-100%
-90%-80%-70%-60%-50%-40%-30%-20%-10%
0%
Area (#LUTs)
TLUTMAP
TCONMAP w. TLC
Rela
tive
to L
UTM
AP
mux
barrels
hift
crossb
ar
regex
1x
regex
2x
regex
4xtlc
1tlc
2-100%
-90%-80%-70%-60%-50%-40%-30%-20%-10%
0%
Depth (#LUTs)
Rela
tive
to L
UTM
AP
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 28
TLC supergate
regex
1x
regex
2x
regex
4xtlc
2tlc
4-60%
-50%
-40%
-30%
-20%
-10%
0%
Area (#LUTs)
With
vs.
with
out T
LC
regex
1x
regex
2x
regex
4xtlc
2tlc
4-60%
-50%
-40%
-30%
-20%
-10%
0%
Depth (#LUTs)
With
vs.
with
out T
LC
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 29
Execution time and scaling behaviour
mux
barrels
hift
crossb
ar
regex
1x
regex
2x
regex
4xtlc
1tlc
20
50
100
150
200
250
300
350
400
Execution time
TLUTMAP
TCONMAP w/o TLC
TCONMAP w. TLC
(ms)
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012 30
Conclusion
• Automatically map functionality to reconfigurable routing
• Creates smaller and faster mapping results
• Part of new toolflow to quickly create specialized configurations
31
Mapping Logic to Reconfigurable FPGA Routing
Karel Heyse Karel Bruneel and Dirk Stroobandt
FACULTY OF ENGINEERING AND ARCHITECTURE
Ghent University – Computer Systems Lab – FPL 2012 – 30 August 2012