Download - Thesis Giani UIC Slides EN
UICUICUniversity of Illinois at ChicagoUniversity of Illinois at Chicago
Core Identification for Core Identification for Reconfigurable Systems driven Reconfigurable Systems driven by Specification Self-Similarityby Specification Self-Similarity
Matteo GianiUniv. ID # 651157728
F. Balasa - A. A. Khokhar – D. Sciuto
UICUIC
- - 22 - -
SummarySummary
MotivationsIntroductionAimsState of the ArtThe Proposed Approach
RationaleSimilarity ExtractionSpecification Covering
ImplementationExperimental resultsConclusions and Future Work
UICUIC
- - 33 - -
MotivationsMotivations
Area occupancyImage processing / robotics applications
Survivability to changing requirementsEvolving standards: cryptography, communications
Reconfigurability for ReliabilitySingle Event Upsets, Permanent Faults
Designer constraintsUnsatisfiable timing constraints given device area
UICUIC
- - 44 - -
Reconfigurability: introductionReconfigurability: introduction
Time
Area
Ao
Sho So
UICUIC
- - 55 - -
Reconfigurability: introductionReconfigurability: introduction
Time
Area
Aw
Shw
Sw
UICUIC
- - 66 - -
Reconfigurability: introductionReconfigurability: introduction
Time
Area
Aw
Shw
Sw
Ao
Sho So
SolutionSpace
UICUIC
- - 77 - -
Reconfigurability: introductionReconfigurability: introduction
Time
Area
Aw
Shw
Sw
Ao
Sho So
Feasible Solution Space
Best Implementation
UICUIC
- - 88 - -
Reconfigurability: introductionReconfigurability: introduction
Time
Area
Aw
Shw
Sw
Ao
Sho So
Feasible Solution Space
Shde
Ade
Problem:Ad < Ade
UICUIC
- - 99 - -
Reconfigurability: introductionReconfigurability: introduction
Time
Area
Aw
Shw
Sw
Ao
Sho So
Feasible Solution Space
Shde
AdeAd Avd
Shvd
Svd
UICUIC
- - 1010 - -
Reconfigurability: introductionReconfigurability: introduction
Partial Total
UICUIC
- - 1111 - -
Reconfigurability: introductionReconfigurability: introduction
fix
Partial TotalEmbedded
UICUIC
- - 1212 - -
MicroLAB’s System on MicroLAB’s System on Reconfigurable Chip ArchitectureReconfigurable Chip Architecture
IP-CoreF1
IP-CoreF2
IP-Core
F3
Fix side:PPCICAP
FPGA
UICUIC
- - 1313 - -
System on Reconfigurable Chip System on Reconfigurable Chip Architecture: physical constraintsArchitecture: physical constraints
Area constraints:Trade-off between area used by fixed components and reconfigurable ones
Communication issues:Bit-width of the communication infrastructureNumber of access points to the communication structure
UICUIC
- - 1414 - -
The Proposed ApproachThe Proposed Approach
int test_code( int io , int * o1)
{
int a = 2, b = 10;
Specification DFG
Partitioned DFGReconfigurable Implementation
UICUIC
- - 1515 - -
AimsAims
Definition of a specification partitioning approach, that:
Aggregates elementary operations in the DFG into clusters suitable to be implemented as configurable modules
Identifies regular structures in the specification, aiming at generating reusable modules
Save device areaSave reconfiguration time
UICUIC
- - 1616 - -
State of the ArtState of the Art
Temporal partitioning approachesReconfigure the whole device at once
Impossible to hide reconfiguration times
UICUIC
- - 1717 - -
State of the ArtState of the Art
Space-Time partitioning approachesExample
UICUIC
- - 1818 - -
State of the ArtState of the Art
Common points among the different approachesReconfiguration times badly affect the system’s performance
Try to embed a loop in each partition
Try to minimize the need for reconfigurationSpatial partitioning approaches often rely on the designer for specification partitioning
UICUIC
- - 1919 - -
The Proposed Approach - RationaleThe Proposed Approach - Rationale
Reconfiguration times impact heavily on the final solution’s latency
Reuse the configurable modules
Our approach: identify recurrent structures in the specification, automatically
UICUIC
- - 2020 - -
The Proposed ApproachThe Proposed Approach
int test_code( int io , int * o1)
{
int a = 2, b = 10;
Specification DFG
Partitioned DFGReconfigurable Implementation
UICUIC
- - 2121 - -
The Proposed Approach:The Proposed Approach:Specification -> DFGSpecification -> DFG
The PandA frameworkBehavioral descriptionlayerGraph layer
UICUIC
- - 2222 - -
The Proposed ApproachThe Proposed Approach
int test_code( int io , int * o1)
{
int a = 2, b = 10;
Specification DFG
Partitioned DFGReconfigurable Implementation
UICUIC
- - 2323 - -
The Proposed Approach:The Proposed Approach:DFG PartitioningDFG Partitioning
Objective: Partition the DFG identifying clusters that are repeated through the specification
Repeated structures -> Isomorphic SubgraphsExtraction of isomorphic subgraphs from a given graph is NP-complete
Need heuristics to be able to treat the problem
UICUIC
- - 2424 - -
The Proposed Approach:The Proposed Approach:DFG PartitioningDFG Partitioning
Our approach: two phasesTemplate Identification
Produce a collection of isomorphism equivalence classes, each containing some isomorphic subgraphs of the original specification
Graph covering (template choice)Choose which among the identified templates are best suitable for implementation as (re)configurable modules
UICUIC
- - 2525 - -
The Proposed Approach:The Proposed Approach:Template IdentificationTemplate Identification
Two algorithms were considered for this phase:Reversed tree templates
Copes with the complexity of the Isomorphic Subgraphs problem by restricting the shape of the subgraphs it identifies
Free shape templatesCopes with the complexity of the Isomorphic Subgraphs problem by expanding pairs of isomorphic subgraphs via a bipartite matching
UICUIC
- - 2626 - -
Template Identification:Template Identification:Reversed-tree templatesReversed-tree templates
UICUIC
- - 2727 - -
Template Identification:Template Identification:Free-shape templatesFree-shape templates
UICUIC
- - 2828 - -
Template Identification:Template Identification:Free-shape templatesFree-shape templates
UICUIC
- - 2929 - -
Template Identification:Template Identification:Free-shape templatesFree-shape templates
UICUIC
- - 3030 - -
Template Identification:Template Identification:Free-shape templatesFree-shape templates
The algorithm produces a pair of isomorphic subgraphs for each run
The produced pairs are used to build equivalence classes of isomorphic subgraphs, exploiting the transitivity of the isomorphism relation
UICUIC
- - 3131 - -
Template choice:Template choice:metricsmetrics
Largest Fit FirstLargest templates are best
Most Frequent fit FirstTemplates with the largest number of instances are best
Communication Weight metricsE.g., #internal edges vs. #boundary edges ratio
UICUIC
- - 3232 - -
ImplementationImplementation
Implementation work was carried out as an extension to the PandA framework
C++
C++ STLBoost Graph Library
UICUIC
- - 3333 - -
Experimental Results:Experimental Results:Reversed-tree templatesReversed-tree templates
4066FDCT
57438DES - des_encrypt
162319AES - decryptblock
151316AES - encryptblock
#TemplatesLargest #Instances
Largest Template
Benchmark
UICUIC
- - 3434 - -
Experimental Results:Experimental Results:Free-shape templatesFree-shape templates
1470262FDCT
18022100DES - des_encrypt
110062147AES - decryptblock
67902132AES - encryptblock
#TemplatesLargest #Instances
Largest Template
Benchmark
UICUIC
- - 3535 - -
Experimental Results:Experimental Results:Graph covering - free-shapeGraph covering - free-shape
73.3
87.8
70.8
74.1
Cover % -Comm
6.4 sec53.876.7FDCT
8.3 sec59.690.5DES - des_encrypt
61 sec51.785.31AES - decryptblock
32.5 sec32.774.3AES - encryptblock
CPU TimeCover % - MFF
Cover % - LFF
Benchmark
UICUIC
- - 3636 - -
Experimental Results:Experimental Results:Free-shape - AES - encryptblockFree-shape - AES - encryptblock
Template size (nodes) vs. number of identified templates
0
20
40
60
80
100
120
140
0 50 100
UICUIC
- - 3737 - -
Experimental Results:Experimental Results:Free-shape - AES - encryptblockFree-shape - AES - encryptblock
Template size (nodes) vs. number of instances of the most recurrent template
0
10
20
30
40
50
60
0 50 100
UICUIC
- - 3838 - -
Experimental Results:Experimental Results:Free-shape - AES - encryptblockFree-shape - AES - encryptblock
Template size (nodes) vs. ratio between number of edges included in the clusters and number of edges cut by the cluster boundaries
0
0,2
0,4
0,6
0,8
1
1,2
1,4
0 50 100
Template Size
UICUIC
- - 3939 - -
Experimental Results:Experimental Results:Free-shape - AES - encryptblockFree-shape - AES - encryptblock
5 6 9 10 13 24 73 132
0
5
10
15
20
25
30
Number of used
instances
Template size
LFFMFFComm
0
50
100
150
200
250
300
350
LFF MFF Comm
Template Choice Heuristic
# Covered Nodes5 6 9 10 13 24 73 132
0
5
10
15
20
25
30
Number of used
instances
Template size
LFFMFFComm
0
50
100
150
200
250
300
350
LFF MFF Comm
Template Choice Heuristic
# Covered Nodes
UICUIC
- - 4040 - -
ConclusionsConclusions
int test_code( int io , int * o1)
{
int a = 2, b = 10;
Specification DFG
Partitioned DFGReconfigurable Implementation
UICUIC
- - 4141 - -
ReferencesReferences
Purna, K. M. G. and Bhatia, D.: Temporal partitioning and scheduling data flow graphs for reconfigurable computers. IEEE Trans. Comput., 1999.
Ganesan, S. and Vemuri, R.: An integrated temporal partitioning and partial reconfiguration technique for design latency improvement, 2000.
Chowdary, A., Kale, S., Saripella, P. K., Sehgal, N. K., and Gupta, R. K.: Extraction of functional regularity in datapath circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,1999.Bachl, S. and Brandenburg, F.-J.: Computing and drawing isomorphic subgraphs. In Graph Drawing, eds, S. G. Kobourov and M. T. Goodrich, 2002
Donato, A., Ferrandi, F., Redaelli, M., Santambrogio, M. D., and Sciuto, D.: Caronte: A complete methodology for the implementation of partially dynamically self- reconfiguring systems on fpga platforms. In FCCM, IEEE Computer Society, 2005
UICUIC
- - 4242 - -
Conclusions, future workConclusions, future work
A partitioning approach was defined and implemented, to expose recurrent computing patterns in a system specification
Starting point: C, SystemC specificationsTests carried out on real-world examples
Future WorkRefinement of the template choice metrics: e.g. area fragmentation
Heuristics for fixed/reconfigurable modules choiceOnline scheduling, placement of the reconfigurable cores