uic thesis candiloro
TRANSCRIPT
Management and analysis of bitstream generators for
Xilinx FPGAs
BY
Davide Candiloro
Thesis committee:
John Lillis (chair), Marco D. Santambrogio, Piotr Gmytrasiewicz
UIC Thesis Defense
Rationale and Main Rationale and Main contributioncontribution
Xilinx software is mainly tailored to static designsAbsence of validation or support for partial dynamic reconfiguration techniques
-therefore-Development of a novel flow for the debugging and validation of partial dynamic reconfigurable architectures on Xilinx FPGAs
Methodology to spot and address possible design flaws
Design of a framework to automate and ease the designer’s task independently from vendor software
2
Detailed ContributionDetailed Contribution
Automated constraint checking on Reconfigurable RegionsGuided error resolution and visual constraint editingHW functionality area conflict monitoringExploration of the relocation possibilities for partial bitstreams
Analysis of the end result files of a PDR flowWorking on an architectural model and representation outside of Xilinx SW
3
4
OutlineOutline
FPGA technologyPartial dynamic reconfiguration, related issues, SoA
The proposed frameworkParser moduleReasoner moduleDesign alteration module
Case study DescriptionDebugging and enhancement using REBIT
ContributionsFuture works
Xilinx FPGA technologyXilinx FPGA technology
5
Three Xilinx families
addressed
Spartan 3
Virtex II Pro
Virtex 4
•Custom HW
•Heterogeneous array
•Per-resource numbering scheme
6
Xilinx FPGAs and Configuration Xilinx FPGAs and Configuration MemoryMemory
Partial Dynamic Partial Dynamic ReconfigurationReconfiguration
7
• static portion of design
• several RRs where different RFUs are configured
• communication via BUS Macros
• Swap hardware at runtime, without disrupting the rest of the design. 2 key advantages:
1) efficient area use
2) adaptability of application
•Application examples: adaptive control, image processing
8
OutlineOutline
FPGA technologyPartial dynamic reconfiguration, related issues, SoA
The proposed frameworkParser moduleReasoner moduleDesign alteration module
Case studydescriptionDebugging and enhancement using REBIT
ContributionsFuture works
PDR flows and related issuesPDR flows and related issues
The flows require the manual definition of RRs, conforming to specific guidelines
The designer must refer correctly to the underlying architecture of the FPGA => error prone
Vendor software has been designed for static designs
There is no guarantee that the constraints for the RRs are respected by the Place and Route phaseThis can inject further errors into the design: area conflicts and RR overflowing
Designer efforts are taken away from the actual application development
9
PDR issue 1: RR definitionPDR issue 1: RR definition
10
• The flows require constraints to be satisfied when defining RRs in the UCF (User Constraints File) file
AREA_GROUP "RR1" RANGE = SLICE_X28Y64:SLICE_X41Y127;
AREA_GROUP "RR1" RANGE = RAMB16_X2Y9:RAMB16_X2Y15;
PDR issue 2: Xilinx PAR PDR issue 2: Xilinx PAR programsprograms
Place and Route built for static designs
Even if RR defined correctly, HW might overflow it
This situation is NOT reported to the designer
Can inject silent errors in the design due to configuration overwriting and area conflicts
11
State of the artState of the artPlanahead® - 2008
Used to constrain the logic inside particular regionsLast version adds PDR supportAn error situation is simply reported but
- not where - not how to overcome it
Floorplanner® - 2008Editor for the constraints of regions on the chipArchitecture-awareNOT reconfiguration aware => guidelines not enforced
Chipscope® - 2008Used in debugging designs on Xilinx FPGAsOnly AFTER the design has been downloaded on board
Jbits (discontinued) - 2004/5Provided low level access to configuration in bitstreams
12
13
OutlineOutline
FPGA technologyPartial dynamic reconfiguration, related issues, SoA
The proposed frameworkParser moduleReasoner moduleDesign alteration module
Case study DescriptionDebugging and enhancement using REBIT
ContributionsFuture works
Integration with the Earendil Integration with the Earendil flowflow
14
•Existing flow for defining FPGA reconfigurable apps
•Proposed flow at the end of Earendil chain
•Based upon reconfigurablearchitecture product files
•May thus be inserted at the end of generic flows
The proposed Flow and Framework: The proposed Flow and Framework: RebitRebit
15
C++
wxWidgets
Parser ModuleParser Module
16
•Reads and parses input files to build the data model
•RR definition
•Bitstream occupation
•Static photos
17
The configuration bitstreamThe configuration bitstream
Analogous structure between the three families
• Occupation must be determined only on the basis of
•Number of configuration words
•Initial Frame Address Register (FAR) value
Frame addressing scheme Frame addressing scheme (FAR)(FAR)
18
•Three families aggregation of datasheet information
•Minimum (re)configuration unit = a frame
•A column corresponds to an HW column (i.e. CLB column)
•Bitstreams meaningful if composed by whole columns
•FAR address is automatically incremented by the FPGA
•How to determine the configured resources given a FAR address?
Implementation: area retrieval Implementation: area retrieval (1)(1)
19
Assumptions on the Assumptions on the configurationconfiguration
Bitstreams show some regular features:Gaps (PPCs) do not affect the number of frames needed to configure a columnIncreasing the major address means moving from left to right columns on the FPGAHard-Cores paired with BRAMs are configured along with the BRAM interconnections(V4) Row address increases from the center towards the edges, TOP/BOTTOM bit = 0 means top half
NOT documented by xilinxVerified with FPGA Editor + bitstream inspection
20
Configuration memory mapsConfiguration memory maps
21Produced for each of the 4 FPGAs analyzed
• For ANY frame allows to find the column configured on the device
• Example for Spartan3 XC3S200
Implementation: area retrieval Implementation: area retrieval (2)(2)
22
SLICE
X0Y0–X20Y41
Reasoner ModuleReasoner Module
23
• Performs the constraint analysis on RRs
• Occupation analysis for bitstream overflowing
• Builds the conflict graph
Conflict GraphConflict Graph
24
Conflict graph
conflict=edge
Incidence Matrix
conflict=red
which functionalities can be used at the same time?
Design Alteration ModuleDesign Alteration Module
25
Allows the user to perform modifications to the design
1) Redefining Reconfigurable Regions
2) Relocating partial bitstreams
The RPM gridThe RPM grid
26
•Model of the FPGAs used throughout the framework
•Describes the available resources and relative positioning
RPM =
Relatively
Placed
Macros
Implementation: equivalent areasImplementation: equivalent areas
Bitstream can be relocated in areas thatHave the same resources as the originalPreserve relative positions
27
•Algorithm: sliding window
•Partial grid is shifted onto global grid in all possible positions
•If every element of the partial matches the underlying global a match is found
28
OutlineOutline
FPGA technologyPartial dynamic reconfiguration, related issues, SoA
The proposed frameworkParser moduleReasoner moduleDesign alteration module
Case study DescriptionDebugging and enhancement using REBIT
ContributionsFuture works
29
Demo descriptionDemo description
Application– Edge detection on black and white digital images
• Input: color digital images• Output: edge detected on the input images
Architecture– 2 IP-Cores
• Filter (gray scale converter)• Edge Detector (E.D.)
– Static area• GPP: PPC405• SW: standalone
– Reconfigurable area• 1 reconfigurable regions
29
30
Data flowData flow
sddd
30
Input image
Gray scale (Filter)
Edge Detection (E.D.)
31
Performance analysis (1)Performance analysis (1)
32
Reconfiguration performanceReconfiguration performance Area (Xilinx VIIP7)
• System• Static area
– Slices: 2100• Reconfigurable area
– Constrained slices : 896• RFUs
• Filter (Gray scale)– # Frames: 126– Bitstream size: 110 KB
• Edge Detector (E.D.)– # Frames: 158– Bitstream size: 110 KB
Reconfiguration performance• Execution time: 0.31s• Rec. troughput :1,02 MB/sec• Rec. time: 0,1 sec• min data size: 32353 byte
• min image size: 180x180
33
Performance analysisPerformance analysis
Enhancement explorationEnhancement exploration
Is there any way in which we can enhance the application performance/flexibility?
Yes!
Exploring new design solutions using REBIT(we will now see how)
34
35
Performance analysisPerformance analysis
36
OutlineOutline
FPGA technology Partial dynamic reconfiguration and related issues
The proposed framework– Parser module– Reasoner module– Design alteration module
Case study– Description– Debugging and enhancement using REBIT
Contributions Future works
Case study: architectureCase study: architecture
37
• 2 image filters
• 2 partial bitstrams
• 1 RR
•Synthesis finished, we now aim at:
•Finding flaws in the design, if any
•Correcting them
Case study: constraint Case study: constraint validationvalidation
38
Case study: UCF editingCase study: UCF editing
39
Case study: relocationCase study: relocation
40
•We have resolved the issues of the design…
•Now we would like to explore new solutions
Case study : data modelCase study : data model
41
Conflict graph Feasible static photos
Aim is to resolve every conflict within each of the static photos
Case study: area conflictsCase study: area conflicts
42
43
OutlineOutline
FPGA technology Partial dynamic reconfiguration and related
issues
The proposed framework– Parser module– Reasoner module– Design alteration module
Case study description Case study application
Contributions Future works
44
Contributions of the workContributions of the work
Novel flow for the DRC of PDR architectures Automation of the flow for the validation and
debug of PDR architectures: no more manual steps
Visual editing and guided issue resolution
Configuration memory maps for the analyzed FPGAs
Relation of Xilinx bitstream format to the specific architecture
Development of a framework independent of Xilinx software that integrates knowledge of the architectural details
Future worksFuture works
Adding support for new/other FPGAs to the system
Turn the reasoner module into an expert system, to develop further automation in the definition and validation of the system
Taking BUS Macros into account, i.e.: communication between different RFUs
Extend the data model with board data, not only chip– Develop methodologies to generate constraints
based on IOB connections to the external board components
45
46
General InformationGeneral Information
Webpage– www.dresd.org/?q=valerie
Mailing List– [email protected]
Contact– To have more information regarding valerie:
• [email protected] – For a complete list of information on how to contact us:
• www.dresd.org/?q=contact_valerie
47
Questions?Questions?
Thank you