paper review: xisystem - a reconfigurable processor and system david hermann engg*6090:...
TRANSCRIPT
Paper Review: XiSystem - A Reconfigurable
Processor and SystemDavid Hermann
ENGG*6090: Reconfigurable Computing Systems
University of Guelph
Overview High Level Overview Classification Motivation Design Issues Detailed Design:
Reconfigurable Function Unit Reconfigurable I/O Module
Hardware/Software Co-Design Summary and Conclusions
High Level Overview Developed by ARCES lab at University of Bologna
and STMicroelectronics
VLIW RISC processor with reconfigurable functional unit: XiRisc Reconfigurable functional unit extends RISC processor
capabilities for specialized DSP-like tasks
Add reconfigurable logic to XiRisc processor specifically for reconfigurable IO tasks : XiSystem Reconfigurable IO allows implementation of application-
specific interface protocols in hardware
System Architecture
Reconfigurable IO module connected to GPP via
system bus
Reconfigurable logic functional unit
closely tied to core general-purpose processor (GPP)
Preliminary Questions
How do we classify this type of reconfigurable computing system?
Why are we interested in this type of system?
What are some general design issues to consider for this kind of architecture?
Classifications of Reconfigurable Computing Systems
Review of classifications Coupling Granularity Heterogeneous Vs. Homogeneous Routing and Topology Reconfiguration Methodology
Where does this system fit in these classifications?
System Classification Coupling
Very tight coupling at functional unit or “close” co-processor level
Granularity Low-to-medium granularity, some specialized functional
blocks
Architecture, Routing and Topology Generally homogeneous logic cells Additional components and architecture make overall
design heterogeneous Specialized “one-dimensional” routing and logic cell layout
for functional unit
Reconfiguration Methodology Well developed hardware/software co-design Some emphasis on run-time reconfiguration
Comparison to Other Systems
This type of system occupies one “corner” of a multi-dimensional design space for reconfigurable computing systems.
Other “corners”: Soft-core in FPGAs Fixed GPPs in FPGAs Coarse-grained FPGAs
All variations on architectural combinations between Reconfigurable Logic Fixed-function Logic General-purpose Processors Interconnection (i.e. coupling) Options
Motivation Why be interested?
An architecture for applications where “small” amounts of reconfigurable logic can offer vast speed-ups
An architecture for extending existing SoCs with reconfigurable logic
Excellent candidates for complete hardware/software co-design flows
Interesting contrast to recent FPGA technology developments
Design Issues Low-level design details
What kind of reconfigurable logic and supporting components What kind of routing/topology How do these issues relate to achieving a maximum performance
from the reconfigurable logic?
Integration of Functional Unit into Existing Core Instruction set changes Control unit changes Parallelism with existing datapath Memory access contention
Communication with IO Co-processor Speed, bandwidth, overhead
Hardware/Software Co-design How to integrate GPP programming (C-based) with reconfigurable
functional unit? How to integrate interface/protocol design with reconfigurable IO
module?
XiRisc: Reconfigurable Functional Unit
Extend a VLIW RISC-type processor with a reconfigurable functional unit 32-bit load/store architecture DSP-like functional units (multiply-accumulators)
Pipelined Configurable Gate Array – PiCoGA Used to map complex, multi-cycle, pipelined data
processing
Features: Configurable datapath and pipelining Standard reconfigurable logic cells Asymmetric (directional) routing Run-time reconfiguration
XiRisc Processor Architecture
Existing datapath: 2 parallel ALUs plus
other shared functional units
PiCoGA: connected to same register file
input/outputs as other function units
Run-time Reconfiguration Configuration cache holds four independent
configurations for each logic cell Context switch in one cycle via special instruction
Configuration/processing partitioning Different configurations can be loaded in different
PiCoGA regions One computation and one reconfiguration can be
executed simultaneously
Second-level reconfiguration can occur through dedicated 192-bit bus in only 16 cycles
XiRisc: Sample Benchmarks
Enhance computational performance for DSP algorithms (encryption, coding, filtering, etc)
Substantially reduced power consumption 15-20% of architecture without
PiCoGA Best power consumption
reduction is as high as 92% Partially from reduced memory
access!
XiRisc: Design Advantages
Architecture provides computational speedup & power consumption improvements
Functional unit parallelism maximizes usage of GPP and reconfigurable logic
Efficient run-time reconfiguration improves re-usability of the reconfigurable logic
XiRisc: Design Drawbacks Overall performance still heavily limited
by memory access Size of reconfigurable logic is 50+% of
the silicon area
XiSystem: XiRisc with Reconfigurable IO
Add reconfigurable logic to XiRisc processor specifically for reconfigurable IO tasks : XiSystem
Reconfigurable IO connected via 32-bit system bus Used to implement application-specific protocols and
interfaces
Features Dedicated FIFO buffers Dedicated control, state and synchronization registers Reconfigurable logic fabric for implementing custom
interfaces Directly connected to configurable IO pads
XiSystem: Sample Results
Able to implement a variety of protocols and algorithms within the available reconfigurable logic RS232 – 39% logic utilization I2C – 8% logic utilization CRC– 32% logic utilization Reed-Solomon Coding– 20% logic utilization
Reconfigurable logic allows the “interface” to offload computation from general purpose processor Pre/post data formatting and processing Error detection and correction
Hardware/Software Co-Design
Complicated GPP and reconfigurable logic system Requires a good hardware/software co-design
workflow
Mixed design flow C-based flow for application processing
Can be used by software engineers HDL-based flow for protocol/interface design
Can be used by hardware engineers
XiSystem Application Design Flow
Application processing including PiCoGA
mapping starts with specialized C-code
Reconfigurable IO mapping starts
with customized HDL code
Summary & Conclusions Reviewed a complete SoC
Traditional GPP architecture Additional reconfigurable logic
Specialized data processing Specialized interfacing
System offers a number of key advantages Improved performance and power consumption Flexibility for application-specific changes and
variable interfacing needs Complete hardware/software co-design balanced
between sotware and hardware design needs
References
Compton, K. and Hauck, S., “Reconfigurable Computing: A Survey of Systems and Software”, ACM Computing Surveys, Vol. 34, No.2, Jun. 2002
Todman et al., “Reconfigurable Computing: Architectures and Design Methods”, IEE Proc. Of Computers and Digital Techniques, Vol. 152, No. 2, Mar. 2005
Lodi et al., “A VLIW Processor with Reconfigurable Instruction Set for Embedded Applications”, IEEE Journal of Solid-State Circuits, Vol. 38, No. 11, Nov. 2003
Lodi et al., “XiSystem: A XiRisc-Based SoC with Reconfigurable IO Module”, IEEE Journal of Solid-State Circuits, Vol. 41, No. 1, Jan. 2006