design space exploration of stream-based dataflow...
TRANSCRIPT
-
Design Space Exploration of Stream-based
Dataflow Architectures
Dr. ir. Bart Kienhuis,
Delft University of Technology
In cooperation withPhilips Research Laboratories
E-mail: [email protected]
29 January, 1999
-
(1)
System Level Design
?Application Integrated Circuit
IC
Architecture
Cost Effective
ProgrammabilityMulti-functionalMulti-standard
(Real-time)
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 1/30
-
(2)
New High-performance DSP Architectures
ProcessorPurposeGeneral
VideoOut
Controller
PE 1 PE 2 PE 3
VideoIn Programmable Communication Network
High BandwidthMemory
Parallelism
Streams
Weakly Programmable
� Operations: 1 Gops - 10 Gops
� Bandwidth: 1000 Mbytes - 10.000 Mbytes per second
Focus is on Stream-based Dataflow ArchitecturesDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 2/30
-
(3)
Outline
� Architecture Template & Video Applications
� The Y-chart Approach
� The Y-chart Environment for Stream-based Dataflow
Architectures
– Retargetable Simulator
– Mapping
� Design Space Exploration
� Example
� Conclusions
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 3/30
-
(4)
Architecture Template
Global Controller
Routers
{ Fp, Fq }Functional Elements
Communication Structure
Input/Output StreamC
oar
se-G
rain
Pro
cess
ing
Ele
men
t
Fp FqFu
nct
ion
alU
nit
Inp
ut
Bu
ffer
sO
utp
ut
Bu
ffer
s
Design Choices
� Function Repertoire
� Grain Size
� Controller Protocol
� Buffer Capacity
� Packet Length
Metrics Constraints
� Throughput (real-time)
� Utilization
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 4/30
-
(5)
Video Algorithms
SinkTransposeSRCFIR
FIR SRC TransposeSource
Vertical Lines
Horizontal Lines
Specified as Kahn Process Networks (Lee&Parks’95)
� Dynamic Dataflow
� Deterministic execution trace (Kahn’74)
Assumptions
� Coarse-grained Functions
� Sample BasedDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 5/30
-
(6)
Problem Statement
The Designer’s Problem
� Many design choices
� Need to evaluate different design alternatives
☞ General and structured design approaches are lacking
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 6/30
-
(7)
The Y-chart Approach(The Methodology)
ApplicationsInstance Mapper
PerformanceNumbers
PerformanceAnalysis
Architecture
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 7/30
-
(8)
The Y-chart Approach (cont’d)(Abstraction Pyramid)
High
Alternative realizations
cycle-accuratemodels
explore
Design Space
back-of-the-envelope
abstract executablemodels
Ab
stra
ctio
n
Op
po
rtu
nit
ies
Co
st o
f M
od
elin
g/E
valu
atio
n
synthesizable
VHDL models
estimation models
explore
High
Low
Low
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 8/30
-
(9)
The Y-chart Approach (cont’d)(Stack of Y-Charts)
High Level
Low Level
Medium Level
Applications
PerformanceNumbers
Applications
PerformanceNumbers
Applications
PerformanceNumbers
Mapping
Mapping
Mapping
VHDLSimulator
Cycle AccurateSimulator
Matlab /Mathematica
EstimationModels
Models
ModelsCycle Acc.
VHDL
(moving down in the Abstraction Pyramid}
(moving down in the Abstraction Pyramid}
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 9/30
-
(10)
The Y-chart Approach (cont’d)
(Design Trajectory)
LOW
HIGH
MEDIUM
LOW
HIGH
MEDIUM
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 10/30
-
(11)
Y-Chart Environment
(For Stream-based Dataflow Architectures)
Description
RetargetableSimulator (ORAS)
Mapper
PerformanceNumbers
ArchitectureSBF Model
Applications
Design Space Exploration
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 11/30
-
(12)
Retargetable Architecture Simulator
Required:
� Retargetable simulator
� Cycle accurate
� Fast Simulator
� Functional Correct
Simulator Lang. Accuracy Sim. Speed 1 Video(Architecture) Instr./sec Frame
SPIM (MIPS 3000) C instruction 200.000 10 min.tmsim (TriMedia) C clock-cycle 40.000 54 min.
DLX (DLX) VHDL RTL 500 1.2 day.ORAS C++ cycle 10.000 3.6 hours
Modeling
Define anArchitecture Instance
Architecture Template Architectural Elements
ArchitectureInstance
Programming
3
2
1
4
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 12/30
-
(13)
Object Oriented Retargetable Architecture Simulator
Buidling Blocks
Architecture
Simulator
ExecutableArch. Inst.
Run-Time Library
Defined as ClassesGrammar
Object Oriented Principles
PAMELA
Function Library (SBF)
DescriptionArchitecture
Parser
Processes
Instrumenting
Architecture Template Architecture Elements
PAMELA constructs
Metric Collectors
Structure
RoutingProgram
1. Structure
2. Execution Model
3. Measuring
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 13/30
-
(14)
Step1: Architecture Description
FunctionalUnit f
Type: Packet;FunctionalElement FilterA(1,1) f
Type: Synchrone;Function f Type:
HighPass(throughput=1,latency=18); gBinding f Input ( 0->0 ); Output ( 0->0 ); g
g
FunctionalElement FilterB(1,1) f
Type: Synchrone;Function f Type:
LowPass(throughput=1,latency=15); g
Binding f Input ( 0->1 ); Output ( 0->1 ); g
g
g
(Architectural Element) (Type) (Behavior)
Buffer
Pipeline
Functional Element
Handshake
SynchronousAsynchronous
Functional Unit Packet-switchingSample-switching
Processing Element
Functional Element
Controller
Router
ArchitecturalElements
Base Class Abstract Classes Derived Classes
First-come-first-servedRound Robin
First-come-first-servedRound RobinTime-Division-Multiplexe
Time-Division-Multiplexe
Bounded FIFOUnbounded FIFO
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 14/30
-
(15)
Step2: Execution Model
FIFO::read
f
pam P (data);aSample = queue[readfifo];readfifo = (++readfifo)%cap;pam delay (1);pam V (room);
g
FIFO::write( aSample )
f
pam P (room);queue[writefifo] = aSample;writefifo = (++writefifo)%cap;pam delay (1);pam V (data);
g
Performance Modeling PAMELA
� Mutual Exclusivity
� Condition Synchronization
Producer
Process
Consumer
ReadWrite
FIFO Buffer
ReadWrite
Process
� Processes
� Synchronization Primitives
� Time
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 15/30
-
(16)
Step3: Metric Collectors
Element Type Performance MetricComm. Structure UtilizationController UtilizationBuffer Filling distributionRouters Response Time ControllerFunctional Unit Utilization, Number of Context Switches
Functional ElementUtilization, Pipeline StallsThroughput, Number of Operations
Architecture Number of Operations, Total execution time
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 16/30
-
(17)
Mapping
SinkTransposeSRCFIR
FIR SRC TransposeSource
Vertical Lines
Horizontal Lines
Global Controller
Routers
{ Fp, Fq }Functional Elements
Communication Structure
Input/Output Stream
Co
arse
-Gra
inP
roce
ssin
g E
lem
ent
Fp FqFu
nct
ion
alU
nit
Inp
ut
Bu
ffer
sO
utp
ut
Bu
ffer
s
?Mapping is ignoredwhen too difficult
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 17/30
-
(18)
Mapping (cont’d)
Mapping Approach:
� Explicit description of both Architecture and Applications
� Description Formalisms and Data Types should correspond
Time
Functional Elements
Model of Architecture Model of Computation
Stream Based Functions
Data Type
Streams
SamplesOrdering of Samples Ordering of Samples
Architecture Algorithms
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 18/30
-
(19)
Stream-based Functions
Basis:1. Describe a network as a
Kahn Process Network (Kahn ’74)
2. Structure the nodes basedon the AST model of Backus(Backus ’78)
� Controller
� State
� Set of Functions
SinkFilter BFilter ASource
State
Read Ports
Write Port
Buffer3
Buffer1
Buffer0
Control
f_b
f_a
SBF Object
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 19/30
-
(20)
Example of an SBF Object
Binding Function
�(s) =
8>>>>><>>>>>:
fa; if s = 0
fa; if s = 1
fb; if s = 2
fc; if s = 3;
Transition Function
!(s) = s+1 (mod 3):Current State Function Buffer0 Buffer1 Buffer2
s0 fa R R W
s1 fa R R W
s2 fb R W
s3 fc W
fc
Controller
fb
State
fa
Read Ports
Write Port
Buffer0
Buffer1
Buffer2
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 20/30
-
(21)
Mapping of an SBF Object
FE
Edge
(maps to) (maps to) (maps to)
Functional Unit
Stream
header
packet
data
SBF Object
State
f_a
f_b
Controller
RouterInput BufferOutput Buffer
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 21/30
-
(22)
Mapping (cont’d)
The Mapping approach results in an Application/Architecture
interface
� No need to rewrite applications
� Capture the correct timing behavior of applications on
arbitrary architecture instances
– Pipeline and Throughput
Example:
Function f Type: LowPass(throughput=1,latency=15); g
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 22/30
-
(23)
Design Space Exploration
Inverse Transformation
Applications
PerformanceNumbers
Mapping
ArchitectureInstance
ORASParameters Performance
The Acquisition of InsightDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 23/30
-
(24)
Design Space Exploration (cont’d)
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 24/30
-
(25)
Design Space Exploration (cont’d)
TextInput
ORAS
Script Language PERL
Text
Output
MappingTable
ResultsParameters Results fromthe simulations
Archicturedescription
Nelsis BoxSimulatie
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 25/30
-
(26)
Experiment
Service Time = 4
Capacity = 20
Capacity = 30
Latency = 1
Throughput = 1
Packet Length = 120
Fir SRC Trans FIR TransSRC
First-come-first-served
Switch Matrix
� Packet Length: f 4 ... 256 g Samples per Packet
� Service Time: f 1 ... 20 g Cycles per Request
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 26/30
-
(27)
Results
Parallelism
Result of simulating 25 different Architecture InstancesDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 27/30
-
(28)
Results (cont’d)
Utilization
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 28/30
-
(29)
Conclusions
Presented a Method and Tools for exploring Stream-based
Dataflow Architectures.
☞Better Engineered Architectures in less Time
Method
� Y-chart approach to quantify design choices
� Y-chart environment for Stream-based Dataflow
Architectures
� Systematic exploration of the design space
Tools
� Object Oriented Retargetable Architecture Simulator (ORAS)
� The SBF Model
� Design Space Exploration environment based on NelsisDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 29/30
-
(30)
Future Work
� Generalize presented methods and tools for heterogeneous
system design at different levels of abstraction (Lieverse et al.)
� Compiling Matlab applications into descriptions in terms of
the SBF Model (Rypkema et al.)
For more information look at:
➥http://cas.et.tudelft.nl/research/hse.html
Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 30/30
http://cas.et.tudelft.nl/research/hse.html