design space exploration of stream-based dataflow...

31
Design Space Exploration of Stream-based Dataflow Architectures Dr. ir. Bart Kienhuis, Delft University of Technology In cooperation with Philips Research Laboratories E-mail: [email protected] 29 January, 1999

Upload: others

Post on 25-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Design Space Exploration of Stream-based

    Dataflow Architectures

    Dr. ir. Bart Kienhuis,

    Delft University of Technology

    In cooperation withPhilips Research Laboratories

    E-mail: [email protected]

    29 January, 1999

  • (1)

    System Level Design

    ?Application Integrated Circuit

    IC

    Architecture

    Cost Effective

    ProgrammabilityMulti-functionalMulti-standard

    (Real-time)

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 1/30

  • (2)

    New High-performance DSP Architectures

    ProcessorPurposeGeneral

    VideoOut

    Controller

    PE 1 PE 2 PE 3

    VideoIn Programmable Communication Network

    High BandwidthMemory

    Parallelism

    Streams

    Weakly Programmable

    � Operations: 1 Gops - 10 Gops

    � Bandwidth: 1000 Mbytes - 10.000 Mbytes per second

    Focus is on Stream-based Dataflow ArchitecturesDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 2/30

  • (3)

    Outline

    � Architecture Template & Video Applications

    � The Y-chart Approach

    � The Y-chart Environment for Stream-based Dataflow

    Architectures

    – Retargetable Simulator

    – Mapping

    � Design Space Exploration

    � Example

    � Conclusions

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 3/30

  • (4)

    Architecture Template

    Global Controller

    Routers

    { Fp, Fq }Functional Elements

    Communication Structure

    Input/Output StreamC

    oar

    se-G

    rain

    Pro

    cess

    ing

    Ele

    men

    t

    Fp FqFu

    nct

    ion

    alU

    nit

    Inp

    ut

    Bu

    ffer

    sO

    utp

    ut

    Bu

    ffer

    s

    Design Choices

    � Function Repertoire

    � Grain Size

    � Controller Protocol

    � Buffer Capacity

    � Packet Length

    Metrics Constraints

    � Throughput (real-time)

    � Utilization

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 4/30

  • (5)

    Video Algorithms

    SinkTransposeSRCFIR

    FIR SRC TransposeSource

    Vertical Lines

    Horizontal Lines

    Specified as Kahn Process Networks (Lee&Parks’95)

    � Dynamic Dataflow

    � Deterministic execution trace (Kahn’74)

    Assumptions

    � Coarse-grained Functions

    � Sample BasedDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 5/30

  • (6)

    Problem Statement

    The Designer’s Problem

    � Many design choices

    � Need to evaluate different design alternatives

    ☞ General and structured design approaches are lacking

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 6/30

  • (7)

    The Y-chart Approach(The Methodology)

    ApplicationsInstance Mapper

    PerformanceNumbers

    PerformanceAnalysis

    Architecture

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 7/30

  • (8)

    The Y-chart Approach (cont’d)(Abstraction Pyramid)

    High

    Alternative realizations

    cycle-accuratemodels

    explore

    Design Space

    back-of-the-envelope

    abstract executablemodels

    Ab

    stra

    ctio

    n

    Op

    po

    rtu

    nit

    ies

    Co

    st o

    f M

    od

    elin

    g/E

    valu

    atio

    n

    synthesizable

    VHDL models

    estimation models

    explore

    High

    Low

    Low

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 8/30

  • (9)

    The Y-chart Approach (cont’d)(Stack of Y-Charts)

    High Level

    Low Level

    Medium Level

    Applications

    PerformanceNumbers

    Applications

    PerformanceNumbers

    Applications

    PerformanceNumbers

    Mapping

    Mapping

    Mapping

    VHDLSimulator

    Cycle AccurateSimulator

    Matlab /Mathematica

    EstimationModels

    Models

    ModelsCycle Acc.

    VHDL

    (moving down in the Abstraction Pyramid}

    (moving down in the Abstraction Pyramid}

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 9/30

  • (10)

    The Y-chart Approach (cont’d)

    (Design Trajectory)

    LOW

    HIGH

    MEDIUM

    LOW

    HIGH

    MEDIUM

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 10/30

  • (11)

    Y-Chart Environment

    (For Stream-based Dataflow Architectures)

    Description

    RetargetableSimulator (ORAS)

    Mapper

    PerformanceNumbers

    ArchitectureSBF Model

    Applications

    Design Space Exploration

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 11/30

  • (12)

    Retargetable Architecture Simulator

    Required:

    � Retargetable simulator

    � Cycle accurate

    � Fast Simulator

    � Functional Correct

    Simulator Lang. Accuracy Sim. Speed 1 Video(Architecture) Instr./sec Frame

    SPIM (MIPS 3000) C instruction 200.000 10 min.tmsim (TriMedia) C clock-cycle 40.000 54 min.

    DLX (DLX) VHDL RTL 500 1.2 day.ORAS C++ cycle 10.000 3.6 hours

    Modeling

    Define anArchitecture Instance

    Architecture Template Architectural Elements

    ArchitectureInstance

    Programming

    3

    2

    1

    4

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 12/30

  • (13)

    Object Oriented Retargetable Architecture Simulator

    Buidling Blocks

    Architecture

    Simulator

    ExecutableArch. Inst.

    Run-Time Library

    Defined as ClassesGrammar

    Object Oriented Principles

    PAMELA

    Function Library (SBF)

    DescriptionArchitecture

    Parser

    Processes

    Instrumenting

    Architecture Template Architecture Elements

    PAMELA constructs

    Metric Collectors

    Structure

    RoutingProgram

    1. Structure

    2. Execution Model

    3. Measuring

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 13/30

  • (14)

    Step1: Architecture Description

    FunctionalUnit f

    Type: Packet;FunctionalElement FilterA(1,1) f

    Type: Synchrone;Function f Type:

    HighPass(throughput=1,latency=18); gBinding f Input ( 0->0 ); Output ( 0->0 ); g

    g

    FunctionalElement FilterB(1,1) f

    Type: Synchrone;Function f Type:

    LowPass(throughput=1,latency=15); g

    Binding f Input ( 0->1 ); Output ( 0->1 ); g

    g

    g

    (Architectural Element) (Type) (Behavior)

    Buffer

    Pipeline

    Functional Element

    Handshake

    SynchronousAsynchronous

    Functional Unit Packet-switchingSample-switching

    Processing Element

    Functional Element

    Controller

    Router

    ArchitecturalElements

    Base Class Abstract Classes Derived Classes

    First-come-first-servedRound Robin

    First-come-first-servedRound RobinTime-Division-Multiplexe

    Time-Division-Multiplexe

    Bounded FIFOUnbounded FIFO

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 14/30

  • (15)

    Step2: Execution Model

    FIFO::read

    f

    pam P (data);aSample = queue[readfifo];readfifo = (++readfifo)%cap;pam delay (1);pam V (room);

    g

    FIFO::write( aSample )

    f

    pam P (room);queue[writefifo] = aSample;writefifo = (++writefifo)%cap;pam delay (1);pam V (data);

    g

    Performance Modeling PAMELA

    � Mutual Exclusivity

    � Condition Synchronization

    Producer

    Process

    Consumer

    ReadWrite

    FIFO Buffer

    ReadWrite

    Process

    � Processes

    � Synchronization Primitives

    � Time

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 15/30

  • (16)

    Step3: Metric Collectors

    Element Type Performance MetricComm. Structure UtilizationController UtilizationBuffer Filling distributionRouters Response Time ControllerFunctional Unit Utilization, Number of Context Switches

    Functional ElementUtilization, Pipeline StallsThroughput, Number of Operations

    Architecture Number of Operations, Total execution time

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 16/30

  • (17)

    Mapping

    SinkTransposeSRCFIR

    FIR SRC TransposeSource

    Vertical Lines

    Horizontal Lines

    Global Controller

    Routers

    { Fp, Fq }Functional Elements

    Communication Structure

    Input/Output Stream

    Co

    arse

    -Gra

    inP

    roce

    ssin

    g E

    lem

    ent

    Fp FqFu

    nct

    ion

    alU

    nit

    Inp

    ut

    Bu

    ffer

    sO

    utp

    ut

    Bu

    ffer

    s

    ?Mapping is ignoredwhen too difficult

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 17/30

  • (18)

    Mapping (cont’d)

    Mapping Approach:

    � Explicit description of both Architecture and Applications

    � Description Formalisms and Data Types should correspond

    Time

    Functional Elements

    Model of Architecture Model of Computation

    Stream Based Functions

    Data Type

    Streams

    SamplesOrdering of Samples Ordering of Samples

    Architecture Algorithms

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 18/30

  • (19)

    Stream-based Functions

    Basis:1. Describe a network as a

    Kahn Process Network (Kahn ’74)

    2. Structure the nodes basedon the AST model of Backus(Backus ’78)

    � Controller

    � State

    � Set of Functions

    SinkFilter BFilter ASource

    State

    Read Ports

    Write Port

    Buffer3

    Buffer1

    Buffer0

    Control

    f_b

    f_a

    SBF Object

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 19/30

  • (20)

    Example of an SBF Object

    Binding Function

    �(s) =

    8>>>>><>>>>>:

    fa; if s = 0

    fa; if s = 1

    fb; if s = 2

    fc; if s = 3;

    Transition Function

    !(s) = s+1 (mod 3):Current State Function Buffer0 Buffer1 Buffer2

    s0 fa R R W

    s1 fa R R W

    s2 fb R W

    s3 fc W

    fc

    Controller

    fb

    State

    fa

    Read Ports

    Write Port

    Buffer0

    Buffer1

    Buffer2

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 20/30

  • (21)

    Mapping of an SBF Object

    FE

    Edge

    (maps to) (maps to) (maps to)

    Functional Unit

    Stream

    header

    packet

    data

    SBF Object

    State

    f_a

    f_b

    Controller

    RouterInput BufferOutput Buffer

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 21/30

  • (22)

    Mapping (cont’d)

    The Mapping approach results in an Application/Architecture

    interface

    � No need to rewrite applications

    � Capture the correct timing behavior of applications on

    arbitrary architecture instances

    – Pipeline and Throughput

    Example:

    Function f Type: LowPass(throughput=1,latency=15); g

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 22/30

  • (23)

    Design Space Exploration

    Inverse Transformation

    Applications

    PerformanceNumbers

    Mapping

    ArchitectureInstance

    ORASParameters Performance

    The Acquisition of InsightDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 23/30

  • (24)

    Design Space Exploration (cont’d)

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 24/30

  • (25)

    Design Space Exploration (cont’d)

    TextInput

    ORAS

    Script Language PERL

    Text

    Output

    MappingTable

    ResultsParameters Results fromthe simulations

    Archicturedescription

    Nelsis BoxSimulatie

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 25/30

  • (26)

    Experiment

    Service Time = 4

    Capacity = 20

    Capacity = 30

    Latency = 1

    Throughput = 1

    Packet Length = 120

    Fir SRC Trans FIR TransSRC

    First-come-first-served

    Switch Matrix

    � Packet Length: f 4 ... 256 g Samples per Packet

    � Service Time: f 1 ... 20 g Cycles per Request

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 26/30

  • (27)

    Results

    Parallelism

    Result of simulating 25 different Architecture InstancesDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 27/30

  • (28)

    Results (cont’d)

    Utilization

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 28/30

  • (29)

    Conclusions

    Presented a Method and Tools for exploring Stream-based

    Dataflow Architectures.

    ☞Better Engineered Architectures in less Time

    Method

    � Y-chart approach to quantify design choices

    � Y-chart environment for Stream-based Dataflow

    Architectures

    � Systematic exploration of the design space

    Tools

    � Object Oriented Retargetable Architecture Simulator (ORAS)

    � The SBF Model

    � Design Space Exploration environment based on NelsisDesign Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 29/30

  • (30)

    Future Work

    � Generalize presented methods and tools for heterogeneous

    system design at different levels of abstraction (Lieverse et al.)

    � Compiling Matlab applications into descriptions in terms of

    the SBF Model (Rypkema et al.)

    For more information look at:

    ➥http://cas.et.tudelft.nl/research/hse.html

    Design Space Exploration of Stream-based Dataflow Architectures / 29 January, 1999 30/30

    http://cas.et.tudelft.nl/research/hse.html