1
ASPDAC’07 Tutorial
Embedded System Design:Concepts and Tools
Daniel D. GajskiAndreas Gerstlauer
Samar AbdiCenter for Embedded Computer Systems
University of California, Irvine
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements
• Modeling
• Verification and synthesis
• Summary, conclusions and outlook
2
ASPDAC’07 Tutorial
Embedded System Design:Requirements
Daniel D. GajskiCenter for Embedded Computer Systems
University of California, Irvine
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements• Issues
• Models
• Platforms
• Tools
• Modeling• Verification and synthesis• Summary, conclusions and outlook
3
Copyright ©2007 CECSASPDAC’07 Tutorial
Closing the System Gap
Real gap: behavior and structure (semantics and syntax)
Copyright ©2007 CECSASPDAC’07 Tutorial
Simulation Based Methodology
Simulatable but not synthesizable or verifiable
Ambiguous semantics of hardware/system level languages
4
Copyright ©2007 CECSASPDAC’07 Tutorial
In Search of a Solution
Algebra: < objects, operations>
Arithmetic algebra allows creation of expressions and equivalences
Copyright ©2007 CECSASPDAC’07 Tutorial
Model Algebra
Model algebra: <objects, compositions>
Model algebra allows creation of models and model equivalences
5
Copyright ©2007 CECSASPDAC’07 Tutorial
Specify-Explore-Refine Methodology
Design decisions
Model refinement
Replacement or re-composition
FPGA board
Copyright ©2007 CECSASPDAC’07 Tutorial
General System Design EnvironmentModel A
Model B
Estimationtool
Synthesistool
Componentlibrary
SimulationtoolGUI Verify
tool
ti
Refinementtool
Transforms:t1t2...
tn
6
Copyright ©2007 CECSASPDAC’07 Tutorial
How many models?
Minimal set for any methodology
(Are 3 enough ?)
• System specification model (application designers)
• Transaction-level model (system designers)
• Pin&Cycle accurate model (implementation designers)
Copyright ©2007 CECSASPDAC’07 Tutorial
Three Models (with Respect to OSI)
Pin / Cycle Accurate Model
Transaction Level Model
Specification Model
7. Application6. Presentation5. Session4. Transport3. Network2b. Link + Stream2a. Media Access Ctrl2a. Protocol1. Physical
7. Application6. Presentation5. Session4. Transport3. Network2b. Link + Stream2a. Media Access Ctrl2a. Protocol1. Physical
Address lines
Data lines
Control lines
TLM
Spec
P/CAM
Source: G Schirner
7
Copyright ©2007 CECSASPDAC’07 Tutorial
How many components?
Minimal set for any design
(Are 4 enough?)
• Processing element (PE)
• Memory
• Transducer / Bridge
• Arbiter
Copyright ©2007 CECSASPDAC’07 Tutorial
General System Model
Bus2Bus1 Bus3
Arbiter 1
PE 1.1
Transducer2-3
Arbiter 2
Arbiter 3
PE 1.2
Memory 1
PE 2.1(Master)
PE 3.1
Memory 3
Interrupt1.1
Interrupt2.1
PE 2.2(Slave)
Transducer1-2 Interrupt2.2
Interrupt3.1
Interrupt3.2
8
Copyright ©2007 CECSASPDAC’07 Tutorial
How many tools?
Minimal set for any methodology
(Are 2 enough?)
• Front-End (for application developers)– Input: C, C++, Mathlab, UML, …
– Output: TLM
• Back-End (for SW/HW system designers)– Input : TLM
– Output: Pin/Cycle accurate Verilog/VHDL
Copyright ©2007 CECSASPDAC’07 Tutorial
DecisionUser
Interface (DUI)
ValidationUser
Interface (VUI)
TIMED
Create
Map
Compile
Replace
Select
Partition
Compiler
Debugger
Stimulate
Verify
ES Environment
Application Tools : Compilers/Debuggers Commercial Tools : FPGA, ASIC
CYCLE ACCURATE
Verify
Simulate
Check
Compile
ESE Front – End
System Capture + Platform Development
SW Development + HW Development
ESE Back – End
9
Copyright ©2007 CECSASPDAC’07 Tutorial
Does it work?
• Intuitively it does– Well defined models, rules, transformations, refinements– Worked in the past: layout, logic, RTL?– System level complexity simplified
• Following talks will prove the concept– ES modeling abstractions– Automatic model generation– Model synthesis and verification– Embedded System Environment (ESE)
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
RequirementsIssues
Models
Platforms
Tools
• Modeling• Verification and synthesis• Summary, conclusions and outlook
10
ASPDAC’07 Tutorial
Embedded System Design:Modeling
Andreas GerstlauerCenter for Embedded Computer Systems
University of California, Irvinehttp://www.cecs.uci.edu/pub_slides
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements• Modeling
• Introduction• Languages and models• System modeling semantics• Automatic model generation• Design example• Tools• Summary and conclusions
• Verification and synthesis• Summary, conclusions and outlook
11
Copyright ©2007 CECSASPDAC’07 Tutorial
Introduction
• Growing SoC complexities and sizes• Heterogeneous multi-processor SoC (MPSoC)• Networked, distributed systems
System modeling• Validation and analysis• Concurrent hardware/software development• Specification for further implementation/synthesis
Higher levels of abstraction for speed/accuracy tradeoffsRapid, early design space exploration
Copyright ©2007 CECSASPDAC’07 Tutorial
System Modeling
Computation
Com
mun
icat
ion
A B
C
D F
Un-timed
Approximate-timed
Cycle-timed
Un-timed
Approximate-timed
A. System specification modelB. Component modelC. Bus-arbitration modelD. Bus-functional modelE. Cycle-accurate computation modelF. RTL/ISS Implementation model
E
Cycle-timed
Source: Lukai Cai, D. Gajski. “Transaction level modeling: An overview”, ISSS 2003
• Abstraction based on level of detail/granularity• Computation• Communication
System-level design flowPath from model A to model F
12
Copyright ©2007 CECSASPDAC’07 Tutorial
System Design Languages
• Netlists• Structure only: components and connectivity
Gate-level [EDIF], system-level [SPIRIT/XML]• Hardware description languages (HDLs)
• Event-driven behavior: signals/wires, clocks• Register-transfer level (RTL): boolean logic
Discrete event [VHDL, Verilog]• System-level design languages (SLDLs)
• Software behavior: sequential functionality/programsC-based [SpecC, SystemC, SystemVerilog]
Copyright ©2007 CECSASPDAC’07 Tutorial
ESL Modeling Today
• Simulation-centric system modeling• Co-simulation at lower RTL/ISS levels [Axys, VaST]
– CPU-centric, limited architectures/designs [ARM]
• Transaction-level models [CoWare, Summit]– Multi-processor system simulation [SystemC]– Graphical platform assembly [Eclipse]– Processor customization [Tensilica, Lisatek]
• Algorithmic specification [SPW, MATLAB, COSSAP]– Varying models of computation (MoC) [Ptolemy]– Model-based design [UML, MATLAB/Simulink]
Horizontal integration of different models / componentsLack vertical integration for synthesis-centric approach
13
Copyright ©2007 CECSASPDAC’07 Tutorial
System Modeling Needs
• Language ambiguities• Simulation vs. synthesis (subsets)• Impossible to automatically discern implicit meaning
Well-defined, rigorous semantics• Unambiguous, explicit abstractions, models
– Objects and composition rules– Computation and communication
• Systematic flow from specification to implementation– Transformations and refinements
Reliable feedback at early stagesDesign automation for synthesis, verificationRapid, early design space exploration
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements• Modeling
• Introduction• Languages and models• System modeling semantics• Automatic model generation• Design example• Tools• Summary and conclusions
• Verification and synthesis• Summary, conclusions and outlook
14
Copyright ©2007 CECSASPDAC’07 Tutorial
Computation Modeling
• Application modeling• Native process execution (C code)• Back-annotated execution timing
• Processor modeling• Operating system
– Real-time multi-tasking (RTOS model)– Bus drivers (C code)
• Hardware abstraction layer (HAL)– Interrupt handlers– Media accesses
• Processor hardware– Bus interfaces (I/O state machines)– Interrupt suspension and timing
B1 B2
OS
CPU
Drv
Interrupts
Bus
ISRHAL
Process B1(){
…waitfor(15000);…waitfor(25000);…
};
Source: G. Schirner, A. Gerstlauer, R. Doemer. “Abstract, Multifaceted Modeling of Embedded Processors for System Level Design,” ASPDAC, 2007.
Copyright ©2007 CECSASPDAC’07 Tutorial
RTOS Modeling• High-level RTOS abstraction
• Wrap around SLDL primitives, replace event handling
– High-level, abstract model– Target-independent, canonical API– Small overhead, low complexity
• Dynamic scheduling behavior– Real-time scheduling, multi-tasking– Preemption, interrupt handling
• IPC channels– Task communication, synchronization
Application
SLDL
Channels
Application
SLDL
RTOS Model
Channels
Application
SLDL
Comm. & Sync. API
ISS
RTOS
Specification TLM Implementation
Application
CPU HW Model
RTOS Model
B1 B2 B3 B4 B5
SLDL
Source: A. Gerstlauer, H. Yu, D. Gajski. "RTOS Modeling for System-Level Design," DATE, 2003.
15
Copyright ©2007 CECSASPDAC’07 Tutorial
Communication Modeling
SoC communication layers
1InterconnectDriving, samplingPins, wiresPhysical
2aHardwareProtocol timingMedia (word/frame) transactionsProtocol
2aHALData slicing,Arbitration
Shared medium byte streamsMedia Access
2bDriverMultiplexing,Addressing
Point-to-point control/data streamsStream
2bDriverStation typing,Synchronization
Point-to-point logical linksLink
3OS kernelRoutingEnd-to-end packetsNetwork
4OS kernelPacketing, Flow control,Error correction
End-to-end data streamsTransport
5OS kernelSynchronization,Multiplexing
End-to-end untyped messagesSession
6ApplicationData formattingEnd-to-end typed messagesPresentation
7ApplicationComputationChannels, variablesApplication
OSIImplementationFunctionalitySemanticsLayer
Source: A. Gerstlauer, D. Shin, R. Dömer, D. Gajski, "System-Level Communication Modeling for Network-On-Chip Synthesis," ASPDAC, 2005.
Copyright ©2007 CECSASPDAC’07 Tutorial
Transaction-Level Modeling
Pin / Cycle Accurate Model
Transaction Level Model
Specification Model
7. Application6. Presentation5. Session4. Transport3. Network2b. Link + Stream2a. Media Access Ctrl2a. Protocol1. Physical
7. Application6. Presentation5. Session4. Transport3. Network2b. Link + Stream2a. Media Access Ctrl2a. Protocol1. Physical
Address lines
Data lines
Control lines
TLM
Spec
P/CAM
Source: G. Schirner
16
Copyright ©2007 CECSASPDAC’07 Tutorial
Modeling Results
0.01
0.1
1
10
100
1000
10000
100000
Spec. TLM (Net) TLM (Prot) PAM CAM
Sim
ulat
ion
Tim
e [s
]
MP3JPEGGSM
0
5
10
15
20
25
Spec. TLM (Net) TLM (Prot) PAM CAM
Ave
rage
Err
or [%
]
MP3JPEGGSM
Source: G. Schirner, A. Gerstlauer, R. Doemer
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements• Modeling
• Introduction• Languages and models• System modeling semantics• Automatic model generation• Design example• Tools• Summary and conclusions
• Verification and synthesis• Summary, conclusions and outlook
17
Copyright ©2007 CECSASPDAC’07 Tutorial
Automatic Model Generation• Problem: Writing of system models is
• Time consuming, error-prone, tedious• Solution:
• Automatic model generation• Refinement-based approach
• System designer : makes design decisions• Refinement tool : automatically generates the model
• Benefits• No manual model writing, focus on design decisions• Low error rate by automating error-prone tasks• Easy change/upgrade for incremental/derivative design• No change in basic design methodology
Enables fast, extensive design space explorationProductivity gains (1000x)Shorter time-to-market
Copyright ©2007 CECSASPDAC’07 Tutorial
ESL Design Flow
Model GenerationFront-End
Model GenerationFront-End
Synthesis(Hardware/Software)
Back-End
Synthesis(Hardware/Software)
Back-End
TLM
Inst
ruct
ion-
Set S
imul
ator
sSystem
C
Object Code VHDL/Verilog RTL
Platform Architecture Platform Services/Application
System ModelsTLMTLMTLMn
18
Copyright ©2007 CECSASPDAC’07 Tutorial
Platform Architecture
DCT(IP)
TX
CPU(ARM7)
M1(SRAM)
M1Ctrl
I/O4(HW)
CoPro(Custom)
DSP(DSP56k)
MBUS
BUS1 (AHB) BUS2 (DSP)
S
SS
S
M
S
M2/S
M1 M
Arb
iter1
Arb
IP BridgeM
S
DCTBus
S
I/O3(HW)
S
I/O2(HW)
S
I/O1(HW)
S
S
DMA(2753A)
• Components• Processors• Memories• IPs• Custom HW
• Connectivity• Buses• Bridges• Ports
Copyright ©2007 CECSASPDAC’07 Tutorial
Platform Services
DCT
TX
ARM
M1Ctrl
I/O4
HW
DSP
MBUS
BUS1 (AHB) BUS2 (DSP)
Arb
iter1
IP Bridge
DCTBus
I/O3I/O2I/O1DMA
M1
Enc Dec15 10
RTOS
Jpeg
Codebk
v1v2
SI BO BI SO
DCT
v3
• Computation • Processes (hierarchical C)• Scheduling (static/OS)
Application
Test code
19
Copyright ©2007 CECSASPDAC’07 Tutorial
Platform Services
DCT
TX
ARM
M1Ctrl
I/O4
HW
DSP
MBUS
BUS1 (AHB) BUS2 (DSP)
Arb
iter1
IP Bridge
DCTBus
I/O3I/O2I/O1DMA
M1
Enc DecJpeg
Codebk
v1v2
SI BO BI SO
DCT
v3
C2
C1
C3C4
C5
C6
C7
C8
C0
0x400x48
int0
0x14
0x80
0x10,int1 0xA000,intD
0x0C00,intC
0x0C50,intC
0x08
00,in
tA
0x09
00,in
tB
0x09
50
0x08
50
• Computation • Processes (hierarchical C)• Scheduling (static/OS)
• Communication• High-level IPC (channels)• Storage (variables)
Copyright ©2007 CECSASPDAC’07 Tutorial
Model Generation Process• Synthesis = Decision making + model refinement
Successive model refinementLayers of implementation detail
RefinementRefinement
Model nModel n
LibLib
Model n+1Model n+1
Specification modelSpecification model
Implementation modelImplementation model
Optim. algorithmOptim. algorithm
GUIGUI
Design decisions
20
Copyright ©2007 CECSASPDAC’07 Tutorial
Specification Model
RcvData
Co-process
SpchOut
DCT
stripe[]Decoder
JPEG Coder
OSModel
SerOut
SpchIn
SerIn
DSPARM_OS
Mem
DMA
HW
DCT_IP
BI
BO
SO
SICtrl
ARM7
DSP_OS
DC
TAda
pter
Vocoder
Copyright ©2007 CECSASPDAC’07 Tutorial
Transaction-Level Model (Network)
ARM_OSARM7
DMADMA_HW
DCT
DCT_IP
M
S
DSP_OS
DSP
OSModel
HW
SI
SI_HW
BI
BI_HW
SOSO_HW
BOBO_HW
Bridge
MSlinkBri
l
DC
TAda
pter
linkD
MA
linkBrilinkBI
linkHW
linkSI
linkBO
linkSO
Mem
• Data conversion, channel merging• Transducer insertion, Packeting, Routing
21
Copyright ©2007 CECSASPDAC’07 Tutorial
Transaction-Level Model (Protocol)
mem
mac
ARM_OSARM_HAL
AD
DR ARM7
DMA
mac
mem
DMA_HW
AD
DR
Mem
mem
Mem_HW
AD
DR
DCT
DCT_IP
i_semaphore
Arbiter
cfMaster
cfSlave
ahbP
roto
col
T_H
WDSP_OS
DSP
_HA
L
DSP
AD
DR
OSModel
mac
intA intB intC intD
HW
mac
AD
DR
HW_HW
SI
mac
AD
DR
SI_HW
BI
mac
AD
DR
BI_HW
SO
mac
AD
DR
SO_HW
BO
mac
AD
DR
BO_HW
mac
AD
DR
Bridge
dspMaster
dspSlave
dspProtocol
mac
AD
DR
poll POLL_ADDR
poll POLL_ADDR
poll POLL_ADDR
poll POLL_ADDR
• Synchronization, addressing, media acces• Arbitration, data slicing, interrupt handling
Copyright ©2007 CECSASPDAC’07 Tutorial
Pin-Accurate Model
ARM_BF
ISR
l
PIC
ARM_OSARM_HAL
ARM_HW
AD
DR ARM7
DMA
DMA_BF
AD
DR
AD
DR
Mem
Mem_BF
AD
DR
DCT
DCT_IP
Arbiter
T_BF
DSP
_BF
PIC
DSP_OS
DSP
_HA
L DSP
_HW
DSP
l
AD
DR
OSModel
ISR
HW
AD
DR
HW_BF
SI
SI_BF
BI
ADDR,POLL_ADDR
BI_BF
SO
SO_BF
BO
BO_BF
Bridge
AD
DR
AD
DR
ADDR,POLL_ADDR
ADDR,POLL_ADDR
ADDR,POLL_ADDR
Implementation synthesis in backend toolsInterface synthesis on hardware sideBus driver and RTOS synthesis on the software side
22
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements• Modeling
• Introduction• Languages and models• System modeling semantics• Automatic model generation• Design example• Tools• Summary and conclusions
• Verification and synthesis• Summary, conclusions and outlook
Copyright ©2007 CECSASPDAC’07 Tutorial
Example: MP3 Decoder
• Functional block diagram (major blocks only)
• Timing constraints• 38 frames per second• Frame delay < 26.12ms
HuffDec
FilterCoreIMDCT
PCM
FilterCoreIMDCT
mp3 pcm
Left channel
Right channel
AliasRed
AliasRed
2 granules
23
Copyright ©2007 CECSASPDAC’07 Tutorial
System Definition
• Components allocated from database • Bridge used to interface two busses
• Hierarchical processes inside PEs• Parallel, sequential, leaf (C code)
• Channels and variables between PEs
ARM
Mem
Brid
ge
mainBusArbi
ter
PCM
. HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
FilterCore
FilterCore
PCM
pcmBus
ch1
ch0
real gainpow2[378];...
Copyright ©2007 CECSASPDAC’07 Tutorial
Output: Transaction-Level Model (TLM)
mainBus
OSHAL
Drivers
pcmBus
PCM
. HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
FilterCore
FilterCore
ARM
Mem
Bridge PCM
24
Copyright ©2007 CECSASPDAC’07 Tutorial
TLM Simulation• Timed-simulation
• Computation delays back-annotated in processes• Communication delays modeled in the bus channels
• Simulation of TLM• Functionally correct but frame delay far exceeds timing constraint
Copyright ©2007 CECSASPDAC’07 Tutorial
Computation Analysis
• View computation time of processes • FilterCore is the most computation-intensive
• Look for parallelism in process hierarchy• FilterCore processes are running in parallel
→Use two identical custom HWs
DecodeMP3 time distribution
huffdecprogranule
Comm.
Progranule time distribution
Comm.right
channel
left channel
Channel time distribution
imdct
Comm.
aliasred
filtercore
ARMARMv7
uCosII RTOS...View Graph...
Properties...
25
Copyright ©2007 CECSASPDAC’07 Tutorial
System Modifications
• Add two more PEs: HW1 and HW2
ARM
Mem
Brid
ge
.
mainBus
HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
FilterCore
FilterCore
Arbi
ter
PCMPCM
ch1
ch0
real gainpow2[378];...
pcmBus
HW1 HW2
• Connect HW1 and HW2 to the mainBus• Move processes from ARM to HWs
• Channels are automatically inserted
ARM
. HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
ch3
ch2
FilterCoreFilterCore
Copyright ©2007 CECSASPDAC’07 Tutorial
Modified TLM
mainBus
OSHAL
Drivers
pcmBus
PCM
. HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
ARM
Mem
Bridge PCM
FilterCore
HW1
FilterCore
HW2
26
Copyright ©2007 CECSASPDAC’07 Tutorial
Modified TLM Simulation
• Simulation of TLM• Functionally correct• Frame delay still violates timing constraint
Copyright ©2007 CECSASPDAC’07 Tutorial
Communication Analysis• Bus utilization graph
• Bus contention graph
27
Copyright ©2007 CECSASPDAC’07 Tutorial
ch1
ch0
More System Modifications
ARM
Mem
Brid
ge
.
mainBus
HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
Arbi
ter
PCMPCM
real gainpow2[378];...
pcmBus
HW1 HW2FilterCoreFilterCore
ch3
ch2
• Make two ports in HW1, HW2 • Connect HWs directly to the DHS bus
pcmBusch1
ch0
• Remove the bridge
Copyright ©2007 CECSASPDAC’07 Tutorial
Modified TLM
mainBus
OSHAL
Drivers
PCM
. HuffEnc
AliasRed
AliasRed
IMDCT
IMDCT
ARM
Mem
PCM
FilterCore
HW1
FilterCore
HW2
pcmBus
28
Copyright ©2007 CECSASPDAC’07 Tutorial
Modified TLM Simulation
• Simulation of TLM• Functionally correct• Frame delay now meets timing constraint
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
• Requirements• Modeling
• Introduction• Languages and models• System modeling semantics• Automatic model generation• Design example• Tools• Summary and conclusions
• Verification and synthesis• Summary, conclusions and outlook
29
Copyright ©2007 CECSASPDAC’07 Tutorial
ELEGANT Environment
System-level design exploration
SER
System-level design exploration
SER
Spec-level modelSpec-level model
Resource library
Resource library
SimulatorVisualSpec
withFastVeri
Formalverification
tools
TLMTLM
Pin-accuratemodel
Pin-accuratemodel
Cycle-accurate model
Cycle-accurate model
Behavioral synthesisCyber
Analog-to-digital decision tools
RTL HDLRTL HDL
Source: InterDesign Technologies, Inc.
Copyright ©2007 CECSASPDAC’07 Tutorial
ELEGANT Environment
• Complete ESL design environment• Japanese Aerospace Exploration Agency (JAXA)• Focus on correctness and reliability• Integration of tools for simulation, synthesis, verification
• Specify-Explore-Refine (SER) tool• 1st-generation of automatic model generation• Top-down design
– Specifiy desired system functionality– Explore architectural design alternatives– Refine specification into automatically generated TLM or PAM
30
Copyright ©2007 CECSASPDAC’07 Tutorial
DecisionUser
Interface (DUI)
ValidationUser
Interface (VUI)
TIMED
Create
Map
Compile
Replace
Select
Partition
Compiler
Debugger
Stimulate
Verify
Embedded Systems Environment (ESE)
Application Tools : Compilers/Debuggers Commercial Tools : FPGA, ASIC
CYCLE ACCURATE
Verify
Simulate
Check
Compile
ESE Front – End
System Capture + Platform Development
SW Development + HW Development
ESE Back – End
Copyright ©2007 CECSASPDAC’07 Tutorial
ES Environment (ESE)
• Next-generation ESL tool set• ESE Front-End
– Platform capture and virtual prototyping (modeling, simulation)– Embedded software development environment– Communication network model (advanced TLM) generation– Heterogeneous multi-processor SoC (MPSoC) target support
• ESE Back-End– Platform customization, optimization and implementation– Software and hardware synthesis– FPGA board prototyping
31
Copyright ©2007 CECSASPDAC’07 Tutorial
ESE: System DefinitionESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
scc: SpecC Compiler V 2.2.0(c) 1997-2005 CECS, University of California, Irvine
PCM MemARM
right
Main
aliasred
imdct
p2_2.sc
Architecture
Sources
ARMARMv7
uCosII
MemSRAM64
PCM
Bridge
mainBusAMBA_AHB pcmBus
DoubleHdshk
View Project Simulation Help
Database
CEs Busses
ARMv7
SRAM64
CPUs
Memories
SRAM128
PEs
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
i_sendi_receive
Synthesis
filtercore
left
aliasred
imdct
filtercore
HardwaresHW_StandardNISC
aliasred.caliasred.c
int process_mode;
void antialias(real xr[SBLIMIT][SSLIMIT], struct gr_info_s *gr_info) { int sblim;
if(gr_info->block_type == 2) { if(!gr_info->mixed_block_flag) return; sblim = 1; } else { sblim = gr_info->maxb-1; } }
void main(void) {
Add variable
Edit source...
SerializeParallelize
Remove
Add childAdd channel
Copyright ©2007 CECSASPDAC’07 Tutorial
ESE: System ModificationsESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
p2_2.sc
Architecture
Sources
ARMARMv7
uCosII
MemSRAM64
mainBusAMBA_AHB
View Project Simulation Help
Database
CEs Busses
ARMv7
SRAM64
CPUs
Memories
SRAM128
PEs
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
i_sendi_receivePCM
Bridge
pcmBusDoubleHdshk
PEs
PCM MemARM
right
Main
aliasred
imdct
filtercore
left
aliasred
imdct
filtercore
HardwaresHW_StandardNISC
Synthesis
HW2HW1
32
Copyright ©2007 CECSASPDAC’07 Tutorial
ESE: System ModificationsESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
p2_2.sc
Architecture
Sources
ARMARMv7
uCosII
MemSRAM64
mainBusAMBA_AHB
View Project Simulation Help
Database
CEs Busses
ARMv7
SRAM64
CPUs
Memories
SRAM128
PEs
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
i_sendi_receivePCM
Bridge
pcmBusDoubleHdshk
HW1 HW2
PEs
HW1 HW2ARM
right
Main
aliasred
imdct
filtercore
left
aliasred
imdct
filtercore
PCM
HardwaresHW_StandardNISC
Synthesis
DisconnectConnect
Add port...
Remove
mainBuspcmBus masterCPU
slave
Copyright ©2007 CECSASPDAC’07 Tutorial
ESE: System ModificationsESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
p2_2.sc
Architecture
Sources
ARMARMv7
uCosII
MemSRAM64
mainBusAMBA_AHB
View Project Simulation Help
Database
CEs Busses
ARMv7
SRAM64
CPUs
Memories
SRAM128
PEs
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
i_sendi_receivePCM
Bridge
pcmBusDoubleHdshk
HW1 HW2
PEs
HW1 HW2ARM
right
Main
aliasred
imdct
filtercore
left
aliasred
imdct
filtercore
PCM
HardwaresHW_StandardNISC
Synthesis
filtercore
filtercore
m1 c_double_hm2 c_double_hm3 c_double_hm4 c_double_hm5 c_double_h
33
Copyright ©2007 CECSASPDAC’07 Tutorial
ESE: TLM Simulation
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
PCM MemARM
right
Main
aliasred
imdct
p2_2.sc
Architecture
Sources
ARMARMv7
uCosII
MemSRAM64
PCM
Bridge
mainBusAMBA_AHB Bus2
DoubleHdshk
View Project Simulation Help
Database
CEs
ARMv7CPUs
Memories
PEs
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
i_sendi_receive
SynthesisInstrumentCompile
Simulate...Kill simulation
View log...
filtercore
left
aliasred
imdct
filtercore
Bus traceBus trace
Running simulation …% xterm -e ./mp3decoder funky.mp3 funky.pcm
SRAM64SRAM128Hardwares
HW_StandardNISC
Busses
Simulation logSimulation log
MP3Decoder Input MP3 file: funky.mp3 Output file: funky.pcmBitrate = 96000 bits/s, Sampling frequency = 44100 samples/sFrame 1 ... 10.67 msFrame 2 ... 21.33 msFrame 3 ...
Bus traceBus trace
TimeTime
25%65%
10%5%
TimeTime
15%
7%
3%
75%
Copyright ©2007 CECSASPDAC’07 Tutorial
Summary and Conclusions
• Technology advantages• Platform can be easily captured using GUI• TL models are automatically generated for
development and testing of application code • Legacy or preliminary SW can be easily captured and
mapped to the platform• ESL allows concurrent development of platform SW,
HW and application code• ESL allows easy upgrade of platform and reuse of
legacy application SW and RTL HW code
Early, rapid prototyping for design space explorationSignificant productivity gains
34
Copyright ©2007 CECSASPDAC’07 Tutorial
References• System-level languages and modeling
• T. Grötker, S. Liao, G. Martin, S. Swan. System Design with SystemC. Kluwer, 2002.
• A. Gerstlauer, R. Dömer, J. Peng, D. D. Gajski, System Design: A Practical Guide with SpecC, Kluwer, 2001.
• F. Ghenassia. Transaction-Level Modeling with SystemC: TLM Concepts and Applications for Embedded Systems, Springer, 2006.
• System prototyping and simulation• L. Benini, D. Bertozzi, A. Bogoliolo, F. Menichelli, M. Olivieri. “MPARM:
Exploring the Multi-Processor SoC Design Space with SystemC,”Journal of VLSI Signal Processing, Sept. 2005.
• T. Kempf, M. Dörper, R. Leupers, G. Ascheid, H. Meyr, T. Kogel, B. Vanthournout. “A Modular Simulation Framework for Spatial and Temporal Task Mapping onto Multi-Processor SoC Platforms,” DATE Conference, March 2005.
• System-level design and design environments• F. Balarin, H. Hsieh, L. Lavagno, C. Passerone, A. Pinto, A.
Sangiovanni-Vincentelli, Y. Watanabe, G. Yang. “Metropolis: a Design Environment for Heterogeneous Systems,” Multiprocessor Systems-on-Chips (Editors: W. Wolf, A. Jerraya), Morgan Kaufmann, 2004.
• A. A. Jerraya, F. Petrot, A. Bouchhima. “Programming Models and HW-SW Interfaces Abstraction for Multi-Processor SoC,” DAC, June 2006.
• D. Gajski, A. Gerstlauer, R. Doemer, S. Abdi, J. Peng, D. Shin, R. Ang, "ESE Front End," http://www.cecs.uci.edu/pub_slides, March 2006.
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
Requirements
Modeling
• Verification and synthesis
• Summary, conclusions and outlook
35
ASPDAC’07 Tutorial
Embedded System Design:Verification and Synthesis
Samar AbdiCenter for Embedded Computer Systems
University of California, Irvinehttp://www.cecs.uci.edu/pub_slides
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
RequirementsModeling
• Verification and synthesis• Existing verification methods• TLM verification using transformations• Issues in ES synthesis• Synthesis of MP3 player design• Results
• Summary, conclusions and outlook
36
Copyright ©2007 CECSASPDAC’07 Tutorial
Design Verification Methods
• Simulation based methods• Specify input test vector, output test vector pair• Run simulation and compare output against expected output
• Formal Methods• Check equivalence of design models or parts of models• Check specified properties on models
• Semi-formal Methods• Specify inputs and outputs as symbolic expressions• Check simulation output against expected expression
Copyright ©2007 CECSASPDAC’07 Tutorial
Simulation
• Task : Create test vectors and simulate model• Tools: VCS(Synopsys), ModelSim(Mentor), NC-Sim(Cadence)• Inputs
• Specification– Typically natural language, incomplete and informal– Used to create interesting stimuli and monitors
• Model of DUT– Typically written in HDL or C or both
• Output• Failed test vectors
– Pointed out in different design representations by debugging tools
Typical simulation environment
DUT
Stim
ulus
Mon
itors
Specification
37
Copyright ©2007 CECSASPDAC’07 Tutorial
Logic and FSM Equivalence Checking
• LEC uses boolean algebra to check for logic equivalence
1 = 1’ ?2= 2’ ?
inpu
ts
outp
uts
12
inpu
ts
outp
uts1’
2’
Equivalence result
p
q
x
ya
b
r
s
x
ya
bty
b
pr
qr
ps pt
qs qt
xx
yx
xyxy
yy
yyaa
bb
bb
× =
• SEC uses FSMs to check for sequential equivalence
Copyright ©2007 CECSASPDAC’07 Tutorial
Model Checking
• Model M satisfies property P? (Clarke, Emerson ’81)• Inputs
• State transition system representation of M• Temporal property P as formula of state properties
• Output• True (property holds)• False + counter-example (property does not hold)
True /False + counter-exampleModel
Checker
P = P2 always leads to P4s1
s4 s3
s2P1
P3P4
P2
M
38
Copyright ©2007 CECSASPDAC’07 Tutorial
New Verification Challenges for SoC Design
• Design complexity• Size
– Verification either takes unreasonable time (eg. Logic simulation)– Or takes unreasonable memory (eg. Model Checking)
• Heterogeneity– HW / SW components on the same chip– Interface problems
• Modeling abstraction– No formalism exists to prove equivalence of high level models– LEC and SEC are not applicable above RTL
• Possible directions• Methodology
– Unified HW/SW models– Model formalization– Automatic model transformations
Copyright ©2007 CECSASPDAC’07 Tutorial
System Verification through Refinement
• Set of models
• Designer Decisions => transformations
• Transformations preserve equivalence• Same partial order of tasks• Same input/output data for each task• Same partial order of data transactions• Equivalent replacements
• All refined models will be “equivalent” to input model
Still need to verifyFirst modelCorrectness of replacements
RefinementTool
t1t2…tm
Model A
Model B
DesignerDecisions
Library ofobjects
39
Copyright ©2007 CECSASPDAC’07 Tutorial
Formal Model Representation
• Model Algebra• < {objects}, {composition rules} >
• Objects• Behaviors (for computation)
– Identity behaviors (output identical to input)• Channels (for synchronized communication)• Variables (for storage)• Ports (for hierarchy / connections)
• Composition rules• Control dependency• Channel transaction• Variable read/write
• Hierarchical behavior• Grouping of sub-behaviors, channels, variables
and their compositions
• System = Top level behavior
b1
c e2
b2
e1
v1
v2
bq1 q2
q3
Copyright ©2007 CECSASPDAC’07 Tutorial
Refinement Example
• Design decisions• Select PEs• Map leaf behaviors to PEs
• Resulting model refinement• Additional hierarchy for PEs• Distribute leaf behaviors inside PEs• Add Identity behaviors and channels
to preserve control flow– Each PE has independent control
• Refinement verification• Express refinement as sequence of
model transformations
e1
b1 e2
b2
sync
PE1 PE2Ma
b 1
b 2
Ms
40
Copyright ©2007 CECSASPDAC’07 Tutorial
Transformation Step 1
( Original Model )
b 1
b 2
Ms
b 1
b 2
e1
e2
( Intermediate Model 1 )
b1 b2q1 q2eb1 b2
q1 q2
Identity Addition Rule
Copyright ©2007 CECSASPDAC’07 Tutorial
Transformation Step 2
b 1
b 2
e1
e2
( Intermediate Model 1 )
b 1
b 2
e1
e2
( Intermediate Model 2 )
sync
e1 e21 e1 e2sync
Channel Addition Rule
41
Copyright ©2007 CECSASPDAC’07 Tutorial
Transformation Step 3
b 1
b 2
e1
e2
( Intermediate Model 2 )
sync
( Final Model )
Hierarchy Addition Rule
e1
b1 e2
b2
sync
PE1 PE2Ma
b sync b sync
Copyright ©2007 CECSASPDAC’07 Tutorial
Summary of ES Verification
• Variety of verification techniques available• But they apply to traditional design flow based on RTL
• Challenges for verification of large system designs• Simulation based techniques take way too long• Most formal techniques cannot scale in size or abstraction level
• Future design and verification• Well defined semantics for models at different abstraction levels• Well defined transformations for design decisions
– Verify transformations– Automate refinements
• Modeling semantics are the key to verification !
42
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
RequirementsModeling
• Verification and synthesisExisting verification methodsTLM verification using transformations
• Issues in ES synthesis• Synthesis of MP3 player design• Results
• Summary, conclusions and outlook
Copyright ©2007 CECSASPDAC’07 Tutorial
ESL Design Flow
Model GenerationFront-End
Model GenerationFront-End
Synthesis(Hardware/Software)
Back-End
Synthesis(Hardware/Software)
Back-End
TLM
Inst
ruct
ion-
Set S
imul
ator
sSystem
C
Object Code VHDL/Verilog RTL
Platform Architecture Platform Services/Application
System ModelsTLMTLMTLMn
43
Copyright ©2007 CECSASPDAC’07 Tutorial
Input: Transaction Level Model (TLM)
CPU Bus
B1 B2
OS
B4
CPU Mem
IP
B3
HW
HAL
IP Bus
B6
B5
Drivers
Application (C code)
Bridge
Copyright ©2007 CECSASPDAC’07 Tutorial
TLM Features
• Universal Bus Channel (UBC)• Bus is modeled as universal channel with send/recv, read/write functions• Well defined functions for routing, synchronization, arbitration and transfer
• SW modeling• Application SW is modeled as processes in C• A RTOS model or real RTOS is used for dynamic scheduling of processes• Communication with peripherals, memory or other IP is done using UBC
• HW modeling• Application HW is modeled as processes written in C• Communication with processor, memory or other IP is done using UBC
• Memory modeling• Memory is modeled as array in C• Controller is modeled by function in UBC
44
Copyright ©2007 CECSASPDAC’07 Tutorial
Cycle-Accurate Software Synthesis
CPU Bus
CPU
HW IP
IP Bus
Compile
RTOS Synthesis
HALRTOS
EXEB1 B2
OSHAL
B5
Bridge
Program
Copyright ©2007 CECSASPDAC’07 Tutorial
SW Synthesis Issues
• Compiler selection• The designer specifies which compiler is used for the SW
• Library selection• Libraries are selected for SW support such as file systems, string
manipulation etc. • Prototype debugging requires selection of additional libraries
• RTOS selection and targeting• Designer selects an RTOS for the processor • RTOS model is replaced by real RTOS and SW is re-targeted
• Program and data memory• Address range for SW program memory is assigned• Address range for data memory used by program is assigned• For large programs or data, off-chip memory may be allocated
45
Copyright ©2007 CECSASPDAC’07 Tutorial
Cycle-Accurate Hardware Synthesis
CPU Bus
CPU Mem
Behavior in CHW (RTL)
Cycle-accurateSynthesis
IP Bus
IP (RTL)
B3B6
Bridge
Copyright ©2007 CECSASPDAC’07 Tutorial
HW Synthesis Issues
• IP insertion• C model of HW is replaced with pre-designed RTL IP, if available
• RTL synthesis tool selection• RTL synthesis tool must be selected for custom HW design
• C code generation• C code for input to RTL synthesis tool is generated
• Synthesis directives• RTL architecture and clock cycle time is selected• UBC calls are treated as special interface operations, to be later
expanded during interface synthesis
• HDL generation• RTL synthesis results in cycle accurate synthesizable Verilog code
46
Copyright ©2007 CECSASPDAC’07 Tutorial
HW Synthesis Today
• HW synthesis tools• SystemC/C to RTL [Mentor, FORTE, Synfora]
– Based on high level synthesis technology– C/C++ support with user constraints (Catapult)– SystemC TLM support (Cynthesizer)– C based language support (HandelC, Celoxica)
• SystemVerilog to RTL [BlueSpec]– Synthesis from assertions in SystemVerilog (BS Compiler)– Correct-by-construction
• MATLAB to FPGA RTL [Xilinx]– Matlab models to DSP HW (AccelChip)
Component based approachLack of HW-SW co-synthesis supportCapacity and quality issues
Copyright ©2007 CECSASPDAC’07 Tutorial
IP synthesis with NISC technology
offset
status
const
address
CWPC CMem
AG
status
B1B2
ALU Memory
RF
MUL
B3
const
status
RF
OR
ALUAR
MemDR
offset
CMem
CWPC
AG P
bL
Sum
Add
aL
Mul
for(int i=0; i<8; i++)for(int j=0; j<8; j++){
sum=0;for(int k=0; k<8; k++)
sum = sum + A[i][k] ×B[k][j];C[i][j] = sum;
}
for(int i=0; i<8; i++)for(int j=0; j<8; j++){
sum=0;for(int k=0; k<8; k++)
sum = sum + A[i][k] ×B[k][j];C[i][j] = sum;
}
for(int i=0; i<8; i++)for(int j=0; j<8; j++){
i8 = i × 8;sum = *(A + i8) × *(B + j);sum += *(A + i8 + 1) × *(B + 8 + j);...C[i][j] = sum;
}
for(int i=0; i<8; i++)for(int j=0; j<8; j++){
i8 = i × 8;sum = *(A + i8) × *(B + j);sum += *(A + i8 + 1) × *(B + 8 + j);...C[i][j] = sum;
}
NISC Compiler
Results
Code Refinement
NISC Refinement
NISC
Application
NISC
ApplicationNISC
CompilerResults
Iterative design & refinement
Source: M. Reshadi
47
Copyright ©2007 CECSASPDAC’07 Tutorial
Synthesis + feedbackOne
-pas
s sy
nthe
sis
Mem
ory based controllerFSM
bas
ed c
ontr
olle
r
C RTL
Designer control over final architectureNo designer control over final architecture
Global optimization for the whole applicationLocal optimization for blocks/kernels
No timing problems in templatized controller(memory-based, pipelined controller)
Timing problems in controller(long delays in FSM controller)
Available assembly programmingNo assembly programming
Architecture features possibleArchitecture features not considered(cache, interrupt, I/O, branch prediction,…)
Good refinement closure(through preliminary layout)
No refinement closure(no layout info., no improvement)
Compiling algorithm based on measured metrics (compiler knows the final architecture netlist)
Estimation based algorithms(no final architecture information available)
Full DFM(by pre lay out or use of templates)
No design for manufacturability (DFM)(layout generated after synthesis)
Reprogramming / Reuse possible by recompiling(with matching reduction possible)
No reprogramming / No Reuse (synthesized design cannot be reused for other code)
Compile complete application. Any size C possible Synthesis for small block/kernel in application
Full COnly subset of C
C-to-RTL synthesis with NISC technologyPresent C-to-RTL synthesis
Copyright ©2007 CECSASPDAC’07 Tutorial
Cycle-Accurate Interface SynthesisCPU Mem
Bridge
CustomHW
Interface Synthesis
IP
Arbiter
IC
48
Copyright ©2007 CECSASPDAC’07 Tutorial
Interface Synthesis Issues
• Synchronization• UBC has unique flag for each pair of communicating processes• Flag access is implemented as polling, CPU interrupt or interrupt controller
• Arbitration• Selected from library or synthesized to RTL based on policy
• Bridge• Selected from library or synthesized using universal bridge generator
• Addressing• All communicating processes are assigned unique bus addresses
• SW communication synthesis• UBC functions are replaced by RTOS functions and assembly instructions
• HW communication synthesis• DMA controller in RTL is created for each custom HW component• Send/Recv operations are replaced by DMA transfer states
Copyright ©2007 CECSASPDAC’07 Tutorial
Final Pin-Accurate Model
CPU Mem
Bridge
HW IP
Arbiter
HALRTOS
EXE
PAM is downloaded automatically for fast prototyping with FPGAs
ICProgram
49
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
RequirementsModeling
• Verification and synthesisExisting verification methodsTLM verification using transformationsIssues in ES synthesis
• Synthesis of MP3 player design• Results
• Summary, conclusions and outlook
Copyright ©2007 CECSASPDAC’07 Tutorial
MP3 Player Synthesis
• TLM Input for MP3 application and platform
• Synchronization and Arbitration synthesis
• SW synthesis including RTOS selection and address generation
• HW and Bridge synthesis
• Export to FPGA design tools
• FPGA download and test
50
Copyright ©2007 CECSASPDAC’07 Tutorial
Example: MP3 Decoder
• Functional block diagram (major blocks only)
• Timing constraints• 38 frames per second• Frame delay < 26.12ms
HuffDec
FilterCoreIMDCT
PCM
FilterCoreIMDCT
mp3 pcm
Left channel
Right channel
AliasRed
AliasRed
2 granules
Copyright ©2007 CECSASPDAC’07 Tutorial
MP3 Player TLM
• MP3 encoder mapped to SW (MicroBlaze), filters and IMDCT to HW• Mem1 (on OPB bus) for data, Mem2 (on LMB bus) for program• Custom HWs on DoubleHdshk (DH) bus, with bridge to OPB
OPB Bus
MP3OS
Mic
robl
aze
CPU
RightFilter
HW2Mem1
HAL
DH Bus
Bridge
LeftFilter
HW1
LMB Bus
Mem2
RightIMDCT
HW4
LeftIMDCT
HW3
51
Copyright ©2007 CECSASPDAC’07 Tutorial
Synchronization Selection
• Interrupt signals and connections are selected
Copyright ©2007 CECSASPDAC’07 Tutorial
Model with Synchronization
• Interrupt signals and connections are created
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
Running simulation …% xterm -e ./mp3decoder funky.mp3 funky.pcm
HW1 HW2 MemCPU
Process1
Process2
Main
Process3
var1
c1
Process4
p2_2.sc
Sources
CPUMicroBlaze
Mem1SRAM64
HW2
(Filter2)
HW1
(Filter1)Bridge
Bus1OPB
Bus2DoubleHdshk
View Project Simulation Help
Database
CEs Busses
XilKernel
gcc
OSs
Compilers
mb-gcc
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
var1var2
i_sendin bool
c4
int
bool
c_queue
i_sendi_receive
SWPEs
Bus3LMB
Mem2SRAM64
Synthesis
HW4
(IMDCT2)
HW3
(IMDCT1)
52
Copyright ©2007 CECSASPDAC’07 Tutorial
Arbiter Selection
• Arbiter is selected and request / grant pins are connected
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
Running simulation …% xterm -e ./mp3decoder funky.mp3 funky.pcm
HW1 HW2 MemCPU
Process1
Process2
Main
Process3
var1
c1
Process4
p2_2.sc
Sources
CPUMicroBlaze
Mem1SRAM64
HW2
(Filter2)
HW1
(Filter1)Bridge
Bus1OPB
Bus2DoubleHdshk
View Project Simulation Help
Database
CEs Busses
XilKernel
gcc
OSs
Compilers
mb-gcc
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
var1var2
i_sendin bool
c4
int
bool
c_queue
i_sendi_receive
SWPEs
Bus3LMB
Mem2SRAM64
Synthesis
ArbitrationArbitration
Bus1 Bus3Bus2
OK
Click to confirm arbiter connections
MasterName
RequestID / Port
GrantID/Port
Req1 Gnt1
Req2 Gnt2
Arbiter FCFS
HW1
HW2
HW4
(IMDCT2)
HW3
(IMDCT1)
Req3 Gnt3
Req4 Gnt4
HW3
HW4
Copyright ©2007 CECSASPDAC’07 Tutorial
Model with Arbitration
• The selected arbiter is instantiated and signals are added to create arbiter connections
53
Copyright ©2007 CECSASPDAC’07 Tutorial
SW synthesis
• Compiler, RTOS and libraries are selected for SW• Default addresses for all addressable memory/bus is generated by ESE
Copyright ©2007 CECSASPDAC’07 Tutorial
Model after SW synthesis
• SW application and drivers are targeted for RTOS and ready for compilation on MicroBlaze
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
Instantiating FCFS arbiter and connecting request / grantCreating interrupt connection from Bridge to CPU
HW1 HW2 MemCPU
Process1
Process2
Main
Process3
var1
c1
Process4
p2_2.sc
Sources
CPUMicroBlaze
Mem1SRAM64
Bridge
Bus1OPB
Bus2DoubleHdshk
View Project Simulation Help
Database
CEs Busses
XilKernel
gcc
OSs
Compilers
mb-gcc
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
var1var2
i_sendin bool
c4
int
bool
c_queue
i_sendi_receive
SWPEs
Bus3LMB
Mem2SRAM64
Synthesis
HW2
(Filter2)
HW1
(Filter1)
ArbiterFCFS
HW4
(IMDCT2)
HW3
(IMDCT1)
54
Copyright ©2007 CECSASPDAC’07 Tutorial
HW synthesis
• RTL code for HW components is generated using NISC compiler
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
Instantiating FCFS arbiter and connecting request / grantCreating interrupt connection from Bridge to CPU
HW1 HW2 MemCPU
Process1
Process2
Main
Process3
var1
c1
Process4
p2_2.sc
Sources
CPUMicroBlaze
Mem1SRAM64
Bridge
Bus1OPB
Bus2DoubleHdshk
View Project Simulation Help
Database
CEs Busses
XilKernel
gcc
OSs
Compilers
mb-gcc
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
var1var2
i_sendin bool
c4
int
bool
c_queue
i_sendi_receive
SWPEs
Bus3LMB
Mem2SRAM64
Synthesis
HW SynthesisHW SynthesisHW2 HW3HW1
OK
Click to create HW RTL
Replace with IP1
Synthesize with NISC CompilerNISC CompilerESE-RTLSynopsys BCForte
HW2
(Filter2)
HW1
(Filter1)
ArbiterFCFS
HW4
(IMDCT2)
HW3
(IMDCT1)
HW4
Copyright ©2007 CECSASPDAC’07 Tutorial
NISC compilation
• NISC compiler generates synthesizable RTL verilog from application code for selected architecture
55
Copyright ©2007 CECSASPDAC’07 Tutorial
Model after HW synthesis
• Model is updated with RTL code for HW units
Copyright ©2007 CECSASPDAC’07 Tutorial
Bridge synthesis
• RTL code for Bridge is generated using BridgeGenerator
56
Copyright ©2007 CECSASPDAC’07 Tutorial
Model after Bridge synthesis
• Design is ready for prototyping after SW, HW and Interface synthesis
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
Instantiating FCFS arbiter and connecting request / grantCreating interrupt connection from Bridge to CPU
HW1 HW2 MemCPU
Process1
Process2
Main
Process3
var1
c1
Process4
p2_2.sc
Sources
CPUMicroBlaze
Mem1SRAM64
Bridge
Bus1OPB
Bus2DoubleHdshk
View Project Simulation Help
Database
CEs Busses
XilKernel
gcc
OSs
Compilers
mb-gcc
Platforms
Channels
c1
c2c3
c_double_handshake
c_handshakec_semaphore
var1var2
i_sendin bool
c4
int
bool
c_queue
i_sendi_receive
SWPEs
Bus3LMB
Mem2SRAM64
Synthesis
HW2
(Filter2)
HW1
(Filter1)
ArbiterFCFS
HW4
(IMDCT2)
HW3
(IMDCT1)
Copyright ©2007 CECSASPDAC’07 Tutorial
Export to FPGA Design Tools
• Platform and SW specification files are created for FPGA design tools• C code for Microblaze and Verilog for HWs and Bridge is exported
ESE – MP3Decoder.ese*ESE – MP3Decoder.ese*
Output
PEs
File Edit
Project
p1.sc
p2.sc
Platform1
Platform2
Ready
Instantiating FCFS arbiter and connecting request / grantCreating interrupt connection from Bridge to CPU
HW1 HW2 MemCPU
Process1
Process2
Main
Process3
var1
c1
Process4
p2_2.sc
Sources
CPUMicroBlaze
Mem1SRAM64
Bridge
Bus1OPB
Bus2DoubleHdshk
View Project Simulation Help
Database
CEs Busses
XilKernel
gcc
OSs
Compilers
mb-gcc
Platforms
Channels
c1
c2
c3
c_double_handshake
c_handshake
c_semaphore
var1var2
i_sendin bool
c4
int
bool
c_queue
i_sendi_receive
SWPEs
Bus3LMB
Mem2SRAM64
Synthesis
HW3
(PCM)
HW2
(Filter2)
HW1
(Filter1)
ArbiterFCFS
HW4
(IMDCT2)
HW3
(IMDCT1)
Platform ExportPlatform Export
Export Target Xilinx Platform StudioXilinx Platform StudioAltera Quartus IIARM RealViewCosimulation
CPU ISS MB-Sim
OK
Click to export platform design
57
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
RequirementsModeling
• Verification and synthesisExisting verification methodsTLM verification using transformationsIssues in ES synthesisSynthesis of MP3 player design
• Results• Summary, conclusions and outlook
Copyright ©2007 CECSASPDAC’07 Tutorial
DecisionUser
Interface (DUI)
ValidationUser
Interface (VUI)
TIMED
Create
Map
Compile
Replace
Select
Partition
Compiler
Debugger
Stimulate
Verify
ESE Synthesis
Application Tools : Compilers/Debuggers Commercial Tools : FPGA, ASIC
CYCLE ACCURATE
Verify
Simulate
Check
Compile
System Capture + Platform Development
SW Development + HW Development
58
Copyright ©2007 CECSASPDAC’07 Tutorial
Design Quality
0102030405060708090
100
SW+0 SW+1 SW+2 SW+4 NISC+0
Design Points
% c
hip
utili
zatio
n
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
seco
nds %Slices
%BRAMsExec. time
• Area• % of FPGA slices and BRAMS
• Performance• Time to decode 1 frame of MP3 data
Copyright ©2007 CECSASPDAC’07 Tutorial
Development Time with ESE
0
10
20
30
40
50
60
70
Spec. TLM RTL Board
models
pers
on-d
ays SW+0
SW+1SW+2SW+4
Manual Development Time
• Model Development time• Includes time for C, TLM and RTL Verilog coding and debugging
• ESE drastically cuts RTL and Board development time
ESE
0
10
20
30
40
50
60
70
Spec. TLM RTL Board
models
pers
on-d
ays SW+0
SW+1SW+2SW+4
59
Copyright ©2007 CECSASPDAC’07 Tutorial
Validation Time with ESE
0123456789
10
Spec. TLM RTL Boardmodels
seco
nds SW+0
SW+1SW+2SW+4
Validation Time
• Simulation time measured on 3.3 GHz processor• Emulation time measured on board with Timer• ESE cuts validation time from hours to seconds
ESE
0123456789
10
Spec. TLM RTL Boardmodels
seco
nds SW+0
SW+1SW+2SW+4
X
hour
s
18.06 hrs17.71 hrs17.56 hrs15.93 hrs
Copyright ©2007 CECSASPDAC’07 Tutorial
Summary and Conclusions
• Technology advantages• No need to code and debug large RTL HDL models • Bridge and interface synthesis allows flexibility to
include heterogeneous IP in the design• No need for SW developers to understand HW details• Easy application upgrade at TL• C and graphical input of TL model allows even non-
experts to develop and test HW/SW systems
Simplified system developmentHuge productivity gainsShort time to market
60
Copyright ©2007 CECSASPDAC’07 Tutorial
References• Embedded System Verification
• Devadas, Ma, Newton, “On the verification of sequential machines at different levels of abstraction”, 24th DAC, pp.271-276, June 1987
• Clarke, Grumberg, Peled, “Model Checking” , MIT Press• K.L. McMillan, “Symbolic Model Checking: An approach to the State
Explosion Problem” , Kluwer Academic 1993• McFarland, “Formal Verification of Sequential Hardware: A tutorial”,
IEEE Transaction on CAD, pp. 633-653, May 1993.• HW Synthesis
• P. Bannerjee et al. “A Behavioral Synthesis Tool for Exploiting Fine Grain Parallelism in FPGAs ,” LNCS, Sept. 2005.
• J. Cong et al. “"xPilot: A Platform-Based Behavioral Synthesis System", SRC TechCon'05 .
• System synthesis• F. Balarin, H. Hsieh, L. Lavagno, C. Passerone, A. Pinto, A.
Sangiovanni-Vincentelli, Y. Watanabe, G. Yang. “Metropolis: a Design Environment for Heterogeneous Systems,” Multiprocessor Systems-on-Chips (Editors: W. Wolf, A. Jerraya), Morgan Kaufmann, 2004.
• P. Chou, R. Ortega, G. Borriello. “The Chinook hardware/software co-synthesis system ,” ISSS 1996.
• D. Gajski, "ESE Back End," http://www.cecs.uci.edu/pub_slides, March 2006.
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
Requirements
Modeling
Verification and synthesis
• Summary, conclusions and outlook
61
ASPDAC’07 Tutorial
Embedded System Design:Summary, Conclusions and Outlook
Daniel D. GajskiCenter for Embedded Computer Systems
University of California, Irvine
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
RequirementsModelingVerification and synthesis
• Summary, conclusions and outlook• Models
• Platforms
• Tools
• Benefits
• Conclusion
62
Copyright ©2007 CECSASPDAC’07 Tutorial
How many models?
Minimal set for any methodology
(3 are enough)
• System specification model (application designers)
• Transaction-level model (system designers)
• Pin&Cycle accurate model (implementation designers)
Copyright ©2007 CECSASPDAC’07 Tutorial
Three Models (with Respect to OSI)
Pin / Cycle Accurate Model
Transaction Level Model
Specification Model
7. Application6. Presentation5. Session4. Transport3. Network2b. Link + Stream2a. Media Access Ctrl2a. Protocol1. Physical
7. Application6. Presentation5. Session4. Transport3. Network2b. Link + Stream2a. Media Access Ctrl2a. Protocol1. Physical
Address lines
Data lines
Control lines
TLM
Spec
P/CAM
Source: G Schirner
63
Copyright ©2007 CECSASPDAC’07 Tutorial
System Specification
Brid
ge
v1
C1
B1 B2
CPU Mem
HW
B3
IP
B4
C2
System Definition = (Partial) Platform + (Partial) Spec
Arb
iter
Communication• Channels (in C)• Variables (in C)
Computation • Behaviors (in C)
Copyright ©2007 CECSASPDAC’07 Tutorial
Transaction-Level Model (TLM)
CPU Bus
B1 B2
OS
B4
CPU Mem
IP
B3
HW
HALDrivers
IP Bus
64
Copyright ©2007 CECSASPDAC’07 Tutorial
Pin/Cycle Accurate Model (P/CAM)
CPU Mem
Bridge
HW IP
Arbiter
HALRTOS
EXE
P/CAM is downloaded automatically for fast prototyping
with FPGAs or ASIC design
ICProgram
Source: D. Gajski et al.
Copyright ©2007 CECSASPDAC’07 Tutorial
How many components?
Minimal set for any design
(4 are enough)
• Processing element (PE)
• Memory
• Transducer / Bridge
• Arbiter
65
Copyright ©2007 CECSASPDAC’07 Tutorial
General System Model
Bus2Bus1 Bus3
Arbiter 1
PE 1.1
Transducer2-3
Arbiter 2
Arbiter 3
PE 1.2
Memory 1
PE 2.1(Master)
PE 3.1
Memory 3
Interrupt1.1
Interrupt2.1
PE 2.2(Slave)
Transducer1-2 Interrupt2.2
Interrupt3.1
Interrupt3.2
Copyright ©2007 CECSASPDAC’07 Tutorial
Transducer Model
Processor1<clk1>
Processor2<clk2>
FSMD1<clk1>
FSMD2<clk2>
Transducer
Ready1Ack_ready1
Ready2Ack_ready2
Memory1 Memory2
PE1 PE2
Addr bus1 Addr bus2Data bus1 Data bus2
Interrupt2Interrupt1
Data1 Data2
Queue<clk3>
Source: H. Cho
66
Copyright ©2007 CECSASPDAC’07 Tutorial
NISC technology architectures
RF / Scratch pad
ALU MUL
Status
PC
AG
CW
offset
status
address
const
CMem
Memory
B2
B1
B3
Data forwarding
Datapath pipelining
Controller pipelining
Programmable controller
Multi-cycle units
Datapath Pipelined units
• Direct compilation of C to HW (fastest possible execution)• Statically and dynamically reconfigurable (anytime, anywhere)• Designed for manufacturability (solving timing closure)
Copyright ©2007 CECSASPDAC’07 Tutorial
5.3X 2.1X 11.6X 2.5X
0.83X 1.3X 0 2.1X
1.25X NA NA NA
DCT with NISC technology
0
0.2
0.4
0.6
0.8
1
1.2
1.4
MIPS NMIPS CDCT1 CDCT2 CDCT3 CDCT4 CDCT5 CDCT6 CDCT7 Manual
Execution Time Power Energy Area
10X 1.3X 12.8X 3X
Performance Power saving Energy saving Area reduction
NMIPS vs. MIPS
CDCT3 vs. NMIPS
CDCT7 vs. NMIPS
CDCT7 vs. Manual
67
Copyright ©2007 CECSASPDAC’07 Tutorial
How many tools?
Minimal set for any methodology
(2 are enough)
• Front-End (for application developers)– Input: C, C++, Mathlab, UML, …
– Output: TLM
• Back-End (for SW/HW system designers)– Input : TLM
– Output: Pin/Cycle accurate Verilog/VHDL
Copyright ©2007 CECSASPDAC’07 Tutorial
DecisionUser
Interface (DUI)
ValidationUser
Interface (VUI)
TIMED
Create
Map
Compile
Replace
Select
Partition
Compiler
Debugger
Stimulate
Verify
ES Environment
Application Tools : Compilers/Debuggers Commercial Tools : FPGA, ASIC
CYCLE ACCURATE
Verify
Simulate
Check
Compile
ESE Front – End
System Capture + Platform Development
SW Development + HW Development
ESE Back – End
68
Copyright ©2007 CECSASPDAC’07 Tutorial
0
10
20
30
40
50
60
70
Spec. TLM RTL Board
models
pers
on-d
ays SW+0
SW+1SW+2SW+4
Development Time with ESE
• ESE drastically cuts RTL and Board development time• Models can be developed at Spec and TL• Synthesizable RTL models are generated automatically by ESE
ESE
Source: S. Abdi
Copyright ©2007 CECSASPDAC’07 Tutorial
0123456789
10
Spec. TLM RTL Boardmodels
seco
nds SW+0
SW+1SW+2SW+4
Validation Time with ESE
• ESE cuts validation time from hours to seconds• No need to verify RTL models• Designers can perform high speed validation at TLM and board
ESE
Source: S. Abdi
69
Copyright ©2007 CECSASPDAC’07 Tutorial
Benefit: Spec-to-Prototype in 1 Week
Copyright ©2007 CECSASPDAC’07 Tutorial
Summary• Underlying concepts
– Well defined models, rules, transformations, refinements– Analogous to layout, logic, RTL– System level complexity simplified
• Proof of concept demonstrated– Embedded System Environment (ESE)– Automatic model generation– Model synthesis and verification– Universal IP technology (NISC)– Productivity gains greater then 1000
• Benefits– Large productivity gains– Easy design management – Easy derivatives– Shorter TTM
70
Copyright ©2007 CECSASPDAC’07 Tutorial
Conclusions
• Extreme makeover is necessary for a new paradigm, where
– SW = HW = SOC = Embedded Systems– Simulation based flow is not acceptable– Design methodology is based on scientific principles
• Model algebra is enabling technology for– System design, modeling and simulation– System synthesis, verification, and test
• What is next?– Change of mind– Application oriented EDA– Looking for early adapters
Copyright ©2007 CECSASPDAC’07 Tutorial
Outline
Requirements
Modeling
Verification and synthesis
Summary, conclusions and outlook
71
ASPDAC’07 Tutorial
Thank You
Daniel D. GajskiAndreas Gerstlauer
Samar AbdiCenter for Embedded Computer Systems
University of California, Irvine