accelerating system-level design and multi-processor systems … · 2004-12-05 · system...

69
© 2002 © 2004 Altera Corporation Accelerating System-Level Design and Multi-Processor Systems in FPGA Accelerating System-Level Design and Multi-Processor Systems in FPGA

Upload: others

Post on 02-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

© 2002© 2004 Altera Corporation

Accelerating System-Level Design and Multi-Processor Systems in FPGA

Accelerating System-Level Design and Multi-Processor Systems in FPGA

AgendaAgendaAccelerating System-Level Design

System Integration IssuesSOPC BuilderInterfacing to External Processors Summary

Multi-Processor Systems in FPGAWhy Use Multiple Processors?Multi-Processor ArchitecturesDesign ConsiderationsSoftware Development

2© 2004 Altera Corporation

System Integration IssuesSystem Integration IssuesMatching I/O Specifications− Voltage Standards− Timing Requirements

Connecting the Block Interfaces− Control Signal Types − Control Signal Assertion Levels− Data Widths− Transaction Timing

Domain Crossing− Interface Domain− Clock Domain

Time− Develop− Change Orders− Verification

3© 2004 Altera Corporation

© 2002© 2004 Altera Corporation

SOPC BuilderSOPC Builder

Data

Embedded System IntegrationEmbedded System Integration

Address

Address Decoder

Processor (Bus Master)

32-BitInterrupt

Controller

Address

Data

Ethernet (Bus Master)

32-Bit

Slave 316-Bit

Slave 18-Bit

Slave 564-Bit

Slave 432-Bit

Slave 232-Bit

Arbiter

Width-Match Width-MatchWidth-MatchWidth-Match Width-Match

5© 2004 Altera Corporation

SOPC Builder: System DesignSOPC Builder: System Design

6© 2004 Altera Corporation

SOPC Builder: System IntegrationSOPC Builder: System IntegrationSingle Master

Multiple Slaves Multi-Master− Slave-Side Arbitration− Optimize for Throughput− Patch Panel Selection

Automatic Generation− Datapath Logic− Chip Selects− Arbitration Logic− Timing

7© 2004 Altera Corporation

SOPC Builder: IntegrationSOPC Builder: Integration

Width-Match

Interrupt Controller

Address Decoder

Arbiter

Width-Match

Arbiter

Width-Match

Arbiter

Width-Match

Arbiter

Width-Match

Arbiter

SOPC Builder-GeneratedAvalon™

Switch Fabric

Wait-State Generation

Data Multiplexing

Processor(Bus Master)

32-Bit

Ethernet (Bus Master)

32-Bit

Slave 18-Bit

Slave 232-Bit

Slave 316-Bit

Slave 432-Bit

Slave 564-Bit

8© 2004 Altera Corporation

9© 2004 Altera Corporation

Avalon InterfaceAvalon Interface Avalon Signal Typesreset

chipselectaddress

byteenableread

readdatawrite

writedatadata

waitrequestreadyfordata

dataavailabledatavalid

flushbegintransfer

endofpacketirq

irqnumberclk

resetrequest

All Signals AvailableIn Positive or Negative

Form

Avalon Signal Typesreset

chipselectaddress

byteenableread

readdatawrite

writedatadata

waitrequestreadyfordata

dataavailabledatavalid

flushbegintransfer

endofpacketirq

irqnumberclk

resetrequest

All Signals AvailableIn Positive or Negative

Form

An Open Standard− Simple, Synchronous Interface− Switch Fabric Interconnect

Controls Signal Timing Matching Logic Levels

Many Signal Types Supported− Peripheral Uses Only the Signals it Needs

Any Combination of Signals Possible− Support for Arbitrary Setup Time, Hold

Time & Wait StatesSwitch Fabric Generated in SOPC Builder− Eliminates Tedious HDL Writing− Increases Intellectual Property (IP) Re-

Use− Decreases Development Time

Avalon Switch FabricAvalon Switch FabricPeripherals Can Be Dedicated or Shared− Shared Connections Arbitrated on the Slave-Side

Simultaneous Data Transfers Boost Throughput

System CPU(Master 1)

DMA(Master 2)

I/O1

I/O2

DataMemory(32-bit)

ProgramMemory

Masters

Slaves

ArbiterArbiter

Avalon Switch Fabric

VGAController

(8-bit)

VGAController

(8-bit)Data

Memory(32-bit)

ArbiterArbiter

10© 2004 Altera Corporation

SOPC Builder: System GenerationSOPC Builder: System GenerationSoftware

Development-Kit Generation

C Header files

Peripheral Drivers

Software IDE

Software Debuggers

RTOS

Simulation Test Bench-Generation

System GenerationVHDL/Verilog HDL

11© 2004 Altera Corporation

11

12© 2004 Altera Corporation

SOPC Builder: ApplicationsSOPC Builder: Applications

Custom Microcontroller

Ethernet I/FEthernet I/F

RAMRAM

UARTUART

SPISPI CustomCustom

Multi-Processor

CustomCustom

ExternalCPUor

DSP

ExternalCPUor

DSP Bus

I/F

Bus

I/F

UARTUART

PCIPCI

SPISPI

SPISPI

Companion Chip

CustomCustom

SPI+SPI+

SPISPIB

us I/

FB

us I/

F

UARTUART

PCIPCI

ExternalCPUor

DSP

ExternalCPUor

DSP

Interfacing External ProcessorsInterfacing External ProcessorsIncumbent Processor or Digital Signal ProcessorAdd System Elements − Peripherals − Custom Logic− Accelerators

SOPC Builder Glues it TogetherMissing Element is the Interface

External Processor InterfaceExternal Processor Interface

Avalon Switch FabricAvalon Switch Fabric

PeripheralPeripheral

Slave PortSlave Port

Custom IPCustom IP

Slave PortSlave Port

AcceleratorAccelerator

Slave PortSlave Port

ExternalProcessorExternal

Processor

13© 2004 Altera Corporation

Classes of Processor InterfacesClasses of Processor Interfaces

14© 2004 Altera Corporation

Complex & Simple Interfaces− Complex

PCI(X)(Express), Rapid I/O™ & HyperTransport™TechnologiesSolved with Specialized Logic Blocks & Software Protocols

− SimpleCommon on Most Embedded ProcessorsAddress, Data, Chipselect, R/W control, Etc.Simple Software Control of Memory-Mapped Peripherals

Bus Frequency, I/O Timing, & I/O Standard − FPGAs Are Rich in I/O Standards− FPGA Tools Provide I/O Timing− High-Speed Busses Require High-Speed Design

TechniquesSee High-Speed Design Seminar Notes

Control Signal Matching Data Transfer LatencyClock Domain Crossing

Processor Interface IssuesProcessor Interface Issues

15© 2004 Altera Corporation

Simple Processor InterfaceSimple Processor InterfaceMaps FPGA to a Memory Region, SOPC Builder Decodes Each Element Operates Synchronous to the Processor’s Bus ClockTranslates Processor Signal Types to Avalon Equivalents

16© 2004 Altera Corporation

DMA_0DMA_0

MEM_0MEM_0

MEM_NMEM_N

MMMM

MM

SS

SS

CUST_0CUST_0

CUST_NCUST_N

SS

SS

MM

MM

AvalonSwitchFabric

AvalonSwitchFabric

CPU-to-Avalon I/FCPU-to-Avalon I/F

AddressAddressExternalProcessorExternal

Processor Avalon MasterAvalon Master

MM

MM

MM

SS

SS

SS

SS

Exte

rnal

Mem

ory

Inte

rfac

eEx

tern

al M

emor

y In

terf

ace

MM

MMMMDataData

ControlControl

FPGAFPGA

Intel PXA255 Example Intel PXA255 Example

10/100/1G EMACOr 802.11x MAC

10/100/1G EMACOr 802.11x MAC

Backplane Interface

Backplane Interface

USB 2.0 /FireWire

USB 2.0 /FireWire

Integrated Drive Electronics (IDE)Integrated Drive Electronics (IDE)

Visual Display Unit

(LCD/VGA/ SVGA)

Visual Display Unit

(LCD/VGA/ SVGA)

GPIOs(LEDs,

Pushbuttons)

GPIOs(LEDs,

Pushbuttons)UARTUART

Motor Drive(PWM / Stepper)

Motor Drive(PWM / Stepper)

CustomPeripheralCustom

Peripheral

Avalon Sw

itch FabricA

valon Switch Fabric

Altera® FPGAIntel PXA255

Processor InterfaceProcessor Interface

17© 2004 Altera Corporation

VLIO as an Avalon Master PortVLIO as an Avalon Master PortIntel PXA255 Variable Latency I/O (VLIO) Uses a Bi-Directional Data Path, RDY Signal to Add Wait StatesInterface Separates DATA into Read Data & Write Data Paths

VLIO AvalonRDY

MA [25:0])

nPWE

MD[31:0]

nOE

MD [31:0]

DQM[3:0]

EXT_IRQ

MEMCLOCK

RESET

waitrequest

address

write_n

writedata

read_n

readdata

Byteenble[3:0]

irq

clk

reset

Altera FPGAEMIF

RESET

MEMCLOCK

RDY

EXT_IRQ

reset

clk

waitrequest

irq

readdata[31:0]MD [31:0]

writedata[31:0]

nOE

MA[25:0]

nPWE

DQM[3:0]

address [ n:2]

write_n

byteenable[3:0]

read_nPXA

255

VLI

O

18© 2004 Altera Corporation

Relevant Verilog Code to ImplementRelevant Verilog Code to Implement

module ext_proc_if ( //port declarations)//signal declarationsassign a_write_n = e_pwe_n;assign a_read_n = e_oe_n;assign a_addr = {e_ma, 2'b0};assign e_rdy = a_waitrequest; assign a_be_n = e_dqm_n;// work out the bi-directional data bus// if output enable is low, then get the data from the readdata path of the Avalon switch fabricassign e_data = (!e_oe_n && !e_cs_n)? a_data_read : 'bz; // assign the Avalon Switch Fabric write data path to athe a_data_write net (i.e. the incoming data..assign a_data_write = a_write_data;always @(!e_cs_n)

beginif (!e_pwe_n ) a_write_data = e_data;

endendmodule

19© 2004 Altera Corporation

EMIF to Avalon Master PortEMIF to Avalon Master PortTexas Instruments (TI) DSP External Memory Interface (EMIF) Uses a Bi-Directional Data PathInterface Separates DATA into Read Data & Write Data Paths

EMIF AvalonARDY

ADDRESS [21:2])

AWE_n

DATA [31:0]

ARE_n

DATA [31:0]

BE[3:0]

EXT_IRQ

E_CLOCK

reset

waitrequest

address

write_n

writedata

read_n

readdata

byteenble

irq

clk

reset

Altera FPGAEMIF

RESETE_CLOCK

ARDY

EXT_IRQ

resetclk

waitrequest

irq

readdata[31:0]

DATA [31:0]writedata[31:0]

AOE_n

ADDRESS [ n:2]

AWE_n

BE[3:0]

ARE_naddress [ n:2]

write_n

byteenable[3:0]

read_n

TI –

DSP

EM

IF

20© 2004 Altera Corporation

Relevant Verilog Code to ImplementRelevant Verilog Code to Implement

module ext_proc_if ( //port declarations)//signal declarationsassign a_write_n = e_write_n;assign a_read_n = e_read_n;assign a_addr = {e_addr, 2'b0};assign e_rdy = a_waitrequest; assign a_be_n = e_be_n;// work out the bi-directional data bus// if output enable is low, then get the data from the readdata path of the Avalon switch fabricassign e_data = (!e_oe_n && !e_read_n && !e_cs_n)? a_data_read : 'bz; // assign the Avalon Switch Fabric write data path to athe a_data_write net (i.e. the incoming data..assign a_data_write = a_write_data;always @(!e_cs_n)

beginif (!e_write_n ) a_write_data = e_data;

endendmodule

21© 2004 Altera Corporation

Simple Processor InterfaceSimple Processor InterfaceAdvantages− Simple as Wires—1 Logic Element (LE) to

Control Data Bus I/O Direction− Easy Access to Peripherals for the Incumbent

Processor− Limitless Options for Adding Components

Restrictions− Processor Bus & FPGA Logic Operate on

Single Clock Domain− Processor Must Honor waitrequest

22© 2004 Altera Corporation

23© 2004 Altera Corporation

Removing Clock Domain RestrictionsRemoving Clock Domain RestrictionsState Machines Pass Transactions Between Clock DomainsProcessor State Machine Initiates Transaction − Waits for Avalon Master State Machine To Complete

Avalon Master State Machine Initiates Transaction with Avalon Switch Fabric− Waits for Avalon Switch Fabric to Complete

DMA_0DMA_0

MEM_0MEM_0

MEM_NMEM_N

M

M

S

CUST_0CUST_0

CUST_NCUST_N

S

S

M

M

AvalonSwitchFabric

AvalonSwitchFabric

AddressAddressExternalProcessorExternal

ProcessorProcessor

State Machine

Processor State

Machine

Avalon Master State

Machine

Avalon Master State

Machine

External CPU - Avalon I/F

MM

MM

MM

MM

MM

SS

SS

SS

SS

SS

FPGAFPGA

DataData

ControlControl

State Machine Control DiagramState Machine Control Diagram

H_WriteH_wait = 1H_Write

H_wait = 1

H_Idle(Start)H_Idle(Start)

H_DoneH_wait=1H_Done

H_wait=1

H_FinishingH_wait = 0

H_FinishingH_wait = 0

H_ReadH_wait=1H_Read

H_wait=1

CS =1 & Write

= 1

T_Done = 1

CS=1 & Read = 1

CS= 0

T_Done = 1

T_Idle = 1

T_Idle = 1

T_Idle = 0T_Done = 0

T_WriteT_Write

T_IdleT_Idle

T_DoneT_Done

T_ReadT_ReadH_Done OR

H_Finishing ORH_Idle = 1

H_Write= 1 H_Read = 1

Waitreq

uest

= 0

Waitrequest

Waitrequest = 0

Waitrequest = 1

= 1

H_Read ORH_Write = 1

External CPU or Digital SignalExternal CPU or Digital SignalProcessor (Host) SM Diagram

Avalon Master Avalon Master SM DiagramProcessor (Host) SM Diagram SM Diagram

24© 2004 Altera Corporation

State Machine Control State Machine Control Advantages− Processor & FPGA Operation Decoupled− Each Side Free to Operate at fMAX

Restrictions− Adds Size & Complexity to Core− Adds 3-5 Cycles of Latency per Transaction

25© 2004 Altera Corporation

Removing waitrequest RestrictionRemoving waitrequest RestrictionFixed Timing Between Processor & Interface− FIFO or DPRAM− Additional Logic Required to Move Data In/Out of FIFO / DPRAM

FIFO-Based Interface− Single Address, No Random Access− Excellent for Data Streaming Applications

Dual-Port RAM-Based Interface− Allows Fully Loaded Memory with Random Access− Excellent for Multi-Processor Applications

26© 2004 Altera Corporation

FIFO-Based InterfaceFIFO-Based InterfaceFIFO Used to Stream Data to/from a Fixed Destination− Custom Accelerators− Streaming Devices

Data Movement Controlled by Nios II Processor, DMA Controllers, Custom LogicFIFO Depth Affects Data Throughput

27© 2004 Altera Corporation

CPU-to-Avalon I/FCPU-to-Avalon I/F

AddressAddressExternalProcessorExternal

Processor

Read FIFORead FIFO

Write FIFOWrite FIFO SS

SS

DMA_0DMA_0MMMM

SS

SS

CUST_0CUST_0

CUST_NCUST_N

SS

SS

MM

MM

AvalonSwitchFabric

AvalonSwitchFabric

MM

MM

MM

MM

SS

SS

SS

SS

Accelerator

Accelerator

MM

FPGAFPGA

DataData

ControlControl

28© 2004 Altera Corporation

DPRAM-Based Interface DPRAM-Based Interface DPRAMs− Random Access− Mailbox Access

DPRAM Has Fixed Cycle Timing Data Movement on DPRAM Slave Controlled by:− Soft Embedded Processor− DMA on FPGA− IP Accelerator

DMA_0DMA_0MMMM

SS

SS

CUST_0CUST_0

CUST_NCUST_N

SS

SS

MM

MM

AvalonSwitchFabric

AvalonSwitchFabric

CPU-to-Avalon I/FCPU-to-Avalon I/F

AddressAddressExternalProcessorExternal

Processor

MM

MM

MM

MM

SS

SS

SS

SS

DPRAMDPRAMSS

Accelerator

Accelerator

MM

FPGAFPGA

DataData

ControlControl

Combination of InterfacesCombination of Interfaces

Variable Cycle Interfaces− Random Access

Peripherals− Control Variables

FIFO/DPRAM − Low-Latency− Streaming Data

Use External Processor Chip_Selects to Select the Interface

DMA_0DMA_0

MEM_0MEM_0

MEM_NMEM_N

MMMM

SS

SS

CUST_0CUST_0

CUST_NCUST_N

SS

SS

MM

MM

MM

AvalonSwitchFabric

AvalonSwitchFabric

CPU-to-Avalon I/FCPU-to-Avalon I/F

AddressAddressExternalProcessorExternal

ProcessorAvalonMasterAvalonMaster

MM

MM

MM

MM

SS

SS

SS

Streaming DataStreaming Data

Decoder

Decoder

SS

FIFO/DPRAMFIFO/DPRAM SS

FPGAFPGA

DataData

ControlControl

29© 2004 Altera Corporation

Interface Bandwidth & ThroughputInterface Bandwidth & ThroughputDetermined by Your Processor Bus− Width of Bus, Frequency of Bus, Number Cycles per Transfer

Dedicated or Shared Peripheral− SOPC Builder Easy to Define− Slave-Side Arbitration

Latency− Processor: Inherent by Design− Processor Interface: Depends on Type Used− Avalon Switch Fabric: 0 Cycles, Combinatorial− Peripherals: Defined by the Peripheral

30© 2004 Altera Corporation

SummarySummaryMultiple Processor Interface Options − Simple Processor Interface− Asynchronous Interface with State Machines− FIFO-Based Interface− DPRAM-Based Interface− Combination Interfaces

SOPC Builder Automatically Integrates Additional Components− Converts Your Concepts into Systems in

Minutes

31© 2004 Altera Corporation

Additional ResourcesAdditional ResourcesSOPC Builder− Included in All Quartus II Products

Evaluation Nios II & IP Cores− MegaCore IP Library CD− Nios II Embedded Processor Evaluation

Editionwww.altera.com− Literature Section – SOPC Builder− Complete List of Application Notes− Wide Range of Tutorials

32© 2004 Altera Corporation

© 2002© 2004 Altera Corporation

Multi-Processor Systems in FPGAs: Implementation & Debug

Multi-Processor Systems in FPGAs: Implementation & Debug

Multi-Processor Systems Address these ChallengesMulti-Processor Systems Address these Challenges

Increasing Application Requirements− Single Processor Faster Clock Rate Higher Power− Multi–Processors Slower Clock Rate Lower Power− Execute Tasks in Serial or Parallel for Highest

PerformanceExpanding Product Functionality− Custom-Fit Processors for Specific Functions

Simplify Application Software − Limit Code to the Task at Hand

34© 2004 Altera Corporation

Multi-Processor ArchitecturesMulti-Processor ArchitecturesMultiple Independent ProcessorsHost with Off-Load ProcessorsSerially Linked ProcessorsMulti-Channel Processors

35© 2004 Altera Corporation

Multiple Independent ProcessorsMultiple Independent ProcessorsMain System Processor − Primary Application

Secondary Processors− System Monitoring & Reporting Functions

PHYPHYASSPASSP

FlashFlash SDRAMSDRAMMainSystem

Processor

MainSystem

ProcessorMACMAC

SDRAMControllerSDRAM

Controller

MonitoringProcessorMonitoringProcessor

PowerSupplyPowerSupply

FanControl

FanControl

StatusDisplayStatusDisplay

StatusSensorsStatus

Sensors

I/OI/O

I/OI/O

JTAGJTAG

36© 2004 Altera Corporation

Off-Load of Host ProcessorOff-Load of Host ProcessorApplication or I/O Requirements Not Being Met by CPU Add an Off-Load CPUAdditional Off-Load CPU(s) to Further Address Performance or Functional Requirements

General PurposeProcessor

General PurposeGeneral PurposeProcessorProcessor

I/OI/OI/OI/OI/OI/O

SharedMemory/Resource

SharedSharedMemory/Memory/ResourceResource

Off-LoadProcessorOff-Load

Processor I/OI/O

I/O

Off-LoadProcessorOff-Load

Processor I/OI/O

I/O

General-PurposeProcessor

GeneralGeneral--PurposePurposeProcessorProcessor

37© 2004 Altera Corporation

Serial ProcessingSerial ProcessingApplication Divided into Sequential Tasks− One Processor per Task− Simplify Software

Processors Communicate Through Shared Data/ Resources

ININ CPU 0CPU 0 CPU 1CPU 1 OUTOUT

Shared DataResources

38© 2004 Altera Corporation

Multi-Channel ProcessingMulti-Channel Processing

APEX 20KCPU

MAC PacketBuffer

DMA

CPU

MAC PacketBuffer

DMA

UserLogicUserLogic

CPU

MAC PacketBuffer

DMA

CPU

MAC PacketBuffer

DMA

Develop a Single Channel Solution

Replicate to More Channels

39© 2004 Altera Corporation

Master/Slave Processor SystemMaster/Slave Processor SystemMaster Processor− Running Operating System− Primary Application(s)

Slave Processors− Operate Under Master

Processor Control

Examples− Load Balancing in Web

Server − Control Processor with

Multiple Processor Engines for Packet Processing

Master CPU

Master CPU

Slave CPU 0Slave CPU 0

Slave CPU 1Slave CPU 1

Slave CPU 2Slave CPU 2

OUTOUT

ININ

ININ

ININ

40© 2004 Altera Corporation

Design ConsiderationsDesign Considerations

41© 2004 Altera Corporation

System Development Environment− FPGAs− Soft Embedded Processors− System Development Tools

Inter-Processor CommunicationResource SharingSoftware− Project Management− Collective Run & Stop Control

Altera’s Product OfferingsAltera’s Product Offerings

High-Density FPGAs

High-Density FPGAsCPLDsCPLDs

StructuredASICs

StructuredASICs

Low-Cost FPGAs

Low-Cost FPGAs

NextGeneration

NextGeneration

42© 2004 Altera Corporation

Nios: Most Popular Soft Processor EverNios: Most Popular Soft Processor EverOver 14,000 Kits Shipped− Over 4,000 Unique Nios® Licensees (Companies)− More than All Other Embedded Soft Processors Combined *

EDN Hot 100 Product of 2003 Keys to Success− Complete Kit − Simple, Capable CPU− Easy to Use− Low Cost ($995)− Perpetual License− No Royalties

* Semico: Soft-Core Processor Licensees

43© 2004 Altera Corporation

Nios II OverviewNios II OverviewFamily of 3 Processor Cores− Performance Over 200 DMIPS− Cost as Low as 35¢ of Logic

Powerful New Software Development Tools− IDE/Debugger, RTOS,

TCP/IP StackComplete Portfolio of Development Kits− Stratix® , Stratix II & Cyclone™ Kits − Application-Specific Expansion Boards

44© 2004 Altera Corporation

0

50

100

150

200

250

300

$0.00 $1.00 $2.00 $3.00 $4.00 $5.00Cost of CPU Logic

Perf

orm

ance

(DM

IPS)

Processor Cost vs. PerformanceProcessor Cost vs. Performance

Stratix

Cyclone

Stratix II

HardCopy® Stratix

e

s

f

e

s

f

e

s

f

e

s

f

45© 2004 Altera Corporation

46© 2004 Altera Corporation

FPGA Processor Core AdvantagesFPGA Processor Core Advantages

Stratix II EP2S180 & Nios II FastCPU 1% of Device220 DMIPs (each)

Complex Embedded System-on-a-Chip

Processor?Processor? How Many Processors?How Many Processors?Cyclone Series & Nios II Economy

CPU < 20% of Device20 DMIPs

As Low as 35¢

Low-Cost Embedded Solution

2,910 Logic Elements

179,400 Logic Elements

Development Tools – SOPC BuilderDevelopment Tools – SOPC Builder

47© 2004 Altera Corporation

Host-Side Arbitration SchemeHost-Side Arbitration Scheme

MastersCPU 0

Shared BusShared Bus

ArbiterArbiter

I/O1 Shared

MemoryUART(s) SPI(s) Shared

MemorySharedMemory

CustomAccelerator Peripheral

48© 2004 Altera Corporation

CPU 2CPU 1

BottleneckArbiter Determines which Master Has

Access to Shared Bus

BottleneckArbiter Determines which Master Has

Access to Shared Bus

SlavesSlaves

Slave Side ArbitrationSlave Side Arbitration

SharedMemory

49© 2004 Altera Corporation

Slaves MailboxMemory

ArbiterArbiter

Masters

CustomAccelerator Peripheral

DataMemory

DataMemory

Slaves SharedMemory

ArbiterArbiter

SharedI/O

ArbiterArbiterArbiterArbiter ArbiterArbiter

CPU 0

LocalResources

CPU 2CPU 1

LocalResources

LocalResources

Simultaneous Operation for All Masters

SOPC Builder – Slave Side ArbitrationSOPC Builder – Slave Side Arbitration

50© 2004 Altera Corporation

Inter-Processor CommunicationsInter-Processor CommunicationsMemory Management− Access Control− Data Integrity− Amount of Memory

CPU 0CPU 0 CPU 1CPU 1

Buffer

DataData

Write

Read Write

Read

Shared Memory

51© 2004 Altera Corporation

Sharing MemorySharing MemoryPotential Problems with Shared Memory

Buffer 1

Buffer 2

Buffer 3

writewrite = Corrupt Data!CPU 0CPU 0 CPU 1CPU 1

52© 2004 Altera Corporation

Processors Writing to the Same Buffer

Sharing MemorySharing MemoryPartition & Allocate Memory by Processor Limit Access in Application Software

CPU 0CPU 0

Buffer 1

Buffer 2

Buffer 3

CPU 1CPU 1

readwrite

writeread

53© 2004 Altera Corporation

Sharing Memory – MailboxesSharing Memory – MailboxesSoftware Protocol Controls Exchange of Data− Flags Signal Data Ready or Taken

Independent Data Buffers − N * (N-1) Buffers Required

N = Number of Processors Assumes All Shared

CPU 0CPU 0 CPU 1CPU 1

Buffer A

Buffer B

DataDataFlagFlag

DataDataFlagFlag

Write

Read Write

Read

Shared Memory

54© 2004 Altera Corporation

Mailbox in SOPC BuilderMailbox in SOPC Builder

55© 2004 Altera Corporation

Sharing Resources – MutexSharing Resources – MutexControls Access to Shared Memory & Resources

CPU 0CPU 0

CPU 1CPU 1

Buffer 1

Buffer 2

MutexStructures

Hardware Mutex

Hardware Mutex

Any Shared Resource

Any Shared Resource

UARTUART

56© 2004 Altera Corporation

Mutex System in SOPC BuilderMutex System in SOPC Builder

57© 2004 Altera Corporation

Software Development ChallengesSoftware Development Challenges

Software DevelopmentMaintaining Hardware/Software CoherencyPartitioning Software− Sorry, That’s Your Job

58© 2004 Altera Corporation

Nios II Integrated Development Environment (IDE)*Nios II Integrated Development Environment (IDE)*

Features− Project Management− Editor & Compile− Debugger− Flash Programmer

Single JTAG Communication Link− HW/SW Download− HW/SW Debug

SignalTap® II Embedded Logic AnalyzerJTAG Debug Module

− JTAG UARTAdvanced Hardware Debug Features− Hardware Break Points, Data Triggers, Trace

59© 2004 Altera Corporation

* Based on Eclipse Project Integrated Development Environment (IDE)

Nios II IDE: Project Management Nios II IDE: Project Management

Features:• Project Manager• Editor & Compiler• Debugger• Flash Programmer

Software Templates

• System Library Customized to Hardware

• Painless Configuration

Software Components

• Provide Easy-to-Use Examples

• Users May Add Their Own Templates

Included Software Components

• LWIP TCP/IP Stack

• MicroC/OS-II RTOS*

• Altera ZIP FS

• Nios II Run-Time Library

60© 2004 Altera Corporation

Integrated Software Project Management

* MicroC/OS-II requires a separate license for product deployment!

Nios II IDE – Multi-ProcessorsNios II IDE – Multi-ProcessorsMultiple Processors SupportSystem Library Customized to the SOPC Builder Project − HW/SW Coherency

Assign Regions of a Shared Memory to Individual Processors− Linker Script

61© 2004 Altera Corporation

62© 2004 Altera Corporation

Nios II Hardware Abstraction LayerNios II Hardware Abstraction LayerHAL Mailbox RoutinesHAL Mailbox Routines

inter_processor_mbox_init()− Initialize a Mmailbox

inter_processor_mbox_get()− Copy a Message from Mailbox. Block

Until Requested bBtes are Available.inter_processor_mbox_tryget()− Copy a Message from Mailbox. Fail if

Requested Bytes are Not Available.inter_processor_mbox_peek_item()− Copy a Message from Mailbox. Fail if

Requested Bytes are Not Available. Do not Remove the Message from the Mailbox.

inter_processor_mbox_put()− Copy a Message to Mailbox. Block until

the Message Can Be Placed in the Mailbox.

inter_processor_mbox_tryput()− Copy a Message to Mailbox. Fail if the

Message Will Not Fit in the Mailbox.inter_processor_mbox_peek()− Retrieve the Number of Bytes Available

for Getting from the Mailbox.

HAL HAL MutexMutex RoutinesRoutinesinter_processor_mutex_init()− Initialize a Mutex, & Associate it with

the Supplied Hardware Component.inter_processor_mutex_destroy()− Destroy a Mutex, Disassociating it

with the Supplied Hardware Component.

inter_processor_mutex_lock()− Acquire the Lock for the Mutex.

Block until The Lock Can Be Acquired.

inter_processor_mutex_trylock()− Acquire the Lock for the Mutex.

Return if the Mutex is Unavailable.inter_processor_mutex_unlock()− Release the Lock for the Mutex.

Nios II IDE: Debugger Nios II IDE: Debugger

• Breakpoints

• Code Position

Source View

Basic Debug• Run Controls

• Stack View

• Active Debug Sessions

• Variables

• Registers

• Signals

Memory View

Features:• Project Manager• Editor & Compiler• Debugger• Flash Programmer

Debug Targets:• Hardware (JTAG)• Instruction Set Simulator• Logic Simulator (ModelSim)Debug Scope:• HW/SW breakpoints• H/W Data Triggers• Watchpoints• Instruction Trace

Debug Scope

63© 2004 Altera Corporation

Integrated, Broad Capability Debug

Debug – Multi-Processor SupportDebug – Multi-Processor SupportCollective Launch & Terminate

64© 2004 Altera Corporation

Multi-Processor Systems Using FPGAsMulti-Processor Systems Using FPGAs

Multi-Processors − Address Increasing Application Requirements− Expand Product Functionality− Simplify Software Development

FPGAs, Nios II, SOPC Builder − The Perfect-Fit Solutions

65© 2004 Altera Corporation

Additional ResourcesAdditional ResourcesNios II & IP MegaCore® Functions− Nios II Embedded Processor Evaluation Edition− MegaCore IP Library CD

SOPC Builder− Included in All Quartus® II Products

www.altera.com Literature Section− Multi-Processor Systems with Nios II Application

Note− Nios II Handbook− SOPC Builder Documentation

66© 2004 Altera Corporation

© 2002© 2004 Altera Corporation

Multi-Processor Systems in FPGAs: Implementation & Debug

Multi-Processor Systems in FPGAs: Implementation & Debug

68© 2004 Altera Corporation

FPGAs for Multi-Processor DesignsFPGAs for Multi-Processor Designs

PerformancePerformanceThree ConfigurationsHardware AccelerationCustom Instructions

SOPC BuilderThe Perfect-Fit Solution

FlexibilityFlexibility

Powerful Debug Tools

Powerful Debug Tools

Fastest Timeto Market

Fastest Timeto Market

Single or Multiple Standard PeripheralsCustom Peripherals

On-Chip Processor DebugSignalTap II Logic AnalyzerPerformance Analysis Plug-Ins

© 2002© 2004 Altera Corporation

Thank You !Thank You !