hy-c a compiler retargetable for single-chip heterogeneous multiprocessors

10
Hy-C A Compiler Retargetable for Single-Chip Heterogeneous Multiprocessors Philip Sweany 8/30/2013

Upload: tavon

Post on 06-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

Hy-C A Compiler Retargetable for Single-Chip Heterogeneous Multiprocessors. Philip Sweany 8/30/2013. Hybrid Computing. Heterogeneous processors on single chip “ CPU ” FPGA ASIC N “ CPU ” s, M FPGAs, K ASICs Tradeoffs of performance, power, flexibility. Generic Hybrid Architecture. - PowerPoint PPT Presentation

TRANSCRIPT

Hy-CA Compiler Retargetable for Single-Chip

Heterogeneous Multiprocessors

Philip Sweany8/30/2013

Hybrid Computing

• Heterogeneous processors on single chip– “CPU”– FPGA – ASIC – N “CPU”s, M FPGAs, K ASICs

• Tradeoffs of performance, power, flexibility

CPU 1

CPU 2

CPU m

Multi-CPU

FPGA 1

FPGA 2

FPGA n

Multi-FPGA

Shared Memory

Generic Hybrid Architecture

System Specification

Partitioning

CPUCompiler

FPGA Synthesis

CPU Power-Performance

Model

FPGA Power-Performance

Model

Source Code

Generic Hy-C Tools

Optimization Control

Objectives/Constraints

Veyron

Tesla

Ducati

Multi-CPU

Shared Memory

OMAP Resources (old)

OMAP Processor Resources

• Chiron– 2 x 600 MHz (2 symmetric processors each at 600 MHz

with shared L2)– Power 600uW / MHz

• Tesla– DSP Sub-System (C64x derivative); 400 MHz, 8-wide ILP– Power 200uW / MHz

• Ducati– 200 MHz (targeted for control, low latency code)– Power 100uW / MHz

StrongArm

C64x

FPGA

Shared Memory

“Canonical” Resources

WimpyArm

“Canonical” Processor Resources• StrongArm

– 2 x 600 MHz (2 symmetric processors each at 600 MHz with shared L2)

– Power 600uW / MHz

• C64x– DSP Sub-System (C64x derivative); 400 MHz, 8-wide ILP– Power 200uW / MHz

• WimpyArm– 200 MHz (targeted for control, low latency code)– Power 100uW / MHz

• FPGA fabric

System Specification

Partitioning

Strong Wimpy

Source Code

Hy-C for Canonical Chip

Optimization Control

Objectives/Constraints

C64x FPGA

Open Issue(s)

• How should we describe the architecture?• How should we describe the optimization

constraints?• How/when shall we implement this beast?• How will we evaluate the “performance” of

the generated code?