inspire the insieme parallel intermediate representation

25
INSPIRE The Insieme Parallel Intermediate Representation Herbert Jordan, Peter Thoman, Simone Pellegrini, Klaus Kofler, and Thomas Fahringer University of Innsbruck PACT’13 - 9 September

Upload: bryga

Post on 21-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

INSPIRE The Insieme Parallel Intermediate Representation. Herbert Jordan, Peter Thoman , Simone Pellegrini , Klaus Kofler , and Thomas Fahringer University of Innsbruck PACT’13 - 9 September. Programming Models. User:. HW:. C / C++. C. void main(…) { int sum = 0; - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: INSPIRE The  Insieme  Parallel Intermediate Representation

INSPIREThe Insieme Parallel

Intermediate Representation

Herbert Jordan, Peter Thoman, Simone

Pellegrini, Klaus Kofler, and Thomas Fahringer

University of Innsbruck

PACT’13 - 9 September

Page 2: INSPIRE The  Insieme  Parallel Intermediate Representation

Programming Models

void main(…) { int sum = 0; for(i = 1..10) sum += i; print(sum);}

C / C++

HW:

C

Memory

User:

Page 3: INSPIRE The  Insieme  Parallel Intermediate Representation

Programming Models

void main(…) { int sum = 0; for(i = 1..10) sum += i; print(sum);}

C / C++

HW:

C

Memory

User: Compiler:

.START ST ST: MOV R1,#2 MOV R2,#1 M1: CMP R2,#20 BGT M2 MUL R1,R2 INI R2 JMP M1

IR

PL Assembly• instruction selection• register allocation• optimization

Page 4: INSPIRE The  Insieme  Parallel Intermediate Representation

Programming Models

void main(…) { int sum = 0; for(i = 1..10) sum += i; print(sum);}

C / C++

HW:User: Compiler:

C

Memory$

.START ST ST: MOV R1,#2 MOV R2,#1 M1: CMP R2,#20 BGT M2 MUL R1,R2 INI R2 JMP M1

IR

PL Assembly• instruction selection• register allocation• optimization

• loops & latency

Page 5: INSPIRE The  Insieme  Parallel Intermediate Representation

Programming Models

void main(…) { int sum = 0; for(i = 1..10) sum += i; print(sum);}

C / C++

HW:User: Compiler:

C

Memory$

.START ST ST: MOV R1,#2 MOV R2,#1 M1: CMP R2,#20 BGT M2 MUL R1,R2 INI R2 JMP M1

IR

PL Assembly• instruction selection• register allocation• optimization

• loops & latency

10 year old architecture

Page 6: INSPIRE The  Insieme  Parallel Intermediate Representation

Parallel Architectures

Memory$

C CC C

Multicore: Accelerators:

M$C G

M

Clusters:

M$C

M$C

M$C

OpenMP/Cilk OpenCL/CUDA MPI/PGAS

Page 7: INSPIRE The  Insieme  Parallel Intermediate Representation

Compiler Support

void main(…) { int sum = 0; #omp pfor for(i = 1..10) sum += i;}

C / C++.START ST ST: MOV R1,#2 MOV R2,#1 _GOMP_PFORM1: CMP R2,#20 BGT M2 MUL R1,R2 INI

IR pfor: mov eax,-2 cmp eax, 2 xor eax, eax ...

lib

Start: mov eax,2 mov ebx,1 call “pfor”Label 1: lea esi, Str push esi

bin

Frontend Backend

sequential

Page 8: INSPIRE The  Insieme  Parallel Intermediate Representation

Situation Compilers

unaware of thread-level parallelism magic happens in libraries

Libraries limited perspective / scope no static analysis, no transformations

User has to manage and coordinate parallelism no performance portability

Page 9: INSPIRE The  Insieme  Parallel Intermediate Representation

Compiler Support?

M$

C CC C

G

MM$

C CC C

G

M

HW: Compiler: User:

.START ST ST: MOV R1,#2 MOV R2,#1 M1: CMP R2,#20 BGT M2 MUL R1,R2 INI R2 JMP M1

IR

PL Assembly• instruction selection• register allocation• optimization

• loops & latency• vectorization

void main(…) { int sum = 0; for(i = 1..10) sum += i; print(sum);}

C / C++

M$

C CC C

G

MM$

C CC C

G

M

Page 10: INSPIRE The  Insieme  Parallel Intermediate Representation

M$

C CC C

G

MM$

C CC C

G

M

Our approach:HW: Compiler: User:

.START ST ST: MOV R1,#2 MOV R2,#1 M1: CMP R2,#20 BGT M2 MUL R1,R2 INI R2 JMP M1

IR

PL Assembly• instruction selection• register allocation• optimization

• loops & latency• vectorization

void main(…) { int sum = 0; for(i = 1..10) sum += i; print(sum);}

C / C++

M$

C CC C

G

MM$

C CC C

G

M

Page 11: INSPIRE The  Insieme  Parallel Intermediate Representation

Our approach: InsiemeUser:Compiler:

.START ST ST: MOV R1,#2 MOV R2,#1 M1: CMP R2,#20 BGT M2 MUL R1,R2 INI R2 JMP M1

IR

PL Assembly• instruction selection• register allocation• optimization

• loops & latency• vectorization

PL PL + extras• coordinate parallelism• high-level optimization• auto tuning• instrumentation

void main(…) { int sum = 0; #omp pfor for(i = 1..10) sum += i;}

C / C++

M$

C C

C CG

MM$

C C

C CG

M

HW:

M$

C C

C CG

MM$

C C

C CG

M

Insieme:

unit main(...) { ref<int> v1 =0; pfor(..., (){ ... });}

INSPIRE

Page 12: INSPIRE The  Insieme  Parallel Intermediate Representation

Goal:to establish a research platform for

hybrid, thread level parallelism

The Insieme ProjectFr

onte

nd

Back

end

INSPIRE

Static OptimizerIR Toolbox IRSM

C/C++OpenMP

CilkOpenCL

MPI and extensions

Compiler

Dyn. OptimizerScheduler

Monitoring

Exec. Engine

Runtime

Page 13: INSPIRE The  Insieme  Parallel Intermediate Representation

Parallel Programming OpenMP

Pragmas (+ API)

Cilk Keywords

MPI library

OpenCL library + JIT

Objective:combine those using a

unified formalism and to provide an infrastructure for

analysis and manipulations

Page 14: INSPIRE The  Insieme  Parallel Intermediate Representation

INSPIRE RequirementsOpenMP / Cilk / OpenCL / MPI / others

INSPIRE

• complete • unified• explicit• analyzable• transformable• compact• high level• whole program• open system• extensible

OpenCL / MPI / Insieme Runtime / others

Page 15: INSPIRE The  Insieme  Parallel Intermediate Representation

INSPIRE Functional Basis

first-class functions and closures generic (function) types program = 1 expression

Imperative Constructs loops, conditions, mutable state

Explicit Parallel Constructs to model parallel control flow

Page 16: INSPIRE The  Insieme  Parallel Intermediate Representation

Parallel Model Parallel Control Flow

defined by jobs: processed cooperatively by thread

groups

Page 17: INSPIRE The  Insieme  Parallel Intermediate Representation

Parallel Model (2) one work-sharing construct

one data-sharing construct

point-to-point communication abstract channels type:

Page 18: INSPIRE The  Insieme  Parallel Intermediate Representation

Evaluation What inherent impact does the

INSPIRE detour impose?

FE BEINSPIRE

C

Input Code

Insieme Compiler

Binary A(GCC)

Binary B(Insieme)

GCC 4.6.3 (-O3)

Target Code(IRT)

C

No Optimization!

Page 19: INSPIRE The  Insieme  Parallel Intermediate Representation

Performance Impact

Relative execution time ()

Page 20: INSPIRE The  Insieme  Parallel Intermediate Representation

Derived Work (subset) Adaptive Task Granularity Control

P. Thoman, H. Jordan, T. Fahringer, Adaptive Granularity Control in Task Parallel Programs using Multiversioning, EuroPar 2013

Multiobjective Auto-TuningH. Jordan, P. Thoman, J. J. Durillo et al., A Multi-Objective Auto-Tuning Framework for Parallel Codes, SC 2012

Compiler aided Loop SchedulingP. Thoman, H. Jordan, S. Pellegrini et al., Automatic OpenMP Loop Scheduling: A Combined Compiler and Runtime Approach, IWOMP 2012

OpenCL Kernel PartitioningK. Kofler, I. Grasso, B. Cosenza, T. Fahringer, An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning, ICS 2013

Improved usage of MPI PrimitivesS. Pellegrini, T. Hoefler, T. Fahringer, On the Effects of CPU Caches on MPI Point-to-Point Communications, Cluster 2012

Page 21: INSPIRE The  Insieme  Parallel Intermediate Representation

Conclusion INSPIRE is designed to

represent and unify parallel applications to analyze and manipulate parallel codes provide the foundation for researching parallel

language extensions

based on comprehensive parallel model sufficient to cover leading standards for parallel

programming

Practicality has been demonstrated by a variety of derived work

Page 22: INSPIRE The  Insieme  Parallel Intermediate Representation

Thank You!Visit: http://insieme-compiler.orgContact: [email protected]

Page 23: INSPIRE The  Insieme  Parallel Intermediate Representation

Types 7 type constructors:

Page 24: INSPIRE The  Insieme  Parallel Intermediate Representation

Expressions 8 kind of expressions:

Page 25: INSPIRE The  Insieme  Parallel Intermediate Representation

Statements 9 types of statements: