graphite: the polyhedral framework of gccpingali/cs380c/2013/lectures/...graphite in gcc c c++ java...

31
Graphite: the polyhedral framework of GCC Sebastian Pop AMD - Austin, Texas October 20, 2010 1 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Upload: others

Post on 08-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Graphite: the polyhedral framework of GCC

Sebastian Pop

AMD - Austin, Texas

October 20, 2010

1 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 2: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Outline

I Graphite in GCC

I detection of SCoPs

I polyhedral representation

I code generation: CLooG

I loop transforms: blocking, flattening, autopar, autovect

2 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 3: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Graphite in GCC

C C++ Java F95 Ada

Asm

GIMPLE

Machine Description

Front−ends

Middle−end

Back−endRTL

GIMPLE

GENERIC

GIMPLE − SSA

Graphite

3 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 4: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Components of Graphite

C C++ Java F95 Ada

Asm

GIMPLE

Machine Description

SCoP detection

polyhedral representation

apply loop transforms

CLooG code generation

CLAST translation

Graphite

Front−ends

Middle−end

Back−endRTL

GIMPLE

GENERIC

GIMPLE − SSA

4 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 5: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Gimple, SSA, CFG, natural loops

I Gimple = 3 address code

I SSA = Static Single Assignment

I CFG = Control Flow Graph

I Natural loops = strongly connected components of the CFG

Reference books:

I the “Dragon Book” (Aho, Lam, Sethi, Ullman)

I Steven Muchnick

I Michael Wolfe

5 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 6: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE
Page 7: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

SCoP detection

SCoP: Static Control Part is a region with no side effects

I induction variables (IVs) affine

I canonical IV: one per loop, from 0 to number of iterations,with steps of 1

I linear loop bounds (linear = function of params and outer IVs)

I linear memory accesses

I side effects: function calls, inline asm, volatile, etc.

I regular control flow: irreducible strongly connectedcomponents not handled

I basic blocks with no memory accesses not represented

I statements = basic blocks with regular memory accesses

7 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 8: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

SCoP example

0

11*

1

2

12

3

13

4

14

5

15

6

16

7

17

8

18

( 9# )

( 19* )

10

( 20# )

8 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 9: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Translation to polyhedral representation

I build Polyhedral Black Boxes (PBB): one statement, asequence of statements, one basic block, or a single entrysingle exit (SESE) region.

I record original PBB schedule

I loop nest around PBB

I conditions around PBB

I find SCoP parameters

I record SCoP context: constraints on parameters

I iteration domains: constraints on IV

I data accesses in PBB

I build the data dependence graph

9 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 10: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Representation of scalar dependences in Graphite

I the SSA represents dependences between scalars.

I when the scalar dependences cross the boundary of PBBs, wehave to expose these dependences to the polyhedralframework: translate scalars into arrays (for a scalar variable“s”, define an array of one element and replace all theoccurrences of the scalar variable by “S[0]”).

I commutative associative reductions are special cased in thedata dependence test to remove unwanted dependences.

10 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 11: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Polyhedral representation

1. scop context = constraints on parameters

2. iteration domain = bounds of enclosing loops

3. schedule = execution time (static + dynamic)

4. access functions = data reference accesses

for (i=0; i<m; i++)

for (j=5; j<n; j++)

A[2*i][j+1] = ...

i j m n cst1 0 0 0 0

−1 0 1 0 −10 1 0 0 50 −1 0 1 −1

i ≥ 0−i +m − 1 ≥ 0j ≥ 5−j + n − 1 ≥ 0

11 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 12: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Polyhedral representation

1. scop context = constraints on parameters

2. iteration domain = bounds of enclosing loops

3. schedule = execution time (static + dynamic)

4. access functions = data reference accesses

I sequence [[s1; s2]]:S[[s1]] = t, S[[s2]] = t + 1

I loop [[loop1 s end1]]: i1 indexes loop1 iterations: dynamic timeS[[loop1]] = t, S[[s]] = (t, i1, 0)

11 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 13: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Polyhedral representation

1. scop context = constraints on parameters

2. iteration domain = bounds of enclosing loops

3. schedule = execution time (static + dynamic)

4. access functions = data reference accesses

for (i=0; i<m; i++)

for (j=5; j<n; j++)

A[2*i][j+1] = ...

i j m n cst2 0 0 0 00 1 0 0 1

2 ∗ ij + 1

11 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 14: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Polyhedral representation

1. scop context = constraints on parameters

2. iteration domain = bounds of enclosing loops

3. schedule = execution time (static + dynamic)

4. access functions = data reference accesses

data dependences = ILP solution of all these constraints

11 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 15: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Data dependence analysis

I data dependences characterize computation sharing

I sharing ⇒ synchronization and communications

I no sharing = parallelism = recomputations (privatization)

I legality of a transform = satisfy original computation order

12 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 16: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Counting points in polyhedra

In many program analyses and optimizations, questions startingwith ”how many” need to be answered:

I How many memory locations are touched by a loop?

I How many operations are performed by a loop?

I How many cache lines are touched by a loop?

I How many array elements are accessed between two points?

I How many array elements are live at a given iteration?

I How many times is a statement executed before an iteration?

I How many cache misses does a loop generate?

I How much memory is dynamically allocated?

Techniques used for counting points:

I Ehrhart polynomials

I Barvinok’s generating functions

13 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 17: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Loop transforms

I Graphite represents the static and dynamic schedules under apolyhedral format: the scattering polyhedra

I identifying statements belonging to a loop, or updating thesequence of statements on the polyhedral representation isdifficult

I the LST = Loop Statement Tree represents the statementsequence and loop nesting, but does not include informationsabout the iteration domains

I loop transformations are performed on the LST and thenimpacted on the scattering polyhedra

14 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 18: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Code generation

I the code generation of an imperative language from thepolyhedral representation introduces imperative languageconstructs: sequence, loops, parallel computations,communication, . . .

I call CLooG for code generation, produces a representationCLAST: CLooG Abstract Syntax Trees

I generate GIMPLE-SSA from CLAST

15 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 19: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

CLooG’s code generation from polyhedra

16 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 20: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

CLooG’s code generation from polyhedra

16 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 21: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

CLooG’s code generation from polyhedra

16 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 22: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

CLooG’s code generation from polyhedra

16 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 23: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Code generation details

I type of induction variables (IV): when a transform increasesthe number of iterations, the original IV type may not be largeenough to contain all the values of the new IV: use the scopcontext to get an approximation of the largest integer of thenew IV, then compute the smallest type that can represent IV

I to backup original code, use SESE versioning:

i f (0 ) {o r i g i n a l code ;

} e l s e {t r an s f o rmed code ;

}

I one could replace the 0 with a runtime condition that validatesadditional assumptions under which a transform is legal

17 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 24: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Examples

I strip mining

I interchange: improves spatial and temporal data locality

I loop blocking (tiling): strip mining + interchange

I loop flattening: removes loops, increases ILP, avoids bubblesin processors’ pipeline

18 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 25: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Loop blocking

I original loop nest:

f o r ( i = 0 ; i < 1000 ; i++)f o r ( j = 0 ; j < 1000 ; j++)

a [ i ] [ j ] = b [ i ] [ j ] + 1 ;

I strip mining (with strides of 64):

f o r ( s1 = 0 ; s1 <= 15 ; s1++)f o r ( i = 64∗ s1 ; i <= min (64∗ s1 + 63 , 999 ) ; i++)

f o r ( s3 = 0 ; s3 <= 15 ; s3++)f o r ( j = 64∗ s3 ; j <= min (64∗ s3 + 63 , 999 ) ; j++)

a [ i ] [ j ] = b [ i ] [ j ] + 1 ;

I interchange (for better data locality):

f o r ( s1 = 0 ; s1 <= 15 ; s1++)f o r ( s3 = 0 ; s3 <= 15 ; s3++)

f o r ( i = 64∗ s1 ; i <= min (64∗ s1 + 63 , 999 ) ; i++)f o r ( j = 64∗ s3 ; j <= min (64∗ s3 + 63 , 999 ) ; j++)

a [ i ] [ j ] = b [ i ] [ j ] + 1 ;

19 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 26: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Loop blocking

��������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

��������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

������������������������

������������������������

������������������������������������������������

a[i][j] = b[i][j] + 1;

for (i = 0; i < 1000; i++)

for (j = 0; j < 1000; j++)

a[i][j] = b[i][j] + 1;

for (j=64*s3;j<=min(64*s3+63,999);j++)

for (i=64*s1;i<=min(64*s1+63,999);i++)

for (s3=0;s3<=15;s3++)

for (s1=0;s1<=15;s1++)

j

i i

j

20 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 27: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Loop blocking

0

10

2

11

3

12

4

13

15

14

5

1

6

7

8

9

0

2

3

4

5

6

7*

( 8# )

1

a[i][j] = b[i][j] + 1;

for (i = 0; i < 1000; i++)

for (j = 0; j < 1000; j++)

a[i][j] = b[i][j] + 1;

for (j=64*s3;j<=min(64*s3+63,999);j++)

for (i=64*s1;i<=min(64*s1+63,999);i++)

for (s3=0;s3<=15;s3++)

for (s1=0;s1<=15;s1++)

21 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 28: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Loop flattening

I projection of a loop nest into one dimension (execution trace)

I multiplication of number of iterations for nested loops

I addition of number of iterations for sequential loops

I original loop nest:

f o r ( i = 0 ; i < 1000 ; i++)f o r ( j = 0 ; j < 1000 ; j++)

a [ i ] [ j ] = b [ i ] [ j ] + 1 ;

I loop flattening:

f o r ( t = 0 ; t < 1000 ∗ 1000 ; t++) {i = t / 1000 ;j = t % 1000 ;a [ i ] [ j ] = b [ i ] [ j ] + 1 ;

}

22 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 29: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Loop flattening

��������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

��������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

����������������������������������������������������������������

����������������������������������������������������������������

��������������������������������

��������������������������������

a[i][j] = b[i][j] + 1;

for (i = 0; i < 1000; i++)

for (j = 0; j < 1000; j++)

for (t=0;t<1000*1000;t++) {

i = t / 1000;

j = t % 1000;

a[i][j] = b[i][j] + 1;

}

j

i i

j

same iteration order: loop flattening is always legal

23 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 30: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Writing and reading the polyhedral representation

I for GSoC’10 Riyadh Baghdadi added -fgraphite-write and-fgraphite-read to read and write OpenSCoP to disk

I OpenSCoP format: complete polyhedral representation,supported by several other polyhedral tools.

I read and write of OpenSCoP allows development and use ofexternal components (Pluto, PoCC, Pace, etc.)

24 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC

Page 31: Graphite: the polyhedral framework of GCCpingali/CS380C/2013/lectures/...Graphite in GCC C C++ Java F95 Ada Asm GIMPLE Machine Description Front-ends Middle-end Back-end RTL GIMPLE

Auto parallelization with Graphite

I for GSoC’09 Li Feng added -floop-parallelize-all that usesGraphite to tag parallel loops and code generate them usingthe autopar infrastructure of GCC on top of OpenMP runtime.

I OpenCL code generation: soon to be contributed to Graphite,see the GCCSummit’10 paper “GRAPHITE-OpenCL:Generate OpenCL Code from Parallel Loops” by AlexeyKravets, Alexander Monakov, and Andrey Belevantsev fromRussian Accademy of Science (ISPRAS).

I auto-vectorization on the polyhedral representation: still to beworked on . . .

25 / 25 Sebastian Pop Graphite: the polyhedral framework of GCC