compilation and program analysis (#6) : intermediate ...€¦ · scheduling loops (m2 course on...

29
Compilation and Program Analysis (#6) : Intermediate Representations: CFG, DAGs (Instruction Selection and Scheduling), SSA Laure Gonnord & Ludovic Henrio & Matthieu Moy Master 1, ENS de Lyon 2019-2020

Upload: others

Post on 13-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Compilation and Program Analysis (#6) :Intermediate Representations: CFG, DAGs (Instruction

Selection and Scheduling), SSA

Laure Gonnord & Ludovic Henrio & Matthieu Moyhttps:

//compil-lyon.gitlabpages.inria.fr/compil-lyon/

[email protected]

Master 1, ENS de Lyon

2019-2020

Page 2: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Big picture

source code↓

lexical+syntactic analysis + typing↓

decorated AST↓

code production (numerous phases)↓

assembly language

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 2 / 27 �

Page 3: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

In context 1/2

In the last course we saw the need for a better data structure topropagate and infer information. We need:

A data structure that helps us to reason about the flow ofthe program.

Which embeds our three address code.

I Control-Flow Graph.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 3 / 27 �

Page 4: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

In context 2/2

decorated AST↓

IR Construction↓

Control-Flow Graph↓

Clever analyses/code generation↓

assembly language

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 4 / 27 �

Page 5: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Control flow Graph

1 Control flow Graph

2 Basic Bloc DAGs, instruction selection/scheduling

3 SSA Control Flow Graph

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 5 / 27 �

Page 6: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Control flow Graph

Definitions

Definition (Basic Block)Basic block: largest (3-address RISCV) instruction sequencewithout label. (except at the first instruction) and without jumpsand calls.

Definition (CFG)It is a directed graph whose vertices are basic blocks, and edgeB1 → B2 exists if B2 can follow immediately B1 in an execution.

I two optimisation levels: local (BB) and global (CFG)

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 6 / 27 �

Page 7: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Control flow Graph

An example 1/2

Let us consider the program:

int x,y;

if (x<4) y=7; else y=42;

x=10;

We already generated the (linear code) for a large part of it.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 7 / 27 �

Page 8: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Control flow Graph

An example 2/2

li temp3, 4li temp2, 0geq temp0, temp3, lbl0li temp2, 1

lbl0 : # if false , jumpeq tmp2, zero, lelse1li temp4, 7mv temp1, temp4 # y gets 7jump lendif1

lelse1 :li temp4 42mv temp1, temp4 # y gets 42

lendif1 :li temp5, 10mv temp0, temp5 # end

li temp3 4

li temp2 0

temp0 >= temp3 ?

start

li temp2 1

temp2 >= temp0 ?

lbl0

li temp4 42

mv temp1 temp4

lelse1

li temp4 7

mv temp1 temp4

li temp5 10

mv temp0 temp5

lendif1

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 8 / 27 �

Page 9: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Control flow Graph

An example 2/2

li temp3, 4li temp2, 0geq temp0, temp3, lbl0li temp2, 1

lbl0 : # if false , jumpeq tmp2, zero, lelse1li temp4, 7mv temp1, temp4 # y gets 7jump lendif1

lelse1 :li temp4 42mv temp1, temp4 # y gets 42

lendif1 :li temp5, 10mv temp0, temp5 # end

li temp3 4

li temp2 0

temp0 >= temp3 ?

start

li temp2 1

temp2 >= temp0 ?

lbl0

li temp4 42

mv temp1 temp4

lelse1

li temp4 7

mv temp1 temp4

li temp5 10

mv temp0 temp5

lendif1

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 8 / 27 �

Page 10: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Control flow Graph

Identifying Basic Blocks (from 3 address code)

The first instruction of a basic block is called a leader.

We can identify leaders via these three properties:1 The first instruction in the intermediate code is a leader.2 Any instruction that is the target of a conditional or

unconditional jump is a leader.3 Any instruction that immediately follows a conditional or

unconditional jump is a leader.

Once we have found the leaders, it is straighforward to findthe basic blocks: for each leader, its basic block consists ofthe leader itself, plus all the instructions until the nextleader.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 9 / 27 �

Page 11: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling

1 Control flow Graph

2 Basic Bloc DAGs, instruction selection/schedulingInstruction SelectionInstruction Scheduling

3 SSA Control Flow Graph

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 10 / 27 �

Page 12: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling

Big picture

Front-end→ a CFG where nodes are basic blocks.

Basic blocks→ DAGs that explicit common computations

u1 := c - d

u2 := b + u1

u3 := a * u2

u4 := u2 * u1

u5 := u3 + u4

+

* *

a +

b -

c d

MULADD

MUL

ADD

SUB

I choose instructions(selection) and order them (scheduling).

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 11 / 27 �

Page 13: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Selection

1 Control flow Graph

2 Basic Bloc DAGs, instruction selection/schedulingInstruction SelectionInstruction Scheduling

3 SSA Control Flow Graph

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 12 / 27 �

Page 14: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Selection

Instruction Selection

The problem of selecting instructions is a DAG-partitioningproblem. But what is the objective ?

The best instructions:

cover bigger parts of computation.

cause few memory accesses.

I Assign a cost to each instruction, depending on theiraddressing mode.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 13 / 27 �

Page 15: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Selection

Instruction Selection: an example

+

ADD(c=2)

+

cte

ADD(c=1)

*

MUL(c=2)

*

cte

MUL(c=1)

+

*

MULADD(c=3)

What is the optimalinstruction selection for:

+

+ 42

* b

1515 a

I Finding a tiling of minimal cost: it is NP-complete (SATreduction).

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 14 / 27 �

Page 16: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Selection

Tiling trees / DAGs, in practice

For tiling:

There is an optimal algorithm for trees based on dynamicprograming.

For DAGs we use heuristics (decomposition into a forest oftrees, . . . )

I The literature is plethoric on the subject.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 15 / 27 �

Page 17: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

1 Control flow Graph

2 Basic Bloc DAGs, instruction selection/schedulingInstruction SelectionInstruction Scheduling

3 SSA Control Flow Graph

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 16 / 27 �

Page 18: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Instruction Scheduling, what for?

We want an evaluation order for the instructions that we choosewith Instruction Scheduling.

A scheduling is a function θ that associates a logical date toeach instruction. To be correct, it must respect datadependancies:

(S1) u1 := c - d

(S2) u2 := b + u1

implies θ(S1) < θ(S2).I How to choose among many correct schedulings? dependson the target architecture.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 17 / 27 �

Page 19: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Architecture-dependant choices

The idea is to exploit the different ressources of the machine attheir best:

instruction parallelism: some machine have parallel units(subinstructions of a given instruction).

prefetch: some machines have non-blocking load/stores,we can run some instructions between a load and its use(hide latency!)

pipeline.

registers: see next slide.

(sometimes these criteria are incompatible)

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 18 / 27 �

Page 20: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Register use

Some schedules induce less register pressure:

I How to find a schedule with less register pressure?

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 19 / 27 �

Page 21: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Scheduling wrt register pressure

Result: this is a linear problem on trees, but NP-complete onDAGs (Sethi, 1975).

I Sethi-Ullman algorithm on trees, heuristics on DAGs

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 20 / 27 �

Page 22: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Sethi-Ullman algorithm on trees

ρ(node) denoting the number of (pseudo)-registers necessaryto compute a node:

ρ(leaf) = 1

ρ(nodeop(e1, e2)) =

max{ρ(e1), ρ(e2)} if ρ(e1) 6= ρ(e2)

ρ(e1) + 1 else

(the idea for non “balanced” subtrees is to execute the one withthe biggest ρ first, then the other branch, then the op. If the treeis balanced, then we need an extra register)I then the code is produced with postfix tree traversal, thebiggest register consumers first.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 21 / 27 �

Page 23: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Sethi-Ullman algorithm on trees - an example

+

3

*

2

*

2

b

1

b

1

4

1

*

2

a

1

c

1

tmp1 tmp2 tmp3 tmp4

mul tmp1, b b

mul tmp2, a c

leti tmp3, 4

mul tmp4, tmp2, tmp3

mul tmp5, tmp1 ,temp4

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 22 / 27 �

Page 24: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

Basic Bloc DAGs, instruction selection/scheduling Instruction Scheduling

Conclusion (instruction selection/scheduling)

Plenty of other algorithms in the literature:

Scheduling DAGs with heuristics, . . .

Scheduling loops (M2 course on advanced compilation)

Practical session:

we have (nearly) no choice for the instructions in theRISCV ISA.

evaluating the impact of scheduling is a bit hard.

We won’t implement any of the previous algorithms.

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 23 / 27 �

Page 25: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

SSA Control Flow Graph

1 Control flow Graph

2 Basic Bloc DAGs, instruction selection/scheduling

3 SSA Control Flow Graph

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 24 / 27 �

Page 26: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

SSA Control Flow Graph

Credits

Source http://homepages.dcc.ufmg.br/~fernando/classes/

dcc888/ementa/slides/StaticSingleAssignment.pdf

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 25 / 27 �

Page 27: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

TheStaAcSingleAssignmentForm

•  ThisnamecomesoutofthefactthateachvariablehasonlyonedefiniAonsiteintheprogram.

•  Inotherwords,theenAreprogramcontainsonlyonepointwherethevariableisassignedavalue.

•  WerewetalkingaboutSingleDynamicAssignment,thenwewouldbesayingthatduringtheexecuAonoftheprogram,thevariableisassignedonlyonce.

VariableihastwostaAcassignmentsites:atL0andatL4;thus,thisprogramisnotinStaAcSingleAssignmentform.Variables,alsohastwostaAcdefiniAonsites.Variablex,ontheotherhand,hasonlyonestaAcdefiniAonsite,atL2.Nevertheless,xmaybeassignedmanyAmesdynamically,i.e.,duringtheexecuAonoftheprogram.

Page 28: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

SSA Control Flow Graph

A first Example (Cytron 1991)

Each variable is assigned only once (Static Single Assigmentform):

x ← 5

x ← x - 3

x < 3 ?

y ← x - 3y ← 2 * x

w ← y

w ← x - y

z ← x + y

x_1 ← 5

x_2 ← x_1 - 3

x_2 < 3 ?

y_2 ← x_2 - 3y_1 ← 2 * x_2

w_1 ← y_1

y_3 ← phi(y_1 ,y_2)

w_2 ← x_2 - y_3

z_1 ← x_2 + y_3

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 26 / 27 �

Page 29: Compilation and Program Analysis (#6) : Intermediate ...€¦ · Scheduling loops (M2 course on advanced compilation) Practical session: we have (nearly) no choice for the instructions

SSA Control Flow Graph

Pro/cons

- Another IR, and cost of contruction/deconstruction

+ (some) Analyses/optimisations are easier to perform (likeregister allocation):http://homepages.dcc.ufmg.br/~fernando/classes/

dcc888/ementa/slides/SSABasedRA.pdf

LG/LH/MM (M1/DI-ENSL) Compilation and Program Analysis (#6): IRs 2019-2020 � 27 / 27 �