optimizing compilers cisc 673 spring 2011 register allocation

35
UNIVERSITY NIVERSITY OF OF D DELAWARE ELAWARE C COMPUTER & OMPUTER & INFORMATION NFORMATION SCIENCES CIENCES DEPARTMENT EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Register Allocation John Cavazos University of Delaware

Upload: oliver-gill

Post on 31-Dec-2015

29 views

Category:

Documents


0 download

DESCRIPTION

Optimizing Compilers CISC 673 Spring 2011 Register Allocation. John Cavazos University of Delaware. The Memory Hierarchy. Higher = smaller, faster, more expensive, closer to CPU. registers. 8 integer, 8 floating-point; 1-cycle latency. 8K data & instructions; 2-cycle latency. L1 cache. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimizing CompilersCISC 673

Spring 2011Register Allocation

John CavazosUniversity of Delaware

Page 2: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 2

The Memory Hierarchy

Higher = smaller, faster, more expensive, closer to CPU

registers

L1 cache

L2 cache

RAM

Disk

8 integer, 8 floating-point; 1-cycle latency

8K data & instructions; 2-cycle latency

512K; 7-cycle latency

1GB; 100 cycle latency

40 GB; 38,000,000 cycle latency (!)

Page 3: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 3

Managing the Memory Hierarchy

Programmer view: only two levels of memory

Main memory (stores & loads) Disk (file I/O)

Two things maintain this abstraction: Hardware

Moves data between memory and caches Compiler

Moves data between memory and registers

Page 4: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 4

Overview

Introduction Register Allocation

Definition History Interference graphs Graph coloring Register spilling

Page 5: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 5

Register Allocation: Definition

Register allocation assigns registers to values Candidate values:

Variables Temporaries Large constants

When needed, spill registers to memory

Important low-level optimization Registers are 2x – 7x faster than cache

Judicious use = big performance improvements

Page 6: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 6

Register Allocation: Complications

Explicit names Unlike all other levels of hierarchy

Scarce Small register files (set of all registers) Some reserved by operating system, virtual

machine e.g., “FP”, “SP”…

Complicated Weird constraints, esp. on CISC

architectures For example: Non-orthogonality of instructions

Page 7: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 7

History

As old as intermediate code Used in the original FORTRAN compiler

(1950’s)

No breakthroughs until 1981! Chaitin invented register allocation scheme

based on graph coloring Simple heuristic, works well in practice

Page 8: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 8

Register Allocation Example

Consider this program with six variables:a := c + de := a + bf := e - 1

with the assumption that a and e die after use

Variable a can be “reused” after e := a + b Same with variable e

Can allocate a, e, and f all to one register (r1):

r1 := r2 + r3

r1 := r1 + r4

r1 := r1 - 1

Page 9: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 9

Basic Register Allocation Idea

Value in dead variable not needed for rest of the computation Register containing dead variable can be

reused

Basic rule: Variables t1 and t2 can share same

registerif at any point in the program at most one of t1 or t2 is live !

Page 10: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 10

Algorithm: Part I

Compute live variables for each point: a := b + c

d := -ae := d + f

f := 2 * eb := d + e

e := e - 1

b := f + c

{b}

{c,e}

{b}

{c,f} {c,f}

{b,c,e,f}

{c,d,e,f}

{b,c,f}

{c,d,f}{a,c,f}

Page 11: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 11

Interference Graph

Two variables live simultaneously Cannot be allocated in the same register

Construct an interference graph (IG) Node for each variable Undirected edge between t1 and t2

If live simultaneously at some point in the program

Two variables can be allocated to same register if no edge connects them

Page 12: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 12

Interference Graph: Example

For our example:a

f

e

d

c

b

b and c cannot be in the same register

b and d can be in the same register

{b,c,f}{a,c,f}{c,d,f}{c,d,e,f}{c,e}{b,c,e,f}{c,f}{b}

Page 13: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 13

Graph Coloring

Graph coloring:assignment of colors to nodes Nodes connected by edge have

different colors

Graph k-colorable =can be colored with k colors

Page 14: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 14

Register AllocationThrough Graph Coloring

In our problem, colors = registers We need to assign colors (registers)

to graph nodes (variables) Let k = number of machine

registers

If the IG is k-colorable, there’s a register assignment that uses no more than k registers

Page 15: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 15

Color the following graph

What is the smallest k needed to color the graph?

a

f

e

d

c

b

Page 16: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 16

Graph Coloring Example

Consider the example IGa

f

e

d

c

b

There is no coloring with fewer than 4 colors

There are 4-colorings of this graph

r4

r1

r2

r3

r2

r3

Page 17: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 17

Graph Coloring Example, Continued Under this coloring the code becomes:

r2 := r3 + r4

r3 := -r2

r2 := r3 + r1

r1 := 2 * r2

r3 := r3 + r2

r2 := r2 - 1

r3 := r1 + r4

Page 18: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 18

Computing Graph Colorings

How do we compute coloring for IG?

NP-hard! For given # of registers,

coloring may not exist

Solution Use heuristics

Page 19: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 19

Graph Coloring Algorithm (Chaitin)

while G cannot be k-colored while graph G has node N with degree less than k Remove N and its edges from G and push N on a

stack S end while if all nodes removed then graph is k-colorable while stack S contains node N Add N to graph G and assign it a color from k colors end while else graph G cannot be colored with k colors Simplify graph G choosing node N to spill and remove

node (spill nodes chosen based number of definitions and

uses)end while

Page 20: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 20

Graph Coloring Heuristic Observation: “degree < k rule”

Reduce graph: Pick node N with < k neighbors in IG Eliminate N and its edges from IG

If the resulting graph has k-coloring,so does the original graph

Why? Let c1,…,cn be colors assigned to

neighbors of t in reduced graph Since n < k, we can pick some color for t

different from those of its neighbors

Page 21: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 21

Graph Coloring Heuristic (cont’d)

Heuristic: Pick node t with fewer than k neighbors Put t on a stack and remove it from the IG Repeat until all nodes have been removed

Start assigning colors to nodes on the stack (starting with the last node added) At each step, pick color different from those

assigned to already-colored neighbors

Page 22: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 22

Graph Coloring Example (1)

Remove a and then d

a

f

e

d

c

b Stack: {}

Start with the IG and with k = 4 Try 4-coloring the graph

Page 23: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 23

Graph Coloring Example (2)

Now all nodes have fewer than 4 neighbors and can be removed: c, b, e, f

f

e c

b

Stack: {d, a}

Page 24: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 24

Graph Coloring Example (2)

Start assigning colors to: f, e, b, c, d, a

b

a

e c r4

fr1

r2

r3

r2

r3

d

Page 25: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 25

What if the Heuristic Fails?

What if during simplification we get to a state where all nodes have k or more neighbors?

Example: try to find a 3-coloring of the IG:

a

f

e

d

c

b

Page 26: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 26

What if the Heuristic Fails? Remove a and get stuck (as shown below)

Pick a node as a candidate for spilling Assume that f is picked

f

e

d

c

b

Page 27: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 27

What if the Heuristic Fails?

Remove f and continue the simplification Simplification now succeeds: b, d, e, c

e

d

c

b

Page 28: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 28

What if the Heuristic Fails?

During assignment phase, we get to the point when we have to assign a color to f

Hope: among the 4 neighbors of f,we use less than 3 colors optimistic coloring

f

e

d

c

b r3

r1r2

r3

?

Page 29: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 29

Spilling

Optimistic coloring failed = must spill variable f

Allocate memory location as home of f Typically in current stack frame Call this address fa

Before each operation that uses f, insert f := load fa

After each operation that defines f, insert store f, fa

Page 30: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 30

Spilling Example

New code after spilling f

a := b + cd := -a

f := load fae := d + f

f := 2 * estore f, fa

b := d + e

e := e - 1

f := load fa

b := f + c

Page 31: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 31

Recomputing Liveness Information New liveness information after spilling:

a := b + cd := -a

f := load fae := d + f

f := 2 * estore f, fa

b := d + e

e := e - 1

f := load fa

b := f + c{b}

{c,e}

{b}

{c,f}{c,f}

{b,c,e,f}

{c,d,e,f}

{b,c,f}

{c,d,f}{a,c,f}

{c,d,f}

{c,f}

{c,f}

Page 32: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 32

Recomputing Liveness Information

New liveness info almost as before, but:f is live only Between f := load fa and the next

instruction Between store f, fa and the preceding

instruction

Spilling reduces the live range of f Reduces its interferences Results in fewer neighbors in IG for f

Page 33: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 33

Recompute IG After Spilling

Remove some edges of spilled node

Here, f still interferes only with c and d Resulting IG is 3-colorable

a

f

e

d

c

b

Page 34: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 34

Spilling, Continued

Additional spills might be required before coloring is found

Tricky part: deciding what to spill Possible heuristics:

Spill variables with most conflicts Spill variables with few definitions and

uses Avoid spilling in inner loops

All are “correct”

Page 35: Optimizing Compilers CISC 673 Spring 2011 Register Allocation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT 35

Conclusion

Register allocation: “must have” optimization in most compilers: Intermediate code uses too many

temporaries Makes a big difference in

performance

Graph coloring: Powerful register allocation scheme