cse322, programming languages and compilers 1 6/27/2015 lecture #16, may 31, 2007 data flow...

39
Cse322, Programming Languages and Compilers 1 06/27/22 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data flow equations, constant propagation, using associativity and commutativity computing dominators.

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

104/18/23

Lecture #16, May 31, 2007• Data flow equations,•available expressions•live variables,•solving data flow equations,•constant propagation,•using associativity and commutativity•computing dominators.

Page 2: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

204/18/23

Assignments

• Reminders– project #3 is Now due Friday, June 7, 2007 at 5:00 PM

» In order to get the course graded, there will be no extensions

– Final exam will be Tuesday June 12,

• Possibility – Next Tuesday, June 5, limited lecture, and I answer questions in class about the project.

• Project Three Info– The template has been posted since Wed.

• I have finished grading project 1. But I am missing projects from several people.

Page 3: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

304/18/23

Notes about using GCC as an assembler• Since I last taught this course, 64-bit computers

have become much more common.• Compiling in native 64-bit isn’t always compatible

with the strategy we discussed earlier.– gcc –m32 myfunc.s runtime.c

– I have successfully done this using a dosbox and gcc 3.2.3.  This is an emulator set up to emulate a 386 (can you say very old?) architecture.  The professor was using CYGWIN on a Windows machine, i.e., very similar to the dosbox.

– OK, that was easier than I thought it would be. Passing "-m32" does the trick for gcc on my x86_64 Ubuntu system, but only after installing 32-bit runtime libs (sudo apt-get install libc6-dev-i386). Fortunately, it looks like the 32-bit libs are installed on the linuxlab boxes.

• The underscore convention can be controlled using the following strategy.

– In the file Phase3.sml, use one of the follwoing– fun globalName nm = nm;– fun globalName nm = "_"^nm;– To control if you want the underscore to appear.

Page 4: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

404/18/23

Data-Flow Equations• Compute certain sets for each basic block.• Examples: set of

– available expressions when entering and leaving a block

– variable definitions (assignments) reaching and leaving a block

– live variables on block entry and exit

– etc.

• The sets are computed by solving some recursive equations on the flow graph.

• There are different equations for each kind of set.

• After computing the sets for each block, it is (relatively) easy to propagate the information to each point inside a block.

Page 5: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

504/18/23

Available expressions• Used to eliminate common subexpressions

(inside basic blocks was done using Dags).

• Definitions: an expression is

generated when it is computed

and killed when any of its operands

is changed.

• An expression is available at a point p if it has been generated but not killed before control reaches p.

• Inside a basic block, the control flow is sequential (no jumps) so it is easy to propagate Available set information.

Page 6: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

604/18/23

Available Expressions (cont.)

Example

(1) a := b+c ; generate E1=b+c

(2) d := a*e ; generate E2=a*e

(3) b := c+d ; kill E1 (but not E2!),

; generate E3=c+d

(4) e := e+1 ; kill E2, generate

; and then kill E4 = e+1

• Among the expressions generated inside this block, only E3 is available at the end of the block; E3 is generated by the block.

Page 7: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

704/18/23

Available expressions (cont.)

• Example: suppose E=a+b is generated somewhere in the program

(1) b := 5 ;kill E

• E is killed by some instruction inside the block; we will say that E is killed by the block.

• Definitions:

• genAv[i] = the set of expressions generated by block i.• killAv[i] = the set of expressions killed by block i.• inAv[i] = the set of expressions available at the

beginning of block i.• outAv[i] = the set of expressions available at the end

of block i.

Page 8: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

804/18/23

Available expressions (cont.)

• Equations (forward-flow, all-path):

• inAv[i] = ∩k pred i outAv[k]• outAv[i] = genAv[i] U (inAv[i]-killAv[i])• inAv[1] = { }

• At compile time we can only compute an approximation of the “real” (run-time) in and out.

• To be safe, in this case “approximation” means a “subset”.

• The above equations have many solutions (e.g., all empty) and they are all safe approximations.

• To be useful we want the largest solution.

k1 kn. . .

i

Page 9: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

904/18/23

Live variables (again)• A variable is live at a certain point if its value at that point

could be used later in the program; otherwise it is dead.

• usedlv[i] = the set of variables used before they are assigned a new value in block i.

• killlv[i] = the set of variables defined (assigned) in block i.

• inlv[i] = the set of live variables at the beginning of block i.

• outlv[i] = the set of live variables at the end of block i.

Page 10: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1004/18/23

Live variables (cont.)

• Example

• The inlv-set of the following block is

{b,c,e}.

(1) a := b+c ; use b,c; kill a; live: a,c,e

(2) d := a*e ; use a,e; kill d; live: c,d,e

(3) b := c+d ; use c,d; kill b; live: e

(4) e := e+1 ; use e; kill e; live:

(5) end

b is no longer live, since it will not be

used again

Page 11: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1104/18/23

Live variables (cont.)

Equations (backward-flow, any-path):

• outlv[i] = U k succ i inlv[k]• inlv[i] = usedlv[i] U (outlv[i]-killlv[i])

• Note: there are no initial conditions.• Here we want the approximations to be

supersets of the run-time sets.• To be useful we want the smallest

solutions.

. . .

i

k1 kn

Page 12: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1204/18/23

Solving Data-Flow Equations• Arbitrary control flow can be achieved by composing three

structures:– sequence (;)

– conditional (if-then-else)

– loop (while-do)

• Disadvantage: may lead to duplication of code.

• The flow graph of structured programs has special properties.

• Solving data-flow equations on these graphs can be reduced to computing attributes.

• In general a more complicated scheme is necessary.

Page 13: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1304/18/23

Solving Data-Flow Equations (cont.)

• The general method for solving data-flow equations is by iteration (sometimes called fixpoint iteration).

• General algorithm:

1. Compute local information (i.e., gen/used, kill) at each basic block.

2. For each block i, assign in[i] and out[i] some initial values.

3. Use the recursive equations as assignment statements to compute new values for in[i] and out[i].

4. If the new values are the same as the old ones then stop, otherwise go to step 3.

Page 14: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1404/18/23

Solving Data-Flow Equations (cont.)

• In general:

• To compute the smallest solution start with “small” initial values (e.g., empty sets); at each iteration obtain a larger new value.

• To compute the largest solution start with “large” initial values (e.g., “total” sets); at each iteration obtain a smaller new value.

Page 15: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1504/18/23

Solving Data-Flow Equations (cont.)

• Example: live variables

3

(1) a := 0

(2) b := a+1(3) c := a+b

(4) print a (5) d := c+1(6) a := c+d

1

2

4

Page 16: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1604/18/23

Solving Data-Flow Equations (cont.)

• Local information:

kill[1] = {a} used[1] = { }

kill[2] = {b,c} used[2] = {a}

kill[3] = { } used[3] = {a}

kill[4] = {d,a} used[4] = {c}

• Initial values (want the smallest solution):

in[i] = out[i] = { }, i=1,2,3,4.

Page 17: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1704/18/23

Solving Data-Flow Equations (cont.)

Iteration 1:out[i] = { }, i=1,2,3,4

in[1] = { }

in[2] = {a}

in[3] = {a}

in[4] = {c}

• Iteration 2:out[1] = {a}

out[2] = {a,c}

out[3] = { }

out[4] = {a}

in[1] = { }

in[2] = {a}

in[3] = {a}

in[4] = {c}

• There will be no more changes

outlv[i] = U k succ i inlv[k]inlv[i] = usedlv[i] U (outlv[i]-killlv[i])

(1) a := 0

(2) b := a+1(3) c := a+b

(4) print a (5) d := c+1(6) a := c+d

1

2

4

Page 18: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1804/18/23

Solving Data-Flow Equations (cont.)

• Using the live variables information:

• b is not in out[2] so it is a local variable in block 2. It does not need to be preserved when control leaves block 2.

• d is local in block 4.

Page 19: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

1904/18/23

Constant Propagation

• Consider

x <- 0

y <- z * x

w <- y+4

q <- w-1

• We know quite a bit about what is going on, even before we execute the first statement.

x is 0

y is also 0 (why?)

w is 4

q is 3

Page 20: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2004/18/23

Constant propagation as a dataflow problem

• General algorithm:1. Compute local information at each basic block.

2. For each block i, assign in[i] and out[i] some initial values.

3. Use the recursive equations as assignment statements to compute new values for in[i] and out[i].

4. If the new values are the same as the old ones then stop, otherwise go to step 3.

• In constant propagation the blocks are individual statements in the IR

• Information are constants for each variable (or unknown)

• Example Information set = { (a,3),(b,4),(c,)}

unknown or not a constant

Page 21: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2104/18/23

Equations

• constants(n) = ppreds(n) Fp(constants(p))

is the pairwise meet of two information elements (a,x) (a,y)

• And Fp is a function specific to each instruction that is explained on the second following page.

Page 22: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2204/18/23

The Meet

• (a,x) (a,y) = if x=y then (a,x) else (a,)

• Note that– (a,x) (a,) = (a,)

– (a,) (a,y) = (a,)

• For a variable a, the meet of two constants is the constant if they are both the same constant, and unknown otherwise.

Page 23: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2304/18/23

Instruction specific function Fp

Each instruction has its own Fp

• The Fp tracks the effect of that instruction on constants

• x <- y

F(x <- y) (p) = if p has the form {(x,c1),(y,c2), … ,}

then (p – {(x,c1)}) {(x,c2)}

• x <- y `op` z

F(x <- y `op` z) (p) = if p has the form {(x,c1),(y,c2),(z,c3) … ,}

then (p – {(x,c1)}) {(x,c2 `op` c3)}

Page 24: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2404/18/23

Operation specific rules

• If an operation has an identity element or a erasure element, then we can use special rules

• x <- n + 0• x <- n * 0• x <- n * 1

Page 25: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2504/18/23

x <- 0 {(x,0)}

y <- z * x {(x,0),(y,z*0)}={(x,0),(y,0)}

w <- y+4 {(x,0),(y,0),(w,4)}

q <- w-1 {(x,0),(y,0),(w,4),(q,3)}

x <- z*3 {(y,0),(w,4),(q,3)}

Once the constants are computed the code can be specialized.x <- 0

y <- 0

w <- 4

q <- 3

x <- z*3

Page 26: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2604/18/23

ML code• Strategy

– Simplify IR expressions under a mapping of variables to constants.– Build such a mapping by a data flow analysis– Rewrite each IR statement using constant information.

• Complication– In IR1 constants are stored as strings.– We need to convert back and forth to integers to do simplification

fun constInt "true" = 1 | constInt "false" = 0 | constInt s = (case (Int.fromString s) of SOME n => n | NONE => raise(BadConst s))

val toString = Int.toString;

Page 27: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2704/18/23

Optimizing constructions

• We normally build an addition expression by

– plus x y = BINOP(ADD,x,y)

• Use knowledge of ADD and constants to perform some addition at IR.EXP build time.

fun plus(CONST(n,Int),CONST(m,Int))

= CONST(toString(constInt n +

constInt m),Int)

| plus(CONST("0",Int),n) = n

| plus(n,CONST("0",Int)) = n

| plus(x,y) = BINOP(ADD,x,y)

Page 28: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2804/18/23

SUB and MULfun minus(CONST(n,Int),CONST(m,Int)) = CONST(toString(constInt n – constInt m),Int) | minus(n,CONST("0",Int)) = n | minus(x,y) = BINOP(SUB,x,y)

fun times(CONST(n,Int),CONST(m,Int)) = CONST(toString(constInt n * constInt m),Int) | times(CONST("0",Int),n) = CONST("0",Int) | times(n,CONST("0",Int)) = CONST("0",Int) | times(CONST("1",Int),n) = n | times(n,CONST("1",Int)) = n | times(x,y) = BINOP(MUL,x,y)

Page 29: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

2904/18/23

Strategy

• Rebuild every EXP by using the optimizing constructors.

fun simp BINOP(MUL,x,y))

= times(simp x,simp y)

| simp env (BINOP(ADD,x,y))

= plus(simp x,simp y))))

| simp env (BINOP(SUB,x,y))

= minus(simp x, simp y)

| simp env (BINOP(m,x,y))

= BINOP(m,simp x, simp y)

| simp env x = x;

Page 30: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3004/18/23

Using associativity and commutativity• Consider

– V1 * (3 * (P2 * 6))

– P2 + 0 + T3 + 4

• The strategy on the previous page won’t work. why?• But equivalent expressions

– V1 * P2 * 3 * 6

– P2 + T3 + 0 + 4

will work.• How do we arrange this?• Flatten into a list

– [V1, 3, P2, 6]– [P2, 0, T3, 4]

• Rearrange the list– [V1, P2, 3, 6]– [P2, T3, 0, 4]

• Then apply optimizing construction from right-to-left

Page 31: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3104/18/23

flattening to a list

fun flatADD (BINOP(ADD,x,y)) =

flatADD x @ flatADD y

| flatADD x = [x]

fun flatMUL (BINOP(MUL,x,y)) =

flatMUL x @ flatMUL y

| flatMUL x = [x]

Page 32: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3204/18/23

Rearranging and optimizing

fun split xs =

let fun constp (CONST(_,Int)) = true

| constp x = false

in (List.filter (Bool.not o constp) xs)

@

(List.filter constp xs) end;

fun useOper oper unit [x] = x

| useOper oper unit [] = unit

| useOper oper unit (x::xs)

= oper(x,useOper oper unit xs);

note left to write order of applying oper

Page 33: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3304/18/23

Simplifyingfun simp env (BINOP(MUL,x,y)) = useOper times ONE (split (flatMUL (BINOP(MUL,simp env x,simp env y)))) | simp env (BINOP(ADD,x,y)) = useOper plus ZERO (split (flatADD (BINOP(ADD,simp env x,simp env y)))) | simp env (BINOP(SUB,x,y)) = minus(simp env x, simp env y) | simp env (BINOP(m,x,y)) = BINOP(m,simp env x, simp env y) | simp env (v as (VAR _ | TEMP _ | PARAM _)) = (case List.find (fn (a,b) => a=v) env of NONE => v | SOME(_,SOME m) => CONST(m,Int) | SOME(_,NONE) => v) | simp env x = x;

note, we look variables up!

Page 34: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3404/18/23

Constant propagation• Create and propagate a mapping of variables bound

to constants.• Use the data flow equations

– constants(n) = ppreds(n) Fp(constants(p))

• In a sequence of straightline code, like we have in a basic block, every node has exactly one predecessor, except the first which has none.

• We will use a list of pairs to represent the constants function: (EXP,string Option) list

fun overRide v m [] = [(v,m)]

| overRide v m ((x,y)::zs) =

if v=x then ((x,m)::zs)

else (x,y)::overRide v m zs

Page 35: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3504/18/23

4 interesting casesfun constProp info [] = [] | constProp info (x::xs) = (case x of MOVE(v as(VAR _ | TEMP _ | PARAM _), n as CONST(m,Int)) => let val info2 = overRide v (SOME m) info in (x,info2) :: constProp info2 xs end | MOVE(v as(VAR _ | TEMP _ | PARAM _) ,u as(VAR _ | TEMP _ | PARAM _)) => (case List.find (fn (a,b) => a=u) info of SOME(_,w) => let val info2 = overRide v w info in (x,info2) :: constProp info2 xs end | NONE => let val info2 = overRide v NONE info in (x,info2) :: constProp info2 xs end) | MOVE(v as(VAR _ | TEMP _ | PARAM _),u as BINOP(_,_,_)) => (case simp info u of (new as CONST(m,INT)) => let val info2 = overRide v (SOME m) info in (MOVE(v,new),info2) :: constProp info2 xs end | new => let val info2 = overRide v NONE info in (MOVE(v,new),info2) :: constProp info2 xs end) | _ => (x,info):: constProp info xs)

Page 36: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3604/18/23

x <- 5

Move a constant into a variable

MOVE(v as(VAR _ | TEMP _ | PARAM _)

,n as CONST(m,Int)) =>

let val info2 = overRide v (SOME m) info

in (x,info2) :: constProp info2 xs end

Page 37: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3704/18/23

x <- y• Move a variable into another variable

MOVE(v as(VAR _ | TEMP _ | PARAM _)

,u as(VAR _ | TEMP _ | PARAM _)) =>

(case List.find (fn (a,b) => a=u) info of

SOME(_,w) =>

let val info2 = overRide v w info

in (x,info2)::constProp info2 xs end

| NONE =>

let val info2 = overRide v NONE info

in (x,info2)::constProp info2 xs end)

Page 38: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3804/18/23

x <- y + 3• Move an expression into a variable

MOVE(v as(VAR _ | TEMP _ | PARAM _) ,u as BINOP(_,_,_)) =>(case simp info u of (new as CONST(m,INT)) => let val info2 = overRide v (SOME m) info in (MOVE(v,new),info2) :: constProp info2 xs end | new => let val info2 = overRide v NONE info in (MOVE(v,new),info2) :: constProp info2 xs end)

Page 39: Cse322, Programming Languages and Compilers 1 6/27/2015 Lecture #16, May 31, 2007 Data flow equations, available expressions live variables, solving data

Cse322, Programming Languages and Compilers

3904/18/23

ExampleT0 := 0

T1 := (V5 * T0)

T2 := T1 + 4

T3 := T2 - 1

T0 := (V5 * 3)

T0 := 0 % T0=0

T1 := 0 % T0=0, T1=0

T2 := 4 % T0=0, T1=0, T2=4

T3 := 3 % T0=0, T1=0, T2=4, T3=3

T0 := (V5 * 3) % T1=0, T2=4, T3=3