compilation 2007 code generation michael i. schwartzbach brics, university of aarhus

34
Compilation 2007 Compilation 2007 Code Generation Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

Upload: hunter-trees

Post on 14-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

Compilation 2007Compilation 2007

Code GenerationCode Generation

Michael I. Schwartzbach

BRICS, University of Aarhus

Page 2: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

2Code Generation

Code Generation PhasesCode Generation Phases

Computing resources, such as:• layout of data structures• offsets• register allocation

Generating an internal representation of machine code for statements and expressions

Optimizing the generated code (ignored for now) Emitting the code to files in assembler format Assembling the emitted code to binary format

Page 3: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

3Code Generation

Joos Code GenerationJoos Code Generation

Compute offsets and signatures Generate code for static initializers Generate code for statements and expressions Optimize the generated code (ignored for now) Compute locals and stack limits Emit Jasmin code Assemble Jasmin code to class files

Page 4: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

4Code Generation

Computing OffsetsComputing Offsets

Each formal and local variables must have an offset in the stack frame

The this object always has offset 0 The naive solution:

• enumerate all formals and locals

The better solution:• reuse offsets for locals in disjoint scopes

The clever solution:• exploit liveness information• must still respect the runtime types of locals

Page 5: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

5Code Generation

Naive OffsetsNaive Offsets

public void m(int p , int q , Object r )

int x = 42;

int y ;

{ int z ;

z = 87;

}

{ boolean a ;

Object b ;

{ boolean b ;

int z ;

b = true;

boolean c ;

c = b && (x==87);

}

{ int y ;

y = x;

}

}

}

4

5

1 2 3

6

7

8

9

10

11

12

max = 12

Page 6: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

6Code Generation

Better OffsetsBetter Offsets

public void m(int p , int q , Object r )

int x = 42;

int y ;

{ int z ;

z = 87;

}

{ boolean a ;

Object b ;

{ boolean b ;

int z ;

b = true;

boolean c ;

c = b && (x==87);

}

{ int y ;

y = x;

}

}

}

4

5

1 2 3

6

6

7

8

9

10

8

max = 10

Page 7: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

7Code Generation

Clever OffsetsClever Offsets

public void m(int p , int q , Object r )

int x = 42;

int y ;

{ int z ;

z = 87;

}

{ boolean a ;

Object b ;

{ boolean b ;

int z ;

b = true;

boolean c ;

c = b && (x==87);

}

{ int y ;

y = x;

}

}

}

4

4

1 2 3

5

6

3

6

5

6

5

max = 6

Page 8: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

8Code Generation

The function sig(σ) encodes a type:

sig(void) = V sig(byte) = B

sig(short) = S sig(int) = I

sig(char) = C sig(boolean) = Z

sig(σ[]) = [desc(σ)

sig(C1.C2. ... .Ck) = C1/C2/.../Ck

desc(void) = V desc(byte) = B

desc(short) = S desc(int) = I

desc(char) = C desc(boolean) = Z

desc(σ[]) = [desc(σ)

desc(C1.C2. ... .Ck) = LC1/C2/.../Ck;

Computing Signatures (1/2)Computing Signatures (1/2)

Page 9: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

9Code Generation

Computing Signatures (2/2)Computing Signatures (2/2)

This extends to fields, methods, and constructors The field named x in class C:

sig(C)/x

The method σ m(σ1 x1, ..., σk xk) in class C:

sig(C)/m(desc(σ1)...desc(σk))desc(σ)

The constructor C(σ1 x1, ..., σk xk) in class C:

sig(C)/<init>(desc(σ1)...desc(σk))V

Page 10: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

10Code Generation

Static Initializers (1/2)Static Initializers (1/2)

Initialization of static fields is performed when the class is loaded by the JVM

All static fields are first given default values The code for initialization is written in a special

method with the name <clinit> Fields that are static final and constant

valued must then be initialized Finally, all other static fields are initialized

Page 11: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

11Code Generation

Static Initializers (2/2)Static Initializers (2/2)

public class A { public static int x = A.y+1; public static final int y = 42; public static void main(String[] args) { System.out.print(A.x); }} public class A {

public static int x = A.y+1; public static int y = 42; public static void main(String[] args) { System.out.print(A.x); }}

public class A { public static int x = A.y+1; public static final int y = A.fortytwo(); public static int fortytwo() { return 42; } public static void main(String[] args) { System.out.print(A.x); }}

43

1

1

Page 12: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

12Code Generation

Generating CodeGenerating Code

Each statement and expression generates a sequence of bytecodes

A code template shows how to generate bytecodes for a given language construct

The template ignores the surrounding context This yields a simple, recursive strategy for the

code generation

Page 13: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

13Code Generation

Code Template InvariantsCode Template Invariants

A statement and a void expression leaves the stack height unchanged

A non-void expression increases the stack height by one

This is a local property of each template

The generated code must be verifiable This is not a local property, since the verifier

performs a global static analysis

Page 14: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

14Code Generation

Code Templates (1/12)Code Templates (1/12)

if(E) S E ifeq false S false:

if(E) S1 else S2 E ifeq false S1

goto endif false: S2

endif: nop

while(E) S goto cond: loop:

S cond: E ifne loop

while(true) S loop: S goto loop

Page 15: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

15Code Generation

{ σ n; S } S

Code Templates (2/12)Code Templates (2/12)

E; E

{ σ n = E; S } E σstore offset(n) S

type(E) = void

E; E pop

type(E) ≠ void

throw E; E athrow

return E; E σreturn

return; return

σstore is eitheristore or astoredepending on σ

Page 16: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

16Code Generation

Code Templates (3/12)Code Templates (3/12)

new C(E1,...,Ek) @ δ new sig(C) dupE1

...Ek

invokespecial sig(δ)

this(E1,...,Ek) @ δ aload 0 E1

...Ek

invokespecial sig(δ)

@ δ indicates that δ is the corresponding resolved declaration

Page 17: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

17Code Generation

Code Templates (4/12)Code Templates (4/12)

super(E1,...,Ek) @ δ aload 0E1

...Ek

invokespecial sig(δ)aload 0I1

putfield sig(x1) desc(σ1)... aload 0In

putfield sig(xn) desc(σn)

The current class contains thenon-static field initializations:

σ1 x1 = I1;... σn xn = In;

Page 18: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

18Code Generation

Code Templates (5/12)Code Templates (5/12)

E.m(E1,...,Ek) @ δ E E1

...Ek

invokevirtual sig(δ)

E.m(E1,...,Ek) @ δ E E1

...Ek

invokeinterface sig(δ)

C.m(E1,...,Ek) @ δ E1

...Ek

invokestatic sig(δ)

sig(δ) is an interface

sig(δ) is a class

Page 19: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

19Code Generation

(char)E E i2c

E instanceof C E instanceof sig(C)

Code Templates (6/12)Code Templates (6/12)

(C)E E checkcast sig(C)

Page 20: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

20Code Generation

Code Templates (7/12)Code Templates (7/12)

this aload 0

n σload offset(n)

E.f E getfield sig(f) desc(type(E.f))

E1[E2] E1

E2

σaload

type(E1[E2]) = σ

type(n) = σ

C.f getstatic sig(f) desc(type(C.f))

σaload is either iaload, baload, saload, caload, or aaload depending on σ

Page 21: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

21Code Generation

Code Templates (8/12)Code Templates (8/12)

n = E E dupσstore offset(n)

E1.f = E2 E1

E2

dup_x1 putfield sig(f) desc(type(E1.f))

E1[E2] = E3 E1

E2

E3

dup_x2 σastore

type(E1[E2]) = σ

type(n) = σ

C.f = E Edup

putstatic sig(f) desc(type(C.f))

Page 22: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

22Code Generation

Code Templates (9/12)Code Templates (9/12)

new σ[E] Emultianewarray desc(σ) 1

E.length E arraylength

E.clone()

Einvokevirtual sig(type(E))/clone()Ljava/lang/Object;

Page 23: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

23Code Generation

Code Templates (10/12)Code Templates (10/12)

42 ldc_int 42

true ldc_int 1

null aconst_null

"abc" ldc_string "abc"

Page 24: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

24Code Generation

Code Templates (11/12)Code Templates (11/12)

E1 + E2 E1

E2

iaddtype(E1+E2) = int

E1 + E2 E1

E2

invokevirtual S/concat(LS;)LS;type(E1+E2) = String

S java/lang/String

- E E ineg

Page 25: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

25Code Generation

Code Templates (12/12)Code Templates (12/12)

E1 || E2 E1

dupifne firsttruepopE2

firsttrue:

E1 && E2 E1

dupifeq firstfalsepopE2

firstfalse:

Page 26: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

26Code Generation

Stack and Locals LimitsStack and Locals Limits

The generated code must explicitly state:• the maximal number of local and formal offsets• the maximal local stack height

This is used to determine the size of the frame

The locals limit is the maximal offset + (1 or 0) The stack limit is computed by a static analysis

Page 27: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

27Code Generation

Stack Limit AnalysisStack Limit Analysis

Consider the control flow graph of the bytecodes succ(Si) denotes the set of successor bytecodes

Δ(Si) denotes the change in stack height by Si

S0 denotes the first bytecode

For every bytecode Si we define the following integer-valued properties:• B[[Si]] denotes the stack height before Si

• A[[Si]] denotes the stack height after Si

Page 28: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

28Code Generation

Dataflow ConstraintsDataflow Constraints

B[[S0]] = 0

A[[Si]] = B[[Si]] + Δ(Si)

xsucc(Si): A[[Si]] = B[[x]]

A[[Si]] 0

These constraints must have a solution The stack limit is the largest value of any A[[Si]]

Page 29: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

29Code Generation

Jasmin Class Format (1/3)Jasmin Class Format (1/3)

The overall structure of a Jasmin file is:

.source sourcefile

.class modifiers name

.super sig(superclass)

.implements sig(interface)

.field modifiers desc(type)

constructors

methods

Page 30: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

30Code Generation

Jasmin Class Format (2/3)Jasmin Class Format (2/3)

The structure of a constructor is:

.method modifiers sig(constructor)

.throws sig(exception)

.limit stack stacklimit

.limit locals localslimit

bytecodes

.end method

Page 31: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

31Code Generation

Jasmin Class Format (3/3)Jasmin Class Format (3/3)

The structure of a method is:

.method modifiers sig(method)

.throws sig(exception)

.limit stack stacklimit

.limit locals localslimit

bytecodes

.end method

Page 32: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

32Code Generation

A Tiny ClassA Tiny Class

public class Foo {

public int y = 42;

public Foo(int z) {

y = y+z;

}

public String print(int n) {

if (n==0) return new Integer(y).toString();

else return new Foo(y).print(n-1);

}

}

Page 33: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

33Code Generation

The Generated CodeThe Generated Code

.source Foo.java

.class public Foo

.super java/lang/Object

.field public "y" I

.method public <init>(I)V

.limit stack 3

.limit locals 2aload_0invokespecial java/lang/Object/<init>()Vaload_0bipush 42putfield Foo/y Iaload_0aload_0getfield Foo/y Iiload_1iadddup_x1putfield Foo/y Ipopreturn.end method

.method public print(I)Ljava/lang/String;

.limit stack 3

.limit locals 2iload_1iconst_0if_icmpeq true2iconst_0goto end3true2:iconst_1end3:ifeq false0new java/lang/Integerdupaload_0getfield Foo/y Iinvokespecial java/lang/Integer/<init>(I)Vinvokevirtual java/lang/Integer/toString()Ljava/lang/String;areturnfalse0:new Foodupaload_0getfield Foo/y Iinvokespecial Foo/<init>(I)Viload_1iconst_1isubinvokevirtual Foo/print(I)Ljava/lang/String;areturn.end method

Page 34: Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus

34Code Generation

The Binary Class FileThe Binary Class File

cafe babe 0000 002e 001e 0c00 0a00 1901

0011 6a61 7661 2f6c 616e 672f 496e 7465

6765 7201 0010 6a61 7661 2f6c 616e 672f

4f62 6a65 6374 0900 0900 1701 0006 3c69

6e69 743e 0700 030c 0005 000d 0100 0346

6f6f 0700 0801 0005 7072 696e 740c 0016

001c 0a00 1d00 1501 0003 2829 5601 0004

436f 6465 0100 0179 0100 0a53 6f75 7263

6546 696c 6501 0001 4901 0004 2849 2956

0a00 0900 010a 0006 0007 0c00 0500 1201

0008 746f 5374 7269 6e67 0c00 0f00 1101

0008 466f 6f2e 6a61 7661 0100 1528 4929

4c6a 6176 612f 6c61 6e67 2f53 7472 696e

673b 0a00 0900 150a 001d 000b 0100 1428

294c 6a61 7661 2f6c 616e 672f 5374 7269

6e67 3b07 0002 0021 0009 0006 0000 0001

0001 000f 0011 0000 0002 0001 0005 0012

0001 000e 0000 0023 0003 0002 0000 0017

2ab7 0014 2a10 2ab5 0004 2a2a b400 041b

605a b500 0457 b100 0000 0000 0100 0a00

1900 0100 0e00 0000 3a00 0300 0200 0000

2e1b 039f 0007 03a7 0004 0499 0012 bb00

1d59 2ab4 0004 b700 0cb6 001b b0bb 0009

592a b400 04b7 001a 1b04 64b6 0013 b000

0000 0000 0100 1000 0000 0200 1800