compilation 2007 code generation michael i. schwartzbach brics, university of aarhus
TRANSCRIPT
Compilation 2007Compilation 2007
Code GenerationCode Generation
Michael I. Schwartzbach
BRICS, University of Aarhus
2Code Generation
Code Generation PhasesCode Generation Phases
Computing resources, such as:• layout of data structures• offsets• register allocation
Generating an internal representation of machine code for statements and expressions
Optimizing the generated code (ignored for now) Emitting the code to files in assembler format Assembling the emitted code to binary format
3Code Generation
Joos Code GenerationJoos Code Generation
Compute offsets and signatures Generate code for static initializers Generate code for statements and expressions Optimize the generated code (ignored for now) Compute locals and stack limits Emit Jasmin code Assemble Jasmin code to class files
4Code Generation
Computing OffsetsComputing Offsets
Each formal and local variables must have an offset in the stack frame
The this object always has offset 0 The naive solution:
• enumerate all formals and locals
The better solution:• reuse offsets for locals in disjoint scopes
The clever solution:• exploit liveness information• must still respect the runtime types of locals
5Code Generation
Naive OffsetsNaive Offsets
public void m(int p , int q , Object r )
int x = 42;
int y ;
{ int z ;
z = 87;
}
{ boolean a ;
Object b ;
{ boolean b ;
int z ;
b = true;
boolean c ;
c = b && (x==87);
}
{ int y ;
y = x;
}
}
}
4
5
1 2 3
6
7
8
9
10
11
12
max = 12
6Code Generation
Better OffsetsBetter Offsets
public void m(int p , int q , Object r )
int x = 42;
int y ;
{ int z ;
z = 87;
}
{ boolean a ;
Object b ;
{ boolean b ;
int z ;
b = true;
boolean c ;
c = b && (x==87);
}
{ int y ;
y = x;
}
}
}
4
5
1 2 3
6
6
7
8
9
10
8
max = 10
7Code Generation
Clever OffsetsClever Offsets
public void m(int p , int q , Object r )
int x = 42;
int y ;
{ int z ;
z = 87;
}
{ boolean a ;
Object b ;
{ boolean b ;
int z ;
b = true;
boolean c ;
c = b && (x==87);
}
{ int y ;
y = x;
}
}
}
4
4
1 2 3
5
6
3
6
5
6
5
max = 6
8Code Generation
The function sig(σ) encodes a type:
sig(void) = V sig(byte) = B
sig(short) = S sig(int) = I
sig(char) = C sig(boolean) = Z
sig(σ[]) = [desc(σ)
sig(C1.C2. ... .Ck) = C1/C2/.../Ck
desc(void) = V desc(byte) = B
desc(short) = S desc(int) = I
desc(char) = C desc(boolean) = Z
desc(σ[]) = [desc(σ)
desc(C1.C2. ... .Ck) = LC1/C2/.../Ck;
Computing Signatures (1/2)Computing Signatures (1/2)
9Code Generation
Computing Signatures (2/2)Computing Signatures (2/2)
This extends to fields, methods, and constructors The field named x in class C:
sig(C)/x
The method σ m(σ1 x1, ..., σk xk) in class C:
sig(C)/m(desc(σ1)...desc(σk))desc(σ)
The constructor C(σ1 x1, ..., σk xk) in class C:
sig(C)/<init>(desc(σ1)...desc(σk))V
10Code Generation
Static Initializers (1/2)Static Initializers (1/2)
Initialization of static fields is performed when the class is loaded by the JVM
All static fields are first given default values The code for initialization is written in a special
method with the name <clinit> Fields that are static final and constant
valued must then be initialized Finally, all other static fields are initialized
11Code Generation
Static Initializers (2/2)Static Initializers (2/2)
public class A { public static int x = A.y+1; public static final int y = 42; public static void main(String[] args) { System.out.print(A.x); }} public class A {
public static int x = A.y+1; public static int y = 42; public static void main(String[] args) { System.out.print(A.x); }}
public class A { public static int x = A.y+1; public static final int y = A.fortytwo(); public static int fortytwo() { return 42; } public static void main(String[] args) { System.out.print(A.x); }}
43
1
1
12Code Generation
Generating CodeGenerating Code
Each statement and expression generates a sequence of bytecodes
A code template shows how to generate bytecodes for a given language construct
The template ignores the surrounding context This yields a simple, recursive strategy for the
code generation
13Code Generation
Code Template InvariantsCode Template Invariants
A statement and a void expression leaves the stack height unchanged
A non-void expression increases the stack height by one
This is a local property of each template
The generated code must be verifiable This is not a local property, since the verifier
performs a global static analysis
14Code Generation
Code Templates (1/12)Code Templates (1/12)
if(E) S E ifeq false S false:
if(E) S1 else S2 E ifeq false S1
goto endif false: S2
endif: nop
while(E) S goto cond: loop:
S cond: E ifne loop
while(true) S loop: S goto loop
15Code Generation
{ σ n; S } S
Code Templates (2/12)Code Templates (2/12)
E; E
{ σ n = E; S } E σstore offset(n) S
type(E) = void
E; E pop
type(E) ≠ void
throw E; E athrow
return E; E σreturn
return; return
σstore is eitheristore or astoredepending on σ
16Code Generation
Code Templates (3/12)Code Templates (3/12)
new C(E1,...,Ek) @ δ new sig(C) dupE1
...Ek
invokespecial sig(δ)
this(E1,...,Ek) @ δ aload 0 E1
...Ek
invokespecial sig(δ)
@ δ indicates that δ is the corresponding resolved declaration
17Code Generation
Code Templates (4/12)Code Templates (4/12)
super(E1,...,Ek) @ δ aload 0E1
...Ek
invokespecial sig(δ)aload 0I1
putfield sig(x1) desc(σ1)... aload 0In
putfield sig(xn) desc(σn)
The current class contains thenon-static field initializations:
σ1 x1 = I1;... σn xn = In;
18Code Generation
Code Templates (5/12)Code Templates (5/12)
E.m(E1,...,Ek) @ δ E E1
...Ek
invokevirtual sig(δ)
E.m(E1,...,Ek) @ δ E E1
...Ek
invokeinterface sig(δ)
C.m(E1,...,Ek) @ δ E1
...Ek
invokestatic sig(δ)
sig(δ) is an interface
sig(δ) is a class
19Code Generation
(char)E E i2c
E instanceof C E instanceof sig(C)
Code Templates (6/12)Code Templates (6/12)
(C)E E checkcast sig(C)
20Code Generation
Code Templates (7/12)Code Templates (7/12)
this aload 0
n σload offset(n)
E.f E getfield sig(f) desc(type(E.f))
E1[E2] E1
E2
σaload
type(E1[E2]) = σ
type(n) = σ
C.f getstatic sig(f) desc(type(C.f))
σaload is either iaload, baload, saload, caload, or aaload depending on σ
21Code Generation
Code Templates (8/12)Code Templates (8/12)
n = E E dupσstore offset(n)
E1.f = E2 E1
E2
dup_x1 putfield sig(f) desc(type(E1.f))
E1[E2] = E3 E1
E2
E3
dup_x2 σastore
type(E1[E2]) = σ
type(n) = σ
C.f = E Edup
putstatic sig(f) desc(type(C.f))
22Code Generation
Code Templates (9/12)Code Templates (9/12)
new σ[E] Emultianewarray desc(σ) 1
E.length E arraylength
E.clone()
Einvokevirtual sig(type(E))/clone()Ljava/lang/Object;
23Code Generation
Code Templates (10/12)Code Templates (10/12)
42 ldc_int 42
true ldc_int 1
null aconst_null
"abc" ldc_string "abc"
24Code Generation
Code Templates (11/12)Code Templates (11/12)
E1 + E2 E1
E2
iaddtype(E1+E2) = int
E1 + E2 E1
E2
invokevirtual S/concat(LS;)LS;type(E1+E2) = String
S java/lang/String
- E E ineg
25Code Generation
Code Templates (12/12)Code Templates (12/12)
E1 || E2 E1
dupifne firsttruepopE2
firsttrue:
E1 && E2 E1
dupifeq firstfalsepopE2
firstfalse:
26Code Generation
Stack and Locals LimitsStack and Locals Limits
The generated code must explicitly state:• the maximal number of local and formal offsets• the maximal local stack height
This is used to determine the size of the frame
The locals limit is the maximal offset + (1 or 0) The stack limit is computed by a static analysis
27Code Generation
Stack Limit AnalysisStack Limit Analysis
Consider the control flow graph of the bytecodes succ(Si) denotes the set of successor bytecodes
Δ(Si) denotes the change in stack height by Si
S0 denotes the first bytecode
For every bytecode Si we define the following integer-valued properties:• B[[Si]] denotes the stack height before Si
• A[[Si]] denotes the stack height after Si
28Code Generation
Dataflow ConstraintsDataflow Constraints
B[[S0]] = 0
A[[Si]] = B[[Si]] + Δ(Si)
xsucc(Si): A[[Si]] = B[[x]]
A[[Si]] 0
These constraints must have a solution The stack limit is the largest value of any A[[Si]]
29Code Generation
Jasmin Class Format (1/3)Jasmin Class Format (1/3)
The overall structure of a Jasmin file is:
.source sourcefile
.class modifiers name
.super sig(superclass)
.implements sig(interface)
.field modifiers desc(type)
constructors
methods
30Code Generation
Jasmin Class Format (2/3)Jasmin Class Format (2/3)
The structure of a constructor is:
.method modifiers sig(constructor)
.throws sig(exception)
.limit stack stacklimit
.limit locals localslimit
bytecodes
.end method
31Code Generation
Jasmin Class Format (3/3)Jasmin Class Format (3/3)
The structure of a method is:
.method modifiers sig(method)
.throws sig(exception)
.limit stack stacklimit
.limit locals localslimit
bytecodes
.end method
32Code Generation
A Tiny ClassA Tiny Class
public class Foo {
public int y = 42;
public Foo(int z) {
y = y+z;
}
public String print(int n) {
if (n==0) return new Integer(y).toString();
else return new Foo(y).print(n-1);
}
}
33Code Generation
The Generated CodeThe Generated Code
.source Foo.java
.class public Foo
.super java/lang/Object
.field public "y" I
.method public <init>(I)V
.limit stack 3
.limit locals 2aload_0invokespecial java/lang/Object/<init>()Vaload_0bipush 42putfield Foo/y Iaload_0aload_0getfield Foo/y Iiload_1iadddup_x1putfield Foo/y Ipopreturn.end method
.method public print(I)Ljava/lang/String;
.limit stack 3
.limit locals 2iload_1iconst_0if_icmpeq true2iconst_0goto end3true2:iconst_1end3:ifeq false0new java/lang/Integerdupaload_0getfield Foo/y Iinvokespecial java/lang/Integer/<init>(I)Vinvokevirtual java/lang/Integer/toString()Ljava/lang/String;areturnfalse0:new Foodupaload_0getfield Foo/y Iinvokespecial Foo/<init>(I)Viload_1iconst_1isubinvokevirtual Foo/print(I)Ljava/lang/String;areturn.end method
34Code Generation
The Binary Class FileThe Binary Class File
cafe babe 0000 002e 001e 0c00 0a00 1901
0011 6a61 7661 2f6c 616e 672f 496e 7465
6765 7201 0010 6a61 7661 2f6c 616e 672f
4f62 6a65 6374 0900 0900 1701 0006 3c69
6e69 743e 0700 030c 0005 000d 0100 0346
6f6f 0700 0801 0005 7072 696e 740c 0016
001c 0a00 1d00 1501 0003 2829 5601 0004
436f 6465 0100 0179 0100 0a53 6f75 7263
6546 696c 6501 0001 4901 0004 2849 2956
0a00 0900 010a 0006 0007 0c00 0500 1201
0008 746f 5374 7269 6e67 0c00 0f00 1101
0008 466f 6f2e 6a61 7661 0100 1528 4929
4c6a 6176 612f 6c61 6e67 2f53 7472 696e
673b 0a00 0900 150a 001d 000b 0100 1428
294c 6a61 7661 2f6c 616e 672f 5374 7269
6e67 3b07 0002 0021 0009 0006 0000 0001
0001 000f 0011 0000 0002 0001 0005 0012
0001 000e 0000 0023 0003 0002 0000 0017
2ab7 0014 2a10 2ab5 0004 2a2a b400 041b
605a b500 0457 b100 0000 0000 0100 0a00
1900 0100 0e00 0000 3a00 0300 0200 0000
2e1b 039f 0007 03a7 0004 0499 0012 bb00
1d59 2ab4 0004 b700 0cb6 001b b0bb 0009
592a b400 04b7 001a 1b04 64b6 0013 b000
0000 0000 0100 1000 0000 0200 1800