![Page 1: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/1.jpg)
THEORY OF COMPILATIONLecture 10 – Code Generation
Eran Yahav
Reference: Dragon 8. MCD 4.2.4
![Page 2: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/2.jpg)
2
You are here
Executable
code
exe
Source
text
txt
Compiler
LexicalAnalysi
s
Syntax Analysi
s
Parsing
Semantic
Analysis
Inter.Rep.
(IR)
Code
Gen.
![Page 3: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/3.jpg)
3
Last Week: Runtime Part II Nested procedures Object layout Inheritance Multiple inheritance
![Page 4: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/4.jpg)
4
Today
Runtime checks Garbage collection Generating assembly code
![Page 5: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/5.jpg)
5
Runtime checks
generate code for checking attempted illegal operations Null pointer check
MoveField, MoveArray, ArrayLength, VirtualCall Reference arguments to library functions should not be
null Array bounds check Array allocation size check Division by zero …
If check fails jump to error handler code that prints a message and gracefully exists program
![Page 6: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/6.jpg)
6
Null pointer check
# null pointer check
cmp $0,%eax
je labelNPE
labelNPE: push $strNPE # error message call __println push $1 # error code call __exit
Single generated handler for entire program
![Page 7: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/7.jpg)
7
Array bounds check
# array bounds check mov -4(%eax),%ebx # ebx = length mov $0,%ecx # ecx = index cmp %ecx,%ebx jle labelABE # ebx <= ecx ? cmp $0,%ecx jl labelABE # ecx < 0 ?
labelABE: push $strABE # error message call __println push $1 # error code call __exit
Single generated handler for entire program
![Page 8: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/8.jpg)
8
Array allocation size check
# array size check
cmp $0,%eax # eax == array size
jle labelASE # eax <= 0 ?
labelASE: push $strASE # error message call __println push $1 # error code call __exit
Single generated handler for entire program
![Page 9: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/9.jpg)
9
Automatic Memory Management automatically free memory when it is no longer needed not limited to OO programs, we show it here because it
is prevalent in OO languages such as Java also in functional languages
approximate reasoning about object liveness use reachability to approximate liveness assume reachable objects are live
non-reachable objects are dead
Three classical garbage collection techniques reference counting mark and sweep copying
![Page 10: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/10.jpg)
10
GC using Reference Counting add a reference-count field to every
object how many references point to it
when (rc==0) the object is non reachable non reachable => dead can be collected (deallocated)
![Page 11: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/11.jpg)
11
Managing Reference Counts
Each object has a reference count o.RC A newly allocated object o gets o.RC = 1
why?
write-barrier for reference updatesupdate(x,old,new) { old.RC--; new.RC++; if (old.RC == 0) collect(old); }
collect(old) will decrement RC for all children and recursively collect objects whose RC reached 0.
![Page 12: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/12.jpg)
12
Cycles!
cannot identify non-reachable cycles reference counts for nodes on the cycle
will never decrement to 0 several approaches for dealing with
cycles ignore periodically invoke a tracing algorithm to
collect cycles specialized algorithms for collecting
cycles
![Page 13: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/13.jpg)
13
GC Using Mark & Sweep
Marking phase mark roots trace all objects transitively reachable
from roots mark every traversed object
Sweep phase scan all objects in the heap collect all unmarked objects
![Page 14: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/14.jpg)
14
mark_sweep() { for Ptr in Roots mark(Ptr) sweep()}
mark(Obj) { if mark_bit(Obj) == unmarked { mark_bit(Obj)=marked for C in Children(Obj) mark(C) }}
Sweep() { p = Heap_bottom while (p < Heap_top) if (mark_bit(p) == unmarked) then free(p) else mark_bit(p) = unmarked; p=p+size(p)}
GC Using Mark & Sweep
![Page 15: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/15.jpg)
15
Copying GC
partition the heap into two parts: old space, new space
GC copy all reachable objects from old
space to new space swap roles of old/new space
![Page 16: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/16.jpg)
16
Example
old new
Roots
A
D
C
B
E
![Page 17: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/17.jpg)
17
Example
old new
Roots
A
D
C
B
E
A
C
![Page 18: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/18.jpg)
18
Summary
How objects are organized in memory
Automatic management of memory
Coming up… Generating assembly code
![Page 19: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/19.jpg)
19
target languages
Absolute machine code
Code
Gen.Relative
machine code
Assembly
IR + Symbol Table
![Page 20: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/20.jpg)
20
From IR to ASM: Challenges mapping IR to ASM operations
what instruction(s) should be used to implement an IR operation?
how do we translate code sequences call/return of routines
managing activation records memory allocation register allocation optimizations
![Page 21: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/21.jpg)
21
Intel IA-32 Assembly
Going from Assembly to Binary… Assembling Linking
AT&T syntax vs. Intel syntax We will use AT&T syntax
matches GNU assembler (GAS)
![Page 22: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/22.jpg)
23
IA-32 Registers
Eight 32-bit general-purpose registers EAX – accumulator for operands and result data.
Used to return value from function calls. EBX – pointer to data. Often use as array-base address ECX – counter for string and loop operations EDX – I/O pointer (GP for us) ESI – GP and source pointer for string operations EDI – GP and destination pointer for string operations EBP – stack frame (base) pointer ESP – stack pointer
EFLAGS register EIP (instruction pointer) register Six 16-bit segment registers … (ignore the rest for our purposes)
![Page 23: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/23.jpg)
24
Not all registers are born equal
EAX Required operand of MUL,IMUL,DIV and IDIV instructions Contains the result of these operations
EDX Stores remainder of a DIV or IDIV instruction
(EAX stores quotient) ESI, EDI
ESI – required source pointer for string instructions EDI – required destination pointer for string instructions
Destination Registers of Arithmetic operations EAX, EBX, ECX, EDX
EBP – stack frame (base) pointer ESP – stack pointer
![Page 24: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/24.jpg)
25
IA-32 Addressing Modes
Machine-instructions take zero or more operands
Source operand Immediate Register Memory location (I/O port)
Destination operand Register Memory location (I/O port)
![Page 25: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/25.jpg)
26
Immediate and Register Operands
Immediate Value specified in the instruction itself GAS syntax – immediate values
preceded by $ add $4, %esp
Register Register name is used GAS syntax – register names preceded
with % mov %esp,%ebp
![Page 26: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/26.jpg)
27
Memory and Base Displacement Operands
Memory operands Value at given address GAS syntax - parentheses mov (%eax), %eax
Base displacement Value at computed address Address computed out of
base register, index register, scale factor, displacement
offset = base + (index*scale) + displacement Syntax: disp(base,index,scale) movl $42, $2(%eax) movl $42, $1(%eax,%ecx,4)
![Page 27: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/27.jpg)
28
Base Displacement Addressing
Mov (%ecx,%ebx,4), %eax
7
Array Base Reference
4 4
0 2 4 5 6 7 1
4 4 4 4 4 4
%ecx = base%ebx = 3
offset = base + (index*scale) + displacement
offset = base + (3*4) + 0 = base + 12
(%ecx,%ebx,4)
![Page 28: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/28.jpg)
29
How do we generate the code? break the IR into basic blocks basic block is a sequence of instructions
with single entry (to first instruction), no jumps to
the middle of the block single exit (last instruction) code execute as a sequence from first
instruction to last instruction without any jumps edge from one basic block B1 to another
block B2 when the last statement of B1 may jump to B2
![Page 29: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/29.jpg)
30
Example
False
B1
B2 B3
B4
True
t1 := 4 * it2 := a [ t1 ]if t2 <= 20 goto B3
t5 := t2 * t4
t6 := prod + t5
prod := t6
goto B4
t7 := i + 1i := t2
Goto B5
t3 := 4 * it4 := b [ t3 ]goto B4
![Page 30: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/30.jpg)
31
creating basic blocks
Input: A sequence of three-address statements Output: A list of basic blocks with each three-
address statement in exactly one block Method
Determine the set of leaders (first statement of a block) The first statement is a leader Any statement that is the target of a conditional or
unconditional jump is a leader Any statement that immediately follows a goto or
conditional jump statement is a leader For each leader, its basic block consists of the leader
and all statements up to but not including the next leader or the end of the program
![Page 31: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/31.jpg)
32
control flow graph
A directed graph G=(V,E)
nodes V = basic blocks
edges E = control flow (B1,B2) E when
control from B1 flows to B2
B1
B2
t1 := 4 * it2 := a [ t1 ]t3 := 4 * it4 := b [ t3 ]t5 := t2 * t4
t6 := prod + t5
prod := t6
t7 := i + 1i := t7
if i <= 20 goto B2
prod := 0i := 1
![Page 32: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/32.jpg)
example
1) i = 12) j =13) t1 = 10*I4) t2 = t1 + j5) t3 = 8*t26) t4 = t3-887) a[t4] = 0.08) j = j + 19) if j <= 10 goto (3)10) i=i+111) if i <= 10 goto (2)12) i=113) t5=i-114) t6=88*t515) a[t6]=1.016) i=i+117) if I <=10 goto (13)
33
i = 1
j = 1
t1 = 10*It2 = t1 + jt3 = 8*t2t4 = t3-88a[t4] = 0.0j = j + 1if j <= 10 goto B3i=i+1if i <= 10 goto B2
i = 1
t5=i-1t6=88*t5a[t6]=1.0i=i+1if I <=10 goto B6
B1
B2
B3
B4
B5
B6
for i from 1 to 10 do for j from 1 to 10 do a[i, j] = 0.0;for i from 1 to 10 do a[i, i] = 1.0;
source IR
CFG
![Page 33: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/33.jpg)
34
Variable Liveness
A statement x = y + z defines x uses y and z
A variable x is live at a program point if its value is used at a later point
y = 42z = 73
x = y + zprint(x);
x is live, y dead, z dead
x undef, y live, z live
x undef, y live, z undef
x is dead, y dead, z dead
(showing state after the statement)
![Page 34: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/34.jpg)
35
Computing Liveness Information between basic blocks – dataflow
analysis (next lecture)
within a single basic block? idea
use symbol table to record next-use information
scan basic block backwards update next-use for each variable
![Page 35: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/35.jpg)
36
Computing Liveness Information INPUT: A basic block B of three-address statements.
symbol table initially shows all non-temporary variables in B as being live on exit.
OUTPUT: At each statement i: x = y + z in B, liveness and next-use information of x, y, and z at i.
Start at the last statement in B and scan backwards At each statement i: x = y + z in B, we do the following:1. Attach to i the information currently found in the symbol
table regarding the next use and liveness of x, y, and z.2. In the symbol table, set x to "not live" and "no next use.“3. In the symbol table, set y and z to "live" and the next uses
of y and z to i
![Page 36: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/36.jpg)
37
Computing Liveness Information Start at the last statement in B and scan backwards
At each statement i: x = y + z in B, we do the following:1. Attach to i the information currently found in the symbol
table regarding the next use and liveness of x, y, and z.2. In the symbol table, set x to "not live" and "no next use.“3. In the symbol table, set y and z to "live" and the next
uses of y and z to i
can we change the order between 2 and 3?
x = 1 y = x + 3 z = x * 3 x = x * z
![Page 37: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/37.jpg)
38
common-subexpression elimination
common-subexpression elimination
a = b + cb = a – dc = b + cd = a - d
a = b + cb = a – dc = b + cd = b
![Page 38: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/38.jpg)
39
DAG Representation of Basic Blocks
a = b + cb = a - d
c = b + cd = a - d
b0 c0
+ d0
-
+
a
b,d
c
![Page 39: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/39.jpg)
40
DAG Representation of Basic Blocks
a = b + cb = b - dc = c + de = b + c
b0 c0
+
d0
- +a b c
+ e
![Page 40: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/40.jpg)
41
algebraic identities
a = x^2b = x*2c = x/2d = 1*x
a = x*xb = x+xc = x*0.5d = x
![Page 41: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/41.jpg)
42
coming up next
register allocation
![Page 42: Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4](https://reader030.vdocuments.us/reader030/viewer/2022032708/56649e6f5503460f94b6c89b/html5/thumbnails/42.jpg)
43
The End