compiler construction sohail aslam lecture 42. 2 code generation the code generation problem is the...
TRANSCRIPT
Compiler Compiler ConstructionConstruction
Compiler Compiler ConstructionConstruction
Sohail Aslam
Lecture 42
2
Code GenerationCode GenerationCode GenerationCode Generation
The code generation problem is the task of mapping intermediate code to machine code.
3
Code GenerationCode GenerationCode GenerationCode Generation
Requirements: Correctness Efficiency
4
Issues:Issues:Issues:Issues: Input language: intermediate
code (optimized or not)
Target: machine code, assembly language
5
Issues:Issues:Issues:Issues: Memory management:
mapping names to data objects in the run-time system.
Instruction Selection Instruction Scheduling Register allocation
6
Issues:Issues:Issues:Issues: Memory management:
mapping names to data objects in the run-time system.
Instruction Selection Instruction Scheduling Register allocation
7
Issues:Issues:Issues:Issues: Memory management:
mapping names to data objects in the run-time system.
Instruction Selection Instruction Scheduling Register allocation
8
Issues:Issues:Issues:Issues: Memory management:
mapping names to data objects in the run-time system.
Instruction Selection Instruction Scheduling Register allocation
9
Target: Assembly LanguageTarget: Assembly LanguageTarget: Assembly LanguageTarget: Assembly Language
General Characteristics Byte-addressable with 4-byte
words N general -purpose registers Two-address instructions:
op source, destination
10
Target: Assembly LanguageTarget: Assembly LanguageTarget: Assembly LanguageTarget: Assembly Language
Operations: • MOV source, destination • ADD source, destination • SUB source, destination
(dest = dest – source)• GOTO address • CJ conditional jump
11
Addressing ModesAddressing ModesAddressing ModesAddressing Modes
MODE FORM ADDRESS ADDED COST
absolute M M 1
register R R 0
indexed c(R) c + contents(R) 1
indirect register *R contents(R) 0
indirect indexed *c(R) contents(c+contents(R)) 1
literal #c c 1
stack SP SP 0
indexed stack c(SP) c + contents(SP) 1
12
Target: Assembly LanguageTarget: Assembly LanguageTarget: Assembly LanguageTarget: Assembly Language
Instruction costs: • The cost corresponds to length of
instruction
• MOV R0,R1 ; R0 = c(R1)has cost 1
• MOV R5,M ; M = c(R5)has cost 2: 1 for instruction, 1 additional for memory address
13
Simple Code GenerationSimple Code GenerationSimple Code GenerationSimple Code Generation
Define a target code sequence for each intermediate code statement type.
14
Simple Code GenerationSimple Code GenerationSimple Code GenerationSimple Code Generation
Intermediate becomes…
a = b MOV b,a
a = b[c] MOV addr(b),R0
ADD c, R0
MOV *R0,a
15
Simple Code GenerationSimple Code GenerationSimple Code GenerationSimple Code Generation
Intermediate becomes…
a = b + c MOV b,a
ADD c,a
a[b] = c MOV addr(a),R0
ADD b,R0
MOV c,*R0
16
Consider the C statement: a[i] = b[c[j]];
t1 := c[j] MOV addr(c), R0
ADD j, R0
MOV *R0, t1
t2 := b[t1] MOV addr(b), R0
ADD t1, R0
MOV *R0, t2
a[i] := t2 MOV address(a), R0
ADD i, R0
MOV t2, *R0 The cost of this code is 18 and we are forced to
allocate space for two temporaries.
17
ProblemsProblemsProblemsProblems
Local decisions do not produce good code.
Do not take temporary variables into account
18
OptimizeOptimizeOptimizeOptimize Get rid of the temporaries:
MOV addr(c), R0 ADD j, R0 MOV addr(b), R1 ADD *R0, R1 MOV addr(a), R2 ADD i, R2 MOV *R1, *R2
The cost of this code is 12.
19
a[i] = b[c[j]]; t1 := c[j] MOV addr(c), R0
ADD j, R0
MOV *R0, t1
t2 := b[t1] MOV addr(b), R0
ADD t1, R0
MOV *R0, t2
a[i] := t2 MOV address(a), R0
ADD i, R0
MOV t2, *R0 The cost of this code is 18
MOV addr(c), R0
ADD j, R0
MOV addr(b), R1
ADD *R0, R1
MOV addr(a), R2
ADD i, R2
MOV *R1, *R2
The cost of this code is 12.
20
Can optimize further …Can optimize further …Can optimize further …Can optimize further …MOV addr(c), R0 ADD j, R0 MOV addr(a), R2 ADD i, R2 MOV *addr(b)(R0), *R2
The cost of this code is 10.
21
Can optimize further …Can optimize further …Can optimize further …Can optimize further …
What is needed is a way to generate machine code based on past and future use of the data.