code generator

44
CODE GENERATION (Part I)

Upload: techmx

Post on 27-Oct-2014

164 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Code Generator

CODE GENERATION(Part I)

Page 2: Code Generator

Agenda

• Introduction

• Code Generation

• Issues in the Design of a Code Generator

• The Target Language

• Addresses in the Target Code

• Basic Blocks and Flow Graphs

• Optimization of Basic Blocks

• Next…04/07/23 2

Page 3: Code Generator

Introduction

• The final phase in our compiler model

Position Of a Code Generator

Front EndFront End CodeGenerator

CodeGenerator

CodeOptimizer

Sourceprogram

Intermediate Intermediate

code codeTargetprogram

04/07/23 3

Page 4: Code Generator

Code Generation

• This phase generates the target code consisting of

assembly code.

1. Memory locations are selected for each variable;

2. Instructions are translated into a sequence of assembly instructions;

3. Variables and intermediate results are assigned to memory registers.

04/07/23 4

Page 5: Code Generator

5

Issue in the Design of a Code Generator

• Input to the Code Generator• The Target Program• Instruction Selection• Register Allocation• Evaluation Order

04/07/23

Page 6: Code Generator

6

Input to the Code Generator

• The input to the code generator is – The intermediate representation of the source program

produced by the frontend along with the symbol table.

• Choices for the IR– Three-address representations– Virtual machine representations– Linear representations– Graphical representation

04/07/23

Page 7: Code Generator

7

The Target Program

• The instruction-set architecture of the target machine has a significant impact on the difficulty of constructing a good code generator that produces high-quality machine code.

• The most common target-machine architecture are RISC, CISC, and stack based.– A RISC machine typically has many registers, three-address

instructions, simple addressing modes, and a relatively simple instruction-set architecture.

04/07/23

Page 8: Code Generator

Contd…

– A CISC machine typically has few registers, two-address instructions, and variety of addressing modes, several register classes, variable-length instructions, and instruction with side effects.

– In a stack-based machine, operations are done by pushing operands onto a stack and then performing the operations on the operands at the top of the stack.

04/07/23 8

Page 9: Code Generator

9

Instruction Selection

• Instruction selection is important to obtain efficient code

• Suppose we translate three-address codex:=y+z

to: MOV y,R0ADD z,R0MOV R0,x a:=a+1 MOV a,R0

ADD #1,R0MOV R0,a

ADD #1,a INC a

Cost = 6

Cost = 3 Cost = 2

Better Better

04/07/23

Page 10: Code Generator

10

Register Allocation

• A key problem in code generation is deciding what values to hold in what registers.

• Efficient utilization is particularly important.• The use of registers is often subdivided into two

sub problems:1. Register Allocation, during which we select the set of variables

that will reside in registers at each point in the program.

2. Register assignment, during which we pick the specific register that a variable will reside in.

04/07/23

Page 11: Code Generator

Contd…

• Finding an optimal assignment of registers to variables is difficult, even with single-register machine.

• Mathematically, the problem is NP-complete.

04/07/23 11

Page 12: Code Generator

12

Evaluation Order

• The order in which computations are performed can affect the efficiency of the target code.

• Some computation orders require fewer registers to hold intermediate results than others.

• However, picking a best order in the general case is a difficult NP-complete problem.

04/07/23

Page 13: Code Generator

13

The Target Language

• A Simple Target Machine Model

• Program and Instruction Costs

04/07/23

Page 14: Code Generator

14

A Simple Target Machine Model

• Our target computer models a three-address machine with load and store operations, computation operations, jump operations, and conditional jumps.

• The underlying computer is a byte-addressable machine with n general-purpose registers.

04/07/23

Page 15: Code Generator

Contd…

• Assume the following kinds of instructions are available:– Load operations– Store operations– Computation operations– Unconditional jumps– Conditional jumps

04/07/23 15

Page 16: Code Generator

16

Contd…

• Assume a variety of addressing models:– A variable name x referring o the memory location

that is reserved for x.– Indexed address, a(r), where a is a variable and r is a

register.– A memory can be an integer indexed by a register, for

example, LD R1, 100(R2). – Two indirect addressing modes: *r and *100(r)– Immediate constant addressing mode

04/07/23

Page 17: Code Generator

17

A Simple Target Machine Model

• Example :

x = y –z LD R1, y LD R2, z SUB R1, R1, R2 ST x, R1

b = a[i] LD R1, i MUL R1, R1, 8 LD R2, a(R1) ST b, R2

x = *p LD R1, p LD R2, 0(R1) ST x, R2

*p = y LD R1, p LD R2, y ST 0(R1), R2

04/07/23

Page 18: Code Generator

04/07/23 18

a[j] = c LD R1, c LD R2, j MUL R2, R2, 8 ST a(R2), R1

if x < y goto L LD R1, x LD R2, y SUB R1, R1, R2 BLTZ R1, L

Page 19: Code Generator

19

Program and Instruction Costs

• For simplicity, we take the cost of an instruction to be one plus the costs associated with the addressing modes of the operands.

• Addressing modes involving registers have zero additional cost, while those involving a memory location or constant in them have an additional cost f one.

04/07/23

Page 20: Code Generator

Contd…

• For example, – LD R0, R1 cost = 1– LD R0, M cost = 2– LD R1, *100(R2) cost = 3

04/07/23 20

Page 21: Code Generator

21

Addresses in the Target Code

• Static Allocation• Stack Allocation• Run-Time Allocation

04/07/23

Page 22: Code Generator

22

Static Allocation

• Focus on the following three-address statements:– Call callee– Return– Halt– Action

04/07/23

Page 23: Code Generator

23

Static Allocation

• Store return address and return control to caller

• Store

ST callee.staticArea , #here + 20

BR callee.codeArea

• Return

BR *callee.staticArea

04/07/23

Page 24: Code Generator

24

Runtime Addresses for Names

• Assumption: a name in a three-address statement is really a pointer to a symbol-table entry for that name.

• Note that names must eventually be replaced by code to access storage locations

04/07/23

Page 25: Code Generator

Contd…

• Example: – x=0– Suppose the symbol-table entry for x contains a relative address

12– x is in a statically allocated area beginning at address static– the actual run-time address of x is static + 12– The actual assignment: static [ 12] = 0– For a static area starting at address 100: LD 112, #0

04/07/23 25

Page 26: Code Generator

Basic Blocks and Flow Graphs

• Basic Blocks

• Next-Use Information

• Flow Graphs

• Representation of Flow Graphs

• Loops

2604/07/23

Page 27: Code Generator

Basic Blocks

• Algorithm: Partitioning three-address instructions into basic blocks.

INPUT: A sequence of three-address instructions.

OUTPUT: A list of the basic blocks for that sequence in which each instruction is assigned to exactly one basic block.

2704/07/23

Page 28: Code Generator

Contd…

METHOD: First, we determine those instructions in the intermediate code that are leaders. rules for finding leaders are:

1. The first three-address instruction in the intermediate code is a leader.

2. Any instruction that is the target of a conditional or unconditional jump is a leader.

3. Any instruction that immediately follows a conditional or unconditional jump is a leader.

04/07/23 28

Page 29: Code Generator

Contd…

• Find the leaders

2904/07/23

Page 30: Code Generator

Next-Use Information

• The use of a name in a three-address statement:– Three-address statement i assigns a value to x – Statement j has x as an operand– Control can flow from statement i to j along a path that has no

intervening assignments to x– Then statement j uses the value of x computed at i .

– Say that x is live at statement i .

3004/07/23

Page 31: Code Generator

Next-Use Information

• Algorithm (for x=y+z) : Determining the liveness and next-use information for each statement in a basic block.

– INPUT: A basic block B of three-address statements. Assume the symbol table initially shows all nontemporary variables in B as being live on exit.

– OUTPUT: At each statement i : x = y + z in B, attach to i the liveness and next-use information of x, y, and z .

3104/07/23

Page 32: Code Generator

Contd…

– METHOD: Start at the last statement in B and scan backwards to the beginning of B. At each statement i: x = y + z in B, do the following:

1. Attach to i the information currently found in the symbol table regarding the next use and

liveness of x , y, and z.

2. In the symbol table, set x to "not live" and "no next use."

3. In the symbol table, set y and z to "live" and the next uses of y and z to i.

04/07/23 32

Page 33: Code Generator

33

Flow Graphs

• A flow graph is a graphical depiction of a sequence of instructions with control flow edges

• A flow graph can be defined at the intermediate code level or target code level

MOV 1,R0 MOV n,R1 JMP L2L1: MUL 2,R0 SUB 1,R1L2: JMPNZ R1,L1

MOV 0,R0 MOV n,R1 JMP L2L1: MUL 2,R0 SUB 1,R1L2: JMPNZ R1,L104/07/23

Page 34: Code Generator

34

Loops

• A loop is a collection of basic blocks, such that– All blocks in the collection are strongly

connected– The collection has a unique entry, and the

only way to reach a block in the loop is through the entry

04/07/23

Page 35: Code Generator

Optimizing of Basic Block

• Compile time evaluation• Common sub-expression elimination• Code motion • Strength Reduction• Dead code elimination• Algebraic Transformations

Page 36: Code Generator

36

Compile-Time Evaluation

• Expressions whose values can be pre-computed at the compilation time

• Two ways:– Constant folding– Constant propagation

Page 37: Code Generator

37

Compile-Time Evaluation

• Constant folding: Evaluation of an expression with constant operands to replace the expression with single value

• Example:

area := (22.0/7.0) * r ^ 2

area := 3.14286 * r ^ 2

Page 38: Code Generator

Compile-Time Evaluation

• Constant Propagation: Replace a variable with constant which has been assigned to it earlier.

• Example:pi := 3.14286area = pi * r ^ 2

area = 3.14286 * r ^ 2

Page 39: Code Generator

Common Sub-expression Elimination

• Local common sub-expression elimination– Performed within basic blocks.

a := b * c…

…x := b * c + 5

temp := b * ca := temp……x := temp + 5

Page 40: Code Generator

40

Code Motion

• Moving code from one part of the program to other without modifying the algorithm– Reduce size of the program– Reduce execution frequency of the code

subjected to movement

Page 41: Code Generator

Code Motion

• Similar to common sub-expression elimination but with the objective to reduce code size.

If(a<b) then z:= x * 5else y := x * 5 + 2

temp := x * 2If(a<b) then z:= tempelse y := temp + 2

Page 42: Code Generator

Strength Reduction

• Replacement of an operator with a less costly one.

X = x ^ 2

Y = y * 2

X = x * x

Y = y + y

Page 43: Code Generator

Dead Code Elimination

• Dead Code are portion of the program which will not be executed in Basic block.

If(a==b){b=c ;…..return b ;c = 0 ;}

If(a==b){b=c ;…..return b ;}

Page 44: Code Generator

References

• Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman,

“Compilers: Principles, Techniques, and Tools”

Addison-Wesley, 1986.

• http://en.wikipedia.org/wiki/Code_generation_(compiler)• http://www.mec.ac.in/resources/notes/notes/compiler/

module5/codegenissues.htm

04/07/23 44