lesson 5: code generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf ·...

61
1 12048 - J. Neira – University of Zaragoza Lesson 5: Code Generation 1. Introduction 2. Types of intermediate code 3. Declarations 4. Expressions and assignment 5. Control flow 6. Procedures and function Readings: Scott, chapter 9 Muchnick, chapters 4, 6 Aho, chapter 8 Fischer, chaptesr 10, 11 , 12 , 13, 14 Holub, chapter 6 Bennett, chapters 4, 10

Upload: nguyenphuc

Post on 02-Sep-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

112048 - J. Neira – University of Zaragoza

Lesson 5: Code Generation

1. Introduction2. Types of intermediate code3. Declarations4. Expressions and assignment5. Control flow6. Procedures and function

Readings: Scott, chapter 9Muchnick, chapters 4, 6Aho, chapter 8Fischer, chaptesr 10, 11 , 12 , 13, 14Holub, chapter 6Bennett, chapters 4, 10

Page 2: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

212048 - J. Neira – University of Zaragoza

1/6. Introduction• We look for:

– transportability– Possibilities for optimization

• It must be:– abstract– simple

• It does not consider:– Addressing modes– Size of data– Registers– Operation efficiency

Why not generate targetcode directly?

• Intermediate codeinterface between front-ends and back-ends

Page 3: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

312048 - J. Neira – University of Zaragoza

Intermediate code• Advantages:

– It allows machine abstraction, separate high level operations from their low level implementation.

– It allows reuse of front-ends and back-ends.

– It allows general optimizations.

• Disadvantages:– It implies and additional

pass for the compiler (the single pass model, conceptually simple, cannot be used).

– It complicates optimizations specificto the target architecture.

– It usually is orthogonal to the target machine, translation to a specific architecture will take longer and be inefficient.

Page 4: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

412048 - J. Neira – University of Zaragoza

2/6. Types of intermediate code• AST (Abstract Syntax

Trees): condensed form of analysis trees, with semantic nodes only and without nodes for terminals (the program is assumed semantically correct).

• Advantages: unification of compilation steps

– Creation of the tree and symbol table

– Semantic analysis– Optimization– Generation of object code

• Disadvantage:– space

E

‘+’ Tid(A)

‘*’ id(C)id(B)

AST:Analysis tree:

E

T‘+’

T

P

id(A)

P opmul P

id(B) id(C)

opad

‘*’

Page 5: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

512048 - J. Neira – University of Zaragoza

ASTsprogram gcd(input, output);var i,j : integer;begin

read(i,j);while i <> j do

if i > j then i := i – jelse j := j – i;

writeln(i);end.

Page 6: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

612048 - J. Neira – University of Zaragoza

Types of intermediate code• DAGs (Directed Acyclic Graphs): concise syntactic trees

– Some savings in space– This highlights operations duplicated in code– Difficult to build

Syntactic tree:

E

‘+’ T

‘*’ id(C)id(B)‘*’ id(C)id(B)

T

DAG:E

‘+’

T

‘*’ id(C)id(B)

Page 7: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

712048 - J. Neira – University of Zaragoza

Types of intermediate code• TAC (Three-Address

Code): sequence of instructions in the form of:

– operator: arithmetic / logical g– operands/result: constants,

names, temporals.

• It corresponds to instructions of the type:

• More complex operations require several instructions.

• This unrolling facilitates optimization and final code generation.

result := operand1 op operand2

a := b + c

(a + b) x (c – (d / e))

dir 1 dir 2 dir 3

Page 8: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

812048 - J. Neira – University of Zaragoza

TAC• General and simplified assembler for

a virtual machine: it includes labels, control flow instructions…

• It includes explicit references to the address of intermediate results (they are given a name).

• The use of names allows reorganization (up to a certain point).

• Some compilers generate this code as final code; easy to interpret (UCSD PCODE, Java).

• Operators:

IC generation optimization

small complex complexmany

few large simple simple

Page 9: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

912048 - J. Neira – University of Zaragoza

TAC• Syntactic variations:

(Fischer)(READI, A)(READI, B)(GT, A, B, t1)(JUMP0, t1, L1)(ADDI, A, 5, C)(JUMP, L2)(LABEL, L1)(ADDI, B, 5, C)(LABEL, L2)(SUBI,C, 1, t2)(MULTI, 2, t2, t3)(WRITEI, t3)

(Aho)1. Asignación:

x := y op zx := op yx := yx[i] := yx := y[i]x := &yx := *y*x := y

2. Saltos:goto Lif x oprel y goto L

3. Procedimientos:param x1...param xncall p, n

Page 10: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1012048 - J. Neira – University of Zaragoza

Representations of TAC

• Quadruples: the destination is usually a temporal.

• Triplets: only operands are represented.

• Supposed advantage: more concise.

• Disadvantage: generated code is position dependent.

True: positional dependencecomplicates optimization.

False: statistically, moreinstructions are required.(*, b, c, t1)

(*, b, d, t2)(+, t1, t2, a)

(*, b, c)(*, b, d)(+, (1), (2))(:=, (3), a)

(1)(2)(3)(4)

a := b * c + b * d;Source:

Page 11: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1112048 - J. Neira – University of Zaragoza

Representations of TAC• Indirect triplets: + array that indicates the order of

execution of instructions.

Efficient toreorganize

Space can beshared

b * cb * db * c(1) + (2)(4) * (3)a := (5);

123456

b * db * cb * c(1) + (2)(4) * (3)a := (5);

213456

b * cb * d(1) + (2)(3) * (1)a := (4);

12345

a := (b * c + b * d) * b * c;

Page 12: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1212048 - J. Neira – University of Zaragoza

RTL: register transfer languaged = (a + b) * c;

Page 13: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1312048 - J. Neira – University of Zaragoza

Postfix notation• Inverse polish, zero address code:

– Parenthesis-free mathematical notation– Operators appear after operands

a := b * c + b * c

@ab c multb c multadd asg

Expresions:

1. E atom ⇒ E ’

2. E1 op E2 ⇒ E1’ E2’ op

3. (E) ⇒ E’

Assignment:

id := E ⇒ @id E’ asg

Page 14: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1412048 - J. Neira – University of Zaragoza

Postfix notation• Advantage:

– Concise code.– No temporaries– Some simple optimizations

possible– Syntactic structure is

maintained

• Disadvantage:– Position-dependent code– Only effective is target

architecture is stack-based.

@a b c mult dup add asg

a := b * c + b * c

a + b * c (a + b) * c

E

‘+’ Tid(A)

‘*’id(C)id(B)

E

‘*’T id(C)

‘+’id(B)id(A)

abcmultadd

a b add c mult

Page 15: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1512048 - J. Neira – University of Zaragoza

Intermediate code generation• Fundamental consideration: generation of correct code.

• Without asking these questions, this is reasonably simple.

a := b * c

; @ A.SRF 0 5

; Access to B.SRF 0 4DRF

; Access to C.SRF 0 3DRFTMS

; Assignment.ASG

TIMEINEFICCIENCY:

Is there any less costly way to carry out multiplication?

TIMEINEFICCIENCY:

Is there any less costly way to carry out multiplication?

SPACEINEFFICIENCY:

Is b * c alreadycomputed somewhere?

SPACEINEFFICIENCY:

Is b * c alreadycomputed somewhere?

Page 16: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1612048 - J. Neira – University of Zaragoza

3/6. Processing of declarations

• No code is generated (with some exceptions).

• Limited to solving problems related with object space allocation:

– Required space– Place in memory– Value

• There is both implicit and explicit information.

• Variable declaration:Essentially, this isabout completing the

symbol table int sig, nivel = 0;void abrir_bloque(){

... sig = INICIAL;++nivel;

}void cerrar_bloque(){

... nivel--;}void crear_var (char *nom,

int tipo){

.....simbolo->dir = sig;switch (tipo) {

case ENTERO : sig += 2; break;

case REAL : sig += 4; break;

....}

} In the end, sig shows the sizeof the execution block.

Page 17: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1712048 - J. Neira – University of Zaragoza

ExamplePrograma p;

entero i, j, m;

accion dato (ref entero d);PrincipioFin

accion mcd(Val entero a, b; ref entero m);entero r;

PrincipioFin

PrincipioFin

3 4 5

3

3 4 5

6

Page 18: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1812048 - J. Neira – University of Zaragoza

Variable declaration• What if variable and

procedure declarations can be mixed?

• What if you need to maintain information about the size of each block?

• C: this problem does not exist.

• Pascal: many compilers ban this.

• Possible solution:

procedure P;var i,j,k : integer;

procedure Q;var l,m : integer;begin

....end;

var n : integer;

i.dir = 0j.dir = 2k.dir = 4

l.dir = 0m.dir = 2

n.dir = 4(same as k!)

int sig[MAXANID], nivel = 0;

void abrir_bloque (){... sig[++nivel] = INICIAL;}

void cerrar_bloque (){ ... nivel--;}

void crear_var (char *nom,int tipo)

{ ....simbolo->dir = sig[nivel];...

}

Page 19: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

1912048 - J. Neira – University of Zaragoza

Inicializers, C:• extern and static:

– Default initial value: 0.– Only constant values.– Initializer executed only once.

• auto and struct:– Default initial value: none.– Non-constant values allowed.– Executed every time the block

is activated.

#include <stdio.h>/* legal? */

int i = 1,/* legal? */j = i + 1, /* legal? */m = 34 + 1, f(),/* legal? */k = f(), /* legal? */l;

/* legal? */extern int n = 0;

int f(){

/* legal? */ int i = 1,

/* legal? */j = i + 1, g(),/* legal? */k = g(i), /* legal? */n = f(), l;

/* legal? */static int m = 1; .....

}

Page 20: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2012048 - J. Neira – University of Zaragoza

General scheme, sequential

programa p;...principio...fin

%union {...char *cadena;...

}...

programa: tPROGRAMA tIDENTIFICADOR ';'{

inicio_generacion_codigo ();$<cadena>$ = nueva_etiqueta ();generar (ENP, $<cadena>$);

}declaracion_variablesdeclaracion_acciones

{comentario (sprintf(msg,

"Comienzo de %s", $2.nombre));etiqueta($<cadena>4);

}bloque_instrucciones

{comentario (sprintf(msg,

"Fin de %s", $2.nombre));generar (LVP);fin_generacion_codigo();

}

ENP L0...

; Comienzo de pL0:

...; Fin de p

LVP

Page 21: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2112048 - J. Neira – University of Zaragoza

General scheme, AST

programa p;...principio...fin

%union {...LISTA cod;

}...%type <cod> bloque_instrucciones%type <cod> declaracion_acciones

ENP L0...

; Comienzo de pL0:

...; Fin de p

LVP

programa: tPROGRAMA tIDENTIFICADOR ';'declaracion_variablesdeclaracion_accionesbloque_instrucciones

{char *lenp = newlabel(), msg[100];

$$ = code (ENP, lenp);concatenar (&($$), $5); sprintf(msg, "Comienzo de %s“,...);comment (&($$), msg);label(&($$), lenp);concatenar (&($$), $6);sprintf(msg, "Fin de %s",...);comment (&($$), msg);emit (&($$), LVP);dumpcode ($$, fich_sal);

}

Page 22: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2212048 - J. Neira – University of Zaragoza

4/6. Expressions and assigment

• About operands:– Determine their localization– Carry out implicit conversions

• About operators:– Observe their precedence– Observe their associativity– Observe their order of

evaluation (if defined)– Determine the correct

interpretation (if overloaded)

• Generated instructions– Access to data

» Simple variables» Records» Arrays

– Data manipulation» Conversions» Operations

– Control flow» Subrange validations» Complex operators

x[i, j] := a*b + c*d - e*f + 10;

Page 23: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2312048 - J. Neira – University of Zaragoza

Some complications• Order of evaluation of

operands:• Method to evaluate

operators:

a[i] = i++;Value of index is current or

prior value of i?

FRAMES[pop()] = pop();

ASG or ASGI?

push(pop() - pop());

SBT correctly implemented?Is this relevant for PLUS?

In C, neither sum not assigmmentHave predefined order of evaluation.

push (pop() and pop());

ANDcorrectly implemented in PASCAL?

push (pop() && pop());

ANDcorrectly implemented in C?

In C, operator && is evaluatedby short circuiting and thus

from left to right.

Page 24: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2412048 - J. Neira – University of Zaragoza

Operands: access to names (sec)• The essential information must be obtained from the

symbol table. factor : TIDENT{generar (SRF, ?, ?);generar (DRF);/* ¿parametro? */

}| TIDENT

{ generar (SRF, ?, ?); /* ¿parametro? */}'[' expresion ']'{/* ?tamano? */generar (DRF);}

| TIDENT '(' args ')'{ generar (OSF, ?, ?, ?); }.....

;

constante : TTRUE { generar (STC, 1);

}| TFALSE

{ generar (STC, 0);

}| TENTERO

{ generar (STC, $1);

}| TCARACTER

{ generar (STC, $1);

};

Page 25: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2512048 - J. Neira – University of Zaragoza

Operands: access to names (AST)• The essential information must be obtained from the

symbol table.constante :

TTRUE { $$.cod = code (STC, 1);}

| TFALSE { $$.cod = code (STC, 0);}

| TENTERO { $$.cod = code (STC, $1);}

| TCARACTER { $$.cod = code (STC, $1);}

;

factor : TIDENT{$$.cod = code (SRF,?,?);emit (&($$.cod), DRF);/* ¿parametro? */}

| TIDENT '[' expresion ']'{ $$.cod = code (SRF, ?, ?); /* ¿parametro? */concatenar (&($$.cod),

$3.cod);/* ?tamano? */emit (&($$), (DRF);}

| TIDENT '(' args ')'{ $$.cod = $3;emit (&($$.cod),OSF,?,?,?); }.....

;

Page 26: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2612048 - J. Neira – University of Zaragoza

Arrays• Components stored

contiguously.

• For arrays of dimensions defined during compilation, displacement of V[i]:

• In C:

• Reduced to pointer arithmetic:

V[1] V[2] V[3] V[4] V[5]

(i – lower_bound) x size

i x size

v[i]is equivalent to

*(v + i)

Page 27: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2712048 - J. Neira – University of Zaragoza

Processing ararys• Given var v : array[4..8] of integer;

v[<expresión1>] := <expresión2>;

; n y o depend on the declarationSRF n o

; code for expresión1...

; lower boundSTC 4SBT

; size (superfuous in this case)STC 1TMSPLUS

; code for expresión2...ASG

; n y o depend on the declarationSRF n o

; code for expresión1...

; lower boundSTC 4SBT

; size (superfuous in this case)STC 1TMSPLUS

; code for expresión2...ASG

So the upper limit is not used?

Page 28: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2812048 - J. Neira – University of Zaragoza

Processing of arrays• Given var v : array[-3..5] of boolean;

if v[<expresión>] then ...;

; n y o depend on the declarationSRF n o

; code for expresión...????STC -3GTEJMF ... ; errorSTC 5LTEJMF ... ; error

; lower boundSTC -3SBT

; sizeSTC 1TMSPLUSDRFJMF ...

What should be done here?

Page 29: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

2912048 - J. Neira – University of Zaragoza

Contiguous row-wise arrays, v1

4

5

8

40

8

1 105264

entero v[2..5,3..7,1..8];

....v[2,3,1] := ...;v[5,7,8] := ...;v[3,4,5] := ...;

address?

105157

105105

159

52

0

Page 30: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3012048 - J. Neira – University of Zaragoza

Alternatively, v2

4

5

8

entero v[2..5,3..7,1..8];

....v[2,3,1] := ...;v[5,7,8] := ...;v[3,4,5] := ...;

address?105

2

13

105

0

5

32

264

159

3

19

157

52

Page 31: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3112048 - J. Neira – University of Zaragoza

Arrays: code generation, v2factor : TIDENT

{int nivel = ..., /* nivel sint. */int offset = ....; /* dir en BA */

generar (SRF, nivel, offset); /* ¿par x ref? */}'[' lista_indices ']’{ int t = ...; /* Tamaño elems */int c = ...; /* Parte constante */

generar (STC, c);generar (SBT);generar (STC, t);generar (TMS);generar (PLUS);generar (DRF);}

;

cn

dnaddr

Page 32: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3212048 - J. Neira – University of Zaragoza

Arrays: code generation, v2%union { ..... int i; ..... }%type <i> lista_indices%%lista_indices: expresion

{$$ = 1;

}| lista_indices ',' {int s_i, i;

i = $1 + 1; s_i = ..., generar (STC, s_i);generar (TMS);

}expresion{

generar (PLUS); $$ = $1 + 1;

};

ei

e1

si

ci-1

ci-1 si

c1 = e1

ci = ci-1 si + ei

Page 33: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3312048 - J. Neira – University of Zaragoza

And why not...?{% int i; %}%%lista_indices: expresion

{i = 1;

}| lista_indices ',' {int s_i, i;i = i + 1; s_i = ..., generar (STC, s_i);generar (TMS);

}expresion{

generar (PLUS); }

;

ei

e1

si

ci-1

ci-1 si

c1 = e1

ci = ci-1 si + ei

entero v[1..10], w[1..3, 1..4, 1..5, 1..6];....w[v[1], v[2], v[3], v[4]] := ...;

Page 34: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3412048 - J. Neira – University of Zaragoza

Example: for v[3,4,5] ; v

SRF n o; e_1

STC 3; s_ 2

STC 5TMS

; e_2STC 4PLUS

; s_3STC 8TMS

; e_3STC 5PLUS

; fin de los indices; c

STC 105SBTPLUS

; vSRF n o

; e_1STC 3

; m_1STC 40TMSPLUS

; e_2STC 4

; m_2STC 8TMSPLUS

; e_3STC 5

; m_3STC 1TMSPLUS

; fin de los indices; c

STC 105SBT

v1 v2

Which is more efficient?

Page 35: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3512048 - J. Neira – University of Zaragoza

Operators• Recursive algorithm: starting at the root of the

syntax tree:

For a n-ary operator:

1. Generate code to evaluate operands 1..n, saving results in temporary locations.

2. Generate code to evaluate the operator, using the operands saved in n locations and storing the result in another temporary location.

For a n-ary operator:

1. Generate code to evaluate operands 1..n, saving results in temporary locations.

2. Generate code to evaluate the operator, using the operands saved in n locations and storing the result in another temporary location.

a*b + c*d - e*f + 10

10

+

-

+

*

a b

*

c d

*

e f

a b * c d * + e f * - 10 +

Page 36: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3612048 - J. Neira – University of Zaragoza

Arithmetic operators (sec)%union { ... int instr; ... }%type <instr> op_ad op_mul%%expr_simple : termino

| '+' termino| '-' termino{ generar (NGI); }

| expr_simple op_ad termino{ generar ($2); }

;op_ad : '+' { $$ = PLUS; }

| '-' { $$ = SBT; };termino : factor

| termino op_mul factor{ generar ($2); }

;op_mul : '*' { $$ = TMS; }

| TDIV { $$ = DIV; }| TMOD { $$ = MOD; }

;Logical operators are dealt with

in a similar way, except....

Page 37: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3712048 - J. Neira – University of Zaragoza

Short circuit: or else• It implies operations for control flow:

A or else B if A then true else B

AJMT TBJMP Fin

T: STC 1Fin:

AJMF FSTC 1JMP Fin

F: BFin:

a+b+2

a+b+2

a+2a+2Instr.fvfvBffvvA

a+b+1

a+b+1

a+3a+3Instr.fvfvBffvvA

Page 38: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3812048 - J. Neira – University of Zaragoza

Short circuit: and then

A and then B if A then B else false

Instr.fvfvBffvvA

Instr.fvfvBffvvA

AJMF FBJMP Fin

F: STC 0Fin:

AJMT TSTC 0JMP Fin

T: BFin:

Page 39: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

3912048 - J. Neira – University of Zaragoza

Next problem....• What can we do with

intermediate results?• Stack machines:

That depends on thetype of intermediate code

a*1b +1 c*2d - e*3f +2 10

10

+

-

+

*

a b

*

c d

*

e f

;variable ASRF 0 3 DRF

;variable BSRF 0 4 DRF

;*1: A * BTMS;variable C

SRF 0 5 DRF;variable D

SRF 0 6 DRF;*2:C * DTMS;+1:(A*B)+(C*D)PLUS;variable E

SRF 0 7 DRF;variable F

SRF 0 8 DRF;*3:E * FTMS ;-:((A*B)+(C*D))-(E*F)

SBTSTC 10

;+2:(((A*B)+(C*D))-(E*F))+10PLUS

Page 40: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4012048 - J. Neira – University of Zaragoza

TAC: Temporary variables• An unlimited amount of temporary variables is assumed

available

• When translating this code to a target architecture, registers will be used if possible, or else memory addresses

t0 := A * B;t1 := C * D;t2 := t0 + t1;t3 := E * F;t4 := t2 - t3;t5 := t4 + 10;

a*1b +1 c*2d - e*3f +2 10

The register allocation problemThe register allocation problem

Page 41: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4112048 - J. Neira – University of Zaragoza

Three Address Code• A generator of temporary names is used:

• Expressions have as an attribute the name of the variable in which they are stored. %{extern char *newtemp();%}%union {...char *place;...}%type <place> TIDENT expresion%type <place> expresion_simple termino factor

char *newtemp (){

static int c = 0;char *m = malloc (5);

sprintf (m, "t%d", c++);return m;

}

Page 42: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4212048 - J. Neira – University of Zaragoza

Three Address Codeexpresion : simple { strcpy ($$, $1); }

;simple : termino { strcpy ($$, $1); }

| '+' termino { strcpy ($$, $2);}| '-' termino{strcpy ($$, newtemp());tac ("%s := -%s;\n",$$, $2);}| simple '+' termino{strcpy ($$, newtemp());tac("%s := %s + %s;\n",$$, $1, $3);}| simple '-' termino{strcpy ($$, newtemp());tac("%s := %s - %s;\n",$$, $1, $3);}

;factor : TIDENT { strcpy ($$, $1); }

| '(' expresion ')‘ { strcpy ($$, $2); }| TENTERO { sprintf ($$, "%d", $1);}

;

Page 43: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4312048 - J. Neira – University of Zaragoza

Reuse of temporary variables• Once a temporary variable

appears as an operand, it is never used again.

• When translating this intermediate code, registers will be used as much as possible (but their amount is limited); otherwise, memory locations. Reduction in the amount of temporaries required:

1. c = 0

2. To generate a new temporary name, use tc, and increment c by 1.

3. Whenever a temporary name appears as an operand, decrement c by 1.

1. c = 0

2. To generate a new temporary name, use tc, and increment c by 1.

3. Whenever a temporary name appears as an operand, decrement c by 1.

t0 := A * B;t1 := C * D;t0 := t0 + t1;t1 := E * F;t0 := t0 - t1;t0 := t0 + 10;

0121211

0121211

t0 := A * B;t1 := C * D;t2 := t0 + t1;t3 := E * F;t4 := t2 - t3;t5 := t4 + 10;

Page 44: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4412048 - J. Neira – University of Zaragoza

5/6. Control Flow• Without considerations of efficiency, code

generation is relatively simple:• Example 1: repetir

<instr>hasta <exp>

INSTR:; <instr>; <exp>JMF INSTR

repetir: tREPETIR{

$<instr>$ = nueva_etiqueta();etiqueta ($<instr>$);

} lista_instruccionestHASTA_QUE expresion{

generar (JMF, $<instr>2);};

Page 45: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4512048 - J. Neira – University of Zaragoza

Selection (sec)

si <exp>ent<instr1>si_no <instr2>

fsi

seleccion: tSI expresion{$<sino>$ = nueva_etiqueta();generar (JMF, $<sino>$);

}tENT lista_instrucciones{ $<fin>$ = nueva_etiqueta();generar (JMP,$<fin>$);etiqueta ($<sino>3);

}resto_seleccion tFSI{etiqueta($<fin>6);

};resto_seleccion:| tSI_NO lista_instrucciones;

; <exp>JMF SINO

; <instr1>JMP FIN

SINO:; <instr2>FIN:

Page 46: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4612048 - J. Neira – University of Zaragoza

Optimality• Code generation might not be optimal:

si i = 1 ent i := i + 1;

fsi

; expSRF n oDRFSTC 1EQJMF L0

; l1SRF n oSRF n oDRFSTC 1PLUSASGJMP L1

L0:L1:

Page 47: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4712048 - J. Neira – University of Zaragoza

Selection (AST)seleccion:

tSI expresiontENT lista_instruccionesresto_seleccion tFSI

{ char *lsino = newlabel(),

*lfin = newlabel();

$$ = $2.cod;emit (&($$), JMF, lsino);concatenar (&($$), $4);if (longitud_lista($5)) {

emit (&($$), JMP, lfin);label (&($$), lsino);concatenar (&($$), $5);label (&($$), lfin);

}else label (&($$), lsino);

};resto_seleccion: { $$ = newcode();}| tSI_NO lista_instrucciones{ $$ = $2};

; <exp>JMF SINO

; <instr1>SINO:

; <exp>JMF SINO

; <instr1>JMP FIN

SINO:; <instr2>FIN:

Page 48: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4812048 - J. Neira – University of Zaragoza

Multiple selection

caso <exp><exp1> : <instr1> ;...<expn> : <instrn> ;dlc <instr>

fcaso

; caso; exp; exp1EQJMF EXP2

; instr1JMP FIN

EXP2:...EXPn:; expnEQJMF DLC

; instrnJMP FIN

DLC:; dlc; instrFIN:

problems?

Page 49: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

4912048 - J. Neira – University of Zaragoza

Multiple selection

si <exp> = <exp1>ent <instr1>

si_no si <exp> = <exp2>ent <instr2>

....si_no si <exp> = <expn>

ent <instrn>si_no <instr>fsi...fsi

Equivalent to:

; caso; expDUP

; exp1EQJMF EXP2

; instr1JMP FIN

EXP2:...EXPn:DUP

; expnEQJMF DLC

; instrnJMP FIN

DLC:; dlc; instrFIN: POP

Page 50: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5012048 - J. Neira – University of Zaragoza

While (sec)

mq <exp> hacer<instr>

fmq

mientras_que: tMQ{

$<exp>$ = nueva_etiqueta();etiqueta ($<exp>$);

} expresion{

$<fin>$ = nueva_etiqueta();generar (JMF, $<fin>$);

}instr tFMQ{

generar (JMP, $<exp>2);etiqueta($<fin>4);

};

EXP:; <exp>

JMF FIN; <instr>JMP EXP

FIN:

Page 51: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5112048 - J. Neira – University of Zaragoza

While (AST)

mq <exp> hacer<bloque>

fmq

mientras_que: tMQ expresion bloque tFMQ{

char *lcond = newlabel(), *lbloque = newlabel();

$$ = code (JMP, lcond);label (&($$), lbloque);concatenar( &($$), $3);label (&($$), lcond);concatenar (&($$), $2);emit (&($$), JMT, lbloque);

}

JMP CONDBLOQUE:; <bloque>COND:; <exp>JMT BLOQUE

Page 52: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5212048 - J. Neira – University of Zaragoza

6/6. Procedures and Functions

accion q (val entero i; ref booleano t);entero j;booleano f;

accion r (ref entero j);entero i;principio

q (i, t);r (j)

fin

principioq (j, f);r (i)

fin

Declarations:•Retrieve the arguments•Generate the code for the procedure/function•Return the resulting value (funtions)

Calls:•Stack the arguments•Execute the procedure call

Page 53: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5312048 - J. Neira – University of Zaragoza

Arguments• Arguments are transmitted through the stack

accion p(val entero i; ref booleano k; val caracter c);

..........p (a,b,d);

; stack A; stack B; stack D; call P

OSF s l a

; retrieve C; retrieve K; retrieve IJMP P...

P:; action P...CSF

Page 54: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5412048 - J. Neira – University of Zaragoza

Procedure calls• When expressions

corresponding to arguments are evaluated, these are stacked.

• When creating the AR of the procedure, the size of the current AR must be preserved.

• The change of scope is the difference between the current scope and the invoked scope.

• The procedure/function address is determined when introducing the symbol in the symbol table.

SRF nn onDRFSTC 1PLUSSRF nb obDRFSRF nc ocDRFSTC 1PLUSOSF s l a

p (n+1, b, entacar(c+1));

Page 55: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5512048 - J. Neira – University of Zaragoza

Procedure calls• ref parameters require the

address of the references variable; expresiongenerates the code to obtain the value of the variable.

• For ref eliminate the last instruction generated by expresion

SRF nn onDRFSTC 1PLUSSRF nb obDRFSRF nc ocDRFSTC 1PLUSOSF s l a

n+1

b

entacar(c+1)

SRF nn onDRFSTC 1PLUSSRF nb ob

SRF nc ocDRFSTC 1PLUSOSF s l a

Page 56: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5612048 - J. Neira – University of Zaragoza

; accion Q; recuperar args QJMP Q

; accion R; recuperar args RJMP R

R:; codigo de RCSF

Q:; código de QCSF

Procedure declaration• Avoid the possible code

of local procedures and functions

accion q (...);

accion r (...);principio...fin

principio...fin

; action Q; retrieve args QJMP Q

; accion R; retrieve args RJMP R

R:; code of RCSF

Q:; code of QCSF

; action R; retrieve args RR:; code of RCSF

; action Q; retrieve args QQ:; code of QCSF

Page 57: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5712048 - J. Neira – University of Zaragoza

Retrieving arguments• Value/reference

parameters are treated separately:

– All are retrieved at the beginning, some are values, other addresses

– None is returned at the end.

• Copy parameters are treated as local variables:

– All are retrieved at the beginning

– Those that are O or I/O are returned at the end

• Arguments of type array may require all components to be stacked and the retrieved.

; retrieve CSRF 0 5ASGI

; retrieve KSRF 0 4ASGI

; retrieve ISRF 0 3ASGI

Page 58: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5812048 - J. Neira – University of Zaragoza

Use of parameters1. Value of a reference

parameter2. Address of a reference

parameter3. Value of a value

parameter and a reference parameter

4. Value parameter used as argument to a reference parameter

5. Reference parameter used as argument to reference parameter

6. Variables used as value and reference parameters respectively

programa p;entero i, j;

accion q(ref entero m);principioescribir (m);m := 0

fin

accion r(val entero k; ref entero l);

principioescribir (k, l);l := 0;q (k);q (l);

fin

principior (i, j);

fin

12

345

6

Page 59: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

5912048 - J. Neira – University of Zaragoza

Code that should be generatedENP L0

; action QSRF 0 3 ;rec. MASGIJMP L1

L1: SRF 0 3 ; MDRFDRFWRT 1SRF 0 3 ; MDRFSTC 0ASGCSF

; action RSRF 0 4 ;rec. LASGISRF 0 3 ;rec. KASGIJMP L2

L2: SRF 0 3 ; KDRFWRT 1

SRF 0 4 ; LDRFDRFWRT 1SRF 0 4 ; LDRFSTC 0ASGSRF 0 3 ; KOSF 5 1 1 ; QSRF 0 4 ; LDRFOSF 5 1 1 ; QCSF

; ppal PL0: SRF 0 3 ; I

DRFSRF 0 4 ; JOSF 5 0 13 ; RLVP

1

2

3

4

5

6

3

Page 60: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

6012048 - J. Neira – University of Zaragoza

And for functions.....• C version:

funcion mcd(val entero a,b)

dev entero;entero r;principior := a mod b;mq r <> 0a := b;b := r;r := a mod b

fmq;dev (b);

fin

; retrieve parametersSRF 0 4 ; BASGI SRF 0 3 ; AASGIJMP MCD; code of mcdMCD:......; leave result ; in stackSRF 0 4DRF CSFCSF

Page 61: Lesson 5: Code Generation - unizar.eswebdiis.unizar.es/~neira/12048/12048 code generation.pdf · Lesson 5: Code Generation 1. ... • Without asking these questions, ... • The essential

6112048 - J. Neira – University of Zaragoza

And for functions.....• Pascal version:

; retrieve parametersSRF 0 5 ; BASGI SRF 0 4 ; AASGIJMP MCD; code of mcdMCD:......SRF 0 3 ; MCDSRF 0 5 ; BDRF ASG; leave result; in stackSRF 0 3 ; MCDDRF CSF

funcion mcd(val entero a, b)

dev entero;entero r;principior := a mod b;mq r <> 0a := b;b := r;r := a mod b

fmq;mcd := b;

fin