chapter 1

Chapter 1

Language Processor

Introduction

• Semantic gap• Solve by PL

– Design and coding– PL implementation steps

• Introduced new PL Domain • Specification Gap:

– Semantic gap between two specificationof same task.

• Execution Gap:– Gap between the semantics of the program

written in different programming language.

Application domain

Execution domain

Execution domain

PL Domain

Application domain

Specification gap

Execution Gap

Semantic Gap

Language Processor

• Definition: LP is a software which bridges a specification or execution gap.

• Parts of LP:– Language translator: bridges an execution gap like compiler, assembler– Detranslator– Preprocessor– language migrator

• Interpreter: is a language processor which bridges an execution gap without generating m/c lang. program.

• Problem oriented lang.– Less specification gap, more execution gap

• Procedure oriented lang.– More specification gap, less specification gap

Language processing activities• Program generation activity

• Program Execution activity:– Translation and Interpretation

Application domain

Program generator

domain

Target PL

Domain

Execution Domain

Specification Gap

• Program Translation– Translate program from SL to m/c language.

• Characteristics– A program must be translated before it can be executed.– A translated program may saved in a file and saved program

may be executed repeatedly.– A program must be retranslated following modifications.

• Program Interpretation:– Reads the source program and stores in to memory. – Determines it meaning and performs action.

Program interpretation and Execution

• Program Execution– Fetch the instruction cycle– Decode the instruction to determine the

operation.– Execute the instruction

• Program Interpretation– Fetch the statement– Analyze the instruction to determine the meaning.– Execute the statement

Comparison

• ?????

Fundamentals of language processing

• LP= Analysis of SP+ Synthesis of TP.• Analysis of SP

– Lexical rule: valid lexical units– Syntax rule: formation of valid statements– Semantic rule: Associate mening with valid

statements.

Phases of LP

•

• Forward Reference: A forward reference of a program entity is a reference to the entity which precedes its definition in the program.

Ex. struct s { struct t *pt};. . struct t { struct s *ps };

• Issues concerning memory requirements and organization of LP.

Analysis Phase

Synthesis Phase

Source program

Target programIR

Errors Errors

Passes of LP

• Language Processor pass: A language processor pass is the processing of every statement in a source program, or its equivalent representation, to perform language processing function.

– Pass-I: Perform Analysis of SP.– Pass-II : Perform synthesis of TP.

Intermediate Representation of Programs

• Intermediate Representation: An Intermediate representation(IR) is a representation of a source program which reflects the effect of some, but not all, analysis and synthesis tasks performed during language processing.

Front End Back EndSP

Intermediate Representation(IR)

TP

IR and Semantic actions

• Properties of IR– Ease of Use:– Processing Efficiency: – Memory efficiency: compact

• Semantic actions:– All actions performed by the front end, except lexical and

syntax analysis are called semantic actions., which includes• Checking semantic validity• Determine the meaning• Constructing IR

Toy Compiler

• Gcc or cc compiler- c or c++• Toy compiler- ???

– Front End• Lexical analysis• Syntax analysis• Semantic analysis

– Back End• Memory Allocation• Code Generation

Symbol table

Generation

Symbol Type length address

a int

b float

temp int

Front End

• Lexical (scanning)– Ex. a:=b+i ; id#2 op#5 id #3 op#3 id #1 op#10

• Syntax (Parsing)a,b : real;a:=b+i;

• Semantic: IC tree is generated

real

ba

Back End• Memory Allocation:

• Code Generation: Generating Assembly Lang.– issues:

• Determine the places where IR should be kept.• Determine which instructions should be used for type conversion.• Determine which addressing mode should be used for accessing variables.

Symbol Type length address

a int 2 2000

b float 4 2002

temp int 2 2006

Fundamentals of Language Specification

• Programming language Grammars:– Terminal symbols

• lowercase letters, punctuation marks, null• Concatenation(.)

– Nonterminal symbols: name of syntax category of language– Productions: called rewriting a rule, is a rule of the grammar

• NT = String of T’s and NT’s.

• Production form:

<article>= a/an/the<Noun> =<boy ><apple><Noun phrase>= <artical><Noun>

Grammar• Def: A grammar G of a language Lg is a quadruple (∑,SNT,S,P) where,

– ∑ is the set of terminals– SNT is the set of NT’s– S is the distinguished symbol– P is the set of productions

• Ex: Derive a sentence “A boy ate an apple”

– <sentence> = <Noun Phrase> <verb phrase>– <Noun phrase> =<article><Noun>– <verb phrase>=<verb ><noun phrase>– <Article> = a/an/the– <Noun> = boy/apple– <Verb> = ate

Grammar

• Derive a + b * c /5 and construct parse tree.(top down)– <exp>=<exp> + <term> | <term>– <term>=<term>*<factor> | <factor>– <factor>=<factor>/<number>– <number>=0/1/2/3/../9

• Classification of grammar:

– Type-0: phrase structure grammar– Type-1 : context sensitive grammar– Type-2 : context free grammar– Type-3 : linear grammar or regular grammar

Binding

• Definition: A binding is the association of an attribute of a program entity with a value.

– Static Binding: Binding is a binding performed before the execution of a program begins.

– Dynamic Binding: Binding is a binding performed after the execution of a program begins.

Chapter - 3

Scanning and

Parsing

Unit-2

Role of lexical Analyzer

Scanning

• Definition: Scanning is the process of recognizing the lexical components in a source string.

• Type-3 grammar or regular grammar• Regular grammar used to identify identifiers• Regular language obtained from the operation or , concatenation and Kleen*• Ex. Write a regular expression which used to identify strings which ends with

abb.– (a+b)*abb.

• Ex. Write a regular expression which used to identify strings which recognize identifiers.

– R.E. = (letter)(letter/ digit)*– Digit = 0/1/2/…/9– Letter = a/b/c/…./z

Regular Expression and Meaningr String rs String s

r.s and rs Concatenation of r and s(r) Same meaning as r

r/s or (r/s) alternation (r or s)(r)/(s) Alternation

[r] An optional occurrence of r(r)* 0 or more occurrence of string r(r)+ 1 or more occurrence of string r

Examples of regular expression• Integer :[+/-](d)†• Real : [+/-](d)†.(d) †• Real with optional fraction : [+/-](d)†.(d) *• Identifier : l(l/d)*

Example of Regular expression

• String ending with 0 : (0+1)*0• String ending with 11: (0+1)*11• String with 0 EVEN and 1 ODD. (0+1)*(01*01*)*11*0*• The language of all strings containing exactly two

0’s. :1*01*01*• The language of all strings that do not end with 01 :

^+1+(0+1)*+(0+1)*11

Finite state automaton

• FSA: is a triple (S,∑,T) where, S is a finite set of states, ∑ is the alphabet of source symbols, T is a finite set of state transitions

FSA

DFA NFA

DFA from Regular Expression

(0+1)*0

(0+1)*(1+00) (0+1)*

(11+10)*

Transition table from DFA

States/input 0 1

qo q1 qo

q1 q1 qo

Transition Table(0+1)*0

(0+1)*(1+00) (0+1)*(11+10)*

DFA and it’s transition Diagram

Check for the given string aabab

Types of ParserTypes of Parser

Top down Parser

Backtracking

Bottom Up Parser

SLR

Predictive Parser

LR LALR

Shift Reduce Parser

LR Parser

33

Example

• Problem:– Can’t match next terminal– We guessed wrong at step 2

Rule Sentential form Input string- expr expr

expr

x

+

term

fact

term

2 expr + term x - 2 * y 3 term + term x – 2 * y 6 factor + term x – 2 * y 8 <id> + term x – 2 * y - <id,x> + term x – 2 * y

x - 2 * y

Current position in the input stream

34

Backtracking

• Rollback productions• Choose a different production for expr• Continue

Rule Sentential form Input string- expr2 expr + term x - 2 * y 3 term + term x – 2 * y 6 factor + term x – 2 * y 8 <id> + term x – 2 * y ? <id,x> + term x – 2 * y

x - 2 * y

Undo all these productions

35

Retrying

• Problem:– More input to read– Another cause of backtracking

Rule Sentential form Input string- expr

expr

expr

x

-

term

fact

term2 expr - term x - 2 * y 3 term - term x – 2 * y 6 factor - term x – 2 * y 8 <id> - term x – 2 * y - <id,x> - term x – 2 * y

x - 2 * y

3 <id,x> - factor x – 2 * y 7 <id,x> - <num> x – 2 * y

fact

2

36

Successful Parse

• All terminals match – we’re finished

Rule Sentential form Input string

- exprexpr

expr

x

-

term

fact

term2 expr - term x - 2 * y 3 term - term x – 2 * y 6 factor - term x – 2 * y 8 <id> - term x – 2 * y - <id,x> - term x – 2 * y

x - 2 * y

4 <id,x> - term * fact x – 2 * y 6 <id,x> - fact * fact x – 2 * y

2

7 <id,x> - <num> * fact x – 2 * y fact

- <id,x> - <num,2> * fact x – 2 * y 8 <id,x> - <num,2> * <id> x – 2 * y

term * fact

y

Problems in Top down Parsing

• Backtracking( we have seen)• Left recursion • Left Factoring

38

Left Recursion

• Problem: termination– Wrong choice leads to infinite expansion

(More importantly: without consuming any input!)– May not be as obvious as this– Our grammar is left recursive

Rule Sentential form Input string- expr2 expr + term x - 2 * y 2 expr + term + term x – 2 * y 2 expr + term + term + term x – 2 * y 2 expr + term + term + term + term x – 2 * y

x - 2 * y

Rules for Left Recursion

• If A-> Aa1/Aa2/Aa3/………/Aan/b1/b2/…/bn• After removal of left Recursion

A-> b1A’/b2A’/b3A’A’-> a1A’/a2A’/є

• Ex. Apply for • A-> Aa/Ab/c/d• A-> Ac/Aad/bd/є

Left Factoring• When the choice between two production is not clear, we

may be able to rewrite the productions to defer decisions is called as left factoring.

Ex. Stmt-> if expr then stmt else stmt | if expr then stmtStmt-> if expr then stmt S’S’-> if expr then stmt | є

• Rules: if A-> ab1/ab2 then A-> aA’A’-> b1/b2

Some examples for Left factoring

• S-> Assig_stmt/call_stmt/other– Assig_stmt-> id=exp– call_stmt->id(exp_list)

Recursive Descent Parsing• Example

Rule 1: S a S b Rule 2: S b S a Rule 3: S BRule 4: B b B Rule 5: B e

– Parse: a a b b b

• Has to use R1: S a S b• Again has to use R1: a S b a a S b b• Now has to use Rule 2 or 3, follow the order (always R2 first): • a a S b b a a b S a b b a a b b S a a b b a a b b b S a a a b b

– Now cannot use Rule 2 any more: a a b b b B a a a b b a a b b b B a a a b b incorrect, backtrack• After some backtracking, finally tried

– a S b a a S b b a a b B b b a a b b b worked

Predicative Parsing

• Need to immediately know which rule to apply when seeing the next input character– If for every non-terminal X

• We know what would be the first terminal of each X’s production

• And the first terminal of each X’s production is different– Then

• When current leftmost non-terminal is X• And we can look at the next input character• We know exactly which production should be used next

to expand X

Predicative Parsing

• Need to immediately know which rule to apply when seeing the next input character– If for every non-terminal X

• We know what would be the first terminal of each X’s production

• And the first terminal of each X’s production is different– Example

Rule 1: S a S b Rule 2: S b S a Rule 3: S BRule 4: B b B Rule 5: B e

First terminal is aFirst terminal is b

If next input is a, use R1If next input is b, use R2

But, R3’s first terminal is also bWon’t work!!!

Predicative Parsing• Need to immediately know which rule to apply when seeing the

next input character– If for every non-terminal X

• We know what would be the first terminal of each X’s production• And the first terminal of each X’s production is different

– What grammar does not satisfy the above?• If two productions of the same non-terminal have the same first symbol (N or

T), you can see immediately that it won’t work– S b S a | b B – S B a | B C

• If the grammar is left recursive, then it won’t work– S S a | b B, B b B | c– The left recursive rule of S can generate all terminals that the other productions of S can

generate» S b B can generate b, so, S S a can also generate b

Predicative Parsing

• Need to rewrite the grammar – Left recursion elimination

• This is required even for recursive descent parsing algorithm

– Left factoring• Remove the leftmost common factors

First()

• First() = { t | * t }– Consider all possible terminal strings derived

from – The set of the first terminals of those strings

• For all terminals t T– First(t) = {t}

First()• For all non-terminals X N

– If X e add e to First(X)– If X 1 2 … n

• i is either a terminal or a non-terminal (not a string as usual)

• Add all terminals in First(1) to First(X)

– Exclude e

• If e First(1) … e First(i-1) thenadd all terminals in First(i) to First(X)

• If e First(1) … e First(n) thenadd e to First(X)

• Apply the rules until nothing more can be added• For adding t or e: add only if t is not in the set yet

First()• Grammar

E TE’E’ +TE’ | eT FT’T’ *FT’ | eF (E) | id | num

• FirstFirst(*) = {*}, First(+) = {+}, …First(F) = {(, id, num}First(T’) = {*, e}First(T) = First(F) = {(, id, num}First(E’) = {+, e}First(E) = First(T) = {(, id, num}

First()

• GrammarS ABA aA | eB bB | e

• FirstFirst(A) = {a, e}First(B) = {b, e}First(S) = First(A) ={a, e}

Is this complete?

First()• Grammar

S AB | B (R1 | R2)A aA | c (R3 | R4)B bB | d (R5 | R6)

• FirstFirst(A) = {a, c}First(B) = {b, d}First(S) = First(A) First(B) = {a, b, c, d}

• Productions– First (R1) = {a, c}, First (R2) = {b, d}– First (R3) = {a}, First (R4) = {c}– First (R5) = {b}, First (R6) = {d}

If we see a If we see b If we see c If we see d When expanding S Use R1 Use R2 Use R1 Use R2When expanding A Use R3 - Use R4 -When expanding B - Use R5 - Use R6

Input: acbdExpands S, seeing a, use R1: S ABExpands A, seeing a, use R3: AB aABExpands A, seeing c, use R4: aAB acBExpands B, seeing b, use R5: acB acbBExpands B, seeing d, use R6: acbB acbd

First()• Grammar

S AB (R1)A aA | e (R2 | R3)B bB | e (R4 | R5)

• FirstFirst(A) = {a, e}First(B) = {b, e}First(S) = First(A) First(B) ={a, b, e}

• Productions– First (R1) = {a, b, e}– First (R2) = {a}, First (R3) = {e}– First (R4) = {b}, First (R5) = {e}

If we see a If we see b If we see eWhen expanding S Use R1 Use R1 Use R1When expanding A Use R2 - Use R3When expanding B - Use R4 Use R5

Input: aabbUse R1: S ABExpands A, seeing a, use R2: AB aABExpands A, seeing a, use R2: aAB aaABExpands A, seeing b, What to do? Not in table!

Follow()

• Follow() = { t | S * t }– Consider all strings that may follow – The set of the first terminals of those strings

• Assumptions– There is a $ at the end of every input string– S is the starting symbol

• For all non-terminals only– Add $ into Follow(S)– If A B add First() – {e} into Follow(B)– If A B or

A B and e First() add Follow(A) into Follow(B)

Follow()• First

First(A) = {a, e}First(B) = {b, e}First(S) = First(A) ={a, b, e}

• Productions– First (R1) = {a, b, e}– First (R2) = {a}, First (R3) = {e}– First (R4) = {b}, First (R5) = {e}

• Follow– Follow(S) = {$}– Follow(B) = Follow(S) = {$}– Follow(A) = First(B) Follow(S) = {b, $}

• Since e First(B), Follow(S) should be in Follow(A)

If we see a If we see bWhen expanding S Use R1 Use R1When expanding A Use R2 ?When expanding B - Use R4

Grammar S AB (R1) A aA | e (R2 | R3) B bB | e (R4 | R5)

If we see a If we see b If we see $When expanding S Use R1 Use R1 Use R1

When expanding A Use R2 Use R3 Use R3

When expanding B - Use R4 Use R5

Construct a Parse Table• Construct a parse table M[N, T{$}]

– Non-terminals in the rows and terminals in the columns• For each production A

– For each terminal a First() add A to M[A, a]• Meaning: When at A and seeing input a, A should be used

– If e First() then for each terminal a Follow(A) add A to M[A, a]• Meaning: When at A and seeing input a, A should be used

– In order to continue expansion to e– X AC A B B b | e C cc

– If e First() and $ Follow(A) add A to M[A, $]• Same as the above

First() and Follow() – another example

– First(*) = {*}– First(F) = {(, id, num}– First(T’) = {*, e}– First(T) = First(F) = {(, id, num}– First(E’) = {+, e}– First(E) = First(T) = {(, id, num}

– Follow(E) = {$, )}– Follow(E’) = Follow(E) = {$, )}– Follow(T) = {$, ), +}

• Since we have TE’ from first two rules and E’ can be e• Follow(T) = (First(E’)–{e}) Follow(E’)

– Follow(T’) = Follow(T) = {$, ), +}– Follow(F) = {*, $, ), +}

• Follow(F) = (First(T’)–{e}) Follow(T’)

Grammar E TE’ E’ +TE’ | e T FT’ T’ *FT’ | e F (E) | id | num

Construct a Parse Table

Grammar E TE’ E’ +TE’ | e T FT’ T’ *FT’ | e F (E) | id | num

First(*) = {*}First(F) = {(, id, num}First(T’) = {*, e}First(T) {(, id, num}First(E’) = {+, e}First(E) {(, id, num}

Follow(E) = {$, )}Follow(E’) = {$, )}Follow(T) = {$, ), +}Follow(T) = {$, ), +}Follow(T’) = {$, ), +}Follow(F) = {*, $, ), +}

E TE’: First(TE’) = {(, id, num}E’ +TE’: First(+TE’) = {+}E’ e: Follow(E’) = {$,)}T FT’: First(FT’) = {(, id, num}T’ *FT’: First(*FT’) = {*}T’ e: Follow(T’) = {$, ), +}id num * + ( ) $

E E TE’ E TE’ E TE’

E’ E’ +TE’ E’ e E’ e

T T FT’ T FT’ T FT’

T’ T’ *FT’ T’ e T’ e T’ e

F F id F num F (E)

Stack Input ActionE $ id + num * id $ ETE’T E’ $ id + num * id $ T FT’F T’ E’ $ id + num * id $ F idT’ E’ $ + num * id $ T’ eE’ $ + num * id $ E’ +TE’T E’ $ num * id $ T FT’F T’ E’ $ num * id $ F numT’ E’ $ * id $ T’ *FT’F T’ E’ $ id $ F idT’ E’ $ $ T’ eE’ $ $ E’ e$ $ Accept

Pop F from stack Remove id from input

+TE’: Only TE’ in stack Remove + from input

Pop T’ from stack Input unchanged

id num * + ( ) $E E TE’ E TE’ E TE’E’ E’ +TE’ E’ e E’ eT T FT’ T FT’ T FT’T’ T’ *FT’ T’ e T’ e T’ eF F id F num F (E)

More about LL Grammar• What grammar is not LL(1)?

S A | BA aaA | eB abB | b

• First(A) = {a, e}, First(B) = {a, b}, First(S) = {a, b, e}• Follow(S) = {$}, Follow(A) = {$}, Follow(B) = {$}

– But this grammar is LL(2)• If we lookahead 2 input characters, predictive parsing is possible• First2(A) = {aa, e}, First2(B) = {ab, b$}, First2(S) = {aa, ab, b$, e}

a b $S S A

S BS B S A

A A aaA A e B B abB B b

aa ab b$ $ ba, bb, a$S S A S B S B S AA A aaA A e B B abB B b

A Shift-Reduce ParserE E+T | T Right-Most Derivation of id+id*idT T*F | F E E+T E+T*F E+T*id E+F*idF (E) | id E+id*id T+id*id F+id*id id+id*id

Right-Most Sentential Form Reducing Productionid+id*id F idF+id*id T FT+id*id E TE+id*id F idE+F*id T FE+T*id F idE+T*F T T*F E+T E E+T E

Handles are red and underlined in the right-sentential forms

A Stack Implementation of A Shift-Reduce Parser

• There are four possible actions of a shift-parser action:

1. Shift : The next input symbol is shifted onto the top of the stack.2. Reduce: Replace the handle on the top of the stack by the non-terminal.3. Accept: Successful completion of parsing.4. Error: Parser discovers a syntax error, and calls an error recovery routine.

• Initial stack just contains only the end-marker $.• The end of the input string is marked by the end-marker $.

A Stack Implementation of A Shift-Reduce Parser

Stack InputAction$ id+id*id$ shift$id +id*id$ reduce by F id $F +id*id$ reduce by T F$T +id*id$ reduce by E T $E +id*id$ shift$E+ id*id$ shift $E+id *id$ reduce by F id$E+F *id$ reduce by T F $E+T *id$ shift$E+T* id$ shift $E+T*id $ reduce by F id$E+T*F $ reduce by T T*F $E+T $ reduce by E E+T$E $ accept

Operator-Precedence Parser• Operator grammar

– small, but an important class of grammars– we may have an efficient operator precedence parser (a shift-reduce

parser) for an operator grammar.

• In an operator grammar, no production rule can have:– e at the right side– two adjacent non-terminals at the right side.

• Ex:EAB EE+E |Aa E*E |Bb E/E | id

not operator grammar operator grammar

Precedence Relations

• In operator-precedence parsing, we define three disjoint precedence relations between certain pairs of terminals.

a <. b b has higher precedence than aa =· bb has same precedence as aa .> b b has lower precedence than a

• The determination of correct precedence relations between terminals are based on the traditional notions of associativity and precedence of operators. (Unary minus causes a problem).

Using Operator-Precedence Relations

• The intention of the precedence relations is to find the handle of a right-sentential form, <. with marking the left end, =· appearing in the interior of the handle, and .> marking the right hand.

• In our input string $a1a2...an$, we insert the precedence relation between the pairs of terminals (the precedence relation holds between the terminals in that pair).

Using Operator -Precedence Relations

E E+E | E-E | E*E | E/E | E^E | (E) | -E | id

The partial operator-precedencetable for this grammar.

• Then the input string id+id*id with the precedence relations inserted will be:

$ <. id .> + <. id .> * <. id .> $

id + * $id .> .> .>+ <. .> <. .>* <. .> .> .>$ <. <. <.

To Find The Handles1. Scan the string from left end until the first .> is encountered. 2. Then scan backwards (to the left) over any =· until a <. is

encountered. 3. The handle contains everything to left of the first .> and to

the right of the <. is encountered.

$ <. id .> + <. id .> * <. id .> $ E id $ id + id * id $$ <. + <. id .> * <. id .> $ E id $ E + id * id $ $ <. + <. * <. id .> $ E id $ E + E * id $ $ <. + <. * .> $ E E*E $ E + E * .E $$ <. + .> $ E E+E $ E + E $$ $ $ E $

Operator-Precedence Parsing Algorithm -- Example

stack inputaction

$ id+id*id$ $ <. id shift$id +id*id$ id .> + reduceE id$ +id*id$ shift$+ id*id$ shift$+id *id$ id .> * reduce E id$+ *id$ shift$+* id$ shift$+*id $ id .> $ reduce E id $+* $ * .> $ reduce E E*E $+ $ + .> $ reduce E E+E $ $ accept

id + * $id .> .> .>+ <. .> <. .>* <. .> .> .>$ <. <. <.

Chapter - 6

Introduction to Compiler

Unit-6

Aspects of Compilation

• Compiler bridges semantic gap between a PL domain and an execution domain.

• Two aspects of compilations are:-– Generate code to implement meaning of a source program in

execution domain. – Provide diagnostics for violations of PL semantics in a source program.– Data Types– Data Structures– Scope rules– Control Structures

Three address Code• In three-address code, there is at most one operator on the right side of an

instruction; that is, no built-up arithmetic expressions are permitted.

• Example: A source-language expression x+y*z might be translated into the sequence of three-address instructions below where tl and tz are compiler-generated temporary names.

• Generate code for x=a+b+c+d• Generate the code for x= -a *b + -a *b

Quadruple

OP Arg1 Arg2 Result

(0) uminus a t1

(1) * t1 b t2

(2) uminus a t3

(3) * t3 b t4

(4) + t2 t4 t5

(5) = t5 x

t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5

Triple

OP Arg1 Arg2

(0) uminus a

(1) * (0) B

(2) uminus A

(3) * (2) B

(4) + (1) (3)

(5) = x (4)


Indirect Triples

OP Arg1 Arg2

(0) uminus a

(1) * (0) B

(2) uminus A

(3) * (2) B

(4) + (1) (3)

(5) = x (4)


statement

(0) (11)

(1) (12)

(2) (13)

(3) (14)

(4) (15)

(5) (16)

Example• Construct Quadruple , Triple , Indirect Triple Representations of

a = b * - c + b * - c

Example

(c) Indirect triple

Aspects of compilation

• A compiler bridges a specification gap between a PL domain and an execution domain.

• Generate code to implement meaning of a source program in the execution domain.

• Provide diagnosis for violation of a PL semantics in a source program – PL features are:

• Data types: Specification of legal values for variables of the type• Data structures: • Scope rules: Accessibility of variables declared in different blocks of a

program.• Control structures:

Memory Allocation

• Memory binding: is an association between the ‘memory address’ attribute of a data item and the address of memory area. – Static memory Allocation:

• Allocates Before Execution– Dynamic memory Allocation

• Allocates After Execution

Static memory Allocation

• Program consist Three units A,B,C.

• Advantage??????

Code(A)

Data(A)

Code(B)

Data(B)

Code(C)

Data(C)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Dynamic memory Allocation• Program consist Three units A,B,C.

• Program A is active Data(A) is allocated

Code(A)

Code(B)

Code(C)

Data(A)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Dynamic memory Allocation• Program consist Three units A,B,C.• Pro. A calls B. and Data(B) gets allocated

Code(A)

Code(B)

Code(C)

Data(B)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Dynamic memory Allocation• Program consist Three units A,B,C.• Pro. B calls C. and Data(C) gets allocated

Code(A)

Code(B)

Code(C)

Data(C)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Dynamic memory Allocation• Different Scenario….

• Memory allocation in Block structured language(same as above.)• Advantage??????

Procedure AData(A)

B()C()

Procedure BData (B)

Procedure CData (C)

Code(A)

Code(B)

Code(C)

Data(A)

Code(A)

Code(B)

Code(C)

Data(A)

Data(B)

Code(A)

Code(A)

Code(A)

Data(A)

Data(C)

(b)(a) (c)

85

Stack

• Last In, First Out (LIFO) data structure

main (){ a(0); }void a (int m){ b(1); }void b (int n){ c(2); }void c (int o){ d(3); }void d (int p){ }

stack

Stack Pointer Stack grows down

Stack Pointer

Stack Pointer

Stack Pointer

Stack Pointer

• Activation Records:• also called frames• Information(memory) needed by a single execution of a

procedure• A general activation record:

Return value

actual parameters

optional control link

optional access link

machine status

local variables

temporaries

Store result of function call

Points to calling Procedure

Information of Program counter

Store temporary value

Information of actual Parameter

Non local data of other Procedure

Store local data

Activation Record for Factorial Program

main(){ int f; f=factorial(3); } int factorial(int n) { if(n==1)

{ return 1;}

else{return(n*factorial(n-1));}

}

Activation Record for Factorial Program

– Parameter passing• The method to associate actual parameters with

formal parameters.• The parameter passing method will effect the code

generated.

• Call-by-value:– The actual parameters are evaluated and their r-values

are passed to the called procedure.– Implementation:

» a formal parameter is treated like a local name, so the storage for the formals is in the activation record of the called procedure.

» The caller evaluates the actual parameters and places their r-values in the storage for the formals.

– Call-by-reference:• also called call-by address or call-by-location.• The caller passes to the called procedure a pointer to

the storage address of each actual parameter.– Actual parameter must have an address -- only variables

make sense, an expression will not (location of the temporary that holds the result of the expression will be passed).

– Copy-restore:• A hybrid between call-by-value and call-by-reference.

– The actual parameters are evaluated and its r-values are passed to the called procedure as in call-by-value.

– When the control returns, the r-value of the formal parameters are copied back into the l-value of the actuals.

chapter 1

Documents

program entity

translated program

saved program

characteristicsa program

mc language

language processing

language translator

different programming