chapter 1

91
Chapter 1 Language Processor

Upload: tender

Post on 25-Feb-2016

26 views

Category:

Documents


1 download

DESCRIPTION

Chapter 1. Language Processor. Introduction. Semantic gap Solve by PL Design and coding PL implementation steps Introduced new PL Domain Specification Gap: Semantic gap between two specification of same task. Execution Gap: Gap between the semantics of the program - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 1

Chapter 1

Language Processor

Page 2: Chapter 1

Introduction

• Semantic gap• Solve by PL

– Design and coding– PL implementation steps

• Introduced new PL Domain • Specification Gap:

– Semantic gap between two specificationof same task.

• Execution Gap:– Gap between the semantics of the program

written in different programming language.

Application domain

Execution domain

Execution domain

PL Domain

Application domain

Specification gap

Execution Gap

Semantic Gap

Page 3: Chapter 1

Language Processor

• Definition: LP is a software which bridges a specification or execution gap.

• Parts of LP:– Language translator: bridges an execution gap like compiler, assembler– Detranslator– Preprocessor– language migrator

• Interpreter: is a language processor which bridges an execution gap without generating m/c lang. program.

Page 4: Chapter 1

• Problem oriented lang.– Less specification gap, more execution gap

• Procedure oriented lang.– More specification gap, less specification gap

Page 5: Chapter 1

Language processing activities• Program generation activity

• Program Execution activity:– Translation and Interpretation

Application domain

Program generator

domain

Target PL

Domain

Execution Domain

Specification Gap

Page 6: Chapter 1

• Program Translation– Translate program from SL to m/c language.

• Characteristics– A program must be translated before it can be executed.– A translated program may saved in a file and saved program

may be executed repeatedly.– A program must be retranslated following modifications.

• Program Interpretation:– Reads the source program and stores in to memory. – Determines it meaning and performs action.

Page 7: Chapter 1

Program interpretation and Execution

• Program Execution– Fetch the instruction cycle– Decode the instruction to determine the

operation.– Execute the instruction

• Program Interpretation– Fetch the statement– Analyze the instruction to determine the meaning.– Execute the statement

Page 8: Chapter 1

Comparison

• ?????

Page 9: Chapter 1

Fundamentals of language processing

• LP= Analysis of SP+ Synthesis of TP.• Analysis of SP

– Lexical rule: valid lexical units– Syntax rule: formation of valid statements– Semantic rule: Associate mening with valid

statements.

Page 10: Chapter 1

Phases of LP

• Forward Reference: A forward reference of a program entity is a reference to the entity which precedes its definition in the program.

Ex. struct s { struct t *pt};. . struct t { struct s *ps };

• Issues concerning memory requirements and organization of LP.

Analysis Phase

Synthesis Phase

Source program

Target programIR

Errors Errors

Page 11: Chapter 1

Passes of LP

• Language Processor pass: A language processor pass is the processing of every statement in a source program, or its equivalent representation, to perform language processing function.

– Pass-I: Perform Analysis of SP.– Pass-II : Perform synthesis of TP.

Page 12: Chapter 1

Intermediate Representation of Programs

• Intermediate Representation: An Intermediate representation(IR) is a representation of a source program which reflects the effect of some, but not all, analysis and synthesis tasks performed during language processing.

Front End Back EndSP

Intermediate Representation(IR)

TP

Page 13: Chapter 1

IR and Semantic actions

• Properties of IR– Ease of Use:– Processing Efficiency: – Memory efficiency: compact

• Semantic actions:– All actions performed by the front end, except lexical and

syntax analysis are called semantic actions., which includes• Checking semantic validity• Determine the meaning• Constructing IR

Page 14: Chapter 1

Toy Compiler

• Gcc or cc compiler- c or c++• Toy compiler- ???

– Front End• Lexical analysis• Syntax analysis• Semantic analysis

– Back End• Memory Allocation• Code Generation

Symbol table

Generation

Symbol Type length address

a int

b float

temp int

Page 15: Chapter 1

Front End

• Lexical (scanning)– Ex. a:=b+i ; id#2 op#5 id #3 op#3 id #1 op#10

• Syntax (Parsing)a,b : real;a:=b+i;

• Semantic: IC tree is generated

real

ba

Page 16: Chapter 1

Back End• Memory Allocation:

• Code Generation: Generating Assembly Lang.– issues:

• Determine the places where IR should be kept.• Determine which instructions should be used for type conversion.• Determine which addressing mode should be used for accessing variables.

Symbol Type length address

a int 2 2000

b float 4 2002

temp int 2 2006

Page 17: Chapter 1

Fundamentals of Language Specification

• Programming language Grammars:– Terminal symbols

• lowercase letters, punctuation marks, null• Concatenation(.)

– Nonterminal symbols: name of syntax category of language– Productions: called rewriting a rule, is a rule of the grammar

• NT = String of T’s and NT’s.

• Production form:

<article>= a/an/the<Noun> =<boy ><apple><Noun phrase>= <artical><Noun>

Page 18: Chapter 1

Grammar• Def: A grammar G of a language Lg is a quadruple (∑,SNT,S,P) where,

– ∑ is the set of terminals– SNT is the set of NT’s– S is the distinguished symbol– P is the set of productions

• Ex: Derive a sentence “A boy ate an apple”

– <sentence> = <Noun Phrase> <verb phrase>– <Noun phrase> =<article><Noun>– <verb phrase>=<verb ><noun phrase>– <Article> = a/an/the– <Noun> = boy/apple– <Verb> = ate

Page 19: Chapter 1

Grammar

• Derive a + b * c /5 and construct parse tree.(top down)– <exp>=<exp> + <term> | <term>– <term>=<term>*<factor> | <factor>– <factor>=<factor>/<number>– <number>=0/1/2/3/../9

• Classification of grammar:

– Type-0: phrase structure grammar– Type-1 : context sensitive grammar– Type-2 : context free grammar– Type-3 : linear grammar or regular grammar

Page 20: Chapter 1

Binding

• Definition: A binding is the association of an attribute of a program entity with a value.

– Static Binding: Binding is a binding performed before the execution of a program begins.

– Dynamic Binding: Binding is a binding performed after the execution of a program begins.

Page 21: Chapter 1

Chapter - 3

Scanning and

Parsing

Unit-2

Page 22: Chapter 1

Role of lexical Analyzer

Page 23: Chapter 1

Scanning

• Definition: Scanning is the process of recognizing the lexical components in a source string.

• Type-3 grammar or regular grammar• Regular grammar used to identify identifiers• Regular language obtained from the operation or , concatenation and Kleen*• Ex. Write a regular expression which used to identify strings which ends with

abb.– (a+b)*abb.

• Ex. Write a regular expression which used to identify strings which recognize identifiers.

– R.E. = (letter)(letter/ digit)*– Digit = 0/1/2/…/9– Letter = a/b/c/…./z

Page 24: Chapter 1

Regular Expression and Meaningr String rs String s

r.s and rs Concatenation of r and s(r) Same meaning as r

r/s or (r/s) alternation (r or s)(r)/(s) Alternation

[r] An optional occurrence of r(r)* 0 or more occurrence of string r(r)+ 1 or more occurrence of string r

Page 25: Chapter 1

Examples of regular expression• Integer :[+/-](d)†• Real : [+/-](d)†.(d) †• Real with optional fraction : [+/-](d)†.(d) *• Identifier : l(l/d)*

Page 26: Chapter 1

Example of Regular expression

• String ending with 0 : (0+1)*0• String ending with 11: (0+1)*11• String with 0 EVEN and 1 ODD. (0+1)*(01*01*)*11*0*• The language of all strings containing exactly two

0’s. :1*01*01*• The language of all strings that do not end with 01 :

^+1+(0+1)*+(0+1)*11

Page 27: Chapter 1

Finite state automaton

• FSA: is a triple (S,∑,T) where, S is a finite set of states, ∑ is the alphabet of source symbols, T is a finite set of state transitions

FSA

DFA NFA

Page 28: Chapter 1

DFA from Regular Expression

(0+1)*0

(0+1)*(1+00) (0+1)*

(11+10)*

Page 29: Chapter 1

Transition table from DFA

States/input 0 1

qo q1 qo

q1 q1 qo

Transition Table(0+1)*0

(0+1)*(1+00) (0+1)*(11+10)*

Page 30: Chapter 1

DFA and it’s transition Diagram

Check for the given string aabab

Page 31: Chapter 1

Types of ParserTypes of Parser

Top down Parser

Backtracking

Bottom Up Parser

SLR

Predictive Parser

LR LALR

Shift Reduce Parser

LR Parser

Page 32: Chapter 1

32

Example

• Expression grammar (with precedence)

• Input string x – 2 * y

# Production rule12345678

expr → expr + term | expr - term | termterm → term * factor | term / factor | factorfactor → number | identifier

Page 33: Chapter 1

33

Example

• Problem:– Can’t match next terminal– We guessed wrong at step 2

Rule Sentential form Input string- expr expr

expr

x

+

term

fact

term

2 expr + term x - 2 * y 3 term + term x – 2 * y 6 factor + term x – 2 * y 8 <id> + term x – 2 * y - <id,x> + term x – 2 * y

x - 2 * y

Current position in the input stream

Page 34: Chapter 1

34

Backtracking

• Rollback productions• Choose a different production for expr• Continue

Rule Sentential form Input string- expr2 expr + term x - 2 * y 3 term + term x – 2 * y 6 factor + term x – 2 * y 8 <id> + term x – 2 * y ? <id,x> + term x – 2 * y

x - 2 * y

Undo all these productions

Page 35: Chapter 1

35

Retrying

• Problem:– More input to read– Another cause of backtracking

Rule Sentential form Input string- expr

expr

expr

x

-

term

fact

term2 expr - term x - 2 * y 3 term - term x – 2 * y 6 factor - term x – 2 * y 8 <id> - term x – 2 * y - <id,x> - term x – 2 * y

x - 2 * y

3 <id,x> - factor x – 2 * y 7 <id,x> - <num> x – 2 * y

fact

2

Page 36: Chapter 1

36

Successful Parse

• All terminals match – we’re finished

Rule Sentential form Input string

- exprexpr

expr

x

-

term

fact

term2 expr - term x - 2 * y 3 term - term x – 2 * y 6 factor - term x – 2 * y 8 <id> - term x – 2 * y - <id,x> - term x – 2 * y

x - 2 * y

4 <id,x> - term * fact x – 2 * y 6 <id,x> - fact * fact x – 2 * y

2

7 <id,x> - <num> * fact x – 2 * y fact

- <id,x> - <num,2> * fact x – 2 * y 8 <id,x> - <num,2> * <id> x – 2 * y

term * fact

y

Page 37: Chapter 1

Problems in Top down Parsing

• Backtracking( we have seen)• Left recursion • Left Factoring

Page 38: Chapter 1

38

Left Recursion

• Problem: termination– Wrong choice leads to infinite expansion

(More importantly: without consuming any input!)– May not be as obvious as this– Our grammar is left recursive

Rule Sentential form Input string- expr2 expr + term x - 2 * y 2 expr + term + term x – 2 * y 2 expr + term + term + term x – 2 * y 2 expr + term + term + term + term x – 2 * y

x - 2 * y

Page 39: Chapter 1

Rules for Left Recursion

• If A-> Aa1/Aa2/Aa3/………/Aan/b1/b2/…/bn• After removal of left Recursion

A-> b1A’/b2A’/b3A’A’-> a1A’/a2A’/є

• Ex. Apply for • A-> Aa/Ab/c/d• A-> Ac/Aad/bd/є

Page 40: Chapter 1

40

Removing Left Recursion• Two cases of left recursion:

• Transform as follows

# Production rule123

expr → expr + term | expr - term | term

# Production rule456

term → term * factor | term / factor | factor

# Production rule1234

expr → term expr2expr2 → + term expr2 | - term expr2 | e

# Production rule456

term → factor term2term2 → * factor term2 | / factor term2 | e

Page 41: Chapter 1

Left Factoring• When the choice between two production is not clear, we

may be able to rewrite the productions to defer decisions is called as left factoring.

Ex. Stmt-> if expr then stmt else stmt | if expr then stmtStmt-> if expr then stmt S’S’-> if expr then stmt | є

• Rules: if A-> ab1/ab2 then A-> aA’A’-> b1/b2

Page 42: Chapter 1

Some examples for Left factoring

• S-> Assig_stmt/call_stmt/other– Assig_stmt-> id=exp– call_stmt->id(exp_list)

Page 43: Chapter 1

Recursive Descent Parsing• Example

Rule 1: S a S b Rule 2: S b S a Rule 3: S BRule 4: B b B Rule 5: B e

– Parse: a a b b b

• Has to use R1: S a S b• Again has to use R1: a S b a a S b b• Now has to use Rule 2 or 3, follow the order (always R2 first): • a a S b b a a b S a b b a a b b S a a b b a a b b b S a a a b b

– Now cannot use Rule 2 any more: a a b b b B a a a b b a a b b b B a a a b b incorrect, backtrack• After some backtracking, finally tried

– a S b a a S b b a a b B b b a a b b b worked

Page 44: Chapter 1

Predicative Parsing

• Need to immediately know which rule to apply when seeing the next input character– If for every non-terminal X

• We know what would be the first terminal of each X’s production

• And the first terminal of each X’s production is different– Then

• When current leftmost non-terminal is X• And we can look at the next input character• We know exactly which production should be used next

to expand X

Page 45: Chapter 1

Predicative Parsing

• Need to immediately know which rule to apply when seeing the next input character– If for every non-terminal X

• We know what would be the first terminal of each X’s production

• And the first terminal of each X’s production is different– Example

Rule 1: S a S b Rule 2: S b S a Rule 3: S BRule 4: B b B Rule 5: B e

First terminal is aFirst terminal is b

If next input is a, use R1If next input is b, use R2

But, R3’s first terminal is also bWon’t work!!!

Page 46: Chapter 1

Predicative Parsing• Need to immediately know which rule to apply when seeing the

next input character– If for every non-terminal X

• We know what would be the first terminal of each X’s production• And the first terminal of each X’s production is different

– What grammar does not satisfy the above?• If two productions of the same non-terminal have the same first symbol (N or

T), you can see immediately that it won’t work– S b S a | b B – S B a | B C

• If the grammar is left recursive, then it won’t work– S S a | b B, B b B | c– The left recursive rule of S can generate all terminals that the other productions of S can

generate» S b B can generate b, so, S S a can also generate b

Page 47: Chapter 1

Predicative Parsing

• Need to rewrite the grammar – Left recursion elimination

• This is required even for recursive descent parsing algorithm

– Left factoring• Remove the leftmost common factors

Page 48: Chapter 1

First()

• First() = { t | * t }– Consider all possible terminal strings derived

from – The set of the first terminals of those strings

• For all terminals t T– First(t) = {t}

Page 49: Chapter 1

First()• For all non-terminals X N

– If X e add e to First(X)– If X 1 2 … n

• i is either a terminal or a non-terminal (not a string as usual)

• Add all terminals in First(1) to First(X)

– Exclude e

• If e First(1) … e First(i-1) thenadd all terminals in First(i) to First(X)

• If e First(1) … e First(n) thenadd e to First(X)

• Apply the rules until nothing more can be added• For adding t or e: add only if t is not in the set yet

Page 50: Chapter 1

First()• Grammar

E TE’E’ +TE’ | eT FT’T’ *FT’ | eF (E) | id | num

• FirstFirst(*) = {*}, First(+) = {+}, …First(F) = {(, id, num}First(T’) = {*, e}First(T) = First(F) = {(, id, num}First(E’) = {+, e}First(E) = First(T) = {(, id, num}

Page 51: Chapter 1

First()

• GrammarS ABA aA | eB bB | e

• FirstFirst(A) = {a, e}First(B) = {b, e}First(S) = First(A) ={a, e}

Is this complete?

Page 52: Chapter 1

First()• Grammar

S AB | B (R1 | R2)A aA | c (R3 | R4)B bB | d (R5 | R6)

• FirstFirst(A) = {a, c}First(B) = {b, d}First(S) = First(A) First(B) = {a, b, c, d}

• Productions– First (R1) = {a, c}, First (R2) = {b, d}– First (R3) = {a}, First (R4) = {c}– First (R5) = {b}, First (R6) = {d}

If we see a If we see b If we see c If we see d When expanding S Use R1 Use R2 Use R1 Use R2When expanding A Use R3 - Use R4 -When expanding B - Use R5 - Use R6

Input: acbdExpands S, seeing a, use R1: S ABExpands A, seeing a, use R3: AB aABExpands A, seeing c, use R4: aAB acBExpands B, seeing b, use R5: acB acbBExpands B, seeing d, use R6: acbB acbd

Page 53: Chapter 1

First()• Grammar

S AB (R1)A aA | e (R2 | R3)B bB | e (R4 | R5)

• FirstFirst(A) = {a, e}First(B) = {b, e}First(S) = First(A) First(B) ={a, b, e}

• Productions– First (R1) = {a, b, e}– First (R2) = {a}, First (R3) = {e}– First (R4) = {b}, First (R5) = {e}

If we see a If we see b If we see eWhen expanding S Use R1 Use R1 Use R1When expanding A Use R2 - Use R3When expanding B - Use R4 Use R5

Input: aabbUse R1: S ABExpands A, seeing a, use R2: AB aABExpands A, seeing a, use R2: aAB aaABExpands A, seeing b, What to do? Not in table!

Page 54: Chapter 1

Follow()

• Follow() = { t | S * t }– Consider all strings that may follow – The set of the first terminals of those strings

• Assumptions– There is a $ at the end of every input string– S is the starting symbol

• For all non-terminals only– Add $ into Follow(S)– If A B add First() – {e} into Follow(B)– If A B or

A B and e First() add Follow(A) into Follow(B)

Page 55: Chapter 1

Follow()• First

First(A) = {a, e}First(B) = {b, e}First(S) = First(A) ={a, b, e}

• Productions– First (R1) = {a, b, e}– First (R2) = {a}, First (R3) = {e}– First (R4) = {b}, First (R5) = {e}

• Follow– Follow(S) = {$}– Follow(B) = Follow(S) = {$}– Follow(A) = First(B) Follow(S) = {b, $}

• Since e First(B), Follow(S) should be in Follow(A)

If we see a If we see bWhen expanding S Use R1 Use R1When expanding A Use R2 ?When expanding B - Use R4

Grammar S AB (R1) A aA | e (R2 | R3) B bB | e (R4 | R5)

If we see a If we see b If we see $When expanding S Use R1 Use R1 Use R1

When expanding A Use R2 Use R3 Use R3

When expanding B - Use R4 Use R5

Page 56: Chapter 1

Construct a Parse Table• Construct a parse table M[N, T{$}]

– Non-terminals in the rows and terminals in the columns• For each production A

– For each terminal a First() add A to M[A, a]• Meaning: When at A and seeing input a, A should be used

– If e First() then for each terminal a Follow(A) add A to M[A, a]• Meaning: When at A and seeing input a, A should be used

– In order to continue expansion to e– X AC A B B b | e C cc

– If e First() and $ Follow(A) add A to M[A, $]• Same as the above

Page 57: Chapter 1

First() and Follow() – another example

– First(*) = {*}– First(F) = {(, id, num}– First(T’) = {*, e}– First(T) = First(F) = {(, id, num}– First(E’) = {+, e}– First(E) = First(T) = {(, id, num}

– Follow(E) = {$, )}– Follow(E’) = Follow(E) = {$, )}– Follow(T) = {$, ), +}

• Since we have TE’ from first two rules and E’ can be e• Follow(T) = (First(E’)–{e}) Follow(E’)

– Follow(T’) = Follow(T) = {$, ), +}– Follow(F) = {*, $, ), +}

• Follow(F) = (First(T’)–{e}) Follow(T’)

Grammar E TE’ E’ +TE’ | e T FT’ T’ *FT’ | e F (E) | id | num

Page 58: Chapter 1

Construct a Parse Table

Grammar E TE’ E’ +TE’ | e T FT’ T’ *FT’ | e F (E) | id | num

First(*) = {*}First(F) = {(, id, num}First(T’) = {*, e}First(T) {(, id, num}First(E’) = {+, e}First(E) {(, id, num}

Follow(E) = {$, )}Follow(E’) = {$, )}Follow(T) = {$, ), +}Follow(T) = {$, ), +}Follow(T’) = {$, ), +}Follow(F) = {*, $, ), +}

E TE’: First(TE’) = {(, id, num}E’ +TE’: First(+TE’) = {+}E’ e: Follow(E’) = {$,)}T FT’: First(FT’) = {(, id, num}T’ *FT’: First(*FT’) = {*}T’ e: Follow(T’) = {$, ), +}id num * + ( ) $

E E TE’ E TE’ E TE’

E’ E’ +TE’ E’ e E’ e

T T FT’ T FT’ T FT’

T’ T’ *FT’ T’ e T’ e T’ e

F F id F num F (E)

Page 59: Chapter 1

Stack Input ActionE $ id + num * id $ ETE’T E’ $ id + num * id $ T FT’F T’ E’ $ id + num * id $ F idT’ E’ $ + num * id $ T’ eE’ $ + num * id $ E’ +TE’T E’ $ num * id $ T FT’F T’ E’ $ num * id $ F numT’ E’ $ * id $ T’ *FT’F T’ E’ $ id $ F idT’ E’ $ $ T’ eE’ $ $ E’ e$ $ Accept

Pop F from stack Remove id from input

+TE’: Only TE’ in stack Remove + from input

Pop T’ from stack Input unchanged

id num * + ( ) $E E TE’ E TE’ E TE’E’ E’ +TE’ E’ e E’ eT T FT’ T FT’ T FT’T’ T’ *FT’ T’ e T’ e T’ eF F id F num F (E)

Page 60: Chapter 1

More about LL Grammar• What grammar is not LL(1)?

S A | BA aaA | eB abB | b

• First(A) = {a, e}, First(B) = {a, b}, First(S) = {a, b, e}• Follow(S) = {$}, Follow(A) = {$}, Follow(B) = {$}

– But this grammar is LL(2)• If we lookahead 2 input characters, predictive parsing is possible• First2(A) = {aa, e}, First2(B) = {ab, b$}, First2(S) = {aa, ab, b$, e}

a b $S S A

S BS B S A

A A aaA A e B B abB B b

aa ab b$ $ ba, bb, a$S S A S B S B S AA A aaA A e B B abB B b

Page 61: Chapter 1

A Shift-Reduce ParserE E+T | T Right-Most Derivation of id+id*idT T*F | F E E+T E+T*F E+T*id E+F*idF (E) | id E+id*id T+id*id F+id*id id+id*id

Right-Most Sentential Form Reducing Productionid+id*id F idF+id*id T FT+id*id E TE+id*id F idE+F*id T FE+T*id F idE+T*F T T*F E+T E E+T E

Handles are red and underlined in the right-sentential forms

Page 62: Chapter 1

A Stack Implementation of A Shift-Reduce Parser

• There are four possible actions of a shift-parser action:

1. Shift : The next input symbol is shifted onto the top of the stack.2. Reduce: Replace the handle on the top of the stack by the non-terminal.3. Accept: Successful completion of parsing.4. Error: Parser discovers a syntax error, and calls an error recovery routine.

• Initial stack just contains only the end-marker $.• The end of the input string is marked by the end-marker $.

Page 63: Chapter 1

A Stack Implementation of A Shift-Reduce Parser

Stack InputAction$ id+id*id$ shift$id +id*id$ reduce by F id $F +id*id$ reduce by T F$T +id*id$ reduce by E T $E +id*id$ shift$E+ id*id$ shift $E+id *id$ reduce by F id$E+F *id$ reduce by T F $E+T *id$ shift$E+T* id$ shift $E+T*id $ reduce by F id$E+T*F $ reduce by T T*F $E+T $ reduce by E E+T$E $ accept

Page 64: Chapter 1

Operator-Precedence Parser• Operator grammar

– small, but an important class of grammars– we may have an efficient operator precedence parser (a shift-reduce

parser) for an operator grammar.

• In an operator grammar, no production rule can have:– e at the right side– two adjacent non-terminals at the right side.

• Ex:EAB EE+E |Aa E*E |Bb E/E | id

not operator grammar operator grammar

Page 65: Chapter 1

Precedence Relations

• In operator-precedence parsing, we define three disjoint precedence relations between certain pairs of terminals.

a <. b b has higher precedence than aa =· bb has same precedence as aa .> b b has lower precedence than a

• The determination of correct precedence relations between terminals are based on the traditional notions of associativity and precedence of operators. (Unary minus causes a problem).

Page 66: Chapter 1

Using Operator-Precedence Relations

• The intention of the precedence relations is to find the handle of a right-sentential form, <. with marking the left end, =· appearing in the interior of the handle, and .> marking the right hand.

• In our input string $a1a2...an$, we insert the precedence relation between the pairs of terminals (the precedence relation holds between the terminals in that pair).

Page 67: Chapter 1

Using Operator -Precedence Relations

E E+E | E-E | E*E | E/E | E^E | (E) | -E | id

The partial operator-precedencetable for this grammar.

• Then the input string id+id*id with the precedence relations inserted will be:

$ <. id .> + <. id .> * <. id .> $

id + * $id .> .> .>+ <. .> <. .>* <. .> .> .>$ <. <. <.

Page 68: Chapter 1

To Find The Handles1. Scan the string from left end until the first .> is encountered. 2. Then scan backwards (to the left) over any =· until a <. is

encountered. 3. The handle contains everything to left of the first .> and to

the right of the <. is encountered.

$ <. id .> + <. id .> * <. id .> $ E id $ id + id * id $$ <. + <. id .> * <. id .> $ E id $ E + id * id $ $ <. + <. * <. id .> $ E id $ E + E * id $ $ <. + <. * .> $ E E*E $ E + E * .E $$ <. + .> $ E E+E $ E + E $$ $ $ E $

Page 69: Chapter 1

Operator-Precedence Parsing Algorithm -- Example

stack inputaction

$ id+id*id$ $ <. id shift$id +id*id$ id .> + reduceE id$ +id*id$ shift$+ id*id$ shift$+id *id$ id .> * reduce E id$+ *id$ shift$+* id$ shift$+*id $ id .> $ reduce E id $+* $ * .> $ reduce E E*E $+ $ + .> $ reduce E E+E $ $ accept

id + * $id .> .> .>+ <. .> <. .>* <. .> .> .>$ <. <. <.

Page 70: Chapter 1

Chapter - 6

Introduction to Compiler

Unit-6

Page 71: Chapter 1

Aspects of Compilation

• Compiler bridges semantic gap between a PL domain and an execution domain.

• Two aspects of compilations are:-– Generate code to implement meaning of a source program in

execution domain. – Provide diagnostics for violations of PL semantics in a source program.– Data Types– Data Structures– Scope rules– Control Structures

Page 72: Chapter 1

Three address Code• In three-address code, there is at most one operator on the right side of an

instruction; that is, no built-up arithmetic expressions are permitted.

• Example: A source-language expression x+y*z might be translated into the sequence of three-address instructions below where tl and tz are compiler-generated temporary names.

• Generate code for x=a+b+c+d• Generate the code for x= -a *b + -a *b

Page 73: Chapter 1

Quadruple

OP Arg1 Arg2 Result

(0) uminus a t1

(1) * t1 b t2

(2) uminus a t3

(3) * t3 b t4

(4) + t2 t4 t5

(5) = t5 x

t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5

Page 74: Chapter 1

Triple

OP Arg1 Arg2

(0) uminus a

(1) * (0) B

(2) uminus A

(3) * (2) B

(4) + (1) (3)

(5) = x (4)

t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5

Page 75: Chapter 1

Indirect Triples

OP Arg1 Arg2

(0) uminus a

(1) * (0) B

(2) uminus A

(3) * (2) B

(4) + (1) (3)

(5) = x (4)

t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5

statement

(0) (11)

(1) (12)

(2) (13)

(3) (14)

(4) (15)

(5) (16)

Page 76: Chapter 1

Example• Construct Quadruple , Triple , Indirect Triple Representations of

a = b * - c + b * - c

Page 77: Chapter 1

Example

(c) Indirect triple

Page 78: Chapter 1

Aspects of compilation

• A compiler bridges a specification gap between a PL domain and an execution domain.

• Generate code to implement meaning of a source program in the execution domain.

• Provide diagnosis for violation of a PL semantics in a source program – PL features are:

• Data types: Specification of legal values for variables of the type• Data structures: • Scope rules: Accessibility of variables declared in different blocks of a

program.• Control structures:

Page 79: Chapter 1

Memory Allocation

• Memory binding: is an association between the ‘memory address’ attribute of a data item and the address of memory area. – Static memory Allocation:

• Allocates Before Execution– Dynamic memory Allocation

• Allocates After Execution

Page 80: Chapter 1

Static memory Allocation

• Program consist Three units A,B,C.

• Advantage??????

Code(A)

Data(A)

Code(B)

Data(B)

Code(C)

Data(C)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Page 81: Chapter 1

Dynamic memory Allocation• Program consist Three units A,B,C.

• Program A is active Data(A) is allocated

Code(A)

Code(B)

Code(C)

Data(A)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Page 82: Chapter 1

Dynamic memory Allocation• Program consist Three units A,B,C.• Pro. A calls B. and Data(B) gets allocated

Code(A)

Code(B)

Code(C)

Data(B)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Page 83: Chapter 1

Dynamic memory Allocation• Program consist Three units A,B,C.• Pro. B calls C. and Data(C) gets allocated

Code(A)

Code(B)

Code(C)

Data(C)

Procedure AData(A)

B()Procedure B

Data (B)C()

Procedure CData (C)

Page 84: Chapter 1

Dynamic memory Allocation• Different Scenario….

• Memory allocation in Block structured language(same as above.)• Advantage??????

Procedure AData(A)

B()C()

Procedure BData (B)

Procedure CData (C)

Code(A)

Code(B)

Code(C)

Data(A)

Code(A)

Code(B)

Code(C)

Data(A)

Data(B)

Code(A)

Code(A)

Code(A)

Data(A)

Data(C)

(b)(a) (c)

Page 85: Chapter 1

85

Stack

• Last In, First Out (LIFO) data structure

main (){ a(0); }void a (int m){ b(1); }void b (int n){ c(2); }void c (int o){ d(3); }void d (int p){ }

stack

Stack Pointer Stack grows down

Stack Pointer

Stack Pointer

Stack Pointer

Stack Pointer

Page 86: Chapter 1

• Activation Records:• also called frames• Information(memory) needed by a single execution of a

procedure• A general activation record:

Return value

actual parameters

optional control link

optional access link

machine status

local variables

temporaries

Store result of function call

Points to calling Procedure

Information of Program counter

Store temporary value

Information of actual Parameter

Non local data of other Procedure

Store local data

Page 87: Chapter 1

Activation Record for Factorial Program

main(){ int f; f=factorial(3); } int factorial(int n) { if(n==1)

{ return 1;}

else{return(n*factorial(n-1));}

}

Page 88: Chapter 1

Activation Record for Factorial Program

Page 89: Chapter 1

Activation Record for Factorial Program

Page 90: Chapter 1

– Parameter passing• The method to associate actual parameters with

formal parameters.• The parameter passing method will effect the code

generated.

• Call-by-value:– The actual parameters are evaluated and their r-values

are passed to the called procedure.– Implementation:

» a formal parameter is treated like a local name, so the storage for the formals is in the activation record of the called procedure.

» The caller evaluates the actual parameters and places their r-values in the storage for the formals.

Page 91: Chapter 1

– Call-by-reference:• also called call-by address or call-by-location.• The caller passes to the called procedure a pointer to

the storage address of each actual parameter.– Actual parameter must have an address -- only variables

make sense, an expression will not (location of the temporary that holds the result of the expression will be passed).

– Copy-restore:• A hybrid between call-by-value and call-by-reference.

– The actual parameters are evaluated and its r-values are passed to the called procedure as in call-by-value.

– When the control returns, the r-value of the formal parameters are copied back into the l-value of the actuals.