cs419 lec7 cfg

Post on 20-Nov-2014

311 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Compilers

WELCOME TO A JOURNEY TO

CS419 Lecture 7

Parsing tokens using Context Free Grammars

Cairo UniversityFCI

Dr. Hussien SharafComputer Science Departmentdr.sharaf@from-masr.com

2

PART ONE

Dr. Hussien M. Sharaf

PARSING A parser gets a stream of

tokens from the scanner, and determines if the syntax (structure) of the program is correct according to the (context-free) grammar of the source language.

Then, it produces a data structure, called a parse tree or an abstract syntax tree, which describes the syntactic structure of the program.

parser

Stream of tokens

Parse/syntax tree

Dr. Hussien M. Sharaf 3

CFG A context-free grammar is a notation for

defining context free languages. It is more powerful than finite automata or

RE’s, but still cannot define all possible languages.

Useful for nested structures, e.g., parentheses in programming languages.

Basic idea is to use “variables” to stand for sets of strings.

These variables are defined recursively, in terms of one another.

Dr. Hussien M. Sharaf 4

CFG FORMAL DEFINITION C =(V, Σ, R, S) V: is a finite set of variables. Σ: symbols called terminals of the

alphabet of the language being defined.

S V: a special start symbol. R: is a finite set of production rules of

the form A→ where AV, (V Σ)

Dr. Hussien M. Sharaf 5

CFG -1

Define the language { anbn | n > 1}. Terminals = {a, b}. Variables = {S}. Start symbol = S. Productions =

S → ab S → aSb Summary S → ab S → aSb

Dr. Hussien M. Sharaf 6

DERIVATION We derive strings in the language of a CFG by

starting with the start symbol, and repeatedly replacing some variable A by the right side of one of its productions.

Derivation example for “aabb” Using S→ aSb

generates uncompleted string that still has a non- terminal S.

Then using S→ ab to replace the inner S Generates “aabb”

S aSb aabb ……[Successful derivation of aabb]

Dr. Hussien M. Sharaf 7

CFG -1 : BALANCED-PARENTHESES

Prod1S → (S)Prod2S → ()

Derive the string ((())).S → (S) …..[by prod1]

→ ((S)) …..[by prod1]→ ((())) …..[by prod2]

Dr. Hussien M. Sharaf 8

CFG -2 : PALINDROME

Describe palindrome of a’s and b’s using CFG

1] S → aSa 2] S → bSb 3] S → Λ

Derive “baab” from the above grammar. S → bSb [by 2]

→ baSab [by 1]→ ba ab [by 3]

Dr. Hussien M. Sharaf 9

CFG -3 : EVEN-PLAINDROME

i.e. {Λ, ab, abbaabba,… } S → aSa| bSb| Λ Derive abaaba

10

S

S

Λ

aa

S bb

S aa

Dr. Hussien M. Sharaf

CFG – 4

Describe anything (a+b)* using CGF1] S → Λ 2] S → Y 3] Y→ aY4] Y → bY 5] Y →a 6] Y→ b

Derive “aab” from the above grammar.

S → aY [by 3]Y → aaY [by 3]Y → aab [by 6]

Dr. Hussien M. Sharaf 11

CFG – 5

1] S → Λ 2] S → aS 3] S→ bS

Derive “aa” from the above grammar.

S → aS [by 2]→ aaS [by 2]→ aa [by 1]

Dr. Hussien M. Sharaf 12

13

PART TWO

Dr. Hussien M. Sharaf

Parsing CFG grammar is about categorizing the

statements of a language. Parsing using CFG means categorizing a certain

statements into categories defined in the CFG. Parsing can be expressed using a special type

of graph called Trees where no cycles exist. A parse tree is the graph representation of a

derivation. Programmatically; Parse tree can be

represented as a dynamic data structure using a single root node.

14Dr. Hussien M. Sharaf

Parse tree

15

(1)A vertex with a label which is a Non-terminal symbol is a parse tree.

(2) If A → y1 y2 … yn is a rule in R, then the tree

A

y1 y2 yn. . .

is a parse tree.

Dr. Hussien M. Sharaf

Ambiguity A grammar can generate the same

string in different ways. Ambiguity occurs when a string has two

or more leftmost derivations for the same CFG.

There are ways to eliminate ambiguity such as using Chomsky Normal Form (CNF) which does n’t use Λ.

Λ cause ambiguity.

16Dr. Hussien M. Sharaf

Ex 1 Deduce CFG of addition and parse the

following expression 2+3+5 1] S→S+S|N 2] N→1|2|3|4|5|6|7|8|9|0 N1|N2|N3|N4|N5|N6|N7|N8|N9|N0

17

S

S+N

S+

N

5S+

3

N

2

N

Can u makeanother parsingtree ?

Dr. Hussien M. Sharaf

Ex 2 Deduce CFG of a addition/multiplication

and parse the following expression 2+3*5

1] S→S+S|S*S|N

2] N→1|2|3|4|5|6|7|8|9|0|NN

18

S

S*S

S*

N

5S+

3

N

2

N

Can u makeanother parsingtree ?

Dr. Hussien M. Sharaf

Ex 3 CFG without ambiguity Deduce CFG of a addition/multiplication

and parse the following expression

2*3+51] S→ Term|Term + S 2] Term → N|N * Term 3] N→1|2|3|4|5|6|7|8|9|0

19

S

S+N

S+

N

5S*

3

N

2

N

Can you makeanother parsingtree ?

Dr. Hussien M. Sharaf

Example 4 : AABB

20

S A | A BA Λ| a | A b | A A

B b | b c | B c | b BSample derivations:

S AB AbB Abb AAbb Aabb aabb

S

A B

AA Bb

a a b

S

BA

b

A

b

AA

a aDr. Hussien M. Sharaf

S AB AAB aAB aaB aabB aabb

Ex 5

21

S A | A BA Λ | a | A b | A AB b | b c | B c | b B

w = aabb

S

A B

AA b

a

a

bA

S

A

A A

AA bA

a e

a

bA

S

A B

AA Bb

a a b

Dr. Hussien M. Sharaf

REMOVING AMBIGUITY

22

Eliminate “useless” variables.Eliminate Λ-productions: AΛ.Avoid left recursion by replacing it with

right-recursion.

But if a language is ambiguous, it can’t be totally removed. We just need to the parsing to continue without entering an infinite loop.

Dr. Hussien M. Sharaf

THANK YOU

Dr. Hussien M. Sharaf 23

top related