context free grammar. introduction why do we want to learn about context free grammars? used in...

Post on 02-Jan-2016

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Context Free Grammar

Introduction

Why do we want to learn about Context Free

Grammars?

Used in many parsers in compilers

Yet another compiler-compiler, yacc

GNU project parser generator, bison

Web application

In describing the formats of XML (eXtensible Markup

Language) documents

Context Free Grammar

G = (V, T, P, S)

V – a set of variables, e.g. {S, A, B, C, D, E}

T – a set of terminals, e.g. {a, b, c}

P – a set of productions rules

In the form of A , where AV, (VT)* e.g. S aB

S is a special variable called the start symbol

CFG – Example 1

Construct a CFG for the following language :

L1={0n1n | n>0}

Thought: If is a string in L1, so is 01, i.e.

0k+11k+1 = 0(0k1k)1

S 01 | 0S1

CFG – Example 2

Construct a CFG for the following language :

L2={0i1j | ij and i, j>0}

Thought: Similar to L1, but at least one more ‘1’ or

at least one more ‘0’

S AC | CB

A 0A | 0

B B1 | 1

C 0C1 |

CFG – Example 3

Determine the language generated by the CFG:

S AS |

A A1 | 0A1 | 01

S generates consecutive ‘A’s

A generates 0i1j where i j

Languages: each block of ‘0’s is followed by at least

as many ‘1’s

Derivation

How a string is generated by a grammar G

Consider the grammar G

S AS |

A A1 | 0A1 | 01

S 0110011 ?

Derivation

A S

S

A SA 1

0 1 0 1A

0 1

S AS

  A1S

  011S

  011AS

  0110A1S

  0110011S

  0110011

S AS

  AAS

  AA

  A0A1

  A0011

  A10011

  0110011

Parse Tree orDerivation Tree Leftmost

DerivationRightmost Derivation

Ambiguity

Each parse tree has one unique leftmost

derivation and one unique rightmost derivation.

A grammar is ambiguous if some strings in it

have more than one parse trees, i.e., it has

more than one leftmost derivations (or more

than one rightmost derivations).

Ambiguity

Consider the following grammar G: S AS | a | b A SS | ab A string generated by this grammar can

have more than one parse trees. Consider the string abb:

ba

A S

S S b

S

A S

a b b

S

Ambiguity

As another example, consider the following grammar:

E E + E | E * E | (E) | x | y | z

There are 2 leftmost derivations for x + y + z: E E + E

x + E x + E + E x + y + E x + y + z

E E + E E + E + E x + E + E x + y + E x + y + z

E + E

x y + z

E

E + E

x y

E

z+

Simplification of CFG

Remove useless variables Generating Variables

Reachable Variables

Remove -productions, e.g. A

Remove unit-productions, e.g. AB

A variable X is useless if:

• X does not generate any string of terminals, or

• The start symbol, S, cannot generate X.

S X

where , (VT)*, X V and T*

Useless Variables

Removal of Non-generating Variables

1 Mark each production of the form:

X where T*

2 Repeat

Mark X where consists of terminals or variables

which are on the left hand side of some marked

productions.

Until no new productions is marked.

3 Remove all unmarked productions.

Example

CFG:

S AB|CAA a

B BC|AB C aB|b

S CA A a

C b

Example

CFG:

S aAa|aBA aS|bDB aBa|bC abb|DDD aDa

S aAa|aB A aS

B aBa|bC abb

Removal of Non-Reachable Variables

1 Mark each production of the form:

S where (VT)*

2 Repeat

Mark X where X appears on the right

hand side of some marked productions.

Until no new productions is marked.

3 Remove all unmarked productions.

Example

CFG:S aAa|aBA aSB aBa|bC abb

S aAa|aB A aS

B aBa|b

Example

CFG:S Aab|ABA aB bD | bBC A | BD a

S Aab|AB A a

B bD | bBD a

Remove Useless Variables

A Bac|bTC|baT aB|TBB aTB|bBCC TBc|aBC|ac

A ba C ac

A ba

Removal of -productions

Step 1: Find nullable variables

Step 2: Remove all -productions by replacing

some productions

How to find nullable variables ?

1 Mark all variables A for which there exists

a production of the form A .

2 Repeat

Mark X for which there exists X where V*

and all symbols in have been marked.

Until no new variables is marked.

Removal of -Productions

We can remove all -productions (except S

if S is nullable) by rewriting some other

productions.

e.g., If X1 and X3 are nullable, we should

replace

A X1X2X3X4 by:

A X1X2X3X4 | X2X3X4 | X1X2X4 | X2X4

Removal of -productions

Example:S A | B | CA aAa | BB bB | bbC aCaa | DD baD | abD |

Step1: nullable variables are D, C and S

Removal of -productions(3)

Step 2:eliminate D by replacing:

D baD | abD into D baD | abD | ba | ab

eliminate C by replacing: C aCaa | D into C aCaa | D | aaa S A | B | C into S A | B | C |

The new grammar: S A | B | C | A aAa | B B bB | bb C aCaa | D | aaa D baD | abD | ba | ab

Example:S A | B | CA aAa | BB bB | bbC aCaa | DD baD | abD |

Example

S ABCDA a B C ED | D BCE b

S ABCD|ABC|ABD|ACD|AB|AC|AD|AA aC ED|ED BC|B|CE b

Removal of Unit Productions

Unit production: A B, where A and B V

1. Detect cycles of unit productions:

A1 A2 A3 … A1

Replaced A1, A2, A3 , …, Ak by any one of them

2. Detect paths of unit productions:

A1 A2 A3 …, Ak , where (VT)*/V

Add productions A1 , A2 , …, Ak and

remove all the unit productions

Removal of Unit Productions

Example:S A | B | C A aa | BB bb | CC cc | A

Cycle: A B C ARemove by replace B,C by

A

S A | A | AA aa | AA bb | AA cc | A

Becomes:S A A aa | bb | cc

Path: S A Remove by adding

productions

S aa | bb | cc

Example

CFG:S → A | Bb A → C | a B → aBa | b C → aSa

S → Bb | aSa |a → A → a | aSa

B → aBa | b C → aSa

Example

bbB

AB

BA

aA

aAS

Substitute

BA

bbB

BAB

aA

aBaAS

|

|

Remove

bbB

BAB

aA

aBaAS

|

|

bbB

AB

aA

aBaAS

|

BB

XX Unit productions of form can be removed immediately

Example

Example

Substitute

ABbbB

aA

aAaBaAS

||

bbB

AB

aA

aBaAS

|

Example

Remove repeated productions

bbB

aA

aBaAS

|

bbB

aA

aAaBaAS

||

Final grammar

Thank You

top related