slide 1… · ppt file · web view · 2016-02-0410cs56 prepared by: swetha g ... they have been...

32
Subject Name Finite Automata and Formal Languages Subject Code 10CS56 Prepared By: Swetha G , Pramela Department: ISE & CSE Date : 07 / 10 / 2014

Upload: doanminh

Post on 18-Apr-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Subject Name Finite Automata and Formal Languages

Subject Code 10CS56

Prepared By: Swetha G , PramelaDepartment: ISE & CSEDate : 07 / 10 / 2014

Page 2: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Chapter 4:

Context Free Grammars and Languages

Page 3: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Topics Covered:

•CFG Definition & Derivations

•Parse Tree

•Ambiguity in Grammars and Languages

•Problems

Page 4: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Context Free Grammars - CFG

•Context-free grammars (CFG) define the context-free languages.

•They have been developed in the mid-1950s by Noam Chomsky.

•CFG play an important role in the description and design of programming

languages and compilers, for the implementation of parsers, also used in XML ,

DTD.

•The grammar can be represented as parse tree, a pictorial representation of the

grammar.

•The CFL ( context free languages) are represented by “ Pushdown automata ” – a

automata representation.

Page 5: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Context-Free GrammarsDefinition:

A context-free grammar is a 4-tuple G = (V,T,P, S) where:

•V is a finite set of variables / non-terminals: each variable represents a language or set

of strings;

•T is a finite set of symbols or terminals:

•P is a finite set of rules or productions which recursively define the language. Each

production consists of a variable being defined in the production. Productions are

represented by the symbol “→”; A string of 0 or more terminals and variables called the

body of the production;

•S is the start variable and represents the language we are defining.

other variables define the auxiliary classes needed to define our language.

Page 6: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Example: Palindrome defined over language ∑ = ( 0 , 1 )

Hence the productions on the basis of induction says, basis = 0 or 1P → єP → 0P → 1and induction : n >=1, means,P → 0P0P → 1P1

The grammar G pal = ( { P } , { 0,1 } , A , P )

Here V = { P } = non terminals T = { 0,1 } = Terminals P = Start state A = Set of Productions defined above.

Page 7: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Example 2 : Expression .

Here we define 2 classes of strings: those representing simple numerical

expressions (denoted by E) and those representing Boolean expressions

(denoted by B).

E → 0

E → 1

E → E + E

E → if B then E else E

B → True

B → False

B → E < E

B → E == E

Page 8: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Compact Notation

We can group all productions defining elements in a certain class to have a

more compact notation.

Example: Palindromes can be defined as

P → | 0 | 1 | 0P0 | 1P1

Example: The expressions can be defined as

E → 0 | 1 | E + E | if B then E else E

B → True | False | E < E | E == E

Page 9: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Chomsky Hierarchy

Noam Chomsky is the funder of Formal Language theory has classified grammar

into categories…

•Type 0 : Phrase structured grammar

•Type 1 : Context Sensitive grammar

•Type 2 : Context Free grammar

•Type 3 : Regular grammar

Page 10: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Type 0 : Phrase structured grammar

A grammar G = ( VTPS) is Type 0 / unrestricted / Phrase Structured Grammar if

all the productions are of the form α β

α belongs to ( V U T )+ and β belongs to ( V U T )*

There are no restrictions on the length of the α and β.

Α cannot contain “ є ” but β can have it.

Page 11: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Type 1 : Context Sensitive grammar

A grammar G = ( VTPS) is Type 1 / context sensitive Grammar if the grammar is

Type 0 and all the productions are of the form α β

α and β belongs to ( V U T )+

There is restriction on the length of the β i.e., it must be atleast length of α .

α and β cannot contain “ є ”.

It is “є-free” Grammar.

Page 12: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Type 2 : Context Free grammar

A grammar G = ( VTPS) is Type 2 / context free Grammar all the productions are

of the form A →α, where α belongs to ( V U T )* and A is non terminal.

“ є ” can appear on right hand side of the production.

Page 13: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Type 3 : Regular grammar

A grammar G = ( V,T,P,S) is Type 3 / Regular Grammar if the grammar is right

linear or left linear.

A Grammar is said to be right linear if all the productions are of the form

A →wB

or

A →w , A Grammar is said to be left linear if all the productions are of the form

A →Bw

or

A →w ,

where A , B are terminals and w is string of terminals

Page 14: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Defining language of Grammar:When CFG productions are applied, it is possible to derive certain strings, and we

say these strings are in the language of certain variable.

The procedure to derive the string is known as inference .

Two types of Inference

1 . Recursive inference

2 . Derivation.

Page 15: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Recursive inferenceTo infer sting in the language defined by CFG, one can use productions from body

to head.

Ex : Grammar G = ( { E, I } , T, P, E ) where: T = { + , , ( , ) , a, b,0 ,1 } and P is ∗ the set of productions

E → I | E + E | E E | ( E )∗I → a | b | Ia | Ib | I0 | I1.

Example to Infer String: a(a + b00 )

Page 16: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

String Lang Production String used

i a I I → a -

ii b I I → b -

iii b0 I I → I0 ii

iv b00 I I → I0 iii

v a E E → I i

vi b00 E E → I iv

vii a+b00 E E → E + E v, vi

viii (a + b00 ) E E → ( E ) vii

ix a ( a + b00 ) E v, vii

Page 17: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Language of a Grammar

Definition: Let G = (V,T,R, S) be a CFG. The language of G, denoted L(G) is the

set of terminal strings that can be derived from the start variable, that is,

L(G) = {w T | S w}∈ ∗ ⇒∗Theorem :

Page 18: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Examples :

1) Construct context free grammars to accept the languageL = {w | w starts and ends with the same symbol} where Σ = {0, 1}

Solution :S → 0A0 | 1A1 A → 0A | 1A | є

2) Obtain grammar to generate string consisting of at-least one aSolution:

S → aS → aS

OrS → a | aS

Language generated can be formally written as L = {an : n>=1}

Page 19: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Examples :

3) Obtain grammar to generate strings of a’s and b’s such that the length is multiple of 3.Solution:

S → є | AAASA → a | b.

4) Obtain grammar to generate strings of a’s and b’s such that the string ends with “ab”Solution:

S → Aab A → є | aA | bA

Page 20: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

DerivationsApplying productions from head to body are derivations. It requires the

definition of a new relational symbol “ ”⇒At every derivation step we can choose to replace any variable by the right-

hand side of one of its productions.

Example:

To obtain any arithmetic expression

E → E + E

E → E - E

E → E * E

E → E / E

E → ( E )

E → id

Obtain id + id * id from E.

Page 21: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Derivation of a ( a + b000)∗E E E ⇒ ∗

⇒ id E ∗ ⇒ a E ∗ ⇒ a ( E ) ∗ ⇒ a ( E + E ) ∗ ⇒ a ( id + E ) ∗ ⇒ a ( a + E ) ∗ ⇒ a ( a + id ) ∗ ⇒ a ( a + id 0) ∗ ⇒ a ( a + id 00)∗ ⇒ a ( a + b 00)∗

Note 1 At each step we might have several rules to choose from, e.g.

I E a E a ( E ) , versus I E I ( E ) a ( E ).∗ ⇒ ∗ ⇒ ∗ ∗ ⇒ ∗ ⇒ ∗Note 2 Not all choices lead to successful derivations of a particular string, for

instance E E + E (at the first step) won’t lead to a derivation of a ( a + b00)⇒ ∗Important:

Recursive inference and derivation are equivalent. A string of terminals w is

inferred to be in the language of some variable A iff A w.∗ ⇒

Page 22: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Derivations are of 2 types :

Left most derivation

Right most derivation.

Leftmost derivation: lm and lm .⇒ ⇒∗At each step we choose to replace the leftmost variable.

Rightmost derivation: rm and rm .⇒ ⇒∗At each step we choose to replace the rightmost variable.

Example : Consider the Grammar for string “(a,a)”.            S → L            L → (L)|(L,S)|aLeftmost derivation:S →L → (L) →(L,S) → (a,S) →(a,L) → (a,a)Rightmost Derivation:S →L → (L) →(L,S) → (L,L) →(L,a) → (a,a)

Page 23: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Sentential FormsDerivations from the start variable have an special role.

Definition: Let G = (V,T,R, S) be a CFG and let α (V T) .∈ ∪ ∗Then S α is called a sentential form.⇒∗We say left sentential form if S Lm α or⇒∗right sentential form if S rm α.⇒∗Example:

A sentential form P 010P010 ⇒∗

Page 24: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Ambiguity: a word or expression that can be understoodin two or more possible ways.

The following Grammar produces ambiguity in programming languages with conditionals:

C → if b then C else CC → if b then CC → s

The expression “if b then if b then s else s” can be interpreted in thefollowing 2 di erent ways:ff

1 if b then (if b then s else s) 2 if b then (if b then s) else s

How should the parser of this language understand the expression?

Page 25: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Example: Consider the following grammar

E → 0 | 1 | E + E | E E∗The sentential form E + E E has the following 2 possible derivations∗1. E E + E E + E E⇒ ⇒ ∗2. E E E E + E E⇒ ∗ ⇒ ∗there are 2 possible meanings for words like 1 + 1 * 0:

1. 1 + (1 * 0) = 1

2. (1 + 1) * 0 = 0

Ambiguous Grammars

Definition:

A CFG grammar G = (V ,T,R, S) is ambiguous if there is at least a string w T for ∈ ∗which we can find two (or more) parse trees, each with root S and yield w.

If each string has at most one parse tree we say that the grammar is unambiguous.

Page 26: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Parse Trees

Definition: Let G = (V,T,R, S) be a CFG. The parse trees for G are trees with the

following conditions:

•Nodes are labelled with a variable

•Leaves are either variables, terminals or

•If a node is labelled A and it has children labelled X1, X2, . . . , Xn respectively from left

to right, then it must exist in R a production of the form A → X1X2 . . . Xn.

Note: If a leaf is є it should be the only child of its parent A and there should be a

production A → є .

Height of a Parse Tree

Definition: The height of a parse tree is the maximum length of a path from the root of

the tree to one of its leaves.

Observe: We count the edges in the tree, and not the number of nodes and the leaf in the

path.

Page 27: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Parse Trees

•Parse trees are a way to represent derivations.

•A parse tree reflects the internal structure of a word of the language.

•Using parse trees it becomes very clear which is the variable that was replaced at

each step.

•In addition, it becomes clear which terminal symbols where generated/derived

form a particular variable.

•Parse trees are very important in compiler theory. Parser takes the source code into

its parse tree. This parse tree is the structure of the program.

Page 28: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Example of a Parse Tree :

Derivation of String a * a + b00

Page 29: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Yield of a Parse Tree

Definition: A yield of a parse tree is the string resulted from concatenating all

the leaves of the tree from left to right.

Note :

•The yield of a tree is a string derived from the root of the tree;

•Of particular importance are the yields that consist of only terminals; that is,

the leaves are either terminals or є;

•When, the root is S then we have a parse tree for a string in the language of

the grammar;

•Yields can be used to describe the language of a grammar.

Page 30: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

From Inference to Trees:

Let G = ( V,T,P,S ) be CFG. If recursive inference procedure tells us that terminal

string w is in the language of variable A, then there is a parse tree with root A and yield

w.

Proof:

We do an induction of the length of the inference.

Basis: n = 1, means , atleast we must have used one production A → w . The desired

parse tree is then

Page 31: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

Induction : N= n+1

“w” is a language inferred after n+1 steps, and the theorem holds for all strings x &

variables B such that x is in language of B, was inferred using n or fewer inference steps.

Consider string “w” , break w up as w1, w2, ……wk.

Let assume we are in the last step of inferring w. This inference uses some production of

A, say A → X1,X2,X3…….Xk and Xi belongs to ( V U T )

Case 1 : if Xi is terminal, then Xi = Wi

Case 2 : if Xi is non- terminal, then Wi is string that was previously inferred to be in

language of Xi , That is., Xi was inferred at most using n steps.

By the IH there are parse trees i with root Xi and yield wi. Then the following is a parse

tree for Grammar G with root A and yield w:

Page 32: Slide 1… · PPT file · Web view · 2016-02-0410CS56 Prepared By: Swetha G ... They have been developed in the mid-1950s by Noam Chomsky. ... False | E < E | E == E Chomsky Hierarchy

(Only-if) Let w L(G), that is P w.∈ ⇒∗We prove by induction on the length of the derivation of w that w is a palindrome, that

is, w = rev(w).

Base case: If the derivation is in one step then we should have P epsilon,⇒P 0 and P 1. In all cases w is a palindrome.⇒ ⇒Inductive Step: Our IH is that if P n w ‘ with n > 0 then w ‘ = rev(w ‘ ).⇒Suppose P n+1 w. ⇒The we have 2 cases:

P 0P0 n 0w ‘0 = w;⇒ ⇒P 1P1 n 1w ‘1 = w.⇒ ⇒

Observe that in both cases P n w` with n > 0.⇒Hence by IH w ‘ is a palindrome and then so is w.