non regular languages: two approaches (1) myhill-nerode theorem (not in sipser’s...

58
Computational Models - Lecture 3 Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s book) (2) the Pumping Lemma Algorithmic questions for NFAs Context Free Grammars (time permitting) Sipser’s book, 1.4, 2.1, 2.2 Hopcroft and Ullman, 3.4 Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.1

Upload: vuhanh

Post on 12-May-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Computational Models - Lecture 3

Non Regular Languages: Two Approaches

(1) Myhill-Nerode Theorem (not in Sipser’s book)

(2) the Pumping Lemma

Algorithmic questions for NFAs

Context Free Grammars (time permitting)

Sipser’s book, 1.4, 2.1, 2.2

Hopcroft and Ullman, 3.4

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.1

Page 2: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Proved Last Time

Thm.: A language, L, is described by a regularexpression, R, if and only if L is regular.

=⇒ construct an NFA accepting R.

⇐= Given a regular language, L, construct anequivalent regular expression

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.2

Page 3: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Negative Results

We have made a lot of progress understanding what finiteautomata can do. But what can’t they do?Is there a DFA that accepts

B = {0n1n|n ≥ 0}C = {w|w has an equal number of 0’s and 1’s}D = {w|w has an equal number of occurrences

of 01 and 10 substrings}Consider B:

DFA must “remember” how many 0’s it has seen

impossible with finite state.

The others are exactly the same.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.3

Page 4: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Negative Results

Is there a DFA that accepts

B = {0n1n|n ≥ 0}C = {w|w has an equal number of 0’s and 1’s}D = {w|w has an equal number of occurrences

of 01 and 10 substrings}Consider B:

DFA must “remember” how many 0’s it has seen

impossible with finite state.

The others are exactly the same...

Question: Is this a proof?

Answer: No, D is regular!??? (see problem set 2)

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.4

Page 5: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

The Equivalence Relation ∼L

Let L ⊆ Σ∗ be a language.

Define an equivalence relation ∼L on pairs of strings:

Let x, y ∈ Σ∗. We say that x∼L y if for every string z ∈ Σ∗,xz ∈ L if and only if yz ∈ L.

It is easy to see that ∼L is indeed an equivalence relation(reflexive, symmetric, transitive) on Σ∗.

In addition, if x∼L y then for every string z ∈ Σ∗, xz∼L yz aswell (this is called right invariance).

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.5

Page 6: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

The Equivalence Relation ∼L

Like every equivalence relation, ∼L partitions Σ∗ to (disjoint)equivalence classes. For every string x, let [x] ⊆ Σ∗ denoteits equivalence class w.r.t. ∼L (if x ∼L y then [x] = [y] –equality of sets).

Question is, how many equivalence classes does ∼L

induces?

In particular, is the number of equivalence classes of ∼L

finite or infinite?

Well, it could be either finite or infinite. This depends on thelanguage L.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.6

Page 7: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Classes of ∼L: Three Examples

Let L1 ⊂ {0, 1}∗ contain all strings where the number of1s is divisible by 4. Then ∼L1

has finitely manyequivalence classes.

Let L2 ⊂ {0, 1}∗ contain all strings of the form 0n1n.Then ∼L2

has infinitely many equivalence classes.

Let L3 ⊂ {1}∗ contain all strings whose length is a primenumber. Then ∼L3

has infinitely many equivalenceclasses.

(white-board (Monday) / black-board (Wednesday) proofs.)

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.7

Page 8: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Myhill-Nerode Theorem

Theorem: Let L ⊆ Σ∗ be a language. Then

L is regular ⇐⇒ ∼L has finitely many equivalenceclasses.

Three specific consequences:

L1 ⊂ {0, 1}∗ contains all strings where the number of 1sis divisible by 4. Then L1 is regular.

L2 ⊂ {0, 1}∗ contains all strings of the form 0n1n. ThenL2 is not regular.

Let L3 ⊂ {1}∗ contains all strings whose length is aprime number. Then L3 is not regular.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.8

Page 9: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Myhill-Nerode Theorem: Proof

=⇒Suppose L is regular. Let M = (Q,Σ, δ, q0, F ) be a DFAaccepting it. For every x ∈ Σ∗, let δ(q0, x) ∈ Q be the statewhere the computation of M on input x ends.

The relation ∼M on pairs of strings is defined as follows:x ∼M y if δ(q0, x) = δ(q0, y).Clearly, ∼M is an equivalence relation.

Furthermore, if x ∼M y, then for every z ∈ Σ∗, alsoxz ∼M yz. Therefore, xz ∈ L if and only if yz ∈ L.

This means that x ∼M y =⇒ x ∼L y.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.9

Page 10: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Myhill-Nerode Theorem: Proof (cont.)

=⇒

The equivalence relation ∼M has finitely manyequivalence classes (at most the number of states inM ).

We saw that x ∼M y =⇒ x ∼L y, so the number ofequivalence classes of ∼L is less or equal than thenumber of equivalence classes of ∼M .

Therefore, ∼L has finitely many equivalence classes. ♠

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.10

Page 11: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Myhill-Nerode Theorem: Proof (cont.)

⇐=

Suppose ∼L has finitely many equivalence classes.We’ll construct a DFA M that accepts L.

Let x1, . . . , xn ∈ Σ∗ be representatives for the finitelymany equivalence classes of ∼L.

The states of M are the equivalence classes[x1], . . . , [xn].

The transition function δ is defined as follows: For alla ∈ Σ, δ([xi], a) = [xia] (the equivalence class of xia).

Hey, why is this a proper definition of δ?(Hint: right invariance).

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.11

Page 12: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Myhill-Nerode Theorem: Proof (conc.)

⇐=

The initial state is [ε].

The accept states are F = {[xi] |xi ∈ L}.

Easy: On input x ∈ Σ∗, M ends at state [x] (why?).

Therefore M accepts x iff x ∈ L (why?).

So L is accepted by DFA, hence L is regular. ♠

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.12

Page 13: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Myhill-Nerode Theorem

An example: Constructing DFA for the languageL1 ⊂ {0, 1}∗ contains all strings where the number of 1sis divisible by 5.

A few words on minimizing the number of states of aDFA accepting a given language L.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.13

Page 14: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Pumping Lemma

We will show that all regular languages have a specialproperty.

Suppose L is regular.

If a string in L is longer than a certain critical length �(the pumping length),

then it can be “pumped” to a longer string by repeatingan internal substring any number of times.

The longer string must be in L too.

This is a (second) powerful technique for showing that alanguage is not regular.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.14

Page 15: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Pumping Lemma

Theorem: If L is a regular language, then there is an � > 0(the pumping length), where if s is any string in L of length|s| > �, then s may be divided into three pieces s = xyz suchthat

for every i ≥ 0, xyiz ∈ L,

|y| > 0, and

|xy| ≤ �.

Remarks: Without the second condition, the theorem wouldbe trivial.The third condition is technical and sometimes useful.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.15

Page 16: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Pumping Lemma – Proof

Let M = (Q,Σ, δ, q1, F ) be a DFA that accepts L.

Let � be |Q|, the number of states of M .

If s ∈ L has length at least �, consider the sequence ofstates M goes through as it reads s:

s1 s2 s3 s4 s5 s6 . . . sn

↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑q1 q20 q9 q17 q12 q13 q9 q2 q5 ∈ F

Since the sequence of states is of length |s| + 1 > �, and

there are only � different states in Q, at least one state is

repeated (by the pigeonhole principle).

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.16

Page 17: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Pumping Lemma – Proof (cont.)

Write down s = xyz

q1

q9

q5x

y

z

By inspection, M accepts xykz for every k ≥ 0.

|y| > 0 because the state (q9 in figure) is repeated.

To ensure that |xy| ≤ �, pick first state repetition, whichmust occure no later than � + 1 states in sequence.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.17

Page 18: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

An Application

Theorem: The language B = {0n1n|n > 0} is not regular.

Proof: By contradiction. Suppose B is regular, accepted byDFA M . Let � be the pumping length.

Consider the string s = 0�1�.

By pumping lemma s = xyz, where xykz ∈ B for every k.

If y is all 0, then xykz has too many 0’s.

If y is all 1, then xykz has too many 1’s.

If y is mixed, then xykz is not of right form. ♣

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.18

Page 19: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Another Application

Theorem: The languageC = {w|w has an equal number of 0’s and 1’s} is notregular.Proof: By contradiction. Suppose C is regular, accepted byDFA M . Let � be the pumping length.

Consider the string s = 0�1�.

By pumping lemma s = xyz, where xykz ∈ C for every k.

If y is all 0, then xykz has too many 0’s.

If y is all 1, then xykz has too many 1’s.

If y is mixed, then since |xy| ≤ �, y must be all 0’s,contradiction. ♣

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.19

Page 20: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Yet Another Application

Theorem: The language L3 ⊂ {1}∗, which contains allstrings whose length is a prime number, is not regular.

Proof: By contradiction, using the pumping lemma and some

simple properties of prime numbers. ♣

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.20

Page 21: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Context Switch

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.21

Page 22: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Algorithmic Questions for NFAs

Q.: Given an NFA, N , and a string s, is s ∈ L(N)?

Answer: Construct the DFA equivalent to N and run it on w.

Q.: Is L(N) = ∅?Answer: This is a reachability question in graphs: Is there apath in the states’ graph of N from the start state to someaccepting state. There are simple, efficient algorithms forthis task.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.22

Page 23: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

More Algorithmic Questions for NFAs

Q.: Is L(N) = Σ∗?

Answer: Check if L(N) = ∅.

Q.: Given N1 and N2, is L(N1) ⊆ L(N2)?

Answer: Check if L(N2) ∩ L(N1) = ∅.

Q.: Given N1 and N2, is L(N1) = L(N2)?

Answer: Check if L(N1) ⊆ L(N2) and L(N2) ⊆ L(N1).

In the future, we will see that for stronger models ofcomputations, many of these problems cannot be solved byany algorithm.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.23

Page 24: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Another, More Radical Context Switch

So far we sawfinite automata,regular languages,regular expressions,Myhill-Nerode theorem and pumping lemma forregular languages.

We now introduce stronger machines andlanguages with more expressive power:

pushdown automata,context-free languages,context-free grammars,pumping lemma for context-free languages.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.24

Page 25: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Context-Free Grammars

An example of a context free grammer, G1:

A → 0A1

A → B

B → #

Terminology:

Each line is a substitution rule or production.

Each rule has the form: symbol → string.The left-hand symbol is a variable(usually upper-case).

A string consists of variables and terminals.

One variable is the start variable.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.25

Page 26: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Rules for Generating Strings

Write down the start variable (lhs of top rule).

Pick a variable written down in current string and aderivation that starts with that variable.

Replace that variable with right-hand side of thatderivation.

Repeat until no variables remain.

Return final string (concatenation of terminals).

Process is inherently non deterministic.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.26

Page 27: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Example

Grammar G1:

A → 0A1

A → B

B → #

Derivation with G1:

A ⇒ 0A1

⇒ 00A11

⇒ 000A111

⇒ 000B111

⇒ 000#111

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.27

Page 28: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

A Parse Tree

A

A

A

B

#0 10 0 1 1

Question: What strings can be generated in this way fromthe grammar G1?

Answer: Exactly those of the form 0n#1n (n ≥ 0).

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.28

Page 29: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Context-Free Languages

The language generated in this way is called the languageof the grammar.

For example, L(G1) is {0n#1n|n ≥ 0}.

Any language generated by a context-free grammar iscalled a context-free language.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.29

Page 30: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

A Useful Abbreviation

Rules with same variable on left hand side

A → 0A1

A → B

are written as:

A → 0A1 | B

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.30

Page 31: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

English-like Sentences

A grammar G2 to describe a few English sentences:

< SENTENCE > → < NOUN-PHRASE >< VERB >

< NOUN-PHRASE > → < ARTICLE >< NOUN >

< NOUN > → boy | girl | flower

< ARTICLE > → a | the

< VERB > → touches | likes | sees

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.31

Page 32: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Deriving English-like Sentences

A specific derivation in G2:

< SENTENCE > ⇒ < NOUN-PHRASE >< VERB >

⇒ < ARTICLE >< NOUN >< VERB >

⇒ a < NOUN >< VERB >

⇒ a boy < VERB >

⇒ a boy sees

More strings generated by G2:

a flower sees

the girl touchesSlides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.32

Page 33: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Derivation and Parse Tree

< SENTENCE > ⇒ < NOUN-PHRASE >< VERB >

⇒ < ARTICLE >< NOUN >< VERB >

⇒ a < NOUN >< VERB >

⇒ a boy < VERB >

⇒ a boy sees

SENTENCE

NOUN-PHRASE VERB

ARTICLE NOUN

a boy seesSlides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.33

Page 34: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Formal Definitions

A context-free grammar is a 4-tuple (V,Σ, R, S) where

V is a finite set of variables,

Σ is a finite set of terminals,

R is a finite set of rules: each rule is a variable and afinite string of variables and terminals.

S is the start symbol.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.34

Page 35: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Formal Definitions

If u and v are strings of variables and terminals,

and A → w is a rule of the grammar, then

we say uAv yields uwv, written uAv ⇒ uwv.

We write u∗⇒ v if u = v or

u ⇒ u1 ⇒ . . . ⇒ uk ⇒ v.

for some sequence u1, u2, . . . , uk.

Definition: The language of the grammar is{w ∈ Σ∗ | S

∗⇒ w}

.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.35

Page 36: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Example

Consider G4 = (V, {a, b} , R, S).

R (Rules): S → aSb | SS | ε .

Some words in the language: aabb, aababb.

Q.: But what is this language?

Hint: Think of parentheses.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.36

Page 37: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Arythmetic Example

Consider (V,Σ, R,E) where

V = {E, T, F}Σ = {a,+,×, (, )}

Rules:E → E + T | T

T → T × F | F

F → (E) | a

Strings generated by the grammer:a + a × a and (a + a) × a.What is the language of this grammer?Hint: arithmetic expressions.

E = expression, T = term, F = factor.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.37

Page 38: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Parse Tree for a + a × a

E → E + T | T

T → T × F | F

F → (E) | a

aXa+a

FFF

T T

TE

E

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.38

Page 39: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Parse Tree for (a + a) × a

E → E + T | T

T → T × F | F

F → (E) | a

( a + aX)a

F F F

T T

E

E

F

T

T

E

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.39

Page 40: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Designing Context-Free Grammars

No recipe in general, but few rules-of-thumb

If CFG is the union of several CFGs, rename variables(not terminals) so they are disjoint, and add new ruleS → S1 | S2 | . . . | Si.

To construct CFG for a regular language, “follow” a DFAfor the language. For initial state q0, make R0 the startvariable. For state transition δ(qi, a) = qj add ruleRi → aRj to grammer. For each final state qf , add ruleRf → ε to grammer.

For languages with linked substrings (like{0n#1n|n ≥ 0} ), a rule of form R → uRv may be helpful,forcing desired relation between substrings.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.40

Page 41: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Closure Properties

Regular languages are closed underunionconcatenationstar

Context-Free Languages are closed underunion : S → S1 | S2

concatenation S → S1S2

star S → ε | SS

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.41

Page 42: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

More Closure Properties

Regular languages are also closed undercomplement (reverse accept/non-accept states ofDFA)

intersection(L1 ∩ L2 = L1 ∪ L2

).

What about complement and intersection ofcontext-free languages?

Not clear . . .

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.42

Page 43: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Ambiguity

Grammar: E → E+E | E×E | (E) | a

aXa+a

EEE

E

E

aXa+a

EEE

E

E

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.43

Page 44: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Ambiguity

We say that a string w is derived ambiguously fromgrammer G if w has two or more parse trees that generate itfrom G.

Ambiguity is usually not only a syntactic notion but also asemantic one, implying multiple meanings for the samestring.

It is sometime possible to eliminate ambiguity by finding adifferent context free grammer generating the samelanguage. This is true for the grammer above, which can bereplaced by unambiguous grammer from slide 37.

Some languages (e.g. {1i2j3k | i = j or j = k} areinherentrly ambigous.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.44

Page 45: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Chomsky Normal Form

A simplified, canonical form of context free grammers.Every rule has the form

A → BC

A → a

S → ε

where S is the start symbol, A, B and C are any variable,except B and C not the start symbol, and A can be the startsymbol.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.45

Page 46: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Theorem

Theorem: Any context-free language is generated by acontext-free grammar in Chomsky normal form.Basic idea:

Add new start symbol S0.

Eliminate all ε rules of the form A → ε.

Eliminate all “unit” rules of the form A → B.

Patch up rules so that grammar generates the samelanguage.

Convert remaining long rules to proper form.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.46

Page 47: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Proof

Add new start symbol S0 and rule S0 → S.

Guarantees that new start symbol does not appear on right-

hand-side of a rule.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.47

Page 48: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Proof

Eliminating ε rules.

Repeat:

remove some A → ε.

for each R → uAv, add rule R → uv.

and so on: for R → uAvAw add R → uvAw, R → uAvw,and R → uvw.

for R → A add R → ε, except if R → ε has already beenremoved.

until all ε-rules not involving the original start variable havebeen removed.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.48

Page 49: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Proof

Eliminate unit rules.

Repeat:

remove some A → B.

for each B → u, add rule A → u, unless this ispreviously removed unit rule. (u is a string of variablesand terminals.)

until all unit rules have been removed.

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.49

Page 50: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Proof

Finally, convert long rules.To replace each A → u1u2 . . . uk (for k ≥ 3), introduce newnon-terminals

N1, N2, . . . , Nk−1

and rules

A → u1N1

N1 → u2N2

...Nk−3 → uk−2Nk−2

Nk−2 → uk−1uk ♠

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.50

Page 51: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example

Initial Grammar:

S → ASA | aB

A → B | S

B → b | ε

(1) Add new start state:

S0 → S

S → ASA | aB

A → B | S

B → b | ε

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.51

Page 52: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (2)

S0 → S

S → ASA | aB

A → B | S

B → b | ε

(2) Remove ε-rule B → ε:

S0 → S

S → ASA | aB | a

A → B | S | ε

B → b | ε

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.52

Page 53: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (3)

S0 → S

S → ASA | aB | a

A → B | S | ε

B → b

(3) Remove ε-rule A → ε:

S0 → S

S → ASA | aB | a | AS | SA | S

A → B | S | ε

B → b

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.53

Page 54: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (4)

S0 → S

S → ASA | aB | a | AS | SA | S

A → B | S

B → b

(4) Remove unit rule S → S

S0 → S

S → ASA | aB | a | AS | SA | S

A → B | S

B → b

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.54

Page 55: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (5)

S0 → S

S → ASA | aB | a | AS | SA

A → B | S

B → b

(5) Remove unit rule S0 → S:

S0 → S | ASA | aB | a | AS | SA

S → ASA | aB | a | AS | SA

A → B | S

B → b

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.55

Page 56: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (6)

S0 → ASA | aB | a | AS | SA

S → ASA | aB | a | AS | SA

A → B | S

B → b

(6) Remove unit rule A → B:

S0 → ASA | aB | a | AS | SA

S → ASA | aB | a | AS | SA

A → B | S | b

B → b

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.56

Page 57: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (7)

S0 → ASA | aB | a | AS | SA

S → ASA | aB | a | AS | SA

A → S | b

B → b

Remove unit rule A → S:

S0 → ASA | aB | a | AS | SA

S → ASA | aB | a | AS | SA

A → S | b | ASA | aB | a | AS | SA

B → b

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.57

Page 58: Non Regular Languages: Two Approaches (1) Myhill-Nerode Theorem (not in Sipser’s ...bchor/CM07/Compute3.pdf ·  · 2007-03-10Algorithmic questions for NFAs Context Free Grammars

Conversion Example (8)

S0 → ASA | aB | a | AS | SA

S → ASA | aB | a | AS | SA

A → b | ASA | aB | a | AS | SA

B → b

(8) Final simplification – treat long rules:

S0 → AA1 | UB | a | SA | AS

S → AA1 | UB | a | SA | AS

A → b | AA1 | UB | a | SA | AS

A1 → SA

U → a

B → b√

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.58