tutorial 03 -- csc3130 : formal languages and automata theory haifeng wan ( hfwan@cse.cuhk.edu.hk )...

Post on 12-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tutorial 03-- CSC3130 : Formal Languages

and Automata Theory

Haifeng Wan (hfwan@cse.cuhk.edu.hk)

2009-09-27

Outline

Pumping Lemma DFA Minimization Context-free Languages

Pigeonhole Principle

Pigeonhole principle If m objects are put into n containers, where

m>n, then at least one container must hold more than one object.

The pigeonhole can be used to prove that certain infinite languages are not regular. Remind: any finite language is regular.

Pumping Lemma for Regular Languages Theorem: For every regular language LThere exists a number n such that for every string z in L, we can write z = uvw where |uv| ≤ n |v| ≥ 1 For every i ≥ 0, the string u vi w is in L.

z ……

u

v

w

Pumping Lemma

What does the Pumping Lemma say? If an infinite language is regular, it can be defined by a

DFA. The DFA has a finite number m of states. Since the language is infinite, some strings of the

language must have length greater than m. For a string of length greater than m accepted by the

DFA, the walk through the DFA must contain a cycle. Repeating the cycle an arbitrary number of times must

yield another string accepted by the DFA. Remind: the Pumping Lemma is not sufficient.

It is one way to prove that a given infinite language is not regular, while it cannot be used to prove that a given infinite language is regular.

Outline to Prove by Pumping Lemma

Main idea: prove by contradiction. Brief outline:

Assume the language L is regular (and thus the Pumping Lemma holds).

Show that repeating the cycle some number of times (“pumping” the cycle) yields a string that is not in L.

Conclude that L is not regular by contradiction.

What can we think about during using Pumping Lemma? On choosing the particular string z in L. On choosing the number of times to “pump” the cycle.

Example 1

Prove that L3={uu: u in {0,1}*} is not regular. Suppose L3 is regular,there exists n

Choose a string z=0m10m1 with m>n,

Although the decomposition of z into uvw is unknown, uv must consist entirely of 0s because |uv|≤n. Moreover, |v|≥1.

Simply choose i=2. Thus uv2w will have more 0s before the first 1 than the second 1, which is not in L3.

Thus L3 is not regular due to the contradiction.

Prove that L={x: x has different numbers of 0s and 1s} is not regular.

Trick: Instead of directly prove this, let’s prove its dual stated

language D={x: x has the same number of 0s and 1s} is not regular.

Steps: Remind that we have proven L={0n1n: n≥0} is

not regular. And L = D If L is regular, then D should also be regular. Thus D is not regular according to the contradiction.

Neither is L..

Example 2

Take x = 0^n1^{n! + n}.

Then the adversary splits it as uvw. Let k be the length of the v part.

Now pump it (n!+k)/k times.

Then you get uv^iw = 0^{k((n!+k)/k) + (n-k)}1^{n!+n} = 0^{n! + n}1^{n! + n}

Example

Prove that L2 ={1m: m is prime} is not regular. Suppose L2 is regular, and thus Pumping Lemma holds.

Although n is unknown, we can still assume that there is one.

Choose a string z=1m where m is a prime number and |uvw|=m>n+1. Any prefix of z consists entirely of 1s.

Although the decomposition of z into uvw is unknown, it follows that |w|>1 due to |uvw|> n. Moreover, |v|≥1.

Choose i=|uw|. (Remind |w|>1 and |uw|>1). We have |u vi w|=|uw|+|v||uw|=(1+|v|)|uw|. Because both 1+|v| and |uw| are greater than 1, the product must be a composite number, i.e., |u vi w| is a composite not a prime number. It is not in L2 .

Thus, L2 is not regular due to the contradiction. Q.E.D.

Outline

Pumping Lemma DFA Minimization Context-free grammars (CFG)

DFA Minimization

There is an algorithm to start with any DFA and reduce it to the smallest possible DFA

The algorithm attempts to identify classes of equivalent states

These are states that can be merged together without affecting the answer of the computation

Equivalent and Distinguishable States Two states q, q’ are equivalent if

Here, (q, w) is the state that the machine is in if it starts at q and reads the string w

q, q’ are distinguishable if they are not equivalent:

^

For every string w, the states (q, w) and (q’, w) are either both accepting or both rejecting

^ ^

For some string w, one of the states (q, w), (q’, w) is accepting and the other is rejecting

DFA Minimization Algorithm

Find all pairs of distinguishable states as follows:

For any pair of states q, q’: If q is accepting and q’ is rejecting Mark (q, q’) as distinguishable

Repeat until nothing is marked: For any pair of states (q, q’): For every alphabet symbol a: If ((q, a), (q’, a)) are marked as distinguishable Mark (q, q’) as distinguishable

For any pair of states (q, q’): If (q, q’) is not marked as distinguishable Merge q and q’ into a single state

Example 1

q

q

q

q

0

10

q

q

q

q

q q q q

q

0 00, 11

1

1

q

q

Example 1 (cont.)

q

q

q

q

0

10

q

q

q

q

q q q q

q

0 00, 11

1

1 x x x x

q4 is distinguishable from all other states

Example 1 (cont.)

q

q

q

q

0

10

q

q

q

q

q q q q

q

0 00, 11

1

1 x x x x

q0 is distinguishable from q1, q2, q3, q4

xxx

Example 1 (cont.)

q

q

q

q

0

10

q

q

q

q

q q q q

q

0 00, 11

1

1 x x x xxxx

Merge states not marked distinguishable q0 cannot be merged → group A q, q2, q3 are equivalent → group B q4 cannot be merged → group C

BB B

Example 1 (cont.)

q

q

q

q

0

10

q

q

q

q

q q q q

q

0 00, 11

1

1 x x x xxxx

BB B

0, 1q q qC

1minimized DFA:

0 0, 1

A

B

C

Example 2

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

q2 is distinguishable from all other states

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q0 is distinguishable from q1, q2, q4, q5, q6

q

q

q

q

q

q

qq q q qq

x x

xxxx

x

x

xx

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

q1 is distinguishable from q0, q2, q3, q4, q5

x

xxx

x

xx

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

q3 is distinguishable from q1, q2, q4, q5, q6

x

xxx

x

x

x

xxx

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

q4 is distinguishable from q0, q1, q2, q3, q5, q6

x

xxx

x

x

x

xxx x

x

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

q5 is distinguishable from q0, q1, q2, q3, q4, q6

x

xxx

x

x

x

xxx x

x x

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

x

xxx

x

x

x

xxx x

x x

Merge states not marked distinguishable q0, q3 are equivalent → group A q, q6 are equivalent → group B q2 cannot be merged → group C q4 cannot be merged → group D q5 cannot be merged → group E

Example 2 (cont.)

q q q

q q q q

0

0 1

1

10

0

1 1 01

1 0

0

q

q

q

q

q

q

qq q q qq

x x

xxxx

x

xxx

x

x

x

xxx x

x x

qA

qE

qC

1minimized DFA:

AB

C

D E

qB

qD

0 0

1

01

0

1

1

0

Outline

Pumping Lemma DFA Minimization Context-free Languages

Relations

Context-free Languages L

Context-free Grammars G

Push-down Automata M

L = L(G)

L = L(M) L(G) = L(M)

PDA = NFA + a stack (infinite memory)

Example (I)

Given the following CFG

S X | Y

X aXb | aX | a

Y aYb | Yb | b (1) L(G) = ?

Σ={a, b}

Example (I) --- solution: L(S)

S X | Y

X aXb | aX | a

Y aYb | Yb | b

Try to write some strings generated by it:

SXaXbaaXbbaaaXbbaaaabb

SYaYbaYbbaaYbbbaabbbb

more a’s than b’s

more b’s than a’s

Observations:• Start from S, we can enter two States X & Y, and X, Y are “independent”;

• In X state, always more a are generated;

• In Y state, always more b are generated.

Ls = Lx U Ly

Lx = { aibj; i>j }

Lx = { aibj; i<j }

L(S) =

{ aibj; i≠j }

End of this tutorial!Thanks for coming!

Example (II)

Given the following language:

(1) design a CFG for it;

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, = {0, 1}

Example (II) -- solution: CFG

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, = {0, 1}

Consider two extreme cases:

(a). if j = i, then L1 = { 0i1j: i=j }; (b). if j = 2i, then L2 = { 0i1j: 2i=j }.

S 0S1

S ε

S 0S11

S ε

If i ≤ j ≤ 2i , then randomly choose “red-rule” or “blue-rule” in the generation.

“red-rule” “blue-rule”

S 0S1

S 0S11

S ε

Example (II) -- solution: CFG

L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, = {0, 1}

S 0S1 S 0S11 S ε

Need to verify L = L(G)

G =

1). L(G) is a subset of L:

The “red-rule” and “blue-rule” guarantee that in each derivation, the number of 1s generated is one or two times larger than that of 0s. So, L(G) is a subset of L.

2). L is a subset of L(G):

For any w = 0i1j, i ≤ j ≤ 2i, we use “red-rule” (2i - j) times and then “blue-rule” ( j - i ) times, i.e.,

S =*=> 02i-jS12i-j =*=> 02i-j0j-iS12(j-i)12i-j ==> 0i1j = w

top related