foundations of computer science spring...

14
Mohammad Ashiqur Rahman Department of Computer Science College of Engineering Tennessee Tech University Normal Forms for Context-Free Grammars Foundations of Computer Science Spring 2017 CYK Algorithm

Upload: vobao

Post on 20-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Mohammad Ashiqur Rahman

Department of Computer ScienceCollege of Engineering

Tennessee Tech University

Normal Forms for Context-Free Grammars

Foundations of Computer ScienceSpring 2017

CYK Algorithm

The CYK Algorithm

Let G = (V, , P, S) be a CFG.

Find that if w L(G)?

The CYK algorithm answers this question.

J. Cocke

D. Younger,

T. Kasami

Independently developed an algorithm to answer this question.

2

The CYK Algorithm (2)

The rules are structured in a Chomsky normal form grammar

Uses a “dynamic programming” A bottom-up approach

Revisiting Chomsky normal form

A CFG G = (V, , P, S) is in Chomsky normal form if each rule in G has one of the following forms:

i. A BC

ii. A a

iii. S where A, B, C, S V, and B, C V {S}, and a

3

Idea of the CYK Algorithm Let u = x1x2 … xn be a string to be tested for u L(G). Let xi,j denote the substring xixi+1 … xj of u.

xi,i is simply xi

The strategy of the CYK algorithm for the grammar G: Step 1: For each substring xi,i of u with length one, find the set Xi,i

of all variables A with a rule A xi,i. Step 2: For each substring xi,i+1 of u with length two, find the set

Xi,i+1 of all variables that initiate A * xi,i+1. Step 3: For each substring xi,i+2 of u with length three, find the set

Xi,i+2 of all variables that initiate A * xi,i+2.…………

Step n – 1: For the substrings x1,n1 and x2,n of u with length n–1, find the sets X1,n1 and X2,n of all variables that initiate A * x1,n1 and A * x2,n, respectively.

Step n: For the string x1,n = u with length n, find the set X1,n of all variables that initiate A * x1,n.

If S X1,n, then u L(G).4

CYK: Matrix Representation

The sets Xi,j can be represented as the upper triangular portion of an n × n matrix:

1 2 3 … n 1 n

1 X1,1 X1,2 X1,3 … X1,n1 X1,n

2 X2,2 X2,3 … X2,n1 X2,n

3 X3,3 … X3,n1 X3,n

⁞ .

n 1 Xn1,n1 Xn1,n

n Xn,n

5

How the CYK Algorithm Works?

Why it works?

Step 1 is straight-forward

For step 2, the derivation of xi,i+1:

For step t, the derivation of xi,i+t:

A derives xi,i+t only if there is rule A BC and i k < i + t such that B Xi,k and C Xk+1, i+t.

A BC

xiC

xixi+1

A BC

* xi,kC

* xi,kxk+1,i+t

6

CYK: Example

Is aaabbb an string in L(G)? S AT | AB

T XB

X AT | AB

A a

B b

1 2 3 4 5 6

1 X1,1 X1,2 X1,3 X1,4 X1,5 X1,6

2 X2,2 X2,3 X2,4 X2,5 X2,6

3 X3,3 X3,4 X3,5 X3,6

4 X4,4 X4,5 X4,6

5 X5,5 X5,6

6 X6,6

7

CYK: Example

Is aaabbb an string in L(G)? S AT | AB

T XB

X AT | AB

A a

B b

1 2 3 4 5 6

1 {A} {S, X}

2 {A} {S, X} {T}

3 {A} {S, X} {T}

4 {B}

5 {B}

6 {B}

8

The CYK Algorithm for Parsing

The CYK algorithm can be used to produce derivations of strings in L(G).

Derivation Sets

S AT A X1,1, T X2,6

aT T X2,6

aXB X X2,5, B X6,6

aATB A X2,2, T X3,5, B X6,6

aaTB T X3,5, B X6,6

aaXBB T X3,4, B X5,5, B X6,6

aaABBB A X3,3, B X4,4, B X5,5, B X6,6

* aaabbb1 2 3 4 5 6

1 {A} {S, X}

2 {A} {S, X} {T}

3 {A} {S, X} {T}

4 {B}

5 {B}

6 {B}

S AT | AB

T XB

X AT | AB

A a

B b

9

CYK: Another Example

Is abba an string in L(G)? S AX | AY | a

X AX | a

Y BY | a

A a

B b

1 2 3 4

1 X1,1 X1,2 X1,3 X1,4

2 X2,2 X2,3 X2,4

3 X3,3 X3,4

4 X4,4

10

CYK: Another Example (2)

Is abba an string in L(G)? S AX | AY | a

X AX | a

Y BY | a

A a

B b

1 2 3 4

1 {S, X, Y, A} {S}

2 {B} {Y}

3 {B} {Y}

4 {S, X, Y, A}

11

CYK: Do It Yourself

Is baaba an string in L(G)?

S AB | BC

A BA | a

B CC | b

C AB | a

1 2 3 4 5

1 X1,1 X1,2 X1,3 X1,4 X1,5

2 X2,2 X2,3 X2,4 X2,5

3 X3,3 X3,4 X3,5

4 X4,4 X4,5

5 X5,5

For each correct slot, you will get 0.5 mark….

Total 7.5 marks.12

Did You Able to Do It Correctly?

For each correct slot, you will get 0.5 mark…. Total 7.5 marks.

baaba L(G)? S AB | BC

A BA | a

B CC | b

C AB | a

1 2 3 4 5

1 {B} {S, A} {S, A, C}

2 {A, C} {B} {B} {S, A, C}

3 {A, C} {S, C} {B}

4 {B} {S, A}

5 {A, C}

13

THANKS

Source:- Chapter 4, Languages and Machines, Thomas Sudkamp

14