Download - Algorithms for Context-Free Grammars
Algorithms for Context-Free
Grammars
Section 3.6
Mon, Oct 31, 2005
The Membership Problem The Problem: Given a grammar G and a string w, is w
L(G)? The Algorithm: First rewrite the grammar in
Chomsky Normal Form (CNF).
Chomsky Normal Form A grammar is in Chomsky Normal Form if every rule
is of the form
A XY, or
A a
where A V – and X, Y V. Theorem: For every context-free grammar G, there is
a context-free grammar G' in CNF such that
L(G') = L(G) – ( {e}).
CNF Algorithm The following algorithm will convert a grammar to
Chomsky Normal Form. Divide the rules A for which || 2 into three
groups. long rules: || 3. short rules: || = 1. e-rules: = e.
CNF Algorithm Eliminate the long rules.
Let the rule be A X1X2…Xn, for n 3.
Rewrite it as a series of rules:
A X1A1
A1 X2A2
:
An – 3 Xn – 2An - 2.
An – 2 Xn – 1Xn.
CNF Algorithm Eliminate the e-rules.
We first determine which nonterminals are erasable. = {A V – | A * e}.
Use the following algorithm: Let = . While there is a rule A with * and A ,
Add A to . For every A and every rule in which A appears on the
right side, add a copy of that rule, with A replaced by e. Delete the e-rules from R.
CNF Algorithm Eliminate the short rules.
For each A V, determine the set of individual symbols that can be derived from A.
Call this set (A). For each A V – , use the following algorithm:
Let (A) = {A}. While there is a rule B X with B (A) and X (A),
Add X to (A).
CNF Algorithm For every rule A XY, add a new rule A X'Y' for every
possible choice of X' (X) and Y' (Y). For every rule A XY, with A (S) – {S}, add the rule
S XY. Eliminate the short rules.
Example Convert the following grammar to CNF.
S ABA aA | eB aBb | e
Example Eliminate the long rules.
Replace B aBb with B aB1
B1 Bb
The grammar is nowS AB
A aA | e
B aB1 | e
B1 Bb
Example Find the set of erasable nonterminals.
It is obvious that = {S, A, B}.
Eliminate the e-rules. Add S A and S B. Add A a. Add B1 b.
Eliminate A e and B e.
Example The grammar is now
S AB | A | B
A aA | a
B aB1
B1 Bb | b
Example Find the sets of derivable symbols.
(S) = {S, A, B, a} (A) = {A, a} (B) = {B} (B1) = {B1, b}
Proof Eliminate the short rules.
Add the rulesS aB
S aB1
A aa
B ab
Proof Eliminate the short rules.
Eliminate the rules
S A
S B
S a
A a
B1 b.
Example The grammar is now
S AB | aB
A aA | aa
B aB1 | ab
B1 Bb
Example (S) – {S} = {A, B, a}, so add the rules
S aA
S aB1
S aa
S ab
Example The final grammar is
S AB | aA | aB1 | aB | aa | ab
A aA | aa
B aB1 | ab
B1 Bb
Derivations in CNF In the grammar of the example, derive the string
aaaabb. S AB aaB aaaB1 aaaBb aaaabb.
How many steps did it take? Was that predictable? Can we generalize that to a string of length n?
The Membership Problem Now we have a way to solve the membership
problem. Any string of length n must be derivable in exactly n
– 1 steps (which is finite). At worst, check every possible derivation of length n
– 1.
Example Show that abb is not derivable in the previous example. Find every derivation of length 2.
S AB aAB S AB aaB S AB AaB1
S AB Aab S aA aaA S aA aaa S aB1 aBb
S aB aaB1
S aB aab
Example Only the strings aaa and aab are derived. Therefore abb L(G).
The Emptiness Problem The Problem: Given a CFG G, is L(G) empty? Lemma: If L(G) contains no string of length less than
n (of the Pumping Lemma), then L(G) is empty. Proof:
By the Pumping Lemma, for any string w L(G) with length at least n (from the Pumping Lemma), we may write w as uvxyz, where v and y are not both empty.
Then the string uxz is in L(G). By repeating this, we will get a string of length less than n
that is in L(G).
The Emptiness Problem The Problem: Given a CFG G, is L(G) empty? The Algorithm:
Rewrite G in CNF. Test every possible string of length less than n.
The Decision: If no string of length less than n is in L(G), then L(G) is
empty. Otherwise, L(G) is not empty.
The Finiteness Problem The Problem: Given a CFG G, is L(G) finite? Lemma: If L(G) is infinite, then L(G) contains a
string w such that n |w| < 2n. Proof:
If L(G) is infinite, then there exists a string in L(G) of length at least 2n (n of the Pumping Lemma).
Let w be a shortest such string. Rewrite w = uvxyz, according to the Pumping Lemma. Then uxz is also in L(G), and |uxz| < |w|. Then |uxz| < 2n.
The Finiteness Problem However, |vy| n, so |uxz| > n. Therefore, n |uxz| < 2n.
The Finiteness Problem The Algorithm:
Rewrite the grammar in CNF. Test every possible string of length at least n, but less than
2n.
The Decision: If L(G) contains a string of length at least n, but less than
2n, then L(G) is infinite. Otherwise, L(G) is finite.