cfg-pda.ppt
TRANSCRIPT
-
8/11/2019 CFG-PDA.ppt
1/68
Equivalence of CFG's and
PDA'sA language is context free if and only
if some pushdown automatonrecognizes it
As usual with if and only if theorems,
there are two directions to prove If a language is context free, then some pushdownautomaton recognizes it
If a pushdown automaton recognizes somelangauge, then it is context free
-
8/11/2019 CFG-PDA.ppt
2/68
Only If (CFG to PDA) Let L = L(G) for some CFG G = ( V, , P,
S ) Idea : have PDA A simulate leftmost
derivations in G , where a left-sententialform (LSF) is represented by:
1. The sequence of input symbols that A hasconsumed from its input, followed by
2. A 's stack, top left-most
Example: If ( q , abc d , S ) * (q, cd , A B C ),
then the LSF represented is abABC
-
8/11/2019 CFG-PDA.ppt
3/68
Moves of A If a terminal a is on top of the stack ,
then there had better be an a waiting on
the input. A consumes a from theinput and pops it from the stack The LSF represented doesn't change!
If a variableB
is on top of the stack ,then PDA A has a choice of replacingB on the stack by the body of anyproduction with head B
-
8/11/2019 CFG-PDA.ppt
4/68
Defining the PDA Define PDA A as follows:
Q contains a single state, q contains the terminal symbols of the grammar contains all terminal and non-terminal symbols from the
grammar F is the empty set ( A terminates by empty stack) Start stack symbol is the distinguished symbol of the
grammar is defined as follows:
For each production X in the grammar, create a move(q, ,X ) = (q, )
For each terminal symbol a in the grammar, create a move(q,a,a ) = (q, )
-
8/11/2019 CFG-PDA.ppt
5/68
Example
S a | aS | b SS | SSb | Sb S
PDA A = ({q},{a,b },{S,a,b }, ,q, ,S )
is defined as(q, ,S ) = { (q,a ),
(q,aS ),(q,bSS ),(q,SSb ),(q,SbS ) }
(q,a,a ) = (q, )(q,b,b ) = (q, )
-
8/11/2019 CFG-PDA.ppt
6/68
Processing of baa state input stack move
q baa S (q, ,S ) = (q,bSS )
q baa bSS (q,b,b ) = (q, )
q aa SS (q, ,S ) = (q,a )
q aa aS (q,a,a ) = (q, )q a S (q, ,S ) = (q,a )
q a a (q,a,a ) = (q, )
q - - - accept -
Generate bSS
Match b
Generate a
Match a
Generate a
Match a
S
b S S
a a
b a a
m a
t c h
m a
t c h
m a
t c h
-
8/11/2019 CFG-PDA.ppt
7/68
Converting from PDA to CFG
A PDA consumes a character A CFG generates a character We want to relate these two What happens when a PDA consumes
a character? It may change state It may change the stack
-
8/11/2019 CFG-PDA.ppt
8/68
Converting from PDA to CFGcontinued
Suppose X is on the stack and a is read What can happen to X ?
It can be popped It may replaced by one or more other stacksymbols
And so on The stack grows and shrinks and grows and shrinks
Eventually, as more input is consumed, X must bepopped (or well never reach an empty stack )
And the state may change many times We must track all of this!
-
8/11/2019 CFG-PDA.ppt
9/68
PDA to CFG
STRATEGY 1 Assume L = N (P ), where P = (Q, , , ,q 0,Z 0,F ), Fis empty (accept by empty stack)
Key idea : units of PDA action have the net effect of
popping one symbol from the stack, consumingsome input, and making a state change. The triple [ qZp ] is a CFG variable that generates exactly
those strings w such that P can read w from the input,pop Z (net effect), and go from state q to state p . More precisely, ( q,w,Z ) * ( p, , ) As a consequence of above, ( q,wx,Z ) * ( p, x, ) for any x and
.
-
8/11/2019 CFG-PDA.ppt
10/68
It's a Zen thing [qZp ] is at once a triple involvingstates and symbols of P , and yetto the CFG we construct it is asingle, indivisible object.
(OK, I know that's not a Zen thing, but you getthe point)
-
8/11/2019 CFG-PDA.ppt
11/68
Strategy A popping rule, e.g., ( p, ) in (q,a,Z ).
[qZp ] aPop Z , consume a
A rule that replaces one symbol and state by others,e.g., ( p,Y ) in (q,a,Z ).
For all states r in Q : [qZr ] a [ pYr ]Pop Z , consume a, move to state p, push Y A rule that replaces one stack symbol by two, e.g.,
( p,XY ) in (q,a ,Z ).
For all states r and s in Q: [qZs ] a [ pXr ][rYs]Pop Z , consume a, move to state p, push X, move to someother state, push Y , move to s
There may be some states r that cannot be reached from p whilepopping X. True, but does not affect grammar since the resultingvariables are useless and do not affect the language accepted by
the grammar
-
8/11/2019 CFG-PDA.ppt
12/68
(q,a ,Z ) = ( p,Y )
q
p
q
p
?
a , Z Y
a , Z Y
processY
consume apop Zpush Ymove to state p
Y not yet processed
consume apop Zpush Ymove to statep
Since we dont know which statethe PDA will be in afterprocessing Y, define a
production [ qZr n] a [ pYr n] thatends in each possible state r n
-
8/11/2019 CFG-PDA.ppt
13/68
ExamplePDA with transitions :
(q 1,0 ,Z 0) = {( q 1,XZ 0)}(q 2, ,Z 0) = {( q 3, )}(q 1,0,X ) = {( q 1,XX )}(q 1,1,X ) = {( q 2, )}(q 2,1,X ) = {( q 2, )}
S [q 1Z0q 3][q 1Z0q 3] 0 [q 1Xq 1] [q 1Z0q 3][q 1Z0q 3] 0 [q 1Xq 2] [q 2Z0q 3][q
1Z
0q
3] 0 [q
1Xq
3] [q
3Z
0q
3]
[q 1Xq 1] 0 [q 1Xq 1] [q 1Xq 1][q 1Xq 1] 0 [q 1Xq 2] [q 2Xq 1][q 1Xq 1] 0 [q 1Xq 3] [q 3Xq 1] [q 1Xq 2] 0 [q 1Xq 1] [q 1Xq 2]
[q 1Xq 2] 0 [q 1Xq 2] [q 2Xq 2][q 1Xq 2] 0 [q 1Xq 3] [q 3Xq 2] [q 1Xq 2] 1[q 2Xq 2] 1
[q 2Z0q 3]
S[q 1Z0q 3]0 [q 1Xq 2] [q 2Z0q 3]0 0 [q 1Xq 1] [q 1Xq 2] [q 2Z0q 3]0 0 0 [q 1Xq 2] [q 2Xq 2] [q 2Xq 2] [q 2Z0q 3]0 0 0 1 [q 2Xq 2] [q 2Xq 2] [q 2Z0q 3]0 0 0 1 1 [q 2Xq 2] [q 2Z0q 3]0 0 0 1 1 1 [q 2Z0q 3]
0 0 0 1 1 1
Impossible configurations involving q 3 are dimmed
Derivation of 000111 :
-
8/11/2019 CFG-PDA.ppt
14/68
Example
PDA withtransitions(q 0,a,Z 0) = {( q 0,A Z )}(q 0,b,Z 0) = {( q 0,B Z )}(q 0,a,A ) = {( q 0,A A)}(q 0,a,A ) = {( q 0,A A)}(q 0,b,A ) = {( q 0,B A )}(q 0,a,B ) = {( q 0,A B )}(q 0,b,B ) = {( q 0,B B )}(q
0,c,Z
0) = {( q
1,Z
0)}
(q 0,c,A ) = {( q 1,A )}(q 0,c,B ) = {( q 1,B )}(q 1,a,A ) = {( q 1, )}(q 1,b,B ) = {( q 1, )}
(q 1, ,Z 0) = {( q 1, ) }
S [q 0Z 0q ]
[q 0Z 0q ] a [q 0A p ] [pZ 0q ][q 0Z 0q ] b [q 0B p ] [pZ 0q ][q 0A q ] a [q 0A p ] [pA q ][q 0A q ] b [q 0B p ] [pA q ][q 0B q ] a [q 0A p ] [pB q ][q 0B q ] b [q 0B p ] [pB q ][q 0Z 0q ] c [q 1Z 0q ][q 0A q ] c [q 1A q ][q 0B q ] c [q 1B q ]
[q 1A q 1] a[q 1B q 1] b
[q 1Z 0q 1]
In the above, q and p can each
be either q0 or q1
-
8/11/2019 CFG-PDA.ppt
15/68
The Full StoryS [q0Z 0q0]S [q0Z 0q1][q0Z 0q0] a [q0 Aq 0] [q0Z 0q0][q0Z 0q0] a [q0 Aq 1] [q1Z 0q0][q0Z 0q1] a [q0 Aq 0] [q0Z 0q1][q0Z 0q1] a [q0 Aq 1] [q1Z 0q1]
[q0Z 0q0] b [q0Bq0] [q0Z 0q0][q0Z 0q0] b [q0Bq1] [q1Z 0q0][q0Z 0q1] b [q0Bq0] [q0Z 0q1][q0Z 0q1] b [q0Bq1] [q1Z 0q1][q0 Aq 0] a [q0 Aq 0] [q0 Aq 0][q0 Aq 0] a [q0 Aq 1] [q1 Aq 0][q0 Aq 1] a [q0 Aq 0] [q0 Aq 1][q0 Aq 1] a [q0 Aq 1] [q1 Aq 1][q0 Aq 0] b [q0Bq0] [q0 Aq 0][q0 Aq 0] b [q0Bq1] [q1 Aq 0][q0 Aq 1] b [q0Bq0] [q0 Aq 1]
[q0 Aq 1] b [q0Bq1] [q1 Aq 1]
[q0Bq0] a [q0 Aq 0] [q0Bq0][q0Bq0] a [q0 Aq 1] [q1Bq0][q0Bq1] a [q0 Aq 0] [q0Bq1][q0Bq1] a [q0 Aq 1] [q1Bq1][q0Bq0] b [q0Bq0] [q0Bq0][q0Bq0] b [q0Bq1] [q1Bq0][q0Bq1] b [q0Bq0] [q0Bq1][q0Bq1] b [q0Bq1] [q1Bq1][q0Z 0q0] c [q1Z 0q0][q0Z 0q1] c [q1Z 0q1][q0 Aq 0] c [q1 Aq 0]
[q0 Aq 1] c [q1 Aq 1][q0Bq0] c [q1Bq0][q0Bq1] c [q1Bq1][q1 Aq 1] a[q1Bq1] b[q
1Z
0q
1]
If we specify every non-terminal in terms of all possible states, the expansion wouldcontain all of the states below
-
8/11/2019 CFG-PDA.ppt
16/68
Deriving bacab
S [q0Z 0q1][q0Z 0q1] a [q0 Aq 1] [q1Z 0q1][q0Z 0q1] b [q0Bq1] [q1Z 0q1][q0 Aq 1] a [q0 Aq 1] [q1 Aq 1][q0 Aq 1] b [q0Bq1] [q1 Aq 1]
[q0Bq1] a [q0 Aq 1] [q1Bq1][q0Bq1] b [q0Bq1] [q1Bq1][q0Z 0q1] c [q1Z 0q1][q0 Aq 1] c [q1 Aq 1][q0Bq1] c [q1Bq1]
[q1 Aq 1] a[q1Bq1] b[q1Z 0q1]
Only some of the productions in the generated grammar will allow for a derivation; therest are unnecessary
PDA moves (q0, bacab , Z 0) |- ( q0, acab , BZ 0)
|- (q0, cab , ABZ 0)|- (q1, ab , ABZ 0)
|- (q1, b, BZ 0)|- (q1, , Z 0)
|- (q1, , )
Corresponding leftmost derivation S [q0,Z 0,q1]
b [q0,B,q1] [q1,Z 0,q1] ba [q0, A,q1] [q1,B,q1][q1,Z 0,q1] bac [q1, A,q1] [q1,B,q1][q1,Z 0,q1] baca [q1,B,q1][q1,Z 0,q1] bacab [q1,Z 0,q1] bacab
-
8/11/2019 CFG-PDA.ppt
17/68
PDA to CFG
STRATEGY 2 Convert PDA P into CFG G Modify P to be a normalized PDA N so that:
It has a single accept state, qaccept Create -transitions from old accept states to this new accept state
It empties the stack before accepting Push a special character $ on the stack in the start state (introducing a
new start state in the process) Introduce a new temporary state q
temp that replaces q
accept , which has
transitions popping all characters from the stack (except $) Introduce transition:
qtemp qaccept ,$
-
8/11/2019 CFG-PDA.ppt
18/68
Continued Each transition either pushes a symbol onto the stack
or pops one off the stack, but it does not do both atthe same time Replace a simultaneous pop/push move with a 2-transition
rule that goes through a new state E.g.
( read a from input, pop b from stack, push c ) Introduce special state q temp plus 2 transitions, one doing pop
and one doing push:
Replace a transition that neither pops nor pushes with twotransitions that push and then immediately pop some newly-created dummy stack symbol
qi q ja,b c
qia,b q jq temp
, c
qi q ja, qi
a, Xq jq temp
, X
-
8/11/2019 CFG-PDA.ppt
19/68
Normalizing the PDAEXAMPLE
, $
b , X
a, X Y
, $
a,
L (N) = (ba + b aa)*b) +
-
8/11/2019 CFG-PDA.ppt
20/68
Pure Push PopMake sure the stack is always active by replacing inactive stack movesby a push followed by immediate pop of a dummy symbol
, $
b , X
a, X Y
, $
a,
, $
b , X
a, X Y
, $
a, D
D
-
8/11/2019 CFG-PDA.ppt
21/68
Pure Push Pop Any move that replaces the top letter on the stack
should be changed into a pop followed by a push
, $
b , X
a, X Y
, $
a, D
, D
, $
b , X
a, X
, $
a, D
, D
Y
-
8/11/2019 CFG-PDA.ppt
22/68
Unique Accept StateTurn off original accept states and connect to a new
accept state
, $
b , X
a, X
, $
a, D
, D
, Y
, $
b , X
a, X
, $
a, D
, D
, Y
D D
D
Rememb er: eachmo ve must e i therpush o r pop f rom
the stack)
-
8/11/2019 CFG-PDA.ppt
23/68
Empty StackMake sure the stack empties its content by adding a
new dummy empty stack symbol and new start/acceptstates
, $
b , X
a, X
, $
a, D
, D
, Y
, D
D $ X
Y , D
,
, D
-
8/11/2019 CFG-PDA.ppt
24/68
PDA to CFGINTUITIVE DESCRIPTION
Consider normalized PDA N = (Q, , , ,q init ,qaccept ) Starts in q init with an empty stack Ends in qaccept with an empty stack
In general, can define the language L pq , for any two states p,q Q which isthe language of all strings that start in p with an empty stack, and end in q with an empty stack
For each pair of states p and q, define a symbol S pq inthe CFG for the language L
pq
Language of N is
Lq init q accept
-
8/11/2019 CFG-PDA.ppt
25/68
Steps to process w L pq
Two possibilities:1. During the processing of w the stack
becomes empty at some intermediatestate r This means a word of L pq can be formed by concatenating
a word of L pr (which brought N from state p to state r withan empty stack) and a word of Lrq (that took N from r to q )
2. Stack is never empty in the middle of N stransit from p to q in processing w The first transition (from, say, p to p 1) must have been a
push, and the last transition (from, say, q1 to q) must havebeen a pop, and the pop popped exactly the symbolpushed by the first transition from p to p 1
-
8/11/2019 CFG-PDA.ppt
26/68
In other words, if the PDA read a from input
as it moved from p to p 1, and read b as itmoved from q1 to q, then w =ayb , where y isan input that causes the PDA N to start from
p 1 with an empty stack and end in q1 with an
empty stack i.e., y Lp 1q 1 Formally, if there is a push transition (pushing
X onto the stack) from p to p 1 (reading a ) anda pop transition from q
1 to q (popping X and
reading b), then a word in L pq can beconstructed from the expression aL p 1q 1b
Note that either or both of a or b could be
-
8/11/2019 CFG-PDA.ppt
27/68
The Construction For every state p , introduce the rule
A pp Empty string can always be considered as getting you from p to
p without doing any thing to the stack, since nothing was read
CONCATENATION RULE: For the case where thestack empties in the middle of transition from
p to q, introduce, for all states p, q, r of N, therule
A pq A pr Arq
-
8/11/2019 CFG-PDA.ppt
28/68
RECURSION RULE: Case where stack neverempty: for any given states p, p 1, q 1, r of N ,such that there is a push transition from p to
p 1 and a pop transition from q 1to r (that pushand pop the same symbol), introduce anappropriate rule
Formally, for p, p 1, q 1, r of N with the form
introduce the rule A pr a A p1q1b
p1 p a, X
r q1b, X
push X pop X
-
8/11/2019 CFG-PDA.ppt
29/68
Formal DefinitionFROM S IPSER
P = (Q, , , ,q0,qaccept ) Non-terminals of G are { A pq | p,q Q}
Rules: For each p, q, r, s Q, t , and a, b , if ( p,a, )
contains ( r,t ) and (s,b,t ) contains ( q, ), put the rule A pq aA rsb in G
For each p, q, r Q, put the rule A pq
A pr
Arq
in G. For each p Q, put the rule A pp in G.
-
8/11/2019 CFG-PDA.ppt
30/68
A pq A pr Arq
Stackheight
Input string
Generated by A pr Generated by Arq
Generated by A pq
p r q
CONCATENATION RULE
-
8/11/2019 CFG-PDA.ppt
31/68
A pq aA rsb
Stackheight
Input string
Generated by Ars
Generated by A pq
q pr s
a b
RECURSION RULE
-
8/11/2019 CFG-PDA.ppt
32/68
PDA CFG
If paths for strings that are accepted by thePDA start and end with an empty stack, it ispossible to consider any such path, betweenany two states and recursively generate allsuch paths
This recursive relationship between paths willgive rise to the recursion at the heart of therepresentative context free grammar
-
8/11/2019 CFG-PDA.ppt
33/68
The Grammar The rules for generating paths give a
grammar to generate all labels of suchpaths
The grammar has non-terminals Aqr whichwill generate all strings x that areprocessed when passing from state q tostate r
Q: Under this assumption, what should theproduction body (right hand side) forthe start variable S be?
-
8/11/2019 CFG-PDA.ppt
34/68
The Grammar SymbolsA: S = Aq init q accept , where q ini t is the start
state and q accept is the final state In addition to this start variable, the other
variables are all Aqr for which there is apath going from q to r that starts andends with an empty stack
Note that Sipser doesnt require the extra condition that there be a path from q to r which starts and ends with an emptystack his method generates all possible combinations. However, those pairs q,r for which no such path exists willcreate useless variables A qr which end up cluttering the grammar and making the construction extremely ugly, even onthe simplest PDAs . On the other hand, it is not obvious how one would determine a priori which of the pairs dont havesuch paths, which probably explains why Sipser didnt include this condition.
-
8/11/2019 CFG-PDA.ppt
35/68
Grammar Rules
1. B ASIS RULE : Add a production Aqq foreach state q in the PDA
2. C ONCATENATION RULE : Add a production A pr A pq Aqr for all p,q,r when A pr , A pq and Aqr are all in V .
3. R ECURSION RULE : Add a production A ps aA
qr b for all p,s,q,r when
A ps and Aqr are in V Transitions ( q,X ) ( p,a, ), (s, ) (r,b,X ) for the same
stack symbol X exist in the PDA
-
8/11/2019 CFG-PDA.ppt
36/68
Example
PDA in the normalized form:
Q: What is the accepted language?
r s , $q , $
(, X), X
-
8/11/2019 CFG-PDA.ppt
37/68
A: CNP = correctly nested parentheses,
including sets of pairs [e.g., ()(())]. Thenumber of Xs on the stack reflects howdeep the current nesting is.
Q: What are the variables for theequivalent grammar? What is thestart variable?
r s , $q , $
(, X
), X
-
8/11/2019 CFG-PDA.ppt
38/68
A: V = {Aq s , Aq q , A r r , A s s }, S = Aq s
We dont need Arq , Asq , Asr becausethe paths go in the wrong direction
We dont need Aqr or Ars because cantadd or remove $ while at r I.e., not a transition where you both begin and end
with an empty stack
r s , $q , $
(, X
), X
-
8/11/2019 CFG-PDA.ppt
39/68
Productions from the Base
Rule Empty string can always be considered as getting
you from p to p without doing any thing to the stack,since nothing was read
r s , $q , $
(, X), X
Aqq , Arr , Ass
-
8/11/2019 CFG-PDA.ppt
40/68
Productions from the
Concatenation ruleIf you can get from some state p to another state p 1 starting and ending with the stack empty (regardlessof stack activity in the processing of moving from p to
p 1), and from q1 to q under the same conditions, thencombine paths to get a path from p to q.
r s , $q , $
(, X), X Aqs Aqq Aqs | Aqs Ass
Aqq Aqq Aqq Arr Arr Arr Ass Ass Ass State pairs q and r and r and s do not satisfy
the condition that the stack is empty in bothstates
-
8/11/2019 CFG-PDA.ppt
41/68
Productions from the Recursion RuleFor any given states p, p 1, q 1, q of N , such that there is a pushtransition from p to p1 and a pop transition from q1to q (that pushand pop the same symbol), i.e., there exist transitions ( p,a, )contains( p 1,X ) and (q1,b,X ) contains ( q, ), put the rule A pq aA
p1q1b
r s , $q , $
(, X), X
Aqs Arr = A rr Arr (Arr )
(q, , ) contains ( r,$ ) and (r , , X) contains (s , )
(r,(, ) contains (r , X ) and (r , ) , X) contains (r , )
-
8/11/2019 CFG-PDA.ppt
42/68
Full Grammar
Aqs Arr | Aqq Aqs | A qs Ass
Arr | A rr Arr | (A rr ) Aqq | A qq Aqq Ass | A ss Ass
-
8/11/2019 CFG-PDA.ppt
43/68
Simplifications Apparently Aqq and Ass are purely self-
referential, so there is no way to terminatethem that is, no string can be derived fromthem.
We can therefore remove the variables Aqq , Ass
Aqs Arr | Aqq Aqs | Aqs Ass Arr | Arr Arr | ( Arr ) Aqq | Aqq Aqq Ass | Ass Ass
Becomes: Aqs Arr | Aqs Arr | Arr Arr | ( Arr )
-
8/11/2019 CFG-PDA.ppt
44/68
Showing that the grammarworks
Aqs Arr | Aqs Arr | Arr Arr | ( Arr )
Rename variables to get:S T | S
T | TT | (T )
S isnt needed as its whole purpose is to getyou to T
So the final (cleaned up) grammar isT | TT | ( T )
-
8/11/2019 CFG-PDA.ppt
45/68
Another Example
Consider the language L = {wcw R | w {a,b }*}. A non-normalized PDA for this
language isa, a
b, b
a, a
b, b
s fc,
-
8/11/2019 CFG-PDA.ppt
46/68
Convert to Normalized Form
a, a
b, b
a, a
b, b
s fc , D
s a
1. Create new start and accepting states2. All transitions either pop or push except c, ;
change to 2 transitions that push and pop adummy symbol
q tD
, $ ,$
-
8/11/2019 CFG-PDA.ppt
47/68
Generate Grammar
1. Add start symbol and a production Aqq for
each state q in the PDAS Asa , Ass Ass Aqq Aff Aaa
a, a
b, b
a, a
b, b
s fc, D
s aq , D
-
8/11/2019 CFG-PDA.ppt
48/68
Generate Grammar
2. Add a production A pr A pq Aqr for all p,q,r
when A pr , A pq and Aqr are all in V
a, a
b, b
a, a
b, b
s fc, D
s aq , D , $ ,$
A sa A ss A sa | A sa Aaa
-
8/11/2019 CFG-PDA.ppt
49/68
Generate Grammar
3. Add a production A ps aAqr b for all p,s,q,r when A ps and Aqr arein V and transitions ( q,X ) ( p,a, ), (s, ) (r,b,X ) for the samestack symbol X exist in the PDA
a, a
b, b
a,a
b,b
s fc, D
s aq , D , $ ,$
A sa $A s f $ | A s f c A qqA s f b A s f b |aA s f a
-
8/11/2019 CFG-PDA.ppt
50/68
Final Grammar
S A sa A ss A ss
Aqq A ff A aa A sa A ss A sa A sa A sa A aaA sa $A sf $A sf cA qq A sf bA sf bA sf aA sf a
S TR U
V W X T RT T TX T $Z$Z cVZ bZbZ aZa
T $Z$Z c
Z bZbZ aZa
More readableFinal grammar Simplified
R,U,V,W,X contribute only so can
be eliminated T RT and T TX then become T T , which is obviously unnecessary
S is superfluous because it only getsyou to T
-
8/11/2019 CFG-PDA.ppt
51/68
Deterministic PDAs Intuitively: never a choice of move
(q, a, Z) has at most one member for any q, a, Z (including a = ).
If (q, , Z ) is nonempty, then (q, a, Z ) must beempty for all input symbols a .
Why Care? Parsers, as in YACC, are really DPDA's. Thus, the question of what languages a DPDA can
accept is really the question of what programminglanguage syntax can be parsed conveniently.
-
8/11/2019 CFG-PDA.ppt
52/68
Some LanguageRelationships
Acceptance by empty stack is hardfor a DPDA Once it accepts, it dies and cannot accept any
continuation. Thus, N (P ) has the prefix property: if w is in N (P ),
then wx is NOT in N (P ) for any x .
However, parsers do accept byemptying their stack Trick: they really process strings followed by a
unique endmarker (typically $) e.g., if they acceptw $, they consider w to be a correct program.
-
8/11/2019 CFG-PDA.ppt
53/68
If L is a regular language, then L is
a DPDA language A DPDA can simulate a DFA, without using itsstack (acceptance by final state).
If L is a DPDA language, then L is aCFL that is n o t inherentlyambiguous
A DPDA yields an unambiguous grammar inthe standard construction Interesting fact: The class of languages
accepted by NPDAs is larger than thoseaccepted by DPDAs!
-
8/11/2019 CFG-PDA.ppt
54/68
Languages accepted bynondeterministic PDA
Languages accepted bydeterministic PDA
Languagesaccepted by FAor NFA
PDA more powerful than FA
-
8/11/2019 CFG-PDA.ppt
55/68
Cleaning Up Grammars
We can "simplify" grammars to a greatextent, e.g.:1. Get rid of useless symbols -- those that do not
participate in any derivation of a terminal string.
2. Get rid of -productions --those of the form variable .
But you lose the ability to generate as a string in the language.3. Get rid of unit productions -- those of the form variable
variable. Any CFG can be converted via these and
other methods to Chomsky Normal Form only production forms are variable two variables and
variable terminal .
-
8/11/2019 CFG-PDA.ppt
56/68
Getting Rid of the Empty
String Empty string is a nuisance with grammarsand languages in general
We will look at languages that do not contain
No loss of generality:For language L, let G = (V,T,S,P ) be a CFG that generates L- { }
Modify grammar by adding a new start variable S 0 and addproductions S 0 S | This grammar generates LTherefore any non-trivial conclusion we make for L - { }should transfer to L
-
8/11/2019 CFG-PDA.ppt
57/68
Useless Symbols
In order for a symbol X to be useful, itmust:1. Derive some terminal string (possibly X is a
terminal).2. Be reachable from the start symbol; i.e., S
X . Note that X wouldn't really be useful if or
included a symbol that didn't satisfy (1), so it isimportant that (1) be tested first, and symbols thatdon't derive terminal strings be eliminated before testing (2).
*
-
8/11/2019 CFG-PDA.ppt
58/68
Finding Symbols That Don'tDerive Any Terminal String
Recursive construction :
Basis : A terminal surely derives a terminalstring. Induction : If A is the head of a production
whose body is X 1 X 2 X k , and each X i isknown to derive a terminal string, thensurely A derives a terminal string.
Keep going until no more symbols that deriveterminal strings are discovered.
E l
-
8/11/2019 CFG-PDA.ppt
59/68
ExampleS AB | C
A 0B | CB 1 | A0C A C | C1
Round 1 : 0 and 1 are "in." Round 2 : B 1 says B is in. Round 3 : A 0B says A is in. Round 4 : S AB says S is in.
Round 5 : Nothing more can be added.
Thus, C can be eliminated, along with any productionthat mentions it, leaving S A B ; A 0 B ; B 1 | A0 .
-
8/11/2019 CFG-PDA.ppt
60/68
Finding Symbols That Can't BeDerived From the Start Symbol
Another recursive algorithm: Basis : S is "in." Induction : If variable A is in, then so is
every symbol in the production bodiesfor A.
Keep going until no more symbolsderivable from S can be found.
-
8/11/2019 CFG-PDA.ppt
61/68
ExampleS A BA 0 B B 1 | A0
Round 1 : S is in.
Round 2 : A and B are in. Round 3 : 0 and 1 are in. Round 4 : Nothing can be added. In this case, all symbols are derivable from S, so nochange to grammar.
Book has an example where not only are there symbolsnot derivable from S, but you must eliminate first thesymbols that don't derive terminal strings, or you get the
wrong grammar.
-
8/11/2019 CFG-PDA.ppt
62/68
Eliminating -Productions
A variable A is nullable if A . Find them bya recursive algorithm: Basis : If A is a production, then A is
nullable. Induction : If A is the head of a production
whose body consists of only nullable symbols,then A is nullable.
Once we have the nullable symbols, we canadd additional productions and then throwaway the productions of the form A forany A.
*
-
8/11/2019 CFG-PDA.ppt
63/68
If A X 1 X 2 X k is a production, add
all productions that can be formed byeliminating some or all of those X i 'sthat are nullable.
But, don't eliminate all k if they are allnullable.
Example If A BC is a production, and both B and C
are nullable, add A B | C
-
8/11/2019 CFG-PDA.ppt
64/68
Eliminating Unit Productions
1. Eliminate useless symbols and -productions.2. Discover those pairs of variables ( A, B ) such that
A B.
Because there are no -productions, this derivation can only useunit productions. Thus, we can find the pairs by computing reachablity in a graph
where nodes = variables, and arcs = unit productions.
3. Replace each combination where A B and is other than a single variable by A
I.e., "short circuit" sequences of unit productions, which musteventually be followed by some other kind of production.
4. Remove all unit productions.
*
* *
-
8/11/2019 CFG-PDA.ppt
65/68
Chomsky Normal Form
1. Get rid of useless symbols, -productions, andunit productions (already done).
2. Get rid of productions whose bodies are mixesof terminals and variables, or consist of morethan one terminal.
3. Break up production bodies longer than 2.
ResultAll productions are of the form A B C or A
a
-
8/11/2019 CFG-PDA.ppt
66/68
No Mixed Bodies
1. For each terminal a , introduce a newvariable Aa , with one production Aa a .
2. Replace a in any body where it is notthe entire body by Aa .
Now, every body is either a single terminal or itconsists only of variables.
Example A 0B1 becomes A0 0; A1 1; A
A0BA1
-
8/11/2019 CFG-PDA.ppt
67/68
Making Bodies Short If we have a production like A BCDE ,
we can introduce some new variables thatallow the variables of the body to beintroduced one at a time. A body of length k requires k - 2 new variables.
Example Introduce F and G; replace A BCDE by A
BF; F CG; G DE .
-
8/11/2019 CFG-PDA.ppt
68/68
Summary TheoremIf L is any CFL, there is agrammar G that generates L -{ }, for which each production isof the form A B C or A a ,
and there are no uselesssymbols.