cfg-pda.ppt

Upload: sparaiso

Post on 02-Jun-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 CFG-PDA.ppt

    1/68

    Equivalence of CFG's and

    PDA'sA language is context free if and only

    if some pushdown automatonrecognizes it

    As usual with if and only if theorems,

    there are two directions to prove If a language is context free, then some pushdownautomaton recognizes it

    If a pushdown automaton recognizes somelangauge, then it is context free

  • 8/11/2019 CFG-PDA.ppt

    2/68

    Only If (CFG to PDA) Let L = L(G) for some CFG G = ( V, , P,

    S ) Idea : have PDA A simulate leftmost

    derivations in G , where a left-sententialform (LSF) is represented by:

    1. The sequence of input symbols that A hasconsumed from its input, followed by

    2. A 's stack, top left-most

    Example: If ( q , abc d , S ) * (q, cd , A B C ),

    then the LSF represented is abABC

  • 8/11/2019 CFG-PDA.ppt

    3/68

    Moves of A If a terminal a is on top of the stack ,

    then there had better be an a waiting on

    the input. A consumes a from theinput and pops it from the stack The LSF represented doesn't change!

    If a variableB

    is on top of the stack ,then PDA A has a choice of replacingB on the stack by the body of anyproduction with head B

  • 8/11/2019 CFG-PDA.ppt

    4/68

    Defining the PDA Define PDA A as follows:

    Q contains a single state, q contains the terminal symbols of the grammar contains all terminal and non-terminal symbols from the

    grammar F is the empty set ( A terminates by empty stack) Start stack symbol is the distinguished symbol of the

    grammar is defined as follows:

    For each production X in the grammar, create a move(q, ,X ) = (q, )

    For each terminal symbol a in the grammar, create a move(q,a,a ) = (q, )

  • 8/11/2019 CFG-PDA.ppt

    5/68

    Example

    S a | aS | b SS | SSb | Sb S

    PDA A = ({q},{a,b },{S,a,b }, ,q, ,S )

    is defined as(q, ,S ) = { (q,a ),

    (q,aS ),(q,bSS ),(q,SSb ),(q,SbS ) }

    (q,a,a ) = (q, )(q,b,b ) = (q, )

  • 8/11/2019 CFG-PDA.ppt

    6/68

    Processing of baa state input stack move

    q baa S (q, ,S ) = (q,bSS )

    q baa bSS (q,b,b ) = (q, )

    q aa SS (q, ,S ) = (q,a )

    q aa aS (q,a,a ) = (q, )q a S (q, ,S ) = (q,a )

    q a a (q,a,a ) = (q, )

    q - - - accept -

    Generate bSS

    Match b

    Generate a

    Match a

    Generate a

    Match a

    S

    b S S

    a a

    b a a

    m a

    t c h

    m a

    t c h

    m a

    t c h

  • 8/11/2019 CFG-PDA.ppt

    7/68

    Converting from PDA to CFG

    A PDA consumes a character A CFG generates a character We want to relate these two What happens when a PDA consumes

    a character? It may change state It may change the stack

  • 8/11/2019 CFG-PDA.ppt

    8/68

    Converting from PDA to CFGcontinued

    Suppose X is on the stack and a is read What can happen to X ?

    It can be popped It may replaced by one or more other stacksymbols

    And so on The stack grows and shrinks and grows and shrinks

    Eventually, as more input is consumed, X must bepopped (or well never reach an empty stack )

    And the state may change many times We must track all of this!

  • 8/11/2019 CFG-PDA.ppt

    9/68

    PDA to CFG

    STRATEGY 1 Assume L = N (P ), where P = (Q, , , ,q 0,Z 0,F ), Fis empty (accept by empty stack)

    Key idea : units of PDA action have the net effect of

    popping one symbol from the stack, consumingsome input, and making a state change. The triple [ qZp ] is a CFG variable that generates exactly

    those strings w such that P can read w from the input,pop Z (net effect), and go from state q to state p . More precisely, ( q,w,Z ) * ( p, , ) As a consequence of above, ( q,wx,Z ) * ( p, x, ) for any x and

    .

  • 8/11/2019 CFG-PDA.ppt

    10/68

    It's a Zen thing [qZp ] is at once a triple involvingstates and symbols of P , and yetto the CFG we construct it is asingle, indivisible object.

    (OK, I know that's not a Zen thing, but you getthe point)

  • 8/11/2019 CFG-PDA.ppt

    11/68

    Strategy A popping rule, e.g., ( p, ) in (q,a,Z ).

    [qZp ] aPop Z , consume a

    A rule that replaces one symbol and state by others,e.g., ( p,Y ) in (q,a,Z ).

    For all states r in Q : [qZr ] a [ pYr ]Pop Z , consume a, move to state p, push Y A rule that replaces one stack symbol by two, e.g.,

    ( p,XY ) in (q,a ,Z ).

    For all states r and s in Q: [qZs ] a [ pXr ][rYs]Pop Z , consume a, move to state p, push X, move to someother state, push Y , move to s

    There may be some states r that cannot be reached from p whilepopping X. True, but does not affect grammar since the resultingvariables are useless and do not affect the language accepted by

    the grammar

  • 8/11/2019 CFG-PDA.ppt

    12/68

    (q,a ,Z ) = ( p,Y )

    q

    p

    q

    p

    ?

    a , Z Y

    a , Z Y

    processY

    consume apop Zpush Ymove to state p

    Y not yet processed

    consume apop Zpush Ymove to statep

    Since we dont know which statethe PDA will be in afterprocessing Y, define a

    production [ qZr n] a [ pYr n] thatends in each possible state r n

  • 8/11/2019 CFG-PDA.ppt

    13/68

    ExamplePDA with transitions :

    (q 1,0 ,Z 0) = {( q 1,XZ 0)}(q 2, ,Z 0) = {( q 3, )}(q 1,0,X ) = {( q 1,XX )}(q 1,1,X ) = {( q 2, )}(q 2,1,X ) = {( q 2, )}

    S [q 1Z0q 3][q 1Z0q 3] 0 [q 1Xq 1] [q 1Z0q 3][q 1Z0q 3] 0 [q 1Xq 2] [q 2Z0q 3][q

    1Z

    0q

    3] 0 [q

    1Xq

    3] [q

    3Z

    0q

    3]

    [q 1Xq 1] 0 [q 1Xq 1] [q 1Xq 1][q 1Xq 1] 0 [q 1Xq 2] [q 2Xq 1][q 1Xq 1] 0 [q 1Xq 3] [q 3Xq 1] [q 1Xq 2] 0 [q 1Xq 1] [q 1Xq 2]

    [q 1Xq 2] 0 [q 1Xq 2] [q 2Xq 2][q 1Xq 2] 0 [q 1Xq 3] [q 3Xq 2] [q 1Xq 2] 1[q 2Xq 2] 1

    [q 2Z0q 3]

    S[q 1Z0q 3]0 [q 1Xq 2] [q 2Z0q 3]0 0 [q 1Xq 1] [q 1Xq 2] [q 2Z0q 3]0 0 0 [q 1Xq 2] [q 2Xq 2] [q 2Xq 2] [q 2Z0q 3]0 0 0 1 [q 2Xq 2] [q 2Xq 2] [q 2Z0q 3]0 0 0 1 1 [q 2Xq 2] [q 2Z0q 3]0 0 0 1 1 1 [q 2Z0q 3]

    0 0 0 1 1 1

    Impossible configurations involving q 3 are dimmed

    Derivation of 000111 :

  • 8/11/2019 CFG-PDA.ppt

    14/68

    Example

    PDA withtransitions(q 0,a,Z 0) = {( q 0,A Z )}(q 0,b,Z 0) = {( q 0,B Z )}(q 0,a,A ) = {( q 0,A A)}(q 0,a,A ) = {( q 0,A A)}(q 0,b,A ) = {( q 0,B A )}(q 0,a,B ) = {( q 0,A B )}(q 0,b,B ) = {( q 0,B B )}(q

    0,c,Z

    0) = {( q

    1,Z

    0)}

    (q 0,c,A ) = {( q 1,A )}(q 0,c,B ) = {( q 1,B )}(q 1,a,A ) = {( q 1, )}(q 1,b,B ) = {( q 1, )}

    (q 1, ,Z 0) = {( q 1, ) }

    S [q 0Z 0q ]

    [q 0Z 0q ] a [q 0A p ] [pZ 0q ][q 0Z 0q ] b [q 0B p ] [pZ 0q ][q 0A q ] a [q 0A p ] [pA q ][q 0A q ] b [q 0B p ] [pA q ][q 0B q ] a [q 0A p ] [pB q ][q 0B q ] b [q 0B p ] [pB q ][q 0Z 0q ] c [q 1Z 0q ][q 0A q ] c [q 1A q ][q 0B q ] c [q 1B q ]

    [q 1A q 1] a[q 1B q 1] b

    [q 1Z 0q 1]

    In the above, q and p can each

    be either q0 or q1

  • 8/11/2019 CFG-PDA.ppt

    15/68

    The Full StoryS [q0Z 0q0]S [q0Z 0q1][q0Z 0q0] a [q0 Aq 0] [q0Z 0q0][q0Z 0q0] a [q0 Aq 1] [q1Z 0q0][q0Z 0q1] a [q0 Aq 0] [q0Z 0q1][q0Z 0q1] a [q0 Aq 1] [q1Z 0q1]

    [q0Z 0q0] b [q0Bq0] [q0Z 0q0][q0Z 0q0] b [q0Bq1] [q1Z 0q0][q0Z 0q1] b [q0Bq0] [q0Z 0q1][q0Z 0q1] b [q0Bq1] [q1Z 0q1][q0 Aq 0] a [q0 Aq 0] [q0 Aq 0][q0 Aq 0] a [q0 Aq 1] [q1 Aq 0][q0 Aq 1] a [q0 Aq 0] [q0 Aq 1][q0 Aq 1] a [q0 Aq 1] [q1 Aq 1][q0 Aq 0] b [q0Bq0] [q0 Aq 0][q0 Aq 0] b [q0Bq1] [q1 Aq 0][q0 Aq 1] b [q0Bq0] [q0 Aq 1]

    [q0 Aq 1] b [q0Bq1] [q1 Aq 1]

    [q0Bq0] a [q0 Aq 0] [q0Bq0][q0Bq0] a [q0 Aq 1] [q1Bq0][q0Bq1] a [q0 Aq 0] [q0Bq1][q0Bq1] a [q0 Aq 1] [q1Bq1][q0Bq0] b [q0Bq0] [q0Bq0][q0Bq0] b [q0Bq1] [q1Bq0][q0Bq1] b [q0Bq0] [q0Bq1][q0Bq1] b [q0Bq1] [q1Bq1][q0Z 0q0] c [q1Z 0q0][q0Z 0q1] c [q1Z 0q1][q0 Aq 0] c [q1 Aq 0]

    [q0 Aq 1] c [q1 Aq 1][q0Bq0] c [q1Bq0][q0Bq1] c [q1Bq1][q1 Aq 1] a[q1Bq1] b[q

    1Z

    0q

    1]

    If we specify every non-terminal in terms of all possible states, the expansion wouldcontain all of the states below

  • 8/11/2019 CFG-PDA.ppt

    16/68

    Deriving bacab

    S [q0Z 0q1][q0Z 0q1] a [q0 Aq 1] [q1Z 0q1][q0Z 0q1] b [q0Bq1] [q1Z 0q1][q0 Aq 1] a [q0 Aq 1] [q1 Aq 1][q0 Aq 1] b [q0Bq1] [q1 Aq 1]

    [q0Bq1] a [q0 Aq 1] [q1Bq1][q0Bq1] b [q0Bq1] [q1Bq1][q0Z 0q1] c [q1Z 0q1][q0 Aq 1] c [q1 Aq 1][q0Bq1] c [q1Bq1]

    [q1 Aq 1] a[q1Bq1] b[q1Z 0q1]

    Only some of the productions in the generated grammar will allow for a derivation; therest are unnecessary

    PDA moves (q0, bacab , Z 0) |- ( q0, acab , BZ 0)

    |- (q0, cab , ABZ 0)|- (q1, ab , ABZ 0)

    |- (q1, b, BZ 0)|- (q1, , Z 0)

    |- (q1, , )

    Corresponding leftmost derivation S [q0,Z 0,q1]

    b [q0,B,q1] [q1,Z 0,q1] ba [q0, A,q1] [q1,B,q1][q1,Z 0,q1] bac [q1, A,q1] [q1,B,q1][q1,Z 0,q1] baca [q1,B,q1][q1,Z 0,q1] bacab [q1,Z 0,q1] bacab

  • 8/11/2019 CFG-PDA.ppt

    17/68

    PDA to CFG

    STRATEGY 2 Convert PDA P into CFG G Modify P to be a normalized PDA N so that:

    It has a single accept state, qaccept Create -transitions from old accept states to this new accept state

    It empties the stack before accepting Push a special character $ on the stack in the start state (introducing a

    new start state in the process) Introduce a new temporary state q

    temp that replaces q

    accept , which has

    transitions popping all characters from the stack (except $) Introduce transition:

    qtemp qaccept ,$

  • 8/11/2019 CFG-PDA.ppt

    18/68

    Continued Each transition either pushes a symbol onto the stack

    or pops one off the stack, but it does not do both atthe same time Replace a simultaneous pop/push move with a 2-transition

    rule that goes through a new state E.g.

    ( read a from input, pop b from stack, push c ) Introduce special state q temp plus 2 transitions, one doing pop

    and one doing push:

    Replace a transition that neither pops nor pushes with twotransitions that push and then immediately pop some newly-created dummy stack symbol

    qi q ja,b c

    qia,b q jq temp

    , c

    qi q ja, qi

    a, Xq jq temp

    , X

  • 8/11/2019 CFG-PDA.ppt

    19/68

    Normalizing the PDAEXAMPLE

    , $

    b , X

    a, X Y

    , $

    a,

    L (N) = (ba + b aa)*b) +

  • 8/11/2019 CFG-PDA.ppt

    20/68

    Pure Push PopMake sure the stack is always active by replacing inactive stack movesby a push followed by immediate pop of a dummy symbol

    , $

    b , X

    a, X Y

    , $

    a,

    , $

    b , X

    a, X Y

    , $

    a, D

    D

  • 8/11/2019 CFG-PDA.ppt

    21/68

    Pure Push Pop Any move that replaces the top letter on the stack

    should be changed into a pop followed by a push

    , $

    b , X

    a, X Y

    , $

    a, D

    , D

    , $

    b , X

    a, X

    , $

    a, D

    , D

    Y

  • 8/11/2019 CFG-PDA.ppt

    22/68

    Unique Accept StateTurn off original accept states and connect to a new

    accept state

    , $

    b , X

    a, X

    , $

    a, D

    , D

    , Y

    , $

    b , X

    a, X

    , $

    a, D

    , D

    , Y

    D D

    D

    Rememb er: eachmo ve must e i therpush o r pop f rom

    the stack)

  • 8/11/2019 CFG-PDA.ppt

    23/68

    Empty StackMake sure the stack empties its content by adding a

    new dummy empty stack symbol and new start/acceptstates

    , $

    b , X

    a, X

    , $

    a, D

    , D

    , Y

    , D

    D $ X

    Y , D

    ,

    , D

  • 8/11/2019 CFG-PDA.ppt

    24/68

    PDA to CFGINTUITIVE DESCRIPTION

    Consider normalized PDA N = (Q, , , ,q init ,qaccept ) Starts in q init with an empty stack Ends in qaccept with an empty stack

    In general, can define the language L pq , for any two states p,q Q which isthe language of all strings that start in p with an empty stack, and end in q with an empty stack

    For each pair of states p and q, define a symbol S pq inthe CFG for the language L

    pq

    Language of N is

    Lq init q accept

  • 8/11/2019 CFG-PDA.ppt

    25/68

    Steps to process w L pq

    Two possibilities:1. During the processing of w the stack

    becomes empty at some intermediatestate r This means a word of L pq can be formed by concatenating

    a word of L pr (which brought N from state p to state r withan empty stack) and a word of Lrq (that took N from r to q )

    2. Stack is never empty in the middle of N stransit from p to q in processing w The first transition (from, say, p to p 1) must have been a

    push, and the last transition (from, say, q1 to q) must havebeen a pop, and the pop popped exactly the symbolpushed by the first transition from p to p 1

  • 8/11/2019 CFG-PDA.ppt

    26/68

    In other words, if the PDA read a from input

    as it moved from p to p 1, and read b as itmoved from q1 to q, then w =ayb , where y isan input that causes the PDA N to start from

    p 1 with an empty stack and end in q1 with an

    empty stack i.e., y Lp 1q 1 Formally, if there is a push transition (pushing

    X onto the stack) from p to p 1 (reading a ) anda pop transition from q

    1 to q (popping X and

    reading b), then a word in L pq can beconstructed from the expression aL p 1q 1b

    Note that either or both of a or b could be

  • 8/11/2019 CFG-PDA.ppt

    27/68

    The Construction For every state p , introduce the rule

    A pp Empty string can always be considered as getting you from p to

    p without doing any thing to the stack, since nothing was read

    CONCATENATION RULE: For the case where thestack empties in the middle of transition from

    p to q, introduce, for all states p, q, r of N, therule

    A pq A pr Arq

  • 8/11/2019 CFG-PDA.ppt

    28/68

    RECURSION RULE: Case where stack neverempty: for any given states p, p 1, q 1, r of N ,such that there is a push transition from p to

    p 1 and a pop transition from q 1to r (that pushand pop the same symbol), introduce anappropriate rule

    Formally, for p, p 1, q 1, r of N with the form

    introduce the rule A pr a A p1q1b

    p1 p a, X

    r q1b, X

    push X pop X

  • 8/11/2019 CFG-PDA.ppt

    29/68

    Formal DefinitionFROM S IPSER

    P = (Q, , , ,q0,qaccept ) Non-terminals of G are { A pq | p,q Q}

    Rules: For each p, q, r, s Q, t , and a, b , if ( p,a, )

    contains ( r,t ) and (s,b,t ) contains ( q, ), put the rule A pq aA rsb in G

    For each p, q, r Q, put the rule A pq

    A pr

    Arq

    in G. For each p Q, put the rule A pp in G.

  • 8/11/2019 CFG-PDA.ppt

    30/68

    A pq A pr Arq

    Stackheight

    Input string

    Generated by A pr Generated by Arq

    Generated by A pq

    p r q

    CONCATENATION RULE

  • 8/11/2019 CFG-PDA.ppt

    31/68

    A pq aA rsb

    Stackheight

    Input string

    Generated by Ars

    Generated by A pq

    q pr s

    a b

    RECURSION RULE

  • 8/11/2019 CFG-PDA.ppt

    32/68

    PDA CFG

    If paths for strings that are accepted by thePDA start and end with an empty stack, it ispossible to consider any such path, betweenany two states and recursively generate allsuch paths

    This recursive relationship between paths willgive rise to the recursion at the heart of therepresentative context free grammar

  • 8/11/2019 CFG-PDA.ppt

    33/68

    The Grammar The rules for generating paths give a

    grammar to generate all labels of suchpaths

    The grammar has non-terminals Aqr whichwill generate all strings x that areprocessed when passing from state q tostate r

    Q: Under this assumption, what should theproduction body (right hand side) forthe start variable S be?

  • 8/11/2019 CFG-PDA.ppt

    34/68

    The Grammar SymbolsA: S = Aq init q accept , where q ini t is the start

    state and q accept is the final state In addition to this start variable, the other

    variables are all Aqr for which there is apath going from q to r that starts andends with an empty stack

    Note that Sipser doesnt require the extra condition that there be a path from q to r which starts and ends with an emptystack his method generates all possible combinations. However, those pairs q,r for which no such path exists willcreate useless variables A qr which end up cluttering the grammar and making the construction extremely ugly, even onthe simplest PDAs . On the other hand, it is not obvious how one would determine a priori which of the pairs dont havesuch paths, which probably explains why Sipser didnt include this condition.

  • 8/11/2019 CFG-PDA.ppt

    35/68

    Grammar Rules

    1. B ASIS RULE : Add a production Aqq foreach state q in the PDA

    2. C ONCATENATION RULE : Add a production A pr A pq Aqr for all p,q,r when A pr , A pq and Aqr are all in V .

    3. R ECURSION RULE : Add a production A ps aA

    qr b for all p,s,q,r when

    A ps and Aqr are in V Transitions ( q,X ) ( p,a, ), (s, ) (r,b,X ) for the same

    stack symbol X exist in the PDA

  • 8/11/2019 CFG-PDA.ppt

    36/68

    Example

    PDA in the normalized form:

    Q: What is the accepted language?

    r s , $q , $

    (, X), X

  • 8/11/2019 CFG-PDA.ppt

    37/68

    A: CNP = correctly nested parentheses,

    including sets of pairs [e.g., ()(())]. Thenumber of Xs on the stack reflects howdeep the current nesting is.

    Q: What are the variables for theequivalent grammar? What is thestart variable?

    r s , $q , $

    (, X

    ), X

  • 8/11/2019 CFG-PDA.ppt

    38/68

    A: V = {Aq s , Aq q , A r r , A s s }, S = Aq s

    We dont need Arq , Asq , Asr becausethe paths go in the wrong direction

    We dont need Aqr or Ars because cantadd or remove $ while at r I.e., not a transition where you both begin and end

    with an empty stack

    r s , $q , $

    (, X

    ), X

  • 8/11/2019 CFG-PDA.ppt

    39/68

    Productions from the Base

    Rule Empty string can always be considered as getting

    you from p to p without doing any thing to the stack,since nothing was read

    r s , $q , $

    (, X), X

    Aqq , Arr , Ass

  • 8/11/2019 CFG-PDA.ppt

    40/68

    Productions from the

    Concatenation ruleIf you can get from some state p to another state p 1 starting and ending with the stack empty (regardlessof stack activity in the processing of moving from p to

    p 1), and from q1 to q under the same conditions, thencombine paths to get a path from p to q.

    r s , $q , $

    (, X), X Aqs Aqq Aqs | Aqs Ass

    Aqq Aqq Aqq Arr Arr Arr Ass Ass Ass State pairs q and r and r and s do not satisfy

    the condition that the stack is empty in bothstates

  • 8/11/2019 CFG-PDA.ppt

    41/68

    Productions from the Recursion RuleFor any given states p, p 1, q 1, q of N , such that there is a pushtransition from p to p1 and a pop transition from q1to q (that pushand pop the same symbol), i.e., there exist transitions ( p,a, )contains( p 1,X ) and (q1,b,X ) contains ( q, ), put the rule A pq aA

    p1q1b

    r s , $q , $

    (, X), X

    Aqs Arr = A rr Arr (Arr )

    (q, , ) contains ( r,$ ) and (r , , X) contains (s , )

    (r,(, ) contains (r , X ) and (r , ) , X) contains (r , )

  • 8/11/2019 CFG-PDA.ppt

    42/68

    Full Grammar

    Aqs Arr | Aqq Aqs | A qs Ass

    Arr | A rr Arr | (A rr ) Aqq | A qq Aqq Ass | A ss Ass

  • 8/11/2019 CFG-PDA.ppt

    43/68

    Simplifications Apparently Aqq and Ass are purely self-

    referential, so there is no way to terminatethem that is, no string can be derived fromthem.

    We can therefore remove the variables Aqq , Ass

    Aqs Arr | Aqq Aqs | Aqs Ass Arr | Arr Arr | ( Arr ) Aqq | Aqq Aqq Ass | Ass Ass

    Becomes: Aqs Arr | Aqs Arr | Arr Arr | ( Arr )

  • 8/11/2019 CFG-PDA.ppt

    44/68

    Showing that the grammarworks

    Aqs Arr | Aqs Arr | Arr Arr | ( Arr )

    Rename variables to get:S T | S

    T | TT | (T )

    S isnt needed as its whole purpose is to getyou to T

    So the final (cleaned up) grammar isT | TT | ( T )

  • 8/11/2019 CFG-PDA.ppt

    45/68

    Another Example

    Consider the language L = {wcw R | w {a,b }*}. A non-normalized PDA for this

    language isa, a

    b, b

    a, a

    b, b

    s fc,

  • 8/11/2019 CFG-PDA.ppt

    46/68

    Convert to Normalized Form

    a, a

    b, b

    a, a

    b, b

    s fc , D

    s a

    1. Create new start and accepting states2. All transitions either pop or push except c, ;

    change to 2 transitions that push and pop adummy symbol

    q tD

    , $ ,$

  • 8/11/2019 CFG-PDA.ppt

    47/68

    Generate Grammar

    1. Add start symbol and a production Aqq for

    each state q in the PDAS Asa , Ass Ass Aqq Aff Aaa

    a, a

    b, b

    a, a

    b, b

    s fc, D

    s aq , D

  • 8/11/2019 CFG-PDA.ppt

    48/68

    Generate Grammar

    2. Add a production A pr A pq Aqr for all p,q,r

    when A pr , A pq and Aqr are all in V

    a, a

    b, b

    a, a

    b, b

    s fc, D

    s aq , D , $ ,$

    A sa A ss A sa | A sa Aaa

  • 8/11/2019 CFG-PDA.ppt

    49/68

    Generate Grammar

    3. Add a production A ps aAqr b for all p,s,q,r when A ps and Aqr arein V and transitions ( q,X ) ( p,a, ), (s, ) (r,b,X ) for the samestack symbol X exist in the PDA

    a, a

    b, b

    a,a

    b,b

    s fc, D

    s aq , D , $ ,$

    A sa $A s f $ | A s f c A qqA s f b A s f b |aA s f a

  • 8/11/2019 CFG-PDA.ppt

    50/68

    Final Grammar

    S A sa A ss A ss

    Aqq A ff A aa A sa A ss A sa A sa A sa A aaA sa $A sf $A sf cA qq A sf bA sf bA sf aA sf a

    S TR U

    V W X T RT T TX T $Z$Z cVZ bZbZ aZa

    T $Z$Z c

    Z bZbZ aZa

    More readableFinal grammar Simplified

    R,U,V,W,X contribute only so can

    be eliminated T RT and T TX then become T T , which is obviously unnecessary

    S is superfluous because it only getsyou to T

  • 8/11/2019 CFG-PDA.ppt

    51/68

    Deterministic PDAs Intuitively: never a choice of move

    (q, a, Z) has at most one member for any q, a, Z (including a = ).

    If (q, , Z ) is nonempty, then (q, a, Z ) must beempty for all input symbols a .

    Why Care? Parsers, as in YACC, are really DPDA's. Thus, the question of what languages a DPDA can

    accept is really the question of what programminglanguage syntax can be parsed conveniently.

  • 8/11/2019 CFG-PDA.ppt

    52/68

    Some LanguageRelationships

    Acceptance by empty stack is hardfor a DPDA Once it accepts, it dies and cannot accept any

    continuation. Thus, N (P ) has the prefix property: if w is in N (P ),

    then wx is NOT in N (P ) for any x .

    However, parsers do accept byemptying their stack Trick: they really process strings followed by a

    unique endmarker (typically $) e.g., if they acceptw $, they consider w to be a correct program.

  • 8/11/2019 CFG-PDA.ppt

    53/68

    If L is a regular language, then L is

    a DPDA language A DPDA can simulate a DFA, without using itsstack (acceptance by final state).

    If L is a DPDA language, then L is aCFL that is n o t inherentlyambiguous

    A DPDA yields an unambiguous grammar inthe standard construction Interesting fact: The class of languages

    accepted by NPDAs is larger than thoseaccepted by DPDAs!

  • 8/11/2019 CFG-PDA.ppt

    54/68

    Languages accepted bynondeterministic PDA

    Languages accepted bydeterministic PDA

    Languagesaccepted by FAor NFA

    PDA more powerful than FA

  • 8/11/2019 CFG-PDA.ppt

    55/68

    Cleaning Up Grammars

    We can "simplify" grammars to a greatextent, e.g.:1. Get rid of useless symbols -- those that do not

    participate in any derivation of a terminal string.

    2. Get rid of -productions --those of the form variable .

    But you lose the ability to generate as a string in the language.3. Get rid of unit productions -- those of the form variable

    variable. Any CFG can be converted via these and

    other methods to Chomsky Normal Form only production forms are variable two variables and

    variable terminal .

  • 8/11/2019 CFG-PDA.ppt

    56/68

    Getting Rid of the Empty

    String Empty string is a nuisance with grammarsand languages in general

    We will look at languages that do not contain

    No loss of generality:For language L, let G = (V,T,S,P ) be a CFG that generates L- { }

    Modify grammar by adding a new start variable S 0 and addproductions S 0 S | This grammar generates LTherefore any non-trivial conclusion we make for L - { }should transfer to L

  • 8/11/2019 CFG-PDA.ppt

    57/68

    Useless Symbols

    In order for a symbol X to be useful, itmust:1. Derive some terminal string (possibly X is a

    terminal).2. Be reachable from the start symbol; i.e., S

    X . Note that X wouldn't really be useful if or

    included a symbol that didn't satisfy (1), so it isimportant that (1) be tested first, and symbols thatdon't derive terminal strings be eliminated before testing (2).

    *

  • 8/11/2019 CFG-PDA.ppt

    58/68

    Finding Symbols That Don'tDerive Any Terminal String

    Recursive construction :

    Basis : A terminal surely derives a terminalstring. Induction : If A is the head of a production

    whose body is X 1 X 2 X k , and each X i isknown to derive a terminal string, thensurely A derives a terminal string.

    Keep going until no more symbols that deriveterminal strings are discovered.

    E l

  • 8/11/2019 CFG-PDA.ppt

    59/68

    ExampleS AB | C

    A 0B | CB 1 | A0C A C | C1

    Round 1 : 0 and 1 are "in." Round 2 : B 1 says B is in. Round 3 : A 0B says A is in. Round 4 : S AB says S is in.

    Round 5 : Nothing more can be added.

    Thus, C can be eliminated, along with any productionthat mentions it, leaving S A B ; A 0 B ; B 1 | A0 .

  • 8/11/2019 CFG-PDA.ppt

    60/68

    Finding Symbols That Can't BeDerived From the Start Symbol

    Another recursive algorithm: Basis : S is "in." Induction : If variable A is in, then so is

    every symbol in the production bodiesfor A.

    Keep going until no more symbolsderivable from S can be found.

  • 8/11/2019 CFG-PDA.ppt

    61/68

    ExampleS A BA 0 B B 1 | A0

    Round 1 : S is in.

    Round 2 : A and B are in. Round 3 : 0 and 1 are in. Round 4 : Nothing can be added. In this case, all symbols are derivable from S, so nochange to grammar.

    Book has an example where not only are there symbolsnot derivable from S, but you must eliminate first thesymbols that don't derive terminal strings, or you get the

    wrong grammar.

  • 8/11/2019 CFG-PDA.ppt

    62/68

    Eliminating -Productions

    A variable A is nullable if A . Find them bya recursive algorithm: Basis : If A is a production, then A is

    nullable. Induction : If A is the head of a production

    whose body consists of only nullable symbols,then A is nullable.

    Once we have the nullable symbols, we canadd additional productions and then throwaway the productions of the form A forany A.

    *

  • 8/11/2019 CFG-PDA.ppt

    63/68

    If A X 1 X 2 X k is a production, add

    all productions that can be formed byeliminating some or all of those X i 'sthat are nullable.

    But, don't eliminate all k if they are allnullable.

    Example If A BC is a production, and both B and C

    are nullable, add A B | C

  • 8/11/2019 CFG-PDA.ppt

    64/68

    Eliminating Unit Productions

    1. Eliminate useless symbols and -productions.2. Discover those pairs of variables ( A, B ) such that

    A B.

    Because there are no -productions, this derivation can only useunit productions. Thus, we can find the pairs by computing reachablity in a graph

    where nodes = variables, and arcs = unit productions.

    3. Replace each combination where A B and is other than a single variable by A

    I.e., "short circuit" sequences of unit productions, which musteventually be followed by some other kind of production.

    4. Remove all unit productions.

    *

    * *

  • 8/11/2019 CFG-PDA.ppt

    65/68

    Chomsky Normal Form

    1. Get rid of useless symbols, -productions, andunit productions (already done).

    2. Get rid of productions whose bodies are mixesof terminals and variables, or consist of morethan one terminal.

    3. Break up production bodies longer than 2.

    ResultAll productions are of the form A B C or A

    a

  • 8/11/2019 CFG-PDA.ppt

    66/68

    No Mixed Bodies

    1. For each terminal a , introduce a newvariable Aa , with one production Aa a .

    2. Replace a in any body where it is notthe entire body by Aa .

    Now, every body is either a single terminal or itconsists only of variables.

    Example A 0B1 becomes A0 0; A1 1; A

    A0BA1

  • 8/11/2019 CFG-PDA.ppt

    67/68

    Making Bodies Short If we have a production like A BCDE ,

    we can introduce some new variables thatallow the variables of the body to beintroduced one at a time. A body of length k requires k - 2 new variables.

    Example Introduce F and G; replace A BCDE by A

    BF; F CG; G DE .

  • 8/11/2019 CFG-PDA.ppt

    68/68

    Summary TheoremIf L is any CFL, there is agrammar G that generates L -{ }, for which each production isof the form A B C or A a ,

    and there are no uselesssymbols.