5-parserconstruction

Upload: randika-perera

Post on 07-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 5-parserconstruction

    1/58

    COS 320

    Compilers

    David Walker

  • 8/4/2019 5-parserconstruction

    2/58

    last time

    context free grammars (Appel 3.1)

    terminals, non-terminals, rules

    derivations & parse trees

    ambiguous grammars

    recursive descent parsers (Appel 3.2)

    parse LL(k) grammars

    easy to write as ML programs

    algorithms for automatic construction from a CFG

  • 8/4/2019 5-parserconstruction

    3/58

    1. S ::= IF E THEN S ELSE S

    2. | BEGIN S L3. | PRINT E

    4. L ::= END

    5. | ; S L6. E ::= NUM = NUM

    non-terminals: S, E, Lterminals: NUM, IF, THEN, ELSE, BEGIN, END, PRINT, ;, =rules:

    fun S () = case !tok ofIF => eat IF; E (); eat THEN; S (); eat ELSE; S ()

    | BEGIN => eat BEGIN; S (); L ()

    | PRINT => eat PRINT; E ()

    and L () = case !tok ofEND => eat END

    | SEMI => eat SEMI; S (); L ()

    and E () = eat NUM; eat EQ; eat NUM

    val tok = ref (getToken ())fun advance () = tok := getToken ()fun eat t = if (! tok = t) then advance () else error ()

    datatype token = NUM | IF| THEN | ELSE | BEGIN | END| PRINT | SEMI | EQ

  • 8/4/2019 5-parserconstruction

    4/58

    Constructing RD Parsers

    To construct an RD parser, we need to knowwhat rule to apply when

    we have seen a non terminal X

    we see the next terminal a in input

    We apply rule X ::= s when

    a is the first symbol that can be generated by string s,OR

    s reduces to the empty string (is nullable) and a is thefirst symbol in any string that can followX

  • 8/4/2019 5-parserconstruction

    5/58

    Computing Nullable Sets

    Non-terminal X is Nullable only if thefollowing constraints are satisfied(computed using iterative analysis)

    base case:

    if (X := ) then X is Nullable

    inductive case:

    if (X := ABC...) and A, B, C, ... are all Nullable thenX is Nullable

  • 8/4/2019 5-parserconstruction

    6/58

    Computing First Sets

    First(X) is computed iteratively

    base case:

    if T is a terminal symbol then First (T) = {T}

    inductive case:

    if X is a non-terminal and (X:= ABC...) then

    First (X) = First (X) U First (ABC...)

    where First(ABC...) = F1 U F2 U F3 U ... and

    F1 = First (A)

    F2 = First (B), if A is Nullable

    F3 = First (C), if A is Nullable & B is Nullable

    ...

  • 8/4/2019 5-parserconstruction

    7/58

    Computing Follow Sets

    Follow(X) is computed iteratively

    base case:

    initially, we assume nothing in particular follows X

    (Follow (X) is initially { })

    inductive case:

    if (Y := s1 X s2) for any strings s1, s2 then

    Follow (X) = First (s2) U Follow (X)

    if (Y := s1 X s2) for any strings s1, s2 then

    Follow (X) = Follow(Y) U Follow (X), if s2 is Nullable

  • 8/4/2019 5-parserconstruction

    8/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z

    YX

  • 8/4/2019 5-parserconstruction

    9/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no

    Y yesX no

    base case

  • 8/4/2019 5-parserconstruction

    10/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no

    Y yesX no

    after one round of induction, we realize we have reached a fixed point

  • 8/4/2019 5-parserconstruction

    11/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d

    Y yes cX no a,b

    base case

  • 8/4/2019 5-parserconstruction

    12/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b

    Y yes cX no a,b

    after one round of induction, no fixed point

  • 8/4/2019 5-parserconstruction

    13/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b

    Y yes cX no a,b

    after two rounds of induction, no more changes ==> fixed point

  • 8/4/2019 5-parserconstruction

    14/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c { }X no a,b { }

    base case

  • 8/4/2019 5-parserconstruction

    15/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,bX no a,b c,d,a,b

    after one round of induction, no fixed point

  • 8/4/2019 5-parserconstruction

    16/58

    building a predictive parser

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,bX no a,b c,d,a,b

    after two rounds of induction, fixed point

    (but notice, computing Follow(X) before Follow (Y) would have required 3rd round)

  • 8/4/2019 5-parserconstruction

    17/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,d,a,b

    Grammar: Computed Sets:

    Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:

    a b c d e

    Z

    Y

    X

    if T First(s) thenenter (X ::= s) in row X, col T

    if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T

  • 8/4/2019 5-parserconstruction

    18/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:

    a b c d e

    Z Z ::= XYZ Z ::= XYZ

    Y

    X

    if T First(s) thenenter (X ::= s) in row X, col T

    if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T

  • 8/4/2019 5-parserconstruction

    19/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y

    X

    if T First(s) thenenter (X ::= s) in row X, col T

    if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T

  • 8/4/2019 5-parserconstruction

    20/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y Y ::= c

    X

    if T First(s) thenenter (X ::= s) in row X, col T

    if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T

  • 8/4/2019 5-parserconstruction

    21/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y Y ::= Y ::= Y ::= c Y ::= Y ::=

    X

    if T First(s) thenenter (X ::= s) in row X, col T

    if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T

  • 8/4/2019 5-parserconstruction

    22/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y Y ::= Y ::= Y ::= c Y ::= Y ::=

    X X ::= a X ::= b Y e

    if T First(s) thenenter (X ::= s) in row X, col T

    if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T

  • 8/4/2019 5-parserconstruction

    23/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    What are the blanks?

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y Y ::= Y ::= Y ::= c Y ::= Y ::=

    X X ::= a X ::= b Y e

  • 8/4/2019 5-parserconstruction

    24/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    What are the blanks? --> syntax errors

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y Y ::= Y ::= Y ::= c Y ::= Y ::=

    X X ::= a X ::= b Y e

  • 8/4/2019 5-parserconstruction

    25/58

    Z ::= X Y ZZ ::= d

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Is it possible to put 2 grammar rules in the same box?

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Y Y ::= Y ::= Y ::= c Y ::= Y ::=

    X X ::= a X ::= b Y e

  • 8/4/2019 5-parserconstruction

    26/58

    Z ::= X Y ZZ ::= dZ ::= d e

    Y ::= cY ::=

    X ::= aX ::= b Y e

    nullable first follow

    Z no d,a,b { }

    Y yes c e,d,a,b

    X no a,b c,e,d,a,b

    Grammar: Computed Sets:

    Is it possible to put 2 grammar rules in the same box?

    a b c d e

    Z Z ::= XYZ Z ::= XYZ Z ::= d

    Z ::= d e

    Y Y ::= Y ::= Y ::= c Y ::= Y ::=

    X X ::= a X ::= b Y e

  • 8/4/2019 5-parserconstruction

    27/58

    predictive parsing tables

    if a predictive parsing table constructed this waycontains no duplicate entries, the grammar iscalled LL(1)

    Left-to-right parse, Left-most derivation, 1 symbollookahead

    if not, of the grammar is not LL(1)

    in LL(k) parsing table, columns include every k-

    length sequence of terminals:

    aa ab ba bb ac ca ...

  • 8/4/2019 5-parserconstruction

    28/58

    another trick

    Previously, we saw that grammars withleft-recursion were problematic, but couldbe transformed into LL(1) in some cases

    the example non-LL(1) grammar we justsaw:

    how do we fix it?

    Z ::= X Y ZZ ::= dZ ::= d e

    Y ::= cY ::=

    X ::= aX ::= b Y e

  • 8/4/2019 5-parserconstruction

    29/58

    another trick

    Previously, we saw that grammars withleft-recursion were problematic, but couldbe transformed into LL(1) in some cases

    the example non-LL(1) grammar we justsaw:

    solution here is left-factoring:

    Z ::= X Y ZZ ::= dZ ::= d e

    Y ::= cY ::=

    X ::= aX ::= b Y e

    Z ::= X Y ZZ ::= d W Y ::= c

    Y ::=X ::= aX ::= b Y e

    W ::=W ::= e

  • 8/4/2019 5-parserconstruction

    30/58

    summary of RD parsing

    CFGs are good at specifying programming language structure

    parsing general CFGs is expensive so we define parsers forsimpler classes of CFG LL(k), LR(k)

    we can build a recursive descent parser for LL(k) grammars by: computing nullable, first and follow sets

    constructing a parse table from the sets

    checking for duplicate entries, which indicates failure

    creating an ML program from the parse table

    if parser construction fails we can rewrite the grammar (left factoring, eliminating left recursion) and try

    again

    try to build a parser using some other method

  • 8/4/2019 5-parserconstruction

    31/58

    summary of RD parsing

    CFGs are good at specifying programming language structure

    parsing general CFGs is expensive so we define parsers forsimpler classes of CFG LL(k), LR(k)

    we can build a recursive descent parser for LL(k) grammars by: computing nullable, first and follow sets

    constructing a parse table from the sets

    checking for duplicate entries, which indicates failure

    creating an ML program from the parse table

    if parser construction fails we can rewrite the grammar (left factoring, eliminating left recursion) and try

    again

    try to build a parser using some other method...such as using a bottom-up parsing technique

  • 8/4/2019 5-parserconstruction

    32/58

    Bottom-up (Shift-Reduce) Parsing

  • 8/4/2019 5-parserconstruction

    33/58

    shift-reduce parsing

    shift-reduce parsing aka: bottom-up parsing

    aka: LR(k) Left-to-right parse, Rightmost

    derivation, k-token lookahead more powerful than LL(k) parsers

    LALR variant:

    the basis for parsers for most modernprogramming languages

    implemented in tools such as ML-Yacc

  • 8/4/2019 5-parserconstruction

    34/58

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Parsing Table

  • 8/4/2019 5-parserconstruction

    35/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    yet to read

    Parsing Table

  • 8/4/2019 5-parserconstruction

    36/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    (

    yet to read

    Parsing Table

    SHIFT

  • 8/4/2019 5-parserconstruction

    37/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( id

    yet to read

    Parsing Table

    SHIFT

  • 8/4/2019 5-parserconstruction

    38/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( id =

    yet to read

    Parsing Table

    SHIFT

  • 8/4/2019 5-parserconstruction

    39/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( id = num

    yet to read

    Parsing Table

    SHIFT

  • 8/4/2019 5-parserconstruction

    40/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( S

    yet to read

    Parsing Table

    REDUCE

    S ::= id = num

  • 8/4/2019 5-parserconstruction

    41/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( L

    yet to read

    Parsing Table

    REDUCE

    L ::= S

  • 8/4/2019 5-parserconstruction

    42/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( L ;

    yet to read

    Parsing Table

    SHIFT

  • 8/4/2019 5-parserconstruction

    43/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( L ; id = num

    yet to read

    Parsing Table

    SHIFTSHIFTSHIFT

  • 8/4/2019 5-parserconstruction

    44/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( L ; S

    yet to read

    Parsing Table

    REDUCES ::= id = num

  • 8/4/2019 5-parserconstruction

    45/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( L

    yet to read

    Parsing Table

    REDUCES ::= L ; S

  • 8/4/2019 5-parserconstruction

    46/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    ( L )

    yet to read

    Parsing Table

    SHIFT

  • 8/4/2019 5-parserconstruction

    47/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    S

    yet to read

    Parsing Table

    REDUCES ::= ( L )

  • 8/4/2019 5-parserconstruction

    48/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    A

    Parsing Table

    SHIFT

    REDUCEA ::= S EOF

    ACCEPT

  • 8/4/2019 5-parserconstruction

    49/58

    State of parse so far:

    ( id = num ; id = num ) EOF

    shift-reduce parsing example

    A ::= S EOF L ::= L ; SL ::= S

    Grammar:

    S ::= ( L )S ::= id = num

    Input from lexer:

    A

    Parsing Table

    A successful parse! Is this grammar LL(1)?

  • 8/4/2019 5-parserconstruction

    50/58

    Shift-reduce algorithm

    Parser keeps track of position in current input (what input to read next)

    a stack of terminal & non-terminal symbols representing theparse so far

    Based on next input symbol & stack, parser tableindicates shift: push next input on to top of stack

    reduce R:

    top of stack should match RHS of rule

    replace top of stack with LHS of rule error

    accept (we shift EOF & can reduce what remains on stack tostart symbol)

  • 8/4/2019 5-parserconstruction

    51/58

    Shift-reduce algorithm (a detail)

    The parser summarizes the current parse state using aninteger the integer is actually a state in a finite automaton the current parse state can be computed by running the automaton over

    the current parse stack

    Revised algorithm: Based on next input symbol & the parsestate (as opposed to the entire stack), parser table indicates shift s:

    push next input on to top of stack and move automaton into state s

    reduce R & goto s: top of stack should match RHS of rule replace top of stack with LHS of rule move automaton into state s

    error accept

  • 8/4/2019 5-parserconstruction

    52/58

    State of parse so far:

    ???????? EOF

    shift-reduce parsingGrammar:

    Input from lexer:

    ????

    Like LL parsing, shift-reduce parsing does not always work.What sort of grammar rules make shift-reduce parsing impossible?

    ????

  • 8/4/2019 5-parserconstruction

    53/58

    State of parse so far:

    ???????? EOF

    shift-reduce parsingGrammar:

    Input from lexer:

    ????

    Like LL parsing, shift-reduce parsing does not always work.

    Shift-Reduce errors: cant decide whether to Shift or Reduce

    Reduce-Reduce errors: cant decide whether to Reduce by R1 or R2

    ????

  • 8/4/2019 5-parserconstruction

    54/58

    State of parse so far:

    shift-reduce errors

    A ::= S EOF

    Grammar:

    S ::= S + SS ::= S * SS ::= id

    Input from lexer: ???????? EOF

    ????

  • 8/4/2019 5-parserconstruction

    55/58

    State of parse so far:

    id + id * id EOF

    shift-reduce errors

    A ::= S EOF

    Grammar:

    S ::= S + SS ::= S * SS ::= id

    Input from lexer:

    S + S

    reduce by rule (S ::= S + S) or

    shift the * ???

    notice, this is an ambiguousgrammar we are always going toneed some mechanism for resolvingthe outstanding ambiguity before parsing

  • 8/4/2019 5-parserconstruction

    56/58

    State of parse so far:

    shift-reduce errors

    A ::= S id EOF

    Grammar:

    E ::= E ; EE ::= id

    Input from lexer: id ; id EOF

    E ;

    S ::= E ;

    reduce by rule (S ::= E ;) or

    shift the id

    id ; id ; id EOF

    input might be this,making shifting

    correct

    some unambiguousgrammars cant be

    parsed by LR(1)parsers either

  • 8/4/2019 5-parserconstruction

    57/58

    State of parse so far:

    ( id) EOF

    reduce-reduce errors

    A ::= S EOF

    Grammar:

    S ::= ( E )S ::= E

    Input from lexer:

    ( E )

    reduce by rule ( S ::= ( E ) ) or

    reduce by rule ( E ::= ( E ) )

    E ::= ( E )E ::= E + EE ::= id

  • 8/4/2019 5-parserconstruction

    58/58

    Summary

    Top-down Parsing simple to understand and implement

    you can code it yourself using nullable, first, followsets

    excellent for quick & dirty parsing jobs

    Bottom-up Parsing more complex: uses stack & table

    more powerful

    Bonus: tools do the work for you ==> ML-Yacc but you need to understand how shift-reduce & reduce-

    reduce errors can arise