5-parserconstruction
TRANSCRIPT
-
8/4/2019 5-parserconstruction
1/58
COS 320
Compilers
David Walker
-
8/4/2019 5-parserconstruction
2/58
last time
context free grammars (Appel 3.1)
terminals, non-terminals, rules
derivations & parse trees
ambiguous grammars
recursive descent parsers (Appel 3.2)
parse LL(k) grammars
easy to write as ML programs
algorithms for automatic construction from a CFG
-
8/4/2019 5-parserconstruction
3/58
1. S ::= IF E THEN S ELSE S
2. | BEGIN S L3. | PRINT E
4. L ::= END
5. | ; S L6. E ::= NUM = NUM
non-terminals: S, E, Lterminals: NUM, IF, THEN, ELSE, BEGIN, END, PRINT, ;, =rules:
fun S () = case !tok ofIF => eat IF; E (); eat THEN; S (); eat ELSE; S ()
| BEGIN => eat BEGIN; S (); L ()
| PRINT => eat PRINT; E ()
and L () = case !tok ofEND => eat END
| SEMI => eat SEMI; S (); L ()
and E () = eat NUM; eat EQ; eat NUM
val tok = ref (getToken ())fun advance () = tok := getToken ()fun eat t = if (! tok = t) then advance () else error ()
datatype token = NUM | IF| THEN | ELSE | BEGIN | END| PRINT | SEMI | EQ
-
8/4/2019 5-parserconstruction
4/58
Constructing RD Parsers
To construct an RD parser, we need to knowwhat rule to apply when
we have seen a non terminal X
we see the next terminal a in input
We apply rule X ::= s when
a is the first symbol that can be generated by string s,OR
s reduces to the empty string (is nullable) and a is thefirst symbol in any string that can followX
-
8/4/2019 5-parserconstruction
5/58
Computing Nullable Sets
Non-terminal X is Nullable only if thefollowing constraints are satisfied(computed using iterative analysis)
base case:
if (X := ) then X is Nullable
inductive case:
if (X := ABC...) and A, B, C, ... are all Nullable thenX is Nullable
-
8/4/2019 5-parserconstruction
6/58
Computing First Sets
First(X) is computed iteratively
base case:
if T is a terminal symbol then First (T) = {T}
inductive case:
if X is a non-terminal and (X:= ABC...) then
First (X) = First (X) U First (ABC...)
where First(ABC...) = F1 U F2 U F3 U ... and
F1 = First (A)
F2 = First (B), if A is Nullable
F3 = First (C), if A is Nullable & B is Nullable
...
-
8/4/2019 5-parserconstruction
7/58
Computing Follow Sets
Follow(X) is computed iteratively
base case:
initially, we assume nothing in particular follows X
(Follow (X) is initially { })
inductive case:
if (Y := s1 X s2) for any strings s1, s2 then
Follow (X) = First (s2) U Follow (X)
if (Y := s1 X s2) for any strings s1, s2 then
Follow (X) = Follow(Y) U Follow (X), if s2 is Nullable
-
8/4/2019 5-parserconstruction
8/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z
YX
-
8/4/2019 5-parserconstruction
9/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no
Y yesX no
base case
-
8/4/2019 5-parserconstruction
10/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no
Y yesX no
after one round of induction, we realize we have reached a fixed point
-
8/4/2019 5-parserconstruction
11/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d
Y yes cX no a,b
base case
-
8/4/2019 5-parserconstruction
12/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b
Y yes cX no a,b
after one round of induction, no fixed point
-
8/4/2019 5-parserconstruction
13/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b
Y yes cX no a,b
after two rounds of induction, no more changes ==> fixed point
-
8/4/2019 5-parserconstruction
14/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c { }X no a,b { }
base case
-
8/4/2019 5-parserconstruction
15/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,bX no a,b c,d,a,b
after one round of induction, no fixed point
-
8/4/2019 5-parserconstruction
16/58
building a predictive parser
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,bX no a,b c,d,a,b
after two rounds of induction, fixed point
(but notice, computing Follow(X) before Follow (Y) would have required 3rd round)
-
8/4/2019 5-parserconstruction
17/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,d,a,b
Grammar: Computed Sets:
Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:
a b c d e
Z
Y
X
if T First(s) thenenter (X ::= s) in row X, col T
if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T
-
8/4/2019 5-parserconstruction
18/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:
a b c d e
Z Z ::= XYZ Z ::= XYZ
Y
X
if T First(s) thenenter (X ::= s) in row X, col T
if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T
-
8/4/2019 5-parserconstruction
19/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y
X
if T First(s) thenenter (X ::= s) in row X, col T
if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T
-
8/4/2019 5-parserconstruction
20/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y Y ::= c
X
if T First(s) thenenter (X ::= s) in row X, col T
if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T
-
8/4/2019 5-parserconstruction
21/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y Y ::= Y ::= Y ::= c Y ::= Y ::=
X
if T First(s) thenenter (X ::= s) in row X, col T
if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T
-
8/4/2019 5-parserconstruction
22/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Build parsing table where row X, col Ttells parser which clause to execute infunction X with next-token T:
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y Y ::= Y ::= Y ::= c Y ::= Y ::=
X X ::= a X ::= b Y e
if T First(s) thenenter (X ::= s) in row X, col T
if s is Nullable and T Follow(X)enter (X ::= s) in row X, col T
-
8/4/2019 5-parserconstruction
23/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
What are the blanks?
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y Y ::= Y ::= Y ::= c Y ::= Y ::=
X X ::= a X ::= b Y e
-
8/4/2019 5-parserconstruction
24/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
What are the blanks? --> syntax errors
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y Y ::= Y ::= Y ::= c Y ::= Y ::=
X X ::= a X ::= b Y e
-
8/4/2019 5-parserconstruction
25/58
Z ::= X Y ZZ ::= d
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Is it possible to put 2 grammar rules in the same box?
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Y Y ::= Y ::= Y ::= c Y ::= Y ::=
X X ::= a X ::= b Y e
-
8/4/2019 5-parserconstruction
26/58
Z ::= X Y ZZ ::= dZ ::= d e
Y ::= cY ::=
X ::= aX ::= b Y e
nullable first follow
Z no d,a,b { }
Y yes c e,d,a,b
X no a,b c,e,d,a,b
Grammar: Computed Sets:
Is it possible to put 2 grammar rules in the same box?
a b c d e
Z Z ::= XYZ Z ::= XYZ Z ::= d
Z ::= d e
Y Y ::= Y ::= Y ::= c Y ::= Y ::=
X X ::= a X ::= b Y e
-
8/4/2019 5-parserconstruction
27/58
predictive parsing tables
if a predictive parsing table constructed this waycontains no duplicate entries, the grammar iscalled LL(1)
Left-to-right parse, Left-most derivation, 1 symbollookahead
if not, of the grammar is not LL(1)
in LL(k) parsing table, columns include every k-
length sequence of terminals:
aa ab ba bb ac ca ...
-
8/4/2019 5-parserconstruction
28/58
another trick
Previously, we saw that grammars withleft-recursion were problematic, but couldbe transformed into LL(1) in some cases
the example non-LL(1) grammar we justsaw:
how do we fix it?
Z ::= X Y ZZ ::= dZ ::= d e
Y ::= cY ::=
X ::= aX ::= b Y e
-
8/4/2019 5-parserconstruction
29/58
another trick
Previously, we saw that grammars withleft-recursion were problematic, but couldbe transformed into LL(1) in some cases
the example non-LL(1) grammar we justsaw:
solution here is left-factoring:
Z ::= X Y ZZ ::= dZ ::= d e
Y ::= cY ::=
X ::= aX ::= b Y e
Z ::= X Y ZZ ::= d W Y ::= c
Y ::=X ::= aX ::= b Y e
W ::=W ::= e
-
8/4/2019 5-parserconstruction
30/58
summary of RD parsing
CFGs are good at specifying programming language structure
parsing general CFGs is expensive so we define parsers forsimpler classes of CFG LL(k), LR(k)
we can build a recursive descent parser for LL(k) grammars by: computing nullable, first and follow sets
constructing a parse table from the sets
checking for duplicate entries, which indicates failure
creating an ML program from the parse table
if parser construction fails we can rewrite the grammar (left factoring, eliminating left recursion) and try
again
try to build a parser using some other method
-
8/4/2019 5-parserconstruction
31/58
summary of RD parsing
CFGs are good at specifying programming language structure
parsing general CFGs is expensive so we define parsers forsimpler classes of CFG LL(k), LR(k)
we can build a recursive descent parser for LL(k) grammars by: computing nullable, first and follow sets
constructing a parse table from the sets
checking for duplicate entries, which indicates failure
creating an ML program from the parse table
if parser construction fails we can rewrite the grammar (left factoring, eliminating left recursion) and try
again
try to build a parser using some other method...such as using a bottom-up parsing technique
-
8/4/2019 5-parserconstruction
32/58
Bottom-up (Shift-Reduce) Parsing
-
8/4/2019 5-parserconstruction
33/58
shift-reduce parsing
shift-reduce parsing aka: bottom-up parsing
aka: LR(k) Left-to-right parse, Rightmost
derivation, k-token lookahead more powerful than LL(k) parsers
LALR variant:
the basis for parsers for most modernprogramming languages
implemented in tools such as ML-Yacc
-
8/4/2019 5-parserconstruction
34/58
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Parsing Table
-
8/4/2019 5-parserconstruction
35/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
yet to read
Parsing Table
-
8/4/2019 5-parserconstruction
36/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
(
yet to read
Parsing Table
SHIFT
-
8/4/2019 5-parserconstruction
37/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( id
yet to read
Parsing Table
SHIFT
-
8/4/2019 5-parserconstruction
38/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( id =
yet to read
Parsing Table
SHIFT
-
8/4/2019 5-parserconstruction
39/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( id = num
yet to read
Parsing Table
SHIFT
-
8/4/2019 5-parserconstruction
40/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( S
yet to read
Parsing Table
REDUCE
S ::= id = num
-
8/4/2019 5-parserconstruction
41/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( L
yet to read
Parsing Table
REDUCE
L ::= S
-
8/4/2019 5-parserconstruction
42/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( L ;
yet to read
Parsing Table
SHIFT
-
8/4/2019 5-parserconstruction
43/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( L ; id = num
yet to read
Parsing Table
SHIFTSHIFTSHIFT
-
8/4/2019 5-parserconstruction
44/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( L ; S
yet to read
Parsing Table
REDUCES ::= id = num
-
8/4/2019 5-parserconstruction
45/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( L
yet to read
Parsing Table
REDUCES ::= L ; S
-
8/4/2019 5-parserconstruction
46/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
( L )
yet to read
Parsing Table
SHIFT
-
8/4/2019 5-parserconstruction
47/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
S
yet to read
Parsing Table
REDUCES ::= ( L )
-
8/4/2019 5-parserconstruction
48/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
A
Parsing Table
SHIFT
REDUCEA ::= S EOF
ACCEPT
-
8/4/2019 5-parserconstruction
49/58
State of parse so far:
( id = num ; id = num ) EOF
shift-reduce parsing example
A ::= S EOF L ::= L ; SL ::= S
Grammar:
S ::= ( L )S ::= id = num
Input from lexer:
A
Parsing Table
A successful parse! Is this grammar LL(1)?
-
8/4/2019 5-parserconstruction
50/58
Shift-reduce algorithm
Parser keeps track of position in current input (what input to read next)
a stack of terminal & non-terminal symbols representing theparse so far
Based on next input symbol & stack, parser tableindicates shift: push next input on to top of stack
reduce R:
top of stack should match RHS of rule
replace top of stack with LHS of rule error
accept (we shift EOF & can reduce what remains on stack tostart symbol)
-
8/4/2019 5-parserconstruction
51/58
Shift-reduce algorithm (a detail)
The parser summarizes the current parse state using aninteger the integer is actually a state in a finite automaton the current parse state can be computed by running the automaton over
the current parse stack
Revised algorithm: Based on next input symbol & the parsestate (as opposed to the entire stack), parser table indicates shift s:
push next input on to top of stack and move automaton into state s
reduce R & goto s: top of stack should match RHS of rule replace top of stack with LHS of rule move automaton into state s
error accept
-
8/4/2019 5-parserconstruction
52/58
State of parse so far:
???????? EOF
shift-reduce parsingGrammar:
Input from lexer:
????
Like LL parsing, shift-reduce parsing does not always work.What sort of grammar rules make shift-reduce parsing impossible?
????
-
8/4/2019 5-parserconstruction
53/58
State of parse so far:
???????? EOF
shift-reduce parsingGrammar:
Input from lexer:
????
Like LL parsing, shift-reduce parsing does not always work.
Shift-Reduce errors: cant decide whether to Shift or Reduce
Reduce-Reduce errors: cant decide whether to Reduce by R1 or R2
????
-
8/4/2019 5-parserconstruction
54/58
State of parse so far:
shift-reduce errors
A ::= S EOF
Grammar:
S ::= S + SS ::= S * SS ::= id
Input from lexer: ???????? EOF
????
-
8/4/2019 5-parserconstruction
55/58
State of parse so far:
id + id * id EOF
shift-reduce errors
A ::= S EOF
Grammar:
S ::= S + SS ::= S * SS ::= id
Input from lexer:
S + S
reduce by rule (S ::= S + S) or
shift the * ???
notice, this is an ambiguousgrammar we are always going toneed some mechanism for resolvingthe outstanding ambiguity before parsing
-
8/4/2019 5-parserconstruction
56/58
State of parse so far:
shift-reduce errors
A ::= S id EOF
Grammar:
E ::= E ; EE ::= id
Input from lexer: id ; id EOF
E ;
S ::= E ;
reduce by rule (S ::= E ;) or
shift the id
id ; id ; id EOF
input might be this,making shifting
correct
some unambiguousgrammars cant be
parsed by LR(1)parsers either
-
8/4/2019 5-parserconstruction
57/58
State of parse so far:
( id) EOF
reduce-reduce errors
A ::= S EOF
Grammar:
S ::= ( E )S ::= E
Input from lexer:
( E )
reduce by rule ( S ::= ( E ) ) or
reduce by rule ( E ::= ( E ) )
E ::= ( E )E ::= E + EE ::= id
-
8/4/2019 5-parserconstruction
58/58
Summary
Top-down Parsing simple to understand and implement
you can code it yourself using nullable, first, followsets
excellent for quick & dirty parsing jobs
Bottom-up Parsing more complex: uses stack & table
more powerful
Bonus: tools do the work for you ==> ML-Yacc but you need to understand how shift-reduce & reduce-
reduce errors can arise