Download - lesson10.ppt
![Page 1: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/1.jpg)
CPSC 388 – Compiler Design and Construction
Implementing a ParserLL(1) and LALR Grammars
FBI Noon Dining Hall Vicki Anderson Recruiter
![Page 2: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/2.jpg)
Announcements
PROG 3 out, due Oct 9th
Get started NOW! HW due Friday HW6 posted, due next Friday
![Page 3: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/3.jpg)
Parsing using CFGs Algorithms can parse using CFGs in O(n3) time (n is the
number of characters in input stream) – TOO SLOW Subclasses of grammars can be parsed in O(n) time
LL(1)1 token of look aheadDo a left most derivationScan input from left to right
LALR(1)one token of look-aheaddo a rightmost derivation in reversescan the input left-to-rightLA means "look-ahead“
(nothing to do with the number of tokens)
![Page 4: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/4.jpg)
LALR(1) More general than LL(1) grammars
(Every LL(1) grammar is a LALR(1) grammar but not vice versa)
Class of grammars used by java_cup, Bison, YACC
Parsed bottom up(start with non-terminals and build tree from leaves
up to root) Covered in text section 4.6-4.7 For class need to understand details of just
LL(1) grammars
![Page 5: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/5.jpg)
LL(1) Grammars – Predictive Parsers “build” parse tree top-down
actually discover tree top-down, don’t actually build it
Keep track of work to be done using a stack Scanned tokens along with stack
correspond to leaves of incomplete tree Use parse table to decide how to parse
input Rows are non-terminals Columns are tokens (plus EOF token) Cells are the bodies of production rules
![Page 6: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/6.jpg)
Predictive Parser Algorithms.push(EOF) // special EOF terminals.push(start) // start is start non-terminalx=s.peek()t=scanner.next_token()While (x != EOF):
if x==t:s.pop()t=scanner.next_token()
else: if x is terminal: errorelse: if table[x][t]==empty: errorelse:
let body=table[x][t] //body of productionoutput x→bodys.pop()s.push(…) //push body from right to left
x=s.peek()
![Page 7: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/7.jpg)
Example Parse using algorithm
Consider the language of balanced parentheses and brackets, e.g. ([])
Input String is “([])EOF” Grammar:
S → ε | ( S ) | [ S ] Parse Table:
( ) [ ] EOF
S (S) ε [S] ε ε
![Page 8: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/8.jpg)
Not All Grammars LL(1) Not all Grammars are LL(1):
S → ( S ) | [ S ] | ( ) | [ ] If input is ( don’t know which rule to
use! Try input “[[]]” to LL(1) grammar
using predictive parser Draw input seen so far Stack Action taken
![Page 9: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/9.jpg)
Is Grammar LL(1)
Given a grammar how do you tell if it is LL(1)?
How to build the parse table?
If parse table is built and only one entry per cell then LL(1)
![Page 10: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/10.jpg)
Non-LL(1) Grammars
If a grammar is left-recursive
If a grammar is not left-factored
It is sometimes possible to change a grammar to remove left-recursion and to make it left-factored
![Page 11: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/11.jpg)
Left-Recursion
Grammar g is recursive if there exists a production such that:
xx
xx
xx
*
*
*
Recursive
Left recursive
Right recursive
![Page 12: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/12.jpg)
Removing Immediate Left-Recursion Consider the grammar
A → Aα | β A is a nonterminal α a sequence of terminals and/or nonterminals β is a sequence of terminals and/or nonterminals
not starting with A Replace production with
A → β A’A’ → α A’ | ε
Two grammars are equivalent (recognize same set of input strings)
![Page 13: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/13.jpg)
You Try it Remove left recursion from the grammar:
exp → exp - factor | factor factor → INTLITERAL | ( exp )
Construct parse tree using original grammar and new grammar using input “5-3-2”
In general more difficult than this to remove left recursion, see text 4.3.3
![Page 14: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/14.jpg)
Left Factored
A grammar is NOT left-factored if a non-terminal has two productions whose bodies have common prefixesexp → ( exp ) | ( )
A top-down predictive parser would not know which production rule to use when seeing input character of “(“
![Page 15: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/15.jpg)
Left Factoring Given a pair of productions:
A → α β1 | α β2 α is sequence of terminals and non-terminals β1 and β2 are sequence of terminals and non-
terminals but don’t have common prefix (may be epsilon)
Change to:A → α A’A’ → β1 | β2
![Page 16: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/16.jpg)
Left Factoring Example
So for grammarexp → ( exp ) | ( )
It becomesexp → ( exp’exp’ → exp ) | )
![Page 17: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/17.jpg)
You Try It
Remove left recursion and do left factoring for grammarexp → ( exp ) | exp exp | ( )
![Page 18: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/18.jpg)
Building Parse Tables Recall a parse table
Every row is a non-terminal Every column is an input token Every cell contains a production body
If any cell contains more than one production body then grammar is not LL(1)
To build parse table need to have FIRST set and FOLLOW set
![Page 19: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/19.jpg)
FIRST set
FIRST(α)α is some sequence of terminals and non-
terminals
FIRST(α) is set of terminals that begin the strings derivable from α
if α can derive ε, then ε is in FIRST(α)
*αt
* tαttFIRST
and
and terminalis |)(
![Page 20: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/20.jpg)
FIRST(X) X is a single terminal, non-terminal or ε FIRST(X)={X} //X is terminal FIRST(X)={ε} //X is ε FIRST(X)=… //X is non-terminal
Look at all productions rules with X as head For each production rule, X →Y1,Y2,…Yn
Put FIRST(Y1) - {ε} into FIRST(X). If ε is in FIRST(Y1), then put FIRST(Y2) - {ε} into
FIRST(X). If ε is in FIRST(Y2), then put FIRST(Y3) - {ε} into
FIRST(X). etc... If ε is in FIRST(Yi) for 1 <= i <= n (all production right-
hand side
![Page 21: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/21.jpg)
Example FIRST Sets
Compute FIRST sets for each non-terminal:exp → term exp’exp’ → - term exp’ | εterm → factor term’term’ → / factor term’ | εfactor → INTLITERAL | ( exp ) {INTLITERAL, ( }
{ -, ε }
{ INTLITERAL, ( }
{ /, ε }
{ INTLITERAL, ( }
![Page 22: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/22.jpg)
FIRST(α) for any α
α is of the form X1, X2, …, Xn
Where each X is a terminal, non-terminal or ε
1. Put FIRST(X1) - {ε} into FIRST(α)
2. If epsilon is in FIRST(X1) put FIRST(X2) into FIRST(α).
3. etc... 4. If ε is in the FIRST set for every Xn,
put ε into FIRST(α).
![Page 23: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/23.jpg)
Example FIRST sets for rules
FIRST( term exp' ) = { INTLITERAL, ( }FIRST( - term exp' ) = { - }FIRST(ε ) = {ε }FIRST( factor term' ) = { INTLITERAL,
( }FIRST( / factor term' ) = { / }FIRST(ε ) = {ε }FIRST( INTLITERAL ) = { INTLITERAL } FIRST( ( exp ) ) = { ( }
![Page 24: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/24.jpg)
Why Do We Care about FIRST(α)?
During parsing, suppose the top-of-stack symbol is nonterminal A, that there are two productions: A → α A → β
And that the current token is x If x is in FIRST(α) then use first production If x is in FIRST(β) then use second
production
![Page 25: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/25.jpg)
FOLLOW(A) sets
Only defined for singlenon-terminals, A
the set of terminals that can appear immediately to the right of A (may include EOF but never ε)
![Page 26: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/26.jpg)
Calculating FOLLOW(A)
If A is start non-terminal put EOF in FOLLOW(A)
Find productions with A in body: For each production X → α A β
put FIRST(β) – {ε} in FOLLOW(A) If ε in FIRST(β) put FOLLOW(X) into
FOLLOW(A) For each production X → α A
put FOLLOW(X) into FOLLOW(A)
![Page 27: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/27.jpg)
FIRST and FOLLOW sets To compute FIRST(A) you must look for A on
a production's left-hand side. To compute FOLLOW(A) you must look for A
on a production's right-hand side. FIRST and FOLLOW sets are always sets of
terminals (plus, perhaps, ε for FIRST sets, and EOF for follow sets).
Nonterminals are never in a FIRST or a FOLLOW set.
![Page 28: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/28.jpg)
Example FOLLOW setsCAPS are non-terminals and lower-case are terminalsS → B c | D BB → a b | c SD → d | ε
X FIRST(X) FOLLOW(X)-------------------------------------------D { d, ε } { a, c }B { a, c } { c, EOF }S { a, c, d } { EOF, c }Note: FOLLOW of S always includes EOF
![Page 29: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/29.jpg)
You Try It
Computer FIRST and FOLLOW sets for:methodHeader → VOID ID LPAREN paramList RPAREN
paramList → epsilon
paramList → nonEmptyParamList
nonEmptyParamList → ID ID
nonEmptyParamList → ID ID COMMA nonEmptyParamList
Remember you need FIRST and FOLLOW sets for all non-terminals and FIRST sets for all bodies of rules
![Page 30: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/30.jpg)
Parse Table
a b c d
S
A
X
R
Non-terminals
CurrentToken
Rule bodies
![Page 31: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/31.jpg)
Parse Table Construction Algorithm
for each production X → α:for each terminal t in First(α):
put α in Table[X,t]
if ε is in First(α) then: for each terminal t in
Follow(X): put α in Table[X,t]
![Page 32: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/32.jpg)
Example Parse Table Construction
S → B c | D BB → a b | c SD → d | εFor this grammar: Construct FIRST and FOLLOW Sets Apply algorithm to calculate parse
table
![Page 33: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/33.jpg)
Example Parse Table Construction
X FIRST(X) FOLLOW(X)---------------------------------------------------D { d, ε } { a, c }B { a, c } { c, EOF }S { a, c, d } { EOF, c }Bc { a, c }DB { d, a, c }ab { a }cS { c }D { d }Ε {ε }
![Page 34: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/34.jpg)
Parse Table
a b c d EOF
S BcDB
BcDB
DB
B
D ε ε
Finish Filling In Table
![Page 35: lesson10.ppt](https://reader034.vdocuments.us/reader034/viewer/2022051518/5695d3e81a28ab9b029f9982/html5/thumbnails/35.jpg)
Predictive Parser Algorithms.push(EOF) // special EOF terminals.push(start) // start is start non-terminalx=s.peek()t=scanner.next_token()While (x != EOF):
if x==t:s.pop()t=scanner.next_token()
else: if x is terminal: errorelse: if table[x][t]==empty: errorelse:
let body=table[x][t] //body of productionoutput x→bodys.pop()s.push(…) //push body from right to left
x=s.peek()