– 1 – CSCE 531 Spring 2006
Lecture 7 Predictive Parsing
Lecture 7 Predictive Parsing
Topics Topics Review Top Down Parsing First Follow LL (1) Table construction
Readings: 4.4Readings: 4.4
Homework: Program 2Homework: Program 2
February 1, 2004
CSCE 531 Compiler Construction
– 2 – CSCE 531 Spring 2006
OverviewOverviewLast TimeLast Time
Ambiguity in classic programming language grammarsExpressions If-Then-Else
Top-Down Parsing Modifying Grammars to facilitate Top-down parsing
Today’s Lecture Today’s Lecture Regroup halfway to Test 1 First and Follow LL(1) property
References: References:
Homework: Homework:
– 3 – CSCE 531 Spring 2006
Removing the IF-ELSE AmbiguityRemoving the IF-ELSE Ambiguity
Stmt if Expr then Stmt | if Expr then Stmt else Stmt | other stmts
Stmt MatchedStmt | UnmatchedStmt
MatchedStmt if Expr then MatchedStmt else MatchedStmt
| OthersStatements
UnmatchedStmt if Expr then MatchedStmt else
| if Expr then MatchedStmt else UmatchedStmt
– 4 – CSCE 531 Spring 2006
Recursive Descent ParsersRecursive Descent Parsers
Recall the parser from Chapter 2Recall the parser from Chapter 2
A recursive descent parser has a routine for each A recursive descent parser has a routine for each nonterminal. These routines can call each other. If nonterminal. These routines can call each other. If one of these fails then it may backtrack to a point one of these fails then it may backtrack to a point where there is an alternative choice.where there is an alternative choice.
In certain cases the grammar is restricted enough In certain cases the grammar is restricted enough where backtracking would never be required.where backtracking would never be required.
Such a parser is called a predictive parser. Such a parser is called a predictive parser.
The parser from Chapter 2 is a predictive parser.The parser from Chapter 2 is a predictive parser.
– 5 – CSCE 531 Spring 2006
Transition Diagrams for Predictive ParsersTransition Diagrams for Predictive Parsers
To construct the transition diagram for a predictive To construct the transition diagram for a predictive parser:parser:
1.1. Eliminate left recursion from the grammarEliminate left recursion from the grammar
2.2. Left factor the grammarLeft factor the grammar
3.3. For each nonterminal A doFor each nonterminal A do
Create an initial state and final state.Create an initial state and final state.
For each production A For each production A X X11XX22 … X … Xnn create a path create a path from the initial state to the final state labeled from the initial state to the final state labeled
XX11XX22 … … XXnn
endend
– 6 – CSCE 531 Spring 2006
ExampleExampleE E T E’ T E’ E’ E’ + T E’ | - T E’ | + T E’ | - T E’ | εεT T F T’ F T’T’ T’ * F T’ | / F T’ | * F T’ | / F T’ | εεF F id | num | ( E ) id | num | ( E )
1
E
2 3T E’
4
T
5 6F T’
1
E’
2 3+ T
3E’
2 3T E’-
ε
EtceteraSome of the rest in the text.
– 7 – CSCE 531 Spring 2006
Predictive Parsing using Transition DiagramsPredictive Parsing using Transition Diagrams
– 8 – CSCE 531 Spring 2006
Table Driven Predictive ParsingTable Driven Predictive Parsing
xx ++ (( ……
WW
XX
YY
SS
RR
$$
Stack
input
outputPredictiveParsingProgram
Parsing TableM
– 9 – CSCE 531 Spring 2006
Table Driven Predictive ParsingTable Driven Predictive Parsing The stack is initialized to contain $S, the $ is the The stack is initialized to contain $S, the $ is the
“bottom” marker.“bottom” marker.
The input has a $ added to the end.The input has a $ added to the end.
The parse table, M[X, a] contains what should be The parse table, M[X, a] contains what should be done when we see nonterminal X on the stack and done when we see nonterminal X on the stack and current token “a”current token “a”
Parse Actions for Parse Actions for X = top of stack, and a = current token
1.1. If X = a = $ then halt and announce success.If X = a = $ then halt and announce success.
2.2. If X = a != $ then pop X off the stack and advance the If X = a != $ then pop X off the stack and advance the input pointer to the next token.input pointer to the next token.
3.3. If X is nonterminal consult the table entry M[X, a], If X is nonterminal consult the table entry M[X, a], details on next slide.details on next slide.
– 10 – CSCE 531 Spring 2006
M[X, a] ActionsM[X, a] Actions
3.3. If X is nonterminal then consult M[X, a]. If X is nonterminal then consult M[X, a].
The entry will be either a production or an error The entry will be either a production or an error entry.entry.
If M[X, a] = {X If M[X, a] = {X UVW} the parser UVW} the parser replaces X on the top of the stack with W, V, U with the U on the top As output print the name of the production used.
– 11 – CSCE 531 Spring 2006
Algorithm 4.3Algorithm 4.3Set ip to the first token in w$.Set ip to the first token in w$.
RepeatRepeat
Let X be the top of the stack and a be the current tokenLet X be the top of the stack and a be the current token
if X is a terminal or $ thenif X is a terminal or $ then
if X = a thenif X = a then
pop X from the stack and advance the ippop X from the stack and advance the ip
else error()else error()
elseelse /* X is a nonterminal *//* X is a nonterminal */
if M[X, a] = X if M[X, a] = X Y Y11YY22 …Y …Ykk then begin then begin
pop X from the stackpop X from the stack
push Ypush Yk k YYk-1 k-1 …Y…Y22YY1 1 onto the stack withonto the stack with Y Y1 1 on topon top
output the productionoutput the production X X Y Y11YY22 …Y …Ykk
endend
else error()else error()
Until X = $Until X = $
– 12 – CSCE 531 Spring 2006
Parse Table for Expression GrammarParse Table for Expression Grammar
idid ++ -- ** // (( )) $$
EE EETE’TE’ EETE’TE’
E’E’ E’E’+TE’+TE’ E’E’-TE’-TE’ E’E’εε E’E’εε
TT TTFT’FT’ TTFT’FT’
T’T’ T’T’εε T’T’εε T’T’*FT’*FT’ T’T’/FT’/FT’ T’T’εε T’T’εε
FF FFidid FF(E)(E)
Figure 4.15+
– 13 – CSCE 531 Spring 2006
Parse Trace of (z + q) * x + w * yParse Trace of (z + q) * x + w * y
StackStack InputInput OutputOutput
$E$E (( id + id ) * id + id * id $ id + id ) * id + id * id $
$E’T$E’T (( id + id ) * id + id * id $ id + id ) * id + id * id $ EET E’T E’
$E’T’F$E’T’F (( id + id ) * id + id * id $ id + id ) * id + id * id $ TTF T’F T’
$E’T’)E($E’T’)E( (( id + id ) * id + id * id $ id + id ) * id + id * id $ FF( E )( E )
$E’T’)E$E’T’)E idid + id ) * id + id * id $ + id ) * id + id * id $
$E’T’)E’T$E’T’)E’T idid + id ) * id + id * id $ + id ) * id + id * id $ EET E’T E’
$E’T’)E’T’F$E’T’)E’T’F idid + id ) * id + id * id $ + id ) * id + id * id $ TTF T’F T’
$E’T’)E’T’id$E’T’)E’T’id idid + id ) * id + id * id $ + id ) * id + id * id $ FFidid
$E’T’)E’T’$E’T’)E’T’ + id ) * id + id * id $+ id ) * id + id * id $
$E’T’)E’$E’T’)E’ + id ) * id + id * id $+ id ) * id + id * id $ T’T’εε
– 14 – CSCE 531 Spring 2006
First and Follow FunctionsFirst and Follow Functions
We are going to develop two auxilliary functions for We are going to develop two auxilliary functions for facilitating the computing of parse tables.facilitating the computing of parse tables.
FIRST(FIRST(αα) is the set of tokens that can start strings ) is the set of tokens that can start strings derivable from derivable from αα, ,
also if also if αα εε then we add then we add εε to First( to First(αα).).
FOLLOW(N) is the set of tokens that can follow the FOLLOW(N) is the set of tokens that can follow the nonterminal N in some sentential form, i.e.,nonterminal N in some sentential form, i.e.,
FOLLOW(N) = { t | S *FOLLOW(N) = { t | S * ααNtNtββ } }
– 15 – CSCE 531 Spring 2006
Algorithm to Compute FirstAlgorithm to Compute First
Input: Grammar symbol XInput: Grammar symbol X
Output: FIRST(X)Output: FIRST(X)
MethodMethod
1.1. If X is a terminal, then FIRST(X) = {X}If X is a terminal, then FIRST(X) = {X}
2.2. If X If X єє is a production, then add is a production, then add єє to FIRST(X). to FIRST(X).
3.3. For each production X For each production X Y Y11YY22 … Y … Ykk
a. If Y1Y2 … Yi-1 є then add all tokens in FIRST(Yi) to FIRST(X)
b. If Y1Y2 … Yk є then add є to FIRST(X)
– 16 – CSCE 531 Spring 2006
Example of First CalculationExample of First Calculation
E E T E’ T E’
E’ E’ + T E’ | - T E’ | + T E’ | - T E’ | єє
T T F T’ F T’
T’ T’ * F T’ | / F T’ | * F T’ | / F T’ | єє
F F id | num | ( E ) id | num | ( E )
FIRST(token) = {token} for FIRST(token) = {token} for tokens: + - * / ( ) id numtokens: + - * / ( ) id num
FIRST(F) = { id, num, ( }FIRST(F) = { id, num, ( }
FIRST(T’) = ?FIRST(T’) = ?
T’T’єє so … so …
T’T’ *FT’ so … *FT’ so …
T’T’ /FT’ so … /FT’ so …
FIRST(T’) = {FIRST(T’) = {єє … } … }
FIRST(T) = FIRST(F)FIRST(T) = FIRST(F)
FIRST(E’) = ?FIRST(E’) = ?
FIRST(E) = ?FIRST(E) = ?
– 17 – CSCE 531 Spring 2006
Algorithm to Compute Follow (p 189)Algorithm to Compute Follow (p 189)
Input: nonterminal AInput: nonterminal A
Output: FOLLOW(A)Output: FOLLOW(A)
MethodMethod
1.1. Add $ to FOLLOW(S), whereAdd $ to FOLLOW(S), where $ is the end_of_input marker And S is the start state
2.2. If A If A ααBBββ is a production, then every token in is a production, then every token in FIRST(FIRST(ββ) is added to FOLLOW(B) (note not ) is added to FOLLOW(B) (note not єє))
3.3. If A If A ααB is a production or B is a production or if A if A ααBBββ is a production and is a production and ββ єє then every then every
token in FOLLOW(A) is added to FOLLOW(B)token in FOLLOW(A) is added to FOLLOW(B)
– 18 – CSCE 531 Spring 2006
Example of FOLLOW CalculationExample of FOLLOW Calculation
E E T E’ T E’
E’ E’ + T E’ | - T E’ | + T E’ | - T E’ | єє
T T F T’ F T’
T’ T’ * F T’ | / F T’ | * F T’ | / F T’ | єє
F F id | num | ( E ) id | num | ( E )
1.1. Add $ to FOLLOW(E)Add $ to FOLLOW(E)
2.2. EE TE’ TE’ Add FIRST*(E’) to FOLLOW(T)
3.3. E’E’+ T E’ (similarly E’+ T E’ (similarly E’+T E’) +T E’) Add FIRST*(E’) to FOLLOW(T) E’є, so FOLLOW(E’) is added to
FOLLOW(T)
4.4. TTF T’F T’ Add FIRST*(T’) to FOLLOW(F) T’є, so FOLLOW(T’) is added to
FOLLOW(F)
5.5. FF( E ) ( E ) Add FIRST( ‘)’ ) to FOLLOW(E)
NN FOLLOW(N)FOLLOW(N)
EE { ${ $
E’E’ {{
TT { + -{ + -
T’T’ {{
FF {{
– 19 – CSCE 531 Spring 2006
Construction of a Predictive Parse TableConstruction of a Predictive Parse Table
Algorithm 4.4Algorithm 4.4
Input: Grammar GInput: Grammar G
Output: Predictive Parsing Table M[N, a]Output: Predictive Parsing Table M[N, a]
MethodMethod
1.1. For each production AFor each production Aαα do do
2.2. For each a in FIRST(For each a in FIRST(αα), add A), add Aαα to M[A, a] to M[A, a]
3.3. If If єє is in FIRST( is in FIRST(αα), add A), add Aαα to M[A, b] to M[A, b] for each token b in FOLLOW(A)for each token b in FOLLOW(A) If If єє is in FIRST( is in FIRST(αα) and $ is in FOLLOW(A) ) and $ is in FOLLOW(A)
then then add Aadd Aαα to M[A, $] to M[A, $]
4.4. Mark all other entries of M as “error” Mark all other entries of M as “error”
– 20 – CSCE 531 Spring 2006
Predictive Parsing ExamplePredictive Parsing Example
Example 4.18 in text table in Example 4.18 in text table in Figure 4.15 (slide 11)Figure 4.15 (slide 11)
Example 4.19Example 4.19
S S iEtSS’ | a iEtSS’ | a
S’ S’ eS | eS | єє
E E b b
Nonter-Nonter-minalsminals
aa bb ee ii tt $$
SS SSaa SSiEtSS’iEtSS’
S’S’ S’S’eSeS
S’S’єє
S’S’єє
EE EEbb
FIRST(S) = { i, a }FIRST(S) = { i, a }
FIRST(S’) = {FIRST(S’) = {єє, e } , e }
FIRST(E) = { b }FIRST(E) = { b }
FOLLOW(S) = { $, e } FOLLOW(S) = { $, e }
FOLLOW(S’) = { $, e}FOLLOW(S’) = { $, e}
FOLLOW(E) = { t FOLLOW(E) = { t
– 21 – CSCE 531 Spring 2006
LL(1) GrammarsLL(1) Grammars
A grammar is called LL(1) if its parsing table has no A grammar is called LL(1) if its parsing table has no multiply defined entries.multiply defined entries.
LL(1) grammarsLL(1) grammars
Must not be ambiguous.Must not be ambiguous.
Must not be left-recursive.Must not be left-recursive.
G is LL(1) if and only if whenever A G is LL(1) if and only if whenever A αα | | ββ 1. FIRST(α) ∩ FIRST(β) = Φ
2. At most one of α and β can derive є
3. If β * є then FIRST(α) ∩ FOLLOW(A) = Φ
– 22 – CSCE 531 Spring 2006
Error Recovery in Predictive ParsingError Recovery in Predictive Parsing
Panic Mode Error recovery Panic Mode Error recovery
If M[A, a] is an error, then throw away input tokens If M[A, a] is an error, then throw away input tokens until one in a synchronizing set.until one in a synchronizing set.
Heuristics for the synchronizing sets for AHeuristics for the synchronizing sets for A
1.1. Add FOLLOW(A) to the synchronizing set for AAdd FOLLOW(A) to the synchronizing set for A
2.2. If ‘;’ is a separator or terminator of statements then If ‘;’ is a separator or terminator of statements then keywords that can begin statements should not be keywords that can begin statements should not be in synchronizing set for the nonterminal “Expr” in synchronizing set for the nonterminal “Expr” because a missing “;” would cause skipping because a missing “;” would cause skipping keywords.keywords.
3.3. ……
– 23 – CSCE 531 Spring 2006
Parse Table with Synch EntriesParse Table with Synch Entries
Figure 4.18Figure 4.18
– 24 – CSCE 531 Spring 2006
Trace with Error RecoveryTrace with Error Recovery
Figure 4.19Figure 4.19
– 25 – CSCE 531 Spring 2006
Bottom up ParsingBottom up Parsing
Idea – recognize right hand sides of productions so Idea – recognize right hand sides of productions so that we produce a rightmost derivationthat we produce a rightmost derivation
““Handle-pruning”Handle-pruning”
– 26 – CSCE 531 Spring 2006
Reductions in a Shift-Reduce ParserReductions in a Shift-Reduce Parser
Figure 4.21Figure 4.21
E E E + E | E * E | ( E ) | id E + E | E * E | ( E ) | id
Right-Sentential FormRight-Sentential Form HandleHandle Reducing ProductionReducing Production
idid11 + id + id22 * id * id33 idid11 E E id id
E + idE + id22 * id * id33 idid22 E E id id
E + E * idE + E * id33 idid33 E E id How? id How?
E + E * E E + E * E E * EE * E E E E * E E * E
E + EE + E E + EE + E E E E + E E + E
EE