parsing - 会津大学公式ウェブサイトhamada/lp/l4-1-lp.pdf · shift reduce parser 1....
TRANSCRIPT
Parsing
Today
COOL
code
txt
Executable
code
exeLexicalAnalysis
Syntax Analysis
Parsing
AST SymbolTableetc.
Inter.Rep.
(IR)
Code
Gen.
Top Down Parsing
Parsing
Bottom Up Parsing
Predictive Parsing Shift-reduce Parsing
LL(k) Parsing LR(k) Parsing
Left Recursion
Left Factoring
Bottom-Up Parsers
Bottom-up parsers: build the nodes on the bottom of the parse tree first.Suitable for automatic parser generation, handle a larger class of grammars.Examples: shift-reduce parser (or LR(k) parsers)
Bottom-up Parsing
zNo problem with left-recursionzWidely used in practicezLR(1), SLR(1), LALR(1)
Non-ambiguous CFG
CLR(1)
LALR(1)
SLR(1)
LL(1)
Grammar Hierarchy
Bottom-up Parsing
zWorks from tokens to start-symbolzRepeat: yidentify handle - reducible sequence: ⌧ non-terminal is not constructed but⌧ all its children have been constructed
yreduce - construct non-terminal and update stack
zUntil reducing to start-symbol
Bottom-up Parsing1 + (2) + (3)
E + (E) + (3)
+
E → E + (E)E → i
E
1 2 + 3
E
E + (3)
E
( ) ( )
E + (E)
E
E
E
E + (2) + (3) i = 0,1, 2, …, 9
Bottom-up Parsing
zIs the following grammar LL(1) ?
1 + (2)1 + (2) + (3)
zBut this is a useful grammar
E → E + (E)E → i
zNO
Bottom-Up Parser
A bottom-up parser, or a shift-reduce parser, beginsat the leaves and works up to the top of the tree.
The reduction steps trace a rightmost derivationon reverse.
S → aABeA → Abc | bB → d
Consider the Grammar:
We want to parse the input string abbcde.
Bottom-Up Parser Example
a dbb cINPUT:
Bottom-Up ParsingProgram
e OUTPUT:$
ProductionS → aABeA → AbcA → bB → d
Bottom-Up Parser Example
a dbb cINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A
b
$
ProductionS → aABeA → AbcA → bB → d
Bottom-Up Parser Example
a dbA cINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A
b
$
ProductionS → aABeA → AbcA → bB → d
Bottom-Up Parser Example
a dbA cINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A
b
$
ProductionS → aABeA → AbcA → bB → d
We are not reducing here in this example.
A parser would reduce, get stuck and then backtrack!
Bottom-Up Parser Example
a dbA cINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A
b
$
ProductionS → aABeA → AbcA → bB → d
c
A
b
Bottom-Up Parser Example
a dAINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A c
A
b
$
ProductionS → aABeA → AbcA → bB → d
b
Bottom-Up Parser Example
a dAINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A c
A
b
$
ProductionS → aABeA → AbcA → bB → d
b
B
d
Bottom-Up Parser Example
a BAINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A c
A
b
$
ProductionS → aABeA → AbcA → bB → d
b
B
d
Bottom-Up Parser Example
a BAINPUT:
Bottom-Up ParsingProgram
e OUTPUT:
A c
A
b
$
ProductionS → aABeA → AbcA → bB → d
b
B
d
a
S
e
Bottom-Up Parser Example
SINPUT:
Bottom-Up ParsingProgram
OUTPUT:
A c
A
b
$
ProductionS → aABeA → AbcA → bB → d
b
B
d
a
S
e
This parser is known as an LR Parser because it scans the input from Left to right, and it constructs
a Rightmost derivation in reverse order.
Bottom-Up Parser Example
The scanning of productions for matching withhandles in the input string, and backtracking makesthe method used in the previous example veryinefficient.
Can we do better?
LR Parser Example
Input
Stack
LR ParsingProgram
action goto
Output
Shift reduce parser
2. Apply the shift-reduce parsing algorithm to construct the parse tree
1. Construct the action-goto table from the given grammar
Shift reduce parser
1. Construct the action-goto table from the given grammar
This is what make difference between different typsof shift reduce parsing such as SLR, CLR, LALR
In this course due to short of time we will not study how to construct the action-goto table
Shift reduce parser2. Apply the shift-reduce parsing algorithm to construct the parse tree
The following algorithm shows how we can construct the move parsing table for an input string w$ with respect to a given grammar G.
set ip to point to the first symbol of the input string w$repeat forever begin
if action[top(stack), current-input(ip)] = shift(s) then beginpush current-input(ip) then s on top of the stackadvance ip to the next input symbol
endelse if action[top(stack), current-input(ip)] = reduce A à ß thenbegin
pop 2*|ß| symbols off the stack;
output the production A à ßend
else error()end
push A then goto[top(stack), A] on top of the stack;
else if action[top(stack), current-input(ip)] = accept thenreturn
LR Parser Example
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
The following grammar:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
Can be parsed with this actionand goto table
s represents shiftr represents reduceacc represents acceptempty represents error
LR Parser Exampleid idid+ ∗INPUT: $
STACK: E0
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
GRAMMAR:
OUTPUT:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgramE5
id0
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
F
id
GRAMMAR:
OUTPUT:
0
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
F
id
GRAMMAR:
OUTPUT:
E3F0
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
F
id
GRAMMAR:
OUTPUT:
0
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
F
id
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgramE2
T0
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
F
id
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
E7∗2T0
T
F
id
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgramE5
id7∗2T0
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
F
id
F
id
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgramE7
∗2T0
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
F
id
F
id
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgramE10
F7∗2T0
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
∗T F
F
id
id
GRAMMAR:
OUTPUT:
0
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
∗T F
F
id
id
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram2
T0
T
∗T F
F
id
idaction goto State
id + * ( ) $ E T F 0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
E
GRAMMAR:
OUTPUT:
0
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
∗T F
F
id
id
E
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram1
E0
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
∗T F
F
id
id
E
GRAMMAR:
OUTPUT:LR Parser Example
id idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
T
∗T F
F
id
id
E
6+1E0
GRAMMAR:
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
OUTPUT:
T
∗T F
F
id
id
E
5id6+1E0
F
id
GRAMMAR:
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
OUTPUT:
T
∗T F
F
id
id
E
6+1E0
F
id
GRAMMAR:
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
OUTPUT:
T
∗T F
F
id
id
E
3F6+1E0
F
id
GRAMMAR:
T
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
OUTPUT:
T
∗T F
F
id
id
E
6+1E0
F
id
GRAMMAR:
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
OUTPUT:
T
∗T F
F
id
id
E
9T6+1E0
F
id
GRAMMAR:
T
E
+
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram0
GRAMMAR:
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
OUTPUT:
T
∗T F
F
id
id
E
F
id
T
E
+
LR Parser Exampleid idid∗ +INPUT: $
STACK:
(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id
LR ParsingProgram
action goto State id + * ( ) $ E T F
0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1
10 r3 r3 r3 r3 11 r5 r5 r5 r5
OUTPUT:
T
∗T F
F
id
id
E
1E0
F
id
GRAMMAR:
T
E
+
Constructing Parsing Tables
All LR parsers use the same parsing program thatwe demonstrated in the previous slides. What differentiates the LR parsers are the action and the goto tables:Simple LR (SLR): succeeds for the fewest grammars, but is the easiest to implement.
Canonical LR: succeeds for the most grammars, but is the hardest to implement. It splits states when necessary to prevent reductions that would get the parser stuck.
Lookahead LR (LALR): succeeds for most common syntacticconstructions used in programming languages, but producesLR tables much smaller than canonical LR.