csci 3130: automata theory and formal languages
DESCRIPTION
Fall 2010. The Chinese University of Hong Kong. CSCI 3130: Automata theory and formal languages. LR( 1 ) grammars. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. LR(0) parsing review. A a A b A ab. 3. 4. 2. 1. a. parser generator. A. CFG G. 5. - PowerPoint PPT PresentationTRANSCRIPT
CSCI 3130: Automata theory and formal languages
Andrej Bogdanov
http://www.cse.cuhk.edu.hk/~andrejb/csc3130
The Chinese University of Hong Kong
LR(1) grammars
Fall 2010
LR(0) parsing review
A aAbA ab
parser generator
A a•AbA a•bA •aAbA •ab
A aA•b
A aAb•
A ab•
A
b
baA •aAbA •ab
a
1
2 3
5
4
CFG G“PDA” for parsing Gerror
if G is not LR(0)
Motivation: Fast parsing for programming languages
Parsing computer programs
if (n == 0) { return x; }
Statement
( Expression ) Block
else { return x + 1; }
if ParExpression Statement
...
...Block
...
else Statement
Most programming language CFGs are not LR(0)!
LR(0) parsing review
A aAb | ab stack action state
S
A a•AbA a•bA •aAbA •ab
A aA•b A aAb•
A ab•
A
b
b
aA •aAbA •ab
a
1
2 3
5
4
a
a b
b••
• •
1
S1 2
S12 2
R122 5 A12 3 S
•
123 4 R
A
•
•
••
Meaning of LR(0) items
•
A
A •Xundiscovered
part
NFA transitions to:
X •
X
focus
shift focus to subtree rooted at X(if X is nonterminal)
A X•move past subtreerooted at X
Outline of LR(0) parsing algorithm
• LR(0) parser has two kinds of actions:
• What if:
no complete item
is valid
there is one valid item,and it is complete
shift (S) reduce (R)
some valid items
complete, some not
more than one valid
complete item
S / R conflict R / R conflict
context-free grammarsCYK algorithm (slow)
Hierarchy of context-free grammars
LR(1) grammars
LR(0) grammarsLR(0) parsing algorithm
allow some conflicts
conflicts can be resolved by lookahead
A CFG that is not LR(0)
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
input: a
S •A, S •Bc A •aA, A •a B •a, B •ab,
valid LR(0) items:
update
A CFG that is not LR(0)
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
input:
R(4), R(5), S(6)
A
S
A B
A
aA
a a
A
a a
S S
ca• • •
valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a
a
S/R, R/R conflicts!
possible parse trees
peek inside!
Lookahead
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
input: a apeek inside!
valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a
A
A
a a
S
•
…
parse tree must look like this
action: shift
Lookahead
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
input: a a a
valid LR(0) items:A a•A, A a• A •aA, A •a
parse tree must look like this
…
A
A
aA
a
S
•action: shift
peek inside!
Lookahead
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
input: a a a
valid LR(0) items:A a•A, A a• A •aA, A •a
parse tree must look like this
action: reduce
A
A
aA
a a
S
•
LR(0) items vs. LR(1) items
A
A
a b
a b
Aa b•
A aAb | ab
A a•Ab
A
A
a b
a b
Aa b•
[A a•Ab, b]
LR(0) LR(1)
LR(1) items
[A •, x] [A •, ]
x•
A
•
A
Generating an LR(1) parser
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
NFA
states areLR(1) items
DFA + stack
may haveS/R, R/R conflicts
A CFG is LR(1) if conflicts can always be resolved with one symbol lookahead
NFA for LR(0) parsing
S •q0
A X•XA •X
C •A •C
For every LR(0) item S •
For every LR(0) item A •X
For every pair of LR(0) items A •C, C •
a, b: terminalsA, B, C: variables: mixed stringsX: terminal or variable
notation
NFA for LR(1) parsing
For every item S •
For every LR(1) item [A •X, x]
For every LR(1) item [A •C, x] and production C
a, b: terminalsA, B, C: variables: mixed stringsX: terminal or variable
notationq0
[S •,]
X [A X•, x][A •X, x]
[C •, y][A •C, x]
and every y in FIRST(x)
Explaining the transitions
[A X•, x]X
[A •X, x]
[C •, y]
[A •C, x]
A
C x
•
A
X x •
A
X x
y ∈ FIRST(x)
y
C
• •
FIRST sets
FIRST() are all leftmost terminals in derivations ⇒
S A(1) | cB(2) A aA(3) | a(4) B a(5) | ab(6)
C•
A
For every y in FIRST(x)
[C •, y][A •C, x]
x
{a}{a}{a, c}{c}{a}∅
aAScABA
FIRST()
Example: Constructing the NFA
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
q0
[S •A,]
[S •Bc,]
[S A•,]
A
. . .
[B •a,c]
[B •ab,c]
[S B•c,]B
[A •aA,]
[A •a,]
Example: Constructing the NFA
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
[S •A,]
q0
[S •Bc,]
[S A•,]A
[A •aA,]
[B •a,c]
[S B•c,]
[B •ab,c]
B
[A •a,]
[A a•A,] [A aA•,]
[A a•,]
[S Bc•,]
[B a•,c]
[B a•b,c] [B ab•,c]
a
a
c
a
a b
A
Example: Convert NFA to DFA
S A | Bc A aA | a B a | ab
[S •A,]
[S A•,]
[A •aA,]
[S B•c,]
[A •a,]
[A a•A,]
[A aA•,]
[S Bc•,]
A[S •Bc,][A •aA,][A •a,][B •a,c][B •ab,c]
[A a•A,]
[A •a,][B a•b,c]
[A •aA,]
[A a•,][B a•,c]
a
[A a•,]
a
A B
c [B ab•,c]
ba
A
shift variableshift terminal
reduce
LEGEND
12
3
4
5 6 7 8
Example: Resolving conflicts by lookahead
S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)
[A a•A,]
[A •a,][B a•b,c]
[A •aA,]
[A a•,][B a•,c]
shift variableshift terminal
reduce
LEGEND
2 next action
a
b
c
shift
shift
reduce A
reduce B
[A •aA,][A •a,]
[A a•A,]
[A a•,]
3 next action
a
b
c
shift
error
error
reduce A
Example: Reconstruct the parse tree
stack action state
S 1
S1 2
R12 8
S1 6
16 7 R
[S •A,]
[S A•,]
[A •aA,][S B•c,][A •a,]
[A a•A,]
[A aA•,]
[S Bc•,]
[S •Bc,][A •aA,][A •a,][B •a,c][B •ab,c]
[A a•A,]
[A •a,][B a•b,c]
[A •aA,]
[A a•,][B a•,c]
a
[A a•,]
a
A
B
c
[B ab•,c]
b
a
1 2
3
4
5
6
7
8A
A
a b c
B
S
• • • •