parsing: earley & pcfgs - university of washington
TRANSCRIPT
![Page 1: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/1.jpg)
Parsing: Earley & PCFGs
Ling 571 Deep Processing Techniques for NLP
January 19, 2011
![Page 2: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/2.jpg)
Roadmap � Motivation I: CKY limitations
� The Earley algorithm
� Motivation II: Ambiguity
� Probabilistic Context-free Grammars (PCFGs)
![Page 3: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/3.jpg)
CKY Algorithm � Dynamic programming approach
� Yields parsing in cubic time in length of input string
� Fairly efficient
� Issues:
� \
![Page 4: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/4.jpg)
CKY Algorithm � Dynamic programming approach
� Yields parsing in cubic time in length of input string
� Fairly efficient
� Issues: � Requires CNF conversion
� Limits expressiveness
![Page 5: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/5.jpg)
CKY Algorithm � Dynamic programming approach
� Yields parsing in cubic time in length of input string
� Fairly efficient
� Issues: � Requires CNF conversion
� Limits expressiveness
� Maintains ambiguity
![Page 6: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/6.jpg)
Earley Parsing � Avoid repeated work/recursion problem
� Dynamic programming � Store partial parses in “chart”
� Compactly encodes ambiguity
�
� Chart entries: � Subtree for a single grammar rule � Progress in completing subtree
� Position of subtree wrt input
O(N 3)
![Page 7: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/7.jpg)
Earley Algorithm
� Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time
� First, left-to-right pass fills out a chart with N+1 states � Think of chart entries as sitting between words in
the input string keeping track of states of the parse at these positions
� For each word position, chart contains set of states representing all partial parse trees generated to date. E.g. chart[0] contains all partial parse trees generated at the beginning of the sentence
![Page 8: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/8.jpg)
Chart Entries
� predicted constituents
� in-progress constituents
� completed constituents
Represent three types of constituents:
![Page 9: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/9.jpg)
Parse Progress � Represented by Dotted Rules
� Position of • indicates type of constituent
� 0 Book 1 that 2 flight 3 � S → • VP, [0,0] (predicted)
� NP → Det • Nom, [1,2] (in progress) � VP →V NP •, [0,3] (completed)
� [x,y] tells us what portion of the input is spanned so far by this rule
� Each State si: <dotted rule>, [<back pointer>,<current position>]
![Page 10: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/10.jpg)
S → • VP, [0,0] � First 0 means S constituent begins at the
start of input � Second 0 means the dot here too � So, this is a top-down prediction
NP → Det • Nom, [1,2] � the NP begins at position 1 � the dot is at position 2 � so, Det has been successfully parsed � Nom predicted next
0 Book 1 that 2 flight 3
![Page 11: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/11.jpg)
0 Book 1 that 2 flight 3 (continued)
VP → V NP •, [0,3] � Successful VP parse of entire input
![Page 12: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/12.jpg)
Successful Parse � Final answer found by looking at last entry in chart
� If entry resembles S → α • [0,N] then input parsed successfully
� Chart will also contain record of all possible parses of input string, given the grammar
![Page 13: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/13.jpg)
Parsing Procedure for the Earley Algorithm
� Move through each set of states in order, applying one of three operators to each state: � predictor: add predictions to the chart � scanner: read input and add corresponding state
to chart � completer: move dot to right when new
constituent found
� Results (new states) added to current or next set of states in chart
� No backtracking and no states removed: keep complete history of parse
![Page 14: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/14.jpg)
States and State Sets
� Dotted Rule si represented as <dotted rule>, [<back pointer>, <current position>]
� State Set Sj to be a collection of states si with the same <current position>.
![Page 15: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/15.jpg)
Earley Algorithm from Book
![Page 16: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/16.jpg)
Earley Algorithm from Book
![Page 17: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/17.jpg)
Earley Algorithm (simpler!) 1. Add Start → · S, [0,0] to state set 0
Let i=1
2. Predict all states you can, adding new predictions to state set 0
3. Scan input word i—add all matched states to state set Si. Add all new states produced by Complete to state set Si Add all new states produced by Predict to state set Si Let i = i + 1 Unless i=n, repeat step 3.
4. At the end, see if state set n contains Start → S ·, [0,n]
![Page 18: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/18.jpg)
3 Main Sub-Routines of Earley Algorithm
• Predictor: Adds predictions into the chart. • Completer: Moves the dot to the right
when new constituents are found. • Scanner: Reads the input words and enters
states representing those words into the chart.
![Page 19: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/19.jpg)
Predictor � Intuition: create new state for top-down
prediction of new phrase.
� Applied when non part-of-speech non-terminals are to the right of a dot: S → • VP [0,0]
� Adds new states to current chart � One new state for each expansion of the non-
terminal in the grammar VP → • V [0,0] VP → • V NP [0,0]
� Formally: Sj: A → α · B β, [i,j] Sj: B → · γ, [j,j]
![Page 20: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/20.jpg)
Scanner � Intuition: Create new states for rules matching
part of speech of next word.
� Applicable when part of speech is to the right of a dot: VP → • V NP [0,0] ‘Book…’
� Looks at current word in input
� If match, adds state(s) to next chart VP → V • NP [0,1]
� Formally: Sj: A → α · B β, [i,j] Sj+1: A → α B ·β, [i,j+1]
![Page 21: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/21.jpg)
Completer � Intuition: parser has finished a new phrase, so
must find and advance states all that were waiting for this
� Applied when dot has reached right end of rule NP → Det Nom • [1,3]
� Find all states w/dot at 1 and expecting an NP: VP → V • NP [0,1]
� Adds new (completed) state(s) to current chart : VP → V NP • [0,3]
� Formally: Sk: B → δ ·, [j,k] Sk: A → α B · β, [i,k], where: Sj: A → α · B β, [i,j].
![Page 22: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/22.jpg)
1/17/11 Speech and Language Processing - Jurafsky and Martin
Chart[0]
Note that given a grammar, these entries are the same for all inputs; they can be pre-loaded.
![Page 23: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/23.jpg)
1/17/11 Speech and Language Processing - Jurafsky and Martin 23
Chart[1]
![Page 24: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/24.jpg)
1/17/11 Speech and Language Processing - Jurafsky and Martin
Charts[2] and [3]
![Page 25: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/25.jpg)
How do we retrieve the parses at the end?
� Augment the Completer to add pointers to prior states it advances as a field in the current state � i.e. what state did we advance here?
� Read the pointers back from the final state
![Page 26: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/26.jpg)
� What about ambiguity?
![Page 27: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/27.jpg)
� What about ambiguity?
� CKY/Earley can represent it
![Page 28: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/28.jpg)
� What about ambiguity?
� CKY/Earley can represent it
� Can’t resolve it
![Page 29: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/29.jpg)
Probabilistic Parsing � Provides strategy for solving disambiguation problem
� Compute the probability of all analyses
� Select the most probable
� Employed in language modeling for speech recognition � N-gram grammars predict words, constrain search
� Also, constrain generation, translation
![Page 30: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/30.jpg)
PCFGs � Probabilistic Context-free Grammars
� Augmentation of CFGs
![Page 31: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/31.jpg)
PCFGs � Augments each production with probability that
LHS will be expanded as RHS � P(A->B) or P(A->B|A), p(RHS|LHS)
� Sum over all possible expansions is 1
� A PCFG is consistent if sum of probabilities of all sentences in language is 1. � Recursive rules often yield inconsistent grammars
P(A! !) =1!
"
![Page 32: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/32.jpg)
Disambiguation � A PCFG assigns probability to each parse tree T for
input S. � Probability of T: product of all rules to derive T
P(T,S)= P(RHSii=1
n
! | LHSi )
P(T,S)= P(T )P(S |T ) = P(T )
![Page 33: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/33.jpg)
P(T,S)=0.05
![Page 34: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/34.jpg)
P(T,S)=0.05*0.2
![Page 35: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/35.jpg)
P(T,S)=0.05*0.2*0.2
![Page 36: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/36.jpg)
P(T,S)=0.05*0.2*0.2*0.2
![Page 37: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/37.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75
![Page 38: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/38.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3
![Page 39: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/39.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6
![Page 40: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/40.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1
![Page 41: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/41.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
![Page 42: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/42.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
P(T,S)=0.05
![Page 43: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/43.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
P(T,S)=0.05*0.1
![Page 44: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/44.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
P(T,S)=0.05*0.1*0.15
![Page 45: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/45.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
P(T,S)=0.05*0.1*0.15*0.75
![Page 46: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/46.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
P(T,S)=0.05*0.1*0.15*0.75*0.75*
![Page 47: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/47.jpg)
P(T,S)=0.05*0.2*0.2*0.2*0.75* 0.3*0.6*0.1*0.4=2.2x10^-6
P(T,S)=0.05*0.1*0.15*0.75*0.75* 0.3*0.6*0.1*0.4=6.1x10^-7
![Page 48: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/48.jpg)
Formalizing Disambiguation � Select T such that:
� String of words S is yield of parse tree over S � Select tree that maximizes probability of parse
T!
(S) = argmaxTs.t,S=yield (T )
P(T )
![Page 49: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/49.jpg)
Learning Probabilities � Simplest way:
� Treebank of parsed sentences � To compute probability of a rule, count:
� Number of times non-terminal is expanded � Number of times non-terminal is expanded by given rule
� Alternative: Learn probabilities by re-estimating � (Later)
P(!! " |!) = Count(!! ")Count(!! # )
#"
=Count(!! ")Count(!)
![Page 50: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/50.jpg)
Example PCFG
![Page 51: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/51.jpg)
Issues with PCFGs � Independence assumptions:
� Rule expansion is context-independent � Allows us to multiply probabilities
� Is this valid?
![Page 52: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/52.jpg)
Issues with PCFGs � Independence assumptions:
� Rule expansion is context-independent � Allows us to multiply probabilities
� Is this valid?
Pronoun Non-pronoun
Subject 91% 9%
Object
![Page 53: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/53.jpg)
Issues with PCFGs � Independence assumptions:
� Rule expansion is context-independent � Allows us to multiply probabilities
� Is this valid?
Pronoun Non-pronoun
Subject 91% 9%
Object 34% 66%
![Page 54: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/54.jpg)
Issues with PCFGs � Independence assumptions:
� Rule expansion is context-independent � Allows us to multiply probabilities
� Is this valid?
� In Treebank: roughly equi-probable
� How can we handle this?
Pronoun Non-pronoun
Subject 91% 9%
Object 34% 66%
![Page 55: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/55.jpg)
Issues with PCFGs � Independence assumptions:
� Rule expansion is context-independent � Allows us to multiply probabilities
� Is this valid?
� In Treebank: roughly equi-probable
� How can we handle this? � Condition on Subj/Obj with parent annotation
Pronoun Non-pronoun
Subject 91% 9%
Object 34% 66%
![Page 56: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/56.jpg)
Issues with PCFGs � Insufficient lexical conditioning
� Present in pre-terminal rules
� Are there cases where other rules should be conditioned on words?
![Page 57: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/57.jpg)
Issues with PCFGs � Insufficient lexical conditioning
� Present in pre-terminal rules
� Are there cases where other rules should be conditioned on words?
![Page 58: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/58.jpg)
Issues with PCFGs � Insufficient lexical conditioning
� Present in pre-terminal rules
� Are there cases where other rules should be conditioned on words?
Different verbs & prepositions have different attachment preferences
![Page 59: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/59.jpg)
PCFGs � Augment parsers to handle probabilities
� Adapt PCFGs to handle � Structural dependencies
� By splitting nodes
� Lexical dependencies � By lexicalizing PCFGs
![Page 60: Parsing: Earley & PCFGs - University of Washington](https://reader031.vdocuments.us/reader031/viewer/2022020705/61fb91352e268c58cd5fb51c/html5/thumbnails/60.jpg)