probabilistic parsing: enhancements ling 571 deep processing techniques for nlp january 26, 2011
Post on 19-Dec-2015
218 views
TRANSCRIPT
![Page 1: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/1.jpg)
Probabilistic Parsing: Enhancements
Ling 571Deep Processing Techniques for NLP
January 26, 2011
![Page 2: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/2.jpg)
Parser IssuesPCFGs make many (unwarranted) independence
assumptionsStructural Dependency
NP -> Pronoun: much more likely in subject position
Lexical DependencyVerb subcategorizationCoordination ambiguity
![Page 3: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/3.jpg)
Improving PCFGs:Structural Dependencies
How can we capture Subject/Object asymmetry?E.g., NPsubj-> Pron vs NPObj->Pron
Parent annotation:Annotate each node with parent in parse tree
E.g., NP^S vs NP^VPAlso annotate pre-terminals:
RB^ADVP vs RB^VP IN^SBAR vs IN^PP
Can also split rules on other conditions
![Page 4: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/4.jpg)
Parent Annotation
![Page 5: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/5.jpg)
Parent Annotation:Pre-terminals
![Page 6: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/6.jpg)
Parent AnnotationAdvantages:
Captures structural dependency in grammars
![Page 7: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/7.jpg)
Parent AnnotationAdvantages:
Captures structural dependency in grammars
Disadvantages: Increases number of rules in grammar
![Page 8: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/8.jpg)
Parent AnnotationAdvantages:
Captures structural dependency in grammars
Disadvantages: Increases number of rules in grammarDecreases amount of training per rule
Strategies to search for optimal # of rules
![Page 9: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/9.jpg)
Improving PCFGs:Lexical Dependencies
Lexicalized rules:Best known parsers: Collins, Charniak parsersEach non-terminal annotated with its lexical head
E.g. verb with verb phrase, noun with noun phraseEach rule must identify RHS element as head
Heads propagate up treeConceptually like adding 1 rule per head value
VP(dumped) -> VBD(dumped)NP(sacks)PP(into)VP(dumped) -> VBD(dumped)NP(cats)PP(into)
![Page 10: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/10.jpg)
Lexicalized PCFGsAlso, add head tag to non-terminals
Head tag: Part-of-speech tag of head wordVP(dumped) -> VBD(dumped)NP(sacks)PP(into)VP(dumped,VBD) ->
VBD(dumped,VBD)NP(sacks,NNS)PP(into,IN)
Two types of rules:Lexical rules: pre-terminal -> word
Deterministic, probability 1 Internal rules: all other expansions
Must estimate probabilities
![Page 11: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/11.jpg)
![Page 12: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/12.jpg)
PLCFGsIssue:
![Page 13: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/13.jpg)
PLCFGsIssue: Too many rules
No way to find corpus with enough examples
![Page 14: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/14.jpg)
PLCFGsIssue: Too many rules
No way to find corpus with enough examples
(Partial) Solution: Independence assumedCondition rule on
Category of LHS, head
Condition head onCategory of LHS and parent’s head
![Page 15: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/15.jpg)
Disambiguation Example
![Page 16: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/16.jpg)
Disambiguation Example
67.09/6
))((
))((
),|(
dumpedVPC
VBDNPPdumpedVPC
dumpedVPVBDNPPPVPP
09/0
))((
))((
),|(
dumpedVPC
NPVBDdumpedVPC
dumpedVPVBDNPVPp
22.09/2
...)...)((
)..)(...)((
),|(
PPdumpedXC
inPPdumpedXC
dumpedPPinp
0/0
...)...)((
)...)(...)((
),|(
PPsacksXC
inPPsacksXC
sacksPPinp
![Page 17: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/17.jpg)
Improving PCFGs:Tradeoffs
Tensions: Increase accuracy:
Increase specificity E.g. Lexicalizing, Parent annotation, Markovization,etc
Increases grammarIncreases processing timesIncreases training data requirements
How can we balance?
![Page 18: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/18.jpg)
CNF Factorization & Markovization
CNF factorization:Converts n-ary branching to binary branching
![Page 19: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/19.jpg)
CNF Factorization & Markovization
CNF factorization:Converts n-ary branching to binary branchingCan maintain information about original structure
Neighborhood history and parent
Issue:Potentially explosive
![Page 20: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/20.jpg)
CNF Factorization & Markovization
CNF factorization:Converts n-ary branching to binary branchingCan maintain information about original structure
Neighborhood history and parent
Issue:Potentially explosive
If keep all context: 72 -> 10K non-terminals!!!
![Page 21: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/21.jpg)
CNF Factorization & Markovization
CNF factorization:Converts n-ary branching to binary branchingCan maintain information about original structure
Neighborhood history and parent
Issue:Potentially explosive
If keep all context: 72 -> 10K non-terminals!!!
How much context should we keep?What Markov order?
![Page 22: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/22.jpg)
Different Markov Orders
![Page 23: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/23.jpg)
Markovization & Costs(Mohri & Roark 2006)
![Page 24: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/24.jpg)
EfficiencyPCKY is Vn3
Grammar can be huge Grammar can be extremely ambiguous
100s of analyses not unusual, esp. for long sentences
However, only care about best parsesOthers can be pretty bad
Can we use this to improve efficiency?
![Page 25: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/25.jpg)
Beam ThresholdingInspired by beam search algorithm
Assume low probability partial parses unlikely to yield high probability overallKeep only top k most probably partial parse
Retain only k choices per cell For large grammars, could be 50 or 100 For small grammars, 5 or 10
![Page 26: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/26.jpg)
Heuristic FilteringIntuition: Some rules/partial parses are unlikely to
end up in best parse. Don’t store those in table.
![Page 27: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/27.jpg)
Heuristic FilteringIntuition: Some rules/partial parses are unlikely to
end up in best parse. Don’t store those in table.
Exclusions:Low frequency: exclude singleton productions
![Page 28: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/28.jpg)
Heuristic FilteringIntuition: Some rules/partial parses are unlikely to
end up in best parse. Don’t store those in table.
Exclusions:Low frequency: exclude singleton productions
Low probability: exclude constituents x s.t. p(x) <10-
200
![Page 29: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/29.jpg)
Heuristic FilteringIntuition: Some rules/partial parses are unlikely to
end up in best parse. Don’t store those in table.
Exclusions:Low frequency: exclude singleton productions
Low probability: exclude constituents x s.t. p(x) <10-200
Low relative probability:Exclude x if there exists y s.t. p(y) > 100 * p(x)
![Page 30: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/30.jpg)
Notes on HW#3Outline:
Induce grammar from (small) treebank
Implement Probabilistic CKY
Evaluate parser
Improve parser
![Page 31: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/31.jpg)
Treebank FormatAdapted from Penn Treebank Format
Rules simplified:Removed traces and other null elementsRemoved complex tagsReformatted POS tags as non-terminals
![Page 32: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/32.jpg)
Reading the ParsesPOS unary collapse:
(NP_NNP Ontario) was
(NP (NNP Ontario))
Binarization:VP -> VP’ PP; VP’ -> VB PP
WasVP -> VB PP PP
![Page 33: Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d2d5503460f94a0429c/html5/thumbnails/33.jpg)
Start Early!