![Page 1: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/1.jpg)
Fall 2004
Lecture Notes #4
EECS 595 / LING 541 / SI 661
Natural Language Processing
![Page 2: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/2.jpg)
Parsing withContext-Free Grammars
![Page 3: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/3.jpg)
Introduction
• Parsing = associating a structure (parse tree) to an input string using a grammar
• CFG are declarative, they don’t specify how the parse tree will be constructed
• Parse trees are used in grammar checking, semantic analysis, machine translation, question answering, information extraction
• Example: “How many people in the Human Resources Department receive salaries above $30,000?”
![Page 4: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/4.jpg)
Parsing as search
S NP VP Det that | this |a
S Aux NP VP Noun book | flight | meal | money
S VP Verb book | include | prefer
NP Det Nominal Aux does
Nominal Noun Proper-Noun Houston | TWA
Nominal Noun Nominal Prep from | to | on
NP Proper-Noun
VP Verb
VP Verb NP
Nominal Nominal PP
![Page 5: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/5.jpg)
Parsing as search
Book that flight. S
VP
NP
Nom
Verb Det Noun
Book that flight
Two types of constraints on the parses: a) some that come from the input string,b) others that come from the grammar
![Page 6: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/6.jpg)
Top-down parsing
S
NP VP
S
Aux VP
S
VP
S
NP
S
NP VP
Det Nom
S
NP VP
PropN
S
NP VP
Det Nom
S
VP
V NP
Aux
S
NP VPAux
PropN
S
VP
V
![Page 7: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/7.jpg)
Book that flight
Book that flight
Noun Det Noun
Book that flight
Verb Det Noun
Book that flight
Noun Det Noun
Book that flight
Verb Det Noun
Book that flight
Noun Det Noun
Book that flight
Verb Det Noun
Book that flight
Verb Det Noun
Book that flight
Verb Det Noun
Book that flight
Verb Det Noun
NOM NOM NOM
NOMNOM NOM NOM
NOM NOM
VP NP
NP NP
VP
Bottom-up parsing
NP
VP
![Page 8: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/8.jpg)
Comparing TD and BU parsers
• TD never wastes time exploring trees that cannot result in an S.
• BU however never spends effort on trees that are not consistent with the input.
• Needed: some middle ground.
![Page 9: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/9.jpg)
Basic TD parser
• Practically infeasible to generate all trees in parallel.
• Use depth-first strategy.
• When arriving at a tree that is inconsistent with the input, return to the most recently generated but still unexplored tree.
fig10.05.pdf
![Page 10: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/10.jpg)
function TOP-DOWN-PARSE (input, grammar) returns a parse tree agenda (Initial S tree, Beginning of input) current-search-state POP (agenda) loop if SUCCESSFUL-PARSE? (current-search-state) then return TREE (current-search-state) else if CAT (NODE-TO-EXPAND (current-search-state)) is a POS then if CAT (node-to-expand) POS (CURRENT-INPUT (current-search-state)) then PUSH (APPLY-LEXICAL-RULE (current-search-state), agenda) else return reject else PUSH (APPLY-RULES (current-search-state, grammar), agenda) if agenda is empty then return reject else current-search-state NEXT (agenda) end
A TD-DF-LR parser
![Page 11: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/11.jpg)
An example
Does this flight include a meal?
fig10.07.pdf fig10.08.pdf fig10.09.pdf
![Page 12: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/12.jpg)
Problems with the basic parser
• Left-recursion: rules of the type: NP NP PPsolution: rewrite each rule of the form A A | using a new symbol: A A’ A A’ |
• Ambiguity: attachment ambiguity, coordination ambiguity, noun-phrase bracketing ambiguity
• Attachment ambiguity: I saw the Grand Canyon flying to New York
• Coordination ambiguity: old men and women
![Page 13: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/13.jpg)
Problems with the basic parser
• Example:President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio.
• Solutions: return all parses or include disambiguation in the parser.
• Inefficient reparsing of subtrees: a flight from Indianapolis to Houston on TWA
fig10.10.pdf fig10.11.pdf fig10.12.pdf fig10.13.pdf
![Page 14: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/14.jpg)
The Earley algorithm
• Resolving:– Left-recursive rules– Ambiguity– Inefficient reparsing of subtrees
• A chart with N+1 entries• Dotted rules
– S . VP, [0,0]
– NP Det . Nominal, [1,2]
– VP V NP ., [0,3]fig10.14.pdf fig10.15.pdf fig10.16.pdf fig10.17.pdf
![Page 15: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/15.jpg)
Parsing with FSAs
• Shallow parsing
• Useful for information extraction: noun phrases, verb phrases, locations, etc.
• The Fastus system (Appelt and Israel, 1997)
• Sample rules for noun groups:NG Pronoun | Time-NP | Date-NPNG (DETP) (Adjs) HdNns | DETP Ving HdNnsDETP DETP-CP | DETP-CP
• Complete determiner-phrases: “the only five”, “another three”, “this”, “many”, “hers”, “all”, “the most”
fig10.19.pdffig10.18.pdf fig10.20.pdf fig10.21.pdf
![Page 16: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/16.jpg)
Sample FASTUS outputCompany Name: Bridgestone Sports Co.Verb Group: saidNoun Group: FridayNoun Group: itVerb Group: had set upNoun Group: a joint venturePreposition: inLocation: TaiwanPreposition: withNoun Group: a local concernConjunction: andNoun Group: a Japanese trading houseVerb Group: to produceNoun Group: golf clubsVerb Group: to be shippedPreposition: toLocation: Japan
![Page 17: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/17.jpg)
Features and unification
![Page 18: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/18.jpg)
Introduction
• Grammatical categories have properties• Constraint-based formalisms• Example: this flights: agreement is difficult to
handle at the level of grammatical categories• Example: many water: count/mass nouns• Sample rule that takes into account features: S
NP VP (but only if the number of the NP is equal to the number of the VP)
![Page 19: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/19.jpg)
Feature structuresCAT NPNUMBER SINGULARPERSON 3
CAT NP
AGREEMENT NUMBER SG PERSON 3
Feature paths: {x agreement number}
![Page 20: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/20.jpg)
Unification
[NUMBER SG] [NUMBER SG] +
[NUMBER SG] [NUMBER PL] -
[NUMBER SG] [NUMBER []] = [NUMBER SG]
[NUMBER SG] [PERSON 3] = ?
![Page 21: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/21.jpg)
Agreement
• S NP VP{NP AGREEMENT} = {VP AGREEMENT}
• Does this flight serve breakfast?• Do these flights serve breakfast?
• S Aux NP VP{Aux AGREEMENT} = {NP AGREEMENT}
![Page 22: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/22.jpg)
Agreement
• These flights• This flight
• NP Det Nominal{Det AGREEMENT} = {Nominal AGREEMENT}
• Verb serve{Verb AGREEMENT NUMBER} = PL
• Verb serves{Verb AGREEMENT NUMBER} = SG
![Page 23: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/23.jpg)
Subcategorization
• VP Verb{VP HEAD} = {Verb HEAD}{VP HEAD SUBCAT} = INTRANS
• VP Verb NP{VP HEAD} = {Verb HEAD}{VP HEAD SUBCAT} = TRANS
• VP Verb NP NP{VP HEAD} = {Verb HEAD}{VP HEAD SUBCAT} = DITRANS
![Page 24: Fall 2004 Lecture Notes #4 EECS 595 / LING 541 / SI 661 Natural Language Processing](https://reader035.vdocuments.us/reader035/viewer/2022062322/56649ed45503460f94be4d61/html5/thumbnails/24.jpg)
Readings for next time
• J&M Chapters 12, 13, 20
• Lecture notes #4
• FUF/CFUF documentation