![Page 1: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/1.jpg)
Ling573NLP Systems and Applications
May 7, 2013
![Page 2: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/2.jpg)
Local Context and SMT for Question Expansion
“Statistical Machine Translation for Query Expansion in Answer Retrieval”, Riezler et al, 2007
Investigates data-driven approaches to query exp.
![Page 3: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/3.jpg)
Local Context and SMT for Question Expansion
“Statistical Machine Translation for Query Expansion in Answer Retrieval”, Riezler et al, 2007
Investigates data-driven approaches to query exp.Local context analysis (pseudo-rel. feedback)
![Page 4: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/4.jpg)
Local Context and SMT for Question Expansion
“Statistical Machine Translation for Query Expansion in Answer Retrieval”, Riezler et al, 2007
Investigates data-driven approaches to query exp.Local context analysis (pseudo-rel. feedback)Contrasts: Collection global measures
Terms identified by statistical machine translationTerms identified by automatic paraphrasing
![Page 5: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/5.jpg)
MotivationFundamental challenge in QA (and IR)
Bridging the “lexical chasm”
![Page 6: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/6.jpg)
MotivationFundamental challenge in QA (and IR)
Bridging the “lexical chasm”Divide between user’s info need, author’s lexical
choiceResult of linguistic ambiguity
Many approaches:QA
![Page 7: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/7.jpg)
MotivationFundamental challenge in QA (and IR)
Bridging the “lexical chasm”Divide between user’s info need, author’s lexical
choiceResult of linguistic ambiguity
Many approaches:QA
Question reformulation, syntactic rewritingOntology-based expansionMT-based reranking
IR:
![Page 8: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/8.jpg)
MotivationFundamental challenge in QA (and IR)
Bridging the “lexical chasm”Divide between user’s info need, author’s lexical
choiceResult of linguistic ambiguity
Many approaches:QA
Question reformulation, syntactic rewritingOntology-based expansionMT-based reranking
IR: query expansion with pseudo-relevance feedback
![Page 9: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/9.jpg)
Task & ApproachGoal:
Answer retrieval from FAQ pagesIR problem: matching queries to docs of Q-A pairsQA problem: finding answers in restricted document
set
![Page 10: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/10.jpg)
Task & ApproachGoal:
Answer retrieval from FAQ pagesIR problem: matching queries to docs of Q-A pairsQA problem: finding answers in restricted document
set
Approach: Bridge lexical gap with statistical machine
translationPerform query expansion
Expansion terms identified via phrase-based MT
![Page 11: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/11.jpg)
ChallengeVaried results for query expansionIR:
External resources:
![Page 12: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/12.jpg)
ChallengeVaried results for query expansionIR:
External resources:Inconclusive (Vorhees), some gain (Fang)
![Page 13: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/13.jpg)
ChallengeVaried results for query expansionIR:
External resources:Inconclusive (Vorhees), some gain (Fang)
Local, global collection statistics: Substantial gainsQA:
![Page 14: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/14.jpg)
ChallengeVaried results for query expansionIR:
External resources:Inconclusive (Vorhees), some gain (Fang)
Local, global collection statistics: Substantial gainsQA:
External resources: mixed resultsLocal, global collection stats: sizable gains
![Page 15: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/15.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
![Page 16: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/16.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
Web search and scraping ‘FAQ’ in title/urlSearch in proprietary collections
![Page 17: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/17.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
Web search and scraping ‘FAQ’ in title/urlSearch in proprietary collections1-2.8M Q-A pairs
Inspection shows poor quality
Extracted from 4B page corpus
![Page 18: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/18.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
Web search and scraping ‘FAQ’ in title/urlSearch in proprietary collections1-2.8M Q-A pairs
Inspection shows poor quality
Extracted from 4B page corpus (they’re Google)Precision-oriented extraction
![Page 19: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/19.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
Web search and scraping ‘FAQ’ in title/urlSearch in proprietary collections1-2.8M Q-A pairs
Inspection shows poor quality
Extracted from 4B page corpus (they’re Google)Precision-oriented extraction
Search for ‘faq’, Train FAQ page classifier ~800K pages
![Page 20: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/20.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
Web search and scraping ‘FAQ’ in title/urlSearch in proprietary collections1-2.8M Q-A pairs
Inspection shows poor quality
Extracted from 4B page corpus (they’re Google)Precision-oriented extraction
Search for ‘faq’, Train FAQ page classifier ~800K pagesQ-A pairs: trained labeler: features?
![Page 21: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/21.jpg)
Creating the FAQ CorpusPrior FAQ collections limited in scope, quality
Web search and scraping ‘FAQ’ in title/urlSearch in proprietary collections1-2.8M Q-A pairs
Inspection shows poor quality
Extracted from 4B page corpus (they’re Google)Precision-oriented extraction
Search for ‘faq’, Train FAQ page classifier ~800K pagesQ-A pairs: trained labeler: features?
punctuation, HTML tags (<p>,..), markers (Q:), lexical (what,how) 10M pairs (98% precision)
![Page 22: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/22.jpg)
Machine Translation ModelSMT query expansion:
Builds on alignments from SMT models
![Page 23: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/23.jpg)
Machine Translation ModelSMT query expansion:
Builds on alignments from SMT modelsBasic noisy channel machine translation model:
e: English; f: French
![Page 24: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/24.jpg)
Machine Translation ModelSMT query expansion:
Builds on alignments from SMT modelsBasic noisy channel machine translation model:
e: English; f: French
p(e): ‘language model’; p(f|e): translation model
![Page 25: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/25.jpg)
Machine Translation ModelSMT query expansion:
Builds on alignments from SMT modelsBasic noisy channel machine translation model:
e: English; f: French
p(e): ‘language model’; p(f|e): translation modelCalculated from relative frequencies of phrases
Phrases: larger blocks of aligned words
![Page 26: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/26.jpg)
Machine Translation ModelSMT query expansion:
Builds on alignments from SMT modelsBasic noisy channel machine translation model:
e: English; f: French
p(e): ‘language model’; p(f|e): translation modelCalculated from relative frequencies of phrases
Phrases: larger blocks of aligned wordsSequence of phrases:
![Page 27: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/27.jpg)
Machine Translation ModelSMT query expansion:
Builds on alignments from SMT modelsBasic noisy channel machine translation model:
e: English; f: French
p(e): ‘language model’; p(f|e): translation modelCalculated from relative frequencies of phrases
Phrases: larger blocks of aligned wordsSequence of phrases:
![Page 28: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/28.jpg)
Question-Answer TranslationView Q-A pairs from FAQ as translation pairs
Q as translation of A (and vice versa)
![Page 29: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/29.jpg)
Question-Answer TranslationView Q-A pairs from FAQ as translation pairs
Q as translation of A (and vice versa)Goal:
Learn alignments b/t question words & synonymous answer words
![Page 30: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/30.jpg)
Question-Answer TranslationView Q-A pairs from FAQ as translation pairs
Q as translation of A (and vice versa)Goal:
Learn alignments b/t question words & synonymous answer wordsNot interested in fluency, ignore that part of MT
model
Issues: Differences from typical MT
![Page 31: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/31.jpg)
Question-Answer TranslationView Q-A pairs from FAQ as translation pairs
Q as translation of A (and vice versa)Goal:
Learn alignments b/t question words & synonymous answer wordsNot interested in fluency, ignore that part of MT
model
Issues: Differences from typical MTLength differences
![Page 32: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/32.jpg)
Question-Answer TranslationView Q-A pairs from FAQ as translation pairs
Q as translation of A (and vice versa)Goal:
Learn alignments b/t question words & synonymous answer wordsNot interested in fluency, ignore that part of MT model
Issues: Differences from typical MTLength differences Modify null alignment weightsLess important words
![Page 33: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/33.jpg)
Question-Answer TranslationView Q-A pairs from FAQ as translation pairs
Q as translation of A (and vice versa)Goal:
Learn alignments b/t question words & synonymous answer wordsNot interested in fluency, ignore that part of MT model
Issues: Differences from typical MTLength differences Modify null alignment weightsLess important words Use intersection of
bidirectional alignments
![Page 34: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/34.jpg)
ExampleQ: “How to live with cat allergies”Add expansion terms
Translations not seen in original query
![Page 35: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/35.jpg)
SMT-based ParaphrasingKey approach intuition:
Identify paraphrases by translating to and from a ‘pivot’ language
![Page 36: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/36.jpg)
SMT-based ParaphrasingKey approach intuition:
Identify paraphrases by translating to and from a ‘pivot’ language
Paraphrase rewrites yield phrasal ‘synonyms’E.g. translate E -> C -> E: find E phrases aligned to C
![Page 37: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/37.jpg)
SMT-based ParaphrasingKey approach intuition:
Identify paraphrases by translating to and from a ‘pivot’ language
Paraphrase rewrites yield phrasal ‘synonyms’E.g. translate E -> C -> E: find E phrases aligned to C
Given paraphrase pair (trg, syn): pick best pivot
![Page 38: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/38.jpg)
SMT-based ParaphrasingKey approach intuition:
Identify paraphrases by translating to and from a ‘pivot’ language
Paraphrase rewrites yield phrasal ‘synonyms’E.g. translate E -> C -> E: find E phrases aligned to C
Given paraphrase pair (trg, syn): pick best pivot
![Page 39: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/39.jpg)
SMT-based ParaphrasingFeatures employed:
Phrase translation probabilities, lexical translation probabilities, reordering score, # words, # phrases, LM
![Page 40: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/40.jpg)
SMT-based ParaphrasingFeatures employed:
Phrase translation probabilities, lexical translation probabilities, reordering score, # words, # phrases, LM
Trained on NIST multiple Chinese-English translations
![Page 41: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/41.jpg)
SMT-based ParaphrasingFeatures employed:
Phrase translation probabilities, lexical translation probabilities, reordering score, # words, # phrases, LM
Trained on NIST multiple Chinese-English translations
![Page 42: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/42.jpg)
ExampleQ: “How to live with cat allergies”Expansion approach:
Add new terms from n-best paraphrases
![Page 43: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/43.jpg)
Retrieval ModelWeighted linear combination of vector similarity
valsComputed between query and fields of Q-A pair
![Page 44: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/44.jpg)
Retrieval ModelWeighted linear combination of vector similarity
valsComputed between query and fields of Q-A pair
8 Q-A pair fields:1) Full FAQ text; 2) Question text; 3) answer text;4) title text; 5) 1-4 without stopwords
![Page 45: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/45.jpg)
Retrieval ModelWeighted linear combination of vector similarity
valsComputed between query and fields of Q-A pair
8 Q-A pair fields:1) Full FAQ text; 2) Question text; 3) answer text;4) title text; 5) 1-4 without stopwordsHighest weights:
![Page 46: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/46.jpg)
Retrieval ModelWeighted linear combination of vector similarity
valsComputed between query and fields of Q-A pair
8 Q-A pair fields:1) Full FAQ text; 2) Question text; 3) answer text;4) title text; 5) 1-4 without stopwordsHighest weights: Raw Q text;
Then stopped full text, stopped Q textThen stopped A text, stopped title text
No phrase matching or stemming
![Page 47: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/47.jpg)
Query ExpansionSMT Term selection:
New terms from 50-best paraphrases7.8 terms added
![Page 48: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/48.jpg)
Query ExpansionSMT Term selection:
New terms from 50-best paraphrases7.8 terms added
New terms from 20-best translations3.1 terms added
Why?
![Page 49: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/49.jpg)
Query ExpansionSMT Term selection:
New terms from 50-best paraphrases7.8 terms added
New terms from 20-best translations3.1 terms addedWhy? - paraphrasing more constrained, less noisy
Weighting: Paraphrase: same; Trans: higher A text
![Page 50: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/50.jpg)
Query ExpansionSMT Term selection:
New terms from 50-best paraphrases7.8 terms added
New terms from 20-best translations3.1 terms addedWhy? - paraphrasing more constrained, less noisy
Weighting: Paraphrase: same; Trans: higher A textLocal expansion (Xu and Croft)
top 20 docs, terms weighted by tfidf of answers
![Page 51: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/51.jpg)
Query ExpansionSMT Term selection:
New terms from 50-best paraphrases7.8 terms added
New terms from 20-best translations3.1 terms addedWhy? - paraphrasing more constrained, less noisy
Weighting: Paraphrase: same; Trans: higher A textLocal expansion (Xu and Croft)
top 20 docs, terms weighted by tfidf of answersUse answer preference weighting for retrieval9.25 terms added
![Page 52: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/52.jpg)
ExperimentsTest queries from MetaCrawler query logs
60 well-formed NL questions
![Page 53: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/53.jpg)
ExperimentsTest queries from MetaCrawler query logs
60 well-formed NL questionsIssue: Systems fail on 1/3 of questions
No relevant answers retrievedE.g. “how do you make a cornhusk doll?”, “what does
8x certification mean”, etcSerious recall problem in QA DB
![Page 54: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/54.jpg)
ExperimentsTest queries from MetaCrawler query logs
60 well-formed NL questionsIssue: Systems fail on 1/3 of questions
No relevant answers retrievedE.g. “how do you make a cornhusk doll?”, “what does
8x certification mean”, etcSerious recall problem in QA DB
Retrieve 20 results:Compute evaluation measures @10, 20
![Page 55: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/55.jpg)
ExperimentsTest queries from MetaCrawler query logs
60 well-formed NL questionsIssue: Systems fail on 1/3 of questions
No relevant answers retrievedE.g. “how do you make a cornhusk doll?”, “what does
8x certification mean”, etcSerious recall problem in QA DB
Retrieve 20 results:Compute evaluation measures @10, 20
![Page 56: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/56.jpg)
EvaluationManually label top 20 answers by 2 judges
![Page 57: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/57.jpg)
EvaluationManually label top 20 answers by 2 judgesQuality rating: 3 point scale
adequate (2): Includes the answermaterial (1): Some relevant information, no exact
ansunsatisfactory (0): No relevant info
![Page 58: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/58.jpg)
EvaluationManually label top 20 answers by 2 judgesQuality rating: 3 point scale
adequate (2): Includes the answermaterial (1): Some relevant information, no exact ansunsatisfactory (0): No relevant info
Compute ‘Successtype @ n’Type: 2,1,0 aboven: # of documents returned
Why not MRR?
![Page 59: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/59.jpg)
EvaluationManually label top 20 answers by 2 judgesQuality rating: 3 point scale
adequate (2): Includes the answer material (1): Some relevant information, no exact ans unsatisfactory (0): No relevant info
Compute ‘Successtype @ n’ Type: 2,1,0 above n: # of documents returned
Why not MRR? - Reduce sensitivity to high rank Reward recall improvement
![Page 60: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/60.jpg)
Evaluation Manually label top 20 answers by 2 judges Quality rating: 3 point scale
adequate (2): Includes the answer material (1): Some relevant information, no exact ans unsatisfactory (0): No relevant info
Compute ‘Successtype @ n’ Type: 2,1,0 above n: # of documents returned
Why not MRR? - Reduce sensitivity to high rank Reward recall improvement MRR rewards systems with answers in top 1, but poorly on
everything else
![Page 61: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/61.jpg)
Results
![Page 62: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/62.jpg)
DiscussionCompare baseline retrieval to:
Local RF expansionCombined translation, paraphrase based
expansion
![Page 63: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/63.jpg)
DiscussionCompare baseline retrieval to:
Local RF expansionCombined translation, paraphrase based
expansion
Both forms of query expansion improve baseline2 @10: Local: +11%; SMT: +40.7%2,1 (easier task): little change
![Page 64: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/64.jpg)
Example Expansions-++
-++
-
++
+++
---
![Page 65: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/65.jpg)
ObservationsExpansion improves for rigorous criteria
Better for SMT than local RF
Why?
![Page 66: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/66.jpg)
ObservationsExpansion improves for rigorous criteria
Better for SMT than local RF
Why?Both can introduce some good termsLocal RF introduces more irrelevant termsSMT more constrainedChallenge: Balance introducing info vs noise
![Page 67: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/67.jpg)
Comparing Question Reformulations
“Exact Phrases in Information Retrieval for Question Answering”, Stoyanchev et al, 2008
InvestigatesRole of ‘exact phrases’ in retrieval for QAOptimal query construction through document
retrievalFrom Web or AQUAINT collection
Impact of query specificity on passage retrieval
![Page 68: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/68.jpg)
MotivationRetrieval bottleneck in Question-Answering
Retrieval provides source for answer extraction If retrieval fails to return answer-contained
documents, downstream answer processing is guaranteed to fail
Focus on recall in information retrieval phaseConsistent relationship b/t quality of IR and of QA
![Page 69: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/69.jpg)
MotivationRetrieval bottleneck in Question-Answering
Retrieval provides source for answer extraction If retrieval fails to return answer-contained
documents, downstream answer processing is guaranteed to fail
Focus on recall in information retrieval phaseConsistent relationship b/t quality of IR and of QA
Main factor in retrieval: query Approaches vary from simple to complex
processing or expansion with external resources
![Page 70: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/70.jpg)
ApproachFocus on use of ‘exact phrases’ from a question
![Page 71: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/71.jpg)
ApproachFocus on use of ‘exact phrases’ from a questionAnalyze impact of diff’t linguistic components of
QRelate to answer candidate sentences
![Page 72: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/72.jpg)
ApproachFocus on use of ‘exact phrases’ from a questionAnalyze impact of diff’t linguistic components of
QRelate to answer candidate sentences
Evaluate query construction for Web, Trec retrievalOptimize query construction
![Page 73: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/73.jpg)
ApproachFocus on use of ‘exact phrases’ from a questionAnalyze impact of diff’t linguistic components of
QRelate to answer candidate sentences
Evaluate query construction for Web, Trec retrievalOptimize query construction
Evaluate query construction for sentence retrievalAnalyze specificity
![Page 74: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/74.jpg)
Data Sources & ResourcesDocuments:
TREC QA AQUAINT corpusWeb
![Page 75: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/75.jpg)
Data Sources & ResourcesDocuments:
TREC QA AQUAINT corpusWeb
Questions:TREC2006, non-empty questions
![Page 76: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/76.jpg)
Data Sources & ResourcesDocuments:
TREC QA AQUAINT corpusWeb
Questions:TREC2006, non-empty questions
Gold standard: NIST-provided relevant docs, answer key: 3.5
docs/Q
![Page 77: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/77.jpg)
Data Sources & ResourcesDocuments:
TREC QA AQUAINT corpus Web
Questions: TREC2006, non-empty questions
Gold standard: NIST-provided relevant docs, answer key: 3.5 docs/Q
Resources: IR: Lucene; NLTK, Lingpipe: phrase, NE annotation
Also hand-corrected
![Page 78: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/78.jpg)
Query Processing Approach
Exploit less-resource intensive methodsChunking, NERApplied only to questions, candidate sentences
Not applied to full collection
![Page 79: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/79.jpg)
Query Processing Approach
Exploit less-resource intensive methodsChunking, NERApplied only to questions, candidate sentences
Not applied to full collection can use on Web
Exact phrase motivation:Phrases can improve retrieval
“In what year did the movie win academy awards?”
![Page 80: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/80.jpg)
Query Processing Approach
Exploit less-resource intensive methodsChunking, NERApplied only to questions, candidate sentences
Not applied to full collection can use on Web
Exact phrase motivation:Phrases can improve retrieval
“In what year did the movie win academy awards?”Phrase:
![Page 81: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/81.jpg)
Query Processing Approach
Exploit less-resource intensive methodsChunking, NERApplied only to questions, candidate sentences
Not applied to full collection can use on Web
Exact phrase motivation:Phrases can improve retrieval
“In what year did the movie win academy awards?”Phrase: Can rank documents higherDisjunct:
![Page 82: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/82.jpg)
Query Processing Approach
Exploit less-resource intensive methodsChunking, NERApplied only to questions, candidate sentences
Not applied to full collection can use on Web
Exact phrase motivation:Phrases can improve retrieval
“In what year did the movie win academy awards?”Phrase: Can rank documents higherDisjunct: Can dilute pool
![Page 83: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/83.jpg)
Query ProcessingNER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana
![Page 84: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/84.jpg)
Query ProcessingNER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana
![Page 85: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/85.jpg)
Query ProcessingNER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana
Uses LingPipe: ORG, LOC, PERPhrases (NLTK)
NP, VP, PP
![Page 86: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/86.jpg)
Query ProcessingNER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana
Uses LingPipe: ORG, LOC, PERPhrases (NLTK)
NP, VP, PPConverted Q-phrases:
Heuristic paraphrases on question as declarativeE.g. Who was|is NOUN|PRONOUN VBD NOUN|
PRONOUN was|is VBD
![Page 87: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/87.jpg)
Query ProcessingNER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana Uses LingPipe: ORG, LOC, PER
Phrases (NLTK) NP, VP, PP
Converted Q-phrases: Heuristic paraphrases on question as declarative
E.g. Who was|is NOUN|PRONOUN VBD NOUN|PRONOUN was|is VBD
q-phrase: expected form of simple answerE.g. When was Mozart born? Mozart was born
How likely are we to see a q-phrase?
![Page 88: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/88.jpg)
Query Processing NER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana Uses LingPipe: ORG, LOC, PER
Phrases (NLTK) NP, VP, PP
Converted Q-phrases: Heuristic paraphrases on question as declarative
E.g. Who was|is NOUN|PRONOUN VBD NOUN|PRONOUN was|is VBD q-phrase: expected form of simple answer
E.g. When was Mozart born? Mozart was born How likely are we to see a q-phrase? Unlikely How likely is it to be right if we do see it?
![Page 89: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/89.jpg)
Query Processing NER on Question and target
target: 1991 eruption on Mt. Pinatubo vs Nirvana Uses LingPipe: ORG, LOC, PER
Phrases (NLTK) NP, VP, PP
Converted Q-phrases: Heuristic paraphrases on question as declarative
E.g. Who was|is NOUN|PRONOUN VBD NOUN|PRONOUN was|is VBD q-phrase: expected form of simple answer
E.g. When was Mozart born? Mozart was born How likely are we to see a q-phrase? Unlikely How likely is it to be right if we do see it? Very
![Page 90: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/90.jpg)
Comparing Query FormsBaseline: words from question and target
![Page 91: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/91.jpg)
Comparing Query FormsBaseline: words from question and targetExperimental:
Words, quoted exact phrases, quoted names entities
Backoff: Lucene: weight based on type
![Page 92: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/92.jpg)
Comparing Query FormsBaseline: words from question and targetExperimental:
Words, quoted exact phrases, quoted names entities
Backoff: Lucene: weight based on typeBackoff: Web: 1) converted q-phrases;
2) phrases; 3) w/o phrases -- until 20 retrievedCombined with target in all cases
![Page 93: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/93.jpg)
Comparing Query FormsBaseline: words from question and targetExperimental:
Words, quoted exact phrases, quoted names entities
Backoff: Lucene: weight based on typeBackoff: Web: 1) converted q-phrases;
2) phrases; 3) w/o phrases -- until 20 retrievedCombined with target in all cases
Max 20 documents: expensive downstream processSentences split, ranked
![Page 94: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/94.jpg)
Query Components
![Page 95: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/95.jpg)
Query Components in Supporting Sentences
Highest precision:
![Page 96: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/96.jpg)
Query Components in Supporting Sentences
Highest precision: Converted q-phrase, then phrase,..
![Page 97: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/97.jpg)
Query Components in Supporting Sentences
Highest precision: Converted q-phrase, then phrase,..Words likely to appear, but don’t discriminate
![Page 98: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/98.jpg)
ResultsDocument and sentence retrievalMetrics:
Document retrieval:Average recallMRROverall document recall: % of questions w/>=1 correct doc
Sentence retrievalSentence MRROverall sentence recallAverage prec of first sentence# correct in top 10, top 50
![Page 99: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/99.jpg)
ResultsDocument and sentence retrievalMetrics:
Document retrieval:Average recallMRROverall document recall: % of questions w/>=1 correct
doc
![Page 100: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/100.jpg)
Results
![Page 101: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/101.jpg)
DiscussionDocument retrieval:
About half of the correct docs are retrieved, rank 1-2
Sentence retrieval:Lower, correct sentence ~ rank 3
![Page 102: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/102.jpg)
DiscussionDocument retrieval:
About half of the correct docs are retrieved, rank 1-2
Sentence retrieval:Lower, correct sentence ~ rank 3
Little difference for exact phrases in AQUAINT
![Page 103: Ling573 NLP Systems and Applications May 7, 2013](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681641a550346895dd5d150/html5/thumbnails/103.jpg)
DiscussionDocument retrieval:
About half of the correct docs are retrieved, rank 1-2
Sentence retrieval:Lower, correct sentence ~ rank 3
Little difference for exact phrases in AQUAINTWeb:
Retrieval improved by exact phrasesManual more than auto (20-30%) relative
Precision affected by tagging errors