question answering and information extraction · question answering and information extraction cmsc...
TRANSCRIPT
![Page 1: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/1.jpg)
Question Answering andInformation Extraction
CMSC 473/673UMBC
December 11th, 2017
![Page 2: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/2.jpg)
Course Announcement 1: Project
Due Wednesday 12/20, 11:59 AM
Late days cannot be used
Any questions?
![Page 3: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/3.jpg)
Course Announcement 2: Final ExamNo mandatory final exam
December 20th, 1pm-3pm: optional second midterm/final
Averaged into first midterm score
No practice questions
Register by Monday 12/11:https://goo.gl/forms/aXflKkP0BIRxhOS83
![Page 4: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/4.jpg)
Course Announcement 3: Evaluations
Please fill them out! (We do pay attention to them)
Links from [email protected]
![Page 5: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/5.jpg)
Recap from last time…
![Page 6: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/6.jpg)
Pat and Chandler agreed on a plan.
He said Pat would try the same tactic again.
is “he” the same person as “Chandler?”
?
Entity Coreference Resolution
![Page 7: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/7.jpg)
Basic System
Preprocessing Mention Detection
CorefModel OutputInput
Text
![Page 8: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/8.jpg)
What are Named Entities?
Named entity recognition (NER)
Identify proper names in texts, and classification into a set of predefined categories of interest
Person namesOrganizations (companies, government organisations,
committees, etc)Locations (cities, countries, rivers, etc)Date and time expressionsMeasures (percent, money, weight etc), email addresses, Web
addresses, street addresses, etc. Domain-specific: names of drugs, medical conditions, names
of ships, bibliographic references etc.
Cunningham and Bontcheva (2003, RANLP Tutorial)
![Page 9: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/9.jpg)
Two kinds of NE approaches
Knowledge Engineering
rule based developed by experienced language engineers make use of human intuition requires only small amount of training datadevelopment could be very time consuming some changes may be hard to accommodate
Learning Systems
requires some (large?) amounts of annotated training data
some changes may require re-annotation of the entire training corpus
annotators can be cheap
Cunningham and Bontcheva (2003, RANLP Tutorial)
![Page 10: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/10.jpg)
Baseline: list lookup approach
System that recognises only entities stored in its lists (gazetteers).
Advantages - Simple, fast, language independent, easy to retarget (just create lists)
Disadvantages – impossible to enumerate all names, collection and maintenance of lists, cannot deal with name variants, cannot resolve ambiguity
Cunningham and Bontcheva (2003, RANLP Tutorial)
![Page 11: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/11.jpg)
Shallow Parsing Approach (internal structure)
Internal evidence – names often have internal structure. These components can be either stored or guessed, e.g. location:
Cap. Word + {City, Forest, Center, River}Sherwood Forest
Cap. Word + {Street, Boulevard, Avenue, Crescent, Road}
Portobello Street
Cunningham and Bontcheva (2003, RANLP Tutorial)
![Page 12: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/12.jpg)
NER and Shallow Parsing: Machine Learning
Sequence models (HMM, CRF) often effective
BIO encoding
Pat and Chandler Smith agreed on a plan.B-PER B-PER I-PERO O O O O
B-NP B-NP I-NPO B-VP O B-NP I-NP
B-NP I-NP I-NPI-NP B-VP O B-NP I-NP
![Page 13: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/13.jpg)
Pat and Chandler agreed on a plan.
He said Pat would try the same tactic again.
?
Model Attempt 1: Binary Classification
Mention-Pair Model
![Page 14: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/14.jpg)
Pat and Chandler agreed on a plan.
He said Pat would try the same tactic again.
Model Attempt 1: Binary Classification
training
observed positive
instances
negative instances
naïve approach (take all non-positive pairs): highly imbalanced!
Soon et al. (2001): heuristic for more balanced selection
solution: go left-to-right
for a mention m, select the closest preceding coreferent mention
otherwise, no antecedent is found for m
possible problem:not transitive
![Page 15: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/15.jpg)
Anaphora
does a mention have an antecedent?
Chris told Pat he aced the test.
![Page 16: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/16.jpg)
Model 2: Entity-Mention Model
entity 1
entity 2
entity 3
entity 4
Pat and Chandler agreed on a plan.
He said Pat would try the same tactic again.
advantage: featurize based on all (or some or none) of the
clustered mentions
disadvantage: clustering doesn’t address anaphora
![Page 17: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/17.jpg)
Model 3: Cluster-Ranking Model (Rahman and Ng, 2009)
entity 1
entity 2
entity 3
entity 4
Pat and Chandler agreed on a plan.
He said Pat would try the same tactic again.learn to rank the
clusters and items in them
![Page 18: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/18.jpg)
Stanford Coref (Lee et al., 2011)
![Page 19: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/19.jpg)
<Core Task Here> Applications
Question answering
Information extraction
Machine translation
Text summarization
Information retrieval
![Page 20: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/20.jpg)
<Core Task Here> Applications
Question answering
Information extraction
Machine translation
Text summarization
Information retrieval
![Page 21: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/21.jpg)
IBM Watson
https://www.youtube.com/watch?v=C5Xnxjq63Zg
https://youtu.be/WFR3lOm_xhE?t=34s
![Page 22: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/22.jpg)
What happened with Watson?
(let’s ask Google)
![Page 23: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/23.jpg)
What Happened with Watson?David Ferrucci, the manager of the Watson project at IBM Research,explained during a viewing of the show on Monday morning that several ofthings probably confused Watson. First, the category names on Jeopardy! aretricky. The answers often do not exactly fit the category. Watson, in histraining phase, learned that categories only weakly suggest the kind ofanswer that is expected, and, therefore, the machine downgrades theirsignificance. The way the language was parsed provided an advantage for thehumans and a disadvantage for Watson, as well. “What US city” wasn’t in thequestion. If it had been, Watson would have given US cities much moreweight as it searched for the answer. Adding to the confusion for Watson,there are cities named Toronto in the United States and the Toronto inCanada has an American League baseball team. It probably picked up thosefacts from the written material it has digested. Also, the machine didn’t findmuch evidence to connect either city’s airport to World War II. (Chicago was avery close second on Watson’s list of possible answers.) So this is just one ofthose situations that’s a snap for a reasonably knowledgeable human but atrue brain teaser for the machine.
https://www.huffingtonpost.com/2011/02/15/watson-final-jeopardy_n_823795.html
![Page 24: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/24.jpg)
How many children does the Queen have?
![Page 25: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/25.jpg)
![Page 26: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/26.jpg)
There are still errors
(but some questions are
harder than others)
![Page 27: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/27.jpg)
Question Answering Motivation
Question answering
Information extraction
Machine translation
Text summarization
Information retrieval
![Page 28: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/28.jpg)
Question Answering Motivation
Question answering
Information extraction
Machine translation
Text summarization
Information retrieval
![Page 29: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/29.jpg)
Brief Aside: Information Extraction
ATTACK template– Type: Gun attack– Perp: Shining
Path– # killed: 3
BUSINESS NEGOTIATIONtemplate
– ….
Three people have been fatally shot, and five people, including a mayor, were seriously wounded as a result of a Shining Pathattack today.
![Page 30: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/30.jpg)
Remember Our Logical Forms?
Papa ate the caviar
KB
![Page 31: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/31.jpg)
Two Types of QA
Closed domainOften tied to structured database
Open domainOften tied to unstructured data
![Page 32: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/32.jpg)
Remember Our Logical Forms?
Papa ate the caviar
KBCorpus
![Page 33: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/33.jpg)
Open Domain:START (1993-Present; Katz, 1997)
SynTactic Analysis using Reversible Transformations
http://start.csail.mit.edu
![Page 34: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/34.jpg)
Open Domain:START (1993-Present; Katz, 1997)
“The START Server is built on two foundations the sentence level Natural Language processing capability provided by the START Natural Language system (Katz, 1990) and the idea of natural language annotations for multimedia information segments. This paper starts with an overview of sentence level processing in the START system and
then explains how annotating information segments with collections of English sentences makes it possible to use the power of sentence level natural language processing in the service of multimedia information access. The paper
ends with a proposal to annotate the World Wide Web.” (Katz, 1997)
SynTactic Analysis using Reversible Transformationshttp://start.csail.mit.edu
![Page 35: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/35.jpg)
Open Domain:START (1993-Present; Katz, 1997)
Decompose sentences into (subject, verb, object) triples
“De-questionify” the inputHow many children does the queen
haveThe queen has how many children
Apply any needed inference rules
Query against knowledge base
SynTactic Analysis using Reversible Transformationshttp://start.csail.mit.edu
“The START Server is built on two foundations the sentence level Natural Language processing capability provided by the START Natural Language system (Katz, 1990) and the idea of natural language annotations for multimedia information segments. This paper starts with an overview of sentence level processing in the START system and then explains how annotating information segments with collections of English sentences makes it possible to use the power of sentence level natural language processing in the service of multimedia information access. The paper ends with a proposal to annotate the World Wide Web.” (Katz, 1997)
![Page 36: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/36.jpg)
![Page 37: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/37.jpg)
Basic System
Question Analysis
Question Classification
Query Construction
Answer
Input Question
Neves --- https://hpi.de/fileadmin/user_upload/fachgebiete/plattner/teaching/NaturalLanguageProcessing/NLP09_QuestionAnswering.pdf
Document Retrieval
KB
Sentence Retrieval
Sentence NLP
Answer Extraction
Answer Validation
Corpus
To Learn More:NLP
Information Retrieval (IR)Information Extraction (IE)
![Page 38: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/38.jpg)
Aspects of NLPPOS taggingStemmingShallow Parsing (chunking)Predicate argument representation
verb predicates and nominalizationEntity Annotation
Stand alone NERs with a variable number of classesDates, times and numeric value normalizationIdentification of semantic relations
complex nominals, genitives, adjectival phrases, and adjectival clauses
Event identificationSemantic Parsing
![Page 39: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/39.jpg)
Basic System
Question Analysis
Question Classification
Query Construction
Answer
Input Question
Neves --- https://hpi.de/fileadmin/user_upload/fachgebiete/plattner/teaching/NaturalLanguageProcessing/NLP09_QuestionAnswering.pdf
Document Retrieval
KB
Sentence Retrieval
Sentence NLP
Answer Extraction
Answer Validation
Corpus
To Learn More:NLP
Information Retrieval (IR)Information Extraction (IE)
![Page 40: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/40.jpg)
Question Classification
Albert Einstein was born in 14 March 1879.
Albert Einstein was born in Germany.
Albert Einstein was born in a Jewish family.
Neves --- https://hpi.de/fileadmin/user_upload/fachgebiete/plattner/teaching/NaturalLanguageProcessing/NLP09_QuestionAnswering.pdf
![Page 41: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/41.jpg)
Question Classification Taxonomy
LOC:other Where do hyenas live?NUM:date When was Ozzy Osbourne born?LOC:other Where do the adventures of ``The Swiss Family Robinson” take place?LOC:other Where is Procter & Gamble based in the U.S.?HUM:ind What barroom judge called himself The Law West of the Pecos?HUM:gr What Polynesian people inhabit New Zealand?
http://cogcomp.org/Data/QA/QC/train_1000.labelSLP3: Figure 28.4
![Page 42: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/42.jpg)
Basic System
Question Analysis
Question Classification
Query Construction
Answer
Input Question
Neves --- https://hpi.de/fileadmin/user_upload/fachgebiete/plattner/teaching/NaturalLanguageProcessing/NLP09_QuestionAnswering.pdf
Document Retrieval
KB
Sentence Retrieval
Sentence NLP
Answer Extraction
Answer Validation
Corpus
To Learn More:NLP
Information Retrieval (IR)Information Extraction (IE)
![Page 43: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/43.jpg)
NLP Techniques:Vector Space ModelProbabilistic Model
Language Model
Software:Lucenesklearn
nltk
Document & Sentence Retrieval
tf-idf(d, w)term frequency, inverse document frequency
tf: frequency of word w in document d
idf: inverse frequency of documents containing w
count(𝑤𝑤 ∈ 𝑑𝑑)# tokens 𝑖𝑖𝑖𝑖 𝑑𝑑
∗ log(# documents
# documents containing 𝑤𝑤)
![Page 44: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/44.jpg)
Current NLP QA TasksTREC (Text Retrieval Conference)
http://trec.nist.gov/Started in 1992
Freebase Question Answeringe.g., https://nlp.stanford.edu/software/sempre/Yao et al. (2014)
WikiQAhttps://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge-dataset-for-open-domain-question-answering/
![Page 45: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/45.jpg)
orthography
morphology
lexemes
syntax
semantics
pragmatics
discourse
VISIONAUDIO
prosody
intonation
color
![Page 46: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/46.jpg)
Visual Question Answering
http://www.visualqa.org/
![Page 47: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/47.jpg)
Course Goals• Be introduced to some of the core
problems and solutions of NLP (big picture)• Learn different ways that success and
progress can be measured in NLP• Relate to statistics, machine learning, and
linguistics• Implement NLP programs• Read and analyze research papers• Practice your (written) communication skills
![Page 48: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/48.jpg)
Course RecapBasics of ProbabilityRequirements to be a distribution (“proportional to”, ∝)Definitions of conditional probability, joint probability, and
independenceBayes rule, (probability) chain rule
Basics of language modelingGoal: model (be able to predict) and give a score to language
(whole sequences of characters or words)Simple count-based modelSmoothing (and why we need it): Laplace (add-λ), interpolation,
backoffEvaluation: perplexity
Tasks and Classification (use Bayes rule!)Posterior decoding vs. noisy channel modelEvaluations: accuracy, precision, recall, and Fβ (F1) scoresNaïve Bayes (given the label, generate/explain each feature
independently) and connection to language modeling
Maximum Entropy ModelsMeanings of feature functions and weightsUse for language modeling or conditional classification
(“posterior in one go”)How to learn the weights: gradient descent
Distributed Representations & Neural Language ModelsWhat embeddings are and what their motivation isA common way to evaluate: cosine similarity
Word Modeling
Latent ModelsWhat is meant by “latent”Expectation MaximizationBasic Example: Unigram Mixture Model (3 coins)
Machine Translation AlignmentFamily of methods for learning word-to-word translationsIBM Model 1Can be used beyond MT (e.g., semantics, paraphrasing)
Hidden Markov ModelBasic Definition: generative bigram model of latent tags3 Tasks: Likelihood, Most-Likely Sequence, Parameter
Estimation3 Basic Algorithms: Forward (Backward), Viterbi, Baum-
Welch2 Types of Decoding: Viterbi & Posterior
Semi-Supervised LearningLabeled data (small amount) + unlabeled data (large
amount)Apply EM to get fractional counts to re-estimate
parameters
Latent Sequences
![Page 49: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/49.jpg)
Course RecapBasics of Probability
Requirements to be a distribution (“proportional to”, ∝)Definitions of conditional probability, joint probability, and independenceBayes rule, (probability) chain rule
Basics of language modelingGoal: model (be able to predict) and give a score to language (whole sequences of
characters or words)Simple count-based modelSmoothing (and why we need it): Laplace (add-λ), interpolation, backoffEvaluation: perplexity
Tasks and Classification (use Bayes rule!)Posterior decoding vs. noisy channel modelEvaluations: accuracy, precision, recall, and Fβ (F1) scoresNaïve Bayes (given the label, generate/explain each feature independently) and
connection to language modeling
Maximum Entropy ModelsMeanings of feature functions and weightsUse for language modeling or conditional classification (“posterior in one go”)How to learn the weights: gradient descent
Distributed Representations & Neural Language ModelsWhat embeddings are and what their motivation isA common way to evaluate: cosine similarity Word Modeling
Latent ModelsWhat is meant by “latent”Expectation MaximizationBasic Example: Unigram Mixture Model (3 coins)
Machine Translation AlignmentFamily of methods for learning word-to-word translationsIBM Model 1Can be used beyond MT (e.g., semantics, paraphrasing)
Hidden Markov ModelBasic Definition: generative bigram model of latent tags3 Tasks: Likelihood, Most-Likely Sequence, Parameter Estimation3 Basic Algorithms: Forward (Backward), Viterbi, Baum-Welch2 Types of Decoding: Viterbi & Posterior
Semi-Supervised LearningLabeled data (small amount) + unlabeled data (large amount)Apply EM to get fractional counts to re-estimate parameters
Latent Sequences
Syntactic ParsingBasic linguistic intuitionsCapturing of some ambiguities and light semantics
Constituency ParsingBasic Definition: generative tree3 Tasks: Likelihood, Most-Likely Sequence, Parameter Estimation3 Basic Algorithms: Inside, Viterbi, Outside2 Types of Decoding: Viterbi & Posterior
Semi-Supervised LearningLabeled data (small amount) + unlabeled data (large amount)Apply EM to get fractional counts to re-estimate parameters
Dependency ParsingWord-to-word relationsShift-reduce parsingGreedy vs. beam search
Semantic FormsRoles, Frames, and LabelingWays to get human judgments (methodology and infrastructure)Lexical & knowledge resources
Latent Structures
![Page 50: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/50.jpg)
Natural Language Processing
![Page 51: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/51.jpg)
Conditional vs. Sequence
CRF Tutorial, Fig 1.2, Sutton & McCallum (2012)
We’ll cover these in 678
![Page 52: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/52.jpg)
Gradient Ascent
![Page 53: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/53.jpg)
Pick Your Toolkit
PyTorchDeeplearning4j
TensorFlowDyNetCaffe
KerasMxNetGluonCNTK
…
Comparisons:https://en.wikipedia.org/wiki/Comparison_of_deep_learning_softwarehttps://deeplearning4j.org/compare-dl4j-tensorflow-pytorchhttps://github.com/zer0n/deepframeworks (older---2015)
![Page 54: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/54.jpg)
http://www.qwantz.com/index.php?comic=170
![Page 55: Question Answering and Information Extraction · Question Answering and Information Extraction CMSC 473/673 UMBC. December 11 th, 2017](https://reader033.vdocuments.us/reader033/viewer/2022043007/5f9547df840687726124b282/html5/thumbnails/55.jpg)
Thank you for a great semester!
Natural language processing
Semantics
Vision & language processing
Learning with low-to-no supervision