veselin stoyanov claire cardie diane litman janyce wiebe
DESCRIPTION
Evaluating an Opinion Annotation Scheme Using a New Multi-perspective Question and Answer Corpus (AAAI 2004 Spring). Veselin Stoyanov Claire Cardie Diane Litman Janyce Wiebe Dept. of Comp. Science, Cornell University Dept. of Comp. Science, Univ. of Pittsburgh. Abstract. 2 tasks: - PowerPoint PPT PresentationTRANSCRIPT
Evaluating an Opinion Annotation Scheme Using a New Multi-perspective Question and Answer Corpus (AAAI 2004 Spring)
Veselin Stoyanov Claire Cardie Diane Litman Janyce Wiebe
Dept. of Comp. Science, Cornell University Dept. of Comp. Science, Univ. of Pittsburgh
2
Abstract 2 tasks:
Constructing a data collection for MPQA. Evaluating the hypothesis that low-level perspective
information can be useful for MPQA. Low-level perspective information Corpus creation Evaluation:
Answer probability Answer rank
Conclusion: low-level perspective information can be an effective predictor of whether a text segment contains an answer to a question.
3
Introduction (1/2) Hypothesize opinion representations will
be useful for practical NLP applications like MPQA.
Multi-perspective question answering (MPQA): answer opinion-oriented question (“What is the sentiment in the Middle East towards war on Iraq?”) rather than fact-based questions (“What is the primary substance used in producing chocolate?”).
4
Introduction (2/2)
Goal: two-fold Present a new corpus of multi-
perspective questions and answers. Present the results of two
experiments that employ the new Q&A corpus to investigate the usefulness of the opinion annotation scheme for multi-perspective vs. fact-based question answering.
5
Low-Level Perspective Information (1/3) Suggested by: Wiebe et al. (2002) Provides: a basis for annotating opinions,
beliefs, emotions, sentiment, and other private states expressed in text. Private state: a general term used to refer to
mental and emotional states that cannot be directly observed or verified.
Explicitly stated (“John is afraid that Sue might fall.”) Indirectly expressed by the selection of words and
the style of language that the speaker or writer uses (“It is about time that we end Saddam’s oppression.”)
Expressive subjective elements
6
Low-Level Perspective Information (2/3) Source: the experiencer of that state =
the person/entity whose opinion or emotion is being conveyed in the text. Overall source: the writer The writer may write about the private states
of other people multiple sources in a single text segment (nesting of sources: deep and complex)
“Mary believes that Sue is afraid of the dark.” Sue is afraid of the dark Mary Mary believes… the writer
7
Low-Level Perspective Information (3/3) Annotations:
On: the text span that constitutes the private state or speech event phrase itself.
Inside: the text segment inside the scope of the private state/speech event phrase.
“Tom believes that Ken is an outstanding individual.”
Attributes: Fact annotation: onlyfactive = yes Opinion annotation: onlyfactive = no /
expressive subjective element
on inside
8
The MPQA NRRC Corpus Source: U.S. foreign broadcast information
service (FBIS) Using the perspective annotation framework,
Wiebe et al. (2003) have manually annotated a considerable number of documents to form the NRRC (Northeast Regional Research Center) corpus.
Interannotator agreements: Using measure agr (a||b) : the proportion of a’s
annotations that were found by b. 85% on explicit private states 50% on expressive subjectivity
Conclusion: good agreement results indicate that annotating opinions is a feasible task.
9
MPQA Corpus Creation (1/3) The creation of the question and answer
(Q&A) corpus used to evaluate the low-level perspective annotations in the context of opinion-oriented (opinion) and fact-based (fact) question answering.
98 documents, 4 topics (kyoto, mugabe, humanrights, venezuela) 19~33 documents for each topic.
SMART98
documents270,000
documents
10
MPQA Corpus Creation (2/3) Question creation:
Difficulties: The classification associated with each
question (fact/opinion) did not always seem appropriate.
“Did any prominent Americans plan to visit Venezuela immediately following the 2002 coup?”
Fact? Opinion?
A volunteer2 documents on each topic
& a set of instructions15 opinion (o) & 15 fact (f)
questions for each topic
11
MPQA Corpus Creation (3/3) Annotating answers:
Manually added answer annotations for each text segment in the Q&A corpus that constituted/ contributed to an answer to any question.
Attributes: topic, question number, confidence Difficulties:
Opinionated documents often express answers to the questions only very indirectly.
It is hard even for humans to decide what constitutes an answer to a question.
It was hard for human annotators to judge what can be considered an expression of the opinion of collective entities and often the conjecture required a significant amount of background information.
12
Evaluation of Perspective Annotations for MPQA (1/5)
2 different experiments to evaluate the usefulness of the perspective annotations in the context of fact- and especially opinion-based QA. Answer probability
The # of answer segments classified as FACT & OPINION, respectively, that answer each question.
Answer rank Determine the rank of the first retrieved sentence
that correctly answers the question.
13
Evaluation of Perspective Annotations for MPQA (2/5) Multiple criteria: to determine whether a
text segment should be considered FACT or OPINION based on the underlying perspective annotations. 2 association criteria: to determine which
perspective annotations should be considered associated with an arbitrary text segment.
4 classification criteria: to categorize the segment as one of FACT or OPINION.
Bias towards opinion annotations expect opinion annotations to be more discriminative
14
Evaluation of Perspective Annotations for MPQA (3/5)
Answer probability: Procedure:
P(FACT/OPINION answer | fact/opinion question)
Each answeringText segment
Categorize
Based onCriteria
Opinion/ Fact
Count how many fact/opinion
segments answer FACT/OPINION questions
15
Evaluation of Perspective Annotations for MPQA (4/5)
Answer rank: Procedure:
documents A set of text segments
Evaluation
Ranked list ofsentences
divideRun an IR algorithm
Each questionas the query
Modified rankedlist of answers
One of two filters to remove OPINION answers for
fact questions & vice versa(opinion: overlap any
fact: cover (all) )
Determine the rank of the first correct (any part of it is annotated as an A to the Q)
retrieved sentence
16
Evaluation of Perspective Annotations for MPQA (5/5) Discussion:
Low-level perspective information can be a reliable predictor of whether a given segment of a document answers an opinion/fact question.
Low-level perspective information may be used to re-rank potential answers by using the knowledge that the probability that a fact answer appears in an OPINION segment, and vice versa, is very low.
Using filters can sometimes cause all answering segments for a particular question to be discarded unrealistic to use the FACT/OPINION segment classification as an absolute indicator of whether the segment can answer fact/opinion question.
17
Conclusion and Future Work Both tasks (constructing a data collection &
evaluating usefulness) provided insights into potential difficulties of the task of MPQA and the usefulness of the low-level perspective information.
Main problems: Deciding what constitutes answer The presence of indirect answer (expressive subjectivity) Most answers to opinion questions have to be deduced
Low-level perspective information can be an effective predictor of whether a text segment contains an answer to a question (given the type of the question), but should NOT be used as an absolute indicator, especially in a limited number of documents.
18
ThanQ
19
Table 1: Attributes
20
Table 2: Questions in the Q&A collection by topic
21
2 association criteria
22
4 classification criteria
23
Table 3: Answer Probability
120/415 A annotated for fact/opinion Q
P(F|f) >> P(O|f)
P(O|o) > P(F|o)
P(F|f) >> P(F|o)P(O|o) >> P(O|f)
Max P(F|f)Max P(O|o)
P(ANSWER|question)
24
Table 4: Answer Rank
Filter all answer segments
Rank(overlap)<=Rank(unfilt) for opinion Q
Rank(cover)<=Rank(unfilt) for fact Q
Mixed at least as well as unfiltered