automatic classification of semantic relations between facts and opinions koji murakami, eric...
Post on 20-Dec-2015
215 Views
Preview:
TRANSCRIPT
Automatic Classification of Semantic Relations between
Facts and OpinionsKoji Murakami, Eric Nichols, Junta Mizuno, Yotaro
Watanabe, Hayato Goto, Megumi Ohki, Suguru Matsuyoshi, Kentaro Inui, Yuji Matsumoto
Aaron MichelonyCMPS 245
May 3, 2011
Abstract
• They want to classify and identify semantic relations between facts and opinions on the Web.
• This will enable them to organize information on the Web.• Recognizing Textual Entailment (RTE) and Cross-document
Structure Theory (CST) are sets of semantic relations.• They will expand on these.• Japanese web pages.
Recognizing Textual Entailment (RTE)
• The task of deciding whether the meaning of one text is entailed from another text.
• A major task in the RTE Challenge is classifying the semantic relation between a Text (T) and Hypothesis (H) into o [ENTAILMENT]o [CONTRADICTION]: It is very unlikely that both T and H
can be true at the same time.o [UNKNOWN]
Cross-document Structure Theory (CST)
• Developed by Radev (2000). • Another task of recognizing semantic relations between
sentences.• An expanded rhetorical structure analysis based on
Rhetorical Structure Theory (RST) (1988).• A corpus of cross-document sentences annotated with CST
relations has been constructed.• 18 kinds of semantic relations in this corpus, including
[EQUIVALENCE], [CONTRADICTION], [JUDGEMENT], [ELABORATION], [REFINEMENT].
• CST was designed for objective expressions.
Example Semantic Relations
• Query: Xylitol is effective at preventing cavities.• Matching sentences and output:
o The cavity-prevention effects are greater the more Xylitol is included [AGREEMENT].
o Xylitol shows effectiveness at maintaining good oral hygiene and preventing cavities. [AGREEMENT]
o There are many opinions about the cavity-prevention effectiveness of Xylitol, but it is not really effective. [CONFLICT]
Semantic Relations between Statements
• Goal: Define semantic relations that are applicable over both fact and opinions.
[AGREEMENT]
• Bi-directional relation where statements have equivalent semantic content on a shared topic.
• Example:o Bio-ethanol is good for the environment.o Bio-ethanol is a high-quality fuel, and it has the power to
deal with the environment problems we're facing.
[CONFLICT]
• Bi-directional relation where statements have negative or contradicting semantic content on a shared topic.
• Example:o Bio-ethanol is good for our earth.o There is a fact that bio-ethanol further the destruction of
the environment.
[EVIDENCE]
• Uni-directional relation where one statement provides justification or supporting evidence for the other.
• Example:o I believe that applying the technology of cloning must be
controlled by law.o There is a need to regulate cloning, because it can be
open to abuse.
[CONFINEMENT]
• Uni-directional relation where one statement provides more specific information about the other or quantifies the situations in which it applies.
• Example:o Steroids have side-effects.o There is almost no need to worry about side-effects when
steroids are used for local treatments.
Recognizing Semantic Relations
1. Identify a [AGREEMENT] or [CONFLICT] relation between the Query and Text.
2. Search the Text sentence for cues that identify [CONFINEMENT] or [EVIDENCE].
3. Infer the applicability of the [CONFINEMENT] or [EVIDENCE] relations in the Text to the Query.
Linguistic Analysis
• Tools: o For syntactic analysis, the dependency parser CaboCha,
which splits the Japanese text into phrase-like chunks and represents syntactic dependencies between the chunks as edges in a graph.
o The predicate-argument structure analyzer ChaPAS.o Modality analysis resources provided by Matsuyoshi et al.
(2010), focusing on tense, modality and polarity.
Structural Alignment
• Consists of two phases:1.Lexical alignment2.Structural alignment
• Aligns chunks based on lexical similarity information, creating an alignment confidence score between 0.0 and 1.0, aligning chunks whose scores cross an empirically-determined threshold.
Structural Alignment
• Uses the following information:o Surface level similarity
Identical content words or cosine similarity. o Semantic similarity
Predicates: Check for matches in a predicate entailment database.
Arguments: Check for synonyms or hypernym matches in WordNet or a hypernym collection.
Structural Alignment
• Compare the predicate-argument structure of the query to that of the text and see if they are compatible.
• Example:o Agricultural chemicals are used in the field.o Over the field, agricultural chemicals are sprayed.
• Uses the following information:o # of aligned childreno # of aligned case frameso # of possible alignments in a window of n chunko predicates indicating existence or quantity, e.g., many
few, to exist, etc.o Polarity of both parent and child chunks
Structural Alignment
• Use an SVM, train on 370 sentence pairs.• Features:
o Distance in edges in dependency graph between parent and child for both sentences
o Distance in chunks between parent and childo Binary features indicating whether each chunk is a
predicate or argument according to ChaPAS.o POS of first and last word in each chunk.o When the chunk ends with a case marker, the case of the
chunk otherwise none.o Lexical alignment score of each chunk pair.
Relation Classification
• After structural alignment, do semantic relation classification.
• Uses an SVM.• Features:
o Alignmentso Modalityo Antonym: Identifies [CONFLICT].o Negationo Contextual Cues: Can identify [CONFINEMENT] or
[EVIDENCE] relations. "Because" and "due to" are typical for [EVIDENCE] and "when" and "if" are typical for [CONFINEMENT].
Evaluation
1. Retrieve documents2. Extract real sentences that include major subtopic words3. Reduce noise in data4. Reduce search space by identifying sentence pairs and
prepare pairs, which look feasible to annotate5. Annotate corresponding sentences with [AGREEMENT],
[CONFLICT], [CONFINEMENT], [OTHER].
Results
• Compare two different approaches:1.3-class: Semantic relations are directly classed into
[AGREEMENT], [CONFLICT] and [CONFINEMENT].• Cascaded 3-class: Semantic relations are first classified
into [AGREEMENT] and [CONFLICT] and then, using context cues, are some of them reclassified into [CONFINEMENT].
Results
Baseline Structural
Alignment Upper-bound
Precision 0.44 0.52 0.74
(56/126) (96/186) (135/183)
Recall 0.30 0.52 0.73
(56/184) (96/184) (135/184)
F1-score 0.36 0.52 0.74
Error Analysis
• A big cause of incorrect classification is incorrect lexical alignment.o More resources needed, more effective methods needed.
• Most serious problem is the feature engineering necessary to find the optimal way of applying structural alignments or other semantic information to semantic relation classification.
top related