p05- dina: a multi-dialect dataset for arabic emotion analysis
Post on 15-Feb-2017
167 Views
Preview:
TRANSCRIPT
DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
Muhammad Abdul-Mageed1,2, Hassan AlHuzliy1, Duaa’ Abu Elhija1, Mona Diab2
Indiana University1, The George Washington University2
2
Emotions
• Categories of emotion: – Ekman (e.g., 1992) proposes there are 6 basic
emotions: anger, disgust, fear, happiness, sadness, and surprise
– Plutchik (1980, 1985, 1994) adds trust and anticipation • Emotion on 3 dimensions:– e.g., Francisco and Gervas (2006) mark the attributes
of pleasantness, activation, and dominance in the genre of fairy tales.
– DINA is focused on the Ekman emotions.
3
Motivations• Opinion Mining:– Provides an enriching component beyond the mere binary
valence (i.e. positive and negative) of most sentiment analysis systems.
• Health & Wellness– Early detection of certain emotional disorders such as depression. – Improving the well-being of people by exposing them to desired
emotions (since emotion is contagious [Kramer et al., 2014]).• Education:– Integrating emotionally-aware agents in intelligent
computer-assisted language learning, for example, should prove useful and enhance the naturalness of the pedagogical experience.
4
Motivations Cont.• Marketing:– e.g., emotion-sensitive language generation can help with
marketing (Heath et al., 2001; Tan et al., 2014), political campaigning, etc.
• Security:– Deflect potential hazards and anticipate dangerous
behaviors • Author Profiling:– Useful for predicting age and gender (Meina et al., 2013;
Flekova and Gurevych, 2013; Farias et al., 2013; Bamman et al., 2014; Forner et al., 2013) and personality (Mohammad and Kiritchenko, 2013)
5
Related Work• SemEval-2007 Affective Text task (Strapparava and
Mihalcea, 2007) [SEM07]: – Collection and classification of emotion and
valence in news headlines• Aman and Szpakowicz (2007):– Annotation and detection of emotions from blogs
• Qadir and Riloff (2014), Mohammad (2012), Wang et al. (2012):– use hashtags as an approximation of emotion
categories to collect emotion data
6
Arabic: Motivations
• Morphologically Rich Language– Highly inflected: person, number, gender, case,
mood, aspect, voice• Strategic Language:– One of the 6 languages of UN, with ~ 300M
speakers worldwide• Exponential Web growth:– More than 2000% growth rate on the Web in 2010
onwards (www.internetworldstats.com).
7
Arabic Dialects
8
Data Collection
• Crawled Twitter data using a seed set of size < 10 phrases for each of the six Ekman emotion types.
• Each phrase is composed of an emotion word (e.g., “happy”) and the first personal pronoun “I”.
• We collect only tweets where a seed phrase occurs in the tweet body text.
• This approach does not depend on hashtags.• We collect 500 tweets from each of the 6 emotion
types. Total = 3,000.• Seeds capture various Arabic dialects.
9
Seeds
Table 1. Example seeds
10
Annotation
• To verify the utility of this seeds approach, two college-educated native speakers of Arabic labeled the data.
• For labeling, we use one of four tags from the set {“no-emotion/zero”, “weak-emotion”, “moderate/fair-emotion”, “strong-emotion”}.
• We measure inter-annotator agreement as to these intensity labels in Cohen’s Kappa.
• We also calculate the % of emotion-carrying tweets per category (those that did not end up assigned the label “no-emotion/zero”).
11
DINA: Agreement & % Emotion
Table 3. Agreement in fine-grained annotation and average percentage of emotion
12
Gold Labels from Happiness Class
Table 2. Agreement in happiness annotation
13
Examples: Anger
14
Examples: Disgust
15
Examples: Fear
16
Examples: Happiness
17
Examples: Sadness
18
Examples: Surprise
19
Context of No- and Mixed Emotions
• Even with a list of well-crafted seeds, both annotators assign “no-emotion” for 7.5% of the data.
• This is a function of emotion being a pragmatics-level phenomenon.
• Contexts for “no-emotion” include:– Reported speech– Sarcasm
20
Reported Speech
21
Sarcasm
22
Conclusion
• Emotion is like other pragmatic-level phenomena; hence a seed-collection approach is useful, but not perfect.
• Phenomena like reported speech and sarcasm interact with our method for emotion data collection.
• DINA is multidialectal, but we do not have exact dialect labels on the tweets.
• DINA is at 3,000 tweets, and we plan to grow the size.• Full evaluation of DINA is only possible when we build
models exploiting these data, which we plan to do.
top related