final talk

Post on 25-Feb-2016

51 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Final talk. Automatically Acquiring a Dictionary of Emotion-Provoking Events. Student: Hoa Vu- Trong – VNU Supervisor: Graham sensei - NAIST. Can Twitter benefit a dialogue system?. Twitter users. Dialog System. Machine : Hello! User : Hello! User : A guy next to me today, - PowerPoint PPT Presentation

TRANSCRIPT

1/20

Final talk

Automatically Acquiring a Dictionary of Emotion-Provoking Events

Student: Hoa Vu-Trong – VNUSupervisor: Graham sensei - NAIST

2/20

Can Twitter benefit a dialogue system?

Dialog System

Machine: Hello!

User: Hello!

User: A guy next to me today,

are too noisy !

Machine: That's so annoying!

User:

Twitter users

3/20

Motivation

● Emotion is not present in specific word.

● 4% of words imply emotion [1]

[1] Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.: Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology 54, 547–577 (2003)

●Text emotion classifier

Simple architecture of dialogue system with emotion adaption.

1. I feel happy today2. I met my friend today

4/20

Motivation• Arbitrarily large set of emotion-provoking events can be

collected from Twitter

You must be very happy

400M tweets/day

5/20

Method● Emotion and Event have relation.● Pattern learning is an effective way to harvest semantic relation

– Espresso (Pantel and Pennacchiotti 06).

Ex: “I'm happy that I have the support of my friends. I love all of them!”“I'm sad that tomorrow is Monday and I have to work. It's bad day”Pattern: I be EMOTION that EVENTInstances: happy – I have the support of my friends

sad – tomorrow is Monday and I have to work

6/20

Espresso Algorithm● Used in mining semantic relation (eg: is-a, has-a …) begins

with some seed instances.● Each iteration contains 3 phases:

– Pattern Induction– Pattern ranking– Instance extraction

● Stopping criterion: enough patterns, average reliabilty of the patterns decrease t% or exeeds defined number of iterations.

7/20

Espresso Algorithm● Pattern Induction: Infers all the patterns P that connect the seed

instances. Ex:

I be EMOTION that EVENT . I love all of youI be EMOTION that EVENT . It be bad day

I be EMOTION that EVENT - 2 timesEMOTION that EVENT . - 2 timesEMOTION that EVENT . I love all – 1 time

……

I'm happy that I have the support of my friends. I love all of them!I'm sad that tomorrow is Monday and I have to work. It's bad day

8/20

Espresso Algorithm● Pattern ranking: Rank all the patterns and extract top K reliable

ones.● Reliable patterns: one that both highly precise and one that extract

many instances (more in next slides).

9/20

Espresso Algorithm● Instance Extraction: Retrieves top M reliable instances match K

patterns extracted from previous phase.● Reliable instance: one that highly associated with as many reliable

patterns. (more in next slides)

10/20

Espresso Algorithm● Strength of association between instance i(x,y) and pattern p is

measured by PMI.

𝑝𝑚𝑖 (𝑖 ,𝑝 )= log( 𝑐𝑜𝑢𝑛𝑡 (𝑖 ,𝑝)𝑐𝑜𝑢𝑛𝑡 ( i )×𝑐𝑜𝑢𝑛𝑡 (p ) )

11/20

Espresso Algorithm● Pattern reliability:

● Instance reliability:

𝑟 (𝑝 )=∑𝑖 ∈𝐼

( 𝑝𝑚𝑖 (𝑖 ,𝑝 )𝑚𝑎𝑥𝑃𝑀𝐼 ∗𝑟 (𝑖 ))𝑐𝑜𝑢𝑛𝑡 ( 𝐼 )

𝑟 ′ (𝑖 )=∑𝑝∈𝑃

( 𝑝𝑚𝑖 (𝑖 ,𝑝)𝑚𝑎𝑥𝑃𝑀𝐼 ∗𝑟 (𝑝 ))𝑐𝑜𝑢𝑛𝑡 (𝑃 )

0<𝑟 (𝑝 )⩽1

0<𝑟 ′ (𝑖 )⩽1

Grouping events● Relieve sparsity issues to some extent by sharing statistics among

the events in a single group● allows humans to understand the events better, highlighting the

important events shared by many people● Using hierarchical agglomerative clustering and the single-linkage

criterion using cosine similarity as a distance measure

13/20

Experiments● Data corpus: 30 million tweets from Neubig and Duh 13' [1]

● Tweet normalization by Han et al 12' [2]

● Stanford parser [3] was employed to make sure that event must be a sentence

[1] Graham Neubig, Kevin Duh.How Much is Said in a Tweet? A Multilingual, Information-theoretic Perspective in AAAI Spring Symposium on Analyzing Microtext. Stanford, California. March 2013.[2] Han et al. Automatically Constructing a Normalisation Dictionary for Microblogs in EMLNP 2012http://nlp.stanford.edu/software/lex-parser.shtml

14/20

Experiments● 6 basic emotion classes defined by Ekman [1] :

– Anger: angry, mad– Digust: digusted, terrible– Fear: afraid, scared– Happiness: happy, glad– Sadness: sad, upset– Surprise: surprised, astonished

[1]Ekman, P.: Universals and cultural dierences in facial expressions of emotions. Nebraska Symposium on Motivation 19, 207{283 (1972)}

15/20

Experiments● We start the system with the seed instances collected by the

pattern: “I be EMOTION that EVENT”● Reliability of seed instances is 1.● Stopping criterion: limit iterations.

16/20

Result● Happiness: 14027 events● Sadness: 3909 events● Fear: 8798 events● Anger: 2133 events● Surprise: 2466 events● Disgust: 26 events

17/20

Result● Some new patterns:

I feel EMOTION when EVENT

I be EMOTION because EVENT

I be EMOTION EVENT

I get so EMOTION when EVENT

Make me EMOTION when EVENT

Get really EMOTION that EVENT

Be really EMOTION to hear that EVENT

Be EMOTION to know that EVENT

EMOTION at the fact that EVENT

be EMOTION to death that EVENT

18/20

Evaluation● Using Mean Reciprocal Rank(MMR):

𝑀𝑅𝑅= 1∣𝑄∣∑𝑖=1

∣𝑄∣ 1(𝑟𝑎𝑛𝑘 𝑖 )

Predicted Human annotation

Rank Reciprocal rank

Surprised HappinessSurpriseSadness

2 1/2

19/20

Evaluation● Measuring recall

– Asking 30 people about 5 events that provoke each of five emotions

Emotions Events

happiness meeting friends buying/getting something I want

going on a date

sadness a plan gets cancelled someone dies/gets sick

failing a test

anger someone breaks a promise

someone insults me someone breaks something of mine

fear getting a sudden phone call

seeing an insect walking at night

surprise seeing a friend unexpectedly

seeing a car suddenly appear

hearing a loud noise

20/20

Evaluation● Evaluation emotion-provoking events● Human evaluation on top 100 groups.

Methods MRR Recall

Seed 51.8 5.21

Seed + clustering 66.1 9.40

Espresso 51.5 8.55

Espresso + clustering

74.7 16.2

Emotions MRR Recall

Happiness 100 26.9

Sadness 82.3 10.0

Anger 82.4 15.8

fear 46.3 27.3

Surprise 58.3 0.0

21/20

Disscusion● Recall is still relatively low● Events extracted from Twitter were somewhat biased towards

everyday events or events regarding love and dating● for surprise we didn’t manage to extract any of the emotions created

by the annotators at all

22/20

In Conclusion● This work focus on acquiring emotion-provoking events● Using Espresso algorithm to learn patterns and extract events then

similar events are grouped to create a dictionary. ● Paper summited to EACL 2014

23/20

Arigato gozaimasu

top related