talk at mit hci seminar
Post on 27-Jun-2015
491 Views
Preview:
DESCRIPTION
TRANSCRIPT
Machine learning approaches for understanding social interactions on Twitter
May 6, 2014Alice Ohalice.oh@kaist.eduaoh@seas.harvard.eduhttp://uilab.kaist.ac.kr/members/aliceoh/
Our Research
• Topic Modeling• ICML 2014: Hierarchical Dirichlet scaling process• IJCAI 2013: Context-dependent conceptualization• NIPS Big Learning Workshop 2012: Distributed online learning for latent Dirichlet
allocation• CIKM 2012: Recursive Chinese restaurant processes for modeling topic hierarchies• ICML 2012: Dirichlet processes with mixed random measures
• Social Media Analysis• ACL 2014 Workshop: Self-disclosure topic model• WWW 2014: Computational analysis of agenda setting theory• AAAI 2013: Hierarchical aspect-sentiment model• ICWSM 2012: Social aspects of emotions in Twitter conversations• ACL 2012: Self-disclosure and relationship strength in Twitter conversations• WSDM 2011: Aspect sentiment unification model for online review analysis
2
Contact Information
• At Harvard until end of July, 2014 and open for • Collaborations: writing papers, sharing data, etc. • Discussions about topic modeling and computational social science
• Going back to KAIST in August • http://uilab.kaist.ac.kr • alice.oh@kaist.edu • Can recommend students for intern, postdoc, and researcher positions
• Please consider attending • ICWSM (program co-chair), Ann Arbor, MI • ACL Workshop on Social Dynamics and Personal Attributes (co-
organizer), Baltimore, MD
3
What is topic modeling?
Blei, Communications of the ACM, 2012
Motivation
Motivation
• What are the topics discussed in the article?
• Is the article related to
• household finances?
• price of gasoline?
• price of Apple stock?
• How would you build an automatic system for answering these questions?
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp
nascar, races, track, raceway, race, cars, fuel, auto, racingeconomic, slowdown, sales, recession, costs, spending, savefans, spectators, sports, leagues, teams, competition
8
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
Topics: multinomial over wordsTopic Distributions
Input to LDA
10
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
Topics Discovered by LDA
nascar 0.12 spending 0.09 sports 0.12
races 0.1 economic 0.07 team 0.11
cars 0.1 recession 0.06 game 0.1
racing 0.09 save 0.05 player 0.1
track 0.08 money 0.05 athlete 0.09
speed 0.06 cut 0.04 win 0.07
... ... ...
money 0.002 speed 0.003 nascar 0.001
Topics: multinomial over vocabulary11
http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
Topics: multinomial over wordsTopic Distributions
Graphical Representation of LDA
Topic Distributions
nascar, races, track, raceway, race, cars, fuel, auto, racing
economic, slowdown, sales, recession, costs, spending, save
fans, spectators, sports, leagues, teams, competition
Topics: multinomial over words
Topicssales xxx slowdown recession cars races spending xxx save costs fuel
13
Do you feel what I feel? Social Aspects of Emotions in Twitter Conversations
Suin Kim, JinYeong Bak, Alice Oh ICWSM 2012
14
Twitter conversation data
• Twitter conversation data: approx 220k dyads who “reply” to each other, 1,670k conversational chains (We now have about 5x this amount)
!1!
2!
3!
4!
Emotion Cycles
16
Emotion cycles
We propose that organizational dyads and groups inhabit emotion cycles: Emotions of an individual influence the emotions, thoughts and behaviors of others; others’ reactions can then influence their future interactions with the individual expressing the original emotion, as well as that individual’s future emotions and behaviors. People can mimic the emotions of others, thereby extending the social presence of a specific emotion, but can also respond to others’ emotions, extending the range of emotions present.
17
Topic model with a twist• Dirichlet forest prior (Andrzejewski et al.)
• Mixture of Dirichlet tree distribution
• Dirichlet tree: Generalization of Dirichlet distribution
• Knowledge is expressed using Must-link and Cannot-link primitives
• Must-link(love, sweetheart)
• Cannot-link(exciting, bored)
18
q�
⌘
DF-LDA
Domain knowledge in Dirichlet forest prior
19
Seed Words
anticipationhopewaitawaitinspirexcitborereadiexpectnervoucalmmotivpreparcertainanxiouoptimistforese
joyawesomamazwonderexcitgladfinebeautihighluckisuperperfectcompletspecialblesssafeproud
angershitbitchassmeandamnmadjealoupissannoiangriupsetmoronragescrewstuckirrit
surpriseamazwowwonderweirdluckidiffer
awkwardconfusholistrangshockodd
embarrassoverwhelmastoundastonish
fearscarestresshorrornervouterroralarmbehindpanicfearafraiddesperthreatentensterrififrightanxiou
sadnesssorribadawsadwronghurtbluedeadlostcrushweakdepressworslowterribllone
disgustsickwrongevilfatuglihorriblgrossterriblselfishmiserpathetdisgustworthlessaw
ashamfuck
acceptanceokaioksamealrightsafelazirelaxpeaccontentnormalsecurcompletnumbfulfil
comfortdefeat
Must-link within a class Cannot-link between classes
Emotion Topics How do we express emotions?
JoyAnticipation AngerTopic 114 omg love haha thank really Topic 107 love thank follow wow
Topic 159 good day hope morning thank Topic 158 love thank miss hug
Topic 125 hope better feel thank soon Topic 26 good thank hope miss
Topic 146 come wait week day june Topic 146 good day time work
Topic 131 lmao fuck ass bitch shit Topic 4 ass yo lmao nigga
Topic 19 lmao shit damn fuck oh Topic 13 shit nigga smh yea
FearTopic 48 omg oh lmao shit scare Topic 78 happen heart attack hospital
Topic 27 don’t come night sleep outside Topic 140 time got work day
SurpriseTopic 172 yeag know think true funny Topic 89 know don’t think look
Topic 15 think don’t know make really Topic 94 haha dont think really
29 70 21 14 5
Sadness DisgustTopic 6 oh sorry haha know didnt Topic 59 hurt got good bad
Topic 106 tweet reply didn’t read sorry Topic 155 oh really make feel
Topic 116 oh fuck don’t ye ew Topic 116 look haha oh know
Topic 22 don’t oh think yeah lmao Topic 174 don’t think say people
AcceptanceTopic 43 ok oh thank cool okay Topic 102 know try let ok
Topic 199 xx thank good okay follow Topic 8 night love good sleep
17 7 18 NeutralTopic 180 com www http check youtube Topic 156 twitter facebook people account
Topic 184 account google app work email Topic 67 food chicken cook rt
19
20
Emotion Topics How do we express emotions?
JoyAnticipationTopic 114 omg love haha thank really Topic 107 love thank follow wow
Topic 125 hope better feel thank soon Topic 26 good thank hope miss
SadnessTopic 6 oh sorry haha know didnt Topic 59 hurt got good bad
NeutralTopic 180 com www http check youtube Topic 156 twitter facebook people account
GreetingCaringSympathy
IT/Tech
21
Emotion-tagged conversations
22
A (Love): @amithpr @dhempe @OperaIndia - Would you have any update on @mrunmaiy's health - hope she is recovering well? B (neut): @labnol @dhempe she is recovering but slow. The injury is on the spine therefore worrisome. Still in icu. A (Sadness): @amithpr thanks for the update.. extremely said to hear that news.. B (neut): @labnol #prayformrun She is a fighter and will come out of this
B (neut): @AyeItsMeiMei just tell ur followers to report her for spam. then she'll be kicked off twitter A (Anger): @Jakeosaurous dude I didn't even do shit to her I'm just here tweeting & she calls me a ugly bitch? I was like oh wow thanks? B (neut): @AyeItsMeiMei yeah clearly shes so ugly she cant even use her real pic:P so dont feel bad A (Love): @Jakeosaurous haha. I don't care. She's getting spammed with hate. Hahaha. (": thanks though. B (neut): @AyeItsMeiMei np
Emotion Transitions Plutchik’s Wheel of Emotions
Joy39.7%
0.51
Acceptance10.4%
0.23
Fear2.6%
0.11
Surprise7.4%
0.17
Anticipation15.1%
0.26
Disgust2.9%
0.11
Sadness9.1%
0.19
0.31Anger12.8%
0.37
0.33
0.32
0.31
0.33
0.21
0.34
0.15
0.140.13
0.15
23
Defining “Influence”
emotion influencing tweet
User A
User B
Having a tough day today. RIP Harrison. I’ll
miss you a ton :/
Just pray about it. God will help you.
Not really religious, but thanks man. :)
If you need talk you know I’m here.
Time
(Sadness) (Acceptance)
(Anticipation)
24
Topic 117 tweet people don’t read post Topic 59 hurt got bad pain feel
Emotion Influences What can you say to make your partner feel better?
Joy → SadnessSadness → Joy
Topic 18 wear look think love black Topic 24 love thank great new look
Anticipation → Surprise
Topic 96 music listen play song good Topic 178 follow tweet people twitter thank
Acceptance → Anger
Topic 31 i’m got lmax shit da Topic 13 lmao shit nigga smh yea
Disgust → Joy
Topic 61 watch new live tv tonight Topic 63 watch good think know look
Suggesting Greeting Sympathy
Swear words Complaining
25
0
0.075
0.15
0.225
0.3
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.0410.0710.082
0.053
0.265
0.0610.081
0.0420.051
Emotion Influence: Sadness to Joy
Emotion Influence: Joy to Anger
0
0.09
0.18
0.27
0.36
Anticipation Joy Surprise Fear Anger Sadness Disgust Acceptance Neutral
0.2110.230.2140.2090.1910.2370.253
0.358
0.273
Expressing Anger has 26.5% of chance of changing the partner’s emotion from
Joy to Anger.
26
Expressing Joy has 35.8% of chance of changing the partner’s emotion from Sadness to Joy.
Self-disclosure topic model
JinYeong Bak, Chin-Yew Lin, and Alice OhACL 2014 Workshop on Social Dynamics and Personal Attributes
27
Self-disclosure Research using Twitter
• People disclose personal and secretive information • to build and maintain interpersonal relationship • to get social support
• Twitter is a great source for naturally-occurring, large-scale, longitudinal data on self-disclosure behavior
• We develop a topic model for classifying self-disclosure behavior into three categories: G (general, no disclosure), M (medium disclosure), H (high disclosure)
• We look at the correlation of self-disclosure behavior and frequency of Twitter conversations in longitudinal data
28
Self-disclosure in Twitter conversations
29
Conversa)on 2:
I'm moving out.
@xxxx ??? What's going on bb?
@yyyy Mother. Done with her. I am planning to get out now. There's nothing I can do, we dont get along
@xxxx I'm.sorry hunn. That's rough. Where are you going to go though?
@yyyy Probably stay at a friends place in the Cmebeing unCl I find a place to live!
@xxxx :/ well I'm glad your geHng out if she is being horrible to you
Conversa)on 3:
Oh, prepregnancy pants, you are so uncomfortable.
@eeee You can put them on? Jealous.
@ffff they are cuHng into my flesh and are giving me a ridiculous muffin top. It isn't preOy. But we have company coming over.
@eeee Yea, I tried yesterday. I got one pair of shorts to buOon painfully and my jeans just laughed at me.
Conversa)on 1:
So my brother is going to Roskilde FesCval and my mother and sister is going to England.. That leaves me, my dad and my dog.
@cccc why aren't you going to england?
@dddd because my sister is going with 3 of her friends and my mom's just there... to be there. And my sister didn't want me to come :(
Data
• Full data • 88k users, 51k dyads • 1.3M conversations • 10.5M tweets • Longitudinal data from August 2007 to July 2013
• Labeled data (gold standard for self-disclosure level) • 101 conversations • 673 tweets
30
Graphical Representation of SDTM
3 sets of topics, one for G, M, and H levels
By using a topic model, we can !-classify the levels of disclosure!-discover topics associated with each level!-generalize to other social media sites using the same set of seed words
Seed Words
• Medium level: frequent trigrams for personally identifiable information !
!
!
!
• High level: automatically extracted from sixbillionsecrets Website
32
Classification Results
33
Direct Classification using the Models
Classification with SVM using Features Learned from Models
Self-disclosure topics
34
SD level & conversation frequency
35
Sociolinguistic Analysis of Twitter in Multilingual Societies
Suin Kim, Ingmar Weber, Li Wei, and Alice OhUnder Review
36
Data
Data
Visualization of the network
How are they connected?
• English monolinguals and X-EN bilinguals bridge the network
Closer look at Bilinguals: Which language do they choose?
Closer look at Bilinguals: Hashtag usage
Closer look at Bilinguals: Topics (Results of LDA)
Closer look at Bilinguals: Topics (Results of LDA)
Future directions
• Develop model for prediction of language choice in bilinguals
• Look at how English is used throughout the world
• Cognitive studies of first- and second- language
• Self-disclosure and relationship building
• Email me for data sharing, collaborating, discussing, …
• alice.oh@kaist.edu
top related