gender gap in collaborative platforms: language and emotions in wikipedia discussions
TRANSCRIPT
Gender Gap in Collaborative Platforms:Language and emotions in Wikipedia Discussions
David Laniado, Daniela Iosub, Carlos Castillo,Mayo Fuster Morell and Andreas Kaltenbrunner
Universitat Pompeu Fabra, January 17, 2017
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 1 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 2 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 3 / 58
Wikipedia is a teenager
Happy birthdayWikipedia!
English Wikipedia isnow 16 years old
Catalan Wikipedia willbe 16 in March (thesecond oldest one)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 4 / 58
The largest human knowledge repository
Fifth most visited web siteAmong top results for search queries about almost any topicLargest collaborative projectConditions and reflects public opinion... with some bias
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 5 / 58
Wikipedia’s social experiment
"The problem with Wikipedia is that it only works in practice. In theory,it can never work."
(Wikipedian popular joke)
A crazy idea: anyone can editAll relevant points of view should be representedPolicies of Notability and Neutral Point of viewQuality assured by editors’ negotiations over contentThe more people with different points of view contributing, thebetter the quality→ Biases in the editor community may cause biases in the content
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 6 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 7 / 58
Bias in the content?
Top global biographies by birth country (Young-Ho Eom et al, 2015)top central biographies from each of the 24 major Wikipediasthe 100 most central (PageRank) in each version’s hyperlink network
→ striking geographic biashttp://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/geofigs/pagerank24x100.html
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 8 / 58
Top Women Biographies
Rank NA PageRank female figures CC Century LC1 24 Elizabeth II UK 20 EN2 17 Mary (mother of Jesus) IL -1 HE3 12 Queen Victoria UK 19 EN4 6 Elizabeth I of England UK 16 EN5 2 Maria Theresa AT 18 DE6 1 Benazir Bhutto PK 20 HI7 1 Catherine the Great PL 18 PL8 1 Anne Frank DE 20 DE9 1 Indira Gandhi IN 20 HI10 1 Margrethe II of Denmark DK 20 DA
Top 10 global female historical figures by PageRank for the 24 majorWikipedia editions (Young-Ho Eom et al, 2015)
NA → number of language editions in which a biography appears in thetop 100 rankCC → birth country codeLC → language code corresponding to the birth country
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 9 / 58
Top biographies by gender
Number of women among the top global biographies by birth century
Number of women among the 100 most central biographies for eachlanguage edition
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 10 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 11 / 58
Wikipedia editor gender gap
Estimated women participation2011 editor survey: 9%2013 editor survey: 13%
corrected with propensity score estimation: 16% (Mako Hill andShaw, 2013)
while in most online social networks women are more active!
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 12 / 58
Why women do not edit Wikipedia?
1 A lack of user-friendliness in the editing interface2 Not having enough free time3 A lack of self-confidence4 Aversion to conflict and an unwillingness to participate in lengthy
edit wars5 Belief that their contributions are too likely to be reverted or
deleted6 Some find its overall atmosphere misogynistic7 Wikipedia culture is sexual in ways they find off-putting8 Being addressed as male is off-putting to women whose primary
language has grammatical gender9 Fewer opportunities than other sites for social relationships and a
welcoming tone
(Sue Gardener, 2011)David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 13 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 14 / 58
Emotional factors and discussions
Importance ofemotional factorsDiscussion spaces arefundamental to thecollaborative processDiscussion triggersemotions and breedsparticular emotionalenvironments
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 15 / 58
Wikipedia’s most visible side
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 16 / 58
Article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 17 / 58
Discussions in article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 18 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 19 / 58
Studying emotions and language in talk pages
Interactions in WikipediaImplicit→ editingExplicit→ communication
Article talk pages→ discussions about how to improve articlesUser talk pages→ a kind of public in-boxes
Goal: Shed light on the emotional dimension of the interactionsextensive analysis of emotions in explicit communicationsentiment analysis of comments in article talk and personal talkpages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 20 / 58
Research questions
1 How are the emotional and communication styles of editorsaffected by their status?
2 How are the emotional and communication styles of editorsaffected by their gender?
3 How are the emotional expressions affected by interacting withothers in comment threads (emotional congruence)?
4 How are the emotional styles of editors related to those of theeditors they interact more frequently with (emotionalhomophily)?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 21 / 58
Publications
Results published in:
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)Emotions and dialogue in a peer-production community: the case of Wikipedia.8th International Symposium on Wikis and Open Collaboration, WikiSym’12
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)Emotions under Discussion: Gender, Status and Communication in Online Collaboration.Plos One, 9(8)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 22 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 23 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 24 / 58
Dataset: conversationsExtracting conversations among editors from the English Wikipedia
Articles 3 210 039Articles with talk page (ATP) 871 485 (27.1%)Editors who comment articles 350 958Editors with ≥ 100 comments on ATP 12 231 (3.5%)Total comments in ATP 11 041 246Comments containing ANEW words 7 414 411 (67.2%)Comments by editors with ≥ 100 comments on ATP 5 480 544 (49.6%)Comments by these editors and with ANEW words 3 649 297 (33.3%)
Table: Data extracted from a complete dump of the English Wikipedia, datedMarch 12th, 2010
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 25 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 26 / 58
User gender labelling
≈ 12 000 users wrote ≥ 100 comments in articles talk pagesGender identified through Wikipedia API for ≈ 2 000 of themA sample of 1 385 users for manual labelling throughcrowdsourcing (Crowdflower)
Non-admins Admins TotalMen 1 087 1 526 2 613Women 68 97 165Unknown 6 850 2 603 9 453Total 8 005 4 226 12 231
Table: Users with ≥ 100 comments by gender and administrator status.
Gender could be identified only for ≈ 50% of users:real name or username (50% of those identified)implicitly stated gender (27% of women, 20% of men)pronoun (15% of women, 10% of men)other indicators: userboxes, pictures, links to personal blogs...
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 27 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 28 / 58
Measuring the Emotional Content of Discussions
Lexicon-based methods
relying on three different instruments:
Affective norms for English words (ANEW)
Linguistic Inquiry and Word Count (LIWC)
SentiStrength
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 29 / 58
Measuring the Emotional Content of DiscussionsMethod 1: Affective norms for English words (ANEW)
Rates a list of 1060 frequent words on a 9 point scale in threedimensions:
Valence
Arousal
Dominance
assign emotion scores to each word from the lexicon
Bradley and Lang. (1999).Affective norms for English words (ANEW) Technical report C-1.The Center for Research in Psychophysiology, University of Florida, FL.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 30 / 58
Measuring the Emotional Content of DiscussionsMethod 2: Linguistic Inquiry and Word Count (LIWC)
Two scores for basic emotion (compared with ANEW valence)positive emotionnegative emotion
Discrete measures of emotions (anger, anxiety, sadness, affect)Other classes of words to characterize language (i.e. personalpronouns, tentative words, fillers...)
→ Count the proportion of words belonging to each class
Pennebaker J, Chung C, Ireland M, Gonzales A, Booth R (2010).The development and psychometric properties of LIWC2007. Austin, TX.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 31 / 58
Measuring the Emotional Content of DiscussionsMethod 2: Linguistic Inquiry and Word Count (LIWC)
Dictionary size ExamplesAnger 91 hate, kill, annoyed
Anxiety 84 worried, fearful, nervousSadness 101 crying, grief, sadTentative 155 maybe, perhaps, guessCertainty 83 always, never
Fillers 9 blah, you knowPast 155 went, ran, had
Present 169 is, does, hearFuture 48 will, gonna
Social words 455 mate, talk, child
Table: Description of LIWC measures (as per http://www.liwc.net).
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 32 / 58
Measuring Relationship-Orientation with LIWC
DefinitionCommunication that promotes socialaffiliation and emotional connection:
preoccupation with others (use ofpersonal pronouns, e.g., I, you)preoccupation with the larger socialdomain (e.g., references to friends andfamily)expression of positive emotion
ExamplesWe are glad to have you. If I can help at all letme know :)
A-giau has smiled at you. Smiles promoteWikiLove and hopefully this one has madeyour day better...Happy editing
Congrats! Thank you for your dedication.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 33 / 58
Measuring the Emotional Content of DiscussionsMethod 3: SentiStrength
SentiStrengthBased on LIWC and developed for short web textsAccounts for modes of textual expression specific to the onlineenvironment, e.g. emoticons and abbreviationsProvides a positive and a negative score for emotional valenceEmotion score is the strongest positive and negative emotionexpressed in a commentFinal scores are averages over comments in a given category
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010)Sentiment strength detection in short informal text.Journal of the American Society for Information Science and Technology 61: 2544 – 2558.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 34 / 58
Example: results with three different emotional lexica
Table: Example messages with their corresponding Valence(ANEW) orpositive & negative scores (LIWC, SentiStrength)
ANEW LIWC SentiSt.Valence + - + -
Sounds like a good challenge - to be proven or disproven. I’mhappy if it can be shown to go further using closed cubic poly-nomial solutions. The nice thing about these are that they arepretty easy to test numerically . . .
7.4 12.5 0 3 -2
–in “Exact trigonometric constants”
Seems you have not yet seen female lover after having sexwho do not wish to have sex with the same lover any more :)Once you’ve seen it, you understand very well what war of Venusmeans compared to war of Mars.
5.5 6.8 4.5 4 -3
–in “House (astrology)”
What about the whirlie hazing, the alcohol abuse, the emotionalpoverty, the suicide in 1995/6, the biotech plans which werestopped by pitzer protests . . .
1.6 4 8 1 -4
–in “Harvey Mudd College”
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 35 / 58
Sentiment analysisStatistical tests
Compute average values with the three lexica for each user
Compare distribution of values for two groups of users (e.g.:admins vs regulars, women vs men)
Most variables are not normally distributed
⇓
Mann-Whitney U-testCompare distributions of rankings
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 36 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 37 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 38 / 58
Emotions and Status
Table: Emotions and Status: Administrators promote a generally neutral toneon article talk pages. Regular editors express more negative emotion, andare more emotional.
(Article Talk) Regular Admin Mann-Whitney U-Test p-valueLIWCPositive 2.369 2.409 -4.308 p < 0.001Negative 1.368 1.120 -18.578 p < 0.001Affect 3.784 3.661 -8.466 p < 0.001Anxiety 0.180 0.166 -5.834 p < 0.001Anger 0.554 0.446 -19.217 p < 0.001Sadness 0.175 0.166 -4.450 p < 0.001SentiStrengthPositive 1.805 1.774 -14.603 p < 0.001Negative -2.005 -1.912 -23.046 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 39 / 58
Emotions and Status
Admins:more positive emotion(ANEW and LIWC)generally, emotionallyreserved compared toregular users (LIWC)
Regular users:more emotional
more affect, and moreanxiety, anger andsadness (LIWC)
stronger positive andnegative words thanadmins (SentiStrength)
Personal talk pagesIn personal talk pages, admins are more emotional compared tothe article talk pages
more positive emotion compared to regular editors, but also moreanxiety and sadness
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 40 / 58
Dialogue and Status
Table: Dialogue and Status: Administrators are more impersonal in article talkpages. Regular editors are more concerned with others.
(Article Talk) Regular Admin Mann-Whitney U-test p-valueRelationship-orientationPersonal pronouns 5.135 4.815 -13.561 p < 0.001Use of “I” 2.456 2.429 -1.733 p=0.083Use of “You” 1.043 0.892 -12.573 p < 0.001Use of “Shehe” 0.609 0.526 -8.657 p < 0.001Social words 6.320 5.810 -19.013 p < 0.001CertaintyCertainty 1.426 1.317 -16.824 p < 0.001Tentativeness 3.199 3.169 -2.210 p < 0.001Filler words 0.168 0.155 -6.687 p < 0.001Temporal OrientationPast 2.376 2.305 -5.696 p < 0.001Present 8.011 7.841 -8.060 p < 0.001Future 1.114 1.166 -9.887 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 41 / 58
Dialogue and Status
Adminsmore neutral andimpersonal toneless relationship orientedmore concerned with thefuturetend to "rule with reason"
Regular usersmore relationship-oriented
more personal pronounsand more social words
more concerned with pastmore insecure, but not inpersonal spaces
more certainty, tentativeand filler words in articletalk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 42 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 43 / 58
Emotions and gender
ANEW Words more used by women and menSize accounts for difference in frequency
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 44 / 58
Emotions and gender
Women use words associated to more positive emotions
Result consistent and significant with the three lexicons
ANEW: Difference is not significant when normalising by article→ difference might be due to topic selection: women choose toparticipate in topics which have more positive discussions
No significant difference in expression of negative emotions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 45 / 58
Topics, emotions and genderN≥1 ANEW words; corr=−0.64 (p=0.002)
prop. of male comments
mea
n va
lenc
e
0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.964.7
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
5.6
ComputingArtsPhilosophyLanguageHealthMathematicsBeliefSportsAgricultureEnvironmentTechn. & app. sci.LawSocietyBusinessEducationCulturePeopleSciencePoliticsGeography and placesHistory and events
Figure: Mean valence (ANEW) for discussions of articles in different topiccategories, vs the proportion of comments written by men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 46 / 58
Dialogue and gender
Table: Dialogue and Gender: Women use a more relationship-orientedspeech style.
(Article Talk) Men Women Mann-Whitney U-test p-valueRelationship-orientationPersonal pronouns 4.964 5.420 -4.375 p < 0.001Use of “I” 2.488 2.764 -3.945 p < 0.001Use of “You” 0.936 0.957 -0.926 p=0.355Use of “Shehe” pronouns 0.541 0.713 -4.657 p < 0.001Social words 5.960 6.353 -3.487 p < 0.001CertaintyCertainty 1.346 (1397) 1.300 (1263) -2.078 p = 0.038*Tentativeness 3.150 3.215 -1.162 p=0.245Filler words 0.161 0.160 -0.137 p=0.891Temporal OrientationPast 2.325 2.543 -4.305 p < 0.001Present 7.897 8.180 -3.086 p = 0.002Future 1.168 1.147 -1.008 p=0.314
When difference is statistically significant (p-value in bold) the larger absolute value isunderlined. Cases where the averages are not informative are marked with an asterisk * and
include the mean ranks Mann-Whitney U-test next to the averages in parentheses.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 47 / 58
Dialogue and Gender
Women write longer messages
Women are more relationship-orientedmore personal pronouns, in particular “I”, more social words
Women are not more insecureLess certainty words, no significant difference for tentativeness andfiller words
Women admins are more relationship oriented than men adminsDifferent leadership style
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 48 / 58
Qualitative analysis: Relationship orientation
Manual classification of 100 commentsThree main types of comments high in relationship-orientation:
inviting comments that explain the edit in a friendly tone, and call forfurther intervention and collaborationcommon perspective-building comments that are focused onunderstanding others and solving debates in a constructive mannerappreciative comments that contain positive emotions andcelebrate others’ actions
⇒ This suggests that relationship-orientation may be conducive tosuccessful collaboration
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 49 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 50 / 58
Emotional congruence
Comparison of each comment with the comment it replies tonot based only on our set of users, but on all comments (from allusers)
Emotions: editors tend to reply with:more positive emotionless negative emotionless anger, anxiety and sadnessstronger words, both positive and negative (SentiStrength)
Dialogue: editors tend to reply with:more relationship oriented speechless tentative and certainty words
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 51 / 58
Emotional homophilyMixing patterns: do users interact preferentially with similar users?
Disassortativity by activityusers who write more comments tend to reply preferentially to lessactive users, and viceversa
Assortativity by genderMen interact more with other men, and women with other women
Assortativity by emotion and languageUsers interact more with others similar in emotional expression andcommunication style
also in the network of communication on personal talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 52 / 58
Emotional homophilyExample: homophily by expression of anger
edges connect users whohave exchanged at least 10replies
node color represents thelevel of anger expressed by auser, from low to high
node size → proportional tothe number of connections ofa user
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 53 / 58
Outline
1 IntroductionWikipedia content biasesWikipedia gender gapWikipedia discussion spacesGoal and research questions
2 Framework of analysisData acquisition and pre-processingUser gender labellingLanguage and sentiment analysis
3 ResultsEmotions and statusEmotions and genderNetworked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 54 / 58
Conclusions
Administrators and experienced users play a pivotal rolethey tend to interact especially with less experienced usersthey promote a positive but impersonal environment
Men and women have a different communication stylewomen participate in discussions that have a more positive tonemen interact more with men, and women with womenwomen use a more emotional and relationship-oriented languagewomen admins have a relationship-oriented leadership style
⇒ promoting relationship-orientated leadership could lead to a morepositive environment
⇒ giving women more space in the community could result in a morewelcoming envoronment, for both women and men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 55 / 58
Future work
Longitudinal analysishow do emotional styles of editors change over time and withincreasing experience?how do emotions in the discussions affect participation?
Qualitative analysis and human annotationinclude non-textual emotional aspects such as emoticons, barnstars and virtual giftsdeal with sarcasm, measure the extent of condescending orpaternalistic language in comments addressed at women editors
Examine other online spacesSimilar conclusions might hold for other online spacesespecially in discussions involving conflict, decision making andpower dynamics
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 56 / 58
Some referencesM. M. Bradley and P. J .Lang.Affective norms for English words (ANEW) Technical report C-1.The Center for Research in Psychophysiology, University of Florida, FL, 2012.
B. Collier and J. Bear.Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions.In Proc. of CSCW, 2012.
Eom, Y.H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and toppeople of wikipedia from ranking of 24 language editions.PLoS ONE 10(3), e0114,825 (2015).
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)Emotions under Discussion: Gender, Status and Communication in Online Collaboration.Plos One, 9(8)
O. Kucuktunc, B. B. Cambazoglu, I. Weber, and H. Ferhatosmanoglu.A large-scale sentiment analysis for Yahoo! answers.In Proc. of WSDM, 2012.
D. Laniado, R. Tasso, Y. Volkovich, and A. Kaltenbrunner.When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages.In Proc. of ICWSM, 2011.
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)Emotions and dialogue in a peer-production community: the case of Wikipedia.8th International Symposium on Wikis and Open Collaboration, WikiSym’12
H. Zhu, R. Kraut, A. KitturEffectiveness of shared leadership in online communities.In Proc. of CSCW, 2012.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 57 / 58
Questions?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 58 / 58