deep feelings

21

Upload: marco-guerini

Post on 29-Jul-2015

213 views

Category:

Social Media


0 download

TRANSCRIPT

Trento-­‐RISE  

Sorbonne  Univ.  

@m_guerini  

@stjaco  

GOAL  

•  Inves?gate   the   rela?ons   between   virality   of   news  ar?cles  and  the  emo?ons  they  are  found  to  evoke.    –  Compare  different  viral  phenomena  to  understand  if  they  are  affected  by  emo?ons  in  a  similar  way  

–  Compare  emo?ons  and  virality  in  a  cross-­‐lingual  experimental  seKng;    

–  Inves?gate  the  effect  of  deep  cons?tuents  of  emo?ons  in  viral  phenomena.    

 

What  is  Virality?  

•  Virality:  tendency  of  a  content  either  to  spread  quickly  within   a   community   or   to   receive   a   great   deal   of  aRen?on  by  it.    

•  Growing   experimental   evidence   that   Virality   is   a  phenomenon   with   many   facets:   several   different  effects  of  persuasive  communica?on  are  comprised.  

•  Our  virality  indicators:    !  Plusones    !  Tweets  !  Comments  

Virality  and  Persuasion  Dimensions  

•  NARROWCASTING:  ar?cle  comments.  Readers  who  comment  on  the  ar?cle  are  contribu?ng  to  a  discussion  among  a  small  subset  of  the  readers  of  that  ar?cle.  

•  BROADCASTING:  the  act  of  sharing  an  ar?cle  to  social  networking  sites  as  a  form  of  broadcas(ng.    

Standing  on  the  shoulders  of  giants  

•  Small   sample   of   ar?cles   manually  annotated  (∼2,500  documents)    

•  Only   one   virality   index   (email-­‐forward  count,  aka  narrowcas(ng)    

•  Just  hin?ng  to  the  role  of  one  deep  cons?tuents  of  emo?ons:  arousal.    

Early  works  on  emo?ons  and  virality  (Berger  and  Milkman),   paving   the  way   for   this   novel   line  of  research.  A  few  limita?ons,  we  try  to  overcome:    

Berger,  Jonah,  and  Katherine  L.  Milkman.  "What  makes  online  content  viral?.”  Journal  of  Marke(ng  Research  49.2  (2012):  192-­‐205.  

How  

•  The   mass-­‐adop?on   of   sharing  widgets ,   and   recent l y   o f  emo0onal   tagging   widgets,   on  popular   websites,   provides   an  impressive  amount  of  data.      

Dataset  65,000   news   ar?cles   and   1.5   million   emo?on   votes  from  rappler.com  (ENG)  and  corriere.it  (ITA)        

53,226  news  ar?cles  -­‐  rappler.com  12,437  news  ar?cles  -­‐  corriere.it    

1,145,543    emo?onal  votes  -­‐  rappler.com  320,697  emo?onal  votes  -­‐  corriere.it    

A  Linear  Explana?on  

We   use   simple   linear   models   to   analyze   the  rela?onship   between   an   ar?cle   emo?onal  characteris?cs  and  the  corresponding  impact  on  virality  indices  

ID   AFRAID   AMUSED   ANGRY   ANNOYED   DONT_CARE   HAPPY   INSPIRED   SAD   COMMENTS   PLUSONES   TWEETS  

art_10002   75   0   0   0   0   0   25   0   231   3   8  art_10003   0   50   0   16   17   17   0   0   0   54   128  art_10004   52   2   3   2   2   6   2   31   7   12   734  art_10009   0   0   0   0   0   50   50   0   154   6   9  

(ENG)  Results  confirmed  …  

Rappler CommentsEstimate Std. Err Sigf.

ANNOYED .61 .03 ***ANGRY .54 .02 ***

INSPIRED .23 .02 ***DONT_CARE .18 .04 ***

AMUSED .13 .02 ***HAPPY .10 .02 ***

SAD .07 .02 ***AFRAID -.03 .03 †

Rappler TweetsEstimate Std. Err Sigf.

INSPIRED .52 .02 ***ANGRY .32 .02 ***

ANNOYED .32 .03 ***HAPPY .31 .02 ***

DONT_CARE .30 .04 ***SAD .24 .02 ***

AMUSED .21 .02 ***AFRAID .19 .03 ***

Rappler G+Estimate Std. Err Sigf.

INSPIRED .55 .02 ***ANGRY .25 .02 ***HAPPY .24 .02 ***

AMUSED .21 .02 ***ANNOYED .20 .03 ***

SAD .16 .02 ***AFRAID .14 .03 ***

DONT_CARE .13 .04 ***

Table 3: Emotions impact on rappler.com articles viral indices. ***: p <.001; **: p <.01; *: p <.05; †: not significant – thislegend applies to all tables in this paper.

Corriere CommentsEstimate Std. Err Sigf.

ANNOYED 1.11 .04 ***HAPPY .40 .04 ***

AFRAID .19 .06 **AMUSED .13 .06 *

SAD .12 .05 *

Corriere TweetsEstimate Std. Err Sigf.

ANNOYED .33 .03 ***SAD .20 .05 ***

HAPPY .14 .03 ***AFRAID .06 .05 †

AMUSED .02 .05 †

Corriere G+Estimate Std. Err Sigf.

SAD .57 .05 ***ANNOYED .39 .03 ***

AMUSED .34 .05 ***AFRAID .33 .05 ***HAPPY .27 .03 ***

Table 4: Emotions impact on corriere.it articles viral indices.

versely, no previous studies have dealt at this scale with Italianlanguage, and it will be worth investigating in future works whatfactors may influence the trend shown for emotional dimensions inthe corriere.it data.

Turning to the statistics of the various viral indices available forthe two datasets, we provide a summary in Table 5.

Viral Index Rapplerµ CorriereµCOMMENTS 4.11 83.22THREADS 2.81 -G+ SHARES .91 3.79TWITTER SHARES 32.33 40.65FACEBOOK SHARES - 502.93

Table 5: Mean figures for virality indices.

From now on, we will only consider the intersection of the twosets of indices.

3.1 Narrow- and Broad- castingIn the literature, broadcasting refers to the act of communicating

or transmitting a content to numerous recipients simultaneously,over a communication network; on the contrary, narrowcasting hastraditionally been understood as the dissemination of informationto a narrow audience, rather than to the broader public at large.With the advent of new media, this definition as been updated (see,for instance, [15]). In fact, from the perspective of readers of onlinecontent, they can decide whether and how to share such content byeither broadcasting it to their audience or to address only a smallportion of it (narrowcasting). In [4], for example, the act of for-warding a news article by email is treated as a form of narrowcast-ing, since in such case subjects are addressing a selected section oftheir audience/contacts.

Following the above definition, we will consider article com-ments as a form of narrowcasting, as the readers who upload acomment on the article page are contributing to a discussion whichhappens at most between the readers of that article (and most prob-ably, in fact, to a small subset of it). On the other hand, we considerthe act of sharing an article to social networking sites as a form ofbroadcasting.

The rationale behind this distinction of viral facets in narrow-casting and broadcasting lies in previous research efforts showing

that virality can be assessed through different metrics which repre-sent distinct phenomena: it is not only the magnitude of spreading(virality) that depends on content but also the users reactions (com-ments, shares, tweets, etc.) [10, 11, 28].

In the following sections, all virality indices have been standard-ized in order to make them comparable (µ = 0,� = 1) across thetwo datasets. Emotion scores range between 0 and 1.

4. EMOTION ANALYSISTo analyze the relationship between an article emotional char-

acteristics and the corresponding impact on virality indices we usesimple linear models. In this section we will discuss the importanceof each emotion for the various virality indices, while the overallexplanatory power of the models will be discussed in Section 6.

Results on the rappler.com dataset, shown in Table 3, are inaccordance with the findings of Berger et al. [4]: high influence ofINSPIRING (awe) and of negative-valence and high-arousal emo-tions such as ANGER, jointly with a low influence of SADNESS.This similarity stands out for the specific case of narrowcasting,and it is maintained for broadcasting, albeit to a lower extent.

As mentioned in Section 2, other previous research works [9,16], focusing on diverse scenarios and using heterogeneous datasets,have reported results consistent with these findings.

Nonetheless, turning to the Italian resource used in our analy-ses, we notice that results obtained from the corriere.it data,summarized in Table 4, are partially in line with [4] for what con-cerns narrowcasting, while drastically diverge on broadcasting:in the latter case, SADNESS (a low-arousal emotional dimension)is found to be most relevant for virality, disproving the hypothe-sis that arousal alone can be used to explain virality phenomena.It should be noted that the lack of an INSPIRED dimension forcorriere.it cannot account for such discrepancy on broad-casting, since it seems very unlikely that its potential votes wouldhave been collected by SADNESS.

It can be hypothesized, as a consequence, that strong culturaldifferences emerge when it comes to emotions (more precisely, toexplicitly tagging one’s own feeling on a website). What factorsmight underlie this phenomenon (e.g. historical period, editorialchoices, deeper cultural sensibilities, etc.) represents a very fasci-nating research question, which is out of the scope of this work.

Rather than focusing on specific emotions, in the next section weattempt to provide a more general explanation.

Results  on  rappler.com  in  line  with  Berger  and  Milkman:      •  high  influence  of  INSPIRING  (awe)    •  high  influence  of  nega?ve-­‐valence  and  high-­‐arousal  emo?ons  such  as  ANGER  •  low  influence    of  SADNESS.    

(ITA)  …  or  not?  Rappler Comments

Estimate Std. Err Sigf.ANNOYED .61 .03 ***

ANGRY .54 .02 ***INSPIRED .23 .02 ***

DONT_CARE .18 .04 ***AMUSED .13 .02 ***

HAPPY .10 .02 ***SAD .07 .02 ***

AFRAID -.03 .03 †

Rappler TweetsEstimate Std. Err Sigf.

INSPIRED .52 .02 ***ANGRY .32 .02 ***

ANNOYED .32 .03 ***HAPPY .31 .02 ***

DONT_CARE .30 .04 ***SAD .24 .02 ***

AMUSED .21 .02 ***AFRAID .19 .03 ***

Rappler G+Estimate Std. Err Sigf.

INSPIRED .55 .02 ***ANGRY .25 .02 ***HAPPY .24 .02 ***

AMUSED .21 .02 ***ANNOYED .20 .03 ***

SAD .16 .02 ***AFRAID .14 .03 ***

DONT_CARE .13 .04 ***

Table 3: Emotions impact on rappler.com articles viral indices. ***: p <.001; **: p <.01; *: p <.05; †: not significant – thislegend applies to all tables in this paper.

Corriere CommentsEstimate Std. Err Sigf.

ANNOYED 1.11 .04 ***HAPPY .40 .04 ***

AFRAID .19 .06 **AMUSED .13 .06 *

SAD .12 .05 *

Corriere TweetsEstimate Std. Err Sigf.

ANNOYED .33 .03 ***SAD .20 .05 ***

HAPPY .14 .03 ***AFRAID .06 .05 †

AMUSED .02 .05 †

Corriere G+Estimate Std. Err Sigf.

SAD .57 .05 ***ANNOYED .39 .03 ***

AMUSED .34 .05 ***AFRAID .33 .05 ***HAPPY .27 .03 ***

Table 4: Emotions impact on corriere.it articles viral indices.

versely, no previous studies have dealt at this scale with Italianlanguage, and it will be worth investigating in future works whatfactors may influence the trend shown for emotional dimensions inthe corriere.it data.

Turning to the statistics of the various viral indices available forthe two datasets, we provide a summary in Table 5.

Viral Index Rapplerµ CorriereµCOMMENTS 4.11 83.22THREADS 2.81 -G+ SHARES .91 3.79TWITTER SHARES 32.33 40.65FACEBOOK SHARES - 502.93

Table 5: Mean figures for virality indices.

From now on, we will only consider the intersection of the twosets of indices.

3.1 Narrow- and Broad- castingIn the literature, broadcasting refers to the act of communicating

or transmitting a content to numerous recipients simultaneously,over a communication network; on the contrary, narrowcasting hastraditionally been understood as the dissemination of informationto a narrow audience, rather than to the broader public at large.With the advent of new media, this definition as been updated (see,for instance, [15]). In fact, from the perspective of readers of onlinecontent, they can decide whether and how to share such content byeither broadcasting it to their audience or to address only a smallportion of it (narrowcasting). In [4], for example, the act of for-warding a news article by email is treated as a form of narrowcast-ing, since in such case subjects are addressing a selected section oftheir audience/contacts.

Following the above definition, we will consider article com-ments as a form of narrowcasting, as the readers who upload acomment on the article page are contributing to a discussion whichhappens at most between the readers of that article (and most prob-ably, in fact, to a small subset of it). On the other hand, we considerthe act of sharing an article to social networking sites as a form ofbroadcasting.

The rationale behind this distinction of viral facets in narrow-casting and broadcasting lies in previous research efforts showing

that virality can be assessed through different metrics which repre-sent distinct phenomena: it is not only the magnitude of spreading(virality) that depends on content but also the users reactions (com-ments, shares, tweets, etc.) [10, 11, 28].

In the following sections, all virality indices have been standard-ized in order to make them comparable (µ = 0,� = 1) across thetwo datasets. Emotion scores range between 0 and 1.

4. EMOTION ANALYSISTo analyze the relationship between an article emotional char-

acteristics and the corresponding impact on virality indices we usesimple linear models. In this section we will discuss the importanceof each emotion for the various virality indices, while the overallexplanatory power of the models will be discussed in Section 6.

Results on the rappler.com dataset, shown in Table 3, are inaccordance with the findings of Berger et al. [4]: high influence ofINSPIRING (awe) and of negative-valence and high-arousal emo-tions such as ANGER, jointly with a low influence of SADNESS.This similarity stands out for the specific case of narrowcasting,and it is maintained for broadcasting, albeit to a lower extent.

As mentioned in Section 2, other previous research works [9,16], focusing on diverse scenarios and using heterogeneous datasets,have reported results consistent with these findings.

Nonetheless, turning to the Italian resource used in our analy-ses, we notice that results obtained from the corriere.it data,summarized in Table 4, are partially in line with [4] for what con-cerns narrowcasting, while drastically diverge on broadcasting:in the latter case, SADNESS (a low-arousal emotional dimension)is found to be most relevant for virality, disproving the hypothe-sis that arousal alone can be used to explain virality phenomena.It should be noted that the lack of an INSPIRED dimension forcorriere.it cannot account for such discrepancy on broad-casting, since it seems very unlikely that its potential votes wouldhave been collected by SADNESS.

It can be hypothesized, as a consequence, that strong culturaldifferences emerge when it comes to emotions (more precisely, toexplicitly tagging one’s own feeling on a website). What factorsmight underlie this phenomenon (e.g. historical period, editorialchoices, deeper cultural sensibilities, etc.) represents a very fasci-nating research question, which is out of the scope of this work.

Rather than focusing on specific emotions, in the next section weattempt to provide a more general explanation.

Turning  to  corriere.it:    •  par?ally  in  line  with  (Berger  and  Milkman)  for  what  concerns  narrowcas(ng    •  dras?cally  diverge  on  broadcas(ng:  SADNESS  is  found  to  be  most  relevant  for  virality,  

disproving  that  arousal  alone  can  be  used  to  explain  virality  phenomena.      

Idea  

Rather   than   focusing   on   specific   differences   in  the   linear  models,  we   look  for  configura?ons   in  the   deeper   structure   of   emo?ons   that   can  account  for  such  discrepancies.    

The  VAD  Model  

•  the  Valence,  Arousal,  and  Dominance  (VAD)  circumplex  model  of  affect:  –  Valence,  deno?ng  the  degree  of  posi?ve/nega?ve  affec?vity  (e.g.  FEAR  has  high  nega?ve  valence,  while  JOY  has  high  posi?ve  valence);    

– Arousal,  ranging  from  calming  to  exci?ng  (e.g.  ANGER  is  denoted  by  high  arousal  while  SADNESS  by  low  arousal);    

– Dominance,  going  from  “controlled”  to  “in  control”  (e.g.  INSPIRED,  highly  in  control,  vs  FEAR,  overwhelming).    

Mapping  to  the  VAD  model  •  Exploit  the  work  of  Warriner  et  al.:  a  resource  mapping  14  

thousands  words  to  the  VAD  circumplex  model  on  a  Likert  scale,  including  words  represen?ng  our  emo?onal  labels.    

•  To  compute  VAD  scores  for  a  given  ar?cle  doc  we  mul?ply  the  percentage  of  votes  each  emo?on  e  is  found  to  evoke  in  doc  by  the  corresponding  VAD  score,  and  then  take  the  sum  over  the  n  emo?onal  dimensions  considered.    

corriere.it commentsEstimate Std. Err Sigf.

AROUSAL .3741 .0372 ***VALENCE -.3155 .0486 ***DOMINANCE .1000 .0761 †

rappler.com commentsEstimate Std. Err Sigf.

AROUSAL .1205 .0101 ***VALENCE -.1636 .0165 ***DOMINANCE .0728 .0221 **

corriere.it tweetsEstimate Std. Err Sigf.

DOMINANCE .2158 .0709 **VALENCE -.2176 .0453 ***AROUSAL .0400 .0348 †

rappler.com tweetsEstimate Std. Err Sigf.

DOMINANCE .2244 .0221 ***VALENCE -.1495 .0165 ***AROUSAL .0065 .0101 †

corriere.it g+ sharesEstimate Std. Err Sigf.

DOMINANCE .4817 .0704 ***VALENCE -.3781 .0450 ***AROUSAL -.0242 .0345 †

rappler.com g+ sharesEstimate Std. Err Sigf.

DOMINANCE .2619 .0221 ***VALENCE -.1530 .0165 ***AROUSAL -.0371 .0101 ***

Table 7: Valence, Arousal and Dominance significant effects from simple Linear Models.

EMOTION Valence Arousal DominanceAFRAID 2.25 5.12 2.71AMUSED 7.05 4.27 5.93ANGRY 2.53 6.02 4.11ANNOYED 2.80 5.29 4.08DONT_CARE 3.53 4.27 3.62HAPPY 8.47 6.05 7.21INSPIRED 6.89 5.56 7.30SAD 2.10 3.49 3.84

Table 6: Valence, Arousal and Dominance scores for emotionlabels, as provided by [34].

5. VAD ANALYSISWe now turn to investigate how basic constituents of emotions,

such as Valence, Arousal, and Dominance (VAD), connect to vi-rality. The widely adopted VAD circumplex model of affect [6,27] maps emotions on a three-dimensional space, namely: Valence,denoting the degree of positive/negative affectivity (e.g. FEAR hashigh negative valence, while JOY has high positive valence); Arousal,ranging from calming to exciting (e.g. ANGER is denoted by higharousal while SADNESS by low arousal); and Dominance, goingfrom “controlled” to “in control” (e.g. INSPIRED, highly in con-trol, vs FEAR, overwhelming).

We thus proceed to map the emotions an article is found to evoketo the VAD circumplex model. To do so, we exploit the work ofWarriner et al. [34], who provided a resource mapping roughly 14thousands words to the VAD circumplex model, including wordsrepresenting the emotional dimensions we consider in this work –see Table 6, which for us serve as a gold standard. Such scores arein a Likert scale, ranging from 1 (low/negative) to 9 (high/positive).

Then, in order to compute VAD scores for a given article doc wesimply multiply the percentage of votes each emotion e is found toevoke in doc by the corresponding VAD score provided in Table 6,and then take the sum over the n emotional dimensions considered.The formula for Valence is provided below; the equations used forArousal and Dominance are akin.

docv =nX

i=1

votes%(ei)⇥ valence(ei) (1)

As with virality indices, each VAD dimension has been stan-dardized. Subsequently, we build linear models in order to as-sess whether and how VAD dimensions are connected to viral-ity. The results reported in Table 7 show very interesting trends:VAD dimension estimates are found to be consistent among thetwo datasets (see estimate order and sign in Table 7), in both broad-casting (tweets/g+ shares) and narrowcasting (comments) scenar-ios. Comparing these results with those provided in the previoussections, we see that the cross-cultural divergences in the relationsbetween emotional dimensions and virality indices disappear whenaccounting for their more profound VAD components.

This finding hints at a generalized, culturally indipendent phe-nomenon: readers of the articles in our datasets tend to choosecommunication forms of narrowcasting when the content is arous-ing but on which they feel less in control2.

Conversely, they turn to broadcasting (e.g. share on social net-works) when they feel more in control. This result is further sup-ported by the fact that the less important dimensions (i.e. Domi-nance for narrowcasting, and Arousal for broadcasting) are mostof the time not significant, or with a slight negative impact on themodel, indicating that these two dimensions switch roles when tran-sitioning from narrowcasting to broadcasting (and viceversa). Thisfinding is important as it appears to be valid, in our analyses, amongthe two different languages (and cultural factors) our two datasetsprovide.

Finally, the role of Valence appears to be consistent among allindices of virality in both datasets, with negative valence contribut-ing to higher virality. This result is in line with [14], who foundthat news-related content spreads more when imbued with negativeValence.

6. R2 ANALYSISIn this section we examine the explanatory power of our mod-

els and compare them with the results obtained by the models pre-sented in Berger et al. [4] – in particular, with model 2 (Positivityand Emotionality) and model 3 (Positivity and Emotionality plusthe emotions Awe, Anger, Anxiety, Sadness).

2Again, this result is in line with the intuition in [4], that arousalhas a strong effect on narrowcasting. Nonetheless, it appears ofparamount importance to take the role of Dominance (not consid-ered in [4]) into account in order to understand broadcasting phe-nomena.

corriere.it commentsEstimate Std. Err Sigf.

AROUSAL .3741 .0372 ***VALENCE -.3155 .0486 ***DOMINANCE .1000 .0761 †

rappler.com commentsEstimate Std. Err Sigf.

AROUSAL .1205 .0101 ***VALENCE -.1636 .0165 ***DOMINANCE .0728 .0221 **

corriere.it tweetsEstimate Std. Err Sigf.

DOMINANCE .2158 .0709 **VALENCE -.2176 .0453 ***AROUSAL .0400 .0348 †

rappler.com tweetsEstimate Std. Err Sigf.

DOMINANCE .2244 .0221 ***VALENCE -.1495 .0165 ***AROUSAL .0065 .0101 †

corriere.it g+ sharesEstimate Std. Err Sigf.

DOMINANCE .4817 .0704 ***VALENCE -.3781 .0450 ***AROUSAL -.0242 .0345 †

rappler.com g+ sharesEstimate Std. Err Sigf.

DOMINANCE .2619 .0221 ***VALENCE -.1530 .0165 ***AROUSAL -.0371 .0101 ***

Table 7: Valence, Arousal and Dominance significant effects from simple Linear Models.

EMOTION Valence Arousal DominanceAFRAID 2.25 5.12 2.71AMUSED 7.05 4.27 5.93ANGRY 2.53 6.02 4.11ANNOYED 2.80 5.29 4.08DONT_CARE 3.53 4.27 3.62HAPPY 8.47 6.05 7.21INSPIRED 6.89 5.56 7.30SAD 2.10 3.49 3.84

Table 6: Valence, Arousal and Dominance scores for emotionlabels, as provided by [34].

5. VAD ANALYSISWe now turn to investigate how basic constituents of emotions,

such as Valence, Arousal, and Dominance (VAD), connect to vi-rality. The widely adopted VAD circumplex model of affect [6,27] maps emotions on a three-dimensional space, namely: Valence,denoting the degree of positive/negative affectivity (e.g. FEAR hashigh negative valence, while JOY has high positive valence); Arousal,ranging from calming to exciting (e.g. ANGER is denoted by higharousal while SADNESS by low arousal); and Dominance, goingfrom “controlled” to “in control” (e.g. INSPIRED, highly in con-trol, vs FEAR, overwhelming).

We thus proceed to map the emotions an article is found to evoketo the VAD circumplex model. To do so, we exploit the work ofWarriner et al. [34], who provided a resource mapping roughly 14thousands words to the VAD circumplex model, including wordsrepresenting the emotional dimensions we consider in this work –see Table 6, which for us serve as a gold standard. Such scores arein a Likert scale, ranging from 1 (low/negative) to 9 (high/positive).

Then, in order to compute VAD scores for a given article doc wesimply multiply the percentage of votes each emotion e is found toevoke in doc by the corresponding VAD score provided in Table 6,and then take the sum over the n emotional dimensions considered.The formula for Valence is provided below; the equations used forArousal and Dominance are akin.

docv =nX

i=1

votes%(ei)⇥ valence(ei) (1)

As with virality indices, each VAD dimension has been stan-dardized. Subsequently, we build linear models in order to as-sess whether and how VAD dimensions are connected to viral-ity. The results reported in Table 7 show very interesting trends:VAD dimension estimates are found to be consistent among thetwo datasets (see estimate order and sign in Table 7), in both broad-casting (tweets/g+ shares) and narrowcasting (comments) scenar-ios. Comparing these results with those provided in the previoussections, we see that the cross-cultural divergences in the relationsbetween emotional dimensions and virality indices disappear whenaccounting for their more profound VAD components.

This finding hints at a generalized, culturally indipendent phe-nomenon: readers of the articles in our datasets tend to choosecommunication forms of narrowcasting when the content is arous-ing but on which they feel less in control2.

Conversely, they turn to broadcasting (e.g. share on social net-works) when they feel more in control. This result is further sup-ported by the fact that the less important dimensions (i.e. Domi-nance for narrowcasting, and Arousal for broadcasting) are mostof the time not significant, or with a slight negative impact on themodel, indicating that these two dimensions switch roles when tran-sitioning from narrowcasting to broadcasting (and viceversa). Thisfinding is important as it appears to be valid, in our analyses, amongthe two different languages (and cultural factors) our two datasetsprovide.

Finally, the role of Valence appears to be consistent among allindices of virality in both datasets, with negative valence contribut-ing to higher virality. This result is in line with [14], who foundthat news-related content spreads more when imbued with negativeValence.

6. R2 ANALYSISIn this section we examine the explanatory power of our mod-

els and compare them with the results obtained by the models pre-sented in Berger et al. [4] – in particular, with model 2 (Positivityand Emotionality) and model 3 (Positivity and Emotionality plusthe emotions Awe, Anger, Anxiety, Sadness).

2Again, this result is in line with the intuition in [4], that arousalhas a strong effect on narrowcasting. Nonetheless, it appears ofparamount importance to take the role of Dominance (not consid-ered in [4]) into account in order to understand broadcasting phe-nomena.

…  

VAD  Discussion  

•  Cross-­‐cultural  divergences  disappear  when  using  VAD  components.    

•  A  generalized,  culturally  independent  phenomenon:    –  narrowcas0ng  when  the  content  is  arousing.    –  broadcas0ng  when  there  is  control  (dominance).    

•  Further  evidence:  less  important  dimensions  (i.e.  Dominance  for  narrowcas(ng,  and  Arousal  for  broadcas(ng)  are  most  of  the  ?me  not  significant.  These  two  dimensions  switch  roles  when  transi?oning  from  narrowcas?ng  to  broadcas?ng  (and  viceversa).  

VAD  linear  Models  

corriere.it commentsEstimate Std. Err Sigf.

AROUSAL .3741 .0372 ***VALENCE -.3155 .0486 ***DOMINANCE .1000 .0761 †

rappler.com commentsEstimate Std. Err Sigf.

AROUSAL .1205 .0101 ***VALENCE -.1636 .0165 ***DOMINANCE .0728 .0221 **

corriere.it tweetsEstimate Std. Err Sigf.

DOMINANCE .2158 .0709 **VALENCE -.2176 .0453 ***AROUSAL .0400 .0348 †

rappler.com tweetsEstimate Std. Err Sigf.

DOMINANCE .2244 .0221 ***VALENCE -.1495 .0165 ***AROUSAL .0065 .0101 †

corriere.it g+ sharesEstimate Std. Err Sigf.

DOMINANCE .4817 .0704 ***VALENCE -.3781 .0450 ***AROUSAL -.0242 .0345 †

rappler.com g+ sharesEstimate Std. Err Sigf.

DOMINANCE .2619 .0221 ***VALENCE -.1530 .0165 ***AROUSAL -.0371 .0101 ***

Table 7: Valence, Arousal and Dominance significant effects from simple Linear Models.

EMOTION Valence Arousal DominanceAFRAID 2.25 5.12 2.71AMUSED 7.05 4.27 5.93ANGRY 2.53 6.02 4.11ANNOYED 2.80 5.29 4.08DONT_CARE 3.53 4.27 3.62HAPPY 8.47 6.05 7.21INSPIRED 6.89 5.56 7.30SAD 2.10 3.49 3.84

Table 6: Valence, Arousal and Dominance scores for emotionlabels, as provided by [34].

5. VAD ANALYSISWe now turn to investigate how basic constituents of emotions,

such as Valence, Arousal, and Dominance (VAD), connect to vi-rality. The widely adopted VAD circumplex model of affect [6,27] maps emotions on a three-dimensional space, namely: Valence,denoting the degree of positive/negative affectivity (e.g. FEAR hashigh negative valence, while JOY has high positive valence); Arousal,ranging from calming to exciting (e.g. ANGER is denoted by higharousal while SADNESS by low arousal); and Dominance, goingfrom “controlled” to “in control” (e.g. INSPIRED, highly in con-trol, vs FEAR, overwhelming).

We thus proceed to map the emotions an article is found to evoketo the VAD circumplex model. To do so, we exploit the work ofWarriner et al. [34], who provided a resource mapping roughly 14thousands words to the VAD circumplex model, including wordsrepresenting the emotional dimensions we consider in this work –see Table 6, which for us serve as a gold standard. Such scores arein a Likert scale, ranging from 1 (low/negative) to 9 (high/positive).

Then, in order to compute VAD scores for a given article doc wesimply multiply the percentage of votes each emotion e is found toevoke in doc by the corresponding VAD score provided in Table 6,and then take the sum over the n emotional dimensions considered.The formula for Valence is provided below; the equations used forArousal and Dominance are akin.

docv =nX

i=1

votes%(ei)⇥ valence(ei) (1)

As with virality indices, each VAD dimension has been stan-dardized. Subsequently, we build linear models in order to as-sess whether and how VAD dimensions are connected to viral-ity. The results reported in Table 7 show very interesting trends:VAD dimension estimates are found to be consistent among thetwo datasets (see estimate order and sign in Table 7), in both broad-casting (tweets/g+ shares) and narrowcasting (comments) scenar-ios. Comparing these results with those provided in the previoussections, we see that the cross-cultural divergences in the relationsbetween emotional dimensions and virality indices disappear whenaccounting for their more profound VAD components.

This finding hints at a generalized, culturally indipendent phe-nomenon: readers of the articles in our datasets tend to choosecommunication forms of narrowcasting when the content is arous-ing but on which they feel less in control2.

Conversely, they turn to broadcasting (e.g. share on social net-works) when they feel more in control. This result is further sup-ported by the fact that the less important dimensions (i.e. Domi-nance for narrowcasting, and Arousal for broadcasting) are mostof the time not significant, or with a slight negative impact on themodel, indicating that these two dimensions switch roles when tran-sitioning from narrowcasting to broadcasting (and viceversa). Thisfinding is important as it appears to be valid, in our analyses, amongthe two different languages (and cultural factors) our two datasetsprovide.

Finally, the role of Valence appears to be consistent among allindices of virality in both datasets, with negative valence contribut-ing to higher virality. This result is in line with [14], who foundthat news-related content spreads more when imbued with negativeValence.

6. R2 ANALYSISIn this section we examine the explanatory power of our mod-

els and compare them with the results obtained by the models pre-sented in Berger et al. [4] – in particular, with model 2 (Positivityand Emotionality) and model 3 (Positivity and Emotionality plusthe emotions Awe, Anger, Anxiety, Sadness).

2Again, this result is in line with the intuition in [4], that arousalhas a strong effect on narrowcasting. Nonetheless, it appears ofparamount importance to take the role of Dominance (not consid-ered in [4]) into account in order to understand broadcasting phe-nomena.

R2  ANALYSIS  

We compare our VAD models with model 2, which was automat-ically computed starting from a lexicon of relevant affective words(the LIWC lexicon described in [25]), and our emotion-based mod-els with model 3 in [4]. It should be noted that the extensive workpresented in [4] includes other models accounting for variables(such as publication time, homepage position, author, among oth-ers) not available in our datasets. Nonetheless, it has been shownin [4] that, while augmenting the explanatory power, these variablesdo not influence weight and role of the emotions in the models.

In [4], the dependent variable is represented as binary (i.e., 1 forthose articles that make it on the most emailed list, 0 for all the oth-ers). On the contrary, our viral indices are originally represented ascontinuous variables. In order to provide a fair comparison we thusproceeded to binarize the virality indices V of article n, accordingto the following scheme:

Vn =

(1, if Vn > mean(V ) + sd(V )

0, otherwise(2)

In this way, we obtain a small sample of the most viral articlesin our dataset, specular to the most emailed list in [4]. We alsoused logistic regression in accordance with [4]. For comparison,we report in table 8 the McFadden’s R2 used by Berger et al. [4],along with the standard R2.

Viral Index R

2 McFadden’s R2

corriere.it emotion modelsComments .0813 .0922G+ .0207 .0527Tweets .0084 .0604

corriere.it VAD modelsComments .0530 .0840G+ .0195 .0446Tweets .0072 .0591rappler.com emotion models

Comments .0191 .0801G+ .0115 .0314Tweets .0112 .0300

rappler.com VAD modelsComments .0101 .0569G+ .0088 .0288Tweets .0100 .0266

Berger et al. modelsModel 2 - .0400Model 3 - .0700

Table 8: R2 scores for the various linear models described andcomparison with models presented in Berger et al. [4]

Hence, when projecting emotions to their basic VAD dimen-sions, the models experience only a slight drop in terms of explana-tory power, while gaining the advantage of language-invariance.

Moreover, our regression models for narrowcasting have greaterexplanatory power than the best performing model in [4]. This factcan be partially explained by the quality of our data: our datasetsare much bigger and the affective annotations of the news articleswere massively crowdsourced to the readers.

Finally, emotions have significant effects on both narrowcastingand broadcasting, with a stronger impact on the former. Again, thisfinding is cross-cultural consistent.

7. CONCLUSIONSIn this article we have provided a comprehensive investigation

on the relations between virality of news articles and the emotionsthey are found to evoke. By exploiting a high-coverage and bilin-gual corpus of 65k documents containing metrics of their spread onsocial networks as well as a massive affective annotation providedby readers (more than 1.5 millions votes), we presented a thoroughanalysis of the interplay between evoked emotions and viral facets.

We highlighted and discussed our findings in light of a cross-lingual approach.

Our results show differences in evoked emotions and correspond-ing viral effects across the bilingual data we crawled; amongst otherfindings, we note the remarkably discordant influence of the SAD-NESS affective dimension between the english- and italian- lan-guage datasets, in contrast with the hypothesis that arousal alonecan be used to explain virality phenomena. While these findingsseem to indicate effects of cultural differences on the relation ex-isting between emotions and virality, such hypothesis could onlybe properly tested were more information on the annotators demo-graphics available (e.g. geographic provenance).

Still, when accounting for the deeper constituents of emotionsour analyses provided compelling evidence of a generalized ex-planatory model rooted in the Valence-Arousal-Dominance (VAD)circumplex. Viral facets seem to be coherently affected by partic-ular VAD configurations (namely, the alternation between Domi-nance and Arousal), and these configurations indicate a clear con-nection with distinct phenomena underlying persuasive communi-cation. In particular, high arousal is more connected to narrowcast-ing phenomena, while Dominance to broadcasting phenomena.

Extensions of this work will include deeper investigations of therelations between VAD dimensions and virality. For instance, whilein the analyses reported herein we have used a resource [34] to mapaffective tags to their VAD constituents, we plan to increase theresolution by associating each article with a VAD representationderived from its textual content.

The results presented in this article can be very valuable in con-texts such as content marketing and native advertising, and thus berelevant not only for social science researchers interested in under-standing the factors behind virality phenomena, but also for mar-keting and industry people.

AcknowledgementsThe work of M. Guerini has been supported by the Trento RISEPerTe project. The work of J. Staiano has been partially funded bythe French government through the National Agency for Research(ANR) under the Sorbonne Universités Excellence Initiative pro-gram (Idex, reference ANR-11-IDEX-0004-02) in the frameworkof the Investments for the future program, and by the German Fed-eral Ministry of Research and Education, within the EBITA project.

8. REFERENCES[1] J. Z. Aaditeshwar Seth and R. Cohen. A multi-disciplinary

approach for recommending weblog messages. In The AAAI2008 Workshop on Enhanced Messaging, 2008.

[2] T. Althoff, C. Danescu-Niculescu-Mizil, and D. Jurafsky.How to ask for a favor: A case study on the success ofaltruistic requests. Proocedings of ICWSM, 2014.

[3] S. Aral and D. Walker. Creating social contagion throughviral product design: A randomized trial of peer influence innetworks. Management Science, 57(9):1623–1639, 2011.

[4] J. Berger and K. L. Milkman. What makes online contentviral? Journal of Marketing Research, 49(2):192–205, 2012.

 •  When  using  VAD  dimensions,  only  a  slight  drop  

in   terms   of   explanatory   power,   while   gaining  the  advantage  of  language-­‐invariance.  

•  Our  regression  models  for  narrowcas?ng  have  greater  explanatory  power  than   in  Berger  and  Milkman.    Par?ally  explained  by  the  quality  of  our   data:   much   bigger   dataset   and   affec?ve  annota?ons    massively  crowdsourced.  

•  Emo?ons   have   significant   effects   on   both  narrowcas?ng   and   broadcas?ng:   with   a  stronger   impact   on   the   former.   Again,   this  finding  is  cross-­‐cultural  consistent.  

Conclusions  

•  A  study  on  rela?ons  between  emo?ons  and  virality.    •  While  there  seem  to  be  cultural  differences  in  these  rela?ons,  projec?ng  documents  onto  deep  structure  of  emo?ons  these  differences  disappear.      – narrowcas0ng  when  the  content  is  arousing.    – broadcas0ng  when  there  is  control  (dominance).    

   Further  details:  M.  Guerini  and  J.  Staiano.  2015.  Deep  Feelings:  A  Massive  Cross-­‐Lingual  Study  on  the  Rela?on  between  Emo?ons  and  Virality.  In  Proceedings  of  WWW  '15  Companion,  299-­‐305.