people pattern: "the science of sharing"

51
The Science of Sharing Jason Baldridge Co-founder, People Pattern Associate Professor, The University of Texas at Austin @jasonbaldridge

Upload: people-pattern

Post on 14-Apr-2017

5.678 views

Category:

Data & Analytics


1 download

TRANSCRIPT

The Science of SharingJason Baldridge Co-founder, People Pattern Associate Professor, The University of Texas at Austin @jasonbaldridge

Preliminary notes• This talk incorporates results and images from many different research papers by people working primarily in social

network analysis. • As such, this talk is a synthesis of that work put together into a narrative to introduce key abilities and results. I felt this

high-level view was the best way to discuss “The Science of Sharing”, rather than relying primarily on my own work or work done at People Pattern. Also, I was really impressed by the work researchers are doing in social network analysis and wanted to share even a glimpse of the problems they are tackling and what they are finding.

• The high-level progression of this talk is: • Document analysis at scale: meme tracking combined with other variables like sentiment and bias • Social network at scale: information cascades and virality, inference of social networks given meme-like information as

contagions. • The node level perspective and its effects on what an individual sees and shares: Illusions, effort and overload, topics,

personality and demographics. • Personas and segmentation: grouping based on demographics and interests.

• The last item is work done at People Pattern. I stress that neither I nor People Pattern was involved with the research papers cited in the other slides. My own academic research focuses on natural language processing, especially machine learning for learning syntactic parsers and performing geolocation using text. For more on those topics, see: http://www.jasonbaldridge.com/papers

• In the actual talk, I didn’t cover the slides on email overload (to keep things to 30 minutes), but for this deck, I’ve put them back in their place.

• References and links to PDF’s of all cited work are at the end of this deck. They are also available on this post on my blog: https://bcomposes.wordpress.com/2015/10/23/references-for-my-izeafest-talk/

Link to livestream of the talk:https://youtu.be/8_aFymHQZbM?t=6h51m18s

Meme tracking

Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.”

Automatic detection and tracking of memes over time.

Meme tracking

Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.”

Meme oscillation heartbeat from blogs to mainstream media.

Quoting Patterns in Political Coverage

Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”

Measuring bias is subjective and hard. Personal estimates of bias are influenced by the availability heuristic.

57% of Americans perceive media as biased. 73% of conservatives think bias is liberal.

11% of liberals think bias is liberal.

Similarly: husbands and wives both estimate their contributions to family activities differently.

[Lee & Waite (2005): http://www.jstor.org/stable/3600272]

Read this!

Quoting Patterns in Political Coverage

Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”

Automated tracking of quotations from Obama’s speeches.

Red: quoted in conservative media. Blue: quoted in

liberal media.

Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”

Dimensionality reduction reveals two main bias dimensions: (one) independent-mainstream & (two) foreign-liberal-conservative.

Quoting Patterns in Political Coverage

Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”

Sentiment across two bias dimensions: more mainstream & conservative correlates with negative sentiment.

Quoting Patterns in Political Coverage

Structural Virality

Goel et al. (2015). “The Structural Virality of Online Diffusion”

Information cascades can propagate via broadcast and viral diffusion. Most cascades contain both broadcast and viral spreading.

Broadcast Viral

Structural Virality

Goel et al. (2015). “The Structural Virality of Online Diffusion”

Twitter cascades characterized by structural virality, increasing down and to the right.

Structural Virality

Goel et al. (2015). “The Structural Virality of Online Diffusion”

Petition cascades are smallest, but have highest structural virality.

Structural Virality

Goel et al. (2015). “The Structural Virality of Online Diffusion”

99% of content adoptions terminate in

a single generation

The largest image and video cascades are low on

structural virality.

Broadcast is by far the dominant mode to reach large audiences. This means pay-to-play when you need to go big reliably.

Predicting memes using network structure

Weng et al. (2014). “Predicting Successful Memes using Network and Community Structure.”

Viral (a,b) and non-viral (c,d) memes diffuse differently at start.

Predicting memes using network structure

Weng et al. (2014). “Predicting Successful Memes using Network and Community Structure.”

Network configurations of early adopters impact virality. These relationships are better predictors than early popularity.

Rumor cascades

Friggeri et al. (2015). “Rumor Cascades”

Spread of false rumors is kept in check in social networks.

False Cabela’s Obamacare receipt cascade on Facebook. Snope links (red dots) typically end a branch of a rumor cascade.

Rumor cascades

Friggeri et al. (2015). “Rumor Cascades”

Being snoped increases the likelihood of deletion of the original post. False rumors are more likely to be deleted.

Rumor cascades

Friggeri et al. (2015). “Rumor Cascades”

Being snoped lowers the rate of reshares.

Information propagation

Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”

Contagion model: Information infects nodes, which become active. Information spreads from active nodes along the network edges.

Information propagation

Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”

Given information cascades, infer network using contagion model.

Information propagation

Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”

Inferred structure shows emerging and vanishing clusters. Red: mainstream media. Blue: blogs.

March 2011

June 2011

October 2011

Information propagation

Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”

Evolution of network for Fukushima articles.

Information propagation

Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”

Information generally flows from mainstream media to blogs. Blogs play a crucial role in information dissemination in civil movements.

Information propagation

Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”

Blogs and mainstream media swap influence during course of event. Increased blog influence proportion correlates with social unrest.

Is virality/contagion a bad metaphor?

Taylor Swift has 65 million Twitter followers who can receive her messages. One individual cannot sneeze on

and infect that many people simultaneously.

The likelihood of disease infection increases independently with exposure to different infected

individuals, but “infection” by an idea increases greatly when exposed to it by multiple, independent parties.

Majority illusion

Lerman et al. (2015). “The Majority Illusion in Social Networks.”

Friendship paradox: on average most people have fewer friends than their friends.

This generalizes to any node attribute, which may explain why people overestimate their friends’

alcohol consumption.

Majority illusion

Lerman et al. (2015). “The Majority Illusion in Social Networks.”

The connectedness of “infected” people greatly impacts the perception of others. A minority opinion can appear extremely popular for each individual (left side).

Majority illusion

Lerman et al. (2015). “The Majority Illusion in Social Networks.”

The size of majority illusion in Digg and political blogs, varying the number and connectedness of infected nodes.

Strength of weak ties

Brokerage positions expose their network to diverse information. Easy to establish weak links in social media, but increases cognitive load.

Embedded position Brokerage position

Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in Networks.”

User effort and network structure

Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in Networks.”

Twitter users with more diverse networks see more diverse content. More active users (red dots) see more diverse content regardless.

User effort and network structure

Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in Networks.”

High network diversity users tend to see more general topics. Low diversity users tend to focus on one or two niche topics.

Information overload

Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its Impact on Social Contagions”

After incoming rate passes 30 tweets per hour, retweeting drops.

Information overload

Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its Impact on Social Contagions”

Users are responsive until 50-100 tweets/hour, then give up or resort to other techniques or tools.

Information overload

Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its Impact on Social Contagions”

More background information leads to smaller cascades.

Email overload

Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.”

Longer email threads have shorter, quicker responses. Long last response signals end of thread.

Email overload

Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.”

Emails received on weekends are replied to more slowly & tersely.

Email overload

Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.”

People email more as they get more emails, but get buried.

Email overload

Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.”

Younger people are less sensitive to overload than older ones.

Information diets

Most users consume one or two dominant topics.

Kulshrestha et al. (2015). “Characterizing Information Diets of Social Media Users.”

Information diets

Kulshrestha et al. (2015). “Characterizing Information Diets of Social Media Users.”

Social media concentrates more on real-time topics.

Personality classification

Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers.”

Language production provides a window on personality at scale.

Personality classification

Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.”

Bigrams as indicators of high/low scorers in personality classification.

High scorers Low scorersNeuroticism

Extroversion

Openness

Agreeableness

Conscientiousness

Ad Targeting and Personality

Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.”

Twitter users whose language indicates higher openness and lower neuroticism are more likely to respond positively to an ad.

Antisocial Behavior Online

Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.”

Comparing banned & normal users (in retrospect): banned users wrote posts that are less relevant, harder to read, and less positive.

FBU: Future banned users NBU: Never banned users

Race and sharing

http://www.theatlantic.com/technology/archive/2015/10/race-social-media/408889/

Frequency of sharing for topics on social media varies by race.

Events or entertainment Education or schools

Re “race”: please read this book.

Tailored audiences

People Pattern and Smarty Pants Vitamins case study.

Human analysis and machine learning can be used to characterize and identify personas using social media profiles.

+

Tailored audiences

People Pattern and Smarty Pants Vitamins case study.

Interest prediction and extraction of interest-specific keywords. Promoted tweet copy informed by persona-based keywords.

+

Tailored audiences

People Pattern and Smarty Pants Vitamins case study.

Persona-based campaigns with audience-driven ad copy produced higher engagement at lower cost per conversion.

+

Conversions

0

60

120

180

240

Control Overscheduled Parent Grab & Go

Cost per conversion

0

10

20

30

40

Sub-micro segmentation

We have limited attention and many options. The best, most relevant content is often created by those with very similar passions, interests, and demographics.

Doresa Jennings Cheryl Baldridge

• PhD, BGSU • Lives in the southern USA • Mother of profoundly gifted

children • Homeschooler • Commitment to STEM • African-American

• JD, Yale • Lives in the southern USA • Mother of profoundly gifted

children • Homeschooler • Commitment to STEM • African-American

Dr. J creates a lot of original text and video. My busy wife makes time for it all.

Other content is less compelling for her.

http://kdacademy.blogspot.com/

https://www.youtube.com/user/DAJedu

Conclusion

Authentic, original content is the most

compelling.

Audience understanding is essential: demographics, personality and microsegment relevance.

Pay-to-play to reliably get your word out.

Content consumers must constantly manage

information overload.

Large scale analysis of networks and documents reveals hidden patterns.

References• Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.”

- http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10508

• Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.” - http://arxiv.org/abs/1504.00680

• Friggeri et al. (2015). “Rumor Cascades.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8122

• Goel et al. (2015). “The Structural Virality of Online Diffusion.” - https://5harad.com/papers/twiral.pdf

• Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its Impact on Social Contagions.” - http://arxiv.org/abs/1403.6838

• Gomez Rodriguez et al. (2014). "Uncovering the structure and temporal dynamics of information propagation." - http://www.mpi-sws.org/~manuelgr/pubs/S2050124214000034a.pdf

• Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.” - http://www.research.ed.ac.uk/portal/files/12949424/Iacobelli_Gill_et_al_2011_Large_scale_personality_classification_of_bloggers.pdf

References• Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in

Networks.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10483

• Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.” - http://arxiv.org/abs/1504.00704

• Kulshrestha et al (2015). “Characterizing Information Diets of Social Media Users.” - https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10595/10505

• Lerman et al. (2015). “The Majority Illusion in Social Networks.” - http://arxiv.org/abs/1506.03022

• Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.” - http://www.memetracker.org/quotes-kdd09.pdf

• Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns.” - http://snap.stanford.edu/quotus/

• Weng et al. (2014). “Predicting Successful Memes using Network and Community Structure.” - http://arxiv.org/abs/1403.6199

• Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers.” - http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2885844/