cascading behavior in large blog graphs patterns and a model
DESCRIPTION
Cascading Behavior in Large Blog Graphs Patterns and a Model. Leskovec et al. (SDM 2007). Why?. Temporal Aspects How does information spread in Social Network? How does the popularity die? Linearly, exponentially, or …? Topological Aspects Do information cascades have common structures? - PowerPoint PPT PresentationTRANSCRIPT
Cascading Behavior in Large Blog GraphsPatterns and a Model
Leskovec et al.(SDM 2007)
Why?
• Temporal Aspects– How does information spread in Social Network?– How does the popularity die? Linearly,
exponentially, or …?
• Topological Aspects– Do information cascades have common
structures?– Their properties like size distribution
Preliminaries
• Trivial vs. Non-trivial Cascades• Cascade Initiator• Stars and Chains• Connector nodes
Dataset• 21.3 million posts, 2.5 million blogs from Aug and Sep 2005• Start with most cited blog posts in Aug’05• Traversed conversations forward (inlinks) and backward
(outlinks)• Max depth = 100; max breadth = 500• Collected
– Unique post ID– Blog URL– Post Permalink– Post Date– Post Content– Post Links
Temporal Patterns
How Popularity dies?
Blog Network Topology
Popular blogs that receive lots of inlinksdoes not necessarily sprout many outlinks.
Post Network Topology
98% of the posts are isolated
Topological Patterns
Common Cascade Shapes (Gr has the frequency rank r)
97% are trivial cascades
Topological Patterns
Cascade Size Distribution
Observations
• Most cascades follow tree like structures.• Linear increase in diameter requires
exponential increase in the cascade size.• The probability that a node will be a part of a
cascade decreases with the number of cascades it is already a part of.
Generative Model
• Susceptible-Infected-Susceptible (SIS) Model
• β: “infection probability” of a post
• Blog can be either “infected” or “susceptible”
Summary
• Temporal patterns• Topological patterns• Generative model
Food for thought• Blogs are sparsely linked. Not many posts link to
the original post from which they got the content. How to study information diffusion in these scenarios?– Beyond link analysis
• Uniform infecting probability is an unrealistic assumption
• Multiple cascades initiating simultaneously• Not many study the “tipping point” in cascades• Does the cascade die its natural death or is there
some factor that affects the lifespan of a cascade
T-1 T T+1
Backward Forward
InlinkOutlink