dynamic structural equation models for tracking cascades over social networks

14
Brian Baingana, Gonzalo Mateos and Georgios B. Giannakis Dynamic Structural Equation Models for Tracking Cascades over Social Networks Acknowledgments: NSF ECCS Grant No. 1202135 and NSF AST Grant No. 1247885 December 17, 2013

Upload: cade

Post on 23-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Dynamic Structural Equation Models for Tracking Cascades over Social Networks. Brian Baingana, Gonzalo Mateos and Georgios B. Giannakis. Acknowledgments: NSF ECCS Grant No. 1202135 and NSF AST Grant No. 1247885. December 17, 2013. Context and motivation. Contagions. I nfectious diseases. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Brian Baingana, Gonzalo Mateos and Georgios B. Giannakis

Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Acknowledgments: NSF ECCS Grant No. 1202135 and NSF AST Grant No. 1247885

December 17, 2013

Page 2: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Context and motivation

2

Popular news stories

Infectious diseases Buying patterns

Propagate in cascades over social networks

Network topologies:

Unobservable, dynamic, sparse

Topology inference vital:

Viral advertising, healthcare policy

B. Baingana, G. Mateos, and G. B. Giannakis, ``Dynamic structural equation models for social network topology inference,'' IEEE J. of Selected Topics in Signal Processing, 2013 (arXiv:1309.6683 [cs.SI])

Goal: track unobservable time-varying network topology from cascade traces

Contagions

Page 3: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Contributions in context

3

Contributions

Dynamic SEM for tracking slowly-varying sparse networks

Accounting for external influences – Identifiability [Bazerque-Baingana-GG’13]

ADMM-based topology inference algorithm

Related work

Static, undirected networks e.g., [Meinshausen-Buhlmann’06], [Friedman et al’07]

MLE-based dynamic network inference [Rodriguez-Leskovec’13]

Time-invariant sparse SEM for gene network inference [Cai-Bazerque-GG’13]

Structural equation models (SEM): [Goldberger’72]

Statistical framework for modeling causal interactions (endo/exogenous effects)

Used in economics, psychometrics, social sciences, genetics… [Pearl’09]

J. Pearl, Causality: Models, Reasoning, and Inference, 2nd Ed., Cambridge Univ. Press, 2009

Page 4: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Cascades over dynamic networks

4

Example: N = 16 websites, C = 2 news event, T = 2 days

Unknown (asymmetric) adjacency matrices

N-node directed, dynamic network, C cascades observed over

Event #1

Event #2

Cascade infection times depend on:

Causal interactions among nodes (topological influences)

Susceptibility to infection (non-topological influences)

Page 5: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Model and problem statement

5

Captures (directed) topological and external influences

Problem statement:

Data: Infection time of node i by contagion c during interval t:

external influence

un-modeled dynamics

Dynamic SEM

Page 6: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Exponentially-weighted LS criterion

6

Structural spatio-temporal properties

Slowly time-varying topology

Sparse edge connectivity,

Sparsity-promoting exponentially-weighted least-squares (LS) estimator

(P1)

Edge sparsity encouraged by -norm regularization with

Tracking dynamic topologies possible if

Page 7: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Topology-tracking algorithm

7

Alternating-direction method of multipliers (ADMM), e.g., [Bertsekas-Tsitsiklis’89]

Each time interval

(P2)

Acquire new data

Recursively update data sample (cross-)correlations

Solve (P2) using ADMM

Attractive features Provably convergent, close-form updates (unconstrained LS and soft-thresholding)

Fixed computational cost and memory storage requirement per

Page 8: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

ADMM iterations

8

Sequential data terms: , ,

can be updated recursively:

denotes row i of

Page 9: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Simulation setup Kronecker graph [Leskovec et al’10]: N = 64, seed graph

cascades, ,

Non-zero edge weights varied for

Uniform random selection from

Non-smooth edge weight variation

9

Page 10: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Simulation results Algorithm parameters

Initialization

Error performance

10

Page 11: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

The rise of Kim Jong-un

t = 10 weeks t = 40 weeks

Web mentions of “Kim Jong-un” tracked from March’11 to Feb.’12

N = 360 websites, C = 466 cascades, T = 45 weeks

11Data: SNAP’s “Web and blog datasets” http://snap.stanford.edu/infopath/data.html

Kim Jong-un – Supreme leader of N. Korea

Increased media frenzy following Kim Jong-un’s ascent to power in 2011

Page 12: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

LinkedIn goes public Tracking phrase “Reid Hoffman” between March’11 and Feb.’12

N = 125 websites, C = 85 cascades, T = 41 weeks

t = 5 weeks t = 30 weeks

12Data: SNAP’s “Web and blog datasets” http://snap.stanford.edu/infopath/data.html

US sites

Datasets include other interesting “memes”: “Amy Winehouse”, “Syria”, “Wikileaks”,….

Page 13: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

Conclusions

13

Dynamic SEM for modeling node infection times due to cascades

Topological influences and external sources of information diffusion

Accounts for edge sparsity typical of social networks

ADMM algorithm for tracking slowly-varying network topologies

Corroborating tests with synthetic and real cascades of online social media

Key events manifested as network connectivity changes

Thank You!

Ongoing and future research

Identifiabiality of sparse and dynamic SEMs Statistical model consistency tied to Large-scale MapReduce/GraphLab implementations Kernel extensions for network topology forecasting

Page 14: Dynamic Structural Equation Models for Tracking Cascades over Social Networks

ADMM closed-form updates

14

Update with equality constraints:

,

:

Update by soft-thresholding operator