time critical influence maximization

Time-Critical Influence Maximization in Social

Networks with Time-Delayed Diffusion Process

Wei Chen Wei Lu Ning Zhang

Microsoft Research Asia U. of British Columbia U. of Sci and Tech of China

This work was done during the internships of Wei Lu and Ning Zhang at Microsoft Research Asia.

Influence in Social Networks

Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 2

We live in communities

and interact with social

acquaintances

This forms social

networks

In social interactions, we

influence each other

Kinect is great

Kinect is great

Kinect is great

Kinect is great

Kinect is great

Kinect is great

Kinect is great

Influence Diffusion & Viral Marketing

3 AAAI 2012, Toronto, Ontario.

Word-of-mouth effects

Social Networks as Directed Graphs


Nodes: Individuals in the network

Edges: Links between individuals

Edge weight: Influence probability p(u,v) – the probability that v will be influenced by u

0.8

0.7

0.1

0.13

0.3

0.41 0.27

0.2

0.9

0.01

0.6

0.54

0.1

0.11 0

0.2 0.7

A Classical Influence Propagation

Model


Independent Cascade (IC) (Kempe, Kleinberg, and Tardos 2003)

Initially some seed nodes are activated

At each time step (discrete), each newly-activated node u

activates its neighbor v independently with probability p(u,v)

Influence spread: Expected number of nodes activated

Other propagation models

Linear Threshold (LT)

General Threshold

etc

Influence Maximization


Problem

Select k individuals such that by

activating them, the expected

spread of influence is maximized.

Input

Output

A directed graph representing a

social network, with influence

probabilities on edges

Seed set of size k

NP-hard #P-hard to compute exact influence

Temporal Aspects in Influence

Diffusion


Influence diffusion can be time-delayed

Heterogeneity of human activities and interactions (Iribarren and

Moro, 2009)

Network topology and burstiness (Karsai et al., 2011)

NOT captured in classical propagation models

Temporal Aspects in Influence

Diffusion


The task of influence maximization may be time-critical.

Xbox 360 + Kinect is on sale in Vancouver area BestBuys, for

3 days only !!!!!

Alice has grabbed this great deal, and wants to inform Bob

But Bob’s been away for a road trip in Banff National Park

No stable Internet & cellphone access (Rocky mountains!) &

Uncertain return time

Viral marketing campaigns may have limited time horizon,

which affects the spread of word-of-mouth.

NOT captured in current propagation models

Our Work


Extend the influence maximization problem to have a deadline

constraint: Time-Critical Influence Maximization

Propose a new propagation model to reflect temporal delays

of influence diffusion: Independent Cascade with Meeting

events (IC-M)

Independent Cascade with Meeting

Events (The IC-M model)


Model parameters

Social networks modeled as directed graphs

Influence probability p(u,v)

Meeting probability m(u,v), the probability that u and v

meet in each time step

Diffusion dynamics

Initially, a seed set is targeted and activated

At each time step, u and v meet w.p. m(u,v).

If u is active & is meeting v for the first time, u influences v with

prob p(u,v)

Time-Critical Influence Maximization


Original Influence Maximization

Diffusion ends only when no more nodes can be activated

Unconstrained time horizon

Meeting probabilities not essential


Given an integer T << |V|, we only consider influence spread

within T time steps

T: the deadline constraint.

Representing limited time horizon.



Problem Formulation

Input: G=(V, E), k, T

Objective: find k seeds such that the spread of influence by the

end of time step T is maximized (under the IC-M model)

NP-hard

Greedy Approximation Algorithm


Under the IC-M model, our objective function (spread of

influence) is monotone and submodular in the seed set.

Monotonicity: As the seed set grows, influence is non-decreasing

Submodularity: The law of diminishing marginal returns

Greedy approximation algorithm

Repeat k rounds

In each round, select the node v that provides the largest marginal

gain in influence spread

Approximation ratio = 1-1/e ≈ 63% (Nemhauser et al., 1978)

#P-hard to compute influence spread exactly for IC-M

Can use Monte Carlo simulations to estimate, but very costly

Overcome the Inefficiency of Greedy


MIA: Maximum Influence Arborescence (Chen et al., 2010)

Heuristic No.1 (MIA-M algorithm) Design an efficient algorithm to compute influence spread

exactly in tree structures

Leverage it to design scalable heuristics for time-critical influence maximization in general graphs

Heuristic No.2 (MIA-C algorithm)

For each pair of nodes (u,v), estimate the probability that influence will propagate from u to v by combining p(u,v), m(u,v), and the deadline T

Convert the problem to one in the classical IC model, and solve it using MIA

Compute Influence in Directed Trees

(In-Arborescences)


Activation probability ap(u,t): the probability that u becomes active right at time step t

u

Calculating Activation Probability


Step 1: For any seed set S, Compute ap(u,t) given S via

dynamic programming

Base cases

The recursion: ap(u,t) =

Step 2: By linearity of expectation, for a given seed set S,

inf(S) =

Computing Influence in General Graphs


Utilize the dynamic programming algorithm for trees

Restrict incoming influence to a node u in a local region

Influence from nodes far away can be ignored

“Sparsify” the local region of a node u to an in-arborescence, by

including only the strongest influence path from other nodes

to u

Dijkstra’s algorithm

Experiments: Network Datasets


NetHEPT: A co-authorship network from arxiv.org High Energey

Physics Theory section.

WikiVote: A who-voted-whom network from Wikipedia

Epinions: A who-trusts-whom network from the customer

reviews site Epinions.com

DBLP: A co-authorship network from DBLP

NetHEPT WikiVote Epinions DBLP

# Nodes 15K 7.1K 75K 655K

# Edges 62K 101K 509K 2.0M

Avg. degree 4.12 26.6 13.4 6.1

Max. degree 64 1065 3079 588

Experimental Results


(a). NetHEPT (b). Epinions

Graph parameters:

• Influence probability: p(u,v) = 1.0 / in-degree(v) • Meeting probability: m(u,v) = 1.0 / out-degree(u)

Fig. Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15

Experimental Results


(a). NetHEPT (b). Epinions

Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15

Graph parameters:

• Influence probability: p(u,v) = 1.0 / in-degree(v) • Meeting probability: m(u,v) chosen uniformly at random from

{0.2, 0.3, … , 0.7, 0.8}

Running Time Comparisons


T = 5, Weighted meeting probabilities. DNP = Did Not Complete (within 72 hours)

In general, Greedy is too slow to use in practice Cannot scale to large graphs

MIA-M, MIA-C are 2-3 orders of magnitude faster MIA-M is a little bit slower than MIA-C, MIA But has higher spread of influence

NetHEPT WikiVote Epinions DBLP

Greedy 40m 22m DNP DNP

MIA-M 1.6s 7.9s 41s 6.6m

MIA-C 0.3s 0.4s 2.7s 24s

MIA 0.3s 1.4s 12s 40s

Conclusions, Discussions & Future

Work


Conclusions

Time-Critical Influence Maximization Problem

Independent Cascade model with meeting Events

Approximation algorithm & heuristic solutions

Extensions & Refinements

Linear Threshold model with Meeting events (LT-M)

More efficient computation of activation probabilities in tree structures

Details available in our full technical report: arXiv 1204.3074

Future Work

Extend classical propagation models to incorporate login events

Extend to more general propagation models


Thanks!!! Questions???

KDD 2012 tutorial on Information and Influence Spread in

Social Networks (Aug 12, Beijing, China)

Carlos Castillo (Qatar Computing Research Institute)

Wei Chen (Microsoft Research Asia)

Laks V.S. Lakshmanan (University of British Columbia)

0.8

0.7

0.1

0.1

0.3

0.4 0.2

0.2

0.9

0.1

0.6

0.5

0.1

0.1

1

0

0.2 0.7

time critical influence maximization

Science

influence spread

influence probabilities

exact influence

influence maximization

great kinect

task of influence maximization

influence probability

expected spread of influence