time critical influence maximization
TRANSCRIPT
Time-Critical Influence Maximization in Social
Networks with Time-Delayed Diffusion Process
Wei Chen Wei Lu Ning Zhang
Microsoft Research Asia U. of British Columbia U. of Sci and Tech of China
This work was done during the internships of Wei Lu and Ning Zhang at Microsoft Research Asia.
Influence in Social Networks
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 2
We live in communities
and interact with social
acquaintances
This forms social
networks
In social interactions, we
influence each other
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Influence Diffusion & Viral Marketing
3 AAAI 2012, Toronto, Ontario.
Word-of-mouth effects
Social Networks as Directed Graphs
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 4
Nodes: Individuals in the network
Edges: Links between individuals
Edge weight: Influence probability p(u,v) – the probability that v will be influenced by u
0.8
0.7
0.1
0.13
0.3
0.41 0.27
0.2
0.9
0.01
0.6
0.54
0.1
0.11 0
0.2 0.7
A Classical Influence Propagation
Model
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 5
Independent Cascade (IC) (Kempe, Kleinberg, and Tardos 2003)
Initially some seed nodes are activated
At each time step (discrete), each newly-activated node u
activates its neighbor v independently with probability p(u,v)
Influence spread: Expected number of nodes activated
Other propagation models
Linear Threshold (LT)
General Threshold
etc
Influence Maximization
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 6
Problem
Select k individuals such that by
activating them, the expected
spread of influence is maximized.
Input
Output
A directed graph representing a
social network, with influence
probabilities on edges
Seed set of size k
NP-hard #P-hard to compute exact influence
Temporal Aspects in Influence
Diffusion
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 7
Influence diffusion can be time-delayed
Heterogeneity of human activities and interactions (Iribarren and
Moro, 2009)
Network topology and burstiness (Karsai et al., 2011)
NOT captured in classical propagation models
Temporal Aspects in Influence
Diffusion
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 8
The task of influence maximization may be time-critical.
Xbox 360 + Kinect is on sale in Vancouver area BestBuys, for
3 days only !!!!!
Alice has grabbed this great deal, and wants to inform Bob
But Bob’s been away for a road trip in Banff National Park
No stable Internet & cellphone access (Rocky mountains!) &
Uncertain return time
Viral marketing campaigns may have limited time horizon,
which affects the spread of word-of-mouth.
NOT captured in current propagation models
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 9
Our Work
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 10
Extend the influence maximization problem to have a deadline
constraint: Time-Critical Influence Maximization
Propose a new propagation model to reflect temporal delays
of influence diffusion: Independent Cascade with Meeting
events (IC-M)
Independent Cascade with Meeting
Events (The IC-M model)
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 11
Model parameters
Social networks modeled as directed graphs
Influence probability p(u,v)
Meeting probability m(u,v), the probability that u and v
meet in each time step
Diffusion dynamics
Initially, a seed set is targeted and activated
At each time step, u and v meet w.p. m(u,v).
If u is active & is meeting v for the first time, u influences v with
prob p(u,v)
Time-Critical Influence Maximization
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 12
Original Influence Maximization
Diffusion ends only when no more nodes can be activated
Unconstrained time horizon
Meeting probabilities not essential
Time-Critical Influence Maximization
Given an integer T << |V|, we only consider influence spread
within T time steps
T: the deadline constraint.
Representing limited time horizon.
Time-Critical Influence Maximization
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 13
Problem Formulation
Input: G=(V, E), k, T
Objective: find k seeds such that the spread of influence by the
end of time step T is maximized (under the IC-M model)
NP-hard
Greedy Approximation Algorithm
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 14
Under the IC-M model, our objective function (spread of
influence) is monotone and submodular in the seed set.
Monotonicity: As the seed set grows, influence is non-decreasing
Submodularity: The law of diminishing marginal returns
Greedy approximation algorithm
Repeat k rounds
In each round, select the node v that provides the largest marginal
gain in influence spread
Approximation ratio = 1-1/e ≈ 63% (Nemhauser et al., 1978)
#P-hard to compute influence spread exactly for IC-M
Can use Monte Carlo simulations to estimate, but very costly
Overcome the Inefficiency of Greedy
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 15
MIA: Maximum Influence Arborescence (Chen et al., 2010)
Heuristic No.1 (MIA-M algorithm) Design an efficient algorithm to compute influence spread
exactly in tree structures
Leverage it to design scalable heuristics for time-critical influence maximization in general graphs
Heuristic No.2 (MIA-C algorithm)
For each pair of nodes (u,v), estimate the probability that influence will propagate from u to v by combining p(u,v), m(u,v), and the deadline T
Convert the problem to one in the classical IC model, and solve it using MIA
Compute Influence in Directed Trees
(In-Arborescences)
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 16
Activation probability ap(u,t): the probability that u becomes active right at time step t
u
Calculating Activation Probability
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 17
Step 1: For any seed set S, Compute ap(u,t) given S via
dynamic programming
Base cases
The recursion: ap(u,t) =
Step 2: By linearity of expectation, for a given seed set S,
inf(S) =
Computing Influence in General Graphs
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 18
Utilize the dynamic programming algorithm for trees
Restrict incoming influence to a node u in a local region
Influence from nodes far away can be ignored
“Sparsify” the local region of a node u to an in-arborescence, by
including only the strongest influence path from other nodes
to u
Dijkstra’s algorithm
Experiments: Network Datasets
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 19
NetHEPT: A co-authorship network from arxiv.org High Energey
Physics Theory section.
WikiVote: A who-voted-whom network from Wikipedia
Epinions: A who-trusts-whom network from the customer
reviews site Epinions.com
DBLP: A co-authorship network from DBLP
NetHEPT WikiVote Epinions DBLP
# Nodes 15K 7.1K 75K 655K
# Edges 62K 101K 509K 2.0M
Avg. degree 4.12 26.6 13.4 6.1
Max. degree 64 1065 3079 588
Experimental Results
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 20
(a). NetHEPT (b). Epinions
Graph parameters:
• Influence probability: p(u,v) = 1.0 / in-degree(v) • Meeting probability: m(u,v) = 1.0 / out-degree(u)
Fig. Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15
Experimental Results
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 21
(a). NetHEPT (b). Epinions
Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15
Graph parameters:
• Influence probability: p(u,v) = 1.0 / in-degree(v) • Meeting probability: m(u,v) chosen uniformly at random from
{0.2, 0.3, … , 0.7, 0.8}
Running Time Comparisons
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 22
T = 5, Weighted meeting probabilities. DNP = Did Not Complete (within 72 hours)
In general, Greedy is too slow to use in practice Cannot scale to large graphs
MIA-M, MIA-C are 2-3 orders of magnitude faster MIA-M is a little bit slower than MIA-C, MIA But has higher spread of influence
NetHEPT WikiVote Epinions DBLP
Greedy 40m 22m DNP DNP
MIA-M 1.6s 7.9s 41s 6.6m
MIA-C 0.3s 0.4s 2.7s 24s
MIA 0.3s 1.4s 12s 40s
Conclusions, Discussions & Future
Work
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 23
Conclusions
Time-Critical Influence Maximization Problem
Independent Cascade model with meeting Events
Approximation algorithm & heuristic solutions
Extensions & Refinements
Linear Threshold model with Meeting events (LT-M)
More efficient computation of activation probabilities in tree structures
Details available in our full technical report: arXiv 1204.3074
Future Work
Extend classical propagation models to incorporate login events
Extend to more general propagation models
Thursday, July 26, 2012 AAAI 2012, Toronto, Ontario. 24
Thanks!!! Questions???
KDD 2012 tutorial on Information and Influence Spread in
Social Networks (Aug 12, Beijing, China)
Carlos Castillo (Qatar Computing Research Institute)
Wei Chen (Microsoft Research Asia)
Laks V.S. Lakshmanan (University of British Columbia)
0.8
0.7
0.1
0.1
0.3
0.4 0.2
0.2
0.9
0.1
0.6
0.5
0.1
0.1
1
0
0.2 0.7