social event detection
DESCRIPTION
Presentation on Social Event Detection using mobile phone data, February 24, 2011TRANSCRIPT
Social Event Detection
V.A. Traag1, A. Browet1, F. Calabrese2, F. Morlot3
1Department of Applied MathematicsUCL, Louvain-la-neuve, Belgium
2SENSEable City LabMIT, Cambridge, USA
3Orange LabsIssy-les-Moulineaux, France
24 February 2011
Outline
1 Motivation
2 Bayesian Location Inference
3 Identification of frequent location
4 Event detection
5 Presence probability
Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the dailyhome-office routine (Barabasi et al.). Analyze theout-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for trafficregulation or public transportation.
3 Identification/Classification of users and their habits fortelecommunication company.
Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the dailyhome-office routine (Barabasi et al.). Analyze theout-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for trafficregulation or public transportation.
3 Identification/Classification of users and their habits fortelecommunication company.
Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the dailyhome-office routine (Barabasi et al.). Analyze theout-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for trafficregulation or public transportation.
3 Identification/Classification of users and their habits fortelecommunication company.
Introduction
Purpose
Analyze mobility and social behaviour of mobile phone users:
1 Detect social events i.e. unsual large gatherings of poeple.
2 Identify frequent location such as home or office.
Motivation
1 Between 70% & 80% of human mobility is explain by the dailyhome-office routine (Barabasi et al.). Analyze theout-of-ordinary behaviour.
2 Anticipate the impact of large events on urban transit for trafficregulation or public transportation.
3 Identification/Classification of users and their habits fortelecommunication company.
Introduction
Available data
1 Precise location of antennas but no orientation information.
2 Record for each connection to the networks (calls, textmessages, mobile internet,...)
Compute 2 probability measures
1 φi (x) to be connected to antenna i given a position x
2 ψi (x) to be in position x given that the user was connected toantenna i
Location Inference
The signal strength at position x of an antenna i at position Xi isdefined by:
• the power of the antenna pi ; but pi = p;
• the loss of signal strength over distance:
Li (x) =1
‖x − Xi‖β;
• a stochastic fading of the signal i.e. the Rayleigh fading Ri :
Pr(Ri ≤ r) = F (r) = 1− e−r .
Location Inference
The signal strength of antenna i is then given by
Si (x) = piLi (x)Ri .
Further assumptions:
• Ri ⊥⊥ Rj ∀i 6= j .
• given a position x , the user connects to the antenna i with thehighest signal strength:
Si (x) ≥ Sj(x) ∀j ∈ X
m
Si (x) = maxj∈X
Sj(x)
Location Inference
Let ai denote the fact that a user connects to antenna i .
Pr(ai |x) = Pr(Si (x) = maxj∈X Sj(x))
=∏j∈Xj 6=i
Pr (piLi (x)Ri ≥ pjLj(x)Rj)
If we assume that the random variable Ri realize a specific value r ,
Pr(ai |x ,Ri = r) =∏j∈Xj 6=i
Pr(Rj ≤ Li (x)
Lj (x)r)
=∏j∈Xj 6=i
F(Li (x)Lj (x)
r)
Location Inference
Then, it follows that
φi (x) = Pr(ai |x) =∞∫0
f (r)Pr(ai |x ,Ri = r)dr
=∞∫0
e−r∏j∈Xj 6=i
(1− exp
(−r ||x−Xj ||β||x−Xi ||β
))dr
≈∞∫0
e−r∏j∈Xi
(1− exp
(−r ||x−Xj ||β||x−Xi ||β
))dr
How to choose the local neighborhood and what is its impact ?
Location Inference
Delaunay Radius:
ρi = max{d(Xi ,Xj)| j Delaunay of i}
The domain Di is define by
Di = {x |rρi ≥ d(x ,Xi )}
The neighborhood is computed as
Xi = {j |Xj ∈ Di , j ∈ X}
Location Inference
Average error on 1000 random points
1 1.5 2 2.5 30
0.002
0.004
0.006
0.008
0.01
0.012
0.014
r
Avera
ge e
rror
Location Inference
Based on Bayes rule, we can obtain
ψi (x) = Pr(x |ai ) =Pr(ai |x)Pr(x)
Pr(ai )
The value Pr(x)Pr(ai )
is not known but can be assumed constant overthe domain Di . It follows that
ψi (x) =φi (x)∫
Di
φi (x)dx
Location Inference
Probability density ψi (x)
Frequent Location Indentification
Probability that a user connects to antenna i is φi (x)Probability that he made ki calls with antenna i is then φi (x)ki
The likelihood of observing those calling frequencies is
L(x |k) =∏i∈H
φi (x)ki
mlog L(x |k) =
∑i∈H
ki log φi (x)
Maximum Likelihood Estimator(MLE)
x̂h(u) = arg maxx
log L(x |k(u))
Overview Event Detection
General
• Looking for unusual large gatherings of people.
• Which people are likely to be attending an (possible) event?
• Should be present at the event location with high probability.
• Should not be often there.
Presence probability
Given calls in the neighbourhood, what is the probability the userwas present during the time interval of an event?
Ordinary probability
What is the average probability a user was present during otherweeks.
Presence probability
Derivation
• Probability user in area A at time tc for a call c is pc .
• Assume constant leave and arrival rate γ
• Then for t 6= tc we have e−γ|t−tc |pc .
• Take max over all calls c for a user
pp =1
te − ts
∫ te
ts
maxc
e−γ|t−tc |pcdt
Motivation
• More calls ⇒ higher presence probability
• Calls close by ⇒ higher presence probability
• Don’t take into account calls outside of area.
Presence probability
← First call
← Second call
Time
Pro
babili
ty
13 14 15 16 17 18 190
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Ordinary probability
How regularly is user in the area?(Consider only same weekday, same time of day)
Apr
il 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
Was not present, i.e. pp(i) = 0
Was in area with probability pp(2)
Was in area with probability pp(5)
Ordinary probability
Ordinary probability defined as average probability, i.e.po = 1
W
∑Wi=1 pp(i)
Probability of attending
Maximum ordinary probability
• Should be present with relatively high probability
• Relatively rarely present ⇒ small po (i.e. only for the event)
• What is theoretical maximum ordinary probability p̄o?
• Theoretical maximum: make infinite number of calls with ‘best’antenna.
Probability of attending
• Probability user attended then calculated as
pa = pp(1− po/p̄o)
Event detection
Number of attendees
• Mark user as (possible) attendee if pa high enough
• Number of (possible) attendees at week w given by nw
• Mark week w as event if nw is high enough.
Example: Stadium
0 10 20 30 40 50 60−2
−1
0
1
2
3
4
5
Week
Z−
score
Example: Stadium
0 2 4 6 8 10 12 14 16 18 20 22 240
50
100
150
200
250
300
350
Hour
No
. o
f C
alls
Not attending
Attending
Regular
Example: Park
0 10 20 30 40 50 60−4
−3
−2
−1
0
1
2
3
4
Week
Z−
score
Example: Park
0 2 4 6 8 10 12 14 16 18 20 22 240
50
100
150
200
250
300
350
Hour
No
. o
f C
alls
Not Attending
Attending
Regular
Example: Rural area
0 10 20 30 40 50 60−4
−3
−2
−1
0
1
2
3
4
Week
Z−
score
Sensitivity
Conclusions
Conclusions
• Possible to detect ‘social events’ in mobile phone data
• Robust to antenna positioning and switching
• Interesting observation: non-routine behaviour seems massive
Further considerations
• Use simpler (faster) method to detect irregularities
• Refine location estimation by likelihood inference
Questions? Suggestions? Remarks?