machine learning in simple networksmmds.imm.dtu.dk/presentations/hansen.pdf · european physical...
TRANSCRIPT
![Page 1: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/1.jpg)
Machine Learning in Simple Networks
Lars Kai Hansenwww.imm.dtu.dk/~lkh
![Page 2: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/2.jpg)
Outline
• Communities and link prediction
• Modularity – Modularity as a combinatorial optimization problem
– Gibbs sampling
• Detection threshold – a phase transition?
• Learning community parameters
– The Hofman-Wiggins generative model– Is there a threshold for detection when you learn the parameters and complexity?
![Page 3: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/3.jpg)
Muzeeker
• Wikipedia based common sense• Wikipedia used as a proxy for the
music users mental model• Implementation: Filter retrieval
using Wikipedia’s article/ categories
• Muzeeker.com
• LINK PREDICTION to complete the ontological quality of Wikipedia
![Page 4: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/4.jpg)
Network models
• Nodes/vertices and links/edges– Directed / undirected– Weighted / un-weighted
• Link distributions– Random– Long tail– Hubs and authorities
• Link induced correlations– The Rich club
• Communities– Link prediction
![Page 5: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/5.jpg)
Motivation for community detection• Community structure may mark a non-stationary link distribution with “high and low density” sub-networks, hence summarizing with a single “model” could be misleading
![Page 6: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/6.jpg)
Modularity can be predictive for dynamics
M.E.J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E 69, 026113 (2004).
![Page 7: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/7.jpg)
Modularity objective functionThe modularity is expressed as a sum over links, such that we penalize
missing links in communities - missing is measured relative to a null distribution P0ij.
( , )2
iji j i jij
AQ PP c c
mδ⎡ ⎤
= −⎢ ⎥⎣ ⎦
∑
Ci is the community assignment of node jand 2m = ΣijAij, ki = ΣjAij
The null is a baseline distribution Pij = kikj/(2m)2
The value of the modularity lies in the range [−1,1]. It is positive if the number of edges within groups exceeds the number expected on the basis of chance
M.E.J. Newman and M. Girvan. Finding and evaluating communitystructure in networks. Physical Review E, 69:026113,2004, cond-mat/0308217.
![Page 8: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/8.jpg)
Potts representation
( , )
( , )2
( , )2 2
1 ( ')2 2
i j ki kjk
ij
ij iji j i j i j ki kjij ij k
ij ki kjijk
c c S S
AP j i
mA A
Q PP c c PP S Sm m
Tr SBSQ B S Sm m
δ
δ
=
=
⎡ ⎤ ⎡ ⎤= − = −⎢ ⎥ ⎢ ⎥
⎣ ⎦ ⎣ ⎦
= =
∑
∑ ∑ ∑
∑
Introduce 0,1 binary variables Skj coding the community assignment: “node j is member of community k”
![Page 9: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/9.jpg)
Spectral optimization• Newman relaxes the optimization problem to the simplex
1 ( ')2 2
( ') ( )2
i j k i k ji jk
T r S B SQ B S Sm m
T r S B SL T r Sm
B S S
= =
= + Λ
= Λ
∑
![Page 10: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/10.jpg)
Combinatorial optimization• We can use a physics analogy Simulated Annealing (Kirkpatrick et al. 1983)
( ) ( ')( | , ) exp( ) exp( )2
Q S Tr SBSP S A TT mT
∝ =
• Gibbs sampling is a Monte Carlo realization of a Markov process in which each variable is randomly assigned according to its marginal distribution
( | , )( | , , )( | , )
j
j jS
P S A TP S S A TP S A T− =
∑S Geman,D Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images". IEEE Transactions on Pattern Analysis and Machine Intelligence 6 (6): 721–741 (1984)
![Page 11: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/11.jpg)
Potts model 1-node• Discrete probability distribution on states k = 1,…,K
( )
1
''
( | , ) exp ,
( | , )
exp
exp
k
Kk kk
Sk
k
k
k kk
k
SP S A T
T
P S A T r
TS r
T
ϕ
ϕ
ϕ
=⎛ ⎞⎜ ⎟∝⎜ ⎟⎝ ⎠
=
⎛ ⎞⎜ ⎟⎝ ⎠= =⎛ ⎞⎜ ⎟⎝ ⎠
∑
∏
∑
![Page 12: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/12.jpg)
Gibbs sampling
''
2 2 2 2
exp( / )exp( / )
potts( )
ij ij jiki kj kj kjj j j
kiki
k ik
i i
B A kkS S Sm m m m
TrT
S r
ϕ
ϕϕ
= = −
=
=
∑ ∑ ∑
∑
![Page 13: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/13.jpg)
Deterministic annealing • Instead of drawing Gibbs samples according to the marginals we can average instead, this provides a set of self-consistent equations for the means (for 0,1 Bernoulli variables the mean is the probability μki =P(Ski))
''
exp( / )exp( / )
2 2
kiki
k ik
ij ijki kj kj i j kjj j j
TrT
B Ar r PP r
m m
ϕϕ
ϕ
=
= = −
∑
∑ ∑ ∑
S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) 83-88 (2007).
![Page 14: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/14.jpg)
Experimental evaluation• Create a simple testbed with link probability and “noise”
S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) 83-88 (2007).
![Page 15: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/15.jpg)
S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) 83-88 (2007).
![Page 16: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/16.jpg)
Generative community model (Hofman & Wiggins, 2008)
( )
( )
,
,
( | , , ) (1 ) (1 )121 (1 )21 121 (1 ) 12
c d e f
ij kj kij i k
ij kj kij i k
ij kj kij i k
ij kj kij i k
P A S p q p p q q
c A S S
d A S S
e A S S
f A S S
≠
≠
≠
≠
= − −
=
= −
= −
= − −
∑
∑
∑ ∑
∑ ∑
![Page 17: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/17.jpg)
Learning parameters of the generative model
• Hofman & Wiggins (2008)– “Variational Bayes”
– Dirichlets/beta prior and posterior distributions for the probabilities
– Very well determined (over kill)
– Independent binomials for the assignment variables (misses correlation)
• Here– Maximum likelihood for the parameters– Gibbs sampling for the assignments
Jake M. Hofman and Chris H. Wiggins, Bayesian Approach to Network ModularityPhys. Rev. Lett. 100, 258701 (2008),
![Page 18: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/18.jpg)
The community detection thresholdhow many links are needed to detect the structure?
Jorg Reichardt and Michele Leone, Un)detectable Cluster Structure in Sparse NetworksPhys. Rev. Lett. 101, 078701 (2008),
( 1) 1inp SNRP
q C C= =
− −
![Page 19: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/19.jpg)
Experimental design
• Planted solution– N = 1000 nodes– Ctrue = 5– Quality: Mutual information between
• planted assignments and the best identified
• Gibbs sampling– No annealing– Burn-in 200 iterations– Averaging 800 iterations
• Parameter learning– Q = 10 iterations
![Page 20: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/20.jpg)
Community Detection – fully informed on number of communities and probabilities
0 0.01 0.02 0.03 0.04 0.050
0.5
1
1.5
2
2.5
INTRA COMMUNITY LINK PROB (P)
MU
TU
AL
INF
. PLA
NT
ED
CO
MM
UN
ITY
COMMUNITY DETECTION (N =1000, C = 5, SNR = 5)
0 0.01 0.02 0.03 0.04 0.050
0.5
1
1.5
2
2.5
INTRA COMMUNITY LINK PROB (P)
MU
TU
AL
INF
. PLA
NT
ED
CO
MM
UN
ITY
COMMUNITY DETECTION (N =1000, C = 5, SNR = 10)
0 0.01 0.02 0.03 0.04 0.050
0.5
1
1.5
2
2.5
INTRA COMMUNITY LINK PROB (P)
MU
TU
AL
INF
. PLA
NT
ED
CO
MM
UN
ITY
COMMUNITY DETECTION (N =1000, C = 5, SNR = 50)
0 0.01 0.02 0.03 0.04 0.050
0.5
1
1.5
2
2.5
INTRA COMMUNITY LINK PROB (P)
MU
TU
AL
INF
. PLA
NT
ED
CO
MM
UN
ITY
COMMUNITY DETECTION (N =1000, C = 10, SNR = 50)
![Page 21: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/21.jpg)
Now what happens to the phase transition if we learn the parameters … with a too complex model(C > Ctrue = 5) ?
0 0.02 0.04 0.06 0.08 0.10
0.5
1
1.5
2
2.5
INTRA COMMUNITY LINK PROB (P)
MU
TU
AL
INF
. PLA
NT
ED
CO
MM
UN
ITY
COMMUNITY DETECTION (N =1000, C = 10, SNR = 10)
0 0.02 0.04 0.06 0.08 0.10
0.5
1
1.5
2
2.5
INTRA COMMUNITY LINK PROB (P)M
UT
UA
L IN
F. P
LAN
TE
D C
OM
MU
NIT
Y
COMMUNITY DETECTION (N =1000, C = 10, SNR = 5)
1 2 3 4 5 6 7 8 9 100
50
100
150
200
COMMUNITY
ME
MB
ER
SH
IPS
![Page 22: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/22.jpg)
Conclusions
•Community detection can be formulated as an inference problem (Hofman & Wiggins, 2008)
•The sampling process for fixed SNR has a phase transition like detection threshold (Richard & Leone, 2008)
•The phase transition remains (sharpens?) if you learn the parameters of a generative model with unknown complexity
![Page 23: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization](https://reader031.vdocuments.us/reader031/viewer/2022022611/5b98695b09d3f219118c1b94/html5/thumbnails/23.jpg)