spectral analysis of signed graphs for clustering, prediction and visualization

28
Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization Jérôme Kunegis¹, Stephan Schmidt¹, Andreas Lommatzsch¹ & Jürgen Lerner² ¹DAI Lab, Technische Universität Berlin, ²Universität Konstanz, Germany 10 th SIAM International Conference on Data Mining, April 29–May 1, Columbus, Ohio

Upload: jerome-kunegis

Post on 02-Nov-2014

1.047 views

Category:

Education


6 download

DESCRIPTION

We study the application of spectral clustering, prediction andvisualization methods to graphs with negatively weighted edges. We showthat several characteristic matrices of graphs can be extended to graphswith positively and negatively weighted edges, giving signed spectralclustering methods, signed graph kernels and network visualizationmethods that apply to signed graphs. In particular, we review a signedvariant of the graph Laplacian. We derive our results by consideringrandom walks, graph clustering, graph drawing and electrical networks,showing that they all result in the same formalism for handlingnegatively weighted edges. We illustrate our methods using examplesfrom social networks with negative edges and bipartite rating graphs.

TRANSCRIPT

Page 1: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Jérôme Kunegis¹, Stephan Schmidt¹, Andreas Lommatzsch¹ & Jürgen Lerner²¹DAI Lab, Technische Universität Berlin, ²Universität Konstanz, Germany

10th SIAM International Conference on Data Mining, April 29–May 1, Columbus, Ohio

Page 2: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 2

Introduction: Negative Edges

Some websites allow you to have foes :

Example: Slashdot Zoo (Kunegis 2009)

Page 3: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 3

Introduction: Signed Graphs

• The resulting social network is signed

• Edges are positive or negative

• In this talk: we use the graph Laplacian to study signed graphs

Example: Slashdot Zoo (Kunegis 2009)

me

Friend ofFoe of

Fan ofFreak of

Page 4: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 4

Outline

Introduction: Signed Graphs

1. Negative Edges and the Laplacian2. Balance, Conflict and the Graph Spectrum3. Communities, Cuts and Clustering4. Resistance, Conductivity and Link Prediction

Discussion

Page 5: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 5

1. Negative Edges and the Laplacian

� Graph drawing: Place each node at the center of its neighbors

v0 = (1/3) (v1 + v2 + v3)

Algebraically: D v = A v

Solution 1: Upper eigenvectors of D−1 A using A = {0, 1}n×n

Solution 2: Lower eigenvectors of D – A and Dii = Σj Aij

We look at solution 2: L = D − A is the Laplacian matrix

v0

v1

v2 v3

Page 6: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 6

Drawing Signed Graphs

• Replace ‘negative’ neighbors by their antipodal points

v0 = (1/3) (−v1 + v2 + v3)

Solution: lower eigenvectors of L = D − A

Using A = {0, −1, +1}n×n

And Dii = Σj | Aij|

v0

v1

v2v3

−v1

Page 7: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 7

Example: Synthetic Graph

Unsigned Graph Drawing → Signed Graph Drawing

Page 8: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 8

2. Balance, Conflict and the Graph Spectrum

• ‘Balanced’ graphs have a perfect 2-clustering

• Invert all negative edges

• Effect on the Laplacian decomposition: Inversion of all eigenvectors of one cluster

• Therefore: The spectrum of a balanced graph is the same as for the underlying unsigned graph (λ₁ = 0)

Page 9: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 9

The Laplacian Spectrum of Unbalanced Graphs

• Networks with conflict contain odd cycles

• The Laplacian is always positive semidefinite

xTLx = Σij |Aij|(xi − sgn(Aij) xj)² ≥ 0

• In unbalanced networks: λ₁ > 0

Page 10: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 10

Algebraic Conflict

• λ₁ denotes conflict

Network λ₁₁₁₁

MovieLens 100k 0.4285

MovieLens 1M 0.3761

Jester 0.06515

MovieLens 10M 0.006183

Slashdot Zoo 0.006183

Epinions 0.004438

Conflict

For effect of size, see Appendix

Page 11: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 11

3. Communities, Cuts and Clustering

The tribal groups of the Eastern Central Highlands of New Guinea can be friends (‘rova’) or enemies (‘hina’)

Graphic uses two lower eigenvectors of L

Page 12: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 12

Finding Communities

The Laplacian matrix finds communities :

• Communities are connected by many positive edges

• Community are separated by many negative edges

Page 13: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 13

Signed Spectral Clustering

• Compute the d lower eigenvectors of L• Use k-means to cluster nodes in this d-dimensional space

• Minimize signed normalized cut between communities X and Y

SNC(X, Y) = (|X|−1 + |Y|−1) · (2 pos(X, Y) + neg(X, X) + neg(Y, Y))

pos/neg: number of positive/negative edges between communities

Page 14: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 14

Example: Wikipedia Reverts

• Users revert users on controversial Wikipedia article ‘Criticism of Prem Rawat’

• All edges are negative

• Distance to center normalized to unit

• Four clusters are apparent

Page 15: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 15

4. Resistance, Conductivity and Link Prediction

• Consider a network of electrical resistances:

• Between any two nodes, the network has an effective resistance

• The resistance distance is a squared Euclidean metric

Page 16: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 16

Link Prediction

• The resistance distance can be used for link prediction :

– Long paths count less– Parallel paths count more

dist(i,j) = (L+)ii + (L+)jj − (L+)ij − (L+)ji

• Problem: How to handle negative edges ?

Page 17: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 17

Voltage Inversion

• Solution: inverting amplifier

dist(i,j) = (L+)ii + (L+)jj − (L+)ij − (L+)ji

• Using signed Laplacian L• Is squared Euclidean because L is positive semidefinite

−ww −

Page 18: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 18

• Task: Predict the sign of new links

• Problem: Find a function F(A) = B

Evaluation: Link Sign Prediction

Known positive links (A)

Links to be predicted (B)

Known negative links (A)

Page 19: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 19

Graph Kernels

Link prediction functions using the Laplacian:

• L+ – Signed Laplacian kernel

• (I + αL)−1 – Signed regularized Laplacian kernel

• exp(−αL) – Signed ‘heat diffusion’

Other link prediction functions:

• (A)k – Rank reduction

• exp(A) – Matrix exponential

• Poly(A) – Path counting

Page 20: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 20

Evaluation Results

• MovieLens: predict good / bad rating

• Best rating prediction: signed regularized Laplacian graph kernel

Link Prediction RMSE

Rank reduction 0.838

Path counting 0.840

Matrix exponential 0.839

Signed resistance distance 0.812

Signed regularized Laplacian 0.778

Signed heat diffusion 0.789

Page 21: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 21

Summary

� The Laplacian matrix applies to signed graphs

� The Laplacian spectrum denotes graph conflict

� The signed Laplacian arises in several ways:� For graph drawing, the Laplacian implements antipodal proximity� For clustering, the Laplacian implements signed cuts� As an interpretation of negation as inversion of electrical

potential

Page 22: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Thank You

Page 23: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 23

References

P. Hage, F. Harary. Structural models in anthropology, Cambridge University Press, 1983.

F. Harary. On the notion of balance of a signed graph, Michigan Math. J., 2:143–146, 1953.

J. Kunegis, A. Lommatzsch, C. Bauckhage, The Slashdot Zoo: Mining a social network with negative edges, Proc. Int. World Wide Web Conf., pages 741–750, 2009.

J. Leskovec, Daniel Huttenlocher, Jon Kleinberg, Predicting positive and negative links in online social networks, Proc. Int. World Wide Web Conf., 2010.

Page 24: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 24

Appendix – Balance vs Volume

Page 25: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 25

Appendix – Scalability

• Evaluation results in function of reduced rank k

Page 26: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 26

Appendix -- Balance, Conflict and the Graph Spectrum

Look at triads of users (Harary 1953):

• In balanced triangles, the multiplication rule holds

• If it doesn't, there is conflict

Balance:

Conflict:

Page 27: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 27

The Signed Clustering Coefficient

• How many triangles are balanced ?

Cs = (#balanced − #unbalanced) / #possible

• This measure is local, not global (Kunegis 2009)

± uv ?

u v

Page 28: Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Kunegis et al. Spectral Analysis of Signed Graphs 28

Introduction: Networks

• Many web sites allow you to have friends :

Example: Facebook