finding all maximal cliques in very large social networks

36
Finding All Maximal Cliques in Very Large Social Networks 16 March 2016, EDBT 2016, Bordeaux, France Alessio Conte°, Roberto De Virgilio § , Antonio Maccioni § , Maurizio Patrignani § , Riccardo Torlone § °Università di Pisa § Università Roma Tre

Upload: antonio-maccioni

Post on 16-Apr-2017

227 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Finding All Maximal Cliques in Very Large Social Networks

Finding All Maximal Cliques in Very Large Social Networks

16 March 2016, EDBT 2016, Bordeaux, France

Alessio Conte°, Roberto De Virgilio§, Antonio Maccioni§, Maurizio Patrignani§, Riccardo Torlone§

°Università di Pisa §Università Roma Tre

Page 2: Finding All Maximal Cliques in Very Large Social Networks

Social Network Analysis

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 1

Page 3: Finding All Maximal Cliques in Very Large Social Networks

Community Detection

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 2

Page 4: Finding All Maximal Cliques in Very Large Social Networks

Cliques

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 3

A

J H

F

Z

A

J

A

J

F

A

J H

F

Page 5: Finding All Maximal Cliques in Very Large Social Networks

Maximal Clique Enumeration (MCE)

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4

A

JH H

F

D D E

S

E Y

E G

S U

S W

Page 6: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

D

US

W

E

E

S D

GY

Page 7: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

D

US

W

E

E

S D

GY

Page 8: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

D

US

W

E

E

S D

GY

Page 9: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

Z RP

D

L

E

S

WU

X

G

Y

Page 10: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

Z RP

D E

E

S X

GY

D

L

S

W

U

Page 11: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

Z RP

D E

E

S X

GY

D

L

S

W

U

Page 12: Finding All Maximal Cliques in Very Large Social Networks

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

Z RP

D E

ES

X

GY

D

L

S

W

U

D E

S

undetected cliques

Page 13: Finding All Maximal Cliques in Very Large Social Networks

Hub Nodes Effect

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12

Block size m (Max number of nodes) = 5

A

JH

F

Z RP

D

L

E

S

WU

X

G

Y

Page 14: Finding All Maximal Cliques in Very Large Social Networks

The Block Size Effect

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13

* Taken from [Cheng et al., KDD 2012]

efficiency vs completeness/correcteness

Page 15: Finding All Maximal Cliques in Very Large Social Networks

Overview of the Approach

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14

G = (N, E)

C = c1, c

2, ...,

c

n

1st level decomposition

Induced graph2nd level decomposition

FIND MAX CLIQUES

FIND MAX CLIQUES

Block analysis

Nf N

h

Block 1 Block z...

UC

fC

h

Page 16: Finding All Maximal Cliques in Very Large Social Networks

1st Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15

Separate hubs from the rest of nodes in N- according to a maximum block size m

1st level decompositionFIND MAX CLIQUES

Nf N

h

G = (N, E)

A

JH

F

Z R

P

L WU

X

GY

D ES

Page 17: Finding All Maximal Cliques in Very Large Social Networks

1st Level Decomposition - Lemma

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16

The set of all maximal cliques C of G can be obtained by computing C

f and C

h alone

- C = Cf U C

h

- Cf is the set of cliques with at least one node in N

f

- Ch is the set of cliques with all the nodes in N

h

- proof of this Lemma is in our paper

Page 18: Finding All Maximal Cliques in Very Large Social Networks

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17

2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

Page 19: Finding All Maximal Cliques in Very Large Social Networks

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18

Kernel node Border nodeKernel node Visited node

B1

A

J

H2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

Page 20: Finding All Maximal Cliques in Very Large Social Networks

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19

Kernel node Border nodeKernel node Visited node

B1

A

J

H2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

Page 21: Finding All Maximal Cliques in Very Large Social Networks

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20

Kernel node Border nodeKernel node Visited node

B1

A

J

H

F

D 2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

Page 22: Finding All Maximal Cliques in Very Large Social Networks

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21

Kernel node Border nodeKernel node Visited node

Page 23: Finding All Maximal Cliques in Very Large Social Networks

Maximal Clique Enumeration

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22

There are many algorithms for MCE- no one outperforms the others- but each has specific advantages

Tomita et al. Eppstein et al....

FIND MAX CLIQUES

Block analysis

Block 1 Block z...

Cf

Page 24: Finding All Maximal Cliques in Very Large Social Networks

Block Analysis

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23

We determine the best-fit MCE algorithm on each block

Block analysis

Select best-fit

Tomita et al. Eppstein et al....

Cf

Block

FIND MAX CLIQUES

Block analysis

Block 1 Block z...

Cf

Page 25: Finding All Maximal Cliques in Very Large Social Networks

Decision Tree

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 24

Page 26: Finding All Maximal Cliques in Very Large Social Networks

Recursion Over Hub Nodes

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25

Induced graph

FIND MAX CLIQUES

FIND MAX CLIQUES

Nh

Ch

Page 27: Finding All Maximal Cliques in Very Large Social Networks

Recursion Over Hub Nodes

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26

The induced graph of the hub nodes is recursively processed

B12

DE

S

Kernel node Border nodeKernel node Visited node

Induced graph

FIND MAX CLIQUES

FIND MAX CLIQUES

Nh

Ch

Page 28: Finding All Maximal Cliques in Very Large Social Networks

Convergence Guarantee

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27

Given a degeneracy of the graph lower than the block size, the whole process converges

Theorem. Let G be a graph and let Gi, with i=1, 2, 3, ... be a sequence

of subgraphs of G such that G1 = G and G

i, for i > 1 is the graph induced

by the nodes of Gi−1

of degree greater or equal than m. Let the degeneracy d of G be strictly less than m + 1.

1. There is a value q such that all Gj , with j ≥ q, are empty graphs.

2. There exists a graph with n nodes for which q is Ω(n).

Page 29: Finding All Maximal Cliques in Very Large Social Networks

Degeneracy and Sparsity

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28

A measure of the sparsity of a graph- it is the highest value d for which the network contains a d-core. A d-core is obtained by recursively removing nodes with degree less

than d- degeneracy is typically < 100 on scale-free real graphs- facebook has a degeneracy ~ 54

Page 30: Finding All Maximal Cliques in Very Large Social Networks

Experiments: Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 29

Page 31: Finding All Maximal Cliques in Very Large Social Networks

Experiments: the Block Size Effect

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 30

Page 32: Finding All Maximal Cliques in Very Large Social Networks

Experiments: Decision Tree

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 31

Page 33: Finding All Maximal Cliques in Very Large Social Networks

Experiments: Effectiveness

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 32

Page 34: Finding All Maximal Cliques in Very Large Social Networks

Conclusion

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33

Approach for computing maximal cliques over an arbitrarily large graph- taking advantage of distributed computation- leveraging on the advantages of different existing algorithms for MCE

Completeness and correcteness are not compromised by hub nodes- unlike state-of-the-art

Page 35: Finding All Maximal Cliques in Very Large Social Networks

Future Work

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34

Take into account the “semantics” of the graph in order to search for specific kind of cliques

Extend our framework for computing more relaxed communities- k-plexes, k-clans, k-cores, etc.

Page 36: Finding All Maximal Cliques in Very Large Social Networks

Thanks For The Attention

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35