scalable community detection benchmark generation

Scalable Community Detection

Benchmark Generation

Jonathan Berry1 Cynthia Phillips1

Siva Rajamanickam1 George M. Slota2

1Sandia National Labs, 2Rensselaer Polytechnic [email protected], [email protected], [email protected], [email protected]

SIAM CSC 6 June 2018

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology &Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S.

Department of Energys National Nuclear Security Administration under contract DE-NA0003525

1 / 13

Motivation

Better evaluate community detection algorithmsprocessing O(Billion)-sized graphs on HPC resources

Small-scale state-of-the-art: “LFR”Lancichinetti, Fortunato, Radicchi, 2008With >1600 citations, this is a de facto standardGenerates ground truth to test againstHas a tunable parameter for community coherence: µLimited scalability: best implementation takes ∼17hrs togenerate O(1B) edges (Hamann et al., 2017)

Large-scale state-of-the-artWithout a reliable ground truth, parallel algorithms testwith modularity or similar measuresThis approach is flawed in several ways

Goal: evaluate at HPC scale against ground truth2 / 13

Overview

Primary results of this work:

We develop a novel method for generating large-scalegraphs with a tunable ground truth community structure

We utilize the scalable BTER generator (Kolda et al.,2014) as a core step

Our approach generates large-scale communitybenchmarking graphs at a rate of 1B edge/minute onKNL– Orders-of-magnitude faster than state-of-the-art

3 / 13

BTERBlock Two-level Erdos-Reyni Graph Generator

Step 0: Input degree (nd) and clustering coefficient (cd) distributions

Step 1: With ordered degree sequence, group d + 1 vertices v of degree d(v) >= d into affinity blocksStep 2: Use Erdos-Renyi probability pd = 3√cd to create intra-block edges via G(n, p) processStep 3: Create inter-block edges via Chung-Lu process

4 / 13


Step 0: Input degree (nd) and clustering coefficient (cd) distributionsStep 1: With ordered degree sequence, group d + 1 vertices v of degree d(v) >= d into affinity blocks

Step 2: Use Erdos-Renyi probability pd = 3√cd to create intra-block edges via G(n, p) processStep 3: Create inter-block edges via Chung-Lu process

4 / 13


Step 0: Input degree (nd) and clustering coefficient (cd) distributionsStep 1: With ordered degree sequence, group d + 1 vertices v of degree d(v) >= d into affinity blocksStep 2: Use Erdos-Renyi probability pd = 3√cd to create intra-block edges via G(n, p) process

Step 3: Create inter-block edges via Chung-Lu process

4 / 13


Step 0: Input degree (nd) and clustering coefficient (cd) distributionsStep 1: With ordered degree sequence, group d + 1 vertices v of degree d(v) >= d into affinity blocksStep 2: Use Erdos-Renyi probability pd = 3√cd to create intra-block edges via G(n, p) processStep 3: Create inter-block edges via Chung-Lu process

4 / 13

Our Implementation - For Community DetectionwBTER – wrapped BTER

How we wrap the baseline BTER process for generatinggraphs for community detection benchmarking:

Treat affinity blocks as ground truth communities

We have a native µn, based on ratio of inter- tointra-block edges generated from the original distributions

Can shift µn to some target goal µg via a Linear Programsolve (to be described) – we use Pyomo and CBC

Our BTER implementation: fully-parallelized inshared-memory with OpenMP/C++

5 / 13

Linear ProgramShifting the native µ of a graph’s CC distribution

Minimally shift the input clustering coefficient (CC) distribution suchthat the output graph has a desired goal µg considering both definitions:

µg =1

N

∑d

dinterd

µg =1

2M

∑d

nddinter

minimize∑d

|pd − pd|

subject to∑d

ndpd = N(1− µg)∑d

dndpd = 2M(1− µg)

0 ≤ pd ≤ 1output cd = p3d

pd is G(n, p) probabilities per degree from CC distribution cd, pd = 3√cd

pd is output probabilities to get new CC distribution cd, cd = p3dnd is degree distribution, n vertices of d degreedinter is expected number of inter-community edges for vertex of degree dN is number of vertices in graph, M is number of edges

6 / 13

Experimental SetupTest system and test graphs

Test System: Bowman at Sandia Labs – each node has aKNL with 68 cores, 96 GB DDR, and 16 GB MCDRAM

Test Graphs:

Network n m davg dmax D

LJ-fp 4.2 M 27 M 18 20 K 18uk-2002 18 M 261 M 28 195 K 28Wikilinks 26 M 332 M 23 39 K 170RMAT 26 67 M 1.1 B 16 6.7 K 8Friendster 66 M 1.8 B 27 5.2 K 34

Graphs are from the SNAP, Koblenz, and LAW databases.LiveJournal-fp is a parsed version of LiveJournal from SNAP.

7 / 13

Shifting DistributionHow the CC distribution shifts for varying µ

Only every 5th value plotted for better visualizationGenerally, distribution is most “accurate” near native µBetter smoothing of distribution via LP constraints is future work

0

0.2

0.4

0.6

0.8

1

2 100 500 1000 2000

Vertex Degree

Clu

ste

rin

g C

oe

ffic

ien

t

Native 0.05 0.3 0.5 0.7

8 / 13

Hitting Target MuAccuracy of LP for generating desired µ

Generation accuracy is comparable to LFRLess than 5% error in most instancesError is greatest at lower µ targets

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Target−mu

Ou

tpu

t−m

u

LJ−fp uk−2002 WikiLinks RMAT_26 Friendster

9 / 13

Generation time vs. target µ(Left) Time vs. µ – (Right) Time vs. graph scale

Strong scaling generally good up to 2 threads/coreTime decreases with increasing µ, due to coupon collectors edgegeneration scaling - higher CC requires more attempts for each edgeGeneration time a function of scale and complexity (max degree)

Average ∼2 minutes for 1.8B unique edgesOriginal BTER code: ∼4 min. for 1.2B edges on 32 node Hadoop cluster

Fastest LFR implementation: 17 hours for 1B edges in shared-memory

0

10

100

500

17 32 68 136 204 272

Threads

Ge

ne

ratio

n T

ime

(s)

LJ−fp uk−2002 WikiLinks RMAT_26 Friendster

0

50

100

150

0 500 1000 1500

Number of Edges (Millions)

Ge

ne

ratio

n T

ime

(s)

0.05

0.1

0.2

0.3

0.4

0.5

0.6

0.7

10 / 13

A Note on BTER Assortativity

An issue with our approach so far is the degreehomogeneity of communities

We propose the following addition:

Consider intra-comm edge count of each vertexPermute community assignments of all vertices withsame countObservation: won’t affect µ, de-homogenizescommunities in terms of degree

This approach might also be applied to baseline BTERgeneration

11 / 13

Timing BreakdownFull wBTER approach with community permutation, µ = 0.5

Time costs of majorwBTER steps withcommunity assignmentpermutation

Work Complexity:d = Dmax, n = |V |,m = |E|

LP: expected to scaleas O(d log d)EdgeGen: O(m log d)Finalize: O(n+m)CSR: O(n+m)Swap: O(n log n+m)

0

30

60

90

LJ−fp

uk−2002

WikiLinks

RM

AT_26

Friendster

Graph

Tim

e (

s)

Parts

LP

EdgeGen

Finalize

CSR

Swap

LP: linear programEdgeGen: primary BTER phaseFinalize: remove 0-degree vertices & cleanupCSR: create graph representationSwap: community degree permutation

12 / 13

Conclusionsand future work

We shift a graph’s CCD to fit a µ generated by BTER

Our approach can output graphs for community detectionorder-of-magnitudes faster than commonly-usedgenerators, e.g., LFR

Our approach can output graphs with more realisticdegree and CC distributions than commonly-usedgenerators

Future Work:Better develop LP to reduce noise in output CC distributionShift graph scale – i.e., output equivalent distributions for agraph with 2×, 12× original scaleDevelop generation methods for hierarchical communities

www.gmslota.com, [email protected] / 13

scalable community detection benchmark generation

Documents