lnai 8443 - structure-aware distance measures for ...jpei/publications/structure-aware graph...

Structure-Aware Distance Measures

for Comparing Clusterings in Graphs

Jeffrey Chan�, Nguyen Xuan Vinh, Wei Liu, James Bailey,Christopher A. Leckie, Kotagiri Ramamohanarao, and Jian Pei

1 Department of Computing and Information Systems, University of Melbourne,Australia

[email protected] School of Computing Science, Simon Fraser University, BC Canada

Abstract. Clustering in graphs aims to group vertices with similar pat-terns of connections. Applications include discovering communities andlatent structures in graphs. Many algorithms have been proposed to findgraph clusterings, but an open problem is the need for suitable com-parison measures to quantitatively validate these algorithms, perform-ing consensus clustering and to track evolving (graph) clusters acrosstime. To date, most comparison measures have focused on comparing thevertex groupings, and completely ignore the difference in the structuralapproximations in the clusterings, which can lead to counter-intuitivecomparisons. In this paper, we propose new measures that account fordifferences in the approximations. We focus on comparison measuresfor two important graph clustering approaches, community detectionand blockmodelling, and propose comparison measures that work forweighted (and unweighted) graphs.

Keywords: blockmodelling, community, clustering, comparison, struc-tural, weighted graphs.

1 Introduction

Many data are relational, including friendship graphs, communication networksand protein-protein interaction networks. An important type of analysis that isgraph clustering, which involves grouping the vertices based on the similarity oftheir connectivity. Graph clusterings are used to discover communities with sim-ilar interests (for marketing) and discovering the inherent structure of graphs.Despite the popularity of graph clustering, existing graph cluster comparisonmeasures have focused on using membership based measures. To the best of theauthors’ knowledge, there are no work that evaluates if membership based com-parison measures are appropriate for comparing graph clusterings. This analysisis important, as comparison measures are often used for comparing algorithms[1][2], form the basis of consensus clustering [3] and tracking algorithms [4],and knowing which properties measures possess allows us to evaluate if a mea-sure is appropriate for a task, and if not, propose new ones that are. One of

� Corresponding author.

V.S. Tseng et al. (Eds.): PAKDD 2014, Part I, LNAI 8443, pp. 362–373, 2014.c© Springer International Publishing Switzerland 2014

Structure-Aware Distance Measures for Comparing Clusterings in Graphs 363

(a) Graph. (b) Adjacency matrix. (c) Image diagram.

Fig. 1. Example of an online Q&A forum. Each vertex corresponds to a forum user,and each edge represents replying between users (Figure 1a). For the correspondingadjacency matrix (Figure 1b), the positions induce a division of the adjacency matrixinto blocks, delineated by the red dotted lines. Edge weights in Figure 1c are theexpected edge probabilities between each pair of positions.

the important properties a measure should possess is the ability to be “spa-tially”/structurally aware. In traditional clustering of points, it was found thatclusterings could be similar in memberships, but their points are distributed verydifferently. This could lead to counter-intuitive situations where such clusteringswere considered similar [5][6]. We show the same scenario occurs when compar-ing graph clusterings. Therefore, in this paper, we address the open problemsof analysing and showing why membership comparison measures can be inade-quate for comparing graph clusterings and then proposing new measures that areaware of structural differences. We shortly illustrate why comparison measuresshould be structure aware, but we first introduce graph clustering.

Clustering in Graphs – Community Detection and Blockmodelling: Two popu-lar approaches for graph clustering are community detection [7] and blockmod-elling [8]. These approaches are used in many important clustering applications[9][1][2][4], and hence we focus on the comparison of communities and blockmod-els in this paper. Community detection decomposes a graph into a communitystructure, where vertices from the same communities have many edges betweenthemselves and vertices of different communities have few edges. Communitystructure has been found in many graphs, but it is only one of many alternativesfor grouping vertices and possible graph structure (such as core-periphery). Incontrast, an alternative approach is blockmodelling [8]. A blockmodel1 partitionsthe set of vertices into groups (called positions), where for any pair of positions,there are either many edges, or few edges, between the positions.

As an example of the difference between the two approaches, consider a re-ply graph between a group of experts and questioners in a question and an-swer (Q&A) forum, illustrated in Figure 1. The vertices represent users, andthe directed edges represent one user replying to another. If we use a popularcommunity detection algorithm [7] to decompose the graph, all the vertices are

1 We use the popular structural equivalence to assess similarity [8].

364 J. Chan et al.

incorrectly assigned into a single group, as its underlying structure is not ofa community type, and no structure is discerned. If we use a blockmodellingdecomposition, we obtain the correct decomposition into two positions, the ex-perts C1 ({Jan, Ann, Bob}) and the questioners C2 ({May, Ed, Pat, Li}) (seeFigure 1b, which illustrates the adjacency matrix rearranged according to thepositions). The experts C1 reply to questions from the questioners (C1 to C2),the questioners C2 have their questions answered by the experts (C1 to C2) andboth groups seldom converse among themselves. The overall structure can besuccinctly summarised by the image diagram in Figure 1c, which shows the po-sitions as vertices, and the position-to-position aggregated interactions as edges.As can be seen, this graph has a two position bipartite structure, which theblockmodelling decomposition can find but a community decomposition fails todiscover. The similarity definition of blockmodelling actually includes the com-munity one, hence blockmodelling can be considered as a generalisation of com-munity detection and comparison measures that work with blockmodels wouldwork with comparing communities as well. Therefore we focus on measures thatcompare blockmodels, since they can also compare communities.

Comparing Communities and Blockmodels: As discussed earlier, almost all ex-isting literature compares blockmodels (and communities) based on their posi-tions and ignores any difference in the adjacency structure within positions andbetween positions. This can lead to unintuitive and undesirable behaviour. Ablockmodel comparison measure should possess the three following properties,which we will introduce and motivate using two examples.

The first property is that the measure should be sensitive to adjacency dif-ferences in the blocks. Figures 2a (BM 11), 2b (BM 12) and 2c (BM 13) illus-trate example 1. The vertices in each blockmodel are ordered according to theirpositions, and the boundary between the positions in the adjacency matrix isillustrated with red dotted lines. This effectively divides the adjacency matrixinto blocks, which represent the position to position interactions. The positionaldifferences from BM 12 and BM 13 to BM 11 are the same. However, the dis-tribution of edges (frequency of edges and non-edges) are very different betweenBM 11 and BM 13 (there is one single dense block with all the edges in BM 11(and BM 12), while the edges are distributed across the blocks more evenly inBM 13). Hence a measure should indicate BM 11 is more similar to BM 12 thanto BM 13. Positional measures fail to achieve this.

The second property is that a measure should account for differences in theedge distributions across all blocks. In example 1, BM 11 and BM 12 have similardistributions across all blocks since their edges are concentrated in a block, whileBM 13 is different because the edges are spread quite evenly across all blocks. Weshow that two existing comparison measures for blockmodels fail this property.

The third property is that a measure should be sensitive to weighted edgedistributions. Many graphs have weights on the vertices and edges. It might bepossible to binarise the weights, but this throws away important comparisoninformation. For example, if the graph in the blockmodels of Figures 2d to 2f


(a) BM 11. (b) BM 12. (c) BM 13.

(d) BM 21. (e) BM 22. (f) BM 23.

Fig. 2. Example to illustrate the strength and weaknesses of different blockmodel com-parison measures. Each row corresponds to 3 blockmodels of the same graph.

(BM 21 to BM 23, dubbed example 2) were binarised, then BM 22 and 23 haveexactly the same edge distribution differences from BM 21. But when the actualweights are taken into account (darker pixels in the figures represent larger valuedweights), BM 22 will be correctly considered as more similar to BM 21, as mostof the high-value edge weights in the top left block match, while in BM 23 theseedges are evenly distributed among the four blocks.

These properties have important applications. For example in algorithm evalu-ation, the blockmodel output of the algorithms are compared to a gold standard.Imagine that the gold standard is BM 11 and two algorithms produced BM 12and 13. Measures without these properties will incorrectly rank the two algo-rithms to be equally accurate since they have the same position differences toBM 11. In contrast, a measure that possess the first two properties will correctlyrank the algorithm producing BM 12 as more accurate. Another application isin consensus blockmodelling, which finds a blockmodel that is the average ofa set of blockmodels. If a measure does not possess the 3rd property and onlyconsidered position similarity, then the consensus blockmodel might have verydifferent weight distribution to the other blockmodels and hence should not beconsidered as a consensus (e.g., BM 23 as a consensus of BM 21 and 22).

In summary, our contributions are: a) we propose three structural-based prop-erties that a blockmodel comparison measure should possess; b) we analyse exist-ing measures and show that they do not possess these properties; c) we proposenew measures that satisfy these properties; and d) perform experiments on syn-thetic and real data to study the monotonicity of the new measures.

366 J. Chan et al.

2 Related Work

In this section, we describe related work in three key areas: set comparison,spatially aware clustering comparison and subspace clustering comparison.

Set Comparison: There has been extensive work in using set comparison forcluster comparison. Hence we discuss a few selected measures, and refer inter-ested readers to the excellent survey of [10]. The first class of measures in thisarea involves computing the agreement between two clusterings in terms of howmany pairs of vertices are in the same and different clusters. Examples includethe Rand and Jaccard indices [10]. The second class of measures are based onset matching and information theory, where the two clusterings are comparedas two sets of sets. Popular examples include Normalised and Adjusted MutualInformation (NMI, AMI) [11]. As demonstrated in Section 1, these set-basedmeasures do not take into account any adjacency differences between the block-models, resulting in some counter-intuitive comparison behaviour.

Spatially Aware Clustering Comparison: In [12], Zhou et al. proposed ameasure that compares clusters based on membership and the distances betweentheir centroids. In [5], Bae et al. computed the density of clusters using a grid,and then used the cosine similarity measure to compare the cluster distributions.Coen et al. [6] took a similar approach but used a transportation distance to com-pute the distances between clusters and between clusterings. All these measuresdepend on a notion of a distance between points (between clusters of the twoclusterings). In blockmodel and community comparison, there are positions ofvertices and the edges between them, but no notion of a distance between ver-tices across two blockmodels and hence existing spatial-aware measures cannotbe applied for blockmodel comparison.

Subspace Clustering Comparison: In subspace clustering, the aim is to find agroup of objects that are close in a subset of the feature space. In [13], Patrikainenand Meila proposed the first subspace validation measure that considered boththe similarities in the object clusters and in the subspaces that the clustersoccupied. They treated subspace clusters as sets of (object, feature) pairs, andthen used set-based validation measures for comparing subspace clusters. Whenextended to compare the sets of (vertex,vertex) pairs between blockmodels, thesemeasures are equivalent to comparing the positions (proof omitted due to lackof space), and hence have the same issues as the set comparison measures.

In summary, there has been much related work in cluster comparison, but nonethat address the unique problem of comparing blockmodels and communities.What is needed is a measure that is sensitive to differences in the adjacencystructure as well as working in the relation spaces of graphs.

3 Blockmodelling (Graph Clustering) Background andProperties

In this section, we summarise the key ideas of blockmodelling (see [14][8] formore detail). A graph G(V,E) consists of a set of vertices V and a set of edges E,


where E ∈ {0, 1}|V |×|V | for unweighted graphs and E ∈ N |V |×|V | for weightedgraphs2. The edge relation can be represented by an adjacency matrix A whoserows and columns are indexed by the vertices of G.

We use the Q&A example of Figure 1 to illustrate the notation. A blockmodelpartitions a set of vertices into a set of positions C = {C1, C2, . . . , Ck}. Wedenote the number of positions in C by k. C can be alternatively specified byφ(.), a mapping function from vertices to the set of positions (i.e., φ : V → C).A block Ar ,c is a submatrix of A, with the rows and columns drawn from Cr

and Cc respectively, Cr , Cc ∈ C, e.g., A1 ,1 defines the upper left submatrixin Figure 1b. Let Ψr,c denote the random variable representing the probabilitymass function of the edge weights in block Ar ,c , e.g., p(Ψ1,1 = 0) = 7

9 , p(Ψ1,1 =1) = 2

9 . This representation allow us to model both weighted and unweightedgraphs. For simplicity, we introduce Υr,c = p(Ψr,c = 1) for unweighed graphs. Wedefine M as the blockmodel image matrix that models inter-position densities,M : C × C → [0, 1], Mr,c = Υr,c. For the rest of the paper, we use a superscriptnotation to distinguish two different instances of a variable (e.g., G(1) and G(2)).

A blockmodel B(l)(C(l),M(l)) is defined by its set of positions C(l) (the ma-trix version is C(l)) and its image matrix M(l). The (unweighted) blockmod-elling problem can be considered as finding a blockmodel approximation ofA(l) as C(l)M(l)(C(l))T that minimises a sum of squared errors3[15], ||A(l) −C(l)M(l)(C(l))T ||2. We denote C(l)M(l)(C(l))T as A(l). Given two blockmodels ,the distance between two blockmodels B(1) and B(2) is defined as d(B(1),B(2)).

3.1 Desired Properties of Comparison Measures

In this section, we formalise the properties that a blockmodel comparison mea-sure should possess (as discussed in Section 1). These properties allow us toformally evaluate the measures in the next section.

The following properties assume that there are three blockmodels with ap-proximations A(1), A(2) and A(3) of the same graph, A(1) �= A(2) �= A(3) andthey are not co-linear, i.e., |A(1) − A(3)| �= |A(1) − A(2)|+ |A(2) − A(3)|. Theycan be considered as measuring the sensitivity to differences in the structure(approximation) of the blockmodels.

P1: Approximation sensitivity: Given an unweighted graph and threeblockmodels, a measure is approximation sensitive if d(A(1), A(2)) �=d(A(2), A(3)).

P2: Block edge distribution sensitivity: Given dmKL(A(2), A(1)) <

dmKL(A(3), A(1)), then a measure is block edge distribution sensi-

tive if d(A(2), A(1)) < d(A(3), A(1)). dmKL(A(1), A(2)) is the KL

divergence [16] for matrices, and is defined as dmKL(A(1), A(2)) =

2 For simplicity, we assume a discrete set of weights. But this can easily be extendedto continuous weight values.

3 This is one of several popular blockmodelling objective formulations. See [1] for anobjective for weighted graph.

368 J. Chan et al.

∑|V 1|i,j A

(1)i,j log

(A

(1)i,j

A(2)i,j

)

− A(1)i,j + A

(2)i,j . It measures the difference in the edge

(weight) distribution across all the blocks.P3: Weight sensitivity: Given aweighted graph and three blockmodels4, a mea-

sure is weight sensitive if d(A(1), A(2)) �= d(A(2), A(3)).

In addition, we evaluate the important monotonicity property of the measures.Basically, we desire measures that increase in value as the compared blockmodelsbecome more different. We measure “more different” by the minimum numberof position changes to transform one blockmodel to another.

4 Blockmodel (Graph Clustering) ComparisonApproaches

In this section, we describe existing measures, propose new blockmodel com-parison approaches, and analyse their properties. Existing work for comparingblockmodels falls into two categories: positional and reconstruction measures.Positional measures compare the sets of positions associated with each block-model [10]. Reconstruction measures compare the blockmodel approximation

of the graphs and can be expressed as d(B(1),B(2)) = d(A(1), A(2)). We nextprovide details on two existing reconstruction blockmodel measures.

4.1 Edge and Block Reconstruction Distances

The edge and block reconstruction distances were proposed in [4] and [8] re-spectively. The edge reconstruction distance [8] measures the difference in theexpected edge probabilities across all edges (recall that V 1 = V 2):

dRE(B(1),B(2)) =

|V 1|∑

i

|V 2|∑

j

|Υ (1)φ(i),φ(j) − Υ

(2)φ(i),φ(j)| (1)

The block reconstruction distance [4] measures the difference in block densitiesover all pairs of blocks, weighted by the overlap of the positions:

dRB(B(1),B(2)) =

k1∑

r1,c1

k2∑

r2,c2

|C(1)r1 ∩ C

(2)r2 |

n

|C(1)c1 ∩ C

(2)c2 |

n· |Υ (1)

r1,c1 − Υ(2)r2,c2| (2)

The two measures in fact differ only by a factor of 1n2 (we prove this new result

in Theorem 1). It is easier to understand and propose new measures based onthe edge than the block reconstruction distance. But in terms of computationalcomplexity, it takes O(k2) to compute the block reconstruction distance whileO(n2) for the edge distance, hence it is more efficient to compute the blockdistance. Therefore Theorem 1 permits us to use whichever measure that ismore convenient for the task at hand.

4 A consists of n2 PMFs of a random variable with the event space of edge weights.


Theorem 1. dRB(B(1),B(2)) = 1

n2 dRE(B(1),B(2))

Proof. The proof involves arithmetric manipulation. We detail the proof in asupplementary5 paper due to space limitations.

Both these reconstruction distances possess the approximation sensitivityproperty (P1) but fail the block edge distribution property (P2). Reconsider theexample of Figures 2a to 2c. dRB(BM11, BM12) = 0.21 and dRB(BM11, BM13)= 0.18. This means that the reconstruction distances incorrectly ranks BM 13to be closer to BM 11 than BM 12. To explain why, we first state that theEarth Movers Distance (EMD) can be considered as a generalisation of the re-construction measures (see Section 4.3). The EMD finds the minimum amountof mass to move from one PMF (of a block) to another. What this means is thatthe reconstruction distance considers the cost to move a unit of mass the same,whether the source PMF is a unimodal distribution or uniformly distributed one.Hence the reconstruction distances only consider the number of total units ofmass moved and do not consider the differences in distribution of edge densitiesacross all the blocks, leading them to fail property P2.

4.2 KL Reconstruction Measure

We propose to use the KL divergence [17] to compare the block densities (PMFs).Although it is similar in form to the P2 property definition, the proposed measureis in the form of the reconstruction distance and the KL divergence is a naturalapproach to measure distribution differences. It is defined as:

Definition 1. KL Reconstruction:

dRKL(B(1),B(2)) =

C(1)∑

r1,c1

C(2)∑

r2,c2

|Cr1 ∩Cr2|n

|Cc1 ∩ Cc2|n

· dKL(A(1 )r1 ,c1 ,A

(2 )r2 ,c2 ) (3)

where dKL(Ψ(1)r1,c1, Ψ

(2)r2,c2) =

∑x p(Ψ

(1)r1,c1 = x) log(

p(Ψ(1)r1,c1=x)

p(Ψ(2)r2,c2=x)

.

The KL reconstruction measure can be considered as using the edge distribu-tions (across the blocks) in B(2) to encode the edge distributions in B(1). Thismeans it is sensitive to differences in the distribution of the block densities, andit gives the correct ranking for example 1 (see Section 5).

The KL divergence is asymmetric, hence dRKL(.) is also asymmetric, andcan handle weighted graphs. The Jeffrey and Jenson-Shannon divergences [17]are symmetric versions of the KL divergence, and are included for comparison.Unfortunately they fail the block edge distribution property (see Section 5).

4.3 Weighted Block and Edge Reconstruction

In this section, we show how the edge/block reconstruction measures can begeneralised to weighted graphs. We compare the blockmodels of weighted graphs

5 Available at http://people.eng.unimelb.edu.au/jeffreyc/

http://people.eng.unimelb.edu.au/jeffreyc/

370 J. Chan et al.

by comparing the PMFs of the blocks. We desire a measure that can comparemulti-valued PMFs and consider the difference in the actual weight values. Onesuch measure is the Earth Mover’s distance (EMD) [18], which also reduces tothe reconstruction measures for unweighted graphs (see Theorem 2).

Definition 2. EMD Reconstruction Measure

dREMD(B(1),B(2)) =

C(1)∑

r1,c1

C(2)∑

r2,c2

|Cr1 ∩ Cr2|n

|Cc1 ∩ Cc2|n

· dED(Ψ(1)r1,c1, Ψ

(2)r2,c2) (4)

where dED(Ψ(1)x,y , Ψ

(2)a,b ) = minM

∑Lu=0

∑Lw=0 mu,w · d(u,w), subject to:

a) mu,e ≥ 0; b)∑

w mu,w = p(Ψ(1)x,y = u); c)

∑u mu,w = p(Ψ

(2)a,b = w) and d)

d(u,w) = |u− w|.We now show that the EMD reconstruction measure is in fact a generalisation

of the reconstruction measures.

Theorem 2. When we are comparing unweighted graphs,

dED(A(1 )x ,y ,A

(2 )a,b) = |Υ (1)

x,y − Υ(2)a,b |

and dREMD(B(1),B(2)) = dRB(B(1),B(2)).

Proof. The proof involves arithmetric manipulation and reasoning on the con-straints. Due to space limitations, please refer to supplementary for details.

Theorem 2 is a useful result, as it helps to explain why the block reconstructiondistance fail property P2. In addition, it means EMD reconstruction distancecan be used in place of block distance, since Theorem 2 tells us that the EMDdistance is a generalisation of block one for unweighted graphs. At the sametime, the EMD distance satisfies property P3 while the block one does not.

5 Evaluation of the Measures

In this section, we evaluate the measures using the proposed properties andempirically demonstrate their monotonicity (since it is difficult to prove this an-alytically). We also show how each of the measures perform in the examples fromSection 1. In the experiments, we compare NMI, a popular and representativeexample of the positional measures, against the block (RB), EMD (REMD), KL(RKL), Jeffrey (RsKL) and JS (RJS) reconstruction distances.

5.1 Evaluation of the Examples

Table 1 shows the comparison results between the original blockmodel (BM *1)against the two other blockmodels (BM *2 and *3) of the examples in Figure 2.As can be seen, NMI cannot distinguish between the three blockmodels across allthe datasets. RB fails to correctly rank the blockmodels of example 1, and fails


to distinguish the weighted blockmodels of example 2. As expected, REMD hasthe same values as RB for example 1, but correctly classifies the relative orderingof example 2. RsKL and RJS fail example 1. RKL correctly distinguishes theordered example 1, but like the other distributional measures (RsKL and RJS)cannot distinguish the blockmodels of example 2.

Table 1. Measure values when comparing the original blockmodel (BM*1) against theother blockmodels in the Karate club and other examples from Section 1. In each case,ideally d(BM *2,BM *1) should be less than d(BM *3,BM *1).

Example 2 Example 3

BM 12 BM 13 BM 22 BM 23

NMI 0.3845 0.3845 0.1887 0.1887

RB 0.2100 0.1800 0.2188 0.2188REMD 0.2100 0.1800 7.2266 10.5469

RKL 0.2803 4.4432 7.2407 7.2407RsKL 5.5650 4.6501 9.0060 9.0060RJS 0.1676 0.1259 0.2633 0.2633

5.2 Monotonicity Analysis

To evaluate monotonicity, we vary the membership of the positions for severalreal datasets. Each position change corresponds to a change in the membershipof a vertex, and we ensure a vertex can only change membership once in eachsimulation. From a starting blockmodel, we generated 100 different runs, whereup to 50% of the vertices change position. Table 2 shows the statistics of thethree real datasets6 on which we evaluate monotonicity.

Table 2. Statistics of the real datasets

Dataset Vert # Edge # Weighted? Pos. #

Karate 34 78 N 2Football 115 613 N 12C.Elegans 297 2359 Y 5

Figure 3 shows the results for REMD, RKL, RsKL and NMI (we plotted 1-NMI). As can be seen, all measures are monotonically increasing as the positionschange. However, in the rare case where the adjacency approximation remainsthe same after a position splits into two, then the reconstruction distances canfail to distinguish the pre-split and post-split blockmodels.

5.3 Properties of the Measures

Table 3 shows the measures and the properties they have. For each measure, weprove (see supplementary) whether it possesses each of the properties.

6 Available at http://www.personal-umich.edu/~mejn/netdata.

http://www.personal-umich.edu/~mejn/netdata

372 J. Chan et al.

(a) Karate club. (b) Football. (c) C.Elegans.

Fig. 3. Evaluation of the monotonicity of the distance measures as the difference inthe positions of the starting blockmodel and modified blockmodel increases

Table 3. List of blockmodel measures and their properties. *: The KL-based measurescan fail when the distributions have the same shape but have different weight values.#: not strictly monotonic (can fail the co-incidence axiom).

Property NMI RB REMD RKL RsKL RJS

P1: Edge. Dist. Sensitivity - � � � � �P2: Block Dist. Sensitivity - - - � - -

P3: Weight Sensitivity - - � -* -* -*

Monotonicity � �# �# �# �# �#

Table 3 confirms the empirical evaluation of Section 5.1. The positional mea-sures (e.g., NMI) do not consider blockmodel approximations and hence fail thestructural sensitivity properties. The existing RB and RE distances cannot beapplied to weighted graphs and are not block edge distribution sensitive. Theproposed REMD generalises the reconstruction distances to weighted graphs,but still possesses the same assumptions as those distances, and hence fails theblock edge distribution sensitivity property. The proposed KL-based distancesare block edge distribution sensitive, but not weight sensitive as they measuredistribution differences but ignore difference in weight values.

From these analyses, we recommend to use the KL reconstruction distancewhen comparing unweighted blockmodels, as it possess the first two properties.When comparing weighted graphs, the EMD distance might be preferred as itsatisfies the weight sensitivity property. A further option is to combine severalmeasures together via ideas from multi-objective optimisation, e.g., as a weightedlinear sum, which we leave to future work.

6 Conclusion

Blockmodel comparison measures are used for validating blockmodelling andcommunity detection algorithms, finding consensus blockmodels and other tasksthat require the comparison of blockmodels or communities. In this paper, wehave shown that popular positional measures cannot distinguish important dif-ferences in blockmodel structure and approximations because they do not re-


flect a number of key structural properties. We have also proposed two newmeasures, one based on the EMD and the other based on KL divergence. Weformally proved these new measures possess a number of the desired structuralproperties and used empirical experiments to show they are monotonic.

Future work includes introducing new measures for evaluating mixed mem-bership blockmodels [1] and evaluating multi-objective optimisation approachesfor combining measures.

References

1. Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochasticblockmodels. J. of Machine Learning Research 9, 1981–2014 (2008)

2. Pinkert, S., Schultz, J., Reichardt, J.: Protein interaction networks–more than meremodules. PLoS Computational Biology 6(1), e1000659 (2010)

3. Lancichinetti, A., Fortunato, S.: Consensus clustering in complex networks. Na-ture 2(336) (2012)

4. Chan, J., Liu, W., Leckie, C., Bailey, J., Kotagiri, R.: SeqiBloc: Mining Multi-timeSpanning Blockmodels in Dynamic Graphs. In: Proceedings of KDD, pp. 651–659(2012)

5. Bae, E., Bailey, J., Dong, G.: A clustering comparison measure using density pro-files and its application to the discovery of alternate clusterings. Data Mining andKnowledge Discovery 21(3), 427–471 (2010)

6. Coen, M.H., Ansari, H.M., Filllmore, N.: Comparing Clusterings in Space. In: Pro-ceedings of ICML, pp. 231–238 (2010)

7. Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks revealcommunity structure. Proceedings of PNAS 105, 1118–1123 (2008)

8. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications.Cambridge Univ. Press (1994)

9. Chan, J., Lam, S., Hayes, C.: Increasing the Scalability of the Fitting of GeneralisedBlock Models for Social Networks. In: Proceedings of IJCAI, pp. 1218–1224 (2011)

10. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques.J. of Intelligent Information Systems 17(2/3), 107–145 (2001)

11. Vinh, N., Epps, J., Bailey, J.: Information theoretic measures for clusterings com-parison: Variants, properties, normalization and correction for chance. The J. ofMachine Learning Research 11, 2837–2854 (2010)

12. Zhou, D., Li, J., Zha, H.: A new Mallows distance based metric for comparingclusterings. In: Proceedings of the ICDM, pp. 1028–1035 (2005)

13. Patrikainen, A., Meila, M.: Comparing Subspace Clusterings. IEEE Trans. onKnow. Eng. 18(7), 902–916 (2006)

14. Doreian, P., Batagelj, V., Ferligoj, A.: Generalized blockmodeling. CambridgeUniv. Press (2005)

15. Chan, J., Liu, W., Kan, A., Leckie, C., Bailey, J., Kotagiri, R.: Discovering latentblockmodels in sparse and noisy graphs using non-negative matrix factorisation.In: Proceedings of CIKM, pp. 811–816 (2013)

16. Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. In: Proceed-ings of NIPS, pp. 556–562 (2000)

17. Cover, T., Thomas, J.: Elements of Information Theory. Wiley-Interscience (2006)18. Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for

image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)

lnai 8443 - structure-aware distance measures for ...jpei/publications/structure-aware graph...

Documents