ieee transactions on cybernetics 1 distributed k-means...

12
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means Algorithm and Fuzzy c-Means Algorithm for Sensor Networks Based on Multiagent Consensus Theory Jiahu Qin, Member, IEEE, Weiming Fu, Huijun Gao, Fellow, IEEE, and Wei Xing Zheng, Fellow, IEEE Abstract—This paper is concerned with developing a dis- tributed k-means algorithm and a distributed fuzzy c-means algorithm for wireless sensor networks (WSNs) where each node is equipped with sensors. The underlying topology of the WSN is supposed to be strongly connected. The consensus algorithm in multiagent consensus theory is utilized to exchange the mea- surement information of the sensors in WSN. To obtain a faster convergence speed as well as a higher possibility of having the global optimum, a distributed k-means++ algorithm is first pro- posed to find the initial centroids before executing the distributed k-means algorithm and the distributed fuzzy c-means algo- rithm. The proposed distributed k-means algorithm is capable of partitioning the data observed by the nodes into measure- dependent groups which have small in-group and large out-group distances, while the proposed distributed fuzzy c-means algo- rithm is capable of partitioning the data observed by the nodes into different measure-dependent groups with degrees of mem- bership values ranging from 0 to 1. Simulation results show that the proposed distributed algorithms can achieve almost the same results as that given by the centralized clustering algorithms. Index Terms—Consensus, distributed algorithm, fuzzy c-means, hard clustering, k-means, soft clustering, wireless sensor network (WSN). I. I NTRODUCTION A WIRELESS sensor network (WSN) consists of spatially distributed autonomous sensors capable of limited com- puting power, storage space, and communication range to monitor physical or environmental conditions, such as tem- perature, sound and pressure, and to cooperatively pass on Manuscript received June 24, 2015; revised October 25, 2015 and January 24, 2016; accepted February 2, 2016. This work was supported in part by the National Natural Science Foundation of China under Grant 61333012 and Grant 61473269, in part by the Fundamental Research Funds for the Central Universities under Grant WK2100100023, in part by the Anhui Provincial Natural Science Foundation under Grant 1408085QF105, and in part by the Australian Research Council under Grant DP120104986. This paper was recommended by Associate Editor R. Selmic. J. Qin and W. Fu are with the Department of Automation, University of Science and Technology of China, Hefei 230027, China (e-mail: [email protected]; [email protected]). H. Gao is with the Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China (e-mail: [email protected]). W. X. Zheng is with the School of Computing, Engineering and Mathematics, Western Sydney University, Sydney, NSW 2751, Australia (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCYB.2016.2526683 their data through the network to a main location. The devel- opment of WSNs was motivated by military applications such as battlefield surveillance; today such networks are used in many industrial applications, for instance, industrial and man- ufacturing automation, machine health monitoring, and so on [1]–[6]. When using a WSN as an exploratory infrastructure, it is often desired to infer hidden structures in distributed data col- lected by the sensors. For a scene segmentation and monitoring application, nodes in the WSN need to collaborate with each other in order to segment the scene, identify the object of interest, and thereafter to classify it. Outlier detection is also a typical task for WSN with applications in monitoring chemical spillage and intrusion detection [14]. Many centralized clustering algorithms have been proposed to solve the data clustering problem in order to grasp the fun- damental aspects of the ongoing situation [7]–[11]. However, in WSN, large amounts of data are dispersedly collected in geographically distributed nodes over networks. Due to the limited energy, communication, computation and storage resources, centralizing the whole distributed data to one fusion node to perform centralized clustering may be impractical. Thus, there is a great demand for distributed data cluster- ing algorithms in which the global clustering problem can be solved at each individual node based on local data and limited information exchanges among nodes. In addition, compared with the centralized clustering, distributed clustering is more flexible and robust to node and/or link failures. Clustering can be achieved by various algorithms that dif- fer significantly in their notion of what constitutes a cluster and how to efficiently find a cluster. One of the most pop- ular notions of the clusters is groups with small in-group and large out-group differences [7]–[11]. Different measures of difference (or similarity) may be used to place items into clusters, including distance and intensity. A great deal of algo- rithms [7]–[10] have been proposed to deal with the clustering problem based on distance. Among them, the k-means algo- rithm is the most popular hard clustering algorithm, which features simple and fast-convergent iterations [7]. Parallel and distributed implementations of the k-means algorithm have risen most often because of its high effi- ciency in dealing with large data sets. The algorithm proposed in [12] is the parallel version of the k-means algorithm in multiprocessors of distributed memory, which has no com- munication constraints. In [13], observations are assigned to 2168-2267 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: others

Post on 01-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON CYBERNETICS 1

Distributed k-Means Algorithm and Fuzzy c-MeansAlgorithm for Sensor Networks Based on

Multiagent Consensus TheoryJiahu Qin, Member, IEEE, Weiming Fu, Huijun Gao, Fellow, IEEE, and Wei Xing Zheng, Fellow, IEEE

Abstract—This paper is concerned with developing a dis-tributed k-means algorithm and a distributed fuzzy c-meansalgorithm for wireless sensor networks (WSNs) where each nodeis equipped with sensors. The underlying topology of the WSNis supposed to be strongly connected. The consensus algorithmin multiagent consensus theory is utilized to exchange the mea-surement information of the sensors in WSN. To obtain a fasterconvergence speed as well as a higher possibility of having theglobal optimum, a distributed k-means++ algorithm is first pro-posed to find the initial centroids before executing the distributedk-means algorithm and the distributed fuzzy c-means algo-rithm. The proposed distributed k-means algorithm is capableof partitioning the data observed by the nodes into measure-dependent groups which have small in-group and large out-groupdistances, while the proposed distributed fuzzy c-means algo-rithm is capable of partitioning the data observed by the nodesinto different measure-dependent groups with degrees of mem-bership values ranging from 0 to 1. Simulation results showthat the proposed distributed algorithms can achieve almostthe same results as that given by the centralized clusteringalgorithms.

Index Terms—Consensus, distributed algorithm, fuzzyc-means, hard clustering, k-means, soft clustering, wirelesssensor network (WSN).

I. INTRODUCTION

AWIRELESS sensor network (WSN) consists of spatiallydistributed autonomous sensors capable of limited com-

puting power, storage space, and communication range tomonitor physical or environmental conditions, such as tem-perature, sound and pressure, and to cooperatively pass on

Manuscript received June 24, 2015; revised October 25, 2015 and January24, 2016; accepted February 2, 2016. This work was supported in part bythe National Natural Science Foundation of China under Grant 61333012 andGrant 61473269, in part by the Fundamental Research Funds for the CentralUniversities under Grant WK2100100023, in part by the Anhui ProvincialNatural Science Foundation under Grant 1408085QF105, and in part by theAustralian Research Council under Grant DP120104986. This paper wasrecommended by Associate Editor R. Selmic.

J. Qin and W. Fu are with the Department of Automation, Universityof Science and Technology of China, Hefei 230027, China (e-mail:[email protected]; [email protected]).

H. Gao is with the Research Institute of Intelligent Control andSystems, Harbin Institute of Technology, Harbin 150001, China (e-mail:[email protected]).

W. X. Zheng is with the School of Computing, Engineering andMathematics, Western Sydney University, Sydney, NSW 2751, Australia(e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCYB.2016.2526683

their data through the network to a main location. The devel-opment of WSNs was motivated by military applications suchas battlefield surveillance; today such networks are used inmany industrial applications, for instance, industrial and man-ufacturing automation, machine health monitoring, and soon [1]–[6].

When using a WSN as an exploratory infrastructure, it isoften desired to infer hidden structures in distributed data col-lected by the sensors. For a scene segmentation and monitoringapplication, nodes in the WSN need to collaborate with eachother in order to segment the scene, identify the object ofinterest, and thereafter to classify it. Outlier detection is also atypical task for WSN with applications in monitoring chemicalspillage and intrusion detection [14].

Many centralized clustering algorithms have been proposedto solve the data clustering problem in order to grasp the fun-damental aspects of the ongoing situation [7]–[11]. However,in WSN, large amounts of data are dispersedly collectedin geographically distributed nodes over networks. Due tothe limited energy, communication, computation and storageresources, centralizing the whole distributed data to one fusionnode to perform centralized clustering may be impractical.Thus, there is a great demand for distributed data cluster-ing algorithms in which the global clustering problem can besolved at each individual node based on local data and limitedinformation exchanges among nodes. In addition, comparedwith the centralized clustering, distributed clustering is moreflexible and robust to node and/or link failures.

Clustering can be achieved by various algorithms that dif-fer significantly in their notion of what constitutes a clusterand how to efficiently find a cluster. One of the most pop-ular notions of the clusters is groups with small in-groupand large out-group differences [7]–[11]. Different measuresof difference (or similarity) may be used to place items intoclusters, including distance and intensity. A great deal of algo-rithms [7]–[10] have been proposed to deal with the clusteringproblem based on distance. Among them, the k-means algo-rithm is the most popular hard clustering algorithm, whichfeatures simple and fast-convergent iterations [7].

Parallel and distributed implementations of the k-meansalgorithm have risen most often because of its high effi-ciency in dealing with large data sets. The algorithm proposedin [12] is the parallel version of the k-means algorithm inmultiprocessors of distributed memory, which has no com-munication constraints. In [13], observations are assigned to

2168-2267 c© 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON CYBERNETICS

distributed stations until all observations in a station belongto a cluster based on the partition method of the k-meansalgorithm. It is apparent that a large number of observa-tions need to be exchanged, which would cause the networkburden to be heavy. A kind of distributed k-means algo-rithm is proposed in [14] for peer-to-peer networks, in whicheach node gathers basic information of itself and then dis-seminates it to all the others. It can be observed that thealgorithm proposed in [14] is not scaling well, since a verylarge amount of data needs to be stored when the numberof nodes is huge. Forero et al. [15], [16] treated the clus-tering problem performed by distributed k-means algorithmsas a distributed optimization problem. The duality theoryand decomposition methods, which play a major role in thefield of distributed optimization [17], are utilized to derivethe distributed k-means algorithms. However, an additionalparameter is introduced, and furthermore, the clustering resultsmay not converge when an unsuitable parameter is selected.In [18], the distributed consensus algorithm in the multia-gent systems community [19] is applied to help develop adistributed k-means algorithm. Although the algorithm pro-posed in [18] is feasible to accomplish the clustering workin a distributed manner without introducing any parameters,there are still some issues that need to be noted. The firstissue is the high computational complexity. The second issueis that it is impossible for the algorithm to work when thenetwork is a directed graph which does happen in prac-tice since different sensors may have different sensing radii.Finally, the hard clustering algorithm is not always suitablebecause in some situations such as soil science [20] andimage processing [21], overlapping classes may be useful indealing with the observations for which the soft clusteringis required. Certainly, there are also some other techniquesto tackle the distributed clustering problems, which are notbased on the k-means algorithm. For example, [22] presents atype of distributed information theoretic clustering algorithmsbased on the centralized one. In [23], distributed cluster-ing and learning schemes over networks are considered, inwhich different learning tasks are completed by differentclusters.

We consider in this paper, the framework that the nodes inthe WSN hold a (possibly high-dimensional) measure of infor-mation and the network is a strongly connected directed graph.To be specific, we aim to extend the algorithms proposedin [18] from the following perspectives.

1) In order to obtain a faster convergence speed as well as ahigher possibility of the global optimum, the k-means++algorithm [24], a kind of centralized algorithm to findthe initial centroid of the k-means algorithm, is extendedto the distributed one.

2) The distributed k-means algorithm proposed in thispaper has a relatively lower computational complexitythan the one in [18] and can be used to deal with thecondition that the network topology is a directed graph,irrespective of whether the underlying topology of eachcluster is connected or not.

3) A distributed fuzzy c-means clustering algorithm isprovided for distributed soft clustering.

The remainder of this paper is organized as follows. Somegraph theory and the consensus algorithm are introduced inSection II. The k-means clustering algorithm is reviewed anda distributed k-means algorithm is proposed in Section III,while a distributed fuzzy c-means algorithm is developed inSection IV. Section V provides simulation results to demon-strate the usefulness of the proposed algorithms. Finally, someconclusions are made in Section VI.

II. PRELIMINARIES

In order to provide a distributed implementation of thek-means algorithm, we need a tool to diffuse informationamong the nodes. The average-consensus, max-consensus, andmin-consensus algorithms are proved with their effectivenessin composing local observations by means of one-hop com-munication. In this section, we first introduce some notationsof graph theory and then the consensus algorithm.

A. Graph Theory

Let G = (V, E, A) be a weighted directed graph of order nwith a set of nodes V = {1, . . . , n}, a set of edges E ⊂ V × Vand a weighted adjacency matrix A = [aij] ∈ R

n×n with non-negative adjacency element aij. If there is a directed edge fromj to i in the directed graph G, then the edge is denoted by (i, j).The adjacency elements associated with the edges of the graphare positive, i.e., (i, j) ∈ E ⇔ aij > 0. The set of neighbors ofnode i is denoted by Ni = { j ∈ V : (i, j) ∈ E}.

A directed path from node i to node j of G is a sequenceof edges like (i, i1), (i1, i2), . . . , (il−1, il), (il, j) whose lengthis l + 1. The distance between two nodes is the length of theshortest path connecting them. A directed graph is stronglyconnected if there is a directed path from every node to everyother node. The diameter D of a strongly connected graph isthe maximum value of the distances between all the nodes.

For a directed graph, the in-degree and out-degree of i are,respectively, the sum of the weights of incoming and out-going edges. A directed graph is weight balanced if for alli ∈ V ,

∑nj=1 aij = ∑n

j=1 aji, i.e., the in-degree and out-degreecoincide.

An undirected graph is a special kind of directed graph inwhich aij = aji for all i, j ∈ V . Apparently, an undirected graphis always weight balanced.

B. Consensus Algorithm

Based on the notations given above, we review the consen-sus algorithms in this section. A large number of consensusalgorithms have been proposed (see [19], [25]–[29] and thereferences therein), but we only introduce some simple algo-rithms in this section.

For a set of n nodes associated with a graph G, the discretedynamics is described by

xi(t + 1) = ui(Ni ∪ {i}, t), xi(0) = xi0, i = 1, . . . , n (1)

where xi(t), xi0 ∈ Rd, and ui(Ni ∪ {i}, t) is the function of the

states of the nodes in the set Ni∪{i}. Here the dimensions of thestates of nodes are supposed to be 1 for simplification. In fact,

Page 3: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 3

if the state is high-dimensional, then the same operation canbe conducted on every component of the state individually.

Let X (x10, . . . , xn0) ∈ R be any function of the initialstates of all the nodes. The X -problem is to seek a properui(Ni ∪ {i}, t) such that lim

t→∞ xi(t) = X (x10, . . . , xn0) in a

distributed way.In the max-consensus problem, the nodes are required to

converge to the maximum of the initial states, i.e., X (·) isthe maximum of its arguments. When the directed graph isstrongly connected, the max-consensus can be reached [19] ifthe following control law is chosen:

ui(Ni ∪ {i}, t) = maxj∈Ni∪{i} xj(t), i = 1, . . . , n. (2)

Apparently, the maximum values can be spread throughthe network within D steps, the diameter of the network,which can be upper-bounded by n, the number of the nodes.Regarding the computational complexity, for node i we have|Ni| ∝ n, where |·| denotes the cardinality of a set, thus thecomputational complexity is O(n2) for a node.

Similarly, when the directed graph is strongly connected, themin-consensus can be reached [19] if the following control lawis chosen:

ui(Ni ∪ {i}, t) = minj∈Ni∪{i} xj(t), i = 1, . . . , n (3)

and the computational complexity for one node is also O(n2).Remark 1: In order to perform the max-consensus and min-

consensus algorithms, every node needs to know the number ofnodes n in the network. This is a strong assumption that lacksrobustness (e.g., some nodes could run out of battery) and pre-vents scaling (i.e., if a new node is added into the network,every other node should update the number of nodes n). Thisproblem can be alleviated by using a semi-distributed algo-rithm for estimating n (see the work in [30]). On the otherhand, if an upper-bound of n is provided, which is a muchmilder condition, then the max-consensus and min-consensusalgorithms can also perform well.

In the average-consensus problem, the nodes are required toconverge to the average of their initial states, i.e., X (·) is themean of its arguments. When the directed graph is stronglyconnected and weight balanced, the average-consensus can bereached [19] if the following control law is chosen:

ui(Ni ∪ {i}, t) = xi(t) + τ∑

j∈Ni

aij(xj(t) − xi(t)

)(4)

where the parameter τ is assumed to be τ ≤(1/maxi(

∑j�=i aij)). Suppose that the times of iteration

is tmax, then the computational complexity is O(ntmax) for anode since |Ni| ∝ n.

Remark 2: Note that the result of the average-consensusalgorithm is only asymptotically correct. This is different fromthe max-consensus/min-consensus algorithm, which can pro-vide an exact result after a finite number of iterations. In thefollowing, we will introduce the finite-time average-consensusalgorithm, which can get the exact average in finite-time steps.

Introduce W = [wij] ∈ RN×N with

wij ={

τaij, i �= j1 − τ

∑Nk=1 aik, i = j.

Denote by q(t) the minimal polynomial of the matrix W, whichis a unique nomic polynomial of smallest degree such thatq(W) = 0. Suppose that the minimal polynomial of W hasdegree σ + 1 (σ + 1 ≤ N) and takes the form q(t) = tσ+1 +ασ tσ + · · · + α1t + α0.

In the finite-time average-consensus problem, the nodes arerequired to get the exact average of their initial states afterfinite-time steps. When the directed graph is strongly con-nected and weight balanced, the finite-time average-consensuscan be reached within σ steps [31] by letting

xi(σ + 1) = [xi(σ ) xi(σ − 1) · · · xi(0)]S

[1 1 · · · 1]S(5)

with

S =

⎢⎢⎢⎢⎢⎣

11 + ασ

1 + ασ + ασ−1...

1 + ∑σj=1 αi

⎥⎥⎥⎥⎥⎦

if the control law (4) is chosen since

limt→∞ xi(t) = [xi(σ ) xi(σ − 1) · · · xi(0)]S

[1 1 · · · 1]S.

The distributed method to get α0, . . . , ασ is also providedin [31]. Similarly, the corresponding computational complexityis O(n2) for a node.

In the following, we will denote by

x = max-consensus(xi, Ni, n)

and

x = min-consensus(xi, Ni, n)

the execution of n iterations of the max-consensus proce-dure described by (2) and min-consensus procedure describedby (3) by the ith node, respectively, and also denote by

x = average-consensus(xi, Ni, ai, n)

the execution of n iterations of the finite-time average-consensus procedure described by (4) by the ith node, wherexi is the initial state for node i, Ni is the set of neighborsof node i, ai is the stack of the weights of node i’s ingoingedges, and n is the number of the nodes in the network. Thereturned value x of each of the first two algorithms is the finalstate of node i and the returned value x of the last algorithm isobtained by (5) for node i, which is the same for every nodeafter running each algorithm.

It is a strong assumption that a directed graph is assumedto be weight balanced. However, some weight balancing tech-niques [32], [33] can be utilized. In this paper, we employthe mirror imbalance-correcting algorithm presented in [33] tomake the network be weight balanced, for which the basic ideais that each node changes its weights of the outgoing edgesaccording to certain strategy. When considering the worst-casescenario, the computational complexity is O(n4) for each nodefor each node for each node.

Page 4: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON CYBERNETICS

III. DISTRIBUTED k-MEANS ALGORITHM

In this section, the distributed k-means algorithm will beproposed. For this purpose, we first briefly introduce thecentralized k-means algorithm which has been very welldeveloped in the existing literature.

A. Introduction to the Centralized k-Means Algorithm

Given a set of observations {x1, x2, . . . , xn}, where eachobservation is a d-dimensional real-valued vector, the k-meansalgorithm [7] aims to partition the n observations into k(≤ n)

sets S = {S1, S2, . . . , Sk} so as to minimize the within-clustersum of squares (WCSS) function. In other words, its objectiveis to find

argminS

k∑

j=1

xi∈Si

∥∥xi − cj

∥∥2 (6)

where cj is the presentation of cluster j, generally the centroidof points in Sj.

The algorithm uses an iterative refinement technique. Givenan initial set of k centroids c1(1), . . . , ck(1), the algorithmproceeds by alternating between an assignment step and anupdate step as follows.

During the assignment step, assign each observation to thecluster characterized by the nearest centroid, that is

Si(T) ={

xp :∥∥xp − ci(T)

∥∥2 ≤ ∥

∥xp − cj(T)∥∥2

, 1 ≤ j ≤ k},

i = 1, . . . , k (7)

where each xp is assigned to exactly one cluster, even if itcould be assigned to two or more of them. Apparently, thisstep can minimize the WCSS function.

During the update step, the centroid, say ci(T + 1), of theobservations in the new cluster is computed as follows:

ci(T + 1) = 1

|Si(T)|∑

xj∈Si(T)

xj, i = 1, . . . , k. (8)

Since the arithmetic mean is a least-squares estimator, this stepalso minimizes the WCSS function.

The algorithm converges if the centroids no longer change.The k-means algorithm can converge to a (local) optimum,

while there is no guarantee for it to converge to the global opti-mum [7]. Since for a given k, the result of the above clusteringalgorithm depends solely on the initial centroids, a commonpractice in clustering the sensor observations is to execute thealgorithm several times for different initial centroids and thenselect the best solution.

The computational complexity of the k-means algorithm isO(nkdM) [7], where n is the number of the d-dimensionalvectors, k is the number of clusters, and M is the number ofiterations before reaching convergence.

B. Choosing Initial Centroids Using the Distributedk-Means++ Algorithm

The choice of the initial centroids is the key to making thek-means algorithm work well. In the k-means algorithm, thepartition result is dependent only on the initial centroids for

Algorithm 1: Distributed k-Means++ Algorithm: ith NodeData: xi, n, k, Ni

Result: c(1)

\* Propagation of the first centroid *\tempi = random(0, 1);temp = max-consensus(tempi, Ni, n);if tempi == temp then

xic = xi;else

xic = [−∞, . . . ,−∞]′;endc1(1) = max-consensus(xic, Ni, n);for m = 2;m ≤ k;m + + do

\* Propagation of the maximum value of all thenodes’ distance to their nearest centroids *\di = min1≤j≤m−1 ‖xi − cj(1)‖;randi = d2

i ×random(0, 1);randmax = max-consensus(randi, Ni, n);

\* Propagation of the tth center *\if randi == randmax then

xic = xi;else

xic = [−∞, . . . ,−∞]′;endcm(1) = max-consensus(xic, Ni, n);

endc(1) = [c1(1)′, . . . , ck(1)′]′;

a given k, so does the distributed one. Thus, an effective dis-tributed initial centroids choosing method is important. Herewe provide a distributed implementation of the k-means++algorithm [24], a centralized algorithm to find the initial cen-troids for the k-means algorithm. It is noted that k-means++generally outperforms k-means in terms of both accuracy andspeed [24].

Let D(x) denote the distance from observation x to theclosest centroids that have been chosen. The k-means++algorithm [24] is executed as follows.

1) Choose randomly an observation from {x1, x2, . . . , xn}as the first centroid c1.

2) Take a new center cj, choosing x from the observationswith probability D(x)2/(

∑x∈{x1,x2,...,xn} D(x)2).

3) Repeat step 2) until we have taken k centers altogether.Consider that the n nodes are deployed in an Euclidean

space of any dimension and suppose that each node has alimited sensing radius which may be different for differentnodes, therefore leading to the fact that the underlying topol-ogy of the WSN is directed. Each node is endowed with a realvector xi ∈ R

d representing the observation. Here we assumethat every node has its own unique identification (ID), andfurther the underlying topology of the WSN is strongly con-nected. The detailed realization of the distributed k-means++algorithm is as shown in Algorithm 1.

Page 5: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 5

For node i, it knows its observation xi, n (the number of thenodes in the network), k (the number of the clusters), as wellas Ni (the set of node i’s neighbors). Executing the distributedk-means++ algorithm on all the nodes synchronously yieldsinitial centroids c(1) which can be propagated to all the nodes.The detailed procedures are listed as follows.

1) First, every node produces a random number between0 and 1, and the maximum number can be figuredout by all nodes through the max-consensus algorithm.The node with maximum number chooses its obser-vation while others choose [−∞, . . . ,−∞]′ ∈ R

d toparticipate in the max-consensus algorithm so that theobservation of the node with the largest random number,designated as the first centroid c1(1), can be known byevery node.

2) Next, all the nodes perform the following steps iter-atively until all the k centroids are fixed. In the mthiteration, calculate the distance from each node to theexisting centroids and obtain di = min1≤j≤m−1 ‖xi −cj(1)‖, i = 1, . . . , n. Then, every node produces arandom number randi between 0 and d2

i , followed byusing the max-consensus algorithm to spread randmax =max1≤i≤n randi throughout the network. If a certainnode, say i, satisfies randi = randmax, then its obser-vation is designated as the mth centroid and further isobtained by other nodes in the same manner as that forthe first centroid.

Since the computational complexity of the max-consensusalgorithm is O(dn2), and the max-consensus algorithm needsto be executed for 2k times, the distributed k-means++ algo-rithm’s computational complexity is O(kdn2) for each node.

Remark 3: A contradiction will appear if there exist atleast two nodes, say node i and node j, such that randi =randj = randmax in Algorithm 1, since these two nodes bothchoose their own observations as the new centroid. A solu-tion to this contradiction is that for any node, say node �,with rand� = randmax, it provides its ID while others provide0 to participate in the max-consensus algorithm. Therefore,the maximum ID of the node with rand� = randmax can beobtained by every node, and further such a node’s observationcan be chosen as the new centroid. In fact, the same oper-ation needs to be performed for obtaining the first centroid,although the probability for two nodes to produce the samerandom number is normally very small.

C. Distributed k-Means Algorithm

Based on the above work, in this section we will introducethe proposed distributed k-means algorithm in detail.

The objective of the distributed k-means algorithm is to par-tition the data observed by the nodes into k clusters, whichminimizes the WCSS function (6) via a distributed approach.Similarly to the work in [18], we follow the same step asthat for the centralized algorithm except that every step isperformed in a decentralized manner. During the assignmentstep, if all the centroids are spread through the network via themax-consensus algorithm, then each node knows which cluster

(a) (b)

Fig. 1. Flowchart for proposed algorithms. Flowchart for distributed(a) k-means algorithm and (b) c-means algorithm.

it belongs to. During the update step, the finite-time average-consensus algorithm can be utilized to choose the mean of eachcluster as the new centroid. Note that some pretreatments suchas weight balancing and normalization [see Fig. 1(a)] need tobe done before the distributed algorithm is being executed.The flowchart for the distributed k-means algorithm is shownin Fig. 1(a) and the concrete realization of the algorithm isshown in Algorithm 2.

Note that in Algorithm 2,mirror-imbalance-correcting(·)

is the mirror imbalance-correcting algorithm in [33] anddistributed-k-means++(·)

is the distributed k-means++ algorithm proposed inSection III-B.

For node i, it knows its observation xi ∈ Rd, n (the number

of the nodes), k (the number of the clusters), as well as Ni (theset of node i’s neighbors). All the nodes execute the distributedk-means algorithm synchronously, and the final centroids c andthe cluster ki which it belongs to, are obtained by every node.The whole algorithm is executed as follows.

1) Weight Balancing: To compute the average of a num-ber of observations by the finite-time average-consensusalgorithm, the strongly connected directed graph needs tobe weight balanced. The mirror imbalance-correcting algo-rithm [33] is executed to assign weights to node i’s outgoingedges to make the network be weight balanced.

2) Normalization: A min–max value standardizationmethod is used to normalize the components in the observa-tions. The max-consensus and min-consensus algorithms [19]are used first to obtain the maximum and minimum values of

Page 6: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON CYBERNETICS

Algorithm 2: Distributed k-Means Algorithm: ith Node

Data: xi = [xi1, . . . , xid]′, n, k, Ni

Result: c, ki

\* Weight balancing *\ai = mirror-imbalance-correcting(Ni);

\* Normalization *\[ maxi, . . . , maxd ]′ = max-consensus(xi, Ni, n);[ mini, . . . , mind ]′ = min-consensus(xi, Ni, n);

x̂ij = xij−minjmaxj−minj

,for j = 1, . . . , d;

x̂i = [x̂i1, . . . , x̂id]′;\* Choice of the initial centroids *\c(1) � [c1(1)′, . . . , ck(1)′]′ =distributed-k-means++(xi, n, k, Ni);T = 1;while true do

\* Assignment step *\ki = argminj‖x̂i − cj(T)‖;

\* Update step *\T = T + 1;for j = 1; j ≤ k; j + + do

if ki == j thennic = 1;

elsenic = 0;

endcj(T) = average-consensus(nicx̂ic, Ni, ai, n);nj = average-consensus(nic, Ni, ai, n);cj(T) = cj(T)/nj;

endc(T) = [c1(T)′, . . . , ck(T)′]′;\* Check convergence *\if c(T) == c(T − 1) then

break;end

endc = c(T);

every component of the observation. For the jth component ofnode i’s observation xij, if the maximum and minimum valueare maxj and minj, respectively, then invoking the min–maxvalue standardization xij is normalized as follows:

x̂ij = xij − minj

maxj − minj. (9)

3) Choice of the Initial Centroids: Similar to the centralizedk-means++, this step is the key to yield a faster convergencespeed as well as a higher possibility of obtaining the globaloptimum. The normalized observations are applied to the dis-tributed k-means++ algorithm to produce the initial centroids.Let T = 1, and the initial centroids are c(1).

4) Assignment Step: Since the centroids have been providedin step 3), each node needs to figure out which cluster, rep-resented by the centroid, it belongs to. More specific, node ibelongs to the cluster ki which satisfies the condition that thedistance from node i to the centroid of cluster ki is the shortestamong all the distances from the centroids to such a node, i.e.,it belongs to the kith cluster with ki = argminj‖x̂i − cj(T)‖.

5) Update Step: After the assignment step, the update stepis executed. Let T = T + 1. To compute the jth center, eachnode provides its normalized observations if it is assigned tothe jth cluster and [0, . . . , 0] ∈ R

d otherwise, to participatein the finite-time average-consensus algorithm. If the quan-tity of the nodes in the jth cluster is qj and the centroid iscj(T), then it is apparent that qjcj(T) = ∑n

i=1 nicx̂ic. Thus,the result of the finite-time average-consensus algorithm is(qjcj(T)/n).

Then every node chooses 1 if it is assigned to the jthcluster and 0 otherwise to participate in the finite-time average-consensus algorithm. The finite-time average-consensus algo-rithm’s result is (qj/n) since qj = ∑n

i=1 nic. Apparently,the jth cluster’s centroid can be obtained by cj(T) =(qjcj(T)/n)/(qj/n).

6) Check Convergence: After the new centroids are pro-duced, we need to judge whether the algorithm has converged.Obviously, if c(T) = c(T − 1), then c(T) is the final clustercenter c. Otherwise, the assignment step and update step arerepeated until the algorithm converges.

It is known that the computational complexity of the dis-tributed k-means++ algorithm is O(kdn2) and the complexityof the mirror imbalance-correcting algorithm is O(n4) for anode. During the normalization step, every node executesthe max-consensus and min-consensus for one time and thedimension of the observations is d, so the computational com-plexity is O(dn2). As for the assignment step and update step,if the algorithm has converged within T steps, then the com-putational complexity is O(kdn2T), since the complexity ofthe finite-time average-consensus algorithm is O(dn) for anode. In conclusion, the computational complexity of the dis-tributed k-means algorithm is O(n2 max{n2, dkT}) for a node,and when the network is undirected, the computational com-plexity is O(n2dkT) for a node since an undirected graph isalways weight balanced.

D. Advantages of the Distributed k-Means Algorithm

Compared with the algorithms proposed in [18], the maindifference occurs in the update step regarding how to cal-culate the averages of the observations in each cluster. Forthe algorithm proposed in this paper, when calculating theaverage of the nodes’ observations in each cluster, each nodeprovides valid information if it belongs to a cluster other-wise invalid information (i.e., 0) through the whole network.Thus, the proposed algorithm is valid as long as the wholenetwork is strongly connected. However, for the algorithmsgiven in [18], the centroid for each cluster of observationsis calculated by employing the finite-time average-consensusalgorithm on the underlying network of such a cluster, thusleading to two shortcomings of the algorithm proposed in [18].

Page 7: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 7

The first shortcoming is that it cannot be extended directlyto the case when the network topology is a directed graph,because there is no guarantee that the subnetwork (i.e., theunderlying topology of each cluster) is strongly connected aswell if the whole network itself is strongly connected. Thesecond shortcoming is that for the algorithm provided in [18]to cope with disconnected clusters, it involves a high com-putational complexity of O(n2k max{d, n}∣∣σ(Sj(t))

∣∣T), where

σ(Sj(t)) means the set of subclusters of cluster Sj(t) at step t,while the computational complexity of the algorithm proposedherein is only O(n2dkT). In other words, the advantages of thealgorithm proposed in this paper over that in [18] is its capa-bility of dealing with the condition that the network topologyis a directed graph and its lower computational complexity.

Remark 4: The distributed k-means algorithm prefers clus-ters of approximately similar size, as it will always assignan observation to the nearest centroid. This often leads toincorrect cut borders in between clusters. Moreover, it is nec-essary for an observation to be assigned to more than onecluster [20], [21]. For example, when one data point is locatedin the overlapping area of many different clusters, it may notbe appropriate to assign the data point to exactly one clustersince it has the properties of those overlapping clusters. Hence,a distributed fuzzy partition method, which will be introducedin the next section, is needed.

IV. DISTRIBUTED FUZZY c-MEANS ALGORITHM

In this section, we introduce the distributed fuzzy c-meansalgorithm. To this end, the fuzzy c-means algorithm is firstbriefly presented here.

A. Introduction to the Centralized Fuzzy c-Means Algorithm

Consider a given set of observations {x1, x2, . . . , xn}, whereeach observation is a d-dimensional real vector. Similarly tothe k-means algorithm, the fuzzy c-means algorithm [34], [35]aims to partition the n observations into k(≤ n) sets S ={S1, S2, . . . , Sk} with degrees called the membership matrixdescribed by a row stochastic matrix U = [uij] ∈ R

n×k, wherethe membership value uij means that node i belongs to cluster jwith the degree of uij, so as to minimize an objective function.The objective function is given by

n∑

i=1

k∑

j=1

umij

∥∥xi − cj

∥∥2 (10)

where cj is the presentation of the cluster j, which is calledthe center later, and m is the fuzzifier determining the level ofcluster fuzziness. In the absence of experimentation or domainknowledge, m is commonly set to 2 [35].

The Lagrangian multiplier method [34] can be used to solvethis problem and the necessary condition to minimize theobjective function is

cj =∑n

i=1 umij xi

∑ni=1 um

ij, j = 1, . . . , k (11)

and

uij = 1∑k

t=1

(xi−cjxi−ct

)2/(m−1), i = 1, . . . , n, j = 1, . . . , k. (12)

Similar to the k-means algorithm, the fuzzy c-means algo-rithm also uses an iterative refinement technique. Given aninitial set of k centers c1(1), . . . , ck(1) or an initial member-ship matrix, the algorithm proceeds by alternating betweenupdating cj and U according to (11) and (12).

Remark 5: The algorithm minimizes intracluster varianceas well, but has the same problems as the k-means algorithm,that is, the obtained minimum is a local minimum and theresults depend on the initial choice.

B. Distributed Fuzzy c-Means Algorithm

In this section, a proposed distributed fuzzy c-means algo-rithm will be presented in detail.

We make the same assumptions as the distributed k-meansalgorithm, i.e., the n nodes deployed in any high dimensionalspace are endowed with real vectors xi ∈ R

d, i = 1, . . . , n,which represent the observations, and the underlying topologyof the WSN is strongly connected.

The goal of the proposed algorithm is to obtain centersrepresenting different clusters and the membership matrixdescribing to what extent a node belongs to a cluster soas to minimize the objective function described in (10). Wefollow the same step as that for the centralized algorithmexcept that every step is performed in a decentralized man-ner. If all the centers can be spread across the networkvia the max-consensus algorithm, then each node can cal-culate the membership matrix directly according to (12).When computing the new centers, the finite-time average-consensus algorithm can be utilized. The flowchart for thedistributed fuzzy c-means algorithm is shown in Fig. 1(b)and the concrete realization of the algorithm is given inAlgorithm 3.

For node i, it knows its observation xi ∈ Rd, n (the number

of the nodes in the network), k (the number of the clus-ters), Ni (the set of neighbors of node i), as well as δ (theparameter to judge whether the algorithm converges). All thenodes execute the distributed fuzzy c-means algorithm syn-chronously and the final centers c and the membership valueuij describing to what extent node i belongs to cluster j, areobtained by every node. The whole algorithm is executedas follows.

1) Weight Balancing, Normalization and Choice of theInitial Centers: These steps are the same as the distributedk-means algorithm.

2) Calculate Membership Matrix: Since the centers havebeen provided in step 1), each node can calculate the mem-bership matrix as shown in (12).

3) Calculate New Centers: Each node provides umij x̂ and

umij to participate in the finite-time average-consensus algo-

rithm. So the resulting finite-time average-consensus algorithmshould be (

∑ni=1 um

ij x̂i/n) and (∑n

i=1 umij /n). Apparently, the

ratio of the two results is the new center according to (11).

Page 8: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON CYBERNETICS

Algorithm 3: Distributed Fuzzy c-Means Algorithm: ithNode

Data: xi = [xi1, . . . , xid]′, n, k, Ni, δ

Result: c, uij(j = 1, . . . , k)

\* Weight balancing *\ai =mirror-imbalance-correcting(Ni);\* Normalization *\[ maxi, . . . , maxd ]′ = max-consensus(xi, Ni, n);[ mini, . . . , mind ]′ = min-consensus(xi, Ni, n);

x̂ij = xij−minjmaxj−minj

,for j = 1, . . . , d;

x̂i = [x̂i1, . . . , x̂id]′;\* Choice of the initial centers *\c(1) � [c1(1)′, . . . , ck(1)′]′ =distributed-k-means++(xi, n, k, Ni);T = 1;while true do

\* Calculate membership matrix *\for j = 1; j ≤ k; j + + do

uij = 1∑k

t=1

( xi−cjxi−ct

)2/(m−1) ;

end

\* Calculate new centers *\T = T + 1;for j = 1; j ≤ k; j + + do

cj(T) =average-consensus(umij x̂i, Ni, ai, n);

nj = average-consensus(umij , Ni, ai, n);

cj(T) = cj(T)/nj;endc(T) = [c1(T)′, . . . , ck(T)′]′;\* Check convergence *\if ‖c(T) − c(T − 1)‖ < δ then

break;end

endc = c(T);

4) Check Convergence: After the new centers are produced,we need to judge whether the algorithm converges. Obviously,if ‖c(T)−c(T −1)‖ < δ, which means that the centers changea little during two iterations, then c(T) is the final clustercenters c. Otherwise, T = T + 1, we need to calculate themembership matrix and the new centers iteratively until thealgorithm converges.

Similar to the distributed k-means algorithm, the computa-tional complexity for a node of the distributed fuzzy c-meansalgorithm is O(n2 max{n2, dkT}) if the network is directed andO(n2dkT) if the network is undirected.

Remark 6: Similar to the centralized fuzzy c-means algo-rithm, the distributed fuzzy algorithm can begin with a giveninitial set of centers or a given initial membership matrix aswell. Note that the distributed k-means++ algorithm providedin Section III-B is also applicable in finding an initial set ofcenters for the distributed fuzzy c-means algorithm.

Fig. 2. Observations of the positions with initial centroids and undirectednetwork topology.

Fig. 3. Clustering result with final centroids of the distributed k-meansalgorithm in the undirected graph.

V. SIMULATION RESULTS

Now we provide several examples to illustrate the capa-bilities of the proposed distributed k-means algorithm anddistributed fuzzy c-means algorithm.

Example 1: In this example, we consider that the networktopology is an undirected graph. Here the 40 nodes are ran-domly selected in the region [0, 1]×[0, 1] and the observationof each node is the position. We need to partition the 40 mea-sures into four clusters. The network is shown in Fig. 2 and theinitial centroids chosen by the distributed k-means algorithmare marked by red cross.

First, we use the proposed distributed k-means algorithmto deal with these observations and the clustering result isdepicted in Fig. 3, in which the different types of points meandifferent clusters while the final centroids are marked by redcross.

And then the proposed distributed fuzzy c-means algorithmis used to deal with the observations in Fig. 4. Considering thatsoft clustering cannot get a fixed partition, we make the fol-lowing treatment so that the clustering result can be observeddirectly. For node i, if uij ≥ 0.5, then we partition node iinto cluster j. If uij < 0.5 for all j, then we call it fuzzypoints presented by “plus” in the simulation results. The sim-ulation result is depicted in Fig. 4. Note that the fuzzy points

Page 9: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 9

Fig. 4. Clustering result with final centroids of the distributed fuzzy c-meansalgorithm in the undirected graph.

Fig. 5. Observations of the positions with initial centroids and directednetwork topology.

are always on the borders of clusters which is tally with theactual situation.

Here we provide the same observations except that the net-work is a connected directed graph as shown in Fig. 5, wherethe initial centers are marked by red cross. The clusteringresult using the distributed k-means algorithm is depicted inFig. 6, in which the different types of points mean differentclusters and the final centers are marked by red cross. Whenthe distributed fuzzy c-means algorithm is applied, the cor-responding results are shown in Fig. 7. From Figs. 3 and 6which are plotted using the distributed k-means algorithm, aswell as Figs. 4 and 7 which are plotted using the distributedfuzzy c-means algorithm, it can be observed that the clusteringis the same for the same observations regardless of whetherthe network topology is undirected or not.

Example 2: Consider a WSN with 3600 nodes. The net-work is a connected undirected graph with each node commu-nicating with other 50 nodes. The observed data is generatedfrom nine classes, with vectors from each class generatedfrom a 2-D Gaussian distribution with the common covari-ance � = 2I2, and the corresponding means: μ1 = [6, 6]T ,μ2 = [0, 6]T , μ3 = [−6, 6]T , μ4 = [6, 0]T , μ5 = [0, 0]T ,μ6 = [−6, 0]T , μ7 = [6,−6]T , μ8 = [0,−6]T , andμ9 = [−6,−6]T . Every node randomly draws one data point

Fig. 6. Clustering result with final centroids of the distributed k-meansalgorithm in the directed graph.

Fig. 7. Clustering result with final centroids of the distributed fuzzy c-meansalgorithm in the directed graph.

from the nine classes and 400 data points are chosen from aclass.

Fig. 8(b) depicts the evolution of the WCSS functions ofthe distributed k-means algorithm with k = 6, 9, 12 and ofthe centralized k-means algorithm with k = 9. This illustratesthat the convergence of the proposed distributed implemen-tation is similar to the centralized algorithm. The clusteringresult of the distributed k-means algorithm with k = 9 isdisplayed in Fig. 8(a). Table I compares the performanceincluding the final value of the WCSS function and the iter-ation times before convergence over 100 Monte Carlo runsfor the distributed k-means algorithm (represented by dk-means), the k-means algorithm with first centroids chosen byusing the k-means++ algorithm (represented by k-means++),and the k-means algorithm with first centroids randomly cho-sen from the observations (represented by the k-means). It canbe observed that the distributed k-means algorithm performsthe best both in terms of speed and accuracy.

Fig. 9(b) depicts the evolution of the objective functionsof the distributed fuzzy c-means algorithm with k = 6, 9, 12and of the centralized fuzzy c-means algorithm with k = 9.This illustrates that the convergence of the proposed distributedimplementation is similar to the centralized algorithm. Theclustering result of the distributed fuzzy c-means algorithm

Page 10: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON CYBERNETICS

(a)

(b)

Fig. 8. Clustering of the distributed k-means algorithm in the undirectedgraph with 3600 observations fitting Gaussian distribution. (a) Clusteringresult of the distributed k-means algorithm with k = 9. (b) Evolution ofthe WCSS functions of the distributed k-means algorithm with k = 6, 9, 12and of the centralized k-means algorithm with k = 9.

TABLE ICENTRALIZED VERSUS DISTRIBUTED k-MEANS.

PERFORMANCE COMPARISON

with k = 9 is displayed in Fig. 9(a), in which the fuzzy pointsare presented by black “circle.” Table II compares the perfor-mance including the finial value of the objective function andthe iteration times before convergence over 100 Monte Carloruns for the distributed fuzzy c-means algorithm (representedby dc-means), the fuzzy c-means algorithm with first cen-troids chosen by using the k-means++ algorithm (representedby fuzzy c-means++), and the fuzzy c-means algorithm withfirst centroids randomly chosen from the observations (repre-sented by the fuzzy c-means). It can be seen that the distributed

(a)

(b)

Fig. 9. Clustering of the distributed fuzzy c-means algorithm in the undirectedgraph with 3600 observations fitting Gaussian distribution. (a) Clusteringresult of the distributed fuzzy c-means algorithm with k = 9. (b) Evolutionof the objective functions of the distributed fuzzy c-means algorithm withk = 6, 9, 12 and of the centralized fuzzy c-means algorithm with k = 9.

TABLE IICENTRALIZED VERSUS DISTRIBUTED FUZZY c-MEANS.

PERFORMANCE COMPARISON

fuzzy c-means algorithm performs the best both in terms ofspeed and accuracy.

Example 3: This example tests the proposed decentralizedschemes on real data for the first cloud cover database avail-able from the UC-Irvine Machine Learning Repository, withthe goal of identifying regions sharing common characteris-tics. The cloud dataset is composed of 1024 points in 10-D.Suppose that the WSN consisting of 1024 nodes is a con-nected undirected graph with each node communicating withother 20 nodes.

Page 11: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

QIN et al.: DISTRIBUTED k-MEANS ALGORITHM AND FUZZY c-MEANS ALGORITHM FOR SENSOR NETWORKS 11

Fig. 10. Evolution of the WCSS functions of the distributed k-means algo-rithm with k = 10, 25, 50 in the undirected graph with 1024 10-D observationsfrom cloud dataset.

Fig. 11. Evolution of the objective functions of the distributed fuzzy c-meansalgorithm with k = 10, 25, 50 in the undirected graph with 1024 10-Dobservations from cloud dataset.

Fig. 10 depicts the evolution of the WCSS functions ofthe distributed k-means algorithm with k = 10, 25, 50, whichconfirms the convergence of the proposed k-means algorithm.Moreover, Fig. 11 plots the evolution of the objective functionsof the distributed fuzzy c-means algorithm with k = 10, 25, 50,which demonstrates the convergence of the proposed c-meansalgorithm.

VI. CONCLUSION

In this paper, based on the distributed consensus theory inmultiagent systems, we have developed a distributed k-meansclustering algorithm as well as a distributed fuzzy c-meansalgorithm for clustering data in WSNs with strongly con-nected underlying graph topology. The proposed distributedk-means algorithm, by virtue of one-hop communication, hasbeen proven to be feasible in partitioning the data observedby the nodes into measure-dependent groups which has smallin-group and large out-group differences. On the other hand,the proposed distributed fuzzy c-means algorithm has beenshown to be workable in partitioning the data observed by thenodes into different measure-dependent groups with degrees ofmembership values. Finally, some simulation results have been

provided to illustrate the feasibility of the proposed distributedalgorithms.

ACKNOWLEDGMENT

The authors would like to thank the Associate Editor andthe anonymous reviewers for their valuable comments and sug-gestions that have helped to improve this paper considerably.

REFERENCES

[1] K. Sohraby, D. Minoli, and T. F. Znati, Wireless Sensor Networks:Technology, Protocols, and Applications. Hoboken, NJ, USA: Wiley,2007.

[2] J. Heo, J. Hong, and Y. Cho, “EARQ: Energy aware routing for real-time and reliable communication in wireless industrial sensor networks,”IEEE Trans. Ind. Informat., vol. 5, no. 1, pp. 3–11, Feb. 2009.

[3] Q.-S. Jia, L. Shi, Y. Mo, and B. Sinopoli, “On optimal partial broad-casting of wireless sensor networks for Kalman filtering,” IEEE Trans.Autom. Control, vol. 57, no. 3, pp. 715–721, Mar. 2012.

[4] J. Liang, Z. Wang, B. Shen, and X. Liu, “Distributed state estimation insensor networks with randomly occurring nonlinearities subject to timedelays,” ACM Trans. Sensor Netw., vol. 9, no. 1, 2012, Art. ID 4.

[5] W. Kim, J. Park, J. Yoo, H. J. Kim, and C. G. Park, “Target localizationusing ensemble support vector regression in wireless sensor networks,”IEEE Trans. Cybern., vol. 43, no. 4, pp. 1189–1198, Aug. 2013.

[6] S. H. Semnani and O. A. Basir, “Semi-flocking algorithm for motioncontrol of mobile sensors in large-scale surveillance systems,” IEEETrans. Cybern., vol. 45, no. 1, pp. 129–137, Jan. 2015.

[7] J. MacQueen, “Some methods for classification and analysis of multi-variate observations,” in Proc. 5th Berkeley Symp. Math. Stat. Probab.,vol. 1. Berkeley, CA, USA, 1967, pp. 281–297.

[8] J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy c-meansclustering algorithm,” Comput. Geosci., vol. 10, no. 2–3, pp. 191–203,1984.

[9] G. H. Ball and D. J. Hall, ISODATA, A Novel Method of Data Analysisand Pattern Classification. Menlo Park, CA, USA: Stanford Res. Inst.,1965.

[10] J. C. Dunn, “A fuzzy relative of the ISODATA process and its use indetecting compact well-separated clusters,” J. Cybern., vol. 3, no. 3,pp. 32–57, 1973.

[11] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algo-rithm for discovering clusters in large spatial databases with noise,” inProc. 2nd Int. Conf. Knowl. Disc. Data Mining, Portland, OR, USA,Aug. 1996, pp. 226–231.

[12] I. S. Dhillon and D. S. Modha, “A data-clustering algorithm on dis-tributed memory multiprocessors,” Large-Scale Parallel Data Mining(LNCS 1759). Berlin, Germany: Springer, 2000, pp. 245–260.

[13] S. Kantabutra and A. L. Couch, “Parallel K-means clustering algorithmon NOWs,” NECTEC Tech. J., vol. 1, no. 6, pp. 243–247, 2000.

[14] S. Bandyopadhyay et al., “Clustering distributed data streams in peer-to-peer environments,” Inf. Sci., vol. 176, no. 14, pp. 1952–1985, 2006.

[15] P. A. Forero, A. Cano, and G. B. Giannakis, “Consensus-based k-meansalgorithm for distributed learning using wireless sensor networks,”in Proc. Workshop Sensors Signal Inf. Process., Sedona, AZ, USA,May 2008, pp. 11–14.

[16] P. A. Forero, A. Cano, and G. B. Giannakis, “Distributed clusteringusing wireless sensor networks,” IEEE J. Sel. Topics Signal Process.,vol. 5, no. 4, pp. 707–724, Aug. 2011.

[17] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and DistributedComputation: Numerical Methods. Upper Saddle River, NJ, USA:Prentice-Hall, 1989.

[18] G. Oliva, R. Setola, and C. N. Hadjicostis. (Nov. 10, 2014). DistributedK-Means Algorithm. [Online]. Available: http://arxiv.org/abs/1312.4176

[19] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and coop-eration in networked multi-agent systems,” Proc. IEEE, vol. 95, no. 1,pp. 215–233, Jan. 2007.

[20] A. B. McBratney and I. O. A. Odeh, “Application of fuzzy sets insoil science: Fuzzy logic, fuzzy measurements and fuzzy decisions,”Geoderma, vol. 77, nos. 2–4, pp. 85–113, 1997.

[21] S. Balla-Arabé, X. Gao, and B. Wang, “A fast and robust level setmethod for image segmentation using fuzzy clustering and latticeBoltzmann method,” IEEE Trans. Cybern., vol. 43, no. 3, pp. 910–920,Jun. 2013.

Page 12: IEEE TRANSACTIONS ON CYBERNETICS 1 Distributed k-Means …allanjwilson.weebly.com/uploads/1/3/4/0/...algorithm_and_fuzzy_c-m… · Algorithm for Sensor Networks Based on Multiagent

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON CYBERNETICS

[22] P. Shen and C. Li, “Distributed information theoretic clustering,” IEEETrans. Signal Process., vol. 62, no. 13, pp. 3442–3453, Jul. 2014.

[23] X. Zhao and A. H. Sayed, “Distributed clustering and learning overnetworks,” IEEE Trans. Signal Process., vol. 63, no. 13, pp. 3285–3300,Jul. 2015.

[24] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of carefulseeding,” in Proc. 18th Annu. ACM-SIAM Symp. Discrete Algorithms,New Orleans, LA, USA, Jan. 2007, pp. 1027–1035.

[25] Z. Lin, B. Francis, and M. Maggiore, “State agreement for continuous-time coupled nonlinear systems,” SIAM J. Control Optim., vol. 46, no. 1,pp. 288–307, 2007.

[26] W. Ren, R. W. Beard, and E. M. Atkins, “Information consensus in mul-tivehicle cooperative control,” IEEE Control Syst. Mag., vol. 27, no. 2,pp. 71–82, Apr. 2007.

[27] J. Qin, H. Gao, and W. X. Zheng, “Second-order consensus for multi-agent systems with switching topology and communication delay,” Syst.Control Lett., vol. 60, no. 6, pp. 390–397, 2011.

[28] Y. Tang, H. Gao, W. Zou, and J. Kurths, “Distributed synchronization innetworks of agent systems with nonlinearities and random switchings,”IEEE Trans. Cybern., vol. 43, no. 1, pp. 358–370, Feb. 2013.

[29] J. Qin, W. X. Zheng, and H. Gao, “Coordination of multiple agents withdouble-integrator dynamics under generalized interaction topologies,”IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 42, no. 1, pp. 44–57,Feb. 2012.

[30] F. Garin and L. Schenato, “A survey on distributed estimation and controlapplications using linear consensus algorithms,” in Networked ControlSystems. London, U.K.: Springer, 2010, pp. 75–107.

[31] S. Sundaram and C. N. Hadjicostis, “Finite-time distributed consensusin graphs with time-invariant topologies,” in Proc. Amer. Control Conf.,New York, NY, USA, Jul. 2007, pp. 711–716.

[32] C. N. Hadjicostis and A. Rikos, “Distributed strategies for balancing aweighted digraph,” in Proc. 20th IEEE Mediterr. Conf. Control Autom.,Barcelona, Spain, Jul. 2012, pp. 1141–1146.

[33] B. Gharesifard and J. Cortés, “Distributed strategies for generatingweight-balanced and doubly stochastic digraphs,” Eur. J. Control,vol. 18, no. 6, pp. 539–557, 2012.

[34] J. C. Bezdek, Pattern Recognition With Fuzzy Objective FunctionAlgorithms. New York, NY, USA: Springer, 1981.

[35] N. R. Pal and J. C. Bezdek, “On cluster validity for the fuzzy c-meansmodel,” IEEE Trans. Fuzzy Syst., vol. 3, no. 3, pp. 370–379, Aug. 1995.

Jiahu Qin (M’12) received the first Ph.D. degreein control science and engineering from the HarbinInstitute of Technology, Harbin, China, in 2012, andthe second Ph.D. degree in information engineeringfrom Australian National University, Canberra, ACT,Australia, in 2014.

He was a Visiting Fellow with the University ofWestern Sydney, Sydney, NSW, Australia, in 2009and 2015, respectively, and a Research Fellow withthe Australian National University, from 2013 to2014. He is currently a Professor with the University

of Science and Technology of China, Hefei, China. His current research inter-ests include consensus and coordination in multiagent systems as well ascomplex network theory and its application.

Weiming Fu received the B.E. degree in automationfrom the University of Science and Technology ofChina, Hefei, China, in 2014, where he is currentlypursuing the master’s degree.

His current research interests include consensusproblems in multiagent coordination and synchro-nization of complex dynamical networks.

Huijun Gao (F’14) received the Ph.D. degree incontrol science and engineering from the HarbinInstitute of Technology, Harbin, China, in 2005.

He was a Research Associate with the Departmentof Mechanical Engineering, University of HongKong, Hong Kong, from 2003 to 2004. From 2005to 2007, he was a Post-Doctoral Research Fellowwith the Department of Electrical and ComputerEngineering, University of Alberta, Edmonton, AB,Canada. Since 2004, he has been with the HarbinInstitute of Technology, where he is currently a

Professor and the Director of the Research Institute of Intelligent Controland Systems. His current research interests include network-based control,robust control/filter theory, time-delay systems, and their engineering appli-cations.

Dr. Gao was a recipient of the IEEE J. David Irwin Early Career Awardfrom the IEEE IES. He is a Co-Editor-in-Chief for the IEEE TRANSACTIONS

ON INDUSTRIAL ELECTRONICS, and an Associate Editor for Automatica,the IEEE TRANSACTIONS ON CYBERNETICS, the IEEE TRANSACTIONS ON

FUZZY SYSTEMS, the IEEE/ASME TRANSACTIONS ON MECHATRONICS,and the IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY. Heis an AdCom Member of the IEEE Industrial Electronics Society.

Wei Xing Zheng (M’93–SM’98–F’14) receivedthe Ph.D. degree in electrical engineering fromSoutheast University, Nanjing, China, in 1989.

He has held various faculty/research/visiting posi-tions with Southeast University; the Imperial Collegeof Science, Technology and Medicine, London,U.K.; the University of Western Australia, Perth,WA, Australia; the Curtin University of Technology,Perth; the Munich University of Technology,Munich, Germany; the University of Virginia,Charlottesville, VA, USA; and the University of

California at Davis, Davis, CA, USA. He is currently a Full Professor withWestern Sydney University, Sydney, NSW, Australia.

Dr. Zheng served as an Associate Editor for the IEEE TRANSACTIONS ON

CIRCUITS AND SYSTEMS-I: FUNDAMENTAL THEORY AND APPLICATIONS,the IEEE TRANSACTIONS ON AUTOMATIC CONTROL, the IEEE SIGNAL

PROCESSING LETTERS, the IEEE TRANSACTIONS ON CIRCUITS AND

SYSTEMS-II: EXPRESS BRIEFS, and the IEEE TRANSACTIONS ON FUZZY

SYSTEMS, and as a Guest Editor for the IEEE TRANSACTIONS ON CIRCUITS

AND SYSTEMS-I: REGULAR PAPERS. He is currently an Associate Editorfor Automatica, the IEEE TRANSACTIONS ON AUTOMATIC CONTROL (thesecond term), the IEEE TRANSACTIONS ON CYBERNETICS, the IEEETRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, andother scholarly journals. He has also served as the Chair of the IEEECircuits and Systems Society’s Technical Committee on Neural Systemsand Applications and the IEEE Circuits and Systems Society’s TechnicalCommittee on Blind Signal Processing. He is a Fellow of IEEE.