binu 2015

12
1 3 Cluster analysis using optimization algorithms with newly designed 4 objective functions 5 6 7 D. Binu 8 Aloy Labs, Bengaluru, India 9 10 11 13 article info 14 Article history: 15 Available online xxxx 16 Keywords: 17 Clustering 18 Optimization 19 Genetic algorithm (GA) 20 Cuckoo search (CS) 21 Particle swarm optimization (PSO) 22 Kernel space 23 24 abstract 25 Clustering finds various applications in the field of medical and telecommunication for unsupervised 26 learning which is much required in expert system and its application. Various algorithms have been 27 developed to clustering for the past fifty years after the introduction of k-means clustering. Recently, 28 optimization algorithms are applied for clustering to find optimal clusters with the help of different 29 objective functions. Accordingly, in this research, clustering is performed using three newly designed 30 objective functions along with four existing objective functions with the help of optimization algorithms 31 like, genetic algorithm, cuckoo search and particle swarm optimization algorithm. Here, three different 32 objective functions are designed including the cumulative summation of fuzzy membership and distance 33 value with normal data space, kernel space as well as multiple kernel space. In addition to the existing 34 seven objective functions, totally, 21 different clustering algorithms are discussed and the performance 35 is validated with 16 different datasets which are synthetic, small and large scale real data. The compar- 36 ison is made with five different evaluation metrics to validate the effectiveness and efficiency. From the 37 research outcome, the suggestion is presented to select a suitable algorithm among 21 algorithms for a 38 particular data and results proved that the effectiveness of cluster analysis is mainly dependent on objec- 39 tive function and the efficiency of cluster analysis is based on search algorithm. 40 Ó 2015 Published by Elsevier Ltd. 41 42 43 44 1. Introduction 45 Expert systems are intelligent software programs designed for 46 taking useful and intelligent managerial decisions for various 47 domains like, agriculture, finance, education, medicine to military 48 science, process control, space technology and engineering. The 49 expert systems require different data mining methods to support 50 decision making process. Among different data mining methods, 51 classification and clustering are two important methods applied 52 widely for expert system. Clustering which is unsupervised learn- 53 ing has received significant attention among the researchers due to 54 its wide applicability for the past fifty years after the introduction 55 of k-means clustering algorithm (McQueen, 1967), which is well- 56 known algorithm for clustering due to its simplicity. Due to the 57 reception of k-mean clustering, variants of k-means clustering 58 algorithms are introduced by different researchers by pointing 59 out various problems like, initialization (Khan & Ahmad, 2004), 60 k-value (Pham, Dimov, & Nguyen, 2004), and distance com- 61 putation. One of the most accepted methods of clustering after 62 the introduction of k-means clustering is fuzzy c-means clustering 63 (FCM) (Bezdek, 1981), which is a popular algorithm after the 64 k-means clustering, including fuzzy concept in computing the clus- 65 ter centroids. FCM- algorithm is also found at various variants 66 among the researchers (Ji, Pang, Zhou, Han, & Wang, 2012; Ji 67 et al., 2012; Kannana, Ramathilagam, & Chung, 2012; Linda & 68 Manic, 2012; Maji, 2011). The important variants of FCM algorithm 69 is kernel fuzzy clustering (Zhang & Chen, 2004) and multiple ker- 70 nel-based clustering algorithm (Chen, Chen, & Lu, 2011) which 71 are based on the concept of FCM algorithm with the inclusion of 72 kernels, accepted widely for its capability of doing the task for 73 non-linear data. More interestingly, these entire algorithms have 74 found importance in image segmentation (Chen et al., 2011; 75 Zhang & Chen, 2004) and the relevant applications related to image 76 segmentation (Ji, Pang, et al., 2012; Ji et al., 2012; Li & Qi, 2007; 77 Sulaiman & Isa, 2010; Szilágyi, Szilágyi, Benyób, & Benyó, 2011; 78 Zhao, Jiao, & Liu, 2013). 79 After the introduction of soft computing techniques, the cluster- 80 ing problem is transformed to optimization problem, finding the 81 optimal clusters in the defined search space. Accordingly most of 82 the optimization algorithms are applied to clustering problems. 83 For example, the first and pioneer optimization algorithm called, 84 GA (Mualik & Bandyopadhyay, 2002) is applied for clustering ini- 85 tially and then, PSO algorithm (Premalatha & Natarajan, 2008), 86 Artificial Bee Colony (Zhang, Ouyang, & Ning, 2010), Bacterial http://dx.doi.org/10.1016/j.eswa.2015.03.031 0957-4174/Ó 2015 Published by Elsevier Ltd. E-mail address: [email protected] Expert Systems with Applications xxx (2015) xxx–xxx Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa ESWA 9942 No. of Pages 12, Model 5G 10 April 2015 Please cite this article in press as: Binu, D. Cluster analysis using optimization algorithms with newly designed objective functions. Expert Systems with Applications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

Upload: diedieherv

Post on 14-Dec-2015

214 views

Category:

Documents


0 download

DESCRIPTION

Clustering evaluation

TRANSCRIPT

Page 1: Binu 2015

1

3

4

5

6

7

8

91011

1 3

1415

1617181920212223

2 4

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Cluster analysis using optimization algorithms with newly designedobjective functions

http://dx.doi.org/10.1016/j.eswa.2015.03.0310957-4174/� 2015 Published by Elsevier Ltd.

E-mail address: [email protected]

Please cite this article in press as: Binu, D. Cluster analysis using optimization algorithms with newly designed objective functions. Expert SystemApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

D. BinuAloy Labs, Bengaluru, India

25262728293031323334

a r t i c l e i n f o

Article history:Available online xxxx

Keywords:ClusteringOptimizationGenetic algorithm (GA)Cuckoo search (CS)Particle swarm optimization (PSO)Kernel space

35363738394041

a b s t r a c t

Clustering finds various applications in the field of medical and telecommunication for unsupervisedlearning which is much required in expert system and its application. Various algorithms have beendeveloped to clustering for the past fifty years after the introduction of k-means clustering. Recently,optimization algorithms are applied for clustering to find optimal clusters with the help of differentobjective functions. Accordingly, in this research, clustering is performed using three newly designedobjective functions along with four existing objective functions with the help of optimization algorithmslike, genetic algorithm, cuckoo search and particle swarm optimization algorithm. Here, three differentobjective functions are designed including the cumulative summation of fuzzy membership and distancevalue with normal data space, kernel space as well as multiple kernel space. In addition to the existingseven objective functions, totally, 21 different clustering algorithms are discussed and the performanceis validated with 16 different datasets which are synthetic, small and large scale real data. The compar-ison is made with five different evaluation metrics to validate the effectiveness and efficiency. From theresearch outcome, the suggestion is presented to select a suitable algorithm among 21 algorithms for aparticular data and results proved that the effectiveness of cluster analysis is mainly dependent on objec-tive function and the efficiency of cluster analysis is based on search algorithm.

� 2015 Published by Elsevier Ltd.

42

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

1. Introduction

Expert systems are intelligent software programs designed fortaking useful and intelligent managerial decisions for variousdomains like, agriculture, finance, education, medicine to militaryscience, process control, space technology and engineering. Theexpert systems require different data mining methods to supportdecision making process. Among different data mining methods,classification and clustering are two important methods appliedwidely for expert system. Clustering which is unsupervised learn-ing has received significant attention among the researchers due toits wide applicability for the past fifty years after the introductionof k-means clustering algorithm (McQueen, 1967), which is well-known algorithm for clustering due to its simplicity. Due to thereception of k-mean clustering, variants of k-means clusteringalgorithms are introduced by different researchers by pointingout various problems like, initialization (Khan & Ahmad, 2004),k-value (Pham, Dimov, & Nguyen, 2004), and distance com-putation. One of the most accepted methods of clustering afterthe introduction of k-means clustering is fuzzy c-means clustering(FCM) (Bezdek, 1981), which is a popular algorithm after the

85

86

k-means clustering, including fuzzy concept in computing the clus-ter centroids. FCM- algorithm is also found at various variantsamong the researchers (Ji, Pang, Zhou, Han, & Wang, 2012; Jiet al., 2012; Kannana, Ramathilagam, & Chung, 2012; Linda &Manic, 2012; Maji, 2011). The important variants of FCM algorithmis kernel fuzzy clustering (Zhang & Chen, 2004) and multiple ker-nel-based clustering algorithm (Chen, Chen, & Lu, 2011) whichare based on the concept of FCM algorithm with the inclusion ofkernels, accepted widely for its capability of doing the task fornon-linear data. More interestingly, these entire algorithms havefound importance in image segmentation (Chen et al., 2011;Zhang & Chen, 2004) and the relevant applications related to imagesegmentation (Ji, Pang, et al., 2012; Ji et al., 2012; Li & Qi, 2007;Sulaiman & Isa, 2010; Szilágyi, Szilágyi, Benyób, & Benyó, 2011;Zhao, Jiao, & Liu, 2013).

After the introduction of soft computing techniques, the cluster-ing problem is transformed to optimization problem, finding theoptimal clusters in the defined search space. Accordingly most ofthe optimization algorithms are applied to clustering problems.For example, the first and pioneer optimization algorithm called,GA (Mualik & Bandyopadhyay, 2002) is applied for clustering ini-tially and then, PSO algorithm (Premalatha & Natarajan, 2008),Artificial Bee Colony (Zhang, Ouyang, & Ning, 2010), Bacterial

s with

Page 2: Binu 2015

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

2 D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

Foraging Optimization (Wan, Li, Xiao, Wang, & Yang, 2012),Simulated Annealing (Selim & Alsultan, 1991), DifferentialEvolution Algorithm (Das, Abraham, & Konar, 2008), andEvolutionary algorithm (Castellanos-Garzón & Diaz, 2013) andFirefly (Senthilnath, Omkar, & Mani, 2011) are subsequentlyapplied for clustering. Recently, _Inkaya, Kayalıgil, and Özdemirel,2015 utilized Ant Colony Optimization for clustering methodologyusing two objective functions, namely adjusted compactness andrelative separation. Liyong, Witold, Wei, Xiaodong, and Li (2014)utilized genetically guided alternating optimization for fuzzy c-means clustering. Here, interval number was introduced for attri-bute weighting in the weighted fuzzy c-means (WFCM) clusteringto obtain appropriate weights more easily from the viewpoint ofgeometric probability. Hoang, Yadav, Kumar, and Panda (2014)have utilized the recent optimization algorithm called, HarmonySearch Algorithm for clustering. Yuwono, Su, Moulton, andNguyen (2014) have developed Rapid Centroid Estimation utilizingthe rules of PSO algorithm to reduce the computational complexityand produced the clusters with higher purity. These recent algo-rithms utilized the traditional objective function for evaluatingthe clustering solution.

After that, hybrid algorithms are in the field of doing clusteringprocess over the datasets to utilize the advantages of both the algo-rithms taken for hybridization. Here, two optimization algorithmsare combined to do the clustering task as like, GA with PSO (Kuo,Syu, Chen, & Tien, 2012). From this, we can say that if any newoptimization algorithms are being done, researchers are waitingto utilize the updating algorithm for clustering process. Due to thesuccessful application of hybrid algorithms in clustering process,researchers are then hybridized the traditional clustering algo-rithms with the optimization algorithm. For example, GA is com-bined with k-means clustering, called genetic-k-means (Krishna &Murty, 1999) and the similar type of work is given in Niknam andAmiri (2010). Recently, Krishnasamy, Kulkarni, and Paramesran(2014) have proposed hybrid evolutionary data clustering algo-rithm referred to as K-MCI, whereby, K-means with modified cohortintelligence are combined for data clustering. Wei, Yingying, SoonCheol, and Xuezhong (2015) have developed hybrid evolutionarycomputation approach utilizing Quantum-behaved particleswarm optimization for data clustering. Garcia-Piquer, Fornells,Bacardit, Orriols-Puig, and Golobardes (2014) have developedMultiobjective Clustering to guide the search following a cyclebased on evolutionary algorithms. Tengke, Shengrui, Qingshan,and Huang (2014) have proposed a cascade optimizationframework that combines the weighted conditional probabilitydistribution (WCPD) and WFI models for data clustering.

In optimization-based clustering applications, clustering prac-tices are operated, based on the fitness function, which validatesthe optimal cluster achieved. Here, the constraint is that fitnessfunction developed should be capable of providing the good clus-ters’ quality. The objective function is also responsible for the val-idation of the clustering output and directing it through theoptimal cluster centroids. However, when looking into clustering’fitness functions, most of the optimization-based algorithm uti-lized the k-mean objective (minimum mean square distance) as fit-ness function for optimal searching of cluster task (Wan et al.,2012) because of its simplistic computation. Similarly, FCM objec-tive is also applied as fitness function for finding the optimal clus-ter centroids (Ouadfel & Meshoul, 2012) due to its flexibility and itseffectiveness. Also, authors utilize some cluster validity indices toapply on swarm intelligence-based optimization algorithm (Xu,Xu, & Wunsch, 2012) with the different perspective of cluster qual-ity. In addition to, fuzzy cluster validity indices are developed withthe inclusion of fuzzy theory and then, it is applied on optimizationalgorithm, GA (Pakhira, Bandyopadhyay, & Maulik, 2005). In a

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

further way, the multiple objectives are combined to do the clus-tering optimization as like (Bandyopadhyay, 2011). Here, clusterstability and validity are combined as fitness and then it is solvedusing optimization algorithm, simulated annealing (Saha &Bandyopadhyay, 2009).

By means of the overall analysis, our finding is that most of theoptimization algorithm utilizes k-means (KM) and FCM objectivefor their clustering optimization. Moreover with the best of ourknowledge, MKFCM (Multiple Kernel FCM) objective is not solvedpreviously through optimization clustering. So, with the intentionof doing clustering task with optimization totally, two well knownobjectives (k-means and FCM), two recent objectives (KFCM andMKFCM) and three newly designed objective functions are utilizedhere. The reason of selecting these objectives is its (i) applicabilityand popularity (k-means and FCM objectives are chosen), (ii)regency and standard (KFCM and MKFCM are chosen), (iii) effec-tiveness and importance (three newly designed objective func-tion). Then, we are in need of optimization algorithm for solvingthese objectives. Even though various optimization algorithmsare presented in the literature, three optimization algorithms arechosen for our task of applying clustering process. Here, GA, PSOalgorithm and CS algorithm are chosen because GA is traditionaland popular one (Goldberg & David, 1989), PSO is an intelligentalgorithm accepted by various researchers to its capability ofchanging the condition according to its most optimistic position(Kennedy & Eberhart, 1995), CS is a recent and effective algorithmproved better for various complex task of engineering problems(Yang & Deb, 2010).

The basic organization of the paper is given as follows: Section 2provides contributions of the paper and Section 3 discusses objec-tive measures taken from the literature. Section 4 presents newobjective functions designed and Section 5 provides the solutionencoding procedure. Section 6 discusses optimization algorithmstaken for data clustering and Section 7 discusses the experi-mentation with detailed results. Finally, the conclusion is summedup in Section 8.

2. Contributions of the paper

The most important contributions of the paper are discussed asfollows:

(i) Clustering process with optimization: We have developedclustering process through optimization technique in orderto accomplish the optimal cluster quality. So, two traditionalobjective function (KM and FCM), two recent objective func-tions (KFCM and MKFCM) and three newly developed objec-tive functions are operated to do the task. Moreover, theoptimization algorithms such as, GA, PSO and CS algorithmare considered.

(ii) Hybridization: With the best of our knowledge, MKFCMobjective is firstly solved with the optimization algorithmsin this work. Hence, three optimization algorithms such as,GA, PSO and CS algorithm are combined with MKFCM objec-tive functions to get three new hybridization algorithms,(GA-MKFCM, PSO-MKFCM, and CS-MKFCM) which are notpresented in the literature previously.

(iii) New objective functions: We have designed three new objec-tive functions (FCM + CF, KFCM + KCF, MKFCM + MKCF),including the cumulative summation of fuzzy membershipand distance value. Here, the same cumulative summationis also performed with kernel space as well as multiple ker-nel space. Again, these three new objective functions arederived with good mathematical formulation and thecorresponding theorem and the proof is moreover provided.

algorithms with newly designed objective functions. Expert Systems with

Page 3: Binu 2015

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250251

253253

254255

257257

258

259

260

261

262

263

265265

266267

269269

270

271

272

273

274

275

276

278278

279280

282282

283

284

285

286

287

288

289

291291

292293

Table 1Algorithms.

GA PSO CS

KM GA-KMc PSO-KMc CS-KMc

FCM GA-FCMc PSO-FCMc CS-FCMc

KFCM GA-KFCMc PSO-KFCMc CS-KFCMc

MKFCM GA-MKFCMb PSO-MKFCMb CS-MKFCMb

FCM + CF GA-FCM + CFa PSO-FCM + CFa CS-FCM + CFa

KFCM + KCF GA-KFCM + KCFa PSO-KFCM + KCFa CS-KFCM + KCFa

MKFCM + MKCF GA-MKFCM + MKCFa

PSO-MKFCM + MKCFa

CS-MKFCM + MKCFa

a Denotes algorithms with new objective function (noval works).b Denotes algorithms with old objective function (no similar works).c Denotes existing algorithms.

D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx 3

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

(iv) Algorithms: We have presented nine algorithms (GA-FCM + CF, PSO-FCM + CF and CS-FCM + CF, GA-KFCM + KCF,PSO-KFCM + KCF, CS-KFCM + KCF, GA-MKFCM + MKCF, PSO-MKFCM + MKCF, and CS-MKFCM + MKCF) recently with thehelp of three newly designed objective functions. Three algo-rithms (GA-MKFCM, PSO-MKFCM, and CS-MKFCM) are pre-sented by hybridizing in a new way as not presented inthe literature. Nine obtainable algorithms are also consid-ered subsequently. So, totally, 21 algorithms are discussedin this paper. These algorithms are the major contributionof this paper after detailed analysis with the literature. Theclustering algorithms formulated based on the hybridizationof optimization algorithms and its objective function isspecified in Table 1.

(v) Validation: To validate the 21 algorithms, five evaluationmetrics and 16 datasets where, eight real datasets, twoimage data and six datasets generated are synthetically uti-lized. Then, performance of algorithms is extensively ana-lyzed with three different perspectives like, searchalgorithm, objective functions and hybridization view toencounter the effect of clustering results. Most excellentalgorithms for clustering are suggested at last with respectto the characteristics of input data.

2.1. Problem definition

Let X be the database, consisting of n data points located in d-di-

mensional real space of xi 2 Rd. The definition of clustering is todivide the n data points into m clusters, which means thatmjð1 6 j 6 gÞ cluster centre should be identified from the inputdatabase.

3. Objective measures considered from the literature

3.1. Objective function 1: (KM)

The clustering problem defined above is converted intooptimization problem in such a way that minimizing the summa-tion of distance between the entire data points with its nearestcluster. Let the objective function of k-means clustering(McQueen, 1967)

OBKM ¼Xn

i¼1

Xg

j¼1

xi �mj

�� ��2 ð1Þ

subject to the following constraints:

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – j

ðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ[g

j¼1

xi 2 mj ¼ X; i ¼ 1;2; . . . ; nð2Þ

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

3.2. Objective function 2: (FCM)

The objective of clustering problem can be represented inanother way utilizing the fuzzy membership function along withthe distance variable. Let the objective function of FCM (Bezdek,1981)

OBFCM ¼Xn

i¼1

Xg

j¼1

ubij xi �mj

�� ��2 ð3Þ

subject to the following constraints:

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – j

ðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ [g

j¼1xi 2 mj ¼ X; i ¼ 1;2; . . . ;n

ðivÞXg

j¼1

uij ¼ 1; i ¼ 1;2; . . . ;n

ðvÞ ubij ¼

xi �mj

�� �� �1b�1

� �P g

j¼1 xi �mj

�� �� �1b�1

� �

ð4Þ

3.3. Objective function 3: (KFCM)

The same clustering problem can be indicated as optimizationproblem including, distance and fuzzy membership as variablesbut, the computation is done in kernel space instead of originaldata space. The kernel-based clustering problem is given as(Zhang & Chen, 2004),

OBKFCM ¼Xn

i¼1

Xg

j¼1

ubijð1� kðxi;mjÞÞ ð5Þ

subject to the following constraints:

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – j

ðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ[g

j¼1

xi 2 mj ¼ X; i ¼ 1;2; . . . ;n

ðivÞXg

j¼1

uij ¼ 1; i ¼ 1;2; . . . ;n

ðvÞ ubij ¼

ð1� kðxi;mjÞÞ�1

b�1P gj¼1ð1� kðxi;mjÞÞ

�1b�1

ðv iÞ kðxi;mjÞ ¼ exp �xi �mj

�� ��2

r2

!

ð6Þ

As suggested in Chen et al. (2011), the value of r and fuzzifica-tion coefficient, b are fixed as 150 and 2 respectively.

3.4. Objective function 4: (MKFCM)

The purpose of grouping the data points presented in the origi-nal data base, X can be also denoted as optimization problem (Chenet al., 2011) like,

OBMKFCM ¼Xn

i¼1

Xg

j¼1

ubijð1� kcomðxi;mjÞÞ ð7Þ

subject to the following constraints:

algorithms with newly designed objective functions. Expert Systems with

Page 4: Binu 2015

295295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312313

315315

316317

319319

320

321

322

323

324

325

326

327

328

329

330

331332

334334

335336

4 D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – j

ðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ[g

j¼1

xi 2 mj ¼ X; i ¼ 1;2; . . . ;n

ðivÞXg

j¼1

uij ¼ 1; i ¼ 1;2; . . . ;n

ðvÞ ubij ¼

ð1� kcomðxi;mjÞÞ�1

b�1Pmj¼1ð1� kcomðxi;mjÞÞ

�1b�1

ðv iÞ kcomðxi �mjÞ ¼ k1ðxi;mjÞ � k2ðxi;mjÞ

ðv iiÞ k1ðxi;mjÞ ¼ k2ðxi;mjÞ ¼ exp �xi �mj

�� ��2

r2

!

ð8Þ

338338

339

340

341

342

343

344

345

346

347

348349

351351

352353

4. Devising of new objective functions

4.1. Objective function 5: (FCM + CF)

The new objective function is introduced with the concern ofdistance, fuzzy variable along with two additional variables thatare not defined previously, called cumulative distance and cumula-tive fuzzy values. These two additional variables are added into theobjective function of clustering because outlier data points ornoises can contribute to objective function even more with theoriginal data points. This contribution can be omitted by addingthe cumulative distance and cumulative membership value withthe old objective function. With the intention of this, two variablesare additionally added with FCM objective function to make moresuitable for clustering even outlier data points are presented. Thefollowing objective function is designed for clustering with theaddition of FCM objective with the newly introduced term called,cumulative function (CF). Let the objective function of clusteringbased on new term defined,

OBFCMþCF ¼Xg

j¼1

Xn

i¼1;i2j

ubij xi �mj

�� ��2 þXg

j¼1

Xn

i¼1;i2j

ubij

!

�Xn

i¼1;i2j

xi �mj

�� ��2� � !

ð9Þ

subject to the following constraints:

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – j

ðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ[g

j¼1

xi 2 mj ¼ X; i ¼ 1;2; . . . ;n

ðivÞXg

j¼1

uij ¼ 1; i ¼ 1;2; . . . ; n

ðvÞ ubij ¼

xi �mj

�� �� �1b�1

� �P g

j¼1 xi �mj

�� �� �1b�1

� �

ð10Þ

355355

356

357

358

4.2. Objective function 6: (KFCM + KCF)

The new definition of cumulative function by considering thekernel space is given and it is utilized along with the objectivefunction of KFCM to define the clustering optimization.Absolutely, it is a small variant of cumulative function which isdefined above but, it affects the performance of the algorithm verywell. The difference is very simple that the cumulative function isperformed in the original data space but here, the kernel space is

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

utilized to do the membership and distance computation insteadof the original data space. Hence, we named it as, kernel cumula-tive functions (KCF). Let the new definition of objective functionfor clustering considering KCF

OBKFCMþKCF ¼Xg

j¼1

Xn

i¼1;i2j

ubijð1� kðxi;mjÞÞ þ

Xg

j¼1

Xn

i¼1;i2j

ubij

!

�Xn

i¼1;i2j

ð1� kðxi;mjÞÞ !

ð11Þ

subject to the following constraints:

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – jðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ[g

j¼1

xi 2 mj ¼ X; i ¼ 1;2; . . . ;n

ðivÞXg

j¼1

uij ¼ 1; i ¼ 1;2; . . . ;n

ðvÞ ubij ¼

ð1� kðxi;mjÞÞ�1

b�1P gj¼1ð1� kðxi;mjÞÞ

�1b�1

ðviÞ kðxi;mjÞ ¼ exp �xi �mj

�� ��2

r2

!

ð12Þ

4.3. Objective function 7: (MKFCM + MKCF)

Even though the kernel space played a significant role in dis-tance computation of clustering algorithm, multiple kernels alsoplayed key role in differentiating the data points. Multiple kernelsare combined to act as kernel at this point. It is just the variant ofkernel-based objective function in addition with combined kernel.Similarly, cumulative function is also performed with multiple ker-nels so that the name of the function we utilized after this is multi-ple kernel cumulative function (MKCF). Let the addition of thesetwo objective function taken for clustering optimization

OBMKFCMþMKCF ¼Xg

j¼1

Xm

i¼1;i2j

ubijð1� kcomðxi;mjÞÞ þ

Xg

j¼1

Xn

i¼1;i2j

ubij

!

�Xn

i¼1;i2j

1� kcomðxi;mjÞ� � !

ð13Þ

subject to the following constraints:

ðiÞ mi \mj ¼ /; i; j ¼ 1;2; . . . ; g; i – j

ðiiÞ mi – /; i ¼ 1;2; . . . ; g

ðiiiÞ[g

j¼1

xi 2 mj ¼ X; i ¼ 1;2; . . . ;n

ðivÞXg

j¼1

uij ¼ 1; i ¼ 1;2; . . . ;n

ðvÞ ubij ¼

ð1� kcomðxi;mjÞÞ�1

b�1P gj¼1ð1� kcomðxi;mjÞÞ

�1b�1

ðviÞ kcomðxi;mjÞ ¼ k1ðxi;mjÞ � k2ðxi;mjÞ

ðviiÞ k1ðxi;mjÞ ¼ k2ðxi;mjÞ ¼ exp �xi �mj

�� ��2

r2

!

ð14Þ

5. Solution encoding

The explanation encoding procedure employed in the clusteringis given in Fig. 1. Every solution defined in the problem space is a

algorithms with newly designed objective functions. Expert Systems with

Page 5: Binu 2015

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417418

420420

421

422

423

424

425

426

427429429

430

431

432

433

434

435

436

437

438

439440

441

Fig. 1. Solution encoding.

D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx 5

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

vector which consists of d � gvalues: This signifies that the solution iscentroid values for the given dataset, X. Suppose, the datasets ishaving 3 attributes (3 dimensions), and then the solution vectorwill contain six elements if the number of cluster needed is two.The first three elements in the solution vector is the first centroidand the last three elements in the solution vector the second cen-troid. This way of representing the solution can fulfill the optimiza-tion criterion with less computation times even though thedimension of the solution is high.

6. Optimization algorithms for data clustering

Solving optimization problems requires well-establishedheuristic procedures from the field of intelligent search towardsusing decision making in expert system. Metaheuristics are exten-sively renowned as efficient approaches for many hard optimiza-tion problems including cluster analysis. The field ofmetaheuristics for the application to real world optimization prob-lems is a rapidly growing field of research. This is due to the impor-tance of optimization problems for the scientific as well as theindustrial world to do decision making much faster. As these meta-heuristics technologies mature and lead to widespread deploy-ment of expert systems, optimal finding of solution becomesmore important. Here, three different heuristic search algorithmsare effectively utilized for solving clustering problem.

6.1. Genetic algorithm

Step 1: Initial population: Initially, P solutions are given in a pop-ulation and every chromosome with d �m vector.

Step 2: Fitness computation: For every chromosome (centroidsets), the objective function is computed.

Step 3: Selection: From the initial population, set of chromosomesare selected randomly based on selection rate.

Step 4: Crossover: The crossover operator is applied on theselected two candidates, and this produces two individualsnewly.

Step 5: Mutation: The obtained new set of individuals is then fedto the mutation operator that also provides a new set ofchromosomes.

Step 6: Termination: After performing cross over and mutationoperators, the algorithm go to step 2 up to the maximumnumber of iteration specified by the user are reached.

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

6.2. PSO algorithm

Step 1. Initially, P solutions (centroid sets) are given in an initialset of particles. Initial solutions are taken from the inputdataset X and the velocity value of every particle is set tozero.

Step 2. For every particle (centroid), the position is computedbased on the objective function and the particle with mini-mum fitness is assigned as pbest for the current iteration.pbest is local best guided towards reaching best positionbased on one particle.

Step 3. Determine the particle that have minimum fitness withrespect to entire iterations executed and update it as gbest

which is global best particle found by all the particles insearch space.

Step 4. Once we find pbest and gbest , the particles’ (centroids’)velocities are newly generated using the followingequation.

algorit

v tþ1 ¼ w � v t þ /1 � rndðÞ � ðpbest � xtÞ þ /2 � rndðÞ� ðgbest � xtÞ ð15Þ

where /1 and /2 are set as two. v t is the old velocity of theparticle and rndðÞ is a random number between (0,1). xt isthe current particle taken for finding new velocity.

Step 5. Then, new positions for all the particles are found out usingthe new velocity and previous positions. The formulaeused for computing new position are given as follows:

xtþ1 ¼ xt þ v tþ1 ð16Þ

Once we generate new position, lower bound and upperbound conditions are checked based on lower bound andupper bound values availed in the input database. If thenew position value is less than the lower bound, then, thenew value is replaced with lower bound value and if it isgreater, replace with upper bound value.7. Go to step 2 until it reach maximum iteration.

6.3. Cuckoo search algorithm

Step 1: Initially, P solutions are given in an initial set of nests andevery nest represents with d �m matrix.

hms with newly designed objective functions. Expert Systems with

Page 6: Binu 2015

442

443

444

445

446

447

448

449

450

451

452

453

454455457457

458

459

460

461

462463465465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496497

499499

500

501

502503505505

506508508

509510

512512

513

514

6 D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

Step 2: Choose a random number ðjÞ through levy flight equationin between 1 to P and the corresponding solution (cen-troids sets) is chosen.

Step 3: Evaluate the fitness of nest in j th location based on theobjective function and a random number generated inbetween 1 to P blindly and the solution given in i th loca-tion of initial population is chosen to find the fitness of thesolution.

Step 4: Replacing nest j by new solution if the fitness belongs to jis less than i : The evaluation of the fitness of both solu-tions taken from the previous steps is found out by com-paring the fitness. The new solution xðtþ1Þ for the worstnest is performed by,

515

516

517

518

519

520

521

522

523

524

525

526

527

528

PleaseApplic

xðtþ1Þ ¼ xðtÞ þ a� LevyðkÞ ð17Þ

where a > 0 is the step size which should be related to thescales of the problem of interest. The product � meansentry-wise multiplications. Levy flights essentially providea random walk whereas their random steps are drawnfrom a Levy distribution for large steps,

Levy � u ¼ t�k; ð1 < k 6 3Þ ð18Þ

Once we generate new position, lower bound and upperbound conditions are checked based on lower bound andupper bound values availed in the input database. If thevalue obtained in new solution based on levy flight isgreater than upper bound, the values are replaced withupper bound value. If the value is less than lower bound,new value is replaced with lower bound value.

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

Table 2Description of datasets.

Dataset name Notation Dataobjects

Attributes No. ofclass

Dimension ofsolution

Iris RD1 150 4 3 9PID RD2 768 8 2 14Wine RD3 178 13 3 36Sonar RD4 208 60 2 118Blood Transfusion RD5 748 5 2 8Mammogram RD6 961 6 2 10Secom LRD1 1567 591 2 1180Madelon LRD2 4400 500 2 998Synthetic SD1 10,029 3 2 4Synthetic SD2 12,745 3 3 6Synthetic SD3 42,248 3 2 4Synthetic SD4 55,647 3 2 4Synthetic SD5 5761 3 2 4Synthetic SD6 22,162 3 2 4Image Micro 65,025 6 2 10

Step 5: Based on the probability pa given in the algorithm, theworst set of nests are identified and building a new onein the corresponding location.

Step 6: The best set of nest is maintained in every iteration basedon the objective function and the process is continuedfrom step 2 to 5 until the maximum iteration is reached.

7. Results and discussion

This section presents experimental validation of the clusteringtechniques. In order to handle with, evaluation metrics and datasettaken for the validation of the clustering techniques is given withfull description. Then, detailed experimental results are given withtables and the corresponding discussion is furthermore given inthis section.

7.1. Evaluation metrics

The performance of the algorithm is measured through effec-tiveness and efficiency where, the effectiveness of the algorithmsis evaluated with three different evaluation metrics, like clusteringaccuracy (CA) given in Yang and Chen (2011), Rand efficient (RC),Jaccard co-efficient (JC) given in Wan et al. (2012) and Adjustedrand index (ARI). The efficiency is measured with computationtime.

The definition of these metrics is given as follows:Clustering accuracy,

CA ¼

PKi¼1maxj�f1;2;...;gg 2

Ci\Pmjj jCiþ Pmjj jj j

� � K

ð19Þ

Here, C ¼ C1; . . . ;CKf g is a labeled data set that offers the groundtruth and Pm ¼ Pm1; . . . ; Pmg

� �is a partition produced by a cluster-

ing algorithm for the data set.

Rand co-efficient; RC ¼ ðSSþ DDÞ=ðSSþ SDþ DSþ DDÞ ð20Þ

cite this article in press as: Binu, D. Cluster analysis using optimizationations (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

Jaccard co-efficient; JC ¼ ðSSÞ=ðSSþ SDþ DSÞ ð21Þ

Adjusted rand Index,

ARI ¼

n

2

SSþ DDð Þ � SSþ SDð Þ SSþ DSð Þ þ DSþ DDð Þ SDþ DDð Þ½ �

n

2

2

� SSþ SDð Þ SSþ DSð Þ þ DSþ DDð Þ SDþ DDð Þ½ �

ð22Þ

Here, SS, SD, DS, DD represent the number of possible pairs of datapoints where,

SS: both the data points belong to the same cluster and samegroup.SD: both the data points belong to the same cluster but differentgroups.DS: both the data points belong to different clusters but samegroup.DD: both the data points belong to different clusters and differ-ent groups.

Computation time: The efficiency of all the algorithms is evalu-ated with the execution time which is measured by Matlab syntax,‘‘tic’’ and tac’’. ‘‘toc’’ reads the elapsed time from the stopwatchtimer started by the ‘‘tic’’ function.

7.2. Datasets description

The experimental validation is performed with 16 differentdatasets which are taken under four different categories like, smallscale real data, synthetic data, large scale real data and image data.Table 2 details total data objects, total attributes number of classesand dimension of solution for all the 16 datasets taken for theexperimentation and Fig. 2 views the synthetic data generatedand the image data.

Small scale real data: For small scale data validation, eight data-sets are captured from UCI learning repository (UCI, 2013). Thedatasets are Iris, PID, Wine, Blood transfusion, Mammogram andsonar dataset which are used to evaluate the clustering. Syntheticdata: The synthetic information is generated to validate the perfor-mance of the algorithm with respect to various size, shape, over-lapping and classes. This synthetic data-based experimentationprovides to what extent the algorithms are better for differentshapes, density of data records, data size, overlapping and classes.

Image MRI 65,025 6 3 15

algorithms with newly designed objective functions. Expert Systems with

Page 7: Binu 2015

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

SD1 SD2 SD3 SD4

SD5SD6 Microarray image MRI image

Fig. 2. Visualization of synthetic and image data.

D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx 7

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

Accordingly, SD1 and SD5 datasets are generated with the inten-tion of evaluating the performance in clustering of letter symbols.SD2 is generated to evaluate the performance of clustering algo-rithm for the widely accepted shapes. SD6 is generated to validatethe performance of clustering algorithm in overlapping. Similarly,SD3 and SD4 are used for validating the performance of the algo-rithms in irregular shape of clusters. The synthetic dataset are gen-erated through the synthetic code which gets an input as an imagedrawn by the users. Then, the image is converted to data sample byfinding the location of the pixel with intensity as more than 0.Based on the pixel location, 2D data is formed for every pixel;refereeing x and y coordinate values.

Large scale real data: Here, two datasets such as secom andmadelon are chosen with large dimension of solution nearly1000 from UCI learning repository (UCI, 2013). Image data: Twomedical images such as MRI and microarray image are taken andthen, data is formulated based on the gray level intensity. Here,the total number of data records will be equal to the total numberof pixels in the image and attributes of every pixel will be its inten-sity value and its four neighbor pixels value.

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

7.3. Experimental set up

The clustering algorithms are written in MATLAB programming(Version: R2011a) and the results are taken after running with asystem of having 2.13 GHz Intel (R) Pentium (R) CPU with 2 GBRAM. For genetic algorithm, we have fixed a Cross over rate andMutation rate as 0.8 and 0.005 respectively, based on the sugges-tion given in Hong and Kwong (2008). For PSO algorithm, theparameter fixed are, w ¼ 0:72; /1 ¼ 1:49; /2 ¼ 1:49 based onrecommendation (Merwe & Engelbrecht, 2003). Also,pa ¼ 0:25; a ¼ 1 is set for cuckoo search algorithm as per(Elkeran, 2013). Another one operator considered is cluster sizewhich is fixed based on the ground truth classes in the dataset,means that the number of classes in the dataset is equivalent tocluster size given for clustering. Since the cluster size and attri-butes are fixed, the dimension of the solution is also fixed. Theother two common parameters in the optimization algorithm areiteration and population. These two operators are fixed as 100

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

and 10, respectively as these two are global convergence operatorwhich is common for all optimization algorithms.

7.4. Performance evaluation

7.4.1. Experimentation with synthetic dataThe experimentation of all the 21 algorithms and k-means for

the synthetic data is executed in this section. The average perfor-mance of the algorithms for synthetic data is given with threetables that are plotted based on three different views like, algo-rithm perspective, objective perspective and hybridization per-spective. After finding performance measures for all thealgorithms, results are summarized by taking mean value of thesix synthetic data. So, for the hybridization perspective, six sam-ples (measures of all the six synthetic data) are collected for allthe 21 algorithms and k-means to fill up Table 5. Similarly, thesummary of algorithmic perspective given in Table 3 for syntheticdata is filled up by taking the seven samples (out of 21 algorithms,every search algorithms are used seven times) for every data soglobally, the average performance of 42 samples (7 algorithms ⁄ 6synthetic data). It means that every tuple of Table 3 is the averageperformance of seven genetic-based algorithms given in Table 1with six synthetic data. Table 4 is the performance summary ofall the clustering algorithms with the perspective of objectiveformulation. The values are filled by taking 3 samples (out of 21algorithms, three unique objective functions are used) for everydata so globally, the average performance of 18 samples (3 sameobjectives algorithms⁄6 synthetic data) which are collected afterperforming the clustering with three same objective-based algo-rithms given in Table 1 with six synthetic data samples.

For all the six synthetic datasets, the maximum performance(one) is achieved by any one of the 21 algorithms in terms of CA,RC and JC. In terms of ARI, CS-KM achieved maximum performanceby reaching 0.9936. The minimum computation time for SD1, SD2,SD3, SD4, SD5 and SD6 data is 148.206113 s, 280.510192 s,701.163344 s, 950.327078 s, 87.518454 s and 5415 s. The mini-mization will be varied based on datasets and the objective func-tions so, through these objectives, the performance cannot becompared. From Table 3, CS in terms of CA is obtained 0.9164

algorithms with newly designed objective functions. Expert Systems with

Page 8: Binu 2015

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

Table 3Summary of data experimentation in the view of search algorithm.

GA PSO CS K-means GA PSO CS K-means

Synthetic data CA 0.8921 0.8748 0.9164 0.8944 Image data CA 0.9229 0.911 0.9293 0.9211RC 0.8724 0.8548 0.8735 0.8463 RC 0.8772 0.8322 0.8506 0.8567JC 0.8609 0.7647 0.8745 0.7747 JC 0.8833 0.7506 0.7910 0.8083ARI 0.9028 0.8885 0.9179 0.9014 ARI 0.9338 0.9277 0.9287 0.9286Time 49288.0 2237.0 5665.1 5415.2 Time 55994.0 1665.0 3771.0 13490.0

Small scale real data CA 0.809386 0.808186 0.819169 0.7535 Large scale real data CA 0.7961 0.8104 0.8059 0.7625RC 0.59194 0.611712 0.599895 0.6522 RC 0.5967 0.6094 0.5944 0.5149JC 0.45826 0.481398 0.441381 0.4768 JC 0.6274 0.6374 0.6337 0.5769ARI 0.8047 0.7975 0.8168 0.7605 ARI 0.7513 0.7584 0.7562 0.7346Time 54.9 2.8 7.2 24.3 Time 903.6 34.9 79.7 295.3

Table 4Summary of data experimentation in the view of objective formalization.

KM FCM KFCM MKFCM FCM + CF KFCM + KCF MKFCM + MKCF

Synthetic data CA 0.9870 0.9620 0.8030 0.7418 0.9486 0.9227 0.7392RC 0.9876 0.9436 0.8226 0.6556 0.8854 0.8775 0.6513JC 0.9728 0.9146 0.8707 0.5145 0.8334 0.8140 0.4968ARI 0.9772 0.9187 0.9555 0.7568 0.9464 0.8633 0.7660Time 5415.0 7341.0 8979.0 5079.0 3980.0 3015.0 6067.0

Small scale real data CA 0.7539 0.7944 0.8670 0.6135 0.8409 0.9309 0.8848RC 0.6523 0.6215 0.6194 0.4806 0.6431 0.6744 0.5167JC 0.4768 0.4629 0.4777 0.3678 0.4586 0.5465 0.4319ARI 0.7608 0.7893 0.8769 0.5803 0.8458 0.9322 0.8592Time 24.3 24.4 21.0 20.2 23.8 18.6 19.3

Image data CA 0.9362 0.9564 0.9641 0.8115 0.9769 0.9456 0.8568RC 0.9068 0.8886 0.8938 0.7312 0.9141 0.8699 0.7924JC 0.8728 0.8287 0.8537 0.6862 0.8759 0.8181 0.7228ARI 0.9212 0.9500 0.9780 0.8252 0.9948 0.9654 0.8759Time 13490 22132 22138 30323 10477 14392 30387

Large scale real data CA 0.7628 0.7235 0.8525 0.8527 0.7321 0.8529 0.8529RC 0.5149 0.4738 0.6872 0.6874 0.4636 0.6872 0.6872JC 0.5770 0.5496 0.6873 0.6877 0.5536 0.6873 0.6873ARI 0.7228 0.7150 0.7797 0.7796 0.7188 0.7797 0.7797Time 295.4 306.1 359.1 364.4 306.2 434.5 376.7

Table 5Summary of synthetic data experimentation in the view of objective formalizationand search algorithm.

CA RC JC ARI Time

K-means 0.8944 0.8463 0.7747 0.9014 5415.2GA-KMc 0.9792 0.9628 0.9395 0.9711 14735.7PSO-KMc 1 1 1 0.9669 454.9CS-KMc 1 1 1 0.9936 1055.1GA-FCMc 0.9713 0.9296 0.8826 0.9740 12661.2PSO-FCMc 1 0.9578 0.9297 0.9749 455.2CS-FCMc 0.9844 0.9615 0.9315 0.9872 1174.6GA-KFCMc 0.9751 0.9254 0.8742 0.9699 15075.4PSO-KFCMc 0.8884 0.8648 0.8054 0.9151 650.9CS-KFCMc 0.9864 0.9608 0.9308 0.9815 1224.7GA-MKFCMb 0.7259 0.6422 0.4996 0.7741 25063.1PSO-MKFCMb 0.7244 0.6390 0.5124 0.7221 604.7CS-MKFCMb 0.7752 0.6857 0.5314 0.7742 1611.6GA-FCM + CFa 0.8904 0.8623 0.7835 0.9233 13021.1PSO- FCM + CFa 0.9742 0.9140 0.8627 0.9741 555.5CS- FCM + CFa 0.9212 0.8799 0.8540 0.9419 2584.4GA-KFCM + KCFa 0.9752 0.9243 0.8725 0.9524 8800.7PSO- KFCM + KCFa 0.7232 0.6664 0.5015 0.6533 628.5CS- KFCM + KCFa 0.7168 0.6148 0.4518 0.5814 2616.8GA-MKFCM + MKCFa 0.7712 0.6627 0.5035 0.7547 14249.6PSO- MKFCM + MKCFa 0.7186 0.6189 0.4510 0.7776 986.4CS- MKFCM + MKCFa 0.7711 0.6722 0.5357 0.7655 2965.1

8 D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

which is higher than the genetic and PSO algorithm. For RC, CSalgorithm is better and achieved 0.8735 which is higher than othertwo algorithms. Similarly, in terms of ARI, CS algorithm is betterand achieved 0.9179 which is higher than other two algorithms

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

The CS algorithm outperformed in synthetic data as compared withexisting algorithms. For the objective minimization, KM objectiveachieved 0.9870 which is higher than the existing and the pro-posed objective function. For the algorithmic perspective also,KM combined algorithms provided better results.

7.4.2. Experimentation with real data with small and medium scaledimension

The summary experimental results of iris, PID, wine, sonar,transfusion and mammogram data are given in Table 3, 4 and 6.This experimentation is prepared with the intention of testingthe effectiveness and efficiency of the clustering algorithm in smalland medium scale dimension problems. In the datasets of PID,sonar, transfusion and mammogram, the maximum CA has beenreached. The maximum accuracy reached by iris data is 0.9867.In iris data, maximum RC is 0.8923 and JC is 0.719. In terms ofARI, maximum value is 0.9480.

Every clustering algorithm has taken six different real data asinput and the clustering output are measured with CA, JC, RC,ARI and computation time. Then, summary of experimental resultsare obtained with three different perspectives and presented inTables 3, 4 and 6. In Table 3, every tuples belonging to real dataare found out after taking mean value from CA, JC and RC out of42 samples (seven genetic-based algorithms⁄ six different data).So, Table 3 of real data has been the mean value of 42 samples.Based on this analysis, PSO algorithm provided better performancein terms of RC and JC but for cuckoo search, the summary value is0.819 in terms of CA and 0.8168 in terms of ARI which is higher

algorithms with newly designed objective functions. Expert Systems with

Page 9: Binu 2015

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

Table 7Summary of image data experimentation in the view of objective formalization andsearch algorithm.

CA RC JC ARI Time

K-means 0.9211 0.8567 0.8083 0.9286 13490.6GA-KMc 0.9480 0.9191 0.9401 0.9366 35772.9PSO-KMc 0.8820 0.8382 0.7335 0.8512 1683.7CS-KMc 0.9787 0.9632 0.9447 0.9786 3013.7GA-FCMc 0.9722 0.9606 0.9383 0.9745 60674.2PSO-FCMc 0.9114 0.7962 0.6981 0.8896 1903.7CS-FCMc 0.9857 0.9090 0.8497 0.9855 3818.4GA-KFCMc 0.9563 0.9342 0.9381 0.9615 61348.8PSO-KFCMc 0.9706 0.8777 0.8170 0.8239 1658.9CS-KFCMc 0.9656 0.8695 0.8061 0.9706 3406.6GA-MKFCMb 0.8783 0.8212 0.8783 0.9106 85831.7PSO-MKFCMb 0.7475 0.6918 0.5933 0.7762 1673.6CS-MKFCMb 0.8088 0.6807 0.5870 0.7892 3463.7GA-FCM + CFa 0.9689 0.9469 0.9492 0.9963 27193.3PSO-FCM + CFa 0.9749 0.8857 0.8258 0.9822 1380.1CS-FCM + CFa 0.9870 0.9096 0.8529 0.8859 2858.5GA-KFCM + KCFa 0.8970 0.8544 0.8194 0.8942 38051.3PSO-KFCM + KCFa 0.9542 0.8497 0.7850 0.8954 1658.3CS-KFCM + KCFa 0.9857 0.9055 0.8500 0.9928 3468.1GA-MKFCM + MKCFa 0.8401 0.7740 0.7199 0.8621 83090.0PSO-MKFCM + MKCFa 0.9363 0.8861 0.8016 0.9830 1699.5CS-MKFCM + MKCFa 0.7941 0.7171 0.6471 0.7822 6371.3

Table 6Summary of real data experimentation in the view of objective formalization andsearch algorithm.

CA RC JC ARI Time

K-means 0.7535 0.6522 0.4768 0.7605 24.3GA-KMc 0.7729 0.6550 0.4848 0.7771 62.5PSO-KMc 0.7148 0.6457 0.4590 0.7268 3.3CS-KMc 0.7741 0.6562 0.4867 0.7780 7.2GA-FCMc 0.7489 0.5913 0.4332 0.7323 62.6PSO-FCMc 0.8179 0.6376 0.4783 0.8276 3.2CS-FCMc 0.8163 0.6356 0.4772 0.8079 7.3GA-KFCMc 0.9144 0.6380 0.5049 0.9069 52.8PSO-KFCMc 0.8699 0.6266 0.4806 0.8804 3.1CS-KFCMc 0.8167 0.5935 0.4475 0.8441 7.1GA-MKFCMb 0.5833 0.4631 0.3939 0.5814 50.2PSO-MKFCMb 0.5927 0.4621 0.3807 0.5223 3.2CS-MKFCMb 0.6646 0.5165 0.3287 0.6374 7.1GA-FCM + CFa 0.7900 0.6228 0.4523 0.7912 56.3PSO-FCM + CFa 0.8846 0.6620 0.4734 0.8889 2.9CS-FCM + CFa 0.8480 0.6444 0.4502 0.8572 12.2GA-KFCM + KCFa 0.9480 0.7089 0.5494 0.9585 48.7PSO- KFCM + KCFa 0.9458 0.6404 0.5892 0.9464 2.1CS-KFCM + KCFa 0.8990 0.6740 0.5006 0.8918 4.9GA-MKFCM + MKCFa 0.9079 0.4641 0.3890 0.8858 51.2PSO-MKFCM + MKCFa 0.8312 0.6073 0.5082 0.7909 2.0CS-MKFCM + MKCFa 0.9152 0.4787 0.3984 0.9011 4.6

D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx 9

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

than other two algorithms. The computation complexity is muchless for PSO algorithm for these real datasets.

In Table 4, summary is taken based on objective function-basedperformance of real data. Here, each values are obtained by findingthe mean value of 18 samples (three same objective functions outof 21⁄six datasets). Here, the proposed kernel-based objective out-performed all other objectives in terms of CA, RC and JC. The pro-posed KFCM + KCF have achieved 0.9309 in CA and 0.9322 in ARIwhich is higher than existing objective functions. Likewise, thesummary of hybridized algorithm for real data is given in Table 6in which the values are taken by taking the average performanceof six different real data considered. For this, GA-KFCM + KCF pro-vided 0.94807 as CA value and 0.7089 as RC value. In terms of JC,PSO-KFCM + KCF provided better results.

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

7.4.3. Experimentation with image dataThe experimentation of all the 21 algorithms for the image data

is performed in this section. To further analyze the clustering algo-rithm, the image data is taken and we apply all the algorithms tothe image data like micro array and MRI image. The maximumCA reached by any one of the algorithm is 0.9895 and RC is0.9792. The JC reached for Microarray data is 0.9751. Similarly,for the MRI image, the maximum value obtained by analyzingthe entire algorithms is 0.9992, 0.9774, and 0.9639 in terms ofCA, RC and JC respectively. The minimum time taken is1159.035 s for microarray data and 1601.326 s for MRI image data.

The summarized result of two image data is given in Tables 3, 4and 7. In Table 3, the performance metrics are found out by takingthe average performance over the same search algorithms withtwo different data (totally, 14 samples). Based on this Table 3, CSalgorithm has given 0.9293 in terms of CA and other metrics arebetter outperformed by GA. The summarized results of image datawith the perspective of objective functions are given in Table 4.Here, FCM + CF-based objectives (proposed) are outperformed overall the presented objectives. The proposed FCM + CF have achievednearly a value of 0.9 for all the evaluation metrics, CA, RC, ARI andJC. The summarized result of hybridized form of algorithms isgiven in Table 7. Here, average value of two datasets for every algo-rithm is given. Based on this Table 7, except RC, the proposed CS-FCM + CF have provided the better results of 0.987 through CA

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

and 0.9492 in terms of JC. The proposed GA-FCM + CF have pro-vided the better results of 0.9963 through ARI.

7.4.4. Experimentation with real data of large scale dimensionFor the large scale dimension analysis, two real datasets such as,

secom and madelon are taken and the performance of algorithms isanalyzed to find the suitability of algorithm in large scale clusteranalysis. Based on the experimentation with large scale data, themaximum CA reached for secom data is 0.9994 and the maximumRC reached is 0.877. The maximum ARI reached for secom data is0.8525. The minimum computation time required in secom datais 26.73115 s. For the madelon data, the maximum performancein terms of CA, RC and JC is 0.7065, 0.4995 and 0.5085. The com-putation time required for the minimum case is 34.75245 s.

The summarized results are presented in Tables 3, 4 and 8. InTable 3, the average performance is computed by taking meanvalue of seven same search-based algorithms belonging to two dif-ferent data and so it is the mean value of 14 samples. Table 3clearly indicated that PSO algorithm outperformed in all theevaluation metrics when compared with other search algorithms.The PSO algorithm has given 0.8104 as CA whereas the second rankis for CS which reached only 0.8054. The computation time of PSOalgorithm for large scale data is half of the CS algorithm. FromTable 4, MKFCM objectives are better as compared with otherobjective functions. The proposed MKFCM + MKCF have provided0.8529 as CA which is higher than the existing algorithms. FromTable 8, PSO-MKFCM achieved better results in all the format ofevaluation metrics taken, CA, RC, ARI and JC. The values for thisobjective function are 0.85295, 0.68825 and 0.68845.

7.5. Discussion

A thorough discussion about the experimentation of 21 differ-ent algorithms with 16 different datasets is provided in this sec-tion. The performance of clustering algorithms is analyzed anddiscussed through four different categories (based on datasets)with respect to effectiveness and efficiency along with three per-spectives (search algorithm, objective function and hybridizedform). The effectiveness has found out based on CA, JC and RC.The efficiency is found out using computation time.

algorithms with newly designed objective functions. Expert Systems with

Page 10: Binu 2015

726

727

728

729

730731733733

734735737737

738739741741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

826

827

828

829

830

Table 8Summary of large data experimentation in the view of objective formalization andsearch algorithm.

CA RC JC ARI Time

K-means 0.7625 0.5149 0.5765 0.7346 590.7GA-KMc 0.7601 0.5015 0.5739 0.7333 784.3PSO-KMc 0.7664 0.5565 0.5800 0.7364 30.9CS-KMc 0.7620 0.4868 0.5771 0.7340 70.9GA-FCMc 0.6678 0.4609 0.5156 0.6871 816.3PSO-FCMc 0.7616 0.4944 0.5753 0.7345 30.9CS-FCMc 0.7412 0.4663 0.5581 0.7275 71.1GA-KFCMc 0.8529 0.6872 0.6873 0.7795 956.3PSO-KFCMc 0.8529 0.6872 0.6873 0.7795 36.8CS-KFCMc 0.8529 0.6872 0.6873 0.7797 84.5GA-MKFCMb 0.8526 0.6866 0.6868 0.7795 968.6PSO-MKFCMb 0.8529 0.6882 0.6884 0.7795 38.0CS-MKFCMb 0.8526 0.6875 0.6878 0.7795 86.5GA-FCM + CFa 0.7333 0.4667 0.5538 0.7199 816.5PSO-FCM + CFa 0.7333 0.4651 0.5561 0.7199 31.3CS-FCM + CFa 0.7272 0.4591 0.5508 0.7165 70.9GA-KFCM + KCFa 0.8529 0.6872 0.6873 0.7795 979.4PSO-KFCM + KCFa 0.8529 0.6872 0.6873 0.7795 41.2CS-KFCM + KCFa 0.8529 0.6872 0.6873 0.7795 86.3GA-MKFCM + MKCFa 0.8529 0.6872 0.6873 0.7795 1004.2PSO-MKFCM + MKCFa 0.8529 0.6872 0.6873 0.7795 38.4CS-MKFCM + MKCFa 0.8529 0.6872 0.6873 0.7795 87.6

10 D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

7.5.1. Complexity analysisThe computational complexity of objective function is com-

puted based on big O notation. For worst case, the computationalcomplexity (CC) of the proposed objective function is computedas follows:

CC ¼ Oðngðdþ 1Þ þ n log g þ 2ng log gÞ ð22Þ

For best case,

CC ¼ Oðnðgðdþ 3ÞÞ þ 1Þ ð23Þ

For average case,

CC ¼ Oðngðdþ 1:5þ gÞÞ ð24Þ

where n are the total data points and d is the dimension of real dataspace and g is the number of clusters.

7.5.2. FindingsFor the first category (experimentation with synthetic data), CS-

based algorithms are outperformed in three different effectivenessmetrics and the efficiency is better for PSO-based algorithms in thecase of search algorithm dependent perspective. In the perspectiveof objective functions, KM objective outperformed for all threeevaluation metrics considered to prove the effectiveness. Withrespect to the hybridized clustering, PSO-KM and CS-KM providedthe maximum accuracy and the PSO-KM provides less computationtime. This performance ensures that the existing KM is better suit-able if the data is integer.

From the results, we can easily understand that the multiplekernel-based algorithms is struggled to find the exact number ofclusters as compared with other clustering algorithms. While ana-lyzing the performance based on optimization algorithms, there isno big performance deviation for GA, PSO and CS algorithm. Also,objective function-based performance is changed for every objec-tive function which is the core of the analysis of effectiveness.Here, multiple kernel-based algorithms which may be proposedobjective function or the existing function does not contributemuch on the performance of clustering process as compared withother objective functions.

For the second category (experimentation with small and med-ium scale real data), PSO algorithm has outperformed in twoevaluation metrics and other one by cuckoo search algorithm.

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

The computation effort is minimized for PSO algorithm. For theobjective-based analysis, the proposed KFCM + KCF proved betterin all the effectiveness measure as well as efficiency as comparedwith other objective function. In terms of hybridized results, GA-KFCM + KCF scored the highest rank in terms of CA and RC. Onthe other hand, PSO-KFCM + KCF are better in JC. The better effi-ciency is achieved by the PSO-KFCM + KCF. The existing objectivefunctions are not much provided the effectiveness as comparedwith the proposed kernel-based objective function. The kernel-based proposed objective function provided better result for thereal data which are from different application. The performanceachieved by the proposed algorithm ensures that the applicationand its data range is not a problem for kernel-based objectivefunction.

For the third category (experimentation with image data), GAalgorithm performed better in two evaluation metrics and otherone by CS algorithm. Again, the time is minimized for PSO algo-rithm. In the perspective of objective formulation, the proposedFCM + CF has outperformed in both effectiveness and efficiency.In the hybridized form, GA-FCM + CF have provided better in termsof JC and CS-FCM + CF has provided better in terms of CA. The com-putation effort is minimized when PSO-FCM + CF is used for theimage data, so this ensures that the proposed FCM + CF providedgood significance if the data range is constant for all the attributes.

For the fourth category (experimentation with large scale data),PSO-based algorithm outperformed in both effectiveness and effi-ciency. In the case of objective function-based analysis, the effec-tiveness-based measures are better improved by MKFCM andMKFCM + MKCF. The better efficiency is obtained by k-means forthe large scale data. For the hybridized form, PSO-MKFCM is betterchoice to obtain more effective in clustering. The efficient result isachieved by the PSO-FCM algorithm for the large scale data. Theconclusion from the experimentation with large scale data is thatif the attribute size is very large along with different data rangeand its variance are not uniform, proposed MKFCM + MKCF pro-vided the better results as compared with existing objectivefunctions.

7.5.3. SuggestionsBased on the analysis, we can easily say that the algorithmic

effectiveness is decided by objective function and the algorithmicefficiency is decided by the search algorithm. For better efficiency,PSO algorithm is a right choice for all the different set of datawhich may be small scale or large scale. For effectiveness, the rightobjective function should be taken based on the characteristics ofdatasets such as, range of values, dimension, image and data type(integer or floating point). Based on this data characteristic, thesuggestion is that KM can be chosen if (i) the data is fully integersand within constant interval, (ii) the dimension of solution is less.The KFCM + KCF can be chosen if, (i) the data may be any integer orfloating point value data, (ii) the dimension may be small or med-ium. The FCM + CF can be chosen only if (i) the range of value isconstant for all attributes, (ii) the values are medium, (iii) suitablefor image analysis. Finally, the MKFCM + MKCF can be chosen onlyif, (i) the dimension of data is high, (iii) the range of values is notconstant, (iii) the data may be integer or floating point.

8. Conclusion

We have presented 21 different techniques to find optimal clus-ters with the help of different objective functions and optimizationalgorithms for expert systems and its applications in the field ofmedical, telecommunication and engineering. Here, three objectivefunctions are newly designed with four existing objective functionsto incorporate with optimization algorithms. The new objective

algorithms with newly designed objective functions. Expert Systems with

Page 11: Binu 2015

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

881882883884885886887888889890891892893

894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979

D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx 11

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

functions was introduced with the consideration of distance, fuzzyvariable along with two additional variables that are not definedpreviously, called cumulative distance and cumulative fuzzy val-ues. Totally, 21 different clustering algorithms are discussed withmathematical formulation after blending objective function withsearch algorithm like, GA, CS and PSO algorithm. The performanceof algorithms is evaluated with 16 different datasets of differentshape and characteristics. The effectiveness and efficiency of algo-rithms are compared using three different evaluation metrics withcomputation time. From the research outcome, we can easily con-clude that the algorithmic effectiveness is better dependent onobjective function and the algorithmic efficiency is decided basedon search algorithm. Finally, the right choice of algorithms is sug-gested depending on the characteristic of input data.

The clustering is a potential method of having various applicationsespecially, in expert systems. The various experts systems needs ofgood unsupervised learning strategy to fulfill their requirements.The important practical application found out from the literaturefor the clustering is given as, speaker clustering (Tang, Palo Alto,Chu, Hasegawa-Johnson, & Huang, 2012), analysis of fMRI Data(Zhang, Xianguo, Zhen, Wei, & Huafu, 2011), wireless sensor network(Youssef, Youssef, & Younis, 2009), grouping of text documents (Cao,Zhiang, Junjie, & Hui, 2013), Auditory Scene Categorization (Cai, Lie, &Hanjalic, 2008), News Story Clustering (Xiao, Chong-Wah, &Hauptmann, 2008), Target Tracking (Liang et al., 2010), network(Huang, Heli, Qinbao, Hongbo, & Jiawei, 2013), Cancer GeneExpression Profiles (Zhiwen, Le, You, Hau-San, & Guoqiang, 2012),social networks (Caimei, Xiaohua, & Jung-ran, 2011).

The major limitation of the proposed algorithm is user givencluster size which requires data knowledge for the user. The sec-ond limitation is thenumber of iteration which is a termination cri-teria utilized here for convergence. The critical analysis is requiredon defining the better termination criteria. Also, multiple paramet-ric inputs and optimal fixing of threshold values are to overcomefor better application of clustering process.

The proposed clustering algorithm can be applied to clinicaldecision support system, disease diagnosis, agricultural research,forecasting and routing to obtain more effective results based onthe findings of research. Also, clustering can be extended by mod-ifying the optimization search algorithm towards reducing thetime complexity. Again, the effectiveness can be even improvedincluding the different constraints as per the datasets or user inthe objective functions.

9. Uncited reference

Graves and Pedrycz (2010).

Appendix A. Supplementary data

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.eswa.2015.03.031.

References

Bandyopadhyay, S. (2011). Multiobjective simulated annealing for fuzzy clusteringwith stability and validity. IEEE Transactions on Systems, Man, and Cybernetics –Part C: Applications and Reviews, 41(5), 682–691.

Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms.0306406713. New York: Plenum Press.

Cai, R., Lie, L., & Hanjalic, A. (2008). Co-clustering for auditory scene categorization.IEEE Transactions on Multimedia, 10(4), 596–606.

Caimei, L., Xiaohua, H., & Jung-ran, P. (2011). The social tagging network for webclustering. IEEE Transactions on Systems, Man and Cybernetics. Part A: Systems andHumans, 41(5), 840–852.

Cao, J., Zhiang, W., Junjie, W., & Hui, X. (2013). SAIL: Summation-based incrementallearning for information-theoretic text clustering. IEEE Transactions onCybernetics, 43(2), 570–584.

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

Castellanos-Garzón, J. A., & Diaz, F. (2013). An evolutionary computational modelapplied to cluster analysis of DNA microarray data. Expert Systems withApplications, 40(7), 2575–2591.

Chen, L., Chen, C. L. P., & Lu, M. (2011). A multiple-kernel fuzzy c-means algorithmfor image segmentation. IEEE Transactions on Systems, Man, and Cybernetics –Part B: Cybernetics, 41(5).

Das, S., Abraham, A., & Konar, A. (2008). Automatic clustering using an improveddifferential evolution algorithm. IEEE Transactions on Systems, Man, andCybernetics – Part A: Systems and Humans, 38(1), 218–237.

Hoang, D. C., Yadav, P., Kumar, R., & Panda, S. K. (2014). Implementation of aharmony search algorithm-based clustering protocol for energy-efficientwireless sensor networks. IEEE Transactions on Industrial Informatics, 10(1),774–783.

Elkeran, A. (2013). A new approach for sheet nesting problem using guided cuckoosearch and pairwise clustering. European Journal of Operational Research, 231(3),757–769.

Krishnasamy, G., Kulkarni, A. J., & Paramesran, R. (2014). Hybrid approach for dataclustering based on modified cohort intelligence and K-means. Expert Systemswith Applications, 41(13), 6009–6016.

Garcia-Piquer, A., Fornells, A., Bacardit, J., Orriols-Puig, A., & Golobardes, E. (2014).Large-scale experimental evaluation of cluster representations formultiobjective evolutionary clustering. IEEE Transactions on EvolutionaryComputation, 18(1), 36–53.

Goldberg & David (1989). Genetic algorithms in search, optimization and machinelearning. 978-0201157673. Reading, MA: Addison-Wesley Professional.

Graves, D., & Pedrycz, W. (2010). Kernel-based fuzzy clustering and fuzzyclustering: A comparative experimental study. Fuzzy Sets and Systems, 161,522–543.

Hong, Y., & Kwong, S. (2008). To combine steady-state genetic algorithm andensemble learning for data clustering. Journal Pattern Recognition Letters, 29(9),1416–1423.

Huang, J., Heli, S., Qinbao, S., Hongbo, D., & Jiawei, H. (2013). Revealing density-based clustering structure from the core-connected tree of a network. IEEETransactions on Knowledge and Data Engineering, 25(8), 1876–1889.

_Inkaya, T., Kayalıgil, S., & Özdemirel, N. E. (2015). Colony optimization basedclustering methodology. Applied Soft Computing, 28, 301–311.

Ji, J., Pang, W., Zhou, C., Han, X., & Wang, Z. (2012). A fuzzy k-prototype clusteringalgorithm for mixed numeric and categorical data. Journal of Knowledge-BasedSystems, 30, 129–135.

Ji, Z., Xi, Y., Chen, Q., Sun, Q., Xia, D., & Feng, D. D. (2012). Fuzzy c-means clusteringwith weighted image patch for image segmentation. Applied Soft Computing, 12,1659–1667.

Kannana, S. R., Ramathilagam, S., & Chung, P. C. (2012). Effective fuzzy c-meansclustering algorithms for data clustering problems. Expert Systems withApplications, 39(7), 6292–6300.

Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. Proc. of the IEEEint’l conf. on neural networks (Vol. IV, pp. 1942–1948). Piscataway, NJ: IEEEservice center.

Khan, S. S., & Ahmad, A. (2004). Cluster center initialization algorithm for K-meansclustering. Pattern Recognition Letters, 25, 1293–1302.

Krishna, K., & Murty (1999). Genetic K-means algorithm. IEEE Transactions onSystems Man and Cybernetics B Cybernetics, 29, 433–439.

Kuo, R. J., Syu, Y. J., Chen, Z. Y., & Tien, F. C. (2012). Integration of particle swarmoptimization and genetic algorithm for dynamic clustering. Journal ofInformation Sciences, 19, 124–140.

Liang, Z., Chaovalitwongse, W. A., Rodriguez, A. D., Jeffcoat, D. E., Grundel, D. A., &O’Neal, J. K. (2010). Optimization of spatiotemporal clustering for targettracking from multisensor data. IEEE Transactions on Systems, Man, andCybernetics, Part C: Applications and Reviews, 40(2), 176–188.

Linda, O., & Manic, M. (2012). General type-2 fuzzy c-means algorithm for uncertainfuzzy clustering. IEEE Transactions on Fuzzy Systems, 20(5), 883–897.

Li & Qi (2007). Spatial kernel K-harmonic means clustering for multi-spectral imagesegmentation. IET Image Processing, 1(2), 156–167.

Liyong, Z., Witold, P., Wei, L., Xiaodong, L., & Li, Z. (2014). An interval weighed fuzzyc-means clustering by genetically guided alternating optimization. ExpertSystems with Applications, 41(13), 5960–5971.

Maji, P. (2011). Fuzzy-rough supervised attribute clustering algorithm andclassification of microarray data. IEEE Transactions on Systems, Man, andCybernetics – Part B: Cybernetics, 41(1), 222–233.

McQueen, J. (1967). Some methods for classification and analysis of multivariateobservations. In Proc. of fifth berkeley symposium on math. Vol: Statistics andprobability (pp. 281–297).

Merwe, D. W. V. & Engelbrecht, A. P. (2003). Data clustering using particle swarmoptimization. In The congress on evolutionary computation (Vol. 1, pp. 215–220).

Mualik, U., & Bandyopadhyay, S. (2002). Genetic algorithm based clusteringtechnique. Pattern Recognition, 33, 1455–1465.

Niknam, T., & Amiri, B. (2010). An efficient hybrid approach based on PSO, ACO andk-means for cluster analysis. Applied Soft Computing, 10, 183–197.

Ouadfel, S., & Meshoul, S. (2012). Handling fuzzy image clustering with a modifiedABC algorithm. International Journal of Intelligent Systems and Applications, 12,65–74.

Pakhira, M. K., Bandyopadhyay, S., & Maulik, U. (2005). A study of some fuzzycluster validity indices, genetic clustering and application to pixel classification.Fuzzy Sets and Systems, 155(2).

Pham, D. T., Dimov, S. S. & Nguyen, C. D. (2004). Selection of K in K-meansclustering, In Proc. IMechE (p. 219).

algorithms with newly designed objective functions. Expert Systems with

Page 12: Binu 2015

980981982983984985986987988989990991992993994995996997998999

10001001100210031004100510061007100810091010

1011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041

12 D. Binu / Expert Systems with Applications xxx (2015) xxx–xxx

ESWA 9942 No. of Pages 12, Model 5G

10 April 2015

Premalatha, K., & Natarajan, A. M. (2008). A new approach for data clustering basedon PSO with local search. Computer and Information Science, 1(4).

Saha, S., & Bandyopadhyay, S. (2009). A new multiobjective simulated annealingbased clustering technique using symmetry. Pattern Recognition Letters, 30,1392–1403.

Selim, S. Z., & Alsultan, K. (1991). A simulated annealing algorithm for the clusteringproblem. Pattern Recognition, 10(24), 1003–1008.

Senthilnath, J., Omkar, S. N., & Mani, V. (2011). Clustering using firefly algorithm:Performance study. Swarm and Evolutionary Computation, 1, 164–171.

Sulaiman, S. N., & Isa, N. A. M. (2010). Adaptive fuzzy-K-means clustering algorithmfor image segmentation. IEEE Transactions on Consumer Electronics, 56(4),2661–2668.

Szilágyi, L., Szilágyi, S. M., Benyób, B., & Benyó, Z. (2011). Intensity inhomogeneitycompensation and segmentation of MR brain images using hybrid c-meansclustering models. Biomedical Signal Processing and Control, 6, 3–12.

Tang, H., Palo Alto, C. A., Chu, S. M., Hasegawa-Johnson, M., & Huang, T. S. (2012).Partially supervised speaker clustering. IEEE Transactions on Pattern Analysis andMachine Intelligence, 34(5), 959–971.

Tengke, X., Shengrui, W., Qingshan, J., & Huang, J. Z. (2014). A novel variable-orderMarkov model for clustering categorical sequences. IEEE Transactions onKnowledge and Data Engineering, 26(10), 2339–2353.

UCI machine learning repository from <http://archive.ics.uci.edu/ml/datasets.html>.

Wan, M., Li, L., Xiao, J., Wang, C., & Yang, Y. (2012). Data clustering using bacterialforaging optimization. Journal of Intelligent Information Systems, 38(2), 321–341.

Wei, S., Yingying, Q., Soon Cheol, P., & Xuezhong, Q. (2015). A hybrid evolutionarycomputation approach with its application for optimizing text documentclustering. Expert Systems with Applications, 42(5), 2517–2524.

Xiao, W., Chong-Wah, N., & Hauptmann, A. G. (2008). Multimodal news storyclustering with pairwise visual near-duplicate constraint. IEEE Transactions onMultimedia, 10(2), 188–199.

1042

Please cite this article in press as: Binu, D. Cluster analysis using optimizationApplications (2015), http://dx.doi.org/10.1016/j.eswa.2015.03.031

Xu, R., Xu, J., & Wunsch, D. C. (2012). A comparison study of validity indices onswarm-intelligence-based clustering. IEEE Transactions on Systems, Man, andCybernetics – Part B: Cybernetics, 42(4), 1243–1256.

Yang, Y., & Chen, K. (2011). Temporal data clustering via weighted clusteringensemble with different representations. IEEE Transactions on Knowledge andData Engineering, 23(2).

Yang, X. S., & Deb, S. (2010). Engineering optimization by cuckoo search.International Journal of Mathematical Modelling and Numerical Optimisation,1(4), 330–343.

Youssef, M., Youssef, A., & Younis, M. (2009). Overlapping multihop clustering forwireless sensor networks. IEEE Transactions on Parallel and Distributed Systems,20(12), 1844–1856.

Yuwono, M., Su, S. W., Moulton, B. D., & Nguyen, H. T. (2014). Clustering usingvariants of rapid centroid estimation. IEEE Transactions on EvolutionaryComputation, 18(3), 366–377.

Zhang, D. Q., & Chen, S. C. (2004). A novel kernelized fuzzy C-means algorithm withapplication in medical image segmentation. Artificial Intelligence in Medicine,32(1), 37–50.

Zhang, C., Ouyang, D., & Ning, J. (2010). An artificial bee colony approach forclustering. Expert Systems with Applications, 37, 4761–4767.

Zhang, J., Xianguo, T., Zhen, Y., Wei, L., & Huafu, C. (2011). Analysis of fMRI datausing an integrated principal component analysis and supervised affinitypropagation clustering approach. IEEE Transactions on Biomedical Engineering,58(11), 3184–3196.

Zhao, F., Jiao, L., & Liu, H. (2013). Kernel generalized fuzzy c-means clustering withspatial information for image segmentation. Digital Signal Processing, 23,184–199.

Zhiwen, Y., Le, L., You, J., Hau-San, W., & Guoqiang, H. (2012). SC(3): Triple spectralclustering-based consensus clustering framework for class discovery fromcancer gene expression profiles. IEEE/ACM Transactions on Computational Biologyand Bioinformatics, 9(6), 1751–1765.

algorithms with newly designed objective functions. Expert Systems with