[ieee 2013 ieee international conference on control system, computing and engineering (iccsce) -...

5
NMF based Gene Selection Algorithm for Improving Performance of the Spectral Cancer Clustering Andri Mirzal Faculty of Computing Universiti Teknologi Malaysia Skudai, Johor Bahru, Malaysia Email: [email protected] Abstract—Analyzing cancers using microarray gene expres- sion datasets is currently an active research in medical commu- nity. There are many tasks related to this research, e.g., clustering and classifcation, data compression, and samples characteriza- tion. In this paper, we discuss the task of cancer clustering. The spectral clustering is one of the most commonly used methods in cancer clustering. As the gene expression datasets usually are highly imbalanced, i.e., containing only a few tissue samples (hundreds at most) but each is expressed by thousands of genes, filtering out some irrelevant and potentially misleading gene expressions is a necessary step to improve the performance of the method. In this paper, we propose an unsupervised gene selection algorithm based on the nonnegative matrix factoriza- tion (NMF). Our algorithm is designed by making use of the clustering capability of the NMF to select the most informative genes. Clustering performance of the spectral method is then evaluated by comparing the results using the original datasets with the results using the pruned datasets. Our results suggest that the proposed algorithm can be used to improve clustering performance of the spectral method. Keywordscancer clustering, gene selection, nonnegative ma- trix factorization, spectral clustering I. I NTRODUCTION Cancer clustering is a task of grouping samples from patients with cancers so that samples with the same type can be clustered in the same group (usually each group refers to a specific cancer type) [1]. In some datasets, normal tissues are also included for controlling purpose [2]. In the literature, one can find two related terms: cancer clustering and cancer clas- sification which sometimes are used interchangeably. In this paper we explicitly differentiate these terms: cancer clustering refers to the unsupervised task for grouping the samples, and cancer classification refers to the supervised task where the classifiers are trained first before being used to classify the samples. In recent years, thousands of new gene expression datasets are being generated. These datasets usually consist of only a few samples (hundreds at most), but each sample is repre- sented by thousands of gene expressions. This characteristic makes analyzing the datasets quite challenging because most clustering and classification techniques usually perform poorly when the number of samples is small. In addition, the high dimensionality of the data suggests that many of the gene expressions are actually irrelevant and possibly misleading, and thus gene selection procedure should be employed to clean the data. For classification problem, the small number of samples creates an additional problem of overfitting the classifiers [3]. The using of gene selection procedure to improve classifi- cation performance has been extensively studied [1], [3]–[17]. Most of the proposed methods are based on the support vector machines (SVMs), and it has been shown that the methods can significantly improve performances of the classifiers. However in cancer clustering research, gene selection is not well studied yet. The common approach is to use the whole dimensions which potentially reduces performances of the clustering algo- rithms because the data can contain irrelevant and misleading gene expressions. In this paper, we propose an unsupervised gene selection al- gorithm based on the nonnegative matrix factorization (NMF). NMF is a matrix factorization technique that decomposes a nonnegative matrix into a pair of other nonnegative matrices. It has been successfully applied in many problem domains including clustering [4]–[6], [18]–[32], image analysis [33]– [37], and feature extraction [4], [25], [38], [39]. The proposed algorithm is designed by using the fact that the NMF can be used for grouping similar genes in unsupervised manner and that the membership degrees of each gene to the clusters are directly given by entries in the corresponding column of the coefficient matrix. We then use the proposed algorithm to improve performance of the spectral clustering. II. THE SPECTRAL CLUSTERING The spectral clustering is a family of multiway clustering techniques that make use of eigenvectors of the data matrix to perform the clustering. Depending on the choosing of the matrix, the number of eigenvectors, and the algorithm to infer clusters from the eigenvectors, there are many spectral cluster- ing algorithms available, e.g., [40]–[42] (a detailed discussion on the spectral clustering can be found in ref. [43]). Here we use spectral clustering algorithm proposed by Ng et al. [41]. We choose this algorithm because of its intuitiveness and clustering capability. Algorithm 1 outlines the algorithm where × + denotes an -by- nonnegative matrix and × + denotes an -by- binary matrix. III. THE PROPOSED ALGORITHM Given a nonnegative data matrix A × + , the NMF decomposes the matrix into the basis matrix B × + and the coefficient matrix C × + such that: A BC. 2013 IEEE International Conference on Control System, Computing and Engineering, 29 Nov. - 1 Dec. 2013, Penang, Malaysia 978-1-4799-1508-8/13/$31.00 ©2013 IEEE 74

Upload: andri

Post on 22-Feb-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

NMF based Gene Selection Algorithm for Improving Performance of the Spectral Cancer Clustering

Andri MirzalFaculty of Computing

Universiti Teknologi MalaysiaSkudai, Johor Bahru, Malaysia

Email: [email protected]

Abstract—Analyzing cancers using microarray gene expres-sion datasets is currently an active research in medical commu-nity. There are many tasks related to this research, e.g., clusteringand classifcation, data compression, and samples characteriza-tion. In this paper, we discuss the task of cancer clustering. Thespectral clustering is one of the most commonly used methods incancer clustering. As the gene expression datasets usually arehighly imbalanced, i.e., containing only a few tissue samples(hundreds at most) but each is expressed by thousands of genes,filtering out some irrelevant and potentially misleading geneexpressions is a necessary step to improve the performance ofthe method. In this paper, we propose an unsupervised geneselection algorithm based on the nonnegative matrix factoriza-tion (NMF). Our algorithm is designed by making use of theclustering capability of the NMF to select the most informativegenes. Clustering performance of the spectral method is thenevaluated by comparing the results using the original datasetswith the results using the pruned datasets. Our results suggestthat the proposed algorithm can be used to improve clusteringperformance of the spectral method.

Keywords—cancer clustering, gene selection, nonnegative ma-trix factorization, spectral clustering

I. INTRODUCTION

Cancer clustering is a task of grouping samples frompatients with cancers so that samples with the same type canbe clustered in the same group (usually each group refers to aspecific cancer type) [1]. In some datasets, normal tissues arealso included for controlling purpose [2]. In the literature, onecan find two related terms: cancer clustering and cancer clas-sification which sometimes are used interchangeably. In thispaper we explicitly differentiate these terms: cancer clusteringrefers to the unsupervised task for grouping the samples, andcancer classification refers to the supervised task where theclassifiers are trained first before being used to classify thesamples.

In recent years, thousands of new gene expression datasetsare being generated. These datasets usually consist of onlya few samples (hundreds at most), but each sample is repre-sented by thousands of gene expressions. This characteristicmakes analyzing the datasets quite challenging because mostclustering and classification techniques usually perform poorlywhen the number of samples is small. In addition, the highdimensionality of the data suggests that many of the geneexpressions are actually irrelevant and possibly misleading, andthus gene selection procedure should be employed to clean thedata. For classification problem, the small number of samplescreates an additional problem of overfitting the classifiers [3].

The using of gene selection procedure to improve classifi-cation performance has been extensively studied [1], [3]–[17].Most of the proposed methods are based on the support vectormachines (SVMs), and it has been shown that the methods cansignificantly improve performances of the classifiers. Howeverin cancer clustering research, gene selection is not well studiedyet. The common approach is to use the whole dimensionswhich potentially reduces performances of the clustering algo-rithms because the data can contain irrelevant and misleadinggene expressions.

In this paper, we propose an unsupervised gene selection al-gorithm based on the nonnegative matrix factorization (NMF).NMF is a matrix factorization technique that decomposes anonnegative matrix into a pair of other nonnegative matrices.It has been successfully applied in many problem domainsincluding clustering [4]–[6], [18]–[32], image analysis [33]–[37], and feature extraction [4], [25], [38], [39]. The proposedalgorithm is designed by using the fact that the NMF canbe used for grouping similar genes in unsupervised mannerand that the membership degrees of each gene to the clustersare directly given by entries in the corresponding column ofthe coefficient matrix. We then use the proposed algorithm toimprove performance of the spectral clustering.

II. THE SPECTRAL CLUSTERING

The spectral clustering is a family of multiway clusteringtechniques that make use of eigenvectors of the data matrixto perform the clustering. Depending on the choosing of thematrix, the number of eigenvectors, and the algorithm to inferclusters from the eigenvectors, there are many spectral cluster-ing algorithms available, e.g., [40]–[42] (a detailed discussionon the spectral clustering can be found in ref. [43]). Here weuse spectral clustering algorithm proposed by Ng et al. [41].We choose this algorithm because of its intuitiveness andclustering capability. Algorithm 1 outlines the algorithm whereℝ

𝑀×𝑁+ denotes an 𝑀 -by-𝑁 nonnegative matrix and 𝔹

𝑀×𝐾+

denotes an 𝑀 -by-𝐾 binary matrix.

III. THE PROPOSED ALGORITHM

Given a nonnegative data matrix A ∈ ℝ𝑀×𝑁+ , the NMF

decomposes the matrix into the basis matrix B ∈ ℝ𝑀×𝑅+ and

the coefficient matrix C ∈ ℝ𝑅×𝑁+ such that:

A ≈ BC.

2013 IEEE International Conference on Control System, Computing and Engineering, 29 Nov. - 1 Dec. 2013, Penang, Malaysia

978-1-4799-1508-8/13/$31.00 ©2013 IEEE 74

Algorithm 1 A spectral clustering algorithm by Ng et al. [41]

1) Input: Rectangular data matrix A ∈ ℝ𝑀×𝑁+ with

𝑀 data points, and the number of clusters 𝐾.2) Construct symmetric affinity matrix A ∈ ℝ

𝑀×𝑀+

from A by using the Gaussian kernel.3) Normalize A by A← D−1/2AD−1/2 where D is

a diagonal matrix with 𝐷𝑖𝑖 =∑

𝑗 ��𝑖𝑗 .4) Compute 𝐾 eigenvectors that correspond to the 𝐾

largest eigenvalues of A, and form X ∈ ℝ𝑀×𝐾 =

[x1, . . . , x𝐾 ], where x𝑘 is the 𝑘-th eigenvector.5) Normalize every row of X, i.e., 𝑋𝑖𝑗 ←

𝑋𝑖𝑗/(∑

𝑗 𝑋2𝑖𝑗)

1/2.6) Apply k-means clustering on rows of X to obtain

clustering indicator matrix X ∈ 𝔹𝑀×𝐾+ .

To compute B and C, usually the following optimizationproblem is used:

minB,C

𝐽(B,C) =1

2∥A−BC∥2𝐹 s.t. B ≥ 0,C ≥ 0, (1)

where ∥X∥𝐹 denotes the Frobenius norm of X.

There are many algorithms proposed to solve the opti-mization problem in eq. 1. However, for clustering purpose,there is not much performance difference between the standarNMF algorithm proposed by Lee and Seung [44] and themore advanced and application specific algorithms [4]–[6],[18]–[32]. Accordingly, we use the standard NMF algorithm.Algorithm 2 outlines the algorithm where 𝑏

(𝑘)𝑚𝑟 denotes the

(𝑚, 𝑟) entry of B at 𝑘-th iteration, X𝑇 denotes the transposeof X, and 𝛿 denotes a small positive number to avoid divisionby zero.

Algorithm 2 The standard NMF algorithm [44].

Initialization: B(0) > 0 and C(0) > 0.for 𝑘 = 0, . . . ,maxiter do

𝑏(𝑘+1)𝑚𝑟 ←− 𝑏(𝑘)𝑚𝑟

(AC(𝑘)𝑇 )𝑚𝑟

(B(𝑘)C(𝑘)C(𝑘)𝑇 )𝑚𝑟

+ 𝛿∀𝑚, 𝑟

𝑐(𝑘+1)𝑟𝑛 ←− 𝑐(𝑘)𝑟𝑛

(B(𝑘+1)𝑇A)𝑟𝑛(B(𝑘+1)𝑇B(𝑘+1)C(𝑘))

𝑟𝑛+ 𝛿

∀𝑟, 𝑛

end for

Let A denotes sample-by-gene matrix containing the geneexpression data and 𝑅 denotes the number of cancer classes.By using Algorithm 2 to factorize A into B and C, column𝑛-th of C describes the clustering membership degrees of 𝑛-th gene to each cluster with the more positive the entry themore likely the gene to belong to the corresponding cluster.For hard clustering case, the membership is determined by themost positive entry. Further, if we normalize each column ofC, i.e., 𝑐𝑟𝑛 ← 𝑐𝑟𝑛/

∑𝑟 𝑐𝑟𝑛, the entries in each row will be

comparable and consequently row 𝑟-th of C will describe themembership strength of the genes to the 𝑟-th cluster. Thus, wecan sort these rows to find the most “informative genes” tothe corresponding clusters. And by choosing some top genes

for each cluster, we can select the most informative genes andremove some irrelevant and misleading genes.

This process is the core of our algorithm. But becausethe NMF does not have uniqueness property, the process isrepeated so that only genes that consistently come at the topare selected. Because of this repetition process, we introducea score scheme that assigns some predefined scores to the topgenes at each trial. And genes with the largest cumulativescores are then selected as the most informative genes. Ourscore scheme is based on the MotoGP scoring system, but thescores are assigned only to top 10 genes in each cluster (thescores for the top 10 genes are: 25, 20, 16, 13, 11, 10, 9, 8,7, and 6). Algorithm 3 outlines the complete gene selectionprocedure.

Algorithm 3 NMF based gene selection algorithm.

1) Input: Gene expression data matrix A ∈ ℝ𝑀×𝑁+

(the rows correspond to the samples and thecolumns correspond to the genes) and the numberof cluster 𝑅.

2) Normalize each column of A, i.e., 𝑎𝑚𝑛 ←𝑎𝑚𝑛/

∑𝑚 𝑎𝑚𝑛.

3) for 𝑙 = 0, . . . , 𝐿 doa) Compute C using Algorithm 2.b) Normalize each column of C, i.e., 𝑐𝑟𝑛 ←

𝑐𝑟𝑛/∑

𝑟 𝑐𝑟𝑛c) Sort in descending order each row of C.d) Assign scores to the top 10 genes in each

row of C.e) Accumulate the scores by adding the cur-

rent scores to the previous ones.4) end for5) Select some top genes 𝐺 according to the cumula-

tive scores.

IV. EXPERIMENTAL RESULTS

To evaluate the capability of the proposed algorithm inimproving performance of the spectral clustering, six publiclyavailable cancer datasets from the work of Souto et al. [45]in which they compiled the first most comprehensive datasetscollected from many resources (there are 35 datasets in total)were used. Tables I summarizes the information of the datasets.As shown the datasets are quite representative as the numberof classes varied from 2 to 10, the number of samples variedfrom tens to hundreds, and we also have one dataset, Su-2001,that contains multiple type of cancers.

There are some parameters need to be chosen. The first ismaxiter in Algorithm 2. Here we set maxiter to 100 as thestandard NMF algorithm is known to be fast in minimizing the

TABLE I. CANCER DATASETS.

Dataset name Tissue #Samples #Genes #Classes

Nutt-2003-v2 Brain 28 1070 2Armstrong-2002-v2 Blood 72 2194 3Tomlins-2006-v2 Prostate 92 1288 4Pomeroy-2002-v2 Brain 42 1379 5Yeoh-2002-v2 Bone 248 2526 6Su-2001 Multi 174 1571 10

2013 IEEE International Conference on Control System, Computing and Engineering, 29 Nov. - 1 Dec. 2013, Penang, Malaysia

75

error only for the first iterations [49]. The second is the numberof trials 𝐿 in Algorithm 3. After several attempts, we foundthat there was not much performance gain between 𝐿 = 100and 𝐿 > 100. Thus we set 𝐿 to be 100. The third is thenumber of top genes 𝐺 in step 5 of Algorithm 3. After severalattempts, 𝐺 was set to 20, 1600, 50, 300, 2000, and 200 forNutt, Armstrong, Tomlins, Pomeroy, Yeoh, and Su respectively.And 𝛿 in Algorithm 2 was set to 10−8.

To evaluate clustering performance, two metrics wereused: Accuracy and Adjusted Rand Index (ARI). Accuracy isthe most commonly used metric to measure performance ofclustering algorithms in medical community. It measures thefraction of the dominant class in a cluster. Accuracy is definedwith [23]:

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =1

𝑀

𝑅∑

𝑟=1

max𝑠

𝑐𝑟𝑠,

where 𝑟 and 𝑠 denote the 𝑟-th cluster and 𝑠-th reference classrespectively, 𝑅 denotes the number of clusters produced byclustering algorithm, 𝑀 denotes the number of samples, and𝑐𝑟𝑠 denotes the number of samples in 𝑟-th cluster that belongto 𝑠-th class. The values of Accuracy are between 0 and 1 with1 indicates a perfect agreement between the reference classesand the clustering results. In machine learning community, thismetric is also known as Purity [25].

Adjusted Rand Index (ARI) has a value ranges from -1 to1, with 1 indicates the perfect agreement and values near 0or negatives correspond to clusters found by chance. ARI isdefined wth [46]–[48]:

𝐴𝑅𝐼 =

∑𝑟𝑠

(𝑐𝑟𝑠2

)− (𝑁2

)−1 ∑𝑟

(𝑐𝑟∗2

)∑𝑠

(𝑐∗𝑠2

)

12

[∑𝑟

(𝑐𝑟∗2

)+∑

𝑠

(𝑐∗𝑠2

)]− (𝑁2

)−1 ∑𝑟

(𝑐𝑟∗2

)∑𝑠

(𝑐∗𝑠2

) ,

where 𝑐𝑟∗ denotes the number of samples in 𝑟-th cluster, and𝑐∗𝑠 denotes the number of samples in 𝑠-th class.

The experiment procedure is as follows. First Algorithm3 was used to select top genes from the original data matrixA ∈ ℝ

𝑀×𝑁+ . Then a new pruned data matrix A ∈ ℝ

𝑀×𝐺+

was formed with the top 𝐺 genes. This matrix was theninputted to Algorithm 1 to obtain clustering indicator matrix.The clustering quality was then measured by using Accuracyand ARI. Because of the nonuniqueness of the NMF, thisprocedure was repeated 100 times to get more statisticallysound results.

Fig. 1 shows performance of the spectral clustering withand without the gene selection procedure. As shown, thespectral clustering performed quite well in three datasets(Armstrong, Pomeroy, and Su), and rather produced unsatis-factory results in the other three datasets (Nutt, Tomlins, andYeoh). The gene selection improved clustering performance ofthe spectral clustering in all cases with better improvementsare observed in the cases where clustering results are ratherunsatisfactory. These imply that there are not many irrelevantand misleading genes in the first cases so that the results withand without the gene selection are comparable. On the otherhand, there are some of those genes in the second cases thatwere removed by the gene selection process.

Table I dan II give the detailed experimental results for 100trials where the values are displayed in format average values± standard deviation values.

Nutt Armstrong Tomlins Pomeroy Yeoh Su0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Acc

urac

y

Without Gene SelectionWith Gene Selection

(a) Accuracy

Nutt Armstrong Tomlins Pomeroy Yeoh Su0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AR

I

Without Gene SelectionWith Gene Selection

(b) ARI

Fig. 1. Performance of the spectral clustering with and without gene selectionmeasured by Accuracy and ARI (average values over 100 runs).

TABLE II. ACCURACY AND ARI WITHOUT GENE SELECTION.

Dataset name Accuracy ARI

Nutt-2003-v2 0.571± 0.000 0.002± 0.000Armstrong-2002-v2 0.861± 0.000 0.624± 0.000Tomlins-2006-v2 0.587± 0.024 0.227± 0.019Pomeroy-2002-v2 0.759± 0.023 0.503± 0.035Yeoh-2002-v2 0.675± 0.030 0.347± 0.048Su-2001 0.744± 0.030 0.523± 0.050

TABLE III. ACCURACY AND ARI WITH GENE SELECTION.

Dataset name Accuracy ARI

Nutt-2003-v2 0.664± 0.036 0.095± 0.033Armstrong-2002-v2 0.878± 0.014 0.667± 0.037Tomlins-2006-v2 0.679± 0.031 0.343± 0.045Pomeroy-2002-v2 0.767± 0.018 0.540± 0.032Yeoh-2002-v2 0.730± 0.027 0.408± 0.030Su-2001 0.746± 0.033 0.569± 0.044

2013 IEEE International Conference on Control System, Computing and Engineering, 29 Nov. - 1 Dec. 2013, Penang, Malaysia

76

V. CONCLUSION

We have presented a gene selection algorithm based on theNMF to select the most informative genes from a microarraygene expression dataset. The experimental results showed thatthe proposed algorithm improved performance of the spectralclustering with more visible improvements are observed in thecases where the spectral clustering produced rather unsatisfac-tory results.

ACKNOWLEDGMENT

The author would like to thank the reviewers for use-ful comments. This research was supported by Ministryof Higher Education of Malaysia and Universiti Tekno-logi Malaysia under Exploratory Research Grant SchemeR.J130000.7828.4L095.

REFERENCES

[1] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek,J.P. Mesirov, H. Coller, M.L Loh, J.R. Downing, M.A. Caligiuri,C.D. Bloomfield, and E.S. Lander, “Molecular classification of cancer:class discovery and class prediction by gene expression monitoring,”Science, Vol. 286(5439), pp. 531-537, 1999.

[2] S.L. Pomeroy, P. Tamayo, M. Gaasenbeek, L.M. Sturla, M. Angelo,M.E. McLaughlin, J.Y.H. Kim, L.C. Goumnerova, P.M. Black, C. Lau,J.C. Allen, D. Zagzag, J.M. Olson, T. Curran, C. Wetmore, J.A. Biegel,T. Poggio, S. Mukherjee, R. Rifkin, A. Califano, G. Stolovitzky,D.N. Louis, J.P. Mesirov, E.S. Lander, and T.R. Golub, “Predictionof central nervous system embryonal tumour outcome based on geneexpression,” Nature, Vol. 415(6870), pp. 436-442, 2002.

[3] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene Selectionfor Cancer Classification using Support Vector Machines,” MachineLearning, Vol. 46(1-3), pp. 389-422, 2002.

[4] J.P. Brunet, P. Tamayo, T.R. Golub, and J.P. Mesirov, “Metagenesand molecular pattern discovery using matrix factorization,” Proc. NatlAcad. Sci. USA, Vol. 101(12), pp. 4164-4169, 2003.

[5] C.H. Zheng, D.S. Huang, D. Zhang, and X.Z. Kong, “Tumor ClusteringUsing Nonnegative Matrix Factorization With Gene Selection,” IEEETransactions on Information Technology in Biomedicine, Vol. 13(4),pp. 599-607, 2009.

[6] N. Yuvaraj and P. Vivekanandan, An efficient SVM based tumor clas-sification with symmetry Non-negative Matrix Factorization using geneexpression data, Proc. Int’l Conf. on Information Communication andEmbedded Systems, pp. 761-768, 2013.

[7] M. Pirooznia, J.Y. Yang, M.Q. Yang, and Y. Deng, “A comparative studyof different machine learning methods on microarray gene expressiondata,” BMC Genomics, Vol. 9(Suppl 1), pp. S13, 2008.

[8] X. Liu, A. Krishnan, and A. Mondry, “An Entropy-based gene selectionmethod for cancer classification using microarray data,” BMC Bioinfor-matics, Vol. 6, pp. 76, 2005.

[9] L. Wang, F. Chu, and W. Xie, “Accurate Cancer Classification UsingExpressions of Very Few Genes,” IEEE/ACM Transactions on Compu-tational Biology and Bioinformatics, Vol. 4(1), pp. 40-53, 2007.

[10] L.Y. Chuang, H.W. Chang, C.J. Tu, and C.H. Yang, “Improved binaryPSO for feature selection using gene expression data,” ComputationalBiology and Chemistry, Vol. 32(1), pp. 29-37, 2008.

[11] P. Mitra and D.D. Majumder, Feature Selection and Gene Clusteringfrom Gene Expression Data, Proc. the 17th Int’l Conf. on PatternRecognition, pp. 343-346, 2004.

[12] T.S. Furey, N. Cristianini, N. Duffy, D.W. Bednarski, M. Schummerand D. Haussler, “Support vector machine classification and validation ofcancer tissue samples using microarray expression data,” Bioinformatics,Vol. 16(10), pp. 906-914, 2000.

[13] S. Moon and H. Qi, “Hybrid Dimensionality Reduction Method Basedon Support Vector Machine and Independent Component Analysis,” IEEETransactions on Neural Networks and Learning Systems, Vol. 23(5),pp. 749-761, 2012.

[14] Y. Lee and C.K. Lee, “Classification of multiple cancer types bymulticategory support vector machines using gene expression data,”Bioinformatics, Vol. 19(9), pp. 1132-1139, 2003.

[15] X. Zhang, X. Lu, Q. Shi, X. Xu, H.E. Leung, L.N. Harris, J.D. Iglehart,A. Miron, J.S. Liu, and W.H. Wong, “Recursive SVM feature selectionand sample classification for mass-spectrometry and microarray data,”BMC Bioinformatics, Vol. 7(197), 2006.

[16] Y. Lu and J. Han, “Cancer classification using gene expression data,”Information Systems, Vol. 28(4), pp. 243-268, 2003.

[17] H.H. Zhang, J. Ahn, X. Lin. and C. Park, “Gene selection using supportvector machines with non-convex penalty,” Bioinformatics, Vol. 22(1),pp. 88-95, 2006.

[18] F. Shahnaz, M.W. Berry, V. Pauca, and R.J. Plemmons, “Document clus-tering using nonnegative matrix factorization,” Information Processing &Management, Vol. 42(2), pp. 373-386, 2006.

[19] W. Xu, X. Liu and Y. Gong, “Document clustering based on non-negative matrix factorization,” Proc. ACM SIGIR, pp. 267-273, 2003.

[20] M. Berry, M. Brown, A. Langville, P. Pauca, and R.J. Plemmons,“Algorithms and applications for approximate nonnegative matrix factor-ization,” Computational Statistics and Data Analysis, Vol. 52(1), pp. 155-173, 2007.

[21] J. Yoo and S. Choi, Orthogonal nonnegative matrix factorization: Mul-tiplicative updates on Stiefel manifolds, Proc. 9th Int’l Conf. IntelligentData Engineering and Automated Learning, pp. 140-147, 2008.

[22] J. Yoo and S. Choi, “Orthogonal nonnegative matrix tri-factorization forco-clustering: Multiplicative updates on Stiefel manifolds,” InformationProcessing & Management, Vol. 46(5), pp. 559-570, 2010.

[23] Y. Gao and G. Church, “Improving Molecular cancer class discov-ery through sparse non-negative matrix factorization,” Bioinformatics,Vol. 21(21), pp. 3970-3975, 2005.

[24] D. Dueck, Q.D. Morris, and B.J. Frey, “Multi-way clustering of microar-ray data using probabilistic sparse matrix factorization,” Bioinformatics,Vol. 21(1), pp. 145-151, 2005.

[25] H. Kim and H. Park, “Sparse non-negative matrix factorizations viaalternating non-negativity constrained least squares for microarray dataanalysis,” Bioinformatics, Vol. 23(12), pp. 1495-1502, 2007.

[26] K. Devarajan, “Nonnegative Matrix Factorization: An Analytical andInterpretive Tool in Computational Biology,” PLoS Computational Biol-ogy, Vol. 4(7), pp. e1000029, 2008.

[27] H. Kim and H. Park, “Nonnegative matrix factorization based on al-ternating nonnegativity constrained least squares and active set method,”SIAM J. Matrix Anal. Appl., Vol. 30(2), pp. 713-730, 2008.

[28] P. Carmona-Saez, R.D. Pascual-Marqui, F. Tirado, J.M Carazo,and A. Pascual-Montano, “Biclustering of gene expression data bynon-smooth non-negative matrix factorization,” BMC Bioinformatics,Vol. 7(78), 2006.

[29] K. Inamura, T. Fujiwara, Y. Hoshida, T. Isagawa, M.H. Jones, C. Vir-tanen, M. Shimane, Y. Satoh, S. Okumura, K. Nakagawa, E. Tsuchiya,S. Ishikawa, H. Aburatani, H. Nomura, and Y. Ishikawa, “Two subclassesof lung squamous cell carcinoma with different gene expression profilesand prognosis identified by hierarchical clustering and non-negativematrix factorization,” Oncogene, Vol. 24, pp. 7105-7113, 2005.

[30] P. Fogel, S.S. Young, D.M. Hawkins, and N. Ledirac, “Inferential,robust non-negative matrix factorization analysis of microarray data,”Bioinformatics, Vol. 23(1), pp. 44-49, 2007.

[31] G. Wang, A.V. Kossenkov, and M.F. Ochs, “LS-NMF: A modified non-negative matrix factorization algorithm utilizing uncertainty estimates,”BMC Bioinformatics, Vol. 7(175), 2006.

[32] J.J.Y. Wang, X. Wang, and X. Gao, “Non-negative matrix factorizationby maximizing correntropy for cancer clustering,” BMC Bioinformatics,Vol. 14(107), 2013.

[33] P.O. Hoyer, “Non-negative matrix factorization with sparseness con-straints”, The Journal of Machine Learning Research, Vol. 5, pp. 1457-1469, 2004.

[34] S.Z. Li, X.W. Hou, H.J. Zhang, and Q.S. Cheng, Learning spatiallylocalized, parts-based representation, Proc. IEEE Comp. Soc. Conf. onComputer Vision and Pattern Recognition, pp. 207-212, 2001.

[35] D. Wang and H. Lu, “On-line learning parts-based representation viaincremental orthogonal projective non-negative matrix factorization”,Signal Processing, Vol. 93(6), pp. 1608-1623, 2013.

2013 IEEE International Conference on Control System, Computing and Engineering, 29 Nov. - 1 Dec. 2013, Penang, Malaysia

77

[36] A. Pascual-Montano, J.M. Carazo, K. Kochi, D. Lehman, andR.D. Pascual-Marqui, “Nonsmooth nonnegative matrix factorization”,IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28(3), pp. 403-415, 2006.

[37] N. Gillis and F. Glineur, “A multilevel approach for nonnegative matrixfactorization”, Journal of Computational and Applied Mathematics,Vol. 236(7), pp. 1708-1723, 2012.

[38] H. Kim and H. Park, “Nonnegative matrix factorization based on al-ternating nonnegativity constrained least squares and active set method,”SIAM. J. Matrix Anal. & Appl., Vol. 30(2), pp. 713-730, 2008.

[39] W. Kim, B. Chen, J. Kim, Y. Pan, and H. Park, “Sparse nonnegative ma-trix factorization for protein sequence motif discovery,” Expert Systemswith Applications, Vol. 38(10), pp. 13198-13207, 2011.

[40] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEETrans. Pattern Anal. Mach. Intell., Vol. 22(8), pp. 888-905, 2000.

[41] A. Ng, M.I. Jordan, and Y. Weiss, On spectral clustering: analysis andan algorithm, Proc. Advances in Neural Information Processing Systems,pp. 849-856, 2002.

[42] S.X. Yu and J. Shi, Multiclass spectral clustering, Proc. IEEE Int’l

Conf. on Computer Vision, pp. 313-319, 2003.

[43] U. Luxburg, “A tutorial on spectral clustering”, Statistics and Comput-ing, Vol. 17, pp. 395-416, 2007.

[44] D. Lee and H. Seung, “Learning the parts of objects by non-negativematrix factorization,” Nature, Vol. 401(6755), pp. 788-791, 1999.

[45] M.C.P. Souto, I.G. Costa, D.S.A. Araujo, T.B. Ludermir, and A. Schliep,“Clustering cancer gene expression data: a comparative study,” BMCBioinformatics, Vol. 9(497), 2008.

[46] W.M. Rand, “Objective criteria for the evaluation of clustering meth-ods,” Journal of the American Statistical Association, Vol. 66(336),pp. 846-850, 1971.

[47] L. Hubert and P. Arabie, “Comparing partitions,” Journal of Classifica-tion, Vol. 2(1), pp. 193-218, 1985.

[48] N.X. Vinh, J. Epps, and J. Bailey, Information theoretic measures forclustering comparison: Is a correction for chance necessary?, Proc. 26thAnnual Int’l Conf. on Machine Learning, pp. 1073-1080, 2009.

[49] C.J. Lin, “On the convergence of multiplicative update algorithmsfor nonnegative matrix factorization,” IEEE Transactions on NeuralNetworks, Vol. 18(6), pp. 1589-1596, 2007.

2013 IEEE International Conference on Control System, Computing and Engineering, 29 Nov. - 1 Dec. 2013, Penang, Malaysia

78