iterative k-means algorithm based on fisher discriminant university of joensuu department of...
DESCRIPTION
Traditional K-Means Algorithm Iterations of two steps: assignment of each data vector with a class label computation of cluster centroid by averaging all data vectors that are assigned to it Characteristics: Randomized initial partition or codebook Convergence to a local minimum Use of L 2, L 1 and L distance Fast and easy implementation Extensions: Kernel Kmeans algorithm EM algorithm K-median algorithmTRANSCRIPT
Iterative K-Means Algorithm Based on Fisher Discriminant
UNIVERSITY OF JOENSUUDEPARTMENT OF COMPUTER SCIENCEJOENSUU, FINLAND
Mantao Xu
to be presented at: Infomation Fusion 2004
Problem FormulationGiven N data samples
X={x1, x2, …, xN}, construct the codebook C = {c1, c2, …, cM} such that mean-square-error
2)(
1
||||1ip
N
ii cx
NMSE
is minimized. The class membership p (i) is
2||||minarg)( jij
cxip
Traditional K-Means Algorithm Iterations of two steps:
assignment of each data vector with a class label computation of cluster centroid by averaging all data vectors
that are assigned to it Characteristics:
Randomized initial partition or codebook Convergence to a local minimum Use of L2, L1 and L distance Fast and easy implementation
Extensions: Kernel Kmeans algorithm EM algorithm K-median algorithm
Motivation
Investigation on a clustering algorithm that : iteratively performs the regular K-Means
algorithm in searching a solution close to the global optima
estimates the initial partition close to the optimal solution by at each iteration
applies a dissimilarity function based on the current partition instead of L2 distance
Selecting K-Means initial partitonSelection of initial partion based on Fisher
discriminant and dynamic programming: A suboptimal partition is estimated by
dynamic programming in some one-dimenaional subspace of feature space.
The one-dimensional subspace is constructed through linear multi-class Fisher discriminant analysis
The output class of K-Means in each iteration is selected as the input class of discriminant analysis for next iteration
The multi-class Fisher discriminant analysisThe separation of input classes in the discriminant direction w can be measured by F-ratio validity index, F(w)
The multi-class linear Fisher discriminant w is the minimization of F-ratio validity index
T)(
1)(
1
T
)()(
))((
ipi
N
iipiW
M
jjjjB n
cxcxS
xcxcS
M
jj
Tj
N
iipi
T
wn
wkwF
1
2
1
2)(
))((
))(()(
xc
cx
wwwwkw T
w B
W
SST
minarg
The dynamic programming in discriminant directionThe optimal convex partition Qk={(qj-1,qj]| j=1,,n} in the discriminant direction w can be estimated by dynamic promgramming in terms of MSE distortion on the discriminant subspace
))((min)( wn]1,[t
1t][1,1n][1,
mseQEQE k
ntk
k
b
ass xmse 2w
b][a,w
b][a, |||| wx
b
assab
x wx1
1wb][a,
or in terms of MSE distortion on original feature space))((min)ˆ( n]1,[t
1t][1,1n][1,
mseQEQE k
ntk
k
b
assmse 2
b][a,b][a, |||| xx
b
assab
xx1
1b][a,
(2)
(1)
Application of Delta-MSE Dissimilarity
x1
x2
x3
y1
y2
y3
x4 G1 G2
Delta-MSE(x4,G1)=RemovalVariance
Delta-MSE(x 4,G2)=AddVariance
Move vector x from cluster i to cluster j, the change of the MSE function [10] caused by this move is:
22 ||||1
||||1
)( ii
ij
j
jij cx
nn
cxn
nxv
2||||),MSE(Delta jiijji cxwcx
jipnn
jipnnw
jj
jjij )()1/(
)()1(/
Pseudocodes of Iterative K-Means algorithm
Function SubOptimalKMeans(X, k, m) input: Dataset X Number of clusters k Number of iterations m output: Class labels POPT
C Randomly choose cluster centroids from X; P K-Means(X, C, k);
fmin calculate F-ratio of P for j = 1 to m
w solve Fisher discriminant based on class label P; Xw project all data X into discriminant direction w and sort it;
Pw optimally solve clustering problems on Xw by dynamic programming;
C, P K-Means(X, Pw, k); fratio calculate F-ratio of P if fratio < fmin then POPT P fmin fratio end if end for
Four K-Means algorithms conducted in experimental tests K-D tree based K-Means: selects its initial cluster
centroids from the k-bucket centers of a kd-tree structure that is recursively built by principal component analysis
PCA based K-Means: an intuitive approach to estimate a sub-optimal initial partition by applying the dynamic programming in the principal component direction
LFD-I: the proposed iterative K-Means algorithm based on the dynamic programming criterion (1)
LFD-II: the proposed iterative K-Means algorithm based on the dynamic programming criterion (2)
Comparisons of the four K-Means algorithms
Table 1: Performance comparisons (in F-ratio validity indices ) of the four K-Means algorithms on the practical numbers of clusters
Datasets k KD-Tree PCA LFD-I LFD-II
boston 9 4.083 3.526 3.515 3.515
glass 6 5.931 3.974 3.966 3.984
heart 5 5.436 5.420 5.410 5.410
image 7 3.556 2.622 2.615 2.499
thyroid 3 2.265 2.265 2.264 2.264
F-ratios produced by the four K-Means clusterings on dataset glass
Glass
2,8
3,4
4
4,6
5,2
5,8
3 6 9 12 15 18 21 24
Number of clusters
F-ratio
KD-Tree PCALFD-I LFD-II
F-ratios produced by the four K-Means clusterings on dataset heart
Heart
5
5,6
6,2
6,8
7,4
8
3 6 9 12 15 18 21 24
Number of clusters
F-ratio
KD-Tree PCALFD-I LFD-II
F-ratios produced by the four K-Means clusterings on dataset image
Image
2
2,6
3,2
3,8
4,4
5
3 6 9 12 15 18 21 24
Number of clusters
F-ratio
KD-Tree PCALFD-I LFD-II
ConclusionsA new approach to the k-center clustering problem by iteratively incorporating the Fisher discriminant analysis and the dynamic programming technique The proposed approach in general outperforms the two other algorithms: the PCA based K-Means algorithm and the kd-tree based K-Means algorithm The classification performance gains of the proposed approach over the two others is increased with the number of clusters
Further WorkSovling the k-center clustering problem by iteratively incorporating the kernel Fisher discriminant analysis and the dynamic programming techniqueSovling the k-center clustering problem by incorporating the kernel PCA technique and the dynamic programming technique