iterative k-means algorithm based on fisher discriminant university of joensuu department of...

Iterative K-Means Algorithm Based on Fisher Discriminant

UNIVERSITY OF JOENSUUDEPARTMENT OF COMPUTER SCIENCEJOENSUU, FINLAND

Mantao Xu

to be presented at: Infomation Fusion 2004

Problem FormulationGiven N data samples

X={x1, x2, …, xN}, construct the codebook C = {c1, c2, …, cM} such that mean-square-error

2)(

1

||||1ip

N

ii cx

NMSE

is minimized. The class membership p (i) is

2||||minarg)( jij

cxip

Traditional K-Means Algorithm Iterations of two steps:

assignment of each data vector with a class label computation of cluster centroid by averaging all data vectors

that are assigned to it Characteristics:

Randomized initial partition or codebook Convergence to a local minimum Use of L2, L1 and L distance Fast and easy implementation

Extensions: Kernel Kmeans algorithm EM algorithm K-median algorithm

Motivation

Investigation on a clustering algorithm that : iteratively performs the regular K-Means

algorithm in searching a solution close to the global optima

estimates the initial partition close to the optimal solution by at each iteration

applies a dissimilarity function based on the current partition instead of L2 distance

Selecting K-Means initial partitonSelection of initial partion based on Fisher

discriminant and dynamic programming: A suboptimal partition is estimated by

dynamic programming in some one-dimenaional subspace of feature space.

The one-dimensional subspace is constructed through linear multi-class Fisher discriminant analysis

The output class of K-Means in each iteration is selected as the input class of discriminant analysis for next iteration

The multi-class Fisher discriminant analysisThe separation of input classes in the discriminant direction w can be measured by F-ratio validity index, F(w)

The multi-class linear Fisher discriminant w is the minimization of F-ratio validity index

T)(

1)(

1

T

)()(

))((

ipi

N

iipiW

M

jjjjB n

cxcxS

xcxcS

M

jj

Tj

N

iipi

T

wn

wkwF

1

2

1

2)(

))((

))(()(

xc

cx

wwwwkw T

w B

W

SST

minarg

The dynamic programming in discriminant directionThe optimal convex partition Qk={(qj-1,qj]| j=1,,n} in the discriminant direction w can be estimated by dynamic promgramming in terms of MSE distortion on the discriminant subspace

))((min)( wn]1,[t

1t][1,1n][1,

mseQEQE k

ntk

k

b

ass xmse 2w

b][a,w

b][a, |||| wx

b

assab

x wx1

1wb][a,

or in terms of MSE distortion on original feature space))((min)ˆ( n]1,[t

1t][1,1n][1,

mseQEQE k

ntk

k

b

assmse 2

b][a,b][a, |||| xx

b

assab

xx1

1b][a,

(2)

(1)

Application of Delta-MSE Dissimilarity

x1

x2

x3

y1

y2

y3

x4 G1 G2

Delta-MSE(x4,G1)=RemovalVariance

Delta-MSE(x 4,G2)=AddVariance

Move vector x from cluster i to cluster j, the change of the MSE function [10] caused by this move is:

22 ||||1

||||1

)( ii

ij

j

jij cx

nn

cxn

nxv

2||||),MSE(Delta jiijji cxwcx

jipnn

jipnnw

jj

jjij )()1/(

)()1(/

Pseudocodes of Iterative K-Means algorithm

Function SubOptimalKMeans(X, k, m) input: Dataset X Number of clusters k Number of iterations m output: Class labels POPT

C Randomly choose cluster centroids from X; P K-Means(X, C, k);

fmin calculate F-ratio of P for j = 1 to m

w solve Fisher discriminant based on class label P; Xw project all data X into discriminant direction w and sort it;

Pw optimally solve clustering problems on Xw by dynamic programming;

C, P K-Means(X, Pw, k); fratio calculate F-ratio of P if fratio < fmin then POPT P fmin fratio end if end for

Four K-Means algorithms conducted in experimental tests K-D tree based K-Means: selects its initial cluster

centroids from the k-bucket centers of a kd-tree structure that is recursively built by principal component analysis

PCA based K-Means: an intuitive approach to estimate a sub-optimal initial partition by applying the dynamic programming in the principal component direction

LFD-I: the proposed iterative K-Means algorithm based on the dynamic programming criterion (1)

LFD-II: the proposed iterative K-Means algorithm based on the dynamic programming criterion (2)

Comparisons of the four K-Means algorithms

Table 1: Performance comparisons (in F-ratio validity indices ) of the four K-Means algorithms on the practical numbers of clusters

Datasets k KD-Tree PCA LFD-I LFD-II

boston 9 4.083 3.526 3.515 3.515

glass 6 5.931 3.974 3.966 3.984

heart 5 5.436 5.420 5.410 5.410

image 7 3.556 2.622 2.615 2.499

thyroid 3 2.265 2.265 2.264 2.264

F-ratios produced by the four K-Means clusterings on dataset glass

Glass

2,8

3,4

4

4,6

5,2

5,8

3 6 9 12 15 18 21 24

Number of clusters

F-ratio

KD-Tree PCALFD-I LFD-II

F-ratios produced by the four K-Means clusterings on dataset heart

Heart

5

5,6

6,2

6,8

7,4

8

3 6 9 12 15 18 21 24

Number of clusters

F-ratio


F-ratios produced by the four K-Means clusterings on dataset image

Image

2

2,6

3,2

3,8

4,4

5

3 6 9 12 15 18 21 24

Number of clusters

F-ratio


ConclusionsA new approach to the k-center clustering problem by iteratively incorporating the Fisher discriminant analysis and the dynamic programming technique The proposed approach in general outperforms the two other algorithms: the PCA based K-Means algorithm and the kd-tree based K-Means algorithm The classification performance gains of the proposed approach over the two others is increased with the number of clusters

Further WorkSovling the k-center clustering problem by iteratively incorporating the kernel Fisher discriminant analysis and the dynamic programming techniqueSovling the k-center clustering problem by incorporating the kernel PCA technique and the dynamic programming technique

iterative k-means algorithm based on fisher discriminant university of joensuu department of...

Documents