generalized fuzzy clustering model with fuzzy c-means hong jiang computer science and engineering,...

Post on 13-Jan-2016

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Generalized Fuzzy Clustering Model

with Fuzzy C-Means Hong Jiang

Computer Science and Engineering, University of South Carolina,

Columbia, SC 29208, US

CSCE 790ECSCE 790E

Abstract Introduction Generalized Fuzzy Clustering

Model Realization Experiment results Conclusion

Introduction What is Cluster Analysis?

-- The classification of objects into categories. Applications of Cluster Analysis:

-- Pattern recognition, the classification of documents in information retrieval, social groupings based on various criteria, etc.

Why Fuzzy Clustering?

-- Weaker requirements are desirable.

Fuzzy c-means

Generalized Fuzzy Clustering Model

Original Objects

Original Objects

Feature Information

Feature Information

Fuzzy Cluster Analyzer

Fuzzy Cluster Analyzer

Cluster Information

Cluster Information

Goal Objects Goal

Objects

Feature ExtractorFeature

ExtractorPost

TreatmentPost

Treatment

(Cont.) Original Objects: the representation of input data

obtained by measurements on objects that are to be recognized. It may be any kind of data information in any kind of data structure.

Feature Information: characteristic features extracted from the input data in terms of which the dimensionality of pattern vectors can be reduced. The features should be characterizing attributes by which the given pattern classes are well discriminated.

Cluster Information: category information obtained through cluster analysis.

Goal Objects: Final desired result, it may not be necessary.

Fuzzy Cluster AnalyzerFeature

DataCluster Number

Exponent

Initialize U^expo

DistanceCompute

E-step

M-step

(f_n)

(f_n x d) (c_n) (expo)

U (c_n x f_n)

U

C (c_n x d)

D (c_n x f_n)

U: fuzzy partition matrix;C: center matrix;D: distance matrix.

Cost

Realization Initialization: Generate initial fuzzy partition

matrix for clustering. U^expo: Get the matrix after exponential

modification. E-step: Get new center matrix. Distance compute: Calculate the distance

between center and input feature data. Default: Euclidean distance.

M-step: Get new fuzzy partition matrix, and cost function value (used to control the iterations).

Experiment results

example 1

Feature Data: -0.0429 -5.8091 0.0421 -6.9078 0.6455 -5.8091 -0.2485 -6.2146 -0.5465 -6.9078 -5.8091 -2.2538 -6.9078 0.5585 -4.2687 0.6092 -4.9618 0.0208 -5.5215 -1.5418 -0.5108 0 -0.1054 0.2624 0.4055 -0.3567 -1.2040 -0.1054 -0.2231 -0.5108

Step: 0

Step: 1

Step: 10

Step: 15

Step: 20

Step: 25

Result:

0.0031 0.9952 0.0017 0.0161 0.9735 0.0105 0.0230 0.9650 0.0120 0.0006 0.9991 0.0004 0.0175 0.9701 0.0124 0.0856 0.0562 0.8583 0.0829 0.0365 0.8806 0.1562 0.0343 0.8096 0.0272 0.0083 0.9645 0.0362 0.0185 0.9453 0.9942 0.0023 0.0035 0.9660 0.0141 0.0200 0.9308 0.0347 0.0345 0.9777 0.0072 0.0151 0.9788 0.0097 0.0114

Experiment results

example 2

Cluster Number = 2

Cluster Number = 4

Experiment results

example 3

Original Image

Feature data(6000x3) are obtained based on texturehttp://vulcan.ee.iastate.edu/~dickerson/classes/ee571x/homework/hw4soln/hw4.html

Clustering Result

Conclusion Model evaluation:

– Easy to understand.– Extend applications.– Independent.– Convenient to improve.

Possible improvement involved:– Obtain Feature Data (normalization, well

discriminated?)– Determine Cluster Number– U^expo (time consuming, other representation)– Distance Computation (other kind of distance)

Generalized Fuzzy Clustering Model

Original Objects

Original Objects

Feature Information

Feature Information

Fuzzy Cluster Analyzer

Fuzzy Cluster Analyzer

Cluster Information

Cluster Information

Goal Objects Goal

Objects

Feature ExtractorFeature

ExtractorPost

TreatmentPost

Treatment

Fuzzy Cluster AnalyzerFeature

DataCluster Number

Exponent

Initialize U^expo

DistanceCompute

E-step

M-step

(f_n)

(f_n x d) (c_n) (expo)

U (c_n x f_n)

U

C (c_n x d)

D (c_n x f_n)

U: fuzzy partition matrix;C: center matrix;D: distance matrix.

Cost

top related