Intelligent Database Systems Lab
Presenter : JHOU, YU-LIANG
Authors : Yiu-ming Cheung, Hong Jia
2013,PR
Categorical-and-numerical-attribute data clustering based on a unified
similarity metric without knowing cluster number
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
Motivation• It is a nontrivial task to perform clustering on
mixed data because there exists an awkward
gap between the similarity metrics for
categorical and numerical data.
Intelligent Database Systems Lab
Objectives• This paper presents a general clustering framework
based on the concept of object-cluster similarity and
gives a unified similarity metric which can be applied
to the data with categorical, numerical, and mixed
attributes.
Intelligent Database Systems Lab
Methodologyobject-cluster similarity metric
categorical attribute
Intelligent Database Systems Lab
Methodologyobject-cluster similarity metric
• numerical attributes
• mixed data
Intelligent Database Systems Lab
MethodologyIterative clustering algorithm
Intelligent Database Systems Lab
MethodologyAutomatic selection of cluster number
Competition mechanism
Intelligent Database Systems Lab
MethodologyAutomatic selection of cluster number
Penalized mechanism
Intelligent Database Systems Lab
Experiments-data sets
Intelligent Database Systems Lab
Experiments mixed data
Intelligent Database Systems Lab
Experiments categorical data
Intelligent Database Systems Lab
Conclusions• We adopt our new approach can improve the time-
consuming and efficiency of the process and
overcome the cluster number selection problem.
Intelligent Database Systems Lab
Comments• Advantages More save time and efficiency .Applications-Clustering