measurement of quality in cluster analysisucakche/presentations/cquality... · 2013-07-24 · of...
TRANSCRIPT
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Measurement of quality in cluster analysis
Christian Hennig
July 24, 2013
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
1. Introduction
IFCS task force for cluster benchmarking(Nema Dean, Iven van Mechelen, Fritz Leisch, Doug Steinley,Bernd Bischl, Isabelle Guyon, Christian Hennig)
Data repository for systematic comparison of qualityof different cluster analysis algorithms
In this presentation: compare quality of clusteringsbased on clustering and data alone,without reference to known truth.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Which clustering is better?(Old faithful geyser data)
−2 −1 0 1 2
−2
−1
01
mclust
waiting
dura
tion
−2 −1 0 1 2
−2
−1
01
pam
waiting
dura
tion
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Which clustering is better?
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Which tower is better?
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Why datasets without known truth?Benchmarking approaches:
◮ Real datasets with known classes◮ Simulated datasets from mixture distributions◮ Datasets with intuitive classes by fiat◮ Real datasets without known classes
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Why datasets without known truth?Benchmarking approaches:
◮ Real datasets with known classes◮ Simulated datasets from mixture distributions◮ Datasets with intuitive classes by fiat◮ Real datasets without known classes
Misclassification rates or Rand indexare (more or less) straightforward.
So why use datasets without known truth?
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?
Disclaimer: knowing the truth is not evil.There is definitely a role for datasetswith known truth in cluster benchmarking.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?
Disclaimer: knowing the truth is not evil.There is definitely a role for datasetswith known truth in cluster benchmarking.
Measuring cluster quality“ignoring” the truth can be of useeven if truth is known.(May explain which truths a method can discover.)
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?But. . .
◮ In datasets with known classesclustering is not of real scientific interest.(Or one may want to find different clusterings.)Deviate systematically from real clustering problems.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?But. . .
◮ In datasets with known classesclustering is not of real scientific interest.(Or one may want to find different clusterings.)Deviate systematically from real clustering problems.
◮ The fact that we know certain true classesdoesn’t preclude other legitimate/”true” clusterings.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?But. . .
◮ In datasets with known classesclustering is not of real scientific interest.(Or one may want to find different clusterings.)Deviate systematically from real clustering problems.
◮ The fact that we know certain true classesdoesn’t preclude other legitimate/”true” clusterings.
◮ Classes in supervised classification problemsmay not qualify as data analytic clusters.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?But. . .
◮ In datasets with known classesclustering is not of real scientific interest.(Or one may want to find different clusterings.)Deviate systematically from real clustering problems.
◮ The fact that we know certain true classesdoesn’t preclude other legitimate/”true” clusterings.
◮ Classes in supervised classification problemsmay not qualify as data analytic clusters.
So there could be better truths than the known one.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Which clustering is better?(10-d vowel data; Hastie, Tibshirani and Friedman ESL)
1
2
3
4
5
6 7
89
0
a
1
2
3
4
5
6 7
8
9
0
a 1
2
3
4
5
6 7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
23
4
5
6
78
9
0
a
1
2
3 4
5
678
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8 9
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
45 6
7
89
0
a
1
2
3
45
6
7
8
9
0
a 12
3
4
56
7
8
9
0
a
12
3
4
56
7
89
0
a
12
3
4
56
7
8
9
0
a
12
3
4
5
6
7
89
0
a
12
3
4
5
6
7
89
0
a
12
3
4
5
67
8
9
0
a
1
2
3
4
5
67
8
9
0
a
1
2
3
45
6
7
89
0
a
1
2
3
45
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
12
3
4
5
6
7
8
9
0
a1
2
3
4
5
67
8
9
0
a
1
2
34
5
678
90
a
1
2
345
67
8
9 0
a
1
2
34
5
67
8
90
a
1
2
34
5
67
8
9
0
a
1
2
34
5
6
78
90
a
1
2
3
45
678
90
a
12
3
4
5
6
7
8 9 0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
56
7
89
0
a
1
2
3
4
5 6
7
89
0
a
1
2
3
4
5 6
7
89
0
a
12
3
456
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8
90
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
23
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
12
3
4
56
7
8
9
0
a
1
2
3
4
5 67
89
0
a
1
23
4
56
78
9
0
a
1
23
4
5
6
78
9
0
a
12
3
4
5
6
7
8
90
a
1
2
3
4
5
6
7
8
90
a 1
2
3
4
56
7
89
0
a
1
2 3
4
56
7
89
0
a
1
2 3
4
5
6
7
89
0
a
1
2 3
4
5
6
7
89
0
a
1
23
4
5
6
7
8
9
0
a
1
23
45
67
8
9
0
a
1
2
3
4
56
78
9
0
a
1
2
3
4
5
6
78
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
67
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5 6
7
8
9
0
a 1
2
3
4
5 6
7
8
9
0
a1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
67
8
9
0
a
1
2
3
4
5 6
7
8
90
a
1
2
3
4
5 6
7
8
9
0
a
1
2
3
4
5 6
7
8
9
0
a
1
2
3
4
5 6
7
8
9
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
34
5
6
7
8 9
0 a
1
23
4
5
6
7
8
90
a
1
2 34
5
6
7
8
90
a
1
234
5
6
7
8
90 a
1
23
4
5
67
8
9
0
a
1
23
4
5
6
7
8
9
0
a
−8 −6 −4 −2 0 2
−12
−10
−8
−6
−4
−2
dc 1
dc 2
9
5
6
6
8
0
a
1
a
a
0
9
9
6
6
8 0
a
1
a
a
0
99
7
6
1
0
a1
a
a
0
5
9
5
6
1
0
a1
a
a
05
9
9
6
1
0
a1
1
a
05
99
6
0
0
a
1
1
a
6
5
5
96
888
1
1
1
3
5
5
96
8 8
8
1
2
1
3
5
5
96
8 8
8
12
1
3
5
5
9
6
8 88
12
1
3
5
5
96
888
1 2
1
3
5
5
96
8
3
8
1
2
1
3 5
56
68
88
1
1
1
6
5
56
68
8
8
1
2
1
45
56
6
88
8
1
2
1
4
5
56
68 8
81
1
1
45
5
6
688
11
11
6
5
5
6
68
8
1
1 1
1
6
5
5
6
6
6 6
8
2
2
5
6
5
5
6
6
66
82
25
6
5
5
6
66
68
2 2
5
6
5
5
6
60 68
2 2
5
6
5
5
6
606
8
22
5
6
9
5
6
60
6
8
21
5
6
77
6
6
88
8
8
24
3
77
a
6
88
8
8
2
43
77
a
68 8
88
1
43
77a
63 8
8
8
1
4 3
7
a
a
63
3
8
8
2
4 3 7
a
a
63
3
2
2
2
43
a
a
a
33
3
32
3
a
3
a
a
a
33
3
32
3
a
3
a
a
a
33
3
3
2
3
a
3
a
a
a
0
3
3
3
2
3
a
3
a
a
a
3
3
3
3
2
3
a
3
a
a
a
3
3
3
3
2
3
a
3
aa
a
3
33
82
4
43
a
7
a
3
3
3
82
4
43
7
7a
3
33
82
4
4
3
7
7a
3
33
82
4
4
3
7
7
a
3
33
82
4
4
3
7
7
a
3
3
3
82
4
4
3
55
9
0
00
1
1
0 5
0
5
5
9
00
0
1
1
0
5
0
5
5
9
0
0
0
1
1
0
5
0
5
5
9
0
0
8
1
1
0
5
0
5
5
9
0
0
80
1
0
5
0
5
5
9
0
080
1
0
5
0
7
9
9
6
86
8
1
11
6
7
9
9
6
8
61
1
1 1
6
7
9
9
6
8
61
1
11
6
7
9
9
6
8
6
1
1
1 1
6
7
99
6
8
6
1
1 1
1
6
7
9
9
6
8
6
1
1
1
1
6
9
9
9
0
0
0
0
1
01
0
99
9
0
0
0
0
1
01
0
99
9
0
0
0
01
0
1
0
99
9
0
0
0
01
0
1
0
9
9
9
0
0
0
0
1
0
1
0
9
9
9
0
0
0
0
1
0
1
0
9
9
6
68 6
0
a
0 0
0
9
9
6
6
06
a
a
0
0
0
9
9
6
6
0
0
a
a
0
0
0
99
9
6
0
0
a
a
00
0
99
9
6
0
0
a
a
00
0
99
9
6
0
0
a
a
0
0
0
7
7
7
68 6
11
4
4
6
7
7
7
686
1
1
4
4
6
777
68 6
11
4
4
6
77
a
686
11
4
4
6
77a
68
6
8
1
4
4
6
7
aa
68
68
1
4
4
6
7
7
76
8 68
1
8
2
0
7
7
76
8 68
1
0
2
0
757
6
868
1
0
2
0
7
5
7
6
868
1
0
2
0
7
5
7
6
8 08
1
0
2
0
7
5
76
80
8
1 5
2
0
7
7
7
6
8
802 4
4
6
7
7
7
6
8
802 4
4
6
7
7
7
6
8
812
4
4
3
7
77
6
8
3
82
4
4
3
7
77
6
8
3
82
4
4
3
7
776
8
3
8
2
4
4
3
a
a
a
3
8
3
22
2
23
a a
a
3
8
3
22
2
23
aa
a
3
8
3
22
2
23
a a
a
38
3
2
2
2
2
3
aa
a
33
3
22
2
2
3
aa
a
33
3
22
2
2
4
−10 −8 −6 −4 −2 0 2
−8
−6
−4
−2
02
dc 1
dc 2
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?◮ Identification mixture components/clusters
is problematic.◮ Mixture of two components may be unimodal.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?◮ Identification mixture components/clusters
is problematic.◮ Mixture of two components may be unimodal.◮ Observations in tails may rather be outliers
than cluster members (t-distributions).
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
What’s wrong with knowing the truth?◮ Identification mixture components/clusters
is problematic.◮ Mixture of two components may be unimodal.◮ Observations in tails may rather be outliers
than cluster members (t-distributions).◮ Clustering aims may deviate from
finding intuitive clusters or mixture components.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Which clustering is better?Why datasets without known truth?
Which clustering is better?
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
2. Basic thoughts
There is a range of cluster validation indexesmeasuring clustering quality, such asAverage silhouette width (ASW)(Kaufman and Rouseeuw 1990)sw(i , C) = b(i ,C)−a(i ,C)
max(a(i ,C),b(i ,C)) ,
a(i , C) =1
|Cj | − 1
∑
x∈Cj
d(xi , x), b(i , C) = minxi 6∈Cl
1|Cl |
∑
x∈Cl
d(xi , x).
Maximum average sw ⇒ good C.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Most such indexes balance within-cluster homogeneityagainst between-cluster separation.
“One size fits it all”-approach.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Most such indexes balance within-cluster homogeneityagainst between-cluster separation.
“One size fits it all”-approach.
Homogeneity will always dominate here:
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
General philosophy
There are various different aims of clustering.Depending on application,these aims carry different weights.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
General philosophy
There are various different aims of clustering.Depending on application,these aims carry different weights.
Measure them separately to characterisewhat a method does best,instead of producing a single ranking.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
General philosophy
There are various different aims of clustering.Depending on application,these aims carry different weights.
Measure them separately to characterisewhat a method does best,instead of producing a single ranking.
Can piece together overall qualityas weighted mean of separate statistics.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)◮ Within-cluster homogeneous distributional shape
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)◮ Within-cluster homogeneous distributional shape◮ Good representation of data by centroids
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)◮ Within-cluster homogeneous distributional shape◮ Good representation of data by centroids◮ Good representation of dissimilarity
by clustering-induced metric
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)◮ Within-cluster homogeneous distributional shape◮ Good representation of data by centroids◮ Good representation of dissimilarity
by clustering-induced metric◮ Clusters are regions of high density
without within-cluster gaps
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)◮ Within-cluster homogeneous distributional shape◮ Good representation of data by centroids◮ Good representation of dissimilarity
by clustering-induced metric◮ Clusters are regions of high density
without within-cluster gaps◮ Uniform cluster sizes
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
Typical clustering aims◮ Between-cluster separation◮ Within-cluster homogeneity (low distances)◮ Within-cluster homogeneous distributional shape◮ Good representation of data by centroids◮ Good representation of dissimilarity
by clustering-induced metric◮ Clusters are regions of high density
without within-cluster gaps◮ Uniform cluster sizes◮ Stability (requires knowledge of method)
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
E.g., pattern recognition in imagesrequires separation,
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
E.g., pattern recognition in imagesrequires separation,
clustering for information reduction requiresgood representation by centroids,
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
E.g., pattern recognition in imagesrequires separation,
clustering for information reduction requiresgood representation by centroids,
groups in social network analysis shouldn’t havelarge within-cluster gaps,
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Cluster validation indexesGeneral philosophyTypical clustering aims
E.g., pattern recognition in imagesrequires separation,
clustering for information reduction requiresgood representation by centroids,
groups in social network analysis shouldn’t havelarge within-cluster gaps,
underlying “true” classes (biological species)may cause homogeneous distributional shapes.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
3. Cluster quality statistics
Aim: measure all that’s of interestby statistics in [0, 1] (1 is good)(so that different statistics are comparableand weighted means make sense).
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
3. Cluster quality statistics
Aim: measure all that’s of interestby statistics in [0, 1] (1 is good)(so that different statistics are comparableand weighted means make sense).
Principle of direct interpretation:Aim at translating requirements directly into formulae;that’s not optimisation, not estimation of any “truth”.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Warning: requires bold subjective tuning decisions.
And it’s work in progress.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Measuring between-cluster separation
∃ several ways measuring separation (as for other aims).Straightforward: min distance between any two clusters,or distance between centroids (e.g., k-means).
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
−2 −1 0 1 2
−2
−1
01
waiting
dura
tion
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
−2 −1 0 1 2
−2
−1
01
waiting
dura
tion
M
M
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Measuring between-cluster separation
∃ several ways measuring separation (as for other aims).Straightforward: min distance between any two clusters,or distance between centroids (e.g., k-means).
These measure quite different concepts of separation.(min distance relies on only two points;centroid distance ignores what goes on at border.)
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
p-separation index:More stable version of “min distance”:Average distance to nearest point in different cluster forp = 10% “border” points in any cluster.(ASW averages 100% to all in neighbouring cluster.)
−2 −1 0 1 2
−2
−1
01
waiting
dura
tion
X
X
XX
X
X
X
XX
XX XX
X
X
X
X
X
X
XX
X X
X
X
X X
X
XX
X
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
p-stability index:Average distance to nearest point in different cluster forp = 10% “border” points in any cluster.
Problems: choice of p, standardisation.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
p-stability index:Average distance to nearest point in different cluster forp = 10% “border” points in any cluster.
Problems: choice of p, standardisation.
May standardise by maximum distance;range then is [0, 1], but values may be very small,max distance may be outlying,implicit downweighting if used inoverall quality weighted mean.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Could use average/median distance etc. and bound by 1(“separation larger ave distance ⇒ perfect”).
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Could use average/median distance etc. and bound by 1(“separation larger ave distance ⇒ perfect”).
Probably not fully satisfactory.
May use nonlinear transformation to [0, 1]pronouncing differences between lower values,taking into account whetherMax distance >> ave/median distance.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Could use average/median distance etc. and bound by 1(“separation larger ave distance ⇒ perfect”).
Probably not fully satisfactory.
May use nonlinear transformation to [0, 1]pronouncing differences between lower values,taking into account whetherMax distance >> ave/median distance.
Stick to max distance standardisation here.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Could use average/median distance etc. and bound by 1(“separation larger ave distance ⇒ perfect”).
Probably not fully satisfactory.
May use nonlinear transformation to [0, 1]pronouncing differences between lower values,taking into account whetherMax distance >> ave/median distance.
Stick to max distance standardisation here.
p = 0.1 intuitive; sensitivity?
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Alternative concept:Distance-based knn density index
Measures whether border points have lowest density,highest density is within clusters i .
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Alternative concept:Distance-based knn density index
Measures whether border points have lowest density,highest density is within clusters i .
“Border points” here: nBi points that have points
from other clusters among k = 4-nearest neighbours,nI
i interior points.
Pointwise density: k /(2∗mean distance to k-nn).
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
−2 −1 0 1 2
−2
−1
01
waiting
dura
tion
BB B
B
B
B
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
−2 −1 0 1 2
−2
−1
01
waiting
dura
tion
BB B
B
B
B
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Clusterwise density index r∗i :(mean border density)/(mean interior density),0 if nB
i = 0, 1 if nIi = 0.
Aggregation (∈ [0, 1]):
ID = 1 − ((1 − q)r1 + qr2)1((1 − q)r1 + qr2 ≤ 1),
r1 =∑
wi r∗i , r2 = bi,
q = 0.5 − |n−
P
nIi
n − 0.5|, wi =nI
iP
nIi,
Overall: b mean border density, i mean interior density.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Aggregation (∈ [0, 1]):
ID = 1 − ((1 − q)r1 + qr2)1((1 − q)r1 + qr2 ≤ 1),
r1 =∑
wi r∗i , r2 = bi,
q = 0.5 − |n−
P
nIi
n − 0.5|, wi =nI
iP
nIi,
Idea: r1 measures “cluster-relative” density ratio,r2 overall.
Both of interest, but for∑
nIi ≈ n or 0,
one side of r2 relies on very weak information.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Aggregation (∈ [0, 1]):
ID = 1 − ((1 − q)r1 + qr2)1((1 − q)r1 + qr2 ≤ 1),
r1 =∑
wi r∗i , r2 = bi,
q = 0.5 − |n−
P
nIi
n − 0.5|, wi =nI
iP
nIi,
Idea: r1 measures “cluster-relative” density ratio,r2 overall.
Both of interest, but for∑
nIi ≈ n or 0,
one side of r2 relies on very weak information.
Although r1 downweights clusters with nIi small,
outlier one-point clusters still produce too good ID.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance◮ Aggregated within-cluster similarity
(Kolmogorov etc.) to normal/uniform
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance◮ Aggregated within-cluster similarity
(Kolmogorov etc.) to normal/uniform◮ Within-cluster (squared) distance to centroid
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance◮ Aggregated within-cluster similarity
(Kolmogorov etc.) to normal/uniform◮ Within-cluster (squared) distance to centroid◮ ρ(distance, cluster induced distance) (Hubert’s Γ)
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance◮ Aggregated within-cluster similarity
(Kolmogorov etc.) to normal/uniform◮ Within-cluster (squared) distance to centroid◮ ρ(distance, cluster induced distance) (Hubert’s Γ)◮ Entropy of cluster sizes
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance◮ Aggregated within-cluster similarity
(Kolmogorov etc.) to normal/uniform◮ Within-cluster (squared) distance to centroid◮ ρ(distance, cluster induced distance) (Hubert’s Γ)◮ Entropy of cluster sizes◮ Within-cluster nearest neighbour distances
coefficient of variation
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Other statistics◮ Within-cluster average distance◮ Aggregated within-cluster similarity
(Kolmogorov etc.) to normal/uniform◮ Within-cluster (squared) distance to centroid◮ ρ(distance, cluster induced distance) (Hubert’s Γ)◮ Entropy of cluster sizes◮ Within-cluster nearest neighbour distances
coefficient of variation◮ Average largest within-cluster gap
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
All need standardisation/transformation.
Most are dissimilarity-based,allow flexible use with non-Euclidean data,given meaningful distance measure.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Principle of direct interpretationMeasuring between-cluster separationOther statistics
Data set submission to benchmarking repositoryrequires filling in questionnaire, e.g.
◮ Should clusters be similar or dissimilar in size?
◮ Are there requirements on what should be the unifying/common ground for elements to belong to the samecluster? Small within-cluster dissimilarities (and, if yes, in which respect)?
◮ Are there requirements on what should be the discriminating ground for elements to belong to differentclusters? Large between-cluster dissimilarities (and, if yes, in which respect)? Separation (and, if yes, ofwhich kind)? Other (and, if yes, what form do these requirements take)?
◮ Are there requirements on the between-cluster heterogeneity, that is, the structure of between-clusterdifferences (e.g., should lie in low-dimensional space, other)?
◮ (Some other; not all yet formalised by indexes)
◮ Please indicate the importance of those criteria selected by filling in a numerical weight.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
4. Examples
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
−10 0 10 20 30
010
2030
40
xy[,1]
xy[,2
]
3-means mclust-3ave within 0.811 0.643sep index 0.163 0.306
density index 0.460 0.876within gap 0.927 0.949
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
−2 −1 0 1 2
−2
−1
01
mclust
waiting
dura
tion
−2 −1 0 1 2
−2
−1
01
pam
waitingdu
ratio
n
−2 −1 0 1 2
−2
−1
01
spectral
waiting
dura
tion
−2 −1 0 1 2
−2
−1
01
ave.linkage
waiting
dura
tion
−2 −1 0 1 2
−2
−1
01
single linkage
waiting
dura
tion
−2 −1 0 1 2
−2
−1
01
pdfCluster (3)
waiting
dura
tion
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
mclust pam spect ave.l sing.l comp.l pdf3ave within 0.783 0.797 0.792 0.794 0.666 0.779 0.875sep index 0.127 0.045 0.127 0.096 0.175 0.103 0.065
density 0.910 0.733 0.864 0.903 0.969 0.874 0.719gap 0.888 0.891 0.891 0.891 0.929 0.891 0.906
coef var 0.541 0.567 0.554 0.564 0.573 0.545 0.554gamma 0.679 0.708 0.709 0.711 0.064 0.664 0.767
normality 0.880 0.838 0.854 0.841 0.786 0.882 0.856entropy 0.923 0.974 0.941 0.952 0.023 0.913 0.999
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Weighthed mean:full weight: ave within, sep index0.8 weight: entropyhalf weight: within nn cov, gap, min separation,density index, hubert gamma, normality, uniformity
pdf3 spect ave.l mclust kmeans comp.l pam sing.l0.624 0.622 0.622 0.619 0.618 0.610 0.601 0.460
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
◮ Problem with pam captured.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
◮ Problem with pam captured.◮ Single linkage: useless (entropy 0.023; one-point cluster),
good values in some indexes (careful!), bad in others.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
◮ Problem with pam captured.◮ Single linkage: useless (entropy 0.023; one-point cluster),
good values in some indexes (careful!), bad in others.◮ Comparison 2-cluster vs. 3-cluster (pdfCluster):
individual indexes unfair;ave within better, separation worse with larger k (etc.)Depends on proper weighting.Could add parsimony index.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
◮ Problem with pam captured.◮ Single linkage: useless (entropy 0.023; one-point cluster),
good values in some indexes (careful!), bad in others.◮ Comparison 2-cluster vs. 3-cluster (pdfCluster):
individual indexes unfair;ave within better, separation worse with larger k (etc.)Depends on proper weighting.Could add parsimony index.
◮ mclust not best in normality,ave.l not best in ave within!Individual indexes may favour certain methods,but not as obvious as it seems.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
European land snails data (Hausdorf, Hennig 2003,2006)Presence-absence (0-1) data for species in regions;“geographical Kulczynski dissimilarity”;clustering for “biotic elements” (natural history),originally clustered with mclust after MDS.Can compare with distance-based.
−0.4 −0.2 0.0 0.2 0.4 0.6
−0.
6−
0.4
−0.
20.
00.
20.
4
datanp[,1]
data
np[,2
]
−0.4 −0.2 0.0 0.2 0.4 0.6
−0.
6−
0.4
−0.
20.
00.
20.
4
datanp[,1]
data
np[,2
]
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
−0.4 −0.2 0.0 0.2 0.4 0.6
−0.
6−
0.4
−0.
20.
00.
20.
4
datanp[,1]
data
np[,2
]
−0.4 −0.2 0.0 0.2 0.4 0.6
−0.
6−
0.4
−0.
20.
00.
20.
4
datanp[,1]
data
np[,2
]
mclust ave.lave within 0.619 0.645
gap 0.766 0.771density index 0.503 0.852
sep index 0.055 0.126entropy 0.929 0.717
normality (MDS) 0.805 0.781uniformity (MDS) 0.393 0.302
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
−0.4 −0.2 0.0 0.2 0.4 0.6−
0.6
−0.
4−
0.2
0.0
0.2
0.4
datanp[,1]
data
np[,2
]
−0.4 −0.2 0.0 0.2 0.4 0.6
−0.
6−
0.4
−0.
20.
00.
20.
4
datanp[,1]
data
np[,2
]
mclust only better in entropyand MDS-distribution based indexes.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
−0.4 −0.2 0.0 0.2 0.4 0.6−
0.6
−0.
4−
0.2
0.0
0.2
0.4
datanp[,1]
data
np[,2
]
−0.4 −0.2 0.0 0.2 0.4 0.6
−0.
6−
0.4
−0.
20.
00.
20.
4
datanp[,1]
data
np[,2
]
mclust only better in entropyand MDS-distribution based indexes.
Perturbed by “noise cluster”;better cluster with “noise component”.How to use indexes with unclustered data?Could just ignore them but that givesclustering with “noise” unfair advantage.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
1
2
3
4
5
6 7
89
0
a
1
2
3
4
5
6 7
8
9
0
a 1
2
3
4
5
6 7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
23
4
5
6
78
9
0
a
1
2
3 4
5
678
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8 9
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
45 6
7
89
0
a
1
2
3
45
6
7
8
9
0
a 12
3
4
56
7
8
9
0
a
12
3
4
56
7
89
0
a
12
3
4
56
7
8
9
0
a
12
3
4
5
6
7
89
0
a
12
3
4
5
6
7
89
0
a
12
3
4
5
67
8
9
0
a
1
2
3
4
5
67
8
9
0
a
1
2
3
45
6
7
89
0
a
1
2
3
45
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
12
3
4
5
6
7
8
9
0
a1
2
3
4
5
67
8
9
0
a
1
2
34
5
678
90
a
1
2
345
67
8
9 0
a
1
2
34
5
67
8
90
a
1
2
34
5
67
8
9
0
a
1
2
34
5
6
78
90
a
1
2
3
45
678
90
a
12
3
4
5
6
7
8 9 0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
56
7
89
0
a
1
2
3
4
5 6
7
89
0
a
1
2
3
4
5 6
7
89
0
a
12
3
456
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8
90
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
23
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
12
3
4
56
7
8
9
0
a
1
2
3
4
5 67
89
0
a
1
23
4
56
78
9
0
a
1
23
4
5
6
78
9
0
a
12
3
4
5
6
7
8
90
a
1
2
3
4
5
6
7
8
90
a 1
2
3
4
56
7
89
0
a
1
2 3
4
56
7
89
0
a
1
2 3
4
5
6
7
89
0
a
1
2 3
4
5
6
7
89
0
a
1
23
4
5
6
7
8
9
0
a
1
23
45
67
8
9
0
a
1
2
3
4
56
78
9
0
a
1
2
3
4
5
6
78
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
67
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5 6
7
8
9
0
a 1
2
3
4
5 6
7
8
9
0
a1
2
3
4
56
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
3
4
5
67
8
9
0
a
1
2
3
4
5 6
7
8
90
a
1
2
3
4
5 6
7
8
9
0
a
1
2
3
4
5 6
7
8
9
0
a
1
2
3
4
5 6
7
8
9
0
a
1
2
3
4
5
6
7
89
0
a
1
2
3
4
5
6
7
8
9
0
a
1
2
34
5
6
7
8 9
0 a
1
23
4
5
6
7
8
90
a
1
2 34
5
6
7
8
90
a
1
234
5
6
7
8
90 a
1
23
4
5
67
8
9
0
a
1
23
4
5
6
7
8
9
0
a
−8 −6 −4 −2 0 2
−12
−10
−8
−6
−4
−2
dc 1
dc 2
9
5
6
6
8
0
a
1
a
a
0
9
9
6
6
8 0
a
1
a
a
0
99
7
6
1
0
a1
a
a
0
5
9
5
6
1
0
a1
a
a
05
9
9
6
1
0
a1
1
a
05
99
6
0
0
a
1
1
a
6
5
5
96
888
1
1
1
3
5
5
96
8 8
8
1
2
1
3
5
5
96
8 8
8
12
1
3
5
5
9
6
8 88
12
1
3
5
5
96
888
1 2
1
3
5
5
96
8
3
8
1
2
1
3 5
56
68
88
1
1
1
6
5
56
68
8
8
1
2
1
45
56
6
88
8
1
2
1
4
5
56
68 8
81
1
1
45
5
6
688
11
11
6
5
5
6
68
8
1
1 1
1
6
5
5
6
6
6 6
8
2
2
5
6
5
5
6
6
66
82
25
6
5
5
6
66
68
2 2
5
6
5
5
6
60 68
2 2
5
6
5
5
6
606
8
22
5
6
9
5
6
60
6
8
21
5
6
77
6
6
88
8
8
24
3
77
a
6
88
8
8
2
43
77
a
68 8
88
1
43
77a
63 8
8
8
1
4 3
7
a
a
63
3
8
8
2
4 3 7
a
a
63
3
2
2
2
43
a
a
a
33
3
32
3
a
3
a
a
a
33
3
32
3
a
3
a
a
a
33
3
3
2
3
a
3
a
a
a
0
3
3
3
2
3
a
3
a
a
a
3
3
3
3
2
3
a
3
a
a
a
3
3
3
3
2
3
a
3
aa
a
3
33
82
4
43
a
7
a
3
3
3
82
4
43
7
7a
3
33
82
4
4
3
7
7a
3
33
82
4
4
3
7
7
a
3
33
82
4
4
3
7
7
a
3
3
3
82
4
4
3
55
9
0
00
1
1
0 5
0
5
5
9
00
0
1
1
0
5
0
5
5
9
0
0
0
1
1
0
5
0
5
5
9
0
0
8
1
1
0
5
0
5
5
9
0
0
80
1
0
5
0
5
5
9
0
080
1
0
5
0
7
9
9
6
86
8
1
11
6
7
9
9
6
8
61
1
1 1
6
7
9
9
6
8
61
1
11
6
7
9
9
6
8
6
1
1
1 1
6
7
99
6
8
6
1
1 1
1
6
7
9
9
6
8
6
1
1
1
1
6
9
9
9
0
0
0
0
1
01
0
99
9
0
0
0
0
1
01
0
99
9
0
0
0
01
0
1
0
99
9
0
0
0
01
0
1
0
9
9
9
0
0
0
0
1
0
1
0
9
9
9
0
0
0
0
1
0
1
0
9
9
6
68 6
0
a
0 0
0
9
9
6
6
06
a
a
0
0
0
9
9
6
6
0
0
a
a
0
0
0
99
9
6
0
0
a
a
00
0
99
9
6
0
0
a
a
00
0
99
9
6
0
0
a
a
0
0
0
7
7
7
68 6
11
4
4
6
7
7
7
686
1
1
4
4
6
777
68 6
11
4
4
6
77
a
686
11
4
4
6
77a
68
6
8
1
4
4
6
7
aa
68
68
1
4
4
6
7
7
76
8 68
1
8
2
0
7
7
76
8 68
1
0
2
0
757
6
868
1
0
2
0
7
5
7
6
868
1
0
2
0
7
5
7
6
8 08
1
0
2
0
7
5
76
80
8
1 5
2
0
7
7
7
6
8
802 4
4
6
7
7
7
6
8
802 4
4
6
7
7
7
6
8
812
4
4
3
7
77
6
8
3
82
4
4
3
7
77
6
8
3
82
4
4
3
7
776
8
3
8
2
4
4
3
a
a
a
3
8
3
22
2
23
a a
a
3
8
3
22
2
23
aa
a
3
8
3
22
2
23
a a
a
38
3
2
2
2
2
3
aa
a
33
3
22
2
2
3
aa
a
33
3
22
2
2
4
−10 −8 −6 −4 −2 0 2
−8
−6
−4
−2
02
dc 1dc
2
7
777a a 1
91
3
a
7
7
7
7a
a
1
91
3
a
7
7
7
7
a
a
1
91
3
a
77
7
7
a a
1
91 3
a
7
77
7
a a
1
9 13
7
7
77
7
a
a
1
9a
3
7
7
7
6
7
a
a
a
9
9
9
7
7
7
6
7
aa
a
9
9
9
7
7
7
6
7
a
a
a
9
9
9
7
7
7
6
7
aa
a
9
9
9
7
7
7
6
7
a
a
a
9 9
9
7
7
7
6
7
a a
a
9
9
9
77
7
7
7
a
a
a 9
99
77
7
7
7
aa
a
99
9
7
7
7
7
7
a a
a
9
99
7
77
7
7
aa
a
9
9
9
7
77
77
a
a
a
9
9
9
7
77
7
7
a
a
a
9
9
9
7
7
77
a
a
a
a9
9
7
7
7
77
aa
a
a9
9
7
7
7
77
a
aa
a9 9
77
7
77
a
a
a
a9 9
77
7
77
a
a
a
a
9 9
77
7
77
a
a
aa
9
9
7
7
344
a
a a
a
a
98
a
3
44
a
a
a
aa
9 8
a
3
44
a
aa
aa 9
8
a 3
44
aa
a
aa
9
8
a
3
44a
aa
aa
9 8
a
3
44a
a a
a
a
9
8
a
34
4a
a a
a
9
a4
a
34
4a
a aa
9
a
4
a
3
4
4a
a aa
9
a
4
a
34
4aa aa
9 a
4
a
34
4
aa
aa
9a
4
a34
4
aa
aa
9a
4
a34
4
a
aa
a
9 8
4a 34
4
a
a aa
9 8
4a3
4
4a
a
aa
9 8
4a
34
4a
aa
a
9
8
4a
3
4
4a
a aa
9
8
4
a
3
4
4
a
aaa
98
4
a
77
26
7
77
76
76 77
26
7
77
7
6
76
77
26
6
7
7
76
7
6
77
26
6
77
7
6
7
6
77
26
6
7
77
6
7
6
77
26
6
77
7
6
76
0
7
7
7
a
aa
9
9 9
a
0
7 77
a
a
a
99
9
a
0
7 77
a
a
a
999
a
0
7 77
a
aa
9
99
a
0
7 77
aa
a
99
9
a
0
777
a a
a
9
9
9
a
5
2
2
2
2
22
96
9
6
5
2
2
2
222
9
6
9
6
52
2
2
22 2
9
6
9
6
52
2
222 2
9
6
9
6
52
2
22
2
2
96
9
6
5
2
2
222
2
9
6
9
67
7
7
aa a 1
1
a a
a
7
7
7
aa a 1
1
a
a
a
7
77
aaa
1
1
a
a
a
7
7 7
a aa
1
1
a
a
a
7
77
aa
a1
1
a
a
a
7
77
a
a
a1
1
a
a
a
34
4
7
aa a
9
8
8
a
3
4
47
a
aa
98
8
a3
4
47
a
a
a
98 8
a34
4
7
a
a
a
9 8 8
a
34
4
7
a
a
9988
a
3
4
4
7
a
a
99 8
8
a
3
4
4
aa
a
a9
7
5
7
3
a
4
aa a
a 9
7
5
7
3
a
4
a
a
a
a9
7
5
7
3
a 4
aa
a
a9
7
5
7
3
a4
aa
7
a
9
7
5
7
3
a 4
aa
7
a
9
7
5
7
3
44
aa
a
a
9
a
8
a
3
4
4
aa
a
a
9
a
8
a
3
4
4
a
aa
a
9a
8
a
3
4
4
aa
a
a
9a
8
a
3
4
4
a
a
a
a
9
a
8
a
3
4
4a
a
aa
9
a
8
a
3
4
4a
a
a
a
9
9
5
a3
4
4a
a
a
a
99
5
a
34
4
a
a
a
a
99
5
a 3
44
a
a
a
a9
9
5
a3
44
a
a
a
a
9
95
a3 4
4
a
a
a
a
9
95
a
−6 −4 −2 0 2
−6
−4
−2
02
dc 1
dc 2
true 11-means spectralave within 0.691 0.734 0.692sep index 0.069 0.093 0.130
hubert gamma 0.224 0.411 0.400entropy 1.000 0.983 0.739
ARI 1.000 0.205 0.142
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
true 11-means spectralave within 0.691 0.734 0.692sep index 0.069 0.093 0.130
hubert gamma 0.224 0.411 0.400entropy 1.000 0.983 0.739
ARI 1.000 0.205 0.142
“true” no good clustering.Ave within, entropy are only indexespositively correlated with ARI!
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
true 11-means spectralave within 0.691 0.734 0.692sep index 0.069 0.093 0.130
hubert gamma 0.224 0.411 0.400entropy 1.000 0.983 0.739
ARI 1.000 0.205 0.142
“true” no good clustering.Ave within, entropy are only indexespositively correlated with ARI!
Good ARI needs good ave within and nothing else here.Use to explain results in data with known classes.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Crabs data (2 species, m/f):
FL
6 8 12 16 20 20 30 40 50
1015
20
68
1216
20
RW
CL
1525
3545
2030
4050
CW
10 15 20 15 25 35 45 10 15 20
1015
20
BD
FL
6 8 12 16 20 20 30 40 50
1015
20
68
1216
20
RW
CL
1525
3545
2030
4050
CW
10 15 20 15 25 35 45 10 15 20
1015
20
BD
spectral
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Crabs data (2 species, m/f):
FL
6 8 12 16 20 20 30 40 50
1015
20
68
1216
20
RW
CL
1525
3545
2030
4050
CW
10 15 20 15 25 35 45 10 15 20
1015
20
BD
FL
6 8 12 16 20 20 30 40 50
1015
20
68
1216
20
RW
CL
1525
3545
2030
4050
CW
10 15 20 15 25 35 45 10 15 20
1015
20
BD
mclust
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
true mclust spectralave within 0.761 0.828 0.908
density 0.167 0.027 0.246hubert gamma 0.060 0.291 0.591
ARI 1.000 0.316 0.023
◮ “true” is worst according to most indexes.But there is a visible pattern!
◮ All indexes (except entropy) arenegatively correlated with ARI.
◮ mclust has best ARI out of 8 methodsbut quite bad index values.
Indexes fail to capture what goes on here:-(
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
5. Discussion◮ Clustering quality is multidimensional.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
5. Discussion◮ Clustering quality is multidimensional.◮ Provide multidimensional evaluation,
characterising a method’s behaviour.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
5. Discussion◮ Clustering quality is multidimensional.◮ Provide multidimensional evaluation,
characterising a method’s behaviour.◮ Can aggregate criteria by weighted mean
given well justified weights.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
5. Discussion◮ Clustering quality is multidimensional.◮ Provide multidimensional evaluation,
characterising a method’s behaviour.◮ Can aggregate criteria by weighted mean
given well justified weights.◮ Can use to explain performance
in data with known truth
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
5. Discussion◮ Clustering quality is multidimensional.◮ Provide multidimensional evaluation,
characterising a method’s behaviour.◮ Can aggregate criteria by weighted mean
given well justified weights.◮ Can use to explain performance
in data with known truth◮ Designers of new methods should specify
what aspects of clustering they aim at,so that it can be tested.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Open problems◮ Dependence on subjective tuning hurts
(weights, k-nn, percentage, standardisation).
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Open problems◮ Dependence on subjective tuning hurts
(weights, k-nn, percentage, standardisation).◮ But it’s honest; such decisions are needed
to define a good clustering in practice.
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Open problems◮ Dependence on subjective tuning hurts
(weights, k-nn, percentage, standardisation).◮ But it’s honest; such decisions are needed
to define a good clustering in practice.◮ Proper behaviour of criteria
(standardisation, transformation,different numbers of clusters)for fair aggregation and comparability?
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Open problems◮ Dependence on subjective tuning hurts
(weights, k-nn, percentage, standardisation).◮ But it’s honest; such decisions are needed
to define a good clustering in practice.◮ Proper behaviour of criteria
(standardisation, transformation,different numbers of clusters)for fair aggregation and comparability?
◮ Index choice vs. method definition(average linkage not always optimal for ave. distance)
Christian Hennig Measurement of quality in cluster analysis
IntroductionBasic thoughts
Cluster quality statisticsExamples
Discussion
Open problems◮ Dependence on subjective tuning hurts
(weights, k-nn, percentage, standardisation).◮ But it’s honest; such decisions are needed
to define a good clustering in practice.◮ Proper behaviour of criteria
(standardisation, transformation,different numbers of clusters)for fair aggregation and comparability?
◮ Index choice vs. method definition(average linkage not always optimal for ave. distance)
◮ With given weights, optimise quality?
Christian Hennig Measurement of quality in cluster analysis