clustering

LOGO

Clustering

Lecturer Dr Bo Yuan

E-mail yuanbsztsinghuaeducn

Overview

Partitioning Methods K-Means Sequential Leader Model Based Methods Density Based Methods

Hierarchical Methods

2

What is cluster analysis

Finding groups of objects Objects similar to each other are in the same group Objects are different from those in other groups

Unsupervised Learning No labels Data driven

3

Clusters

4

Inter-Cluster

Intra-Cluster

Clusters

5

Applications of Clustering

Marketing Finding groups of customers with similar behaviours

Biology Finding groups of animals or plants with similar features

Bioinformatics Clustering of microarray data genes and sequences

Earthquake Studies Clustering observed earthquake epicenters to identify dangerous zones

WWW Clustering weblog data to discover groups of similar access patterns

Social Networks Discovering groups of individuals with close friendships internally

6

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability

Ability to deal with different types of attributes

Ability to discover clusters with arbitrary shape

Minimum requirements for domain knowledge

Ability to deal with noise and outliers

Insensitivity to order of input records

Incorporation of user-defined constraints

Interpretability and usability

10

Practical Considerations

11

Normalization or Not

12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14

The Influence of Outliers

15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means

Determine the value of K

Choose K cluster centres randomly

Each data point is assigned to its closest centroid

Use the mean of each cluster to update each centroid

Repeat until no more new assignment

Return the K centroids

Reference J MacQueen (1967) Some Methods for Classification and Analysis of

Multivariate Observations Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability vol1 pp 281-297

19

Comments on K-Means

Pros Simple and works well for regular disjoint clusters Converges relatively fast Relatively efficient and scalable O(tkn)

bull t iteration k number of centroids n number of data points

Cons Need to specify the value of K in advance

bull Difficult and domain knowledge may help May converge to local optima

bull In practice try different initial centroids May be sensitive to noisy data and outliers

bull Mean of data points hellip Not suitable for clusters of

bull Non-convex shapes

20

The Influence of Initial Centroids

21


22

The K-Medoids Method

The basic idea is to use real data points as centres

Determine the value of K in advance

Randomly select K points as medoids

Assign each data point to the closest medoid

Calculate the cost of the configuration J

For each medoid m For each non-medoid point o Swap m and o and calculate the new cost of configuration Jprime

If the cost of the best new configuration J is lower than J make the corresponding swap and repeat the above steps

Otherwise terminate the procedure23


24

Cost =20 Cost =26

Sequential Leader Clustering

A very efficient clustering algorithm No iteration Time complexity O(nk)

No need to specify K in advance

Choose a cluster threshold value

For every new data point Compute the distance between the new data point and every clusters centre

If the distance is smaller than the chosen threshold assign the new data point to the corresponding cluster and re-compute cluster centre

Otherwise create a new cluster with the new data point as its centre

Clustering results may be influenced by the sequence of data points

25

Silhouette

A method of interpretation and validation of clusters of data

A succinct graphical representation of how well each data point lies within its cluster compared to other clusters

a(i) average dissimilarity of i with all other points in the same cluster

b(i) the lowest average dissimilarity of i to other clusters

26

)()(max)()()(iaibiaibis

Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf

Clustering by Mixture Models

29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters

Expectation Maximization

31

32

EM Gaussian Mixture

33

Gaussianjth by the generated is i instancer whethecomponents mixture ofnumber the

points data ofnumber the

ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1

Density Based Methods

Generate clusters of arbitrary shapes

Robust against noise

No K value required in advance

Somewhat similar to human vision

34

DBSCAN

Density-Based Spatial Clustering of Applications with Noise

Density number of points within a specified radius

Core Point points with high density

Border Point points with low density but in the neighbourhood of a core point

Noise Point neither a core point nor a border point

35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q

directly density reachable

p

q

density reachable

o

qp

density connected

DBSCAN

A cluster is defined as the maximal set of density connected points Start from a randomly selected unseen point P If P is a core point build a cluster by gradually adding all points that are

density reachable to the current point set Noise points are discarded (unlabelled)

37

Hierarchical Clustering

Produce a set of nested tree-like clusters

Can be visualized as a dendrogram Clustering is obtained by cutting at desired level No need to specify K in advance May correspond to meaningful taxonomies

38

Dinosaur Family Tree

39

Agglomerative Methods

Bottom-up Method

Assign each data point to a cluster

Calculate the proximity matrix

Merge the pair of closest clusters

Repeat until only a single cluster remains

How to calculate the distance between clusters

Single Link Minimum distance between points

Complete Link Maximum distance between points

40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials

Text Books Richard O Duda et al Pattern Classification Chapter 10 John Wiley amp Sons J Han and M Kamber Data Mining Concepts and Techniques Chapter 8

Morgan Kaufmann

Survey Papers A K Jain M N Murty and P J Flynn (1999) ldquoData Clustering A Reviewrdquo ACM

Computing Surveys Vol 31(3) pp 264-323 R Xu and D Wunsch (2005) ldquoSurvey of Clustering Algorithmsrdquo IEEE Transactions

on Neural Networks Vol 16(3) pp 645-678 A K Jain (2010) ldquoData Clustering 50 Years Beyond K-Meansrdquo Pattern

Recognition Letters Vol 31 pp 651-666

Online Tutorials httphomedeipolimiitmatteuccClusteringtutorial_html httpwwwautonlaborgtutorialskmeanshtml httpusersinformatikuni-hallede~hinneburClusterTutorial

45

Review

What is clustering

What are the two categories of clustering methods

How does the K-Means algorithm work

What are the major issues of K-Means

How to control the number of clusters in Sequential Leader Clustering

How to use Gaussian mixture models for clustering

What are the main advantages of density methods

What is the core idea of DBSCAN

What is the general procedure of hierarchical clustering

Which clustering methods do not require K as the input

46

Next Weekrsquos Class Talk

Volunteers are required for next weekrsquos class talk

Topic Affinity Propagation Science 315 972ndash976 2007 Clustering by passing messages between points httpwwwpsitorontoeduindexphpq=affinity20propagation

Topic Clustering by Fast Search and Find of Density Peaks Science 344 1492ndash1496 2014 Cluster centers higher density than neighbors Cluster centers distant from others points with higher densities

Length 20 minutes plus question time

47

Assignment

Topic Clustering Techniques and Applications

Techniques K-Means Another clustering method for comparison

Task 1 2D Artificial Datasets To demonstrate the influence of data patterns To demonstrate the influence of algorithm factors

Task 2 Image Segmentation Gray vs Colour

Deliverables Reports (experiment specification algorithm parameters in-depth analysis) Code (any programming language with detailed comments)

Due Sunday 28 December

Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means


The Influence of Initial Centroids (2)


The K-Medoids Method (2)


Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Overview

Partitioning Methods K-Means Sequential Leader Model Based Methods Density Based Methods

Hierarchical Methods

2




3

Clusters

4

Inter-Cluster

Intra-Cluster

Clusters

5








6

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment




3

Clusters

4

Inter-Cluster

Intra-Cluster

Clusters

5








6

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Clusters

4

Inter-Cluster

Intra-Cluster

Clusters

5








6

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Clusters

5








6

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment








6

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Earthquakes

7

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Image Segmentation

8

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

The Big Picture

9

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Requirements

Scalability








10


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


11


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


12

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Evaluation

13

ii Dxi

i

c

i Dxie x

nmmxJ 1

1

2

VS

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Evaluation

14


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


15

outlier

K=2

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

K-Means

16

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

K-Means

17

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

K-Means

18

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

K-Means









19

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Comments on K-Means








20


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


21


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


22











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment











24

Cost =20 Cost =26









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment









25

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Silhouette





26


Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Silhouette

27

-02 0 02 04 06 08 1

1

2

Silhouette Value

Clu

ster

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Gaussian Mixture

28

)2()(

2

22

21)(

xexg

1amp0)()(1

i

ii

n

iiii xgxf


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


29

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

K-Means Revisited

30

120579=(1199091 1199101 ) (1199092 119910 2)

119885=119862119897119906119904119905119890119903 1 119862119897119906119904119905119890119903 2

model parameters

latent parameters


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


31

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

32

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

EM Gaussian Mixture

33



ijznm

n

kk

x

j

x

n

kkki

jjiij

ki

ji

e

e

xxp

xxpzE

1

)(2

1

)(2

1

1

22

22

)|(

)|(][

m

iij

m

iiij

j

zE

xzE

1

1

][

][

m

iijj zE

m 1

][1






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment






34

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

DBSCAN






35

Core Point

Noise Point

Border Point

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

DBSCAN

36

p

q


p

q

density reachable

o

qp

density connected

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

DBSCAN



37




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment




38


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


39


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment


Bottom-up Method








40

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Example

41

BA FI MI NA RM TO

BA 0 662 877 255 412 996

FI 662 0 295 468 268 400

MI 877 295 0 754 564 138

NA 255 468 754 0 219 869

RM 412 268 564 219 0 669

TO 996 400 138 869 669 0Single Link

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Example

42

BA FI MITO NA RM

BA 0 662 877 255 412

FI 662 0 295 468 268

MITO 877 295 0 754 564

NA 255 468 754 0 219

RM 412 268 564 219 0

BA FI MITO NARM

BA 0 662 877 255FI 662 0 295 268

MITO 877 295 0 564

NARM 255 268 564 0

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Example

43

BANARM FI MITO

BANARM 0 268 564FI 268 0 295

MITO 564 295 0

BAFINARM MITO

BAFINARM 0 295

MITO 295 0

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Min vs Max

44

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Reading Materials


Morgan Kaufmann






45

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Review

What is clustering










46






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment






47

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

Assignment







Credit 1548

Clustering

Overview


Clusters

Clusters (2)


Earthquakes

Image Segmentation

The Big Picture

Requirements



Evaluation

Evaluation (2)


K-Means

K-Means (2)

K-Means (3)

K-Means (4)

Comments on K-Means






Silhouette

Silhouette (2)

Gaussian Mixture


K-Means Revisited


Slide 32

EM Gaussian Mixture


DBSCAN

DBSCAN (2)

DBSCAN (3)




Example

Example (2)

Example (3)

Min vs Max

Reading Materials

Review


Assignment

clustering

Documents

new data point

noisy data

mean of data points

real data points

number of data points

new cost of configuration

weblog

value of