expectation maximization and gaussian mixture models

40
Expectation Maximization and Mixture of Gaussians 1

Upload: petitegeek

Post on 05-Dec-2014

18.170 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Expectation Maximization and Gaussian Mixture Models

Expectation Maximization and Mixture of Gaussians

1

Page 2: Expectation Maximization and Gaussian Mixture Models

 Recommend me some music!

 Discover groups of similar songs…

君の知らない物語 (bpm

125)

Bach Sonata #1 (bpm 60)

Only my railgun (bpm

120) My Music Collection

Bpm 90!

2

Page 3: Expectation Maximization and Gaussian Mixture Models

 Recommend me some music!

 Discover groups of similar songs…

My Music Collection

君の知らない物語 (bpm

125)

Bach Sonata #1 (bpm 60)

Only my railgun (bpm

120)

bpm 60

bpm 120

3

Page 4: Expectation Maximization and Gaussian Mixture Models

An unsupervised classifying method

4

Page 5: Expectation Maximization and Gaussian Mixture Models

1.  Initialize K “means” , one for each class

 Eg. Use random starting points, or choose k random points from the set

K=2

µk

µ2€

µ1

5

Page 6: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

1 0

µk

6

Page 7: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

0 1

µk

7

Page 8: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

8

Page 9: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

9

Page 10: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

10

Page 11: Expectation Maximization and Gaussian Mixture Models

0 1

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

µk

11

Page 12: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

12

Page 13: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

µk

13

Page 14: Expectation Maximization and Gaussian Mixture Models

2.  Phase 1: Assign each point to closest mean

3.  Phase 2: Update means of the new clusters

14

Page 15: Expectation Maximization and Gaussian Mixture Models

4.  When means do not change anymore clustering DONE.

15

Page 16: Expectation Maximization and Gaussian Mixture Models

 In K-means, a point can only have 1 class  But what about points that lie in between

groups? eg. Jazz + Classical

16

Page 17: Expectation Maximization and Gaussian Mixture Models

The Famous “GMM”: Gaussian Mixture Model

17

Page 18: Expectation Maximization and Gaussian Mixture Models

p(X) = N(X |µ,Σ)

Gaussian == “Normal”

distribution

Mean

Variance

18

Page 19: Expectation Maximization and Gaussian Mixture Models

p(X) = N(X |µ,Σ) + N(X |µ,Σ)

19

Page 20: Expectation Maximization and Gaussian Mixture Models

p(X) = N(X |µ1,Σ1) + N(X |µ2,Σ2)

Variance

Example:

20

Page 21: Expectation Maximization and Gaussian Mixture Models

p(X) = π1N(X |µ1,Σ1) + π 2N(X |µ2,Σ2)

π kk=1

k

∑ =1Mixing Coefficient

π1 = 0.7

Example:

π 2 = 0.321

Page 22: Expectation Maximization and Gaussian Mixture Models

p(X) = π kN(X |µk,Σk )k=1

K

K = 2Example:

22

Page 23: Expectation Maximization and Gaussian Mixture Models

 K-means is a classifier

 Mixture of Gaussians is a probability model

 We can USE it as a “soft” classifier

23

Page 24: Expectation Maximization and Gaussian Mixture Models

 K-means is a classifier

 Mixture of Gaussians is a probability model

 We can USE it as a “soft” classifier

24

Page 25: Expectation Maximization and Gaussian Mixture Models

Parameter to fit to data: •  Mean

 K-means is a classifier

 Mixture of Gaussians is a probability model

 We can USE it as a “soft” classifier

Parameters to fit to data: •  Mean •  Covariance •  Mixing coefficient

π k

Σk

µk

µk

25

Page 26: Expectation Maximization and Gaussian Mixture Models

EM for GMM

26

Page 27: Expectation Maximization and Gaussian Mixture Models

1.  Initialize means 2.  E Step: Assign each point to a cluster 3.  M Step: Given clusters, refine mean of each

cluster k

4.  Stop when change in means is small €

µk 1 0

µk

27

Page 28: Expectation Maximization and Gaussian Mixture Models

1.  Initialize Gaussian* parameters: means , covariances and mixing coefficients

2.  E Step: Assign each point Xn an assignment score for each cluster k

3.  M Step: Given scores, adjust , , for each cluster k

4.  Evaluate likelihood. If likelihood or parameters converge, stop.

µk

Σk

π k

0.5 0.5

γ(znk )

Σk

µk€

π k

*There are k Gaussians

28

Page 29: Expectation Maximization and Gaussian Mixture Models

1.  Initialize , , one for each

Gaussian k

 Tip! Use K-means result to initialize: €

µk

µ2

Σk

π k

Σ2

π 2

µk ← µk

Σk ← cov(cluster(K))

π k ← Number of points in k Total number of points

29

Page 30: Expectation Maximization and Gaussian Mixture Models

2.  E Step: For each point Xn, determine its assignment score to each Gaussian k:

.7 .3

is called a “responsibility”: how much is this Gaussian k responsible for this point Xn?

γ(znk )

Latent variable

30

Page 31: Expectation Maximization and Gaussian Mixture Models

3.  M Step: For each Gaussian k, update parameters using new

γ(znk )

Responsibility for this Xn

Mean of Gaussian k

Find the mean that “fits” the assignment scores best 31

Page 32: Expectation Maximization and Gaussian Mixture Models

3.  M Step: For each Gaussian k, update parameters using new

γ(znk )

Covariance matrix of Gaussian k

Just calculated this! 32

Page 33: Expectation Maximization and Gaussian Mixture Models

3.  M Step: For each Gaussian k, update parameters using new

γ(znk )

Mixing Coefficient for Gaussian k

Total # of points

eg. 105.6/200

33

Page 34: Expectation Maximization and Gaussian Mixture Models

4.  Evaluate log likelihood. If likelihood or parameters converge, stop. Else go to Step 2 (E step).

Likelihood is the probability that the data X was generated by the parameters you found. ie. Correctness!

34

Page 35: Expectation Maximization and Gaussian Mixture Models

35

Page 36: Expectation Maximization and Gaussian Mixture Models

1.  Initialize parameters 2.  E Step: Evaluate 3.  M Step: Evaluate

4.  Evaluate log likelihood. If likelihood or parameters converge, stop. Else

and go to E Step.

θ old

p(Z | X,θ old )

where

θ old ←θ new

Observed variables

Hidden variables

Likelihood

36

Page 37: Expectation Maximization and Gaussian Mixture Models

 K-means can be formulated as EM  EM for Gaussian Mixtures  EM for Bernoulli Mixtures  EM for Bayesian Linear Regression

37

Page 38: Expectation Maximization and Gaussian Mixture Models

 “Expectation” Calculated the fixed, data-dependent

parameters of the function Q.  “Maximization” Once the parameters of Q are known, it is fully

determined, so now we can maximize Q.

38

Page 39: Expectation Maximization and Gaussian Mixture Models

 We learned how to cluster data in an unsupervised manner

 Gaussian Mixture Models are useful for modeling data with “soft” cluster assignments

 Expectation Maximization is a method used when we have a model with latent variables (values we don’t know, but estimate with each step) 0.5 0.5

39

Page 40: Expectation Maximization and Gaussian Mixture Models

 My question: What other applications could use EM? How about EM of GMMs?

40