maximum likelihood estimation multivariate normal distribution

Maximum Likelihood Estimation

Multivariate Normal distribution

The Method of Maximum Likelihood

Suppose that the data x1, … , xn has joint density function

f(x1, … , xn ; 1, … , p)

where (1, … , p) are unknown parameters assumed to lie in (a subset of p-dimensional space).

We want to estimate the parameters1, … , p

Definition: The Likelihood function Suppose that the data x1, … , xn has joint density function

f(x1, … , xn ; 1, … , p)

Then given the data the Likelihood function is defined to be

= L(1, … , p)

= f(x1, … , xn ; 1, … , p)

Note: the domain of L(1, … , p) is the set .

,f x L

Definition: Maximum Likelihood Estimators

Suppose that the data x1, … , xn has joint density function

f(x1, … , xn ; 1, … , p)

Then the Likelihood function is defined to be

= L(1, … , p)

= f(x1, … , xn ; 1, … , p)

and the Maximum Likelihood estimators of the parameters 1, … , p are the values that maximize

= L(1, … , p)

i.e. the Maximum Likelihood estimators of the parameters 1, … , p are the values

1 1, ,

ˆ ˆ, , max , ,p

p pL L

1̂ˆ, , p

Such that

Note: 1maximizing , , pL is equivalent to maximizing

1 1, , ln , ,p pl L

the log-likelihood function

The Multivariate Normal Distribution

Maximum Likelihood Estiamtion

Let 1 2, , nx x x

with mean vector

and covariance matrix

from the p-variate normal distribution

denote a sample (independent)

11 12 1

21 22 2

p p pn

x x xx x x

The matrix 1 2, , np n

x x xX

is called the data matrix.

11 12 1

21 22 2

p p pn

The vector

is called the data vector.

The mean vector

The vector

xx x x x

i i i in ijj

x x x x xn n

is called the sample mean vector

1 11 12 1

2 21 22 2

p p p pn

x x x x

x x x xx

x x x x

In terms of the data vector

1 1, , ,

p npnp

xx I I I

where , , ,p np

I I IA

Graphical representation of sample mean vector

The sample mean vector is the centroid of the data vectors.

The Sample Covariance matrix

The sample covariance matrix:

11 12 1

12 11 2

p p pp

ik ij i kj kj

s x x x xn

There are different ways of representing sample covariance matrix:

11 12 1

12 11 2

p p pp

j jj p p

x x x xn

j jj p p

S x x x xn

1,...,

x x x xn

1,..., ,..., ,..., ,...,

1 n nx x x x x x x xn

1,..., ,..., ,..., ,...,

1 j j j jx x x x x x x xn

1 1 11,...,1 1,...,1

1X X X X

1X I J X I J

where 1,...,1 matrix of 1's

1 1n nJ n n

1S X I J X I J

1X I J I J X

1X I J X

Maximum Likelihood Estimation

Multivariate Normal distribution

Let 1 2, , nx x x

with mean vector

from the p-variate normal distribution

denote a sample (independent)

21 / 2 1/ 2

1, , , e

i in x x

Then the joint density function of 1 2, , nx x x

/ 2 / 2

The Likelihood function is:

/ 2 / 2

and the Log-likelihood function is:

, ln , l L

1ln 2 ln

np nx x

To find the Maximum Likelihood estimators of

/ 2 / 2

or equivalently maximize

1, ln 2 ln

np nl x x

we need to find ˆ ˆ and

to maximize

thus 1

dl dx x

i i ii i

x x x n

1, ln 2 ln

np nl x x

1ln 2 ln tr

np nx x

1ln 2 ln tr

np nx x

tr tr AB BA

1ln 2 ln tr

np nx x

Now ,l

, ln 1tr

dl dn dx x

1ln 2 ln tr

np nx x

i ip p

1 ˆ ˆˆor n

nx x x x S

Summary:

the Maximum Likelihood estimators of

1 1ˆ n

nx x x x S

Sampling distribution of the MLE’s

1 1ˆ , ,n

x x I I Axn n

21 / 2 1/ 2

1, , , e

i in x x

The joint density function of 1 2, , nx x x

/ 2 / 2

This distribution is np-variate normal with mean vector

1 1ˆ , ,n

x x I I Axn n

Thus the distribution of

is p-variate normal with mean vector

1 1 1, , = =

A I I nn n n

*and covariance matrix A A

Summary

The sampling distribution of

is p-variate normal with

The sampling distribution of the sample covariance matrix S

The Wishart distribution

A multivariate generalization of the 2 distribution

Let 1 2, , , kz z z be k independent random p-vectors

Each having a p-variate normal distribution with

1mean vector 0 and covariance matrix

and covariance matrix p p

1 1 2 2Let k kp pU z z z z z z

Then U is said to have the p-variate Wishart distribution with k degrees of freedom

pU W k

Definition: the p-variate Wishart distribution

Suppose

Then the joint density of U is:

i.e. / 2 1 / 2p

1 / 2 12

/ 2/ 2

U kkpp pp

u tr uf u

where p(·) is the multivariate gamma function.

pU W k

The density ot the p-variate Wishart distribution

It can be easily checked that when p = 1 and 1 then the Wishart distribution becomes the 2

distribution with k degrees of freedom.

Suppose

Let denote a matrix of rank .q pC q p q p

pU W k

Theorem

2 21 a kv a Ua W k a a

Corollary 1:

2with a a a

Corollary 2: If the diagonal element of th

iiu i U 2then where ii ii k iju

pV CUC W k C C

Set [0 0 1 0]i

Suppose 1 1 2 2 and p pU W k U W k

Theorem

are independent, then

1 2 1 2pV U U W k k

Suppose 1 1 2 and pU W k UTheorem

are independent and

1 2 1 with pV U U W k k k

then 2 1pU W k k

i i pi

U x x W n

Theorem Let

Theorem

1 2, , , nx x x

be a sample from

then pN

1 1i pU n x x W

Let 1 2, , , nx x x

be a sample from

then pN

Theorem

in x x

x x x x

x x x x x x

U x x x x

Theorem Let

1 2, , , nx x x

be a sample from

1 iU n x x

is independent of

Proof 1 1 1

21 22 2

n n nn

h h hH

be orthogonal

Then H H HH I

* 21 22 2

n n nn

h I h I h IH

h I h I h I

Note H* is also orthogonal* the Kronecker product of and H H I H I

a B a B

Properties of Kronecker-product

1. A B C D AC BD

2. A B A B

1 1 1 3. A B A B

BDACDCBA

1 1 11 1

2 2* 21 22 2

n nn n nn

I I I x u

x uh I h I h IH x

x uh I h I h I

for 2,3,...,

i ij ji

u x nx

u h x i p

Note: n n

i i i ii i

u u x x

1 12 1

i i i ii i

u u u u x x

1 1 1 12 1 1

- - 1 n n n

i i i i i ii i i

u u u u x x u u x x nxx n S

and covariance matrix =

This the distribution of

is np-variate normal with mean vector 1 2, , , nx x x x

= H I I H I

Thus the joint distribution of

is np-variate normal with mean vector u H I x

1 u H I H I

*and covariance matrix u H I H I

= HH I

and covariance matrix =

Thus the joint distribution of

is np-variate normal with mean vector

i i pi

U x x x x n S W n

Summary: Sampling distribution of MLE’s for multivatiate Normal distribution

Let 1 2, , , nx x x

be a sample from

1p nx N

1 1Also 1ii ii

nu s n

maximum likelihood estimation multivariate normal distribution

Documents

chapter 3: maximum-likelihood parameter estimation l...

maximum likelihood estimation of mixture densities for...

ece 8443 – pattern recognition lecture 07: maximum...

maximum likelihood estimation - keio...

maximum likelihood, estimators of latent multivariate normal

maximum likelihood estimation

topic 15: maximum likelihood estimation

carbon flux bias estimation employing maximum likelihood...

chapter 5: maximum likelihood...

maximum likelihood (ml) estimation

maximum likelihood estimation of the multivariate normal...

empirical likelihood approach estimation of …

multivariate cointegartion the johansen maximum likelihood...

maximum smoothed likelihood for multivariate · maximum...

the generalized likelihood uncertainty estimation · pdf...

change detection in multivariate data: likelihood and...

marginal likelihood estimation

1986_levinson_maximum likelihood estimation for multivariate...

parameter estimation & maximum likelihood

joint and conditional maximum likelihood estimation for...