maximum likelihood estimation multivariate normal distribution
Post on 24-Dec-2015
256 Views
Preview:
TRANSCRIPT
Maximum Likelihood Estimation
Multivariate Normal distribution
The Method of Maximum Likelihood
Suppose that the data x1, … , xn has joint density function
f(x1, … , xn ; 1, … , p)
where (1, … , p) are unknown parameters assumed to lie in (a subset of p-dimensional space).
We want to estimate the parameters1, … , p
Definition: The Likelihood function Suppose that the data x1, … , xn has joint density function
f(x1, … , xn ; 1, … , p)
Then given the data the Likelihood function is defined to be
= L(1, … , p)
= f(x1, … , xn ; 1, … , p)
Note: the domain of L(1, … , p) is the set .
,f x
,f x L
Definition: Maximum Likelihood Estimators
Suppose that the data x1, … , xn has joint density function
f(x1, … , xn ; 1, … , p)
Then the Likelihood function is defined to be
= L(1, … , p)
= f(x1, … , xn ; 1, … , p)
and the Maximum Likelihood estimators of the parameters 1, … , p are the values that maximize
= L(1, … , p)
,f x
L
L
i.e. the Maximum Likelihood estimators of the parameters 1, … , p are the values
1
1 1, ,
ˆ ˆ, , max , ,p
p pL L
1̂ˆ, , p
Such that
Note: 1maximizing , , pL is equivalent to maximizing
1 1, , ln , ,p pl L
the log-likelihood function
The Multivariate Normal Distribution
Maximum Likelihood Estiamtion
Let 1 2, , nx x x
with mean vector
and covariance matrix
from the p-variate normal distribution
denote a sample (independent)
11 12 1
21 22 2
1 2
1 2
, , ,
n
n
n
p p pn
x x x
x x xx x x
x x x
Note:
The matrix 1 2, , np n
x x xX
is called the data matrix.
11 12 1
21 22 2
1 2
n
n
p p pn
x x x
x x x
x x x
The vector
1
2
1np
n
x
x
x
x
is called the data vector.
11
1
1
p
n
pn
x
x
x
x
The mean vector
The vector
1
2
1 2
1n
p
x
xx x x x
n
x
note
1 21
1 1 n
i i i in ijj
x x x x xn n
is called the sample mean vector
also
1 11 12 1
2 21 22 2
1 2
1
11
1
n
n
p p p pn
x x x x
x x x xx
n
x x x x
11X
n
In terms of the data vector
1
2
1
1 1, , ,
p npnp
n
x
xx I I I
n n
x
xA
where , , ,p np
I I IA
Graphical representation of sample mean vector
2x
x1x
nx
2x
1x
px
The sample mean vector is the centroid of the data vectors.
The Sample Covariance matrix
The sample covariance matrix:
11 12 1
12 11 2
1 2
p
p
p p
p p pp
s s s
s s s
s s s
S
1
1
1
n
ik ij i kj kj
s x x x xn
where
There are different ways of representing sample covariance matrix:
11 12 1
12 11 2
1 2
p
p
p p
p p pp
s s s
s s s
s s s
S
1 1 1
1
1
n
j jj p p
x x x xn
1 1 1
1
1
n
j jj p p
S x x x xn
1
1
1,...,
1 n
n
x x
x x x xn
x x
1 1
1,..., ,..., ,..., ,...,
1 n nx x x x x x x xn
1,..., ,..., ,..., ,...,
1 j j j jx x x x x x x xn
1 1 11,...,1 1,...,1
1X X X X
n n n
1 1 1
1X I J X I J
n n n
1 1
where 1,...,1 matrix of 1's
1 1n nJ n n
1 1 1
1S X I J X I J
n n n
hence
1 1 1
1X I J I J X
n n n
1 1
1X I J X
n n
Maximum Likelihood Estimation
Multivariate Normal distribution
Let 1 2, , nx x x
with mean vector
and covariance matrix
from the p-variate normal distribution
denote a sample (independent)
11
21 / 2 1/ 2
1
1, , , e
2
i in x x
n pi
f x x
Then the joint density function of 1 2, , nx x x
is:
1
1
1
2
/ 2 / 2
1e
2
n
i ii
x x
np n
The Likelihood function is:
1
1
1
2
/ 2 / 2
1, e
2
n
i ii
x x
np nL
and the Log-likelihood function is:
, ln , l L
1
1
1ln 2 ln
2 2 2
n
i ii
np nx x
To find the Maximum Likelihood estimators of
1
1
1
2
/ 2 / 2
1, e
2
n
i ii
x x
np nL
or equivalently maximize
1
1
1, ln 2 ln
2 2 2
n
i ii
np nl x x
and
we need to find ˆ ˆ and
to maximize
Note:
1
1
n
i ii
x x
thus 1
1
, 1
2
n
i ii
dl dx x
d d
1 1 1
1 1
2n n
i i ii i
x x x n
1 1
1
0n
ii
x n
1
1ˆn
ii
x xn
hence
Now
1
1
1, ln 2 ln
2 2 2
n
i ii
np nl x x
1
1
1ln 2 ln tr
2 2 2
n
i ii
np nx x
1
1
1ln 2 ln tr
2 2 2
n
i ii
np nx x
tr tr AB BA
1
1
1ln 2 ln tr
2 2 2
n
i ii
np nx x
Now ,l
1
1
, ln 1tr
2 2
n
i ii
dl dn dx x
d d d
1
1
1ln 2 ln tr
2 2 2
n
i ii
np nx x
1 1 1
1
10
2 2
n
i ip p
i
nx x
1
1 ˆ ˆˆor n
i ii
x xn
1
1 1=
n
i ii
nx x x x S
n n
and
Summary:
the Maximum Likelihood estimators of
are
1
1ˆ n
ii
x xn
and
1
1 1ˆ n
i ii
nx x x x S
n n
Sampling distribution of the MLE’s
Note
1
1
1 1ˆ , ,n
ii
n
x
x x I I Axn n
x
11
21 / 2 1/ 2
1
1, , , e
2
i in x x
n pi
f x x
The joint density function of 1 2, , nx x x
is:
1
1
1
2
/ 2 / 2
1e
2
n
i ii
x x
np n
*
0
and covariance matrix
0
p p
p p
This distribution is np-variate normal with mean vector
*
1
1
1 1ˆ , ,n
ii
n
x
x x I I Axn n
x
Thus the distribution of
is p-variate normal with mean vector
*
1
1 1 1, , = =
n
i
A I I nn n n
*and covariance matrix A A
2
2
01
, ,
0
1 1=
p p
p p
I
I In
I
nn n
Summary
The sampling distribution of
is p-variate normal with
x
nxx
1 and
The sampling distribution of the sample covariance matrix S
and
Sn
n 1ˆ
The Wishart distribution
A multivariate generalization of the 2 distribution
Let 1 2, , , kz z z be k independent random p-vectors
Each having a p-variate normal distribution with
1mean vector 0 and covariance matrix
p pp
and covariance matrix p p
1 1 2 2Let k kp pU z z z z z z
Then U is said to have the p-variate Wishart distribution with k degrees of freedom
pU W k
Definition: the p-variate Wishart distribution
Suppose
Then the joint density of U is:
1 / 4
1
i.e. / 2 1 / 2p
p pp
j
k k j
1 / 2 12
/ 2/ 2
exp
2 / 2
k p
U kkpp pp
u tr uf u
k
where p(·) is the multivariate gamma function.
pU W k
The density ot the p-variate Wishart distribution
It can be easily checked that when p = 1 and 1 then the Wishart distribution becomes the 2
distribution with k degrees of freedom.
U
Suppose
Let denote a matrix of rank .q pC q p q p
pU W k
Theorem
then
2 21 a kv a Ua W k a a
Corollary 1:
2with a a a
Corollary 2: If the diagonal element of th
iiu i U 2then where ii ii k iju
pV CUC W k C C
Proof
Set [0 0 1 0]i
i
a e
Suppose 1 1 2 2 and p pU W k U W k
Theorem
are independent, then
1 2 1 2pV U U W k k
Suppose 1 1 2 and pU W k UTheorem
are independent and
1 2 1 with pV U U W k k k
then 2 1pU W k k
1
n
i i pi
U x x W n
Theorem Let
Theorem
1 2, , , nx x x
be a sample from
then pN
1 1i pU n x x W
Let 1 2, , , nx x x
be a sample from
then pN
1
n
i ii
U x x
Theorem
Proof
in x x
1
n
i ii
x x x x
1
n
i ii
U x x
1
n
i ii
x x x x x x
etc
21
n
i ii
U x x x x
Theorem Let
1 2, , , nx x x
be a sample from
then
pN
1 iU n x x
is independent of
Proof 1 1 1
21 22 2
1 2
Let
n n n
n
n n nn
h h hH
h h h
be orthogonal
Then H H HH I
1 1 1
* 21 22 2
1 2
Let
n n n
n
np np
n n nn
I I I
h I h I h IH
h I h I h I
Note H* is also orthogonal* the Kronecker product of and H H I H I
11 1
1
n
m mn
a B a B
A B
a B a B
Properties of Kronecker-product
1. A B C D AC BD
2. A B A B
1 1 1 3. A B A B
BDACDCBA
1 1 11 1
2 2* 21 22 2
1 2
Let
n n n
n
n nn n nn
I I I x u
x uh I h I h IH x
x uh I h I h I
11
1
1
for 2,3,...,
n
ini
n
i ij ji
u x nx
u h x i p
1 1
Note: n n
i i i ii i
u u x x
1 12 1
n n
i i i ii i
u u u u x x
1 1 1 12 1 1
- - 1 n n n
i i i i i ii i i
u u u u x x u u x x nxx n S
*
0
and covariance matrix =
0
p p
p p
I
This the distribution of
* 1
is np-variate normal with mean vector 1 2, , , nx x x x
= H I I H I
Thus the joint distribution of
1
= 1 =
0
n
H
�
is np-variate normal with mean vector u H I x
1 u H I H I
*and covariance matrix u H I H I
= HH I
0
and covariance matrix =
0
p p
u
p p
I
Thus the joint distribution of
1
=
0
n
u
is np-variate normal with mean vector
u
1
1 1n
i i pi
U x x x x n S W n
Summary: Sampling distribution of MLE’s for multivatiate Normal distribution
Let 1 2, , , nx x x
be a sample from
then
pN
1p nx N
and
22 2
1 1Also 1ii ii
nu s n
top related