point estimation slides
TRANSCRIPT
-
8/3/2019 Point Estimation Slides
1/21
Data Analysis and Statistical Arbitrage
Lecture 2: Point estimationOli Atlason
Outline
Parametric Statistical Models Method of Moments and Maximum Likelihood Bias and MSE
Asymptotic Properties
-
8/3/2019 Point Estimation Slides
2/21
Parametric Statistical Models
ExampleWe observe X 1, . . . , X n .
X i: number of alpha particles emitted by sample during thei-th time intervalof an experimentNatural model: X
i
Poisson( ) and X 1, . . . , X
nindependent.
Poisson distribution
P (X i = k) = pPoisson (k) = ke
k!, kN, > 0
Formal modelconsists, for a givenn , of the family
pPoisson (X i)
R +
here parameter space is R +
-
8/3/2019 Point Estimation Slides
3/21
Parametric Statistical Models
General denitionA parametric statistical model for observationsX = (X 1, . . . , X n ) is a family
{f (x)} of probability distributions.We want to know whichf
is responsible.
Point estimator : a function (x) of the observations. This is a guess atwhat was used to generate data.
Sampling distribution(X ) takes random values. It has a distribution derived fromX .
-
8/3/2019 Point Estimation Slides
4/21
Method of Moments
... is a simple way of nding an estimator.k-th sample moment
k =1n
(X i)k
k-th population moment
k() = E [X k]
Now solve for in the system (k() = k)k=1 ,...,p .
Usually need as many equations ( p) as the number of parameters.
-
8/3/2019 Point Estimation Slides
5/21
Method of Moments: Examples
PoissonFor Poisson, = . Also
E [X ] =
k=0
kP (X = k) =
k=0
k ke
k!=
thus the method of moments estimator is (X ) = 1 =
1n X iNormal
The parameters are two, = (, 2). We need 2 equations. We know thatE [X ] = and V ar [X ] = 2. Thus
1 = , 2 = 2 + 2
The method of moments estimator is = 1,
2 = 2 12 =1n
x2i x2=
1n
(x2i x)2 +2n
x i x x2 x2= 1n (x
2i x)2
-
8/3/2019 Point Estimation Slides
6/21
Maximum Likelihood
Likelihood is the density of the data as function of . When data i.i.d.(independent and identically distributed) with densityf , then
L() = f (X 1, . . . , X n ) =n
i=0
f (X i)
Maximum likelihood estimate (MLE) , , is the that maximizes thelikelihood.
The idea is that is the value of for which the sample is most likely.
Finding MLE
Helpful to take log Usually use calculus (L () = 0)
Remember to check values at boundaries and second derivatives
-
8/3/2019 Point Estimation Slides
7/21
MLE: Example
Poisson
L() =n
i=0
X i eX i!
take logs (argmaxx logf (x) = argmaxx f (x))
l() = log L() =n
i=0
(X i log logX i!)
l () =1 X i ni.e. the MLE is = 1n X i .
In this example, method of moments and MLE give same answer.Note, we dont really needX = ( X 1, . . . , X n ) here, only the sum.T (X ) = X i is called asufficient statistic .
-
8/3/2019 Point Estimation Slides
8/21
Bias and MSE
(X ) is an estimator of . Then
Bias = E [(X )]MSE = E [((X ) )2]
Note- (X ) is random (function of sample)
- Bias, MSE are non-random
- Bias, MSE are functions of
When E = , the estimator is calledunbiased .
-
8/3/2019 Point Estimation Slides
9/21
Bias Example: Sample variance
For an i.i.d. sampleX 1, . . . , X n from a N (, 2), the method of momentsand maximum likelihood estimators coincide and are
2 =1n
i
(X i X )2
this estimator is biased :
E [2] =1n
E (X 2i 2X iX + X 2)=
1n
E X 2i nX 2 = E [X 21 ]E [X 2]= (2 + 2)
2
n + 2
= 2n 1
nwhich implies
Bias = E [2]2 = 2
n
-
8/3/2019 Point Estimation Slides
10/21
Bias Example: Sample variance
From last lecture, sample variance is
S 2n =1
n 1 i(X i X )2
by preceding derivation,S 2n is an unbiased estimate of 2.
Frequently method of moments estimators and MLEs are biased and can bemade slightly better by a small change.
-
8/3/2019 Point Estimation Slides
11/21
-
8/3/2019 Point Estimation Slides
12/21
Mean Squared Error
The MSE combines the bias and the variance of the estimator.
MSE = E ( )2= E E [] + E []
2
= E E []2
+ E (E [])2 + 2E E [] E E []= V ar [] + Bias2
Bias and variance sometimes called precision and accuracy.
-
8/3/2019 Point Estimation Slides
13/21
MSE Example: Sample Variance
To compute the MSE of S 2n , recall that under independence and normality
i(X i X )22
2n1which has meann
1 and variance 2(n
1). Thus
MSE S 2n = Bias2 + V ar [S 2n ] = V ar
1n 1 i
(X i X )2
=2
n 12
2(n 1) =24
n 1for the MLE, however
MSE 2 = Bias2 + V ar [2] = 2
n
2
+ V ar1n
i
(X i X )2
=4
n 2+ 4
2(n 1)n 2
= 42n 1
n 2< MSE
S 2n
-
8/3/2019 Point Estimation Slides
14/21
MSE Example: Sample Variance
MSE perhaps not natural for scale parameters. In fact, minimum MSEestimator is
1n + 1
i
(X i X )2
Asymptotically identical (lim n+1n1
= 1).
Method of moments and MLE estimators are rarely unbiased.
However, MLE has nice asymptotic properties.
-
8/3/2019 Point Estimation Slides
15/21
Example: Gamma distribution, one observation
Recall that (x) = 0 tx1etx dt .
Important property: ( ) = ( 1)( 1) for > 0.Gamma distribution with parameters , > 0:
f , (x) =
( )x1ex , x > 0
We have one observation,X R + , we know . Find MLE for ; compute bias
and MSE.Solution:
l( ) = log( )
log(( )) + (
1) log(x)
x
l ( ) = x = 0
our candidate is = x .
-
8/3/2019 Point Estimation Slides
16/21
Example: Gamma distribution, one observation (2)
Check:l ( ) =
2
< 0
and x > 0, i.e. is in parameter space. Moments:
E [ ] = 0 x
( )x1ex dx
= ( 1)
( ) 0 1
( 1)x (1)1ex dx
=
1
and in same way
E [ 2] = 2 2( 2)
( )= 2
2
( 1)( 2)
-
8/3/2019 Point Estimation Slides
17/21
Example: Gamma distribution, one observation (3)
E [ ] = 1 and E [ 2] = 2
2
(1)(2) . so
Bias = E =
1 1 =
1
MSE = Bias2 + V ar [ ]
= 21
12
+ E [ 2]E [ ]2
= 21
( 1)2 +
2
( 1)( 2)
2
( 1)2
= 2 + 2
( 1)( 2)
-
8/3/2019 Point Estimation Slides
18/21
Properties of MLE
InvarianceMLEs are invariant under transformations.
Ex: In Poisson model, = 1 measures waiting time between observations. Byinvariance, = 1
= nX i .
Consistencyn is the MLE obtained fromX 1, . . . , X n . Then, under minimal technical
conditions,n
P
Compare with statements of:
- unbiasedness (E [] = ),
- strong consistency, n a.s..Method of moments estimators often also consistent.
-
8/3/2019 Point Estimation Slides
19/21
Properties of MLE (2)
Theorem (Cramer-Rao) Under conditions, notably E (X ) =
f (x|)dx ,V ar [(X )]
(1 + Bias ())2
n I ().
where
I () = E
logf (X |)
2
= E 2
2logf (x|)
is the Fisher information of f .
Theorem (Fisher) MLE achieves boundasymptotically and
n D N 0,1
I ()MLE is asymptotically efficient , i.e. attains lowest possible variance.
-
8/3/2019 Point Estimation Slides
20/21
Example: Gamma, n observations
Let X 1, . . . , X n be i.i.d. Gamma(, ), and unknown.
L(, ) =n
i=1
f , (x i) =n
i=1
( )x1i ex i
l(, ) = ( log
log ( ) + (
1) logx i
x i)
l (, ) = n log + logx i n ( )( )
l (, ) = n x i
Last equation: =n
x i . Must solvel (, ) = 0 numerically.Hard to compute small sample bias and MSE.Use asymptotic methods.
-
8/3/2019 Point Estimation Slides
21/21
Example: Gamma, n observations (2)
Fisher information: nI () = E [l ()].l, (, ) = n
( )( )
= n ( )( ) ( ) ( )
(( ))2
l, (, ) =n
l, (, ) = n
2
we can use e.g. the approximation
n (
)
N 0,
2
i.e. N , 2
n