point estimation slides

8/3/2019 Point Estimation Slides

1/21

Data Analysis and Statistical Arbitrage

Lecture 2: Point estimationOli Atlason

Outline

Parametric Statistical Models Method of Moments and Maximum Likelihood Bias and MSE

Asymptotic Properties


2/21

Parametric Statistical Models

ExampleWe observe X 1, . . . , X n .

X i: number of alpha particles emitted by sample during thei-th time intervalof an experimentNatural model: X

i

Poisson( ) and X 1, . . . , X

nindependent.

Poisson distribution

P (X i = k) = pPoisson (k) = ke

k!, kN, > 0

Formal modelconsists, for a givenn , of the family

pPoisson (X i)

R +

here parameter space is R +


3/21

Parametric Statistical Models

General denitionA parametric statistical model for observationsX = (X 1, . . . , X n ) is a family

{f (x)} of probability distributions.We want to know whichf

is responsible.

Point estimator : a function (x) of the observations. This is a guess atwhat was used to generate data.

Sampling distribution(X ) takes random values. It has a distribution derived fromX .


4/21

Method of Moments

... is a simple way of nding an estimator.k-th sample moment

k =1n

(X i)k

k-th population moment

k() = E [X k]

Now solve for in the system (k() = k)k=1 ,...,p .

Usually need as many equations ( p) as the number of parameters.


5/21

Method of Moments: Examples

PoissonFor Poisson, = . Also

E [X ] =

k=0

kP (X = k) =

k=0

k ke

k!=

thus the method of moments estimator is (X ) = 1 =

1n X iNormal

The parameters are two, = (, 2). We need 2 equations. We know thatE [X ] = and V ar [X ] = 2. Thus

1 = , 2 = 2 + 2

The method of moments estimator is = 1,

2 = 2 12 =1n

x2i x2=

1n

(x2i x)2 +2n

x i x x2 x2= 1n (x

2i x)2


6/21

Maximum Likelihood

Likelihood is the density of the data as function of . When data i.i.d.(independent and identically distributed) with densityf , then

L() = f (X 1, . . . , X n ) =n

i=0

f (X i)

Maximum likelihood estimate (MLE) , , is the that maximizes thelikelihood.

The idea is that is the value of for which the sample is most likely.

Finding MLE

Helpful to take log Usually use calculus (L () = 0)

Remember to check values at boundaries and second derivatives


7/21

MLE: Example

Poisson

L() =n

i=0

X i eX i!

take logs (argmaxx logf (x) = argmaxx f (x))

l() = log L() =n

i=0

(X i log logX i!)

l () =1 X i ni.e. the MLE is = 1n X i .

In this example, method of moments and MLE give same answer.Note, we dont really needX = ( X 1, . . . , X n ) here, only the sum.T (X ) = X i is called asufficient statistic .


8/21

Bias and MSE

(X ) is an estimator of . Then

Bias = E [(X )]MSE = E [((X ) )2]

Note- (X ) is random (function of sample)

- Bias, MSE are non-random

- Bias, MSE are functions of

When E = , the estimator is calledunbiased .


9/21

Bias Example: Sample variance

For an i.i.d. sampleX 1, . . . , X n from a N (, 2), the method of momentsand maximum likelihood estimators coincide and are

2 =1n

i

(X i X )2

this estimator is biased :

E [2] =1n

E (X 2i 2X iX + X 2)=

1n

E X 2i nX 2 = E [X 21 ]E [X 2]= (2 + 2)

2

n + 2

= 2n 1

nwhich implies

Bias = E [2]2 = 2

n


10/21

Bias Example: Sample variance

From last lecture, sample variance is

S 2n =1

n 1 i(X i X )2

by preceding derivation,S 2n is an unbiased estimate of 2.

Frequently method of moments estimators and MLEs are biased and can bemade slightly better by a small change.


11/21


12/21

Mean Squared Error

The MSE combines the bias and the variance of the estimator.

MSE = E ( )2= E E [] + E []

2

= E E []2

+ E (E [])2 + 2E E [] E E []= V ar [] + Bias2

Bias and variance sometimes called precision and accuracy.


13/21

MSE Example: Sample Variance

To compute the MSE of S 2n , recall that under independence and normality

i(X i X )22

2n1which has meann

1 and variance 2(n

1). Thus

MSE S 2n = Bias2 + V ar [S 2n ] = V ar

1n 1 i

(X i X )2

=2

n 12

2(n 1) =24

n 1for the MLE, however

MSE 2 = Bias2 + V ar [2] = 2

n

2

+ V ar1n

i

(X i X )2

=4

n 2+ 4

2(n 1)n 2

= 42n 1

n 2< MSE

S 2n


14/21

MSE Example: Sample Variance

MSE perhaps not natural for scale parameters. In fact, minimum MSEestimator is

1n + 1

i

(X i X )2

Asymptotically identical (lim n+1n1

= 1).

Method of moments and MLE estimators are rarely unbiased.

However, MLE has nice asymptotic properties.


15/21

Example: Gamma distribution, one observation

Recall that (x) = 0 tx1etx dt .

Important property: ( ) = ( 1)( 1) for > 0.Gamma distribution with parameters , > 0:

f , (x) =

( )x1ex , x > 0

We have one observation,X R + , we know . Find MLE for ; compute bias

and MSE.Solution:

l( ) = log( )

log(( )) + (

1) log(x)

x

l ( ) = x = 0

our candidate is = x .


16/21

Example: Gamma distribution, one observation (2)

Check:l ( ) =

2

< 0

and x > 0, i.e. is in parameter space. Moments:

E [ ] = 0 x

( )x1ex dx

= ( 1)

( ) 0 1

( 1)x (1)1ex dx

=

1

and in same way

E [ 2] = 2 2( 2)

( )= 2

2

( 1)( 2)


17/21

Example: Gamma distribution, one observation (3)

E [ ] = 1 and E [ 2] = 2

2

(1)(2) . so

Bias = E =

1 1 =

1

MSE = Bias2 + V ar [ ]

= 21

12

+ E [ 2]E [ ]2

= 21

( 1)2 +

2

( 1)( 2)

2

( 1)2

= 2 + 2

( 1)( 2)


18/21

Properties of MLE

InvarianceMLEs are invariant under transformations.

Ex: In Poisson model, = 1 measures waiting time between observations. Byinvariance, = 1

= nX i .

Consistencyn is the MLE obtained fromX 1, . . . , X n . Then, under minimal technical

conditions,n

P

Compare with statements of:

- unbiasedness (E [] = ),

- strong consistency, n a.s..Method of moments estimators often also consistent.


19/21

Properties of MLE (2)

Theorem (Cramer-Rao) Under conditions, notably E (X ) =

f (x|)dx ,V ar [(X )]

(1 + Bias ())2

n I ().

where

I () = E

logf (X |)

2

= E 2

2logf (x|)

is the Fisher information of f .

Theorem (Fisher) MLE achieves boundasymptotically and

n D N 0,1

I ()MLE is asymptotically efficient , i.e. attains lowest possible variance.


20/21

Example: Gamma, n observations

Let X 1, . . . , X n be i.i.d. Gamma(, ), and unknown.

L(, ) =n

i=1

f , (x i) =n

i=1

( )x1i ex i

l(, ) = ( log

log ( ) + (

1) logx i

x i)

l (, ) = n log + logx i n ( )( )

l (, ) = n x i

Last equation: =n

x i . Must solvel (, ) = 0 numerically.Hard to compute small sample bias and MSE.Use asymptotic methods.


21/21

Example: Gamma, n observations (2)

Fisher information: nI () = E [l ()].l, (, ) = n

( )( )

= n ( )( ) ( ) ( )

(( ))2

l, (, ) =n

l, (, ) = n

2

we can use e.g. the approximation

n (

)

N 0,

2

i.e. N , 2

n

point estimation slides

Documents