estimation and testing of population parameters

8/11/2019 Estimation and Testing of Population Parameters

1/27

1/27

EC114 Introduction to Quantitative Economics6. Estimation and Testing of Population Parameters

Department of EconomicsUniversity of Essex

15/17 November 2011

EC114 Introduction to Quantitative Economics 6. Estimation and Testing of Population Parameters


2/27

2/27

Outline

1 Motivation

2 Point Estimates

3 Confidence Intervals

4 Testing Hypotheses About a Population Mean

Reference: R. L. Thomas,Using Statistics in Economics,

McGraw-Hill, 2005, sections 3.3 and 4.1.



3/27

Motivation 3/27

We have used the Central Limit Theorem (CLT) to calculate

probabilities involving the sample mean, X,assumingthatthe population mean and variance, and2, are known.

However, usually the values of the population parameters

areunknown.

This is a common situation in statistics (and econometrics),and so we try toestimatethe unknown parameters from

the sample data.

We shall considerpoint estimates(a single number) as

well asconfidence intervals(a range of values) for thepopulation mean,.



4/27

Point Estimates 4/27

The obvious estimator ofis the sample mean, X.

It is actually a good estimator: we know thatE

(X

) =.This means that, in repeated samples, Xwill on averagebe equal to.

Important: we arenotsaying that Xis equal toand thestatement that X=is INCORRECT it is E(X)thatequals i.e. the mean of all Xvalues obtained from manysamples.

There is nosystematic tendencyfor there to be an error in

estimatingbyX i.e. no systematic tendency to

overestimate of underestimate.When E(X) = we say that there is no biasin ourestimator and Xis anunbiased estimatorof.



5/27


We shall also need to estimate the population variance, 2.

An obvious estimator is

v2 =

n

i=1(Xi X)2n

.

But v2 is not unbiased because E(v2) =2.In fact, it can be shown that

E(v2) = n 1

n

2 < 2

so that, on average, v2 underestimates2.


P i E i /


6/27


This is depicted below:

i.e. v2 is abiasedestimator of2.

Is it possible to construct an unbiased estimator of 2?


P i t E ti t 7/27


7/27


An alternative estimator replaces nin the denominator of v2

with n

1:

s2 =

ni=1(Xi X)2n 1 .

The factor n 1compensates for the downward bias andwe find that s2 > v2.

In fact,s2 =

n

n 1v2,

and so

E(s2) = E nn 1v2

=

n

n 1E(v2) = n

n 1n

1

n2 =2.

Hence s2 is an unbiased estimator of2.




8/27


Example. (Thomas, Example 3.8) During May, a random

sample of packages leaving a large wholesale store have

weights (in kg) as follows:

250, 2000, 720, 1200, 310, 280, 1460, 180.

Find unbiased estimates of the mean and variance of all

packages leaving the store in May.

Solution. The table on the next slide shows the calculationof the relevant sums. We obtain:

X=

iXi

n= 6400

8 = 800;

s2 =

i(Xi X)2n 1 =

3, 239, 400

7 = 462, 771.43.




9/27


Xi Xi

X (Xi

X)2 X2

i

250 550 302,500 62,5002000 1200 1,440,000 4,000,000

720 80 6,400 518,4001200 400 160,000 1,440,000

310 490 240,100 96,100280 520 270,400 78,400

1460 660 435,600 2,131,600

180 620 384,400 32,4006400 0 3,239,400 8,359,400




10/27


Note that another (computationally easier) way to calculate

s2

is to uses2 =

iX

2i

n 1 n

n 1X2,

which only involves computing

iX

2i rather than

i(Xi X)2.

For the previous example,

s2 =

8, 359, 400

7 8

7800

2

= 1, 194, 200

731, 428.57= 462, 771.43

as required.


Confidence Intervals 11/27


11/27


Sometimes we wish to specify a degree ofconfidencein

our estimator.

One way of doing this is to indicate a range of values withinwhich we are 95% confident that the true parameter value

lies.

Suppose we wish to construct a confidence interval for the

population mean,.The aim is to find two values, X Eand X+ E, such thatthere is a 95% probability that the range X Eto X+ Ewillcontain i.e. we need to find an Esuch that

Pr(X E< < X+ E) = 0.95.Note that the centre of the range is Xso we can expressthe range as X E.




12/27


We shall make use of the Central Limit Theorem:

X N,2n

Z= X /n N(0, 1).

We need to find a value, k, such that

Pr(k< Z< k) = 0.95

i.e. the value kputs 2.5% of the distribution in each tail.

The tables show that k= 1.96; the relevant row is

z 0.00 ... 0.03 0.04 0.05 0.06 0.07 0.08 0.09

1.9 0.4713 ... 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767




13/27

In terms of the N(0, 1)distribution:

We have

Pr(1.96< Z< 1.96) = 0.95;substituting the expression for Z:

Pr

1.96 > X 1.96 n

= 0.95

or (reversing the order)

Pr

X 1.96

n< < X+ 1.96

n

= 0.95.




15/27

We have therefore shown that E= 1.96/nand have

found a 95% large-sample confidence interval for .

We can write the confidence interval as

X 1.96 n

.

Problem: is unknown, therefore we replace it with ssothat for practical purposes the confidence interval is

X 1.96 sn

.

Remember that we are saying there is a 95% probabilitythat the true (unknown) value lies in the range

X 1.96 sn

to X+ 1.96 sn

.




16/27

If we wish to increase the probability level, say to 99%, we

need to find ak

such thatPr

(k

1.64we reject H0 at the 0.05 (5%) level of

significance.The level of significance = Pr(reject H0|H0 is true).If = 0.01we reject H0 if TS> 2.33.


Testing Hypotheses About a Population Mean 23/27


23/27

A rejection of H0 at the 1% significance level is stronger

than rejection at the 5% level, but if is too small we wouldalmost never reject H0!

Ourdecision ruleortest criterionis:

reject H0if TS=X 17, 670

/n> 1.64;

otherwise we accept H0.

Another way of writing this is:

reject H0if X> 17, 670+ 1.64 n

.




24/27

Suppose, from our sample of n= 400residents, we find

that X= 17, 890and s= 2048.Using sin place of (recall that s2 is an unbiased estimatorof2) we obtain

TS= 17, 890 17, 670

2048/400= 2.15.

Hence TS> 1.64and we reject the null H0: = 17, 670atthe 5% significance level in favour of HA: > 17, 670.

Note, however, that TS< 2.33so we would not reject H0 atthe 1% significance level.




25/27

The previous test is an example of aone-tail testbecause,

under HA, we were only interested in values of greater

than 17,670.Suppose, instead, that we write the alternative hypothesis

as:

HA: = 17, 670.

Now, under HA, values ofboth greater and less than17,670 are included this becomes a two-tail test.

There are now two critical values, one at each end of the

distribution: for a 5% level of significance these values are

1.96; for a 1% significance level they are

2.58.

Note that, for a significance level, the critical values putan area of/2into each end of the distribution e.g. if= 0.05than 0.025 (2.5%) goes into each end.




26/27

The two-tail rejection region is depicted below:

Our decision rule at the 5% significance level is:

reject H0if TS> 1.96 or TS< 1.96;otherwise we accept H0.

Another way of writing this is in terms of the absolute value

of TS, denoted

|TS

|: reject H0 if

|TS

|> 1.96.


Summary 27/27

Summary


27/27

Summary

Point estimates

Confidence intervals

Testing hypotheses about a population mean

Next week:

Hypothesis testing


estimation and testing of population parameters

Documents