1 t test of a hypothesis relating to a population mean the diagram summarizes the procedure for...

35
1 t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN The diagram summarizes the procedure for performing a 5% significance test on the slope coefficient of a regression under the assumption that we know its standard deviation. discrepancy between hypothetical value and sample estimate, in terms of s.d.: 5% significance test: reject H 0 : = 0 if z > 1.96 or z < – 1.96 s.d. 0 X z s.d. of X known

Upload: bertina-todd

Post on 24-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

1

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

The diagram summarizes the procedure for performing a 5% significance test on the slope coefficient of a regression under the assumption that we know its standard deviation.

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

s.d.0

X

z

s.d. of X known

2

This is a very unrealistic assumption. We usually have to estimate it with the standard error, and we use this in the test statistic instead of the standard deviation.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

s.d.0

X

zs.e.

0

Xt

s.d. of X known s.d. of X not known

3

Because we have replaced the standard deviation in its denominator with the standard error, the test statistic has a t distribution instead of a normal distribution.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

s.d.0

X

zs.e.

0

Xt

s.d. of X known s.d. of X not known

4

Accordingly, we refer to the test statistic as a t statistic. In other respects the test procedure is much the same.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

5% significance test:

reject H0: = 0 if

t > tcrit or t < –tcrit

s.d.0

X

zs.e.

0

Xt

s.d. of X known s.d. of X not known

5

We look up the critical value of t and if the t statistic is greater than it, positive or negative, we reject the null hypothesis. If it is not, we do not.

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

5% significance test:

reject H0: = 0 if

t > tcrit or t < –tcrit

s.d.0

X

zs.e.

0

Xt

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

s.d. of X known s.d. of X not known

6

Here is a graph of a normal distribution with zero mean and unit variance

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

normal

7

A graph of a t distribution with 10 degrees of freedom (this term will be defined in a moment) has been added.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

normal

t, 10 d.f.

8

When the number of degrees of freedom is large, the t distribution looks very much like a normal distribution (and as the number increases, it converges on one).

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

normal

t, 10 d.f.

9

Even when the number of degrees of freedom is small, as in this case, the distributions are very similar.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

normal

t, 10 d.f.

10

Here is another t distribution, this time with only 5 degrees of freedom. It is still very similar to a normal distribution.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

normal

t, 10 d.f.

t, 5 d.f.

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

11

So why do we make such a fuss about referring to the t distribution rather than the normal distribution? Would it really matter if we always used 1.96 for the 5% test and 2.58 for the 1% test?

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

normal

t, 10 d.f.

t, 5 d.f.

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

12

The answer is that it does make a difference. Although the distributions are generally quite similar, the t distribution has longer tails than the normal distribution, the difference being the greater, the smaller the number of degrees of freedom.

normal

t, 10 d.f.

t, 5 d.f.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

0

0.1

0.2

0.3

0.4

-6 -4 -2 0 2 4 6

13

As a consequence, the probability of obtaining a high test statistic on a pure chance basis is greater with a t distribution than with a normal distribution.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

normal

t, 10 d.f.

t, 5 d.f.

0

0.05

-6 -4 -2 0 2 4 6

14

This means that the rejection regions have to start more standard deviations away from zero for a t distribution than for a normal distribution.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

normal

t, 10 d.f.

t, 5 d.f.

0

0.05

-6 -4 -2 0 2 4 6

15

The 2.5% tail of a normal distribution starts 1.96 standard deviations from its mean.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

–1.96

normal

t, 10 d.f.

t, 5 d.f.

0

0.05

-6 -4 -2 0 2 4 6

16

The 2.5% tail of a t distribution with 10 degrees of freedom starts 2.33 standard deviations from its mean.

–2.33

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

normal

t, 10 d.f.

t, 5 d.f.

0

0.05

-6 -4 -2 0 2 4 6

17

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

normal

t, 10 d.f.

t, 5 d.f.

That for a t distribution with 5 degrees of freedom starts 2.57 standard deviations from its mean.

–2.57

0

0.05

-6 -4 -2 0 2 4 6

18

For this reason we need to refer to a table of critical values of t when performing significance tests on the coefficients of a regression equation.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

19

At the top of the table are listed possible significance levels for a test. For the time being we will be performing two-sided tests, so ignore the line for one-sided tests.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

20

Hence if we are performing a (two-sided) 5% significance test, we should use the column thus indicated in the table.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

21

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

The left hand vertical column lists degrees of freedom. When estimating the population mean using the sample mean, the number of degrees of freedom is defined to be the number of observations minus 1.

When estimating the population mean using the samplemean, the number of degrees of freedom = n – 1.

22

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

Thus, if there are 20 observations in the sample, as in the sales tax example we will discuss in a moment, the number of degrees of freedom would be 19 and the critical value of t for a 5% test would be 2.093.

n = 20d.f. = 19

23

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

Note that as the number of degrees of freedom becomes large, the critical value converges on 1.96, the critical value for the normal distribution. This is because the t distribution converges on the normal distribution.

24

Hence, referring back to the summary of the test procedure,

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

s.d.0

X

zs.e.

0

Xt

s.d. of X known s.d. of X not known

5% significance test:

reject H0: = 0 if

t > tcrit or t < –tcrit

25

when n = 20 and so degrees of freedom = 19, we should reject the null hypothesis at the 5% significance level if the absolute value of t is greater than 2.093.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

s.d.0

X

zs.e.

0

Xt

s.d. of X known s.d. of X not known

5% significance test:

reject H0: = 0 if

t > 2.093 or t < –2.093

26

If instead we wished to perform a 1% significance test, we would use the column indicated above. Note that as the number of degrees of freedom becomes large, the critical value converges to 2.58, the critical value for the normal distribution.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

27

For a sample of 20 observations, the critical value of t at the 1% level is 2.861.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

t distribution: critical values of t

Degrees of Two-sided test 10% 5% 2% 1% 0.2% 0.1% freedom One-sided test 5% 2.5% 1% 0.5% 0.1% 0.05%

1 6.314 12.706 31.821 63.657 318.31 636.622 2.920 4.303 6.965 9.925 22.327 31.5983 2.353 3.182 4.541 5.841 10.214 12.9244 2.132 2.776 3.747 4.604 7.173 8.6105 2.015 2.571 3.365 4.032 5.893 6.869

… … … … … … …… … … … … … …18 1.734 2.101 2.552 2.878 3.610 3.92219 1.729 2.093 2.539 2.861 3.579 3.88320 1.725 2.086 2.528 2.845 3.552 3.850… … … … … … …… … … … … … …

600 1.647 1.964 2.333 2.584 3.104 3.3071.645 1.960 2.326 2.576 3.090 3.291

28

So we should use this figure in the test procedure for a 1% test.

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

discrepancy between hypothetical value and sample estimate, in terms of s.d.:

5% significance test:

reject H0: = 0 if

z > 1.96 or z < –1.96

discrepancy between hypothetical value and sample estimate, in terms of s.e.:

s.d.0

X

zs.e.

0

Xt

s.d. of X known s.d. of X not known

1% significance test:

reject H0: = 0 if

t > 2.861 or t < –2.861

We now consider an example of a t test. A certain city abolishes its local sales tax on consumer expenditure. We wish to determine whether the abolition had a significant effect on sales.

29

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

Example: effect of abolition of sales tax on household expenditure

Effect on household expenditure:

30

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

We take as our null hypothesis that there was no effect: H0: = 0.

Example: effect of abolition of sales tax on household expenditure

Null hypothesis: 0:0 H

Effect on household expenditure:

Alternative hypothesis: 0:1 H

31

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

A survey of 20 households shows that, in the following month, mean household expenditure increased by $160 and the standard error of the increase was $60.

Example: effect of abolition of sales tax on household expenditure

Null hypothesis: 0:0 H

Effect on household expenditure:

Sample size: n = 20

Mean increase in expenditure: $160Standard error of increase:

Alternative hypothesis: 0:1 H

$60

32

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

The test statistic is 2.67.

Example: effect of abolition of sales tax on household expenditure

Test statistic: 67.260

0160s.e.

0

X

Xt

Null hypothesis: 0:0 H

Effect on household expenditure:

Sample size: n = 20

Mean increase in expenditure: $160Standard error of increase:

Alternative hypothesis: 0:1 H

$60

33

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

The critical values of t with 19 degrees of freedom are 2.09 at the 5 percent significance level and 2.86 at the 1 percent level

Example: effect of abolition of sales tax on household expenditure

Test statistic:

09.2195% crit, t 86.2191% crit, t

67.260

0160s.e.

0

X

Xt

Null hypothesis:

Critical values of t :

0:0 H

Effect on household expenditure:

Sample size: n = 20

Mean increase in expenditure: $160Standard error of increase:

Alternative hypothesis: 0:1 H

$60

34

Example: effect of abolition of sales tax on household expenditure

t TEST OF A HYPOTHESIS RELATING TO A POPULATION MEAN

Hence we reject the null hypothesis of no effect at the 5 percent level but not at the 1 percent level. This is a case where we should mention both tests.

Conclusion:

Test statistic:

09.2195% crit, t 86.2191% crit, t

67.260

0160s.e.

0

X

Xt

Null hypothesis:

Critical values of t :

0:0 H

Reject H0 at 5% level.Do not reject H0 at 1% level.

Effect on household expenditure:

Sample size: n = 20

Mean increase in expenditure: $160Standard error of increase:

Alternative hypothesis: 0:1 H

$60

Copyright Christopher Dougherty 2012.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be

used as a resource for teaching an econometrics course. There is no need to

refer to the author.

The content of this slideshow comes from Section R.11 of C. Dougherty,

Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be

downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might benefit

from participation in a formal course should consider the London School of

Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course

EC2020 Elements of Econometrics

www.londoninternational.ac.uk/lse.

2012.11.02