unit 4: inference for numerical variables lecture 1: bootstrap,...

27
Unit 4: Inference for numerical variables Lecture 1: Bootstrap, paired, and two sample Statistics 101 Thomas Leininger June 4, 2013

Upload: others

Post on 16-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Unit 4: Inference for numerical variablesLecture 1: Bootstrap, paired, and two sample

Statistics 101

Thomas Leininger

June 4, 2013

Page 2: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Bootstrap & Randomization testing

Rent in Durham - bootstrap interval

The dot plot below shows the distribution of means of 100 bootstrapsamples from the original sample. Estimate the 90% bootstrap confi-dence interval based on this bootstrap distribution.

bootstrap means

900 1000 1100 1200 1300 1400

●● ● ●●●●●

● ●● ●●●●

● ●●

●●

●●

● ●●

●● ●

●●

●●

●●

●●

●● ●●

●●● ●

●●●

●●

●●●

● ●● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

●● ●

1013.9 1354.3

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 2 / 27

Page 3: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Bootstrap & Randomization testing

Bootstrap applet

http:// wise.cgu.edu/ bootstrap/

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 3 / 27

Page 4: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Paired observations

200 observations were randomly sampled from the High School andBeyond survey. The same students took a reading and writing test andtheir scores are shown below. At a first glance, does there appear tobe a difference between the average reading and writing test score?

scores

0

20

40

60

80

100

read write

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 4 / 27

Page 5: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Paired observations

Question

The same students took a reading and writing test and their scoresare shown below. Are the reading and writing scores of each studentindependent of each other?

id read write1 70 57 522 86 44 333 141 63 444 172 47 52...

......

...200 137 63 65

(a) Yes(b) No

(c) No

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 5 / 27

Page 6: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Paired observations

Analyzing paired data

When two sets of observations have this special correspondence(not independent), they are said to be paired.To analyze paired data, it is often useful to look at the differencein outcomes of each pair of observations.

diff = read − write

It is important that we always subtract using a consistent order.

id read write diff1 70 57 52 52 86 44 33 113 141 63 44 194 172 47 52 -5...

......

......

200 137 63 65 -2differences

Fre

quen

cy

−20 −10 0 10 20

010

2030

40

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 6 / 27

Page 7: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Paired observations

Parameter and point estimate

Parameter of interest: Average difference between the readingand writing scores of all high school students.

µdiff

Point estimate: Average difference between the reading andwriting scores of sampled high school students.

x̄diff

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 7 / 27

Page 8: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Inference for paired data

Setting the hypotheses

If in fact there was no difference between the scores on the readingand writing exams, what would you expect the average difference tobe?

0

What are the hypotheses for testing if there is a difference between theaverage reading and writing scores?

H0: There is no difference between the average reading and writingscore.

µdiff = 0

HA : There is a difference between the average reading and writingscore.

µdiff , 0

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 8 / 27

Page 9: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Inference for paired data

Nothing new here

The analysis is no different than what we have done before.

We have data from one sample: differences.

We are testing to see if the average difference is different than 0.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 9 / 27

Page 10: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Inference for paired data

Checking assumptions & conditions

Question

Which of the following is true?

(a) Since students are sampled randomly, we can assume that thedifference between the reading and writing scores of one studentin the sample is independent of another.

(b) The distribution of differences is bimodal, therefore we cannotcontinue with the hypothesis test.

(c) In order for differences to be random we should have sampledwith replacement.

(d) Since students are sampled randomly, we can assume that thesampling distribution of the average difference will be nearlynormal.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 10 / 27

Page 11: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Inference for paired data

Application exercise:Calculating the test-statistic and the p-value

The observed average difference between the two scores is -0.545points and the standard deviation of the difference is 8.887 points.Which of the below is the closest p-value for evaluating a differencebetween the average scores on the two exams? (n=200)

(a) 20%(b) 40%

(c) 40%(d) 5%

(e) 48%(f) 95%

−0.545 0 0.545

Z =−0.545 − 0

8.887√200

=−0.5450.628

= −0.87

p − value = 0.1949 × 2 = 0.3898

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 11 / 27

Page 12: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Inference for paired data

Interpretation of p-value

Question

Which of the following is the correct interpretation of the p-value?

(a) Probability that the average scores on the reading and writingexams are equal.

(b) Probability that the average scores on the reading and writingexams are different.

(c) Probability of obtaining a random sample of 200 students wherethe average difference between the reading and writing scores isat least 0.545 (in either direction), if in fact the true averagedifference between the scores is 0.

(d) Probability of incorrectly rejecting the null hypothesis if in fact thenull hypothesis is true.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 12 / 27

Page 13: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Paired data Inference for paired data

HT↔ CI

Question

Suppose we were to construct a 95% confidence interval for the av-erage difference between the reading and writing scores. Would youexpect this interval to include 0?

(a) yes(b) no(c) cannot tell from the information given

−0.545 ± 1.968.887√

200= −0.545 ± 1.96 × 0.628

= −0.545 ± 1.23

= (−1.775, 0.685)

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 13 / 27

Page 14: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

The General Social Survey (GSS) conducted by the Census Bureaucontains a standard ‘core’ of demographic, behavioral, and attitudinalquestions, plus topics of special interest. Many of the core questionshave remained unchanged since 1972 to facilitate time-trend studiesas well as replication of earlier findings. Below is an excerpt from the2010 data set. The variables are number of hours worked per weekand highest educational attainment.

degree hrs11 BACHELOR 552 BACHELOR 453 JUNIOR COLLEGE 45...

1172 HIGH SCHOOL 40

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 14 / 27

Page 15: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Exploratory analysis

What can you say about the relationship between educational attain-ment and hours worked per week?

●●

●●

●●●●●

●●●

●●● ●

●●

●●

●●

●●●

● ●●●●●●●●●●●●

Less than HS HS Jr Coll Bachelor's Graduate

020

4060

80

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 15 / 27

Page 16: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Collapsing levels into two

Say we are only interested the difference between the number ofhours worked per week by college and non-college graduates.Then we combine the levels of education into two:

hs or lower← less than high school or high schoolcoll or higher← junior college, bachelor’s, and graduate

Here is how you can do this in R:

# create a new empty variable

gss$edu = NA

# if statements to determine levels of new variable

gss$edu[gss$degree == "LESS THAN HIGH SCHOOL" |

gss$degree == "HIGH SCHOOL"] = "hs or lower"

gss$edu[gss$degree == "JUNIOR COLLEGE" | gss$degree == "BACHELOR" |

gss$degree == "GRADUATE"] = "coll or higher"

# make sure new variable is categorical

gss$edu = as.factor(gss$edu)

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 16 / 27

Page 17: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Exploratory analysis - another look

x̄ s ncoll or higher 41.8 15.14 505hs or lower 39.4 15.12 667

coll or higher

Fre

quen

cy

0 20 40 60 80

010

0

hs or lower

hours worked per week

Fre

quen

cy

0 20 40 60 80

015

0

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 17 / 27

Page 18: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Parameter and point estimate

We want to construct a 95% confidence interval for the average dif-ference between the number of hours worked per week by Americanswith a college degree and those with a high school degree or lower.What are the parameter of interest and the point estimate?

Parameter of interest: Average difference between the number ofhours worked per week by all Americans with a college degreeand those with a high school degree or lower.

µcoll − µhs

Point estimate: Average difference between the number of hoursworked per week by sampled Americans with a college degreeand those with a high school degree or lower.

x̄coll − x̄hs

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 18 / 27

Page 19: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Checking assumptions & conditions

1 Independence:Within groups: both samples are randomWe can assume that the number of hours worked per week byone college graduate in the sample is independent of another,and the number of hours worked per week by someone with a HSdegree or lower in the sample is independent of another as well.Between groups: ← new!Since the sample is random, we have no reason to believe thatthe college graduates in the sample would not be independent ofthose with a HS degree or lower.

2 Sample size / skew:Both distributions look reasonably symmetric, and the samplesizes are at least 30, therefore we can assume that the samplingdistribution of number of hours worked per week by collegegraduates and those with HS degree or lower are nearly normal.Hence the sampling distribution of the average difference will benearly normal as well.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 19 / 27

Page 20: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Confidence interval for difference between two means

All confidence intervals have the same form:

point estimate ±ME

And ME = critical value × SE of point estimate

In this case the point estimate is x̄1 − x̄2

Since the sample sizes are large enough, the critical value is z?

So the only new concept is the standard error of the differencebetween two means...

Standard error of the difference between two sample means

SE(x̄1−x̄2) =

√s2

1

n1+

s22

n2

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 20 / 27

Page 21: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Let’s put things in context

Calculate the standard error of the average difference between thenumber of hours worked per week by college graduates and thosewith a HS degree or lower.

x̄ s ncoll or higher 41.8 15.14 505hs or lower 39.4 15.12 667

SE(x̄coll−x̄hs) =

√s2

coll

ncoll+

s2hs

nhs

=

√15.142

505+

15.122

667= 0.89

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 21 / 27

Page 22: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Confidence interval for the difference (cont.)

Estimate (using a 95% confidence interval) the average difference be-tween the number of hours worked per week by Americans with a col-lege degree and those with a high school degree or lower.

x̄coll = 41.8 x̄hs = 39.4 SE(x̄coll−x̄hs) = 0.89

(x̄coll − x̄hs) ± z? × SE(x̄coll−x̄hs) = (41.8 − 39.4) ± 1.96 × 0.89

= 2.4 ± 1.74

= (0.66, 4.14)

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 22 / 27

Page 23: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Interpretation of a confidence interval for the difference

Question

Which of the following is the best interpretation of the confidence inter-val we just calculated?

We are 95% confident that

(a) college grads work on average of 0.66 to 4.14 hours more perweek than those with a HS degree or lower.

(b) college grads work on average 0.66 hours less to 4.14 hoursmore per week than those with a HS degree or lower.

(c) college grads work on average 0.66 to 4.14 hours less per weekthan those with a HS degree or lower.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 23 / 27

Page 24: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Confidence intervals for differences of means

Reality check

Do these results sound reasonable? Why or why not?

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 24 / 27

Page 25: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Hypothesis tests for differences of means

Setting the hypotheses

What are the hypotheses for testing if there is a difference between theaverage number of hours worked per week by college graduates andthose with a HS degree or lower?

H0: µcoll = µhs

There is no difference in the average number of hours worked perweek by college graduates and those with a HS degree or lower. Anyobserved difference between the sample means is due to naturalsampling variation (chance).

HA : µcoll , µhs

There is a difference in the average number of hours worked per weekby college graduates and those with a HS degree or lower.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 25 / 27

Page 26: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Hypothesis tests for differences of means

Calculating the test-statistic and the p-value

H0: µcoll = µhs → µcoll − µhs = 0HA : µcoll , µhs → µcoll − µhs , 0

x̄coll − x̄hs = 2.4, SE(x̄coll − x̄hs) = 0.89

average differences

−2.4 0 2.4

Z =(x̄coll − x̄hs) − 0

SE(x̄coll−x̄hs)

=2.40.89

= 2.70

upper tail = 1 − 0.9965 = 0.0035

p − value = 2 × 0.0035 = 0.007

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 26 / 27

Page 27: Unit 4: Inference for numerical variables Lecture 1: Bootstrap, …tjl13/s101/slides/unit4lec1H.pdf · 2013. 6. 5. · Lecture 1: Bootstrap, paired, and two sample Statistics 101

Difference of two means Hypothesis tests for differences of means

Conclusion of the test

Question

Which of the following is correct based on the results of the hypothesistest we just conducted?

(a) There is a 0.7% chance that there is no difference between the averagenumber of hours worked per week by college graduates and those witha HS degree or lower.

(b) Since the p-value is low, we reject H0. The data provide convincingevidence of a difference between the average number of hours workedper week by college graduates and those with a HS degree or lower.

(c) Since we rejected H0, may have made a Type 2 error.

(d) Since the p-value is low, we fail to reject H0. The data do not provideconvincing evidence of a difference between the average number ofhours worked per week by college graduates and those with a HSdegree or lower.

Statistics 101 (Thomas Leininger) U4 - L1: Bootstrap, paired, and two sample June 4, 2013 27 / 27