stat 512 – day 8 tests of significance (ch. 6). last time use random sampling to eliminate...

Stat 512 – Day 8

Tests of Significance (Ch. 6)

Last Time

Use random sampling to eliminate sampling errors

Use caution to reduce nonsampling errors Use probability theory to estimate the size of

random sampling errors Normal distribution is often useful

Last Time – Central Limit Theorem Case 1: Variable of interest is quantitative

Eg.,date of pennies, length of words Parameters: population mean, , population standard

deviation, Sampling distribution for sample means will be centered at

with standard deviation (assuming random sample) Sampling distribution will follow a normal distribution if

population is normal otherwise sampling distribution will be approximately normal if n is “large” (n > 30)

n/

Last Time – Central Limit Theorem Case 2: Variable of interest is qualitative (binary)

Eg., orange or not orange candy, did heart transplant patient survive

Parameters: population proportion/probability, Sampling distribution for sample proportions will be

centered at with standard deviation (assuming random sample)

Sampling distribution will approximately follow a normal distribution if if n is “large” (n > 10; n(1-)>10)

n/)1(

Population

samples

p̂

p̂p̂p̂

p̂ p̂ p̂p̂

p̂

p̂p̂

p̂ p̂

pop proportion

p̂p̂

p̂ p̂ p̂ p̂p̂p̂ p̂ p̂p̂ p̂p̂p̂ Sampling

distribution

p̂

p̂p̂p̂p̂p̂p̂p̂p̂p̂p̂p̂p̂

p̂p̂p̂p̂p̂p̂p̂p̂p̂ p̂p̂p̂p̂

Is the sample proportion that I observed surprising?

Sample proportions

Last Time – Central Limit Theorem Moral: Don’t have to keep running these

simulations with means and proportions, know what the answer is going to be!

Therefore

Knowledge of the sampling distribution, allows us to make judgments of whether an observation is surprising Instead of running simulations, have the Central

Limit Theorem to tell us about the behavior of many (hypothetical) samples

Technical Conditions

Keep in mind that we can’t always apply the Central Limit Theorem Means: random sample, n > 30 or normal

population Proportions: random sample, n>10 and

n(1-)>10 Consider both sample size and value of Recall Gettysburg Address

Practice Problem 2

Hospital under suspicion for high mortality rate from heart transplantations Last ten cases has 80% mortality rate

Researchers determined that 15% was a reasonable benchmark for transplantation.

Parameter of interest: = the overall mortality rate for heart transplants at this hospital

Give the hospital the benefit of the doubt and assume is equal to .15

See if that is consistent with the data at the hospital

Practice Problem 2

Observed: 19.7% of the 361 heart transplant patients between 1986 and 2000 died at this hospital

Can this be explained “by chance”? Central Limit Theorem

If =.15, we would see a sample proportion at least as large as .197 in only .62% of random samples.

Either really unlucky coincidence or =.15 belief is wrong

More Formal Structure

1. Assumptions Review data collection plan

Simple random sample? Quantitative or qualitative data? Are the conditions for the Central Limit Theorem

met?

Check technical conditions


2. State competing conjectures about the parameter of interest = .15 (equals national rate)

Null hypothesis, H0

> .15 (what actually suspect) Alternative hypothesis, Ha


3. Measure the discrepancy between what observed and what hypothesized

)ˆ(

ˆ

pSD

edhypothesizpobservedz

50.2361/)85(.15.

15.197.

z

“test statistic”

0


4. How unusual is this discrepancy, assuming the null hypothesis is true Probability > .197 = probability Z > 2.50 = .0062

“p-value”assuming = .15

The smaller the p-value, the stronger the evidence against the null hypothesis Would get such an extreme sample proportion in only .6% of

random samples if =.15.

p̂

Matches direction of alternative hypothesis


5. Draw a conclusion Use same cut-off guidelines

E.g., p-value < .05 statistically significant evidence against null hypothesis “Reject the null hypothesis”

If p-value is considered large, we “fail to reject the null hypothesis”

Test of Significance

0. Define parameter of interest1. Check technical conditions2. State null and alternative hypotheses about

the population parameter of interest (in symbols and in words)

3. Calculate test statistic4. Determine p-value

Be able to interpret

5. State conclusion (about the null hypothesis)

Sketch a picture of the sampling distribution!

In English!

Example 1: Cohen v. Brown University Observational units = Brown university

athletes; variable = gender Population = process of males and females

becoming athletes at Brown Parameter of interest = probability of a Brown

University athlete being female, H0: .51 (proportional to student body)

Ha: .51 (smaller prob of an athlete being F)

Example 1: Cohen v. Brown Univ Give the university the benefit of doubt Does the CLT apply?

897(.51) = 457.5 > 10 and 897(.49)=439.5 > 10 Consider this a “random sample” from this process

Test statistic: z = -7.79 P-value < .000000287 (from Table A) Reject H0, very unlikely to have only 38% females in

sample if randomly selected from population w/ 51% Strong evidence that < .51

Example 2: Kissing the Right Way “Human Behavior: Adult Persistence of Head-

Turning Asymmetry,” Güntürkün, O. Nature, 421: 771, 2003.

Example 2: Kissing the Right Way Let represent the probability of a kissing couple

turning to the right H0: = .5 (equally likely left or right) Ha: > .5 (majority of couples turn right) CLT? .5(124) = 62 > 10, assuming it’s a

representative sample Test statistic: z = 3.23 P-value = .000619 < .05 Reject H0

Conclude a majority of couples to the right as long as believe this is a representative sample

Example 3: Body Temperatures Observational units: healthy adults Variable: temperature (quantitative) Parameter: let represent the average body

temperature of a healthy adult Sample size is large (n = 130 > 30) and we

are assuming this is a representative sample H0: = 98.6oF Ha…. Test statistic: - 6.22…

Behavior of ns

x

/

ns

x

/

n

x

/

t distribution, Table B

Minitab/applet (p. 5)

Example 4: Golden Rectangles Sample is skewed to the right

Evidence that population is skewed to the right as well

Population is probably not normal so need large n. Sample size is 20, which isn’t all that large. Proceed with caution

Example 4: Golden Rectangles Let represent the ratio used by American Indians H0: = .618 (same ratio on average)

Ha: ≠ .618 (ratio used by American Indians differs) t = 2.05 with df = 19 P-value = .054 Weak evidence against H0

Not overwhelmingly convincing that the mean ratio used by American Indians differs from .618.

For Tuesday

Read Ch. 5 (confidence intervals) Complete PP 9 Turn in second project report (see syllabus

for details) Start HW 5

Online anonymous Blackboard survey by Thursday

stat 512 – day 8 tests of significance (ch. 6). last time use random sampling to eliminate...

Documents