exercises - department of statistical sciencesfisher.utstat.toronto.edu/~hadas/sta220/lecture...

29
week 9 1 Exercises 1. Z ~ N(0, 1). Find P (-1.96 < Z < 1.96). 2. Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.95. 3. Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.90. 4. X ~ N(500, 15). Find the values of c and d such that P(c < X < d ) = 0.95.

Upload: others

Post on 06-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 1

Exercises 1. Z ~ N(0, 1). Find P (-1.96 < Z < 1.96).

2. Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.95.

3. Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.90.

4. X ~ N(500, 15). Find the values of c and d such that P(c < X < d ) = 0.95.

Page 2: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 2

5. X~N(μ, σ). Find the values of c and d (in terms of μ, and σ) such that P(c < X < d ) = 0.95

6. X~N(μ, σ). Find the values of c and d (in terms of μ, and σ) such that P(c < X < d ) = 0.90

7. X~N(500, 15). Let be the mean of a random sample of size 9. Find the values of c and d such that P( c < < d ) = 0.95

8. X~ N(μ, σ) Let be the mean of a random sample of size nFind the values of c and d such that P( c < < d ) = 0.95X

X

X

X

Page 3: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 3

Point Estimates and CI• A basic tool in statistical inference is point estimate of the

population parameter. However, an estimate without an indication of it’s variability is of little value.

• Example:

• A level C confidence interval for a parameter is an interval computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter.

Parameter Estimate Std. Error

μσ2 S2

p

X

Page 4: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 4

Confidence interval for the population mean• Choose a SRS of size n from a population having unknown

mean μ and known stdev. σ. A level C confidence interval for μis an interval of the form,

• Here is the value on the standard normal curve with area C between and . The interval is exact when the population distribution is normal and approximately correct for large n in other cases.

• In general CIs have the form: Estimate ± margin of error• In the above case,

Margin of error = m =

⎟⎠

⎞⎜⎝

⎛ ⋅+⋅− ∗∗

nzx

nzx σσ ,*x z n

σ±

*z *z− *z

*z nσ

Page 5: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 5

• Note, in the above formula for the CI for the population mean, is the stdev. of the sample mean (this is also known as

the std. error of the sample mean ) and it can also be written as

• The width of any CI is L = 2m i.e. twice the margin of error.

• Here are three ways to reduce the margin of error (and the width of the CI)

Use a lower level of confidence (smaller C)Increase the sample size n.Reduce σ (usually not possible).

nσ X

X* . ( )x z Std Error X±

Page 6: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 6

Sample size for desired margin of error• The CI for population mean will have a specified margin of

error m when the sample size is

• Example:A limnologist wishes to estimate the mean phosphate contentper unit volume of lake water. It is known from previous studies that the stdev. has a fairly stable value of 4mg. How many water samples must the limnologist analyze to be 90% certain that the error of estimation does not exceed 0.8 mg?

2*zn mσ⎛ ⎞

⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠

=

Page 7: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 7

Exercise 6.10 on p397 IPS

• You want to rent an unfurnished one-bedroom apartment for next semester. The mean monthly rent for a random sample of 10 apartments advertised in the local newspaper is $580. Assume that the stdev. is $90. Find a 95% CI for the mean monthly rent for unfurnished one-bedroom apartments available for rent in this community.

• How large a sample of one-bedroom apartments would be needed to estimate the mean µ within ±$20 with 90% confidence?

Page 8: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 8

Exercise 6.19 on p398 IPS• The question gives the data on the Degree of Reading Power

(DRP) scores for 44 students. 95% CI for the population mean score is given in the MINITAB output below.DRP Scores40 26 39 14 42 18 25 43 46 27 19 47 19 26 35 34 15 44 40 38 31 46

52 25 35 35 33 29 34 41 49 28 52 47 35 48 22 33 41 51 27 14 54 45

Z Confidence IntervalsThe assumed sigma = 11.0

Variable N Mean StDev SE Mean 95.0 % CIDRP Scor 44 35.09 11.19 1.66 (31.84 , 38.34)

• MINITAB CommandStat > Basic Statistics > 1 Sample Z and select ‘Confidence interval’

Page 9: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 9

ExerciseA random sample of 85 students in Chicago city high schools taking a course designed to improve SAT scores. Based on these students a 90% CI for the mean improvement in SAT scores for all Chicago high school students is computed as (72.3, 91.4) points.Which of the following statements are true?

a) 90% of the students in the sample improved their scores by between 72.3 and 91.4 points.

b) 90% of the students in the population improved their scores by between 72.3 and 91.4 points.

c) 95% CI will contain the value 72.3.d) The margin of error of the 90% CI above is 9.55.e) 90% CI based on a sample of 340 ( 85 X 4) students will have

margin of error 9.55/4.

Page 10: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 10

CIs for the population proportion p• Choose an SRS of size n from a population having unknown

proportion p of successes. An approximate level C confidence interval for p is

Again z* is the value on the standard normal curve with area C between -z* and z*.

• Note 1: Std. error of the sample proportion is =

• Note 2: Margin of error of this CI m =

• The above CI can be written as

• Use this interval when the number of successes and number of failure are both at least 15.

ˆ ˆ(1 )*ˆ p pp z n−±

ˆ( )SE p ˆ ˆ(1 )p pn−

ˆ ˆ(1 )* p pz n−

*ˆ ˆ( )p z SE p±

Page 11: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 11

• When the sample sizes are small use either tables of exact CIsor approximate CIs based on Wilson’s estimate given by

where, and =

• Note, Wilson’s estimate is called the plus four estimate.

• Read Pages 539-540 in IPS.

* ( )p z SE p±% %

24

Xp n+=+

% ( )SE p% (1 )4

p pn−+

% %

Page 12: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 12

Example• In a sample of 400 computer memory chips made at Digital

Devices, Inc., 40 were found to be defective. Give a 95% confidence interval for the proportion of defective chips in thepopulation from which the sample was taken?

Page 13: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 13

Sample size for desired Margin of error

• The level C Confidence interval will have margin of error approximately equal to the specified margin of error m when the sample size n is

• Here z* is the critical value for the confidence level C and p*

is a guessed value for the proportion of successes in a future sample.

• The margin of error will be less then or equal to m if p* is chosen to be 0.5. The sample size required is then given by

2* * *(1 )zn p pm⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠

= −

2

2*⎟⎠⎞

⎜⎝⎛=

mzn

Page 14: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 14

ExampleThe Gallup Poll asked a sample of 1785 U.S. adults, “Did you, yourself, happen to attend church or synagogue in the last 7 days?” Of the respondents, 750 said “Yes.” Suppose (it is not, in fact, true) that Gallup's sample was an SRS.

(a) Give a 99% confidence interval for the proportion of all U.S. adults who attended church or synagogue during the week preceding the poll.

(b) Do the results provide good evidence that less than half of the population attended church or synagogue?

(c) How large a sample would be required to obtain a margin of error of 0.01 in a 99% confidence interval for the proportion who attend church or synagogue? (Use Gallup's result as the guessed value of p).

Page 15: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 15

ExerciseAssume that a U.S. study and a Canadian study to estimate the proportion of adults in favour of capital punishment are conducted using simple random samples (not really practical). Assume the true unknown proportions in the 2 countries are fairly similar. The U.S. survey uses a sample 9 times bigger than the Canadian sample. Both samples are quite large. The U.S. population is 9 times bigger than the Canadian population. The Canadian confidence interval will be:

a) 9 times wider than the U.S. confidence interval b) 3 times wider than the U.S. confidence interval c) the same width as the U.S. confidence interval d) 9 times smaller than the U.S. confidence interval e) 3 times smaller than the U.S. confidence interval

Page 16: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 16

Statistical tests for the population mean (σ known)

• A significance test is a formal procedure for comparing observed data with a hypothesis whose truth we want to assess. The hypothesis is a statement about the parameters in a population or model.

• Null hypothesis The statement being tested in a test of significance is called the null hypothesis. The test of significance is designed to assess the strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of “no effect” or “no difference”.

• We abbreviate “null hypothesis” as H0 .

Page 17: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 17

ExampleEach of the following situations requires a significance test about a population mean μ. State the appropriate null hypothesis H0 and alternative hypothesis Ha in each case.

(a) The mean area of the several thousand apartments in a new development is advertised to be 1250 square feet. A tenant group thinks that the apartments are smaller than advertised. They hire an engineer to measure a sample of apartments to test their suspicion.

(b) Larry's car consume on average 32 miles per gallon on the highway. He now switches to a new motor oil that is advertised as increasing gas mileage. After driving 3000 highway miles with the new oil, he wants to determine if his gas mileage actually has increased.

(c) The diameter of a spindle in a small motor is supposed to be 5 millimeters. If the spindle is either too small or too large, the motor will not perform properly. The manufacturer measures the diameter in a sample of motors to determine whether the mean diameter has moved away from the target.

Page 18: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 18

Test Statistic• The test is based on a statistic that estimate the parameter that

appears in the hypotheses. Usually this is the same estimate we would use in a confidence interval for the parameter. When H0is true, we expect the estimate to take a value near the parameter value specified in H0.

• Values of the estimate far from the parameter value specified by H0 give evidence against H0. The alternative hypothesis determines which directions count against H0.

• A test statistic measures compatibility between the null hypothesis and the data.

• We use it for the probability calculation that we need for our test of significance

• It is a random variable with a distribution that we know.

Page 19: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 19

Example• An air freight company wishes to test whether or not the mean

weight of parcels shipped on a particular root exceeds 10 pounds. A random sample of 49 shipping orders was examined and found to have average weight of 11 pounds. Assume that the stdev. of the weights (σ) is 2.8 pounds.

• The null and alternative hypotheses in this problem are:H0: μ = 10 ; Ha: μ > 10 .

• The test statistic for this problem is the standardized version of

• Decision: ?

X

/XZ n

μσ−=

Page 20: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 20

P-value and Significance level• The probability computed under the assumption that H0 is true,

that the test statistic would take a value as extreme or more extreme than that actually observed is called the P-value of the test. The smaller the P-value the stronger the evidence against H0provided by the data.

• The decisive value of the P is called the significance level. It is denoted by α.

• Statistical significanceIf the P-value is as small or smaller than α, we reject H0 and say that the data are statistically significant at level α.

• The P-value is the smallest level α at which the data are significant.

Page 21: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 21

Z Test for a population mean (σ known)

• To test the hypothesis H0: µ = µ0 based on a SRS of size n from a population with unknown mean µ and known stdev σ, compute the test statistic

• In terms of a standard Normal variable Z, the P-value for the test of H0 against

Ha : µ > µ0 is P( Z ≥ z )Ha : µ < µ0 is P( Z ≤ z ) Ha : µ ≠ µ0 is 2·P( Z ≥ |z|)

• These P-values are exact if the population distribution is normal and are approximately correct for large n in other cases.

nxzσ

μ0−=

Page 22: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 22

Critical value approach

• We can base our test conclusions on a fixed level of significantα without computing the P-value.

• For this we need to find a critical value z* from the standard normal distribution with a specified tail area (to the right or left depending on Ha). This tail area is called the rejection region.

• If the test statistic falls in the rejection region we reject H0 and conclude that the data are statistically significant at level α.

• A P-value is more informative then a reject-or-not finding at a fixed significance level because it can tell us about the strength of evidence we found against the H0.

Page 23: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 23

Example• The Pfft Light Bulb Company claims that the mean life of its 2

watt bulbs is 1300 hours. Suspecting that the claim is too high, Nalph Rader gathered a random sample of 64 bulbs and tested each. He found the average life to be 1295 hours. Test the company's claim using α = 0.01. Assume σ = 20 hours.

Page 24: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 24

Exercise• A standard intelligence examination has been given for several

years with an average score of 80 and a standard deviation of 7. If 25 students taught with special emphasis on reading skill, obtain a mean grade of 83 on the examination, is there reason to believe that the special emphasis changes the result on the test? Use α = 0.05.

Page 25: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 25

Exercise 6.57 on p421 IPS• The question gives the data on the Degree of Reading Power

(DRP) scores for 44 students. The MINITAB output for the test isgiven below.

Z-TestTest of mu = 32.00 vs mu > 32.00The assumed sigma = 11.0Variable N Mean StDev SE Mean Z PDRP Scor 44 35.09 11.19 1.66 1.86 0.031

• MINITAB CommandStat > Basic Statistics > 1 Sample Z and select ‘Test mean’

Page 26: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 26

Confidence Intervals and two-sided tests

• A level α two-sided significance test rejects a hypothesis H0: μ = μ0 exactly when the value μ0 falls outside the 1- αconfidence interval for μ.

• ExampleFor the exercise above a 95% CI is 83 ± 1.96·(7/5) = (80.256, 85.744)The value 80 is not in this interval and so we reject H0: μ = 80 at the 5% level of significance.

Page 27: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 27

Large sample signif. tests for a population proportion• Draw a SRS of size n from a large population with unknown

proportion p of successes. To test the null hypothesis H0: p = p0, compute the z statistic

• In terms of a standard normal random variable Z, the approximate p-value for the test of H0 againstHa : p > p0 is P( Z ≥ z )Ha : p < p0 is P( Z ≤ z ) Ha : p ≠ p0 is 2·P( Z ≥ |z|)

• Use the large-sample z test as long as the expected number of successes, np0, and the expected number of failure, n(1- p0), are both greater then 10.

0

0 0

ˆ(1 )

p pzp p

n

−=−

Page 28: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 28

ExampleLeroy, a starting player for a major college basketball team, made only 38.4% of his free throws last season. During the summer he worked on developing a softer shot in the hope of improving his free-throw accuracy. In the first eight games of this season Leroy made 25 free throws in 40 attempts. Let p be his probability of making each free throw he shoots this season.

(a)State the null hypothesis H0 that Leroy's free-throw probability has remained the same as last year and the alternative Ha that his work in the summer resulted in a higher probability of success.

(b) Calculate the z statistic for testing H0 versus Ha.(c) Do you accept or reject H0 for α = 0.05 ? Find the P-value.(d) Give a 90% confidence interval for Leroy's free-throw success

probability for the new season. Are you convinced that he is now a better free-throw shooter than last season?

(e) What assumptions are needed for the validity of the test andconfidence interval calculations that you performed?

Page 29: Exercises - Department of Statistical Sciencesfisher.utstat.toronto.edu/~hadas/STA220/Lecture notes/week9.pdf · week 9 7 Exercise 6.10 on p397 IPS • You want to rent an unfurnished

week 9 29

• MINITAB gives the exact p-value for the test.

• Commands: Stats > Basic Statistics > 1 Proportion

• MINITAB output for the above example is given below.

Test and Confidence Interval for One Proportion

Test of p = 0.384 vs p > 0.384Exact

Sample X N Sample p 90.0 % CI P-Value1 25 40 0.625000 (0.482752, 0.752705) 0.002