proportions and percentiles for standard normal...

30
STAT 100, Section 2 Final exam practice questions Spring, 2004 Proportions and Percentiles for Standard Normal Scores Standard Proportion Standard Proportion score, z below z Percentile score, z below z Percentile –3.00 0.0013 0.13 0.03 0.51 51 –2.576 0.005 0.50 0.05 0.52 52 –2.33 0.01 1 0.08 0.53 53 –2.05 0.02 2 0.10 0.54 54 –1.96 0.025 2.5 0.13 0.55 55 –1.88 0.03 3 0.15 0.56 56 –1.75 0.04 4 0.18 0.57 57 –1.64 0.05 5 0.20 0.58 58 –1.55 0.06 6 0.23 0.59 59 –1.48 0.07 7 0.25 0.60 60 –1.41 0.08 8 0.28 0.61 61 –1.34 0.09 9 0.31 0.62 62 –1.28 0.10 10 0.33 0.63 63 –1.23 0.11 11 0.36 0.64 64 –1.17 0.12 12 0.39 0.65 65 –1.13 0.13 13 0.41 0.66 66 –1.08 0.14 14 0.44 0.67 67 –1.04 0.15 15 0.47 0.68 68 –0.99 0.16 16 0.50 0.69 69 –0.95 0.17 17 0.52 0.70 70 –0.92 0.18 18 0.55 0.71 71 –0.88 0.19 19 0.58 0.72 72 –0.84 0.20 20 0.61 0.73 73 –0.81 0.21 21 0.64 0.74 74 –0.77 0.22 22 0.67 0.75 75 –0.74 0.23 23 0.71 0.76 76 –0.71 0.24 24 0.74 0.77 77 –0.67 0.25 25 0.77 0.78 78 –0.64 0.26 26 0.81 0.79 79 –0.61 0.27 27 0.84 0.80 80 –0.58 0.28 28 0.88 0.81 81 –0.55 0.29 29 0.92 0.82 82 –0.52 0.30 30 0.95 0.83 83 –0.50 0.31 31 0.99 0.84 84 –0.47 0.32 32 1.04 0.85 85 –0.44 0.33 33 1.08 0.86 86 –0.41 0.34 34 1.13 0.87 87 –0.39 0.35 35 1.17 0.88 88 –0.36 0.36 36 1.23 0.89 89 –0.33 0.37 37 1.28 0.90 90 –0.31 0.38 38 1.34 0.91 91 –0.28 0.39 39 1.41 0.92 92 –0.25 0.40 40 1.48 0.93 93 –0.23 0.41 41 1.55 0.94 94 –0.20 0.42 42 1.64 0.95 95 –0.18 0.43 43 1.75 0.96 96 –0.15 0.44 44 1.88 0.97 97 –0.13 0.45 45 1.96 0.975 97.5 –0.10 0.46 46 2.05 0.98 98 –0.08 0.47 47 2.33 0.99 99 –0.05 0.48 48 2.576 0.995 99.5 –0.03 0.49 49 3.00 0.9987 99.87 0.00 0.50 50 3.75 0.9999 99.99 Question 1. In a study of the serum DHEA-S level for a random sample of 30 people who practice transcendental meditation, it was found that the sample mean DHEA-S level was 117.0 and the sample standard deviation was 21.0. Give a 95% confidence interval for the true population mean. (A) 117.0 ± 2 × .21×(1-.21) 30 , or 116.9 to 117.1 (B) 117.0 ± 2 × 21.0 30 , or 115.3 to 118.7 (C) 117.0 ± 21.0 30 , or 116.2 to 117.8 (D) 117.0 ± 2 × 21.0 30 , or 109.3 to 124.7 1

Upload: others

Post on 04-Mar-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

STAT 100, Section 2 Final exam practice questions Spring, 2004

Proportions and Percentiles for Standard Normal ScoresStandard Proportion Standard Proportionscore, z below z Percentile score, z below z Percentile–3.00 0.0013 0.13 0.03 0.51 51–2.576 0.005 0.50 0.05 0.52 52–2.33 0.01 1 0.08 0.53 53–2.05 0.02 2 0.10 0.54 54–1.96 0.025 2.5 0.13 0.55 55–1.88 0.03 3 0.15 0.56 56–1.75 0.04 4 0.18 0.57 57–1.64 0.05 5 0.20 0.58 58–1.55 0.06 6 0.23 0.59 59–1.48 0.07 7 0.25 0.60 60–1.41 0.08 8 0.28 0.61 61–1.34 0.09 9 0.31 0.62 62–1.28 0.10 10 0.33 0.63 63–1.23 0.11 11 0.36 0.64 64–1.17 0.12 12 0.39 0.65 65–1.13 0.13 13 0.41 0.66 66–1.08 0.14 14 0.44 0.67 67–1.04 0.15 15 0.47 0.68 68–0.99 0.16 16 0.50 0.69 69–0.95 0.17 17 0.52 0.70 70–0.92 0.18 18 0.55 0.71 71–0.88 0.19 19 0.58 0.72 72–0.84 0.20 20 0.61 0.73 73–0.81 0.21 21 0.64 0.74 74–0.77 0.22 22 0.67 0.75 75–0.74 0.23 23 0.71 0.76 76–0.71 0.24 24 0.74 0.77 77–0.67 0.25 25 0.77 0.78 78–0.64 0.26 26 0.81 0.79 79–0.61 0.27 27 0.84 0.80 80–0.58 0.28 28 0.88 0.81 81–0.55 0.29 29 0.92 0.82 82–0.52 0.30 30 0.95 0.83 83–0.50 0.31 31 0.99 0.84 84–0.47 0.32 32 1.04 0.85 85–0.44 0.33 33 1.08 0.86 86–0.41 0.34 34 1.13 0.87 87–0.39 0.35 35 1.17 0.88 88–0.36 0.36 36 1.23 0.89 89–0.33 0.37 37 1.28 0.90 90–0.31 0.38 38 1.34 0.91 91–0.28 0.39 39 1.41 0.92 92–0.25 0.40 40 1.48 0.93 93–0.23 0.41 41 1.55 0.94 94–0.20 0.42 42 1.64 0.95 95–0.18 0.43 43 1.75 0.96 96–0.15 0.44 44 1.88 0.97 97–0.13 0.45 45 1.96 0.975 97.5–0.10 0.46 46 2.05 0.98 98–0.08 0.47 47 2.33 0.99 99–0.05 0.48 48 2.576 0.995 99.5–0.03 0.49 49 3.00 0.9987 99.870.00 0.50 50 3.75 0.9999 99.99

Question 1. In a study of the serum DHEA-S level for a random sample of 30 peoplewho practice transcendental meditation, it was found that the sample mean DHEA-Slevel was 117.0 and the sample standard deviation was 21.0. Give a 95% confidenceinterval for the true population mean.

(A) 117.0± 2×√

.21×(1−.21)30 , or 116.9 to 117.1

(B) 117.0± 2×√

21.030 , or 115.3 to 118.7

(C) 117.0±√

21.030 , or 116.2 to 117.8

(D) 117.0± 2× 21.0√30

, or 109.3 to 124.7

1

Page 2: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 2. A random sample of students was asked the following question: “Are yourparents divorced or separated?” The data are given below.

Div/sep?No Yes All

Female 99 34 133Male 77 24 101All 176 58 234

What is the expected count (assuming the skeptic is correct) corresponding to the 99in the upper left?

(A) 99133 , or .74

(B) 176+1332 , or 154.5

(C) 99234 , or .42

(D) 99

(E) 176×133234 , or 100.03

Question 3. A certain test for hepatitis has a sensitivity of 90%. What does this mean?

(A) 90% of individuals who have hepatitis will test negative.(B) 90% of individuals who do not have hepatitis will test negative.(C) 90% of individuals who have hepatitis will test positive.(D) 90% of individuals who do not have hepatitis will test positive.

Question 4. Consider the scatterplot below.

●●

●●

● ●

●●

● ●

● ●●

80 90 100 110 120

−70

−65

−60

−55

−50

−45

−40

−35

x

y

One of the numbers below is the correct correlation of the variables in the scatterplot.Which one is it?

(A) .52(B) 8(C) 0(D) −.61(E) −1

2

Page 3: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 5. In the infamous poll conducted by Literary Digest, ten million questionnaireswere mailed out and 2.3 million were returned. The main reason the results were sobiased was the problem of

(A) volunteer response(B) volunteer sample(C) too few questionnaires mailed out

Question 6. Suppose that we know the sample mean is 31.0 and the standard error of themean is 2.0. We wish to calculate a 95% confidence interval. Which of the followingis correct?

(A) We can’t compute the interval, but if we also knew the sample size then we couldcompute it.

(B) We can’t compute the interval, but if we also knew the sample standard deviation thenwe could compute it.

(C) The interval is 31.0± 2.0.(D) The interval is 31.0± 2× 2.0.

Question 7. Four fair coins — penny, nickel, dime, and quarter — are flipped simulta-neously. (“Fair” means that the probability of heads is 1

2 .) Below are two possibleoutcomes.

(I) Heads on all four coins

(II) Heads on penny and nickel; tails on dime and quarter

Which of the following statements is correct?

(A) It is impossible to tell in this case whether one outcome has a higher probability thanthe other.

(B) Outcomes (I) and (II) have the same probability.(C) Outcome (II) has a higher probability than outcome (I).(D) Outcome (I) has a higher probability than outcome (II).

Question 8. In a statistical study, the sample should be

(A) at least 10% as large as the population(B) representative of the population(C) comprised of those individuals who are easiest to find(D) comprised of those individuals with the strongest opinions(E) larger than 2500 individuals

Question 9. To survey the opinions of its customers, an airline company made a list of allits flights and randomly selected 20 flights. All of the passengers on those flights wereasked to fill out a survey. This sample is

(A) a systematic sample(B) a stratified random sample(C) a convenience sample(D) a cluster sample(E) a simple random sample

3

Page 4: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 10. Which of the following is the subject of work by Lee Salk that we discussedin class?

(A) The Gallup poll and its method for sampling adult Americans(B) Whether mothers carry babies on the left side to be near the beating heart(C) The Literary Digest poll of 1936 and its main source of bias(D) Italian art and why some infants have the faces of old men in paintings

Question 11. It has been observed that individuals will often act differently when theyknow they are being studied than when they are not being studied. This phenomenon,which can lead to biased survey results, is known as

(A) the Hawthorne effect(B) the Gallup effect(C) the placebo effect(D) volunteer response bias(E) the randomization effect

Question 12. The Gallup organization sometimes uses random digit dialing to obtain asample of individuals from the population of telephone owners. Such a sample is

(A) a cluster sample(B) a systematic sample(C) a haphazard sample(D) a simple random sample(E) a convenience sample

Question 13. In the context of designing a statistical experiment, by “control” we meanthat

(A) we control carefully the observed outcomes of the experiment.(B) we control carefully the cost of performing the experiment.(C) there must be a basis for making comparisons.(D) there is a need to control the subjects of the experiment.

Question 14. Suppose you have to cross a train track on your commute. The probabilitythat you will have to wait for a train is 1/5, or .20. If you don’t have to wait, thecommute takes 15 minutes, but if you have to wait, the commute takes 20 minutes.What is the expected value of the time it takes you to commute?

(A) 15+202 , or 17.5 minutes

(B) 20× 15 , or 4 minutes

(C) 20× 15 + 15× 4

5 , or 16 minutes

(D) (20 + 15)× 15 , or 7 minutes

(E) 20× 15 + 15, or 19 minutes

4

Page 5: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 15. In a 1996 study conducted at a Philadelphia restaurant, putting a smileyface on the bill

(A) increased the tip for a female waitress but decreased the tip for a male waiter.(B) did not change the tip for either a female waitress or a male waiter(C) increased the tip for both a female waitress and a male waiter.(D) decreased the tip for both a female waitress and a male waiter.(E) decreased the tip for a female waitress but increased the tip for a male waiter.

Question 16. Why is it not possible to conduct a randomized experiment to determinewhether sniffing glue can cause seizures in adolescents?

(A) It is not possible to obtain a sample that would be representative of adolescents.(B) It is unethical to require the treatment group to sniff glue.(C) It is never possible to infer causation from the results of a randomized experiment.(D) The Hawthorne effect would bias any results of such an experiment.

Question 17. From a news report: “Teenagers who smoke marijuana at least twice a monthwere found to have significantly lower grade point averages than those who smoke nomarijuana.”

Which of the following is most likely?

(A) This was a randomized experiment; marijuana can be said to have caused lower GPA.(B) This was a randomized experiment; marijuana cannot be said to have caused lower

GPA.(C) This was an observational study; marijuana can be said to have caused lower GPA.(D) This was an observational study; marijuana cannot be said to have caused lower GPA.

Question 18. In a particular coastal city in the Northeast, is has been shown that whenthe monthly number of ice cream cones sold increases, the number of drowning deathsalso increases. However, this is not a causal relationship, and it turns out that bothice cream cone sales and drowning deaths are related to the outside temperature; thewarmer months bring many more people to the beach and also encourage people tobuy ice cream.

In this example, temperature is called:

(A) A confounding variable(B) A Hawthorne effect(C) A standardized score(D) A placebo

Question 19. What is an 88% confidence interval for a true population mean?

(A) sample mean± 1.17× SEM(B) sample mean± 2× SEM(C) sample mean± .19× SEM(D) sample mean± 1.55× SEM

5

Page 6: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 20. Suppose we take repeated samples of the size 1005 from a population andcompute a 90% confidence interval for the true mean from each sample. In the longrun, the proportion of these intervals that fail to capture the true mean will be ap-proximately

(A) (sample standard deviation)/√

1005(B) .90(C) .10

(D) .90×√

1005

Question 21. In what year did the infamous Literary Digest poll take place, and who werethe two major party presidential candidates?

(A) 1952, Eisenhower and Stevenson(B) 1936, Landon and Roosevelt(C) 1960, Kennedy and Nixon(D) 1948, Dewey and Truman(E) 1964, Johnson and Goldwater

Question 22.

Choose the best estimate for the standard deviation of the sample depicted in the histogramabove:

(A) 3(B) 2(C) 1(D) 15(E) 4

Question 23. All other things remaining constant, the sample proportion that results inthe widest confidence interval is

(A) .333(B) .50(C) .25(D) .95

6

Page 7: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 24. For any data set, the standard deviation is

(A) the average of the sample mean and quartiles.(B) a measure of center, like the mean, median, and mode.(C) a measure of the spread or variability of the data.(D) the z-score of the sample mean.(E) equal to the maximum observation minus the minimum observation.

Question 25. To study the effectiveness of vitamin C in preventing colds, a researcherrecruited 200 people. She randomly assigned 100 of them to take vitamin C and 100of them to take nothing at all. What is the biggest problem with this study?

(A) the presence of interacting variables(B) the inability to infer causation(C) the placebo effect(D) volunteer response

Question 26. A weekly news magazine reports that the president’s approval rating is 50%,plus or minus 3%. What does this mean?

(A) Only 3% of the population was sampled for this particular survey(B) 50% is the standard deviation of a histogram of sample proportions(C) 3% is the standard deviation of a histogram of sample proportions(D) The true proportion of people who approve is unknown, but 47% to 53% is a range of

plausible values.

Question 27. Which of the following was observed in the 1920’s and 1930’s at a WesternElectric plant in Illinois?

(A) The matched-pairs experimental design(B) The effect of confounding variables(C) The effect of interacting variables(D) The placebo effect(E) The Hawthorne effect

Question 28. We have discussed some of the work of Daniel Kahneman and Amos Tverskyin class. What is the subject of this work?

(A) Pioneering work on linear regression, including a famous study of “regression to themean”

(B) The psychology of statistics, including the representativeness heuristic(C) The misuse of correlation coefficients, especially as they are often used to imply cau-

sation(D) The largest clinical trial in U.S. history, used to determine that a vaccine for polio is

effective

7

Page 8: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 29. Which of the following is not part of the five-number summary for a dataset?

(A) First quartile(B) Interquartile range (IQR)(C) Minimum(D) Median

Question 30. A 95% confidence interval for the true mean weight in pounds of a particularbreed of dogs is reported to be 16.0± 1.0. What can we conclude?

(A) The interval from 15.0 to 17.0 is a range of reasonable values for the true mean,calculated using a procedure that will capture the true mean 95% of the time if donerepeatedly.

(B) 95% of the all dogs of this breed have weights between 15.0 and 17.0 pounds.(C) In 95% of all possible samples from this breed of dogs, the sample mean will lie between

15.0 and 17.0 pounds.(D) If a single dog is selected at random from this population, then the true weight of this

dog will lie between 15.0 and 17.0 95% of the time.

Question 31. Suppose that a particular over-the-counter pregnancy test has a sensitivityof 90% and a specificity of 99%. What is the probability of a false positive?

(A) 10%(B) 99%(C) 90%(D) 1%(E) None of the above

Question 32. Which of the following will be closest to the standard deviation of thehistogram of many, many sample proportions from equally-sized samples?

(A) The area under a normal curve to the left of 1.64(B) The true proportion of the population(C) 1.96 standard deviations(D) The square root of: (true proportion)× (1− true proportion)/(sample size)(E) The square root of: (true proportion)/(sample size)

Question 33. The managers of a mall wanted to know whether consumers would be willingto pay for parking if a modern parking structure were built. The mall posted aninterviewer at the door and told her to collect a sample of 100 opinions by askingthe next person who came in the door each time she had finished an interview. Thissample is

(A) a convenience sample(B) a simple random sample(C) a stratified random sample(D) a cluster sample(E) a systematic sample

8

Page 9: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 34.

Which of the following is true about the sample depicted in the histogram above?

(A) The mean is larger than the median.(B) It is impossible to tell from this histogram alone whether the mean is larger than,

smaller than, or equal to the median.(C) The mean is equal to the median.(D) The mean is smaller than the median.

Question 35. Consider the following two variables measured on a sample of people: Height(in inches) and weight (in pounds) We would expect the correlation to be

(A) −1(B) between −1 and 0(C) 0(D) between 0 and 1(E) 1

Question 36. In a study of the relationship between ideal weight and actual weight, bothmeasured in pounds, it is found that the correlation is .867 and the regression equationis

Idealwt = 25.6 + .779×Actualwt

What is the predicted ideal weight for a person who weighs 100 pounds?

(A) 25.6 + .779× 100, or 103.5 pounds(B) 100 pounds(C) 100× .867, or 86.7 pounds(D) .867× (25.6 + 100), or 108.9 pounds(E) 100 + .867, or 100.867 pounds

9

Page 10: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 37. In a randomized experiment involving a new vitamin supplement intendedto reduce the chances of catching a cold, suppose that subjects were randomly dividedinto two groups of 100 each. Over the course of an entire winter, 13 of the subjectsreceiving the supplement got colds and 24 of those not receiving the supplement gotcolds. In this study, what is the risk for the treatment group?

(A) 13100

(B) 1324

(C) 24100

(D) 2476

(E) 1387

Question 38. Suppose that in a particular sample, we find that 23% of female collegestudents attend weekly religious services and 26% of male college students attendweekly religious services. Which of the following is true?

(A) This difference between males and females is more likely to be statistically significantif the sample is larger.

(B) This difference between males and females is less likely to be statistically significant ifthe sample is larger.

(C) This difference between males and females is more likely to be practically significant ifthe sample is larger.

(D) This difference between males and females is less likely to be practically significant ifthe sample is larger.

Question 39. In a study of the relationship between handspan in centimeters and height ininches, it is found that the correlation is .80. Suppose that the handspan measurementsare converted to inches (this is accomplished by dividing all handspan measurementsby 2.54). What will be the new correlation coefficient after both handspan and heightare measured in inches?

(A) .80× 2.54, or 2.03

(B) .802.54 , or .31

(C) .80(D) 1(E) It is impossible to determine the new correlation coefficient without seeing the actual

dataset.

Question 40. All other things remaining constant, if the sample size is multiplied by 16then the width of a confidence interval is multiplied by

(A) 1/16(B) 2(C) 1(D) 1/4(E) 16

10

Page 11: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 41. To learn how its employees felt about a shorter fall semester, a universitydivided employees into three categories: staff, faculty, and student employees. Arandom sample was selected from each group. This sample is

(A) a haphazard sample(B) a cluster sample(C) a simple random sample(D) a stratified random sample(E) a convenience sample

Question 42. To investigate the proportion of times a particular coin will come to restheads-side-up after being spun on the table, an experimenter spins 100 coins. If thisexperiment is carried out once a day for three years and all of the daily sample propor-tions are graphed in a histogram, the shape of this histogram will be approximately

(A) Normal (bell-shaped)(B) Semi-circular(C) Flat, like a rectangle(D) Triangular

Question 43. Suppose we measure the length of someone’s arm with great accuracy. Thismeasurement is

(A) Categorical(B) Unreliable(C) Continuous(D) Discrete

Question 44. In the scatterplot below, consider the point marked by a triangle.

20 40 60 80

2040

6080

x

y

Which of the following will occur if the marked point is removed from the dataset?

(A) The slope and the correlation will both decrease.(B) The slope will decrease and the correlation will increase.(C) The slope and the correlation will both increase.(D) The slope will increase and the correlation will decrease.

11

Page 12: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 45. Suppose that we were to take many, many samples of size 100 from aparticular population and create a histogram of the sample means from each sample.The shape of this histogram would be approximately

(A) Semi-circular(B) Normal (bell-shaped)(C) Triangular(D) A skewed curve with a long right tail

Question 46. Consider the following three statements regarding outliers:

(I) Outliers may be created by errors in data entry.

(II) Sometimes outliers are the most interesting points in a dataset.

(III) A single outlier can dramatically decrease the correlation in a dataset.

Which of the above statements are true?

(A) I and II only(B) II and III only(C) I only(D) II only(E) I, II, and III

Question 47. Consider the events “A occurs” and “A does not occur”.

(I) The probabilities of these two events sum to 1.

(II) These two events are mutually exclusive.

(III) These two events are independent.

Which of the above statements are true?

(A) III only(B) I and II only(C) I and III only(D) I, II, and III(E) II and III only

12

Page 13: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 48. A study comparing the children of mothers who smoked 10 or more cigarettesper day while pregnant to those of mothers who did not smoke found the followingsample data on birth weight (in grams) of these children:

Mean SE MeanNonsmokers 3436 120≥ 10 cigs per day 3025 160

Give a 95% confidence interval for the difference of population means (mean for non-smokers minus mean for smokers).

(A) 411± 2√

1202 + 1602

(B) −40± 2√

120 + 160

(C) 3436± 2 3025√160

(D) −40± 2√

3436√120

+ 3025√160

(E) 411± 2√

120 + 160

Question 49. Suppose you repeatedly toss an unfair coin for which the probability oftossing heads is .2, or 20%. The probability that you do not toss heads on any of yourfirst four tosses is:

(A) .8 + .8 + .8 + .8 = 3.2(B) .8

(C) (.8)4 = .41

(D) 1− (.8)4 = .59(E) None of the above

Question 50. The probability of winning a game of craps is equal to .493. You decide youare going to play craps repeatedly until you win a game. What is the probability thatyour first win occurs on the second game of craps?

(A) .507

(B) .4932

(C) .507× .493(D) .493(E) .493 + .507

Question 51. Suppose that you work for an automobile insurance company. You chargeeach of your policyholders a $400 annual premium. To each policyholder who files alegitimate claim after an accident, you pay a lump sum of $2400. You know from pastexperience that 10% of your policyholders file legitimate claims each year. What isthe insurance company’s expected profit for each policyholder?

(A) −$2000× .1 + $400× .9, or $160(B) $400× .9, or $360(C) $2400× .1, or $240(D) $2800× .1 + $400× .9, or $640

13

Page 14: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 52. A study comparing the children of mothers who smoked 10 or more cigarettesper day while pregnant to those of mothers who did not smoke found the followingsample data on IQ of these children:

Mean standard deviation Sample sizeNonsmokers 113.28 10.28 87≥ 10 cigs per day 103.12 14.73 63

What is the standard deviation of the difference of sample means?

(A) 10.2887 + 14.73

63 , or 0.35

(B)√

10.282

87 + 14.732

63 , or 2.15

(C) (10.28 + 14.73)/2, or 12.51

(D) 10.28√87

+ 14.73√63

, or 2.96

Question 53. Which of the following samples has a standard deviation of 0?

(A) 0, 0, 0, 0, 1(B) −2,−2,−2, 2(C) 1, 2, 2, 2, 2, 3(D) 5, 5, 5, 5(E) −2,−1, 0, 1, 2

Question 54. If the confidence coefficient for a particular confidence interval changes from90% to 95%,will the interval get wider or narrower?

(A) Wider(B) Narrower(C) Neither; it will stay the same width(D) It is impossible to determine the answer without knowing the sample size.

Question 55. A certain confidence interval is reported as sample mean ± 2.33 × SEM.What is the confidence coefficient of this interval?

(A) 95%(B) 90%(C) 98%(D) 99%

Question 56. Suppose that 10% of the population carries a particular genetic trait. Anew test being developed to detect the trait has a false positive probability of 10%and a false negative probability of 10%. (In other words, both the sensitivity andspecificity equal 90%.) Given that the test is positive for a certain individual, what isthe probability that individual is actually a carrier?

(A) 9%(B) .09 + .09, or 18%(C) .09× .81, or 7.3%(D) 90%

(E) .09.09+.09 , or 50%

14

Page 15: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 57. Suppose we ask people what color their eyes are. This variable is

(A) Biased(B) Continuous(C) Unreliable(D) Categorical

Question 58. Suppose that we asked a sample of PSU students whether they watch morethan 10 hours of television per week. In order to compare the percentage of womenwho said yes to the percentage of men who said yes, we ran a chi-squared analysisand obtained a chi-squared statistic of 5.37. What conclusion may be drawn from thisstatistic?

(A) There is a statistically significant difference between men and women on this question.(B) There is a practically significant difference between men and women on this question.(C) There is not a statistically significant difference between men and women on this

question.(D) There is no way to draw any conclusion about statistical or practical significance

without seeing the actual data.

Question 59. The variables x (temperature in ◦C) and y (temperature in ◦F) are exactlyrelated by the formula y = 32 + 1.8x. Which of the following statements is true for asample of observed temperatures measured in both Fahrenheit and Celsius?

(A) The slope of the regression equation and the correlation coefficient have opposite signs.(B) The slope of the regression equation is 1.8 and the correlation coefficient is 1.(C) The slope of the regression equation is 1.8 and the correlation coefficient is 32.(D) The slope of the regression equation and the correlation coefficient both equal 1.8.

Question 60. A random sample of 400 people is asked a particular yes-or-no question.80 of them answer yes. What is the approximate standard deviation of the sampleproportion of yeses?

(A)√(

80400

) (1− 80

400

), or 0.40

(B)(

80400

)/√

400, or 0.01

(C)√(

80400

) (1− 80

400

)/400, or 0.02

(D) 80400 , or 0.20

Question 61. A random sample of 25 people who visited a weight-loss clinic finds thatthe amount of weight lost has a sample mean of 8.0 pounds and a sample standarddeviation of 5.0 pounds. What is the standard error of the mean (SE Mean)?

(A) 5.0/8.0, or 0.625 pounds

(B) 8.0/√

25, or 1.6 pounds

(C)√

5.0/8.0, or 0.791 pounds

(D) 5.0/√

25, or 1.0 pounds(E) 8.0/25, or 0.32 pounds

15

Page 16: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 62. For a sample of young children, why is shoe size strongly positively correlatedwith size of vocabulary?

(A) Cultures in which children tend to be larger tend to be cultures in which reading atan early age is stressed

(B) Children with larger shoes tend to be better nourished and thus also better educatedthan children less well nourished

(C) Children with small feet have a harder time learning to walk than children with largefeet, so they have less mental energy to focus on language acquisition

(D) Both shoe size and size of vocabulary are positively correlated with age

Question 63. If the SE Mean increases, will a 95% confidence interval for a populationmean get wider or narrower?

(A) Neither; it will stay the same width(B) Narrower(C) Wider(D) It is impossible to determine the answer without knowing the sample size.

Question 64. A random sample of students was asked the following question: “Do youhave a tattoo?” The data are given below.

Tattoo?No Yes All

Female 105 31 136Male 85 15 100All 190 46 236

Chi-Sq = 0.184 + 0.761 +0.251 + 1.035 = 2.231

Suppose that every number in the above table were multiplied by 10. What would bethe new value of the chi-squared statistic?

(A) 2231(B) .2231(C) 22.31(D) 2.231(E) 223.1

Question 65. Every confidence interval we have learned about in this class is of the form

(sample estimate)± (multiplier)× (standard deviation of sample estimate).

What is the value of “multiplier” for a 90% confidence interval?

(A) .90(B) 2(C) 1.64(D) 2.33

16

Page 17: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 66. To investigate whether strong electromagnetic fields cause cancer, a groupof 100 rats is randomly split into two groups: Those who are exposed to a strongelectromagnetic field for several hours a day, and those who are not. The number ofrats who develop cancer is recorded for each group, and the results are given below:

Cancer?No Yes All

Mag field 60 40 100No mag field 80 20 100All 140 60 200

Consider the following statement about the increased risk of cancer due to electro-magnetic field:

In this study, the risk for the electromagnetic field group is %higher than the baseline risk.

What goes in the blank?

(A) 0(B) 300(C) 400(D) 100(E) 200

Question 67. A pollster interested in opinions on deficit spending divided a city into cityblocks, then surveyed the third house to the west of the southeast corner of each block.If the house was divided into apartments, the westernmost ground floor apartmentwas selected. This sample is

(A) a haphazard sample(B) a convenience sample(C) a simple random sample(D) a stratified random sample(E) a systematic sample

17

Page 18: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 68. Here is a hypothetical research question, asked of a random sample of stu-dents: Do you own a pet? (Data and related output below.)

Expected counts are shown belowobserved counts in each cell.

Own pet?No Yes All

Female 26 50 7629.44 46.55

Male 17 18 3513.55 21.44

All 43 68 111

Chi-Sq = 0.40 + 0.25+ 0.87 + ????

How do we find the number that should replace “????” in the chi-squared formula above?

(A) 1835

(B) 21.44111

(C) (18−21.44)2

21.44

(D) 68×35111

(E) 18111

Question 69. The Empirical Rule for normal distributions states, in part, that if a dataset is approximately normally distributed (or bell-shaped), then

(A) about 68% of all observations fall within one standard deviation of the mean.(B) about 80% of all observations fall within two standard deviations of the mean.(C) at most 90% of all observations fall within three standard deviations of the mean.(D) All of the above(E) None of the above

Question 70. In a statistical study, the population is

(A) those people or objects unreachable by the experimenters.(B) all people in the United States.(C) the group of people or objects about whom conclusions are to be drawn.(D) all people in the world.

18

Page 19: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 71. The mean of a histogram of many, many sample proportions from equally-sized random samples will be approximately

(A) The square root of: (true proportion)× (1− true proportion)/(sample size)(B) The area under a normal curve to the left of 1.64(C) 1.96 standard deviations(D) The true proportion of the population(E) The square root of: (true proportion)/(sample size)

Question 72. Suppose that a random sample of 900 Penn State students is selected andasked whether they live on campus. What is the margin of error for this study?

(A) 0(B) It is impossible to say without knowing the size of the population

(C) 1√900

, or 3.3%

(D) 1900 , or .0011%

Question 73. Statistics consists of:

(A) The collection of data by the news media.(B) Methods for recording all the “stats” in sports events.(C) Procedures and principles for obtaining and processing information in order to make

decisions under uncertainty.(D) Methods for calculating odds for betting on horse races.

Question 74. The mean of a histogram of many, many sample means of equal size fromthe same population will be approximately

(A) The true standard deviation of the population

(B) (true standard deviation of the population)/√

200(C) The true mean of the population(D) The area under a normal curve to the left of 1.64

Question 75. Participants in a study were asked to circle which of the following twostatements they considered more likely:

• The United States and Russia will fight a nuclear war in the next ten years.

• The United States and Russia will fight a nuclear war in the next ten years afterbeing drawn into a larger conflict involving countries such Iraq, Libya, Israel, orPakistan.

A large number of participants circled the second option. However, the second optionCOULD NOT be more likely than the first, and participants who thought it couldwere guilty of

(A) the gambler’s fallacy(B) the conjunction fallacy(C) the anchoring fallacy(D) the causation fallacy

19

Page 20: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 76. 100 American adults were sampled at random, and 55% of them expressedthe belief that Martha Stewart should go to prison. A 95% confidence interval for thetrue population proportion is

(A) .55± 1.64√

(.55)(1− .55)/100

(B) .55± 2√

(.55)/100

(C) .55± 2√

(.55)(1− .55)/100

(D) .55± 1.64√

(.55)/100

Question 77. People in the U.S. were interviewed and asked “Of the following issues, whichis the most serious issue facing our nation today?” The choices were the economy,education, terrorism, and health care. This is an example of

(A) An unreliable question(B) An open question(C) A closed question(D) An easy question

Question 78. The sample proportion from a sample of size 100 is found to be .36. Whatis the approximate standard deviation of this sample proportion?

(A) 1/√

100

(B)√

.36× .64/100

(C) .36

(D)√

.36

Question 79. Suppose that teenagers make up 16% of the population and senior citizensmake up 26% of the population. If we choose a person at random from the population,what is the probability that the person chosen is either a teenager or a senior citizen?

(A) .16 + .26, or .42

(B) .16+.262 , or .21

(C) .16× .26, or .04(D) .26(E) .16

Question 80. Suppose that I flip two coins, a nickel and a quarter. Which of the followingis true regarding the two events “Heads on nickel” and “Heads on quarter”?

(A) These two events are both independent and mutually exclusive.(B) These two events are mutually exclusive.(C) These two events are independent.(D) These two events are neither independent nor mutually exclusive.

20

Page 21: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 81. A study comparing weight loss for a group of men who only dieted withoutexercise with a group of men who only exercised without dieting found the followingweight loss information (in kilograms) from the samples:

Mean Standard deviation sample size SE MeanDiet only 7.2 3.7 42 0.571

Exercise only 4.0 3.9 47 0.569

If we express a 95% confidence interval for the difference in population means asA± 2B, which of the following could be the value of A?

(A) 7.2− 4.0, or 3.0(B) 0.571− 0.569, or 0.002(C) 42− 47, or −5(D) 3.7− 3.9, or −0.2

Question 82. The ancient Roman practice of decimation is a gruesome example of

(A) random sampling(B) stratified sampling(C) haphazard or convenience sampling(D) systematic sampling(E) cluster sampling

Question 83. Suppose we take repeated samples, all of size 200, from the same populationand calculate the sample mean in each case. The standard deviation of all thesesample means will be approximately

(A) The area under a normal curve to the left of 1.64

(B) (true standard deviation of the population)/√

200(C) The true mean of the population(D) The true standard deviation of the population

Question 84. Which of the following samples has a median of 20?

(A) 17, 21, 21, 21(B) 20, 20, 30, 40(C) 10, 15, 20, 300, 500(D) 16, 17, 18, 19, 20

Question 85. A dataset contains yearly measurements since 1960 of the median householdincome in the United States and the number of people (per 100,000 population) inprison in the United States. We observe a strong correlation of .75. Based on thisinformation, we can say that each of the following statements is true EXCEPT

(A) Higher incomes are associated with higher prison populations.(B) Higher incomes induce people to commit more crimes.(C) Years with higher household income have tended to be years with higher prison popu-

lations.(D) In a regression equation where median household income is x and prison population is

y, the slope would be positive.

21

Page 22: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 86. A study comparing mothers who smoked 10 or more cigarettes per daywhile pregnant to mothers who did not smoke while pregnant gives a 95% confidenceinterval for the mean difference in amount of education (in grades) as 0.15 to 1.19.What can we conclude from this?

(A) There is not a statistically significant difference between mean education for the twogroups of mothers.

(B) There is a statistically significant difference between mean education for the two groupsof mothers.

(C) We can never learn anything from samples about the differences in education levels forthe two groups of mothers.

(D) We need additional information in order to determine whether there is a statisticallysignificant difference between mean education for the two groups of mothers.

Question 87. If the Gallup organization wishes to conduct a survey of people in the UnitedStates aged 18 and over (of whom there are about 210 million), which of the followingis most likely to be the sample size used?

(A) 2.1 million(B) 100,000(C) 21 million(D) 4.2 million(E) 1500

Question 88. If a clock is consistently ten minutes fast, it is

(A) Unreliable and unbiased(B) Unreliable and biased(C) Reliable and unbiased(D) Reliable and biased

Question 89. All other things remaining constant, if the population size triples from 10million to 30 million, will a 95% confidence interval for the population mean get wideror narrower?

(A) Narrower(B) Neither; population size does not affect the confidence interval(C) Wider(D) It is impossible to determine the answer without also knowing the sample size.

Question 90. From a news report: “Blood pressure fell significantly in subjects who got400–500 milligrams of magnesium a day for four weeks, but not in those getting aplacebo.”

Which of the following is most likely?

(A) This was a randomized experiment; magnesium can be said to have caused the lowerblood pressure.

(B) This was a randomized experiment; magnesium cannot be said to have caused thelower blood pressure.

(C) This was an observational study; magnesium can be said to have caused the lowerblood pressure.

(D) This was an observational study; magnesium cannot be said to have caused the lowerblood pressure.

22

Page 23: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 91. The game “odd man” is played as follows: Three people all flip a coin. Ifone of the players has a different outcome than the other two, that player is the loser.Otherwise, the game is a draw. What is the probability that a game of odd man endsin a draw?

(A) 12 , or .5

(B) 18 , or .125

(C) 78 ×

78 , or .766

(D) 18 + 1

8 , or .25

(E) 78 + 7

8 , or 1.75

Question 92. All other things remaining constant, if the sample size is multiplied by 4then the standard error of the mean

(A) becomes twice as large(B) becomes four times as large(C) becomes half as large(D) becomes one fourth as large(E) does not change

Question 93. Can randomization play a role in an observational study? Why?

(A) No, because it is not possible to infer causal relationships from randomized experi-ments.

(B) No, because randomization applies only to randomized experiments.(C) Yes, because selection of a random sample ensures that the sample will be represen-

tative of the population.(D) Yes, because random assignment of individuals to various treatments can help protect

against potential confounding variables.

23

Page 24: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 94. The percent of any sample that falls above the first quartile equals

(A) 75%(B) 60%(C) 90%(D) 50%(E) Almost 100%

Question 95.

Both maps shown above were produced in the 1950’s. In class, we discussed how the uppergraph was used in a misleading way. What is misleading about the upper graph?

(A) In 1950, the densely populated states shown in black contained far more people thanthose eastern states shown in white even though the areas look fairly even.

(B) The states shown in black are not a random sample of states, so the results of thisstudy may not be applied to the larger U.S. population.

(C) Because it fails to color any of the eastern states black, this map contains an inherentbias.

(D) The states shown in black have large areas relative to their 1950 population, so theamount of income represented appears much larger than it really is.

Question 96. What is the first quartile of the sample 10, 10, 10, 10, 11, 12, 13, 20?

(A) 12(B) 10(C) 2(D) 10.5(E) 12.5

24

Page 25: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Each of the next four questions refers to the following boxplot:

Question 97. What is the interquartile range (IQR) of the sample?

(A) 8 minus 3, or 5(B) 10 minus 0, or 10(C) 20 minus 6, or 14(D) 6 minus 6, or 0(E) 20 minus 0, or 20

Question 98. Which of the following values is found in the sample and is considered anoutlier?

(A) 3(B) 0(C) 10(D) 6(E) 20

Question 99. What is the first quartile of the sample?

(A) 8(B) 10(C) 3(D) 0(E) 6

Question 100. What is the median of the sample?

(A) 3(B) 0(C) 6(D) 20(E) 10

25

Page 26: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

The following two questions refer to this paragraph:

Researchers at a prominent medical school reported that chronic users ofillegal drug M had lower IQ scores than those who did not use M chronically.

Question 101. This study is:

(A) A randomized experiment; to avoid detection by legal authorities, subjects must haveused M in a randomized manner.

(B) Worthless; there is no way to establish any association between drug use and IQ scores.(C) A stratified random sample; subjects were stratified into overlapping groups according

to whether or not they used M chronically.(D) An observational study; the researchers could not have offered M to subjects and

could not have randomized subjects into drug-use or non-drug-use groups.

Question 102. A newspaper headline reported the medical study as “More Evidence ThatChronic Illegal Drug Use Impairs Mental Ability!” This headline is:

(A) Accurate; stratified random samples lead to randomized experiments from which wecannot infer causation.

(B) Accurate; there is fundamentally solid anecdotal evidence that chronic users of M dopoorly in school.

(C) Misleading; we can never infer causation from randomized experiments.(D) Misleading; it implies that there is a causal connection between chronic use of M and

brain function, but causation cannot be inferred from an observational study.

26

Page 27: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 103.

What is the biggest flaw of the graph above?

(A) The sizes of the pens are misleading because there is no pen representing zero dollarsfor the sake of comparison.

(B) The pens are correctly scaled according to area, but the resulting heights make thedifferences look much larger than they really are.

(C) The use of pens is distracting and hides the informational content of the graph.(D) The pens are correctly scaled according to height, but the resulting areas make the

differences look much larger than they really are.

27

Page 28: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

The following three questions refer to this experiment:

To test the effects of drugs and alcohol on driving performance, 20 volunteerswere each asked to take a driving test under three sobriety conditions: sober,after two drinks, and after smoking marijuana. The order under which they tookthese was randomized. An evaluator watched them drive on a test course andrated their accuracy on a scale from 1 to 10, without knowing which conditionthey were under each time.

Question 104. For this experiment, the explanatory and response variables are

(A) sobriety condition and driving accuracy, respectively(B) sobriety condition and gender, respectively(C) gender and driving experience, respectively(D) driving accuracy and sobriety condition, respectively(E) gender and driving accuracy, respectively

Question 105. This experiment is

(A) double-blind(B) single-blind(C) neither single-blind nor double-blind

Question 106. This experiment is

(A) a matched-pair design(B) a block design(C) neither a block design nor a matched-pair design

For each of the following three questions, suppose an observational study finds thatpeople who use public transportation to get to work have better knowledge of currentaffairs than those who drive to work, but that this relationship is weaker for well-educatedpeople.

Question 107. In this study, whether the participant reads a daily newspaper is

(A) the explanatory variable(B) an interacting variable(C) a confounding variable(D) the response variable

Question 108. In this study, knowledge of current affairs is

(A) the explanatory variable(B) an interacting variable(C) a confounding variable(D) the response variable

28

Page 29: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

Question 109. In this study, level of education is

(A) the explanatory variable(B) an interacting variable(C) a confounding variable(D) the response variable

29

Page 30: Proportions and Percentiles for Standard Normal Scorespersonal.psu.edu/drh20/100/spring2004/exams/all3.pdfIn the infamous poll conducted by Literary Digest, ten million questionnaires

For the following four questions, assume that adult heights are normally distributedwith mean 68 inches and standard deviation 4 inches.

Question 110. What percent of the population is shorter than 72 inches tall? (Hint: Thecalculations in this question are very easy.)

(A) 58%(B) 84%(C) 76%(D) 69%(E) 72%

Question 111. Dot is 64 inches tall. What is Dot’s standardized score (z-score)?

(A) 64−684 , or −1

(B) (68− 64)× 4, or 16

(C) 68−644 , or 1

(D) 684 , or 17

(E) It is impossible to know for sure without also knowing Dot’s percentile.

Question 112. Scott has a standardized score (z-score) of 0. How tall is Scott?

(A) 68− 4, or 64 inches(B) 68 inches(C) 4 + 68× 0, or 4 inches(D) 68 + 4, or 72 inches

Question 113. What height is at the 67th percentile?

(A) 75 + 4× 0.67, or 77.68 inches(B) 68 + 4× 0.75, or 71.00 inches(C) 67 + 4× 0.75, or 70.00 inches(D) 68 + 4× 0.44, or 69.76 inches(E) 67 + 0.44, or 67.44 inches

30