examview - ch 1 to 4 study guide - miami-dade county ...teachers.dadeschools.net/sdaniel/ch 1 to 4...

23
Name: ________________________ Class: ___________________ Date: __________ ID: A 1 AP Stats: Chpaters 1 to 4 Study Guide Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A policeman records the speeds of cars on a certain section of roadway with a radar gun. The histogram below shows the distribution of speeds for 251 cars. Which of the following measures of center and spread would be the best ones to use when summarizing these data? A. Mean and interquartile range. B. Median and range. C. Median and standard deviation. D. Mean and standard deviation. E. Median and interquartile range. 2. Students with above-average scores on Exam 1 in STAT 001 tend to also get above-average scores on Exam 2. But the relationship is only moderately strong. In fact, a linear relationship between Exam 2 scores and Exam 1 scores explains only 36% of the variance of the Exam 2 scores. A. The correlation between Exam 1 scores and Exam 2 scores is r = .6. B. The correlation between Exam 1 scores and Exam 2 scores is r = ± .6 (can't tell which). C. The correlation between Exam 1 scores and Exam 2 scores is r = ± .36 (can't tell which). D. The correlation between Exam 1 scores and Exam 2 scores is r = .36. E. There is not enough information to say what r is. 3. Suppose we fit the least-squares regression line to a set of data. If a plot of the residuals shows a curved pattern, A. the correlation must be positive. B. the correlation must be 0. C. outliers must be present. D. r2 = 0. E. a straight line is not a good summary for the data. 4. A stratified random sample addresses the same issues as which of the following experimental designs? A. A block design. B. A matched pairs design. C. A double-blind experiment. D. An experiment with a placebo. E. A confounded, nonrandomized study.

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ Class: ___________________ Date: __________ ID: A

1

AP Stats: Chpaters 1 to 4 Study Guide

Multiple ChoiceIdentify the choice that best completes the statement or answers the question.

1. A policeman records the speeds of cars on a certain section of roadway with a radar gun. The histogram below shows the distribution of speeds for 251 cars.

Which of the following measures of center and spread would be the best ones to use when summarizing these data?A. Mean and interquartile range.B. Median and range.C. Median and standard deviation.D. Mean and standard deviation.E. Median and interquartile range.

2. Students with above-average scores on Exam 1 in STAT 001 tend to also get above-average scores on Exam 2. But the relationship is only moderately strong. In fact, a linear relationship between Exam 2 scores and Exam 1 scores explains only 36% of the variance of the Exam 2 scores.A. The correlation between Exam 1 scores and

Exam 2 scores is r = .6.B. The correlation between Exam 1 scores and

Exam 2 scores is r = ± .6 (can't tell which).C. The correlation between Exam 1 scores and

Exam 2 scores is r = ± .36 (can't tell which).D. The correlation between Exam 1 scores and

Exam 2 scores is r = .36.E. There is not enough information to say what r

is.

3. Suppose we fit the least-squares regression line to a set of data. If a plot of the residuals shows a curved pattern,A. the correlation must be positive.B. the correlation must be 0.C. outliers must be present.D. r2 = 0.E. a straight line is not a good summary for the

data.

4. A stratified random sample addresses the same issues as which of the following experimental designs?A. A block design.B. A matched pairs design.C. A double-blind experiment.D. An experiment with a placebo.E. A confounded, nonrandomized study.

Page 2: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

2

5. An example of a nonsampling error that can reduce the accuracy of a sample survey isA. interviewing people at shopping malls to obtain

a sample.B. using the telephone directory as the sampling

frame.C. many members of the sample cannot be

contacted.D. variation due to chance in choosing a sample at

random.E. using voluntary response to choose the sample.

6. If the heights of 99.7% of American men are between 5' 0" and 7' 0", what is your estimate of the standard deviation of the height of American men?A. 1"B. 4"C. 12"D. 3"E. 6"

7. A study of the effects of television on child development measured how many hours of television each of 125 grade school children watched per week during a school year and each child’s reading score. Which variable would you put on the horizontal axis of a scatterplot of the data?A. It makes no difference, because there is no

explanatory-response distinction in this study.B. Reading score, because it is the explanatory

variable.C. Reading score, because it is the response

variable.D. Hours of television, because it is the response

variable.E. Hours of television, because it is the

explanatory variable.

Scenario 3-9

A study gathers data on the outside temperature during the winter, in degrees Fahrenheit, and the amount of natural gas a household consumes, in cubic feet per day. Call the temperature x and gas consumption y. The house is heated with gas, so x helps explain y. The least-squares regression line for predicting y from x is:

8. Use Scenario 3-9. What does the number 1344 represent in the equation?A. Predicted gas usage (in cubic feet) when the

temperature is 0 degrees Fahrenheit.B. Predicted gas usage (in cubic feet) when the

temperature is 19 degrees Fahrenheit.C. The maximum possible gas a household can

use.D. It’s the y-intercept of the regression line, but it

has no practical purpose in the context of the problem.

E. None of the above.

9. Use Scenario 3-9. On a day when the temperature is 20°F, the regression line predicts that gas used will be aboutA. 1724 cubic feet.B. 964 cubic feet.C. 1325 cubic feet.D. 1383 cubic feet.E. none of the above.

Page 3: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

3

Scenario 4-2You want to know the opinions of American school teachers about establishing a national test for high school graduation. You obtain a list of the members of the National Education Association (the largest teachers' union) and mail a questionnaire to 2500 teachers chosen at random from this list. In all 1347 teachers return the questionnaire.

10. Use Scenario 4-2. The sampling frame isA. all American school teachers.B. the 2500 teachers to whom you mailed the

questionnaire.C. all American school students.D. all members of the National Education

Association.E. the 1347 teachers who mail back the

questionnaire.

11. Use Scenario 4-2. The sample isA. all American school teachers.B. the 1347 teachers who mail back the

questionnaire.C. all members of the National Education

Association.D. the 2500 teachers to whom you mailed the

questionnaire.E. all American school students.

12. Which of the following statements are true about the least-squares regression line? I. The slope is the predicted change in the response variable associated with a unit increase in the explanatory variable.II. The line always passes through the point (J , M ), the means of the explanatory and response variables, respectively.III. It is the line that minimizes the sum of the squared residuals.A. III only.B. I only.C. I, II, and III are all true.D. I and III only.E. II only.

13. You measure the age, marital status and earned income of an SRS of 1463 women. The number and type of variables you have measured isA. four; two categorical and two quantitative.B. three; two categorical and one quantitative.C. three; one categorical and two quantitative.D. four; one categorical and three quantitative.E. 14563.

14. “Least-squares” in the term “least-squares regression line” refers toA. Minimizing the sum of the squares of all values

of the explanatory variable.B. Minimizing the squares of the differences

between each value of the response variable and each value of the explanatory variable.

C. Minimizing the products of each value of the response variable and the predicted value based on the regression equation.

D. Minimizing the sum of the squares of all values of the response variable.

E. Minimizing the sum of the squares of the residuals.

Page 4: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

4

Scenario 4-5In order to assess the effects of exercise on reducing cholesterol, a researcher took a random sample of fifty people from a local gym who exercised regularly and another random sample of fifty people from the surrounding community who did not exercise regularly. They all reported to a clinic to have their cholesterol measured. The subjects were unaware of the purpose of the study, and the technician measuring the cholesterol was not aware of whether or not subjects exercised regularly.

15. Use Scenario 4-5. Which of the following best describes the inferences the researcher can make based in his results?A. He cannot make inferences about either cause

and effect or the populations from which the samples were taken.

B. There is not enough information to make judgments about the scope of inference.

C. He can make inferences about both cause and effect and the populations from which the samples were taken.

D. He can make inferences about cause and effect, but not about the populations from which the samples were taken.

E. He can make inferences about the populations from which the samples were taken, but not about cause and effect.

16. Use Scenario 4-5. This is a(n)A. matched pairs experiment.B. experiment, but not a double blind experiment.C. double blind experiment.D. observational study.E. block design.

17. Frequently, telephone poll-takers call near dinner time—between 6 pm and 7 pm—because most people are at home them. This is an effort to avoidA. a convenience sample.B. response bias.C. calling people after they have gone to bed.D. nonresponse.E. voluntary response bias.

18. A lobster fisherman is keeping track of the productivity of a set of traps he has placed in a favorite location. Below are the numbers of lobsters in these traps over the course of 12 different hauls.

0 3 3 3 4 5 5 6 7 7 12 14

According to the 1.5 x IQR rule, which values in the above distribution are outliers?A. 14 onlyB. 12 and 14C. 0, 12, and 14D. 0 and 14E. 0 only

Page 5: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

5

19. A study of child development measures the age (in months) at which a child begins to talk and also the child's score on an ability test given several years later. The study asks whether the age at which a child talks helps predict the later test score. The least-squares regression line of test score y on age x is y = 110 – 1.3x. According to this regression line, what happens (on the average) to children who talk one month later than other children?A. Their predicted test scores go down 1.3 points.B. Their predicted test scores are 108.7.C. Their predicted test scores go up 110 points.D. Their predicted test scores go up 1.3 points.E. Their predicted test scores go down 110 points.

Scenario 1-2Below is a two-way table summarizing the number of cylinders in selected car models manufactured in six different countries in the 1990’s.

Number of cylinders4 5 6 8 Tota

lFrance 0 0 1 0 1

Germany 4 1 0 0 5Italy 1 0 0 0 1Japan 6 0 1 0 7

Sweden 1 0 1 0 2U.S.A. 7 0 7 8 22

Total 19 1 10 8 38

20. Use Scenario 1-2. Which of the following is a marginal distribution?A. The percentage of cars manufactured in

Germany for each number of cylinders.B. The percentage of all cars manufactured in each

country.C. The numbers 4, 5, 6, 8.D. The number of four-cylinder cars manufactured

in Germany.E. The percentage of all four-cylinder cars

manufactured in Germany.

21. Use Scenario 1-2. From this table, we might conclude thatA. the only eight cylinder cars in this data set were

manufactured in Germany.B. about 18% of the cars sold in the United States

were manufactured in Japan.C. there is a strong association between country of

origin and number of cylinders.D. All the cars on Italian roads have four

cylinders.E. these data could be more effectively presented

with a box plot.

22. Use Scenario 1-2. The percentage of all cars listed in the table with 4-cylinder engines isA. 50%.B. 80%.C. 91%.D. 21%.E. 19%.

23. Use Scenario 1-2. The percent of cars with 4-cylinder engines that are made in Germany isA. 80%.B. 50%.C. 21%.D. 10.5%.E. 91%.

24. If removing an observation from a data set would have a marked change on the equation of the least-squares regression line, the point is calledA. a response.B. an outlier.C. resistant.D. influential.E. a residual.

25. You open a package of plain M & M candies and count how many there are of each color. The distribution of the variable “candy color” is:A. The total number of candies in the package.B. Six—the number of different colors the are in

the package.C. The six different colors and how many there are

of each.D. Since “color” is a categorical variable, it

doesn’t have a distribution.E. The colors: Red, Orange, Green, Yellow,

Brown, and Blue.

Page 6: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

6

26. An opinion research firm wants to find the country’s reaction to a speech by a famous politician. They randomly select six states, then randomly select ten Zip Codes from each state. Fifty people from each Zip Code are randomly selected for the survey. This is an example ofA. multistage sampling.B. convenience sampling.C. stratified random sampling.D. simple random sampling.E. cluster sampling.

27. IQs among undergraduates at Mountain Tech are approximately Normally distributed. The mean undergraduate IQ is 110. About 95% of undergraduates have IQs between 100 and 120. The standard deviation of these IQs is aboutA. 10.B. 5.C. 20.D. 25.E. 15.

28. One hundred volunteers who suffer from severe depression are available for a study. Fifty are selected at random and are given a new drug that is thought to be particularly effective in treating severe depression. The other fifty are given an existing drug for treating severe depression. A psychiatrist evaluates the symptoms of all volunteers after four weeks in order to determine if there has been substantial improvement in the severity of the depression. The study would be double blind ifA. all volunteers were not allowed to see the

psychiatrist nor the psychiatrist allowed to see the volunteers during the session in which the psychiatrist evaluated the severity of the depression.

B. neither drug had any identifying marks on it.C. neither the volunteers nor the psychiatrist knew

which treatment any person had received.D. the patients were given a placebo.E. all of the above.

29. You want to use numerical summaries to describe a distribution that is strongly skewed to the left. Which combination of measure of center and spread would be the best ones to use?A. Mean and interquartile range.B. Median and standard deviation.C. Mean and standard deviation.D. Median and interquartile range.E. Median and range.

30. The standard deviation of 16 peoples’ weights (in pounds) is computed to be 5.4. The variance of these measurements isA. 21.6.B. 2.24.C. 29.16.D. 256.E. 52.34.

31. Which of the following statements concerning residuals is true?A. The value of a residual is the observed value of

the response minus the value of the response that one would predict from the least-squares regression line.

B. An influential point on a scatterplot is not necessarily the point with the largest residual.

C. A plot of the residuals is useful for assessing the fit of the least-squares regression line.

D. The sum of the residuals is always 0.E. All of the above.

32. A researcher wishes to determine whether the rate of water flow (in liters per second. over an experimental soil bed can be used to predict the amount of soil washed away (in kilograms). In this study, the explanatory variable isA. depth of soil bed.B. rate of water flow.C. liters/second.D. size of soil bed.E. amount of eroded soil.

Page 7: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

7

33. Entomologist Heinz Kaefer has a colony of bongo spiders in his lab. There are 1000 adult spiders in the colony, and their weights are Normally distributed with mean 11 grams and standard deviation 2 grams. About how many spiders are there in the colony which weigh more than 12 grams?A. 160B. 310C. 117D. 690E. 840

34. You catch 10 cockroaches in your bedroom and measure their lengths in centimeters. Which of these sets of numerical descriptions are all measured in centimeters?A. median length, variance of lengths, largest

lengthB. median length, first and third quartiles of

lengthsC. mean length, standard deviation of lengths,

median lengthD. mean length, median length, variance of

lengths.E. both (B) and (C)

35. When controlled experiments are impractical or unethical, which of the following would be necessary to establish a cause-and-effect relation between two variables?A. An association between the variables is

observed in many different settings.B. Strong association between the variables.C. The alleged cause is plausible.D. There is no obvious lurking variable that would

affect the response variable.E. All of the above.

Scenario 4-1A sportswriter wants to know how strongly Lafayette residents support the local minor league baseball team, the Lafayette Leopards. She stands outside the stadium before a game and interviews the first 20 people who enter the stadium.

36. Use Scenario 4-1. The intended population for this survey isA. the 20 people who gave the sportswriter their

opinion.B. all American adults.C. all residents of Lafayette.D. all Leopard fans.E. all people attending the game the day the

survey was conducted.

37. Which of the following is correct?A. The square of the correlation is the proportion

of the data lying on the least-squares regression line.

B. The mean of the residuals from least-squares regression is 0.

C. The square of the correlation is the slope of the least-squares regression line.

D. The correlation r is the slope of the least-squares regression line.

E. The sum of the squared residuals from the least-squares line is 0.

38. Two variables are said to be negatively associated ifA. larger values of one variable are associated

with larger values of the other.B. smaller values of one variable are associated

with smaller values of the other.C. smaller values of one variable are associated

with both larger or smaller values of the other.D. larger values of one variable are associated

with smaller values of the other.E. there is no pattern in the relationship between

the two variables.

Page 8: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

8

39. Scores on the 1995 SAT verbal aptitude test x among Kentucky high school seniors were normally distributed with mean 420 and standard deviation 80. Scores on the 1995 SAT quantitative aptitude test y among Kentucky high school seniors were normally distributed with mean 440 and standard deviation 60. The least-squares regression line has the equation y = .6x + 188. The correlation between verbal scores and math scores isA. .8B. 0C. –.8D. .45E. cannot be determined from the information

given

Page 9: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

9

Scenario 3-7

Below is a scatter plot (with the least squares regression line) for calories and protein (in grams) in one cup of 11 varieties of dried beans. The computer output for this regression is below the plot.

40. Use Scenario 3-7. Which of the following best describes what the number S = 3.37648 represents?A. The ratio of the standard deviation of protein to

the standard deviation of calories is 3.37648.B. The standard deviation of the residuals is

3.37648.C. The standard deviation of the explanatory

variable, calories, is 3.37648.D. The standard deviation of the response variable,

protein content, is 3.37648.E. The slope of the regression line is 3.37648.

41. Use Scenario 3-7. The circled point on the scatter plot represents lima beans, which have 621 calories and 37 grams of protein. The residual for lima beans is:A. 4.18B. –4.18C. –37.0D. 41.18E. 37.0

42. In the late 1990’s Scotland was considering independence from England. An opinion poll showed that 51% of Scots favor "independence." Another poll taken at the same time showed that only 34% favored being "separate" from England. The reason these results differ by so much is thatA. the wording of questions has a big effect on

poll results.B. more follow-up efforts reduced the

nonresponse rate of the second poll.C. samples will usually differ just by chance due

to random sampling.D. the second poll suffered from undercoverage.E. the sample sizes are different, so the margins of

error are different.

Page 10: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

10

Scenario 3-5In a statistics course a linear regression equation was computed to predict the final exam score from the score on the first test. The equation of the least-squares regression line was

where represents the predicted final exam score and x is the score on the first exam.

43. Use Scenario 3-5. Suppose Joe scores a 90 on the first exam. What would be the predicted value of his score on the final exam?A. 81B. 90C. 89D. 91E. Cannot be determined from the information

given. We also need to know the correlation.

44. Use Scenario 3-5. The first test score isA. the response variable.B. a lurking variable.C. the explanatory variable.D. the intercept.E. the slope.

45. The Normal curve below describes the death rates per 100,000 people in developed countries in the 1990’s.

The mean and standard deviation of this distribution are approximatelyA. Mean 190; Standard Deviation 65B. Mean 100; Standard Deviation 100C. Mean 200; Standard Deviation 130D. Mean 100; Standard Deviation 65E. Mean 190; Standard Deviation 100

46. The essential difference between an experiment and an observational study is thatA. observational studies may have confounded

variables, but experiments never do.B. an experiment imposes treatments on the

subjects, but an observational study does not.C. observational studies cannot have response

variables.D. in an experiment, people must give their

informed consent before being allowed to participate.

E. observational studies are always biased.

47. The standard deviation of 16 peoples’ weights (in pounds) is computed to be 5.4. The units for the variance of these measurements isA. pounds.B. percentiles.C. pounds squared.D. square root pounds.E. no units. Variance never has units.

48. A local tax reform group polls the residents of the school district and asks the question, “Do you think the school board should stop spending taxpayers’ money on non-essential arts programs in elementary schools?” The results of this poll are likely toA. Underestimate support for arts programs

because of nonsampling error.B. Accurately estimate support for arts programs.C. Overestimate support for arts programs because

of undercoverage.D. Overestimate support for arts programs because

of nonsampling error.E. Underestimate support for arts programs

because of undercoverage.

49. An experiment compares the taste of a new spaghetti sauce with the taste of a commercially successful sauce readily available in grocery stores. Each of a number of tasters tastes both sauces (in random order) and says which tastes better. This is called aA. double-blind design.B. simple random sample.C. completely randomized design.D. matched pairs design.E. stratified random sample.

Page 11: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

11

50. A study is conducted to determine if one can predict the yield of a crop based on the amount of fertilizer applied to the soil. The response variable in this study isA. amount of rainfall.B. yield of the crop.C. the experimenter.D. amount of fertilizer applied to the soil.E. the soil.

51. Alexa’s school newspaper publishes an article saying that a poll of 200 male and female students indicated that 60% of the male students did the summer reading, while only 45% of the female students did the summer reading. Alexa suspects that this is a distortion of the true facts, and that Simpson’s paradox is to blame. She suspects that grade level (9th through 12th) is a lurking variable. What should she do to investigate her suspicions?A. Make two two-way tables, one for males, one

for females, in which the two variables are grade level and yes/no for summer reading.

B. Look at the conditional distributions in a two-way table of gender versus yes/no for summer reading.

C. Compare grade level and yes/no for summer reading in one two-way table, without dividing students according to gender.

D. Undertake a new poll and only ask students in grades 12 about the summer reading.

E. Draw parallel box plots of the summer reading data for males and females to see if there is a difference in the shape or center of the two distributions.

52. A statistics teacher asks the 29 students in his statistics class how many minutes they spent on one homework assignment. The distribution of the variable “time on homework” isA. the number of students who were asked the

questions—that is, 29.B. the difference between the longest time and the

shortest time among the students’ responses.C. a description of what values the variable takes

and how often it takes them.D. the average distance between each value of the

variable.E. the average time the students spent on the

assignment.

53. Items produced by a manufacturing process are supposed to weigh 90 grams. The manufacturing process is such, however, that there is variability in the items produced and they do not all weigh exactly 90 grams. The distribution of weights can be approximated by a Normal distribution with mean 90 grams and a standard deviation of 1 gram. About what percentage of the items will either weigh less than 87 grams or more than 93 grams?A. 94%B. 0.3%C. 0.15%D. 6%E. 99.7%

Scenario 4-6Does caffeine improve exam performance? Suppose all students in the 8:30 section of a course are given a "treatment" (two cups of coffee) and all students in the 9:30 section are not permitted to have any caffeine before a mid-term exam.

54. Use Scenario 4-6. The response variable in this study isA. teacher's performance.B. two cups of coffee.C. exam performance.D. the time the class is held.E. class attendance.

55. Use Scenario 4-6. Instead of giving all students in the 8:30 section two cups of coffee, students in the 8:30 section are randomly assigned to a treatment group (two cups of coffee. or a control group (two cups of decaffeinated coffee). The coffee is so bad that students cannot tell whether they are in the treatment or the control group. As it turns out, students in both groups do better on the exam than students in the 9:30 section, who weren't given anything. This could be the result ofA. sampling variability.B. voluntary response.C. an observational study.D. the placebo effect.E. all of the above.

Page 12: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

12

Scenario 3-6A researcher wishes to study how the average weight Y (in kilograms) of children changes during the first year of life. He plots these averages versus the age X (in months) and decides to fit a least-squares regression line to the data with X as the explanatory variable and Y as the response variable. He computes the following quantities.r = correlation between X and Y = 0.9J = mean of the values of X = 6.5M = mean of the values of Y = 6.6Sx = standard deviation of the values of X = 3.6Sy = standard deviation of the values of Y = 1.2

56. Use Scenario 3-6. The slope of the least-squares line isA. 2.7.B. 3.0.C. 1.01.D. 0.30.E. 0.88.

57. Use Scenario 3-6. The y-intercept of the least-squares line isA. 4.52B. 4.65C. 8.55D. 8.48E. –10.95

58. Which one of the following statements is correct?A. Faculty who are good researchers tend to be

poor teachers and vice versa, so the correlation between teaching and research is 0.

B. Women tend to be, on average, about 3.5 inches shorter than the men they marry, so the correlation between the heights of spouses must be negative.

C. The correlation r equals the proportion of times that two variables lie on a straight-line.

D. If people with larger heads tend to be more intelligent, then we would expect the correlation between head size and intelligence to be positive.

E. A researcher finds the correlation between the shoe size of children and their score on a reading test to be 0.22. The researcher must have made a mistake since these two variables are clearly unrelated and must have correlation 0.

59. The least-squares regression line is fit to a set of data. If one of the data points has a positive residual, thenA. the point must lie near the right edge of the

scatterplot.B. the point must lie above the least-squares

regression line.C. the correlation between the values of the

response and explanatory variables must be positive.

D. the point is probably an influential point.E. all of the above.

60. A company produces packets of soap powder labeled "Giant Size 32 Ounces." The actual weight of soap powder in a box has a Normal distribution with a mean of 33 oz. and a standard deviation of 0.8 oz. What proportion of packets are underweight (i.e., weigh less than 32 oz.)?A. 0.841.B. 0.106.C. 0.159.D. 0.115.E. 0.212.

61. A survey typically records many variables of interest to the researchers involved. Below are some of the variables from a survey conducted by the U.S. Postal Service. Which of the variables is categorical?A. Total household income, before taxes, in 1993B. County of residenceC. Number of people, both adults and children,

living in the householdD. Number of rooms in the dwellingE. Age of respondent

Page 13: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

13

Scenario 4-7A farmer wishes to determine which of two brands of baby pig pellets, Kent or Moormans, produces better weight gains. Two of his sows each give birth to litters of 10 pigs on the same day, so he decides to give the baby pigs in litter A only Kent pellets, while the pigs in litter B will get only Moormans pellets. After four weeks, the average weight gain for pigs in litter A is greater than the average weight gain for pigs in litter B.

62. Use Scenario 4-7. If the farmer had fed Kent pellets to an SRS of 5 pigs from litter A and an SRS of 5 pigs from litter B, with the remaining 10 pigs getting Moormans pellets, then he would have been usingA. a systematic random sample.B. a block design.C. a convenience sample.D. a matched-pairs design.E. a completely randomized design.

63. Use Scenario 4-7. The feed they get is not the only factor affecting the rate at which pigs gain weight. Genetic differences also affect weight gain. It is likely that the pigs in litter A are genetically different from the pigs in litter B, since the two litters have different mothers. Since the farmer is only interested in determining which brand of pellets is better, the study suffers fromA. invalid measurement.B. experimenter bias.C. common response.D. convenience sampling.E. confounding.

64. The plot shown below is a Normal probability plot for the total annual cost (tuition plus room and board). to attend 126 of the top colleges in the country in 2005. Which statement is true for these data?

A. The data are approximately Normally distributed.

B. The data are clearly Normally distributed.C. The data are clearly skewed to the left.D. There is insufficient information to determine

the shape of the distribution.E. The data are clearly skewed to the right.

65. Let X denote the time taken for a computer link to be made between the terminal in an executive's office and the computer at a remote factory site. It is known that X has a Normal distribution with a mean of 15 seconds and a standard deviation of 3 seconds. On 90% of the occasions the computer link is made in less thanA. 18.11 seconds.B. 18.84 seconds.C. 11.16 seconds.D. 19.39 seconds.E. 15.95 seconds.

66. Which of the following is not a major principle of good design for all experiments?A. ReplicationB. Comparison to a control.C. BlockingD. RandomizationE. All of these are important principles for every

experiment.

Page 14: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

14

Scenario 3-8

A fisheries biologist studying whitefish in a Canadian Lake collected data on the length (in centimeters) and egg production for 25 female fish. A scatter plot of her results and computer regression analysis of egg production versus fish length are given below.Note that Number of eggs is given in thousands (i.e., “40” means 40,000 eggs).

Predictor Coef SE Coef T PConstant -142.74 25.55 -5.59 0.000Fish length 39.250 5.392 7.28 0.000

S = 6.75133 R-Sq = 69.7% R-Sq(adj) = 68.4%

67. Use Scenario 3-8. Which of the following is the plot of residuals versus fish lengths?

A.

B.

C.

Page 15: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

15

D.

E.

68. Use Scenario 3-8. On average, how far are the predicted y-values from the actual y-values?A. 0.697B. 25.55C. 5.392D. 6.75133E. Cannot be determined without the original data.

69. Use Scenario 3-8. The equation of the least-squares regression line isA. Eggs = –142.74 + 39.25(Length)B. Eggs = 25.55 + 5.392(Length)C. Eggs = –142.74 + 39.25(Eggs)D. Eggs = 25.55 + 5.392(Eggs)E. Eggs = 39.25 – 142.74(Length)

70. Use Scenario 3-8. Which of the following statements can be made on the basis of the computer output?A. 68.4% of the variation in fish length can be

accounted for by the linear regression of egg production on fish length.

B. 83.5% of the variation in egg production can be accounted for by the linear regression of egg production on fish length.

C. 83.5% of the variation in fish length can be accounted for by the linear regression of egg production on fish length.

D. 69.7% of the variation in egg production can be accounted for by the linear regression of egg production on fish length.

E. 69.7% of the variation in fish length can be accounted for by the linear regression of egg production on fish length.

71. A company produces packets of soap powder that are labeled "Giant Size 32 Ounces." The actual weight of soap powder in a box has a Normal distribution with a mean of 33 oz. and a standard deviation of 0.7 oz. 95% of packets actually contain more than x oz. of soap powder. What is x?A. 32.88B. 34.15C. 31.60D. 31.85E. 34.40

Page 16: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

16

Scenario 3-4Consider the following scatterplot of amounts of CO (carbon monoxide) and NOX (nitrogen oxide) in grams per mile driven in the exhausts of cars. The least-squares regression line has been drawn in the plot.

72. Use Scenario 3-4. In the scatterplot, the point indicated by the open circleA. is an outlier.B. has a zero value for the correlation.C. has a negative value for the residual.D. has a positive value for the residual.E. has a zero value for the residual.

73. Simple random samplingA. reduces variability.B. offsets bias resulting from undercoverage and

nonresponse.C. reduces bias resulting from poorly worded

questions.D. reduces bias resulting from the behavior of the

interviewer.E. None of the above.

Page 17: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

17

74. In a study of the link between high blood pressure and cardiovascular disease, a group of white males aged 35 to 64 was followed for 5 years. At the beginning of the study, each man had his blood pressure measured and it was classified as either "low" systolic blood pressure (less than 140 mm Hg) or "high" blood pressure (140 mm Hg or higher). The following table gives the number of men in each blood pressure category and the number of deaths from cardiovascular disease during the 5-year period.

Blood pressure Deaths TotalLow 10 2000

High 5 3500

Based on these data, which of the following statements is correct?A. The mortality rate (proportion of deaths) for men with high blood pressure is 5 times that

of men with low blood pressure.B. These data are consistent with the idea that there is a link between high blood pressure

and death from cardiovascular disease.C. Although there were more deaths in the high blood pressure group, this is expected,

because there were 1500 more men in that group.D. These data probably understate the link between high blood pressure and death from

cardiovascular disease, because men will tend to understate their true blood pressure.E. All of the above.

75. Which of the following is true of the correlation r?A. –1 = r = 1.B. It is a resistant measure of association.C. Whenever all the data lie on a perfectly straight-line, the correlation r will always be

equal to +1.0.D. If r is the correlation between X and Y, then -r is the correlation between Y and X.E. All of the above.

Page 18: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

18

Scenario 3-2The following table and scatter plot present data on wine consumption (in liters per person per year) and death rate from heart attacks (in deaths per 100,000 people per year) in 19 developed Western countries.

Wine Consumption and Heart AttacksCountry Alcohol from

wineHeart disease deaths

Country Alcohol from wine

Heart disease deaths

Australia 2.5 211 Netherlands 1.8 167Austria 3.9 167 New Zealand 1.9 266Belgium 2.9 131 Norway 0.8 227Canada 2.4 191 Spain 6.5 86Denmark 2.9 220 Sweden 1.6 115Finland 0.8 297 Switzerland 5.8 285France 9.1 71 United

Kingdom1.3 199

Iceland 0.8 211 United States 1.2 172Ireland 0.7 300 West Germany 2.7Italy 7.9 107

76. Use Scenario 3-2. The correlation between wine consumption and heart disease deaths is one of the following values. From the scatterplot, which must it be?A. r is very close to 0B. r = 0.84C. r = –0.84D. r = –0.25E. r = 0.25

77. Use Scenario 3-2. If heart disease death rate were expressed as deaths per 1,000 people instead of as deaths per 100,000 people, how would the correlation r between wine consumption and heart disease death rate change?A. r would be multiplied by 10.B. r would be divided by 100.C. r would not change.D. r would be divided by 10.E. r would be multiplied by 100.

Page 19: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

19

78. Use Scenario 3-2. Which country is represented by the clear triangle in the scatter plot?A. CanadaB. ItalyC. BelgiumD. New ZealandE. Finland

79. Use Scenario 3-2. The scatterplot shows thatA. the amount of wine a country drinks is not

related to its heart disease death rate.B. country is the explanatory variable.C. heart disease deaths is the explanatory variable.D. countries that drink more wine have lower

death rates from heart disease.E. countries that drink more wine have higher

death rates from heart disease.

Scenario 1-6The following is a boxplot of the birth weights (in ounces) of a sample of 160 infants born in a local hospital.

80. Use Scenario 1-6. About 40 of the birthweights were belowA. 132 ounces.B. 92 ounces.C. 122 ounces.D. 112 ounces.E. 102 ounces.

81. Use Scenario 1-6. The median birthweight is approximatelyA. 100 ounces.B. 80.5 ounces.C. 110 ounces.D. 90 ounces.E. 120 ounces.

82. Using the standard Normal distribution tables, the area under the standard Normal curve corresponding to -0.5 Z 1.2 isA. 0.2815.B. 0.5764.C. 0.8849.D. 0.3661.E. 0.3085.

83. The five-number summary of the distribution of 316 scores on a statistics exam is:0 26 31 36 50The scores are approximately Normal. The standard deviation of test scores must be aboutA. 7.5.B. 5.0.C. 10.D. 55.E. 0.67.

Page 20: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

20

84. Are dogs better at tracking the movements of brightly colored objects? Fifteen experienced “disk dogs” who have been trained to catch flying disks in mid-air are given the chance to catch a bright red disk or a plain white disk. Each disk is thrown 10 times for each dog, with the sequence of disks (red or white) determined randomly. The proportion of red disks caught to the proportion of white disks caught is compared for each dog. This is an example of aA. double-blind design.B. matched pairs design.C. simple random sample.D. stratified random sample.E. completely randomized design.

85. Birthweights at a local hospital have a Normal distribution with a mean of 110 oz. and a standard deviation of 15 oz. The proportion of infants with birthweights under 95 oz. is aboutA. 0.500.B. 0.025.C. 0.159.D. 0.341.E. 0.841.

86. Just before the presidential election of 1936, the magazine Literary Digest predicted—incorrectly, as it turned out—that Alf Landon would defeat Franklin Delano Roosevelt. Landon lost in a landslide. It turned out that the magazine had only polled its own subscribers, plus others from a list of automobile owners and a list of people who had telephone service. All three groups had higher than typical incomes during the Great Depression. This is an example ofA. response bias.B. voluntary response bias.C. nonresponse.D. undercoverage.E. bias resulting from question wording.

87. Which of the following statements describes what the standard deviation of residuals for a regression equation can be used for?I. It describes the typical vertical distance between an observed data point and the regression line.II. It evaluate whether a linear model is appropriate for a set of data.III. It measures the overall precision of predictions made using the regression equation.A. I onlyB. II onlyC. III onlyD. Both I and IIE. Both I and III

88. The eight students listed below are enrolled in a new honors course developed by the chemistry department. 1. Alvarez 5. Miller2. Barlow 6. Pfouts3. Nahhas 7. Berliner4. Salter 8. Verducci

Starting at the beginning of the random number list below, choose a simple random sample of four students to be interviewed in detail about the quality of the course. Use the labels attached to the eight names.

41842 81868 71035 09001 43367 49497 54580 81507

The sample you obtain isA. 4, 1, 8, and 4.B. Salter, Alvarez, Verduci, Pfouts.C. Alvarez, Barlow, Salter, and Verducci.D. Salter, Alvarez, Verduci, Salter.E. Alvarez, Barlow, Nahhas, and Salter.

Page 21: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

21

89. The fraction of the variation in the values of a response y that is explained by the least-squares regression of y on x is theA. sum of the squared residuals.B. correlation coefficient.C. slope of the least-squares regression line.D. intercept of the least-squares regression line.E. square of the correlation coefficient.

90. Which of the following best describes the correlation r?A. The average of the products of the standardized

scores of X and Y for each point.B. The average perpendicular distance between

each data point and the least-squares regression line.

C. The average of the squared products of the standardized scores of X and Y for each point.

D. The average of the differences between each X value and each Y value.

E. The average of the products of each of the X and Y values for each point

91. The correlation between the age and height of children is found to be about r = 0.7. Suppose we use the age x of a child to predict the height y of the child. We conclude thatA. about 70% of the time, age will accurately

predict height.B. the fraction of the variation in heights

explained by the least-squares regression line of y on x is 0.70.

C. the fraction of the variation in heights explained by the least-squares regression line of y on x is 0.49.

D. the least-squares regression line of y on x would have a slope of 0.7.

E. the line explains about 49% of the data.

Scenario 3-3Consider the following scatterplot, which describes the relationship between stopping distance (in feet) and air temperature (in degrees Centigrade. for a certain 2,000-pound car travelling 40 mph.

92. Use Scenario 3-3. If another data point were added with an air temperature of 0º C and a stopping distance of 80 feet, the correlation wouldA. Whether this data point causes and increase or

decrease cannot be determined without recalculating the correlation.

B. Increase, since this new point is an outlier that does not follow the pattern in the data.

C. Stay nearly the same, since correlation is resistant to outliers.

D. Decrease, since this new point is an outlier that does not follow the pattern in the data.

E. Increase, since there would be more data points.

93. In an experiment, an observed effect so large that it would rarely occur by chance is calledA. influential.B. statistically significant.C. an outlier.D. replication.E. bias.

Page 22: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

22

94. Below is a scatterplot of wine consumption (in liters per person per year) and death rate from heart attacks (in deaths per 100,000 people per year) in 19 developed Western countries. European countries are designated by closed circles, other countries are designated by open circles.

Which of the following statement is not supported by the information in the scatter plot?A. The correlation between wine consumption and

heart disease deaths is equally strong in European countries and non-European countries.

B. The four countries with the highest rates of wine consumption are all European.

C. The country with the highest heart disease death rate is in Europe.

D. About half the European countries consume more wine per person than any of the non-European countries.

E. On average, the non-European countries drink less wine and have more heart attacks.

95. In order to select a sample of undergraduate students in the United States, I select a simple random sample of four states. From each of these states, I select a simple random sample of two colleges or universities. Finally, from each of these eight colleges or universities, I select a simple random sample of 20 undergraduates. My final sample consists of 160 undergraduates. This is an example ofA. convenience sampling.B. simple random sampling.C. cluster sampling.D. stratified random sampling.E. multistage sampling.

Page 23: ExamView - Ch 1 to 4 Study Guide - Miami-Dade County ...teachers.dadeschools.net/sdaniel/Ch 1 to 4 MC Study Guide.pdfC. calling people after they have gone to bed. D. nonresponse

Name: ________________________ ID: A

23

96. Different writers have different styles. One way to quantify this difference is to compare the distribution of word lengths in their work. Below are parallel boxplots describing the distributions of word lengths for the first 60 words in Henry James’s The Turn of the Screw, J.K. Rowling’s Harry Potter and the Chamber of Secrets, and Chapter 1 of your statistics textbook (labeled “Starnes” below).

Based on the graphs, which one of the following statements must be true?A. The longest word in the distribution of Rowling’s word lengths is short than 25% of the

word in the “James” distribution.B. The range of Rowling’s word lengths is smaller than the interquartile range of Starnes’s

word lengths.C. The median word length for Rowling is longer than for either Starnes or James.D. Dot plots of the distributions of James’s word lengths and Starnes’s word lengths are

identical.E. 75% of the words in Rowling’s distribution are longer than the median word length in

Starnes’s distribution.