mathacle pset ----- stats, concepts in statistics and...

23
Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________ 27 Part 2 Quarterly Exam Questions MULTIPLE-CHOICE QUESTIONS I. SAMPLING MC I-1.) [APSTATSMC2002-9] A volunteer for a mayoral candidate's campaign periodically conducts polls to estimate the proportion of people in the city who are planning to vote for this candidate in the upcoming election. Two weeks before the election, the volunteer plans to double the sample size in the polls. The main purpose of this is to (A) reduce nonresponse bias (B) reduce the effects of confounding variables (C) reduce bias due to the interviewer effect (D) decrease the variability in the population (E) decrease the standard deviation of the sampling distribution of the sample proportion MC I-2.) [APSTATSMC2002-15] A high school statistics class wants to conduct a survey to determine what percentage of students in the school would be willing to pay a fee for participating in after-school activities. Twenty students are randomly selected from each of the freshman, sophomore, junior, and senior classes to complete the survey. This plan is an example of which type of sampling? (A) Cluster (B) Convenience (C) Simple random (D) Stratified random (E) Systematic MC I-3.) [APSTATSMC2002-16] Jason wants to determine how age and gender are related to political party preference in his town. Voter registration lists are stratified by gender and age-group. Jason selects a simple random sample of 50 men from the 20 to 29 age-group and records their age, gender, and party registration (Democratic, Republican, neither). He also selects an independent simple random sample of 60 women from the 40 to 49 age-group and records the same information. Of the following, which is the most important observation about Jason's plan? (A) The plan is well conceived and should serve the intended purpose. (B) His samples are too small. (C) He should have used equal sample sizes. (D) He should have randomly selected the two age groups instead of choosing them nonrandomly. (E) He will be unable to tell whether a difference in party affiliation is related to differences in age or to the difference in gender.

Upload: voduong

Post on 05-Jun-2018

288 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

27

Part 2 – Quarterly Exam Questions

MULTIPLE-CHOICE QUESTIONS I. SAMPLING

MC I-1.) [APSTATSMC2002-9] A volunteer for a mayoral candidate's campaign periodically conducts polls to estimate the proportion of people in the city who are planning to vote for this candidate in the upcoming election. Two weeks before the election, the volunteer plans to double the sample size in the polls. The main purpose of this is to (A) reduce nonresponse bias (B) reduce the effects of confounding variables (C) reduce bias due to the interviewer effect (D) decrease the variability in the population (E) decrease the standard deviation of the sampling distribution of the sample proportion MC I-2.) [APSTATSMC2002-15] A high school statistics class wants to conduct a survey to determine what percentage of students in the school would be willing to pay a fee for participating in after-school activities. Twenty students are randomly selected from each of the freshman, sophomore, junior, and senior classes to complete the survey. This plan is an example of which type of sampling? (A) Cluster (B) Convenience (C) Simple random (D) Stratified random (E) Systematic MC I-3.) [APSTATSMC2002-16] Jason wants to determine how age and gender are related to political party preference in his town. Voter registration lists are stratified by gender and age-group. Jason selects a simple random sample of 50 men from the 20 to 29 age-group and records their age, gender, and party registration (Democratic, Republican, neither). He also selects an independent simple random sample of 60 women from the 40 to 49 age-group and records the same information. Of the following, which is the most important observation about Jason's plan? (A) The plan is well conceived and should serve the intended purpose. (B) His samples are too small. (C) He should have used equal sample sizes. (D) He should have randomly selected the two age groups instead of choosing them nonrandomly. (E) He will be unable to tell whether a difference in party affiliation is related to differences in age or to the difference in gender.

Page 2: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

28

MC I-4.) [APSTATSMC2015-2] A researcher wanted to estimate the average amount of money spent on extracurricular activities per school in a certain region. The researcher randomly selected 20 public schools and 20 private schools in the region to use for a sample. Which of the following best describes the type of the sample that was taken? (A) A census (B) A cluster sample (C) A convenience sample (D) A simple random sample (E) A stratified sample MC I-5.) [APSTATSMC2007-20] Which of the following is NOT a characteristic of stratified sampling? (A) Random sampling is part of the sampling procedure. (B) The population is divided into groups of units that are similar on some characteristic. (C) The strata are based on facts known before the sample is selected. (D) Each individual unit in the population belongs to one and only one of strata. (E) Every possible subset of population, of the desired sample size, has an equal chance of being selected. MC I-6.) [APSTATSMC2012-15] A polling firm is interested in surveying a representative sample of registered voters in the United States. The firm has automated its sampling so that random phone numbers within the United States are called. Each time a number is called, the procedure below is followed. • If there is no response or if an answering machine is reached, another number is automatically called. • If a person answers, a survey worker verifies that the person is at least 18 years of age. • If the person is not at least 18 years of age, no response is recorded, and another number is called. • If the person is at least 18 years of age, that person is surveyed. Some people claim the procedure being used does not permit the results to be extended to all registered voters. Which of the following is NOT a legitimate concern about the procedure being used? (A) Registered voters with children under the age of 18 years may be underrepresented in the sample. (B) Registered voters with unlisted telephone numbers may be underrepresented in the sample. (C) Registered voters who have more than one telephone number may be overrepresented in the sample. (D) Registered voters who live in households consisting of more than one voter may be underrepresented. (E) People who are not registered to vote may bias the sample results.

Page 3: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

29

MC I-7.) [APSTATSMC2012-18] When using a one-sample t-procedure to construct a confidence interval for the mean of a finite population, a condition is that the population size be at least 10 times the sample size. The reason for the condition is to ensure that (A) the sample size is large enough (B) the central limit theorem is applicable for the sample mean (C) the sample standard deviation is a good approximation of the population standard deviation (D) the degree of dependence among observations is negligible (E) the sampling method is not biased MC I-8.) [APSTATSMC2013-2] A school principal wanted to investigate student opinion about the food served in the school cafeteria. The principal selected at random 50 first-year students, 50 second-year students, 50 third-year students, and 50 fourth-year students to complete a questionnaire. Which of the following best describes the principal’s sampling plan? (A) A stratified random sample (B) A simple random sample (C) A cluster sample (D) A convenience sample (E) A systematic sample MC I-9.) [APSTATSMC2013-27] A certain motel is roughly 20 miles from the entrance to Yosemite National Park. The motel manager wants to get a better estimate of the distance and asks five people to each measure the distance, to the nearest tenth of a mile, using the odometer in his or her car. The manager will use the median of the five measurements as the estimate of the distance. Which of the following statements is NOT a statistical justification for the manager’s plan?

(A) Odometer reading should be considered a variable when used to measure this distance. (B) The median of the five measurements is more likely to be close to the actual distance than is a single measurements. (C) The actual distance should be considered a variable, and taking five measurements allows the manager to estimate the variability in the actual distance. (D) If one or two odometers give inaccurate readings, the estimate still should be fairly close to the actual distance. (E) The manager can get some indication of how far off the estimate might be.

Page 4: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

30

MC I-10.) [APSTATSMC2013-33] A regional transportation authority is interested in estimating the mean number of minutes working audits in the region spends commuting to work on a typical day. A random sample of working audits will be selected from each of three strata: urban, suburban, and rural. Selected individuals will be asked the number of minutes they spend commuting to work on a typical day. Why is stratification used in this situation? (A) To remove bias when estimating the proportion of working audits living in urban, suburban, and rural areas. (B) To remove bias when estimating the mean commuting time (C) To reduce bias when estimating the mean commuting time (D) To decrease the variability in estimates of the proportion of working adults living in urban, suburban, and rural areas. (E) To decrease the variability in estimates of the mean commuting time.

Page 5: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

31

II. DESIGN OF STUDIES MC II-1.) [APSTATSMC2013-15] An experiment will be concluded to determine whether children learn their multiplication facts better by practicing with flash cards or by practicing on a computer. Children who volunteer for the experiment will be randomly assigned to one of the two treatments. Because the children’s gender may affect the outcome, there will be blocking by gender. After practice, the children will be given a test on their multiplication facts. Why will it be impossible to conduct a double-blind experiment? (A) The experimenter will know whether the child is a boy or a girl and whether he or she used flash cards or the computer. (B) The child will know whether he or she is a boy or a girl. (C) The child will know whether he or she used flash cards or computer. (D) The person who grades the tests will know whether the child was a boy or a girl. (E) The person who grades the tests will know whether the child used flash cards or the computer.

MC II-2.) [APSTATSMC1997-18]

MC II-3.) [APSTATSMC2002-25] A study of existing records of 27,000 automobile accidents involving children in Michigan found that about 10 percent of children who were wearing a seatbelt (group SB) were injured and that about 15 percent of children who were not wearing a seatbelt (group NSB) were injured. Which of the following statements should NOT be included in a summary report about this study? (A) Driver behavior may be a potential confounding factor. (B) The child's location in the car may be a potential confounding factor. (C) This study was not an experiment, and cause-and-effect inferences are not warranted. (D) This study demonstrates clearly that seat belts save children from injury. (E) Concluding that seatbelts save children from injury is risky, at least until the study is independently replicated.

Page 6: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

32

MC II-4.) [APSTATSMC2007-9] A television news editor would like to know how local registered voters would respond to the question, "Are you in favor of the school bond measure that will be voted on in an upcoming special election?" A television survey is conducted during a break in the evening news by listing two telephone numbers side by side on the screen, one for viewers to call if they approve of the bond measure, and the other to call if they disapprove. This survey method could produce biased results for a number of reasons. Which one of the following is the most obvious reason? (A) It uses a stratified sample rather than a simple random sample. (B) People who feel strongly about the issue are more likely to respond. (C) Viewers should be told about the issues before the survey is conducted. (D) Some registered voters who call might not vote in the election. (E) The wording of the question is biased.

MC II-5.) [APSTATSMC2007-31] Automobile brake pads are either metallic or nonmetallic. An experiment is to be conducted to determine whether the stopping distance is the same for both types of brake pads. In previous studies, it was determined that car size (small, medium, large) is associated with stopping distance, but car type (sedan, wagon, coupe) is not associated with stopping distance. The experiment would be best done (A) by blocking on car size (B) by blocking on car type (C) by blocking on stopping distance (D) by blocking on brake pad type (E) without blocking

MC II-6.) [APSTATSMC2007-35] A group of students has 60 houseflies in a large container and needs to assign 20 to each of the three groups labeled A, B, and C for an experiment. They can capture the flies one at a time when the flies enter a side chamber in the container that is baited with food. Which of the following methods will be most likely to result in three comparable groups of 20 houseflies each? (A) Label the first 20 flies caught as Group A, the second 20 caught as group B, and the third 20 caught as group C. (B) Write the letters A, B, and C on separate slips of paper. Randomly pick one of the slips of paper and assign the first 20 flies caught to that group. Pick another slip and assign the next 20 flies caught to that group. Assign the remaining flies to the remaining group. (C) When each fly is caught, roll a die. If the die shows an even number, the fly is labeled A. If the die shows an odd number, the fly is labeled B. When 20 flies have been labeled A and 20 have been labeled B, the remaining flies are then labeled C. (D) Place each fly in its own numbered container (numbered from 1 to 60) in the order that it was caught. Write the numbers from 1 to 60 on slips of paper, put the slips in a jar, and mix them well. Pick 20 numbers out of the jar. Assign the flies in the containers with those numbers to group A. Pick 20 more numbers and assign the flies in the containers with those numbers to group B. Assign the remaining 20 flies to group C. (E) When each fly is caught, roll a die. If the die shows a 1 or 2, the fly is labeled A. If the die shows a 3 or 4, the fly is labeled B. If the die shows a 5 or 6, the fly is labeled C. Repeat this process for all 60 flies.

Page 7: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

33

MC II-7.) [APSTATSMC2013-34] A randomized block design will be used in an experiment to compare two lotions that protect people from getting sunburned. Which of the following should guide the formation of the blocks? (A) Participants in the same block should receive the same lotion. (B) Participants should be randomly assigned to the blocks. (C) Participants should be kept blind as to which block they are in. (D) Participants within each block should be as similar as possible with respect to how easily they get sunburned. (E) Participants within each block should be as different as possible with respect to how easily they get sunburned.

MC II-8.) [APSTATSMC2015-14] The dining and nutrition staff at the University of Georgia plans to survey students to get their opinion on the new nutrition program introduced this semester at each of the on-campus dining halls. They are interested in getting feedback from students living both on-campus and off-campus about the new gluten-free and vegetarian options offered at each meal. Which of the following sampling methods is the most appropriate for accomplishing this? (A) Hand out a survey to every 10th student that enters each dining hall on a specified day. (B) Group students by housing status, one group representing those living on campus and the other representing those living off campus. Email a survey to 100 randomly selected students from each group. (C) On equally sized slips of paper, write down the names of all the dormitories on campus as well as all the apartment complexes off campus. Put all the names in a hat, mix them well, and draw out five of them. Email a survey to all students in the five randomly selected buildings. (D) Hand out a survey to the first 50 students that enter each dining hall on a specified day. (E) Create a Facebook page for each dining hall where students can post their comments.

MC II-9.) [APSTATSMC2015-22] A university statistics professor wants to know if including review problems in each set of homework problems (treatment I) is more effective than including only new problems (treatment II). He teaches three sections of the course: a morning, an afternoon, and an evening section, each with 30 students. Within each section the professor randomly assigns 15 students to treatment I and 15 students to treatment II. Compared to randomly assigning 45 students to each treatment, what is the advantage of randomly assigning 15 students to each treatment within each section? (A) Random assignment within section eliminates the placebo effect. (B) Random assignment within section allows the professor to generalize the results to all sections. (C) Random assignment within section permits the professor and students to be blinded as to the treatment group assignment. (D) Random assignment within section accounts for possible differences in performance due to the time of day the class meets. (E) Random assignment within section reduces the effect of nonresponse bias.

Page 8: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

34

MC II-10.) [APSTATSMC2015-30] Nearly 12,000 high school students across 11 different countries were surveyed about both their sleeping habits and their performance in school. Based on the results, researchers concluded that a lack of sleep is linked to students earning poor grades in school. Which of the following statements is true? (A) This is an observational study. Therefore, researchers cannot conclude that a lack of sleep causes poor grades. (B) This is an observational study. Therefore, researchers can conclude that a lack of sleep causes poor grades. (C) This study is a well-designed experiment. Therefore, researchers cannot conclude that a lack of sleep causes poor grades. (D) This study is a well-designed experiment. Therefore, researchers can conclude that a lack of sleep causes poor grades. (E) This is neither an observational study nor a well-designed experiment.

Page 9: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

35

III.. EXPLORING DATA MC III-1.) [APSTATSMC1997-14]

MC III-2.) [APSTATSMC1997-21]

Page 10: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

36

MC III-3.) [APSTATSMC1997-22]

MC III-4.) [APSTATSMC2002-14]

The boxplots shown above summarize two data sets, I and II. Based on the boxplots, which of the following statements about these two data sets CANNOT be justified? (A) The range of data set I is equal to the range of data set II. (B) The interquartile range of data set I is equal to the interquartile range of data set II. (C) The median of data set I is less than the median of data set II. (D) Data set I and data set II have the same number of data points. (E) About 75% of the values in data set II are greater than or equal to about 50% of the values in data set I.

Page 11: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

37

MC III-5.) [APSTATSMC2002-20] A small town employs 34 salaried, nonunion employees. Each employee receives an annual salary increase of between $500 and $2000 based on a performance review by the mayor's staff. Some employees are members of the mayor's political party, and the rest are not. Students at the local high school form two lists, A and B, one for the raises granted to employees who are in the mayor's party, and the other for raises granted to employees who are not. They want to display a graph (or graphs) of the salary increases in the student newspaper that readers can use to judge whether the two groups of employees have been treated in a reasonably equitable manner. Which of the following displays is least likely to be useful to readers for this purpose? (A) Back-to-back stemplots of A and B (B) Scatterplot of B versus A (C) Parallel boxplots of A and B (D) Histograms of A and B that are drawn to the same scale (E) Dotplots of A and B that are drawn to the same scale

MC III-6.) [APSTATSMC2002-27]

The figure above shows a cumulative relative frequency histogram of 40 scores on a test given in an AP Statistics class. Which of the following conclusions can be made from the graph? (A) There is greater variability in the lower 20 test scores than in the higher 20 test scores. (B) The median test score is less than 50. (C) Sixty percent of the students had test scores above 80. (D) If the passing score is 70, most students did not pass the test. (E) The horizontal nature of the graph for the test scores of 60 and below indicates that those scores occurred most frequently.

Page 12: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

38

MC III-7.) [APSTATSMC2007-15] The histograms below represent the distribution of five different data sets, each containing 28 integers, from 1 through 7, inclusive. The horizontal and vertical scales are the same for all graphs. Which graph represents the data set with the largest standard deviation.

MC III-8.) [APSTATSMC2007-33] Five estimators for a parameter are being evaluated. The true value of the parameter is 0. Simulations of 100 random samples, each of size n, are drawn from the population. For each simulated sample, the five estimates are computed. The histograms below display the simulated sampling distributions for the five estimators. Which simulated sampling distribution is associated with the best estimator for this parameter?

Page 13: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

39

MC III-9.) [APSTATSMC2013-05] The amount of time required for each of 100 mice to navigate through a maze was recorded. The histogram below shows the distribution of times, in seconds, for the 100 mices.

Which of the following values is closest to the standard deviation of the 100 mice? (A) 2.5 seconds (B) 10 seconds (C) 20 seconds (D) 50 seconds (E) 90 seconds

MC III-10.) [APSTATSMC2013-06] A graph (not shown) of the selling prices of homes in a certain city for the month of April reveals that the distribution is skewed to the left. Which of the following statements is the most reasonable conclusion about the selling prices based on the graph? (A) The mean is greater than the median. (B) The median is the average of the first quartile and the third quartile. (C) There are fewer selling prices between the first quartile and the median than there are between the median and the third quartile. (D) There are more selling prices that are less than the mean than selling prices that are greater than the mean. (E) The value of maximum minus third quartile is less than the value of first quartile minus minimum.

Page 14: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

40

FREE-RESPONSE QUESTIONS

Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the correctness of your methods as well as on the accuracy and completeness of your results and explanations.

FRQ 1.1.) [APSTATSFRQ2014-04] As part of its twenty-fifth reunion celebration, the class of 1988 (students who graduated in 1988) at a state university held a reception on campus. In an informal survey, the director of alumni development asked 50 of the attendees about their incomes. The director computed the mean income of the 50 attendees to be $189,952. In a news release, the director announced, “The members of our class of 1988 enjoyed resounding success. Last year’s mean income of its members was $189,952!” a.) What would be a statistical advantage of using the median of the reported incomes, rather than the mean, as the estimate of the typical income? b.) The director felt the members who attended the reception may be different from the class as a whole. A more detailed survey of the class was planned to find a better estimate of the income as well as other facts about the alumni. The staff developed two methods based on the available funds to carry out the survey. Method 1: Send out an e-mail to all 6,826 members of the class asking them to complete an online form. The staff estimates that at least 600 members will respond. Method 2: Select a simple random sample of members of the class and contact the selected members directly by phone. Follow up to ensure that all responses are obtained. Because method 2 will require more time than method 1, the staff estimates that only 100 members of the class could be contacted using method 2. Which of the two methods would you select for estimating the average yearly income of all 6,826 members of the class of 1988 ? Explain your reasoning by comparing the two methods and the effect of each method on the estimate.

Page 15: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

41

FRQ 1.2.) [APSTATSFRQ2011B-02] People with acrophobia (fear of heights) sometimes enroll in therapy sessions to help them overcome this fear. Typically, seven or eight therapy sessions are needed before improvement is noticed. A study was conducted to determine whether the drug D-cycloserine, used in combination with fewer therapy sessions, would help people with acrophobia overcome this fear. Each of 27 people who participated in the study received a pill before each of two therapy sessions. Seventeen of the 27 people were randomly assigned to receive a D-cycloserine pill, and the remaining 10 people received a placebo. After the two therapy sessions, none of the 27 people received additional pills or therapy. Three months after the administration of the pills and the two therapy sessions, each of the 27 people was evaluated to see if he or she had improved. a.) Was this study an experiment or an observational study? Provide an explanation to support your answer. b.) When the data were analyzed, the D-cycloserine group showed statistically significantly more improvement than the placebo group did. Based on this result, would the researchers be justified in concluding that the D-cycloserine pill and two therapy sessions are as beneficial as eight therapy sessions without the pill? Justify your answer. c.) A newspaper article that summarized the results of this study did not explain how it was determined which people received D-cycloserine and which received the placebo. Suppose the researchers allowed the therapists to choose which people received D-cycloserine and which received the placebo, and no randomization was used. Explain why such a method of assignment might lead to an incorrect conclusion.

Page 16: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

42

FRQ 1.3.) [APSTATSFRQ2015-01] Two large corporations, A and B, hire many new college graduates as accountants at entry-level positions. In 2009 the starting salary for an entry-level accountant position was $36,000 a year at both corporations. At each corporation, data were collected from 30 employees who were hired in 2009 as entry-level accountants and were still employed at the corporation five years later. The yearly salaries of the 60 employees in 2014 are summarized in the boxplots below.

a.) Write a few sentences comparing the distributions of the yearly salaries at the two corporations. b.) Suppose both corporations offered you a job for $36,000 a year as an entry-level accountant.

(i) Based on the boxplots, give one reason why you might choose to accept the job at corporation A. (ii) Based on the boxplots, give one reason why you might choose to accept the job at corporation B.

Page 17: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

43

FRQ 1.4.) [APSTATSFRQ2013-06, Investigative Task ] Tropical storms in the Pacific Ocean with sustained winds that exceed 74 miles per hour are called typhoons. Graph A below displays the number of recorded typhoons in two regions of the Pacific Ocean—the Eastern Pacific and the Western Pacific—for the years from 1997 to 2010.

a.) Compare the distributions of yearly frequencies of typhoons for the two regions of the Pacific Ocean for the years from 1997 to 2010. b.) For each region, describe how the yearly frequencies changed over the time period from 1997 to 2010.

Page 18: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

44

A moving average for data collected at regular time increments is the average of data values for two or more consecutive increments. The 4-year moving averages for the typhoon data are provided in the table below. For example, the Eastern Pacific 4-year moving average for 2000 is the average of 22, 16, 15, and 21, which is equal to 18.50.

c.) Show how to calculate the 4-year moving average for the year 2010 in the Western Pacific. Write your value in the appropriate place in the table.

Page 19: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

45

d.) Graph B below shows both yearly frequencies (connected by dashed lines) and the respective 4-year moving averages (connected by solid lines). Use your answer in part (c) to complete the graph.

e.) Consider graph B.

i) What information is more apparent from the plots of the 4-year moving averages than from the plots of the yearly frequencies of typhoons?

ii) What information is less apparent from the plots of the 4-year moving averages than from the plots of the yearly frequencies of typhoons?

Page 20: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

46

Answers

Part 2 – Quarterly Exam Questions

I. SAMPLING MC I-1.) E , You can only reduce variability of a sample by increasing the sample size, since

n

qp 00=s . You cannot reduce the variability of the population. Bias and effect of confounding factors

are “built in” in the way you collect sample, so increasing sample size may not reduce them. MC I-2.) D. This is a typical stratified sampling used on the homogenous strata to reduce the sample variability for a given sample size. MC I-3.) E, Jason probably should have chosen the same age groups for both men and women. MC I-4.) E. MC I-5.) E. The strata, not any subsets of population, are usually selected by homogeneity. The individuals in each stratum are selected randomly. MC I-6.) B. Since the polling firm decides to survey by calling the people, they should not concern about the people the firm cannot reach! They should only worry about the problems when they proceed to survey. MC I-7.) D. The sample size has to be small enough to ignore “non-replacement” problem, as in the example of skittles. MC I-8.) A. MC I-9.) C. The actual distance is a parameter of the experiment, so you can measure the variability of the parameter. MC I-10.) E. The advantage of using stratification is to reduce the sampling variability for a given sample size. II. DESIGN OF STUDIES MC II-1.) C MC II-2.) E MC II-3.) D. This was an observation study, so you can’t really claim cause-and-effect since there was no control group to compare. MC II-4.) B. Response/non-response bias could be the worst compared with other bias. MC II-5.) A. Size is associated with distance. MC II-6.) D. This is actually more of a sampling problem than a design problem. For answer E, you not obtain a 20/20/20 division.

Page 21: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

47

MC II-7.) D. Like stratified sampling. MC II-8.) B. Again, this is actually a sampling problem. MC II-9.) D. Reduce the effect of confounding/lurking factors. MC II-10.) A III. EXPLORING DATA MC III-1.) D. Look for some statistic among groups to see if it is different/similar. In this case, the statistic is the ratio of job/population. MC III-2.) B. Any change in data would affect the mean. For the other “central” measures, it depends on what is changed. MC III-3.) A. The association of data with schools was not provided. MC III-4.) D. Boxplots do not provide the info for the total number of data. MC III-5.) B. There is no corresponding paired variables, so scatter plots are appropriate. MC III-6.) A. This is cumulative function, so the x value for 50% mark on the y-axis represents the median. That is the median is about 70s. This also means that scores of the bottom half ranges from 30s~70s and the scores for the upper half ranges from 70s ~100. MC III-7.) D. The standard deviation “measures” how far all the data from the “center”. MC III-8.) B. All 5 estimates are closely clustered around zero.

MC III-9.) B. Use the rule of s6 -to-cover-99% of the data: 9601156 »Þ-= ss .

MC III-10.) E

Page 22: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

48

The following answers/solutions are from College Board. Your answers/solutions could vary. FRQ 1.1.) a.) The median is less affected by skewness and outliers than the mean. With a variable such as income, a small number of very large incomes could dramatically increase the mean but not the median. Therefore, the median would provide a better estimate of a typical income value. b.) Method 2 is better than Method 1. A sample obtained from Method 1 could be biased because of the voluntary nature of the response. It is plausible that class members with larger incomes might be more likely to return the form than class members with smaller incomes. The mean income for such a sample would overestimate the mean income of all class members. With Method 2, despite the smaller sample size, the random selection is likely to result in a sample that is more representative of the entire class and produce an unbiased estimate of mean yearly income of all class members. FRQ 1.2.) a.). The study was an experiment because treatments (D-cycloserine or placebo) were imposed by the researchers on the people with acrophobia. b.) No, the experiment was designed to compare the D-cycloserine group with a control group that received the placebo. The researchers can conclude that the D-cycloserine pill and two therapy sessions show significantly more improvement than a placebo and two therapy sessions. However, there is no basis for comparison with another group of people with acrophobia who received eight therapy sessions and no pill. c.) One example is that if the therapists were allowed to choose who received the placebo and who received D-cycloserine, they might assign the people with more severe acrophobia to one of the groups and the people with less severe acrophobia to the other group. Thus, the improvement after only two therapy sessions could be related to the initial severity of the acrophobia rather than to the effects of D-cycloserine. FRQ 1.3.)

Page 23: Mathacle PSet ----- Stats, Concepts in Statistics and ...mathacle.com/MathPSet/Stats/Mathacle_Pset_Stats_1st_Quarterly... · PSet ----- Stats, Concepts in Statistics and Probability

Mathacle PSet ----- Stats, Concepts in Statistics and Probability Level ---- 3 Number --- 1 Name: ___________________ Date: _____________

49

FRQ 1.4.) a.) The Western Pacific Ocean had more typhoons than the Eastern Pacific Ocean in all but one of these years. The average seems to have been about 31 typhoons per year in the Western Pacific Ocean, which is higher than the average of about 19 typhoons per year in the Eastern Pacific Ocean. The Western Pacific Ocean also saw more variability (in number of typhoons per year) than the Eastern Pacific Ocean; for example, the range of the frequencies for the Western Pacific is about 21 typhoons and only 10 typhoons for the Eastern Pacific. b.) The Western Pacific Ocean had a decreasing trend in number of typhoons per year over this time period, especially from about 2001 through 2010. In contrast, the Eastern Pacific Ocean was fairly consistent in the number of typhoons per year over this time period, with a slight increasing trend in the later years from 2005 through 2010. c.)

d.)

e.) (i) The overall trends across this time period were more apparent with the moving averages than with the original frequencies. The moving averages reduce variability, making more apparent the overall decreasing trend in number of typhoons in the Western Pacific Ocean and the slight increasing trend in the number of typhoons in the Eastern Pacific Ocean. (ii) The year-to-year variability in number of typhoons is less apparent with the moving averages than with the original frequencies.