university of otago examinations 2011

30
UNIVERSITY OF OTAGO EXAMINATIONS 2011 STATISTICS Paper STAT110 Statistical Methods Semester One (TIME ALLOWED : THREE HOURS) This paper comprises 32 pages (including the 2-page Formulae Summary) Answer questions as follows: Attempt ALL questions in both Sections. Section A - 60 marks in 7 questions. Write your working and answers in the spaces provided. If you need more space, give a reference to the back of the previous page and write your additional work there. Section B - Multiple-choice: 40 marks in 40 questions. Use the separate answer sheet provided for this section. No marks deducted for incorrect answers. The following material is provided: Formulae Summary (Appendix 1, pages 31-32) Candidates are permitted copies of: Nil Other instructions: No restriction on the model of calculator that may be used, but no device with communication capability shall be accepted as a calculator (subject to inspection by Examiners). Office Use Only Question Mark A1 3 A2 4 A3 8 A4 12 A5 12 A6 10 A7 11 TOTAL STUDENT ID NUMBER TURN OVER Only Section B of this exam is multi-choice. Ignore questions 22, 39 and 40. Section A is less relevant, as it contains questions that are not multi-choice. Solutions to both sections are on the resources page.

Upload: others

Post on 05-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UNIVERSITY OF OTAGO EXAMINATIONS 2011

UNIVERSITY OF OTAGO EXAMINATIONS 2011

STATISTICS

Paper STAT110

Statistical Methods

Semester One

(TIME ALLOWED : THREE HOURS)

This paper comprises 32 pages (including the 2-page Formulae Summary)

Answer questions as follows:Attempt ALL questions in both Sections.Section A - 60 marks in 7 questions. Write your working and answers in the spacesprovided. If you need more space, give a reference to the back of the previous page andwrite your additional work there.Section B - Multiple-choice: 40 marks in 40 questions. Use the separate answer sheetprovided for this section. No marks deducted for incorrect answers.

The following material is provided:Formulae Summary (Appendix 1, pages 31-32)Candidates are permitted copies of:NilOther instructions:No restriction on the model of calculator that may be used,but no device with communication capability shall be acceptedas a calculator (subject to inspection by Examiners).

Office Use OnlyQuestion Mark

A1 3A2 4A3 8A4 12A5 12A6 10A7 11

TOTAL

STUDENT ID NUMBER

TURN OVER

Only Section B of this exam is multi-choice. Ignore questions 22, 39 and 40.Section A is less relevant, as it contains questions that are not multi-choice.Solutions to both sections are on the resources page.

Page 2: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

SECTION A

Attempt all the questions in this Section. Write your working and answers in thespaces provided. If you need more space, give a reference to the back of the previous pageand write your additional work there.

1. A survey was conducted to investigate the proportion of registered nurses in theNorth Island who are actively employed. A random sample of 450 nurses selectedfrom the registry of nurses showed 277 were actively employed.

(a) Using 1.96 as the multiplier, calculate the 95% confidence interval for theproportion of registered nurses actively employed:

2 marks

(b) Provide an appropriate interpretation of your confidence interval found above:

1 mark

Q1: 3 marks

2 TURN OVER

Page 3: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

2. A study was conducted to look at average weekly expenditure per household perweek on food in a particular community compared to the overall national averageof $195.20 for a couple with 2 dependent children. A random sample of n = 120households in the community gave a mean and a standard deviation of $204.91and $34.64, respectively. Test the hypothesis that the mean weekly householdexpenditure for the community is different from the overall national average. Useα= 0.05.

(a) State the appropriate pair of hypotheses for this test.

2 marks

(b) Calculate the standardized test statistic for this study.

1 mark

(c) Given the p-value for this test is 0.0021 state what can be concluded from thishypothesis test.

1 mark

Q2: 4 marks

3 TURN OVER

Page 4: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

3. A researcher is interested in whether men are more or less likely than women toexercise. A random sample of 80 men and 90 women were asked whether theyexercised or not. Of the 80 men, 76 reported having exercised; of the 90 womanonly 69 reported having exercised. For this question use a 0.05 level of significance.

(a) State the null and alternative hypotheses for testing whether men are more orless likely than women to exercise.

2 marks

(b) Give the sample proportions for both men and women in the survey:

(i) The sample proportion for men is:

1 mark

(ii) The sample proportion for women is:

1 mark

(c) Calculate the pooled sample proportion defined byp∗ = (XM +XF )/(nM + nF ).

1 mark

4 TURN OVER

Page 5: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(d) Calculate the estimate of the standard error for the difference between the twoproportions.

1 mark

(e) Calculate the test statistic for testing the null hypothesis using your answersin (b) and (d).

1 mark

(f) Given that the p-value for this test is 0.0008 state your conclusion about thehypothesis test?

1 mark

Q3: 8 marks

5 TURN OVER

Page 6: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

4. A survey was undertaken to compare the divorce rates of smokers and nonsmok-ers. The table below shows the smoking habits and divorce history of the 1688respondents:

DivorcedSmoke? Yes No TotalYes 240 254 494No 375 819 1194Total 615 1073 1688

(a) Calculate the estimated risk of divorce for:

(i) for those who smoke

1 mark

(ii) for nonsmokers

1 mark

(b) Estimate the relative risk of divorce for those who smoke compared with non-smokers.

1 mark

(c) Calculate the estimated odds of divorce for:

(i) for those who smoke

1 mark

6 TURN OVER

Page 7: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(ii) for nonsmokers

1 mark

(d) Estimate the odds ratio of divorce for those who smoke compared with non-smokers.

1 mark

(e) Calculate the standard error for the log odds ratio.

2 marks

(f) Using the multiplier of 1.96, calculate a 95% confidence interval for the logodds ratio.

2 marks

(g) Transform the above interval to the original scale giving the 95% confidenceinterval for the odds ratio.

1 mark

7 TURN OVER

Page 8: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(h) Interpret the result in part (g). Two sentences are sufficient.

1 mark

Q4: 12 marks

8 TURN OVER

Page 9: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

5. Researchers interested in the relationship between blood pressure and birthweightin newborns measured systolic blood pressure Y (in mm of Hg) and birthweight X(in kg) for 16 infants. Using the pairs of measurements they obtained the followingstatistics:

x = 3.41y = 88.06∑

i(xi − x)2 = 4.25∑i(yi − y)2 = 671.34∑i(xi − x)(yi − y) = 23.51∑i(yi − yi)2 = 541.64

(a) What is the value of the slope of the least squares regression line of Y on X?

1 mark

(b) What is the value of the intercept?

1 mark

(c) In equation form show the least squares regression line of Y on X.

1 mark

(d) Find the estimated standard deviation of the points about the regression line.

1 mark

9 TURN OVER

Page 10: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(e) Calculate an estimate for the standard error of the slope of the regression linesβ1

1 mark

(f) Using 2.145 as the appropriate multiplier, calculate a 95% confidence intervalfor the slope of the regression line.

2 marks

(g) Interpret the confidence interval calculated in (f). In particular, what doesit tell you about the effect of birthweight on the systolic blood pressure ofbabies?

1 mark

(h) Calculate a prediction of the systolic blood pressure of a baby with birthweightof 2.5 kg.

1 mark

10 TURN OVER

Page 11: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(i) Calculate a 95% confidence interval for the prediction calculated in part (h)using the multiplier of 2.145.

3 marks

Q5: 12 marks

11 TURN OVER

Page 12: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

6. To determine if female greenveined white butterflies have proportionately largerabdomens than males, 51 butterflies (27 male and 24 female) were measured.Abdomen dry weight (mg) and total dry weight (mg) were recorded. Becauselarger butterflies have bigger abdomens, total dry weight was recorded to deter-mine whether any differences could be simply due to the sex of the butterfly.

We have fitted two models below. In each the outcome variable is Abdomen DryWeight (AbdDryW). Sex (coded 0 for males and 1 for females) and Total DryWeight are the explanatory variables.

> summary(fit1)

Call:

lm(formula = AbdDryW ~ Sex, data = data)

Residuals:

Min 1Q Median 3Q Max

-3.1154 -0.8054 -0.1231 0.8169 5.0169

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 6.7608 0.6126 11.036 4.02e-15 ***

Sex 2.5423 0.3989 6.373 5.31e-08 ***

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 1.446 on 51 degrees of freedom

Multiple R-squared: 0.4433,Adjusted R-squared: 0.4324

F-statistic: 40.62 on 1 and 51 DF, p-value: 5.314e-08

> summary(fit2)

Call:

lm(formula = AbdDryW ~ Sex + TotDryW, data = data)

Residuals:

Min 1Q Median 3Q Max

-0.88778 -0.30356 -0.05991 0.24367 2.33472

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -5.76720 0.72681 -7.935 2.10e-10 ***

Sex 1.59750 0.15542 10.279 6.32e-14 ***

TotDryW 0.54102 0.02985 18.126 < 2e-16 ***

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

12 TURN OVER

Page 13: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

Residual standard error: 0.5306 on 50 degrees of freedom

Multiple R-squared: 0.9265,Adjusted R-squared: 0.9235

F-statistic: 315 on 2 and 50 DF, p-value: < 2.2e-16

(a) What is the estimated difference in mean abdomen dry weight of femalescompared to male butterflies without adjusting for total dry weight?

1 mark

(b) In Model 1, is there evidence of a significant difference in mean abdomen dryweight of females compared to male butterflies? If you answer yes, does theabdomen of females appear larger/smaller or no different on average than thoseof males?

1 mark

(c) What percentage of variation in abdomen dry weight is explained by sex with-out adjusting for total dry weight?

1 mark

(d) What is the estimated difference in mean abdomen dry weight of femalescompared to male butterflies after adjusting for total dry weight?

1 mark

13 TURN OVER

Page 14: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(e) Using the t-multiplier 2.0086, calculate the 95% confidence interval for thedifference in mean abdomen dry weight of females compared to male butterfliesafter adjusting for total dry weight.

1 mark

(f) After adjusting for total dry weight of the butterflies, is there a significantdifference between the mean abdomen dry weight of females compared to malebutterflies? Provide evidence to support your conclusion.

1 mark

(g) What is the change in the percentage of variation in abdomen dry weightexplained after adjusting for total dry weight?

1 mark

(h) State the pair of fitted equations that express the estimated abdomen dryweight (F for females and M for males) in terms of total dry weight for eachsex of the butterfly:

2 marks

(i) Does adding the variable for total dry weight of the butterflies improve theregression model? Provide evidence to support your conclusion.

1 mark

Q6: 10 marks

14 TURN OVER

Page 15: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

7. To evaluate the effect of aerobic training on aerobic performance, ten male athletesfrom four sports were tested on a treadmill subject to a graded workload. Ath-letes were tested until the point of exhaustion, and their maximum oxygen outputrecorded. The data (litres/minute) are given in the following table.

Squash Marathon Soccer Rowing5.1 4.3 4.5 4.65.7 4.5 4.4 5.0...

......

5.1 4.1 4.9 5.3Cj 50.6 46.5 45.9 53.0C2j 2560.36 2162.25 2106.81 2809.00

(a) Carry out an analysis of variance on these data by completing the ANOVAtable below (note that the overall mean sum of squares is 960.410). Use thebox provided for your calculations.

ANOVA table:

Source SS DF MS FOverall mean 960.41Sport EffectErrorTotal 973.524

5 marks

15 TURN OVER

Page 16: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(b) If the p-value for this test is 0.0112, what is your conclusion regarding themean maximum oxygen outputs for the four sports? Justify your answer.

2 marks

(c) Using the multiplier of 2.0281 from the t-distribution, calculate the 95% con-fidence interval for the mean maximum oxygen output for marathon athletes.

2 marks

(d) Compare the mean maximum oxygen outputs between squash and marathonathletes using a 95% confidence interval for the difference:

µsquash − µmarathon.

Use 2.0281 as the multiplier.

1 mark

16 TURN OVER

Page 17: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(e) From the confidence interval calculated in part (d), what conclusion do youcome to regarding difference in mean maximum oxygen outputs between squashand marathon athletes?

1 mark

Q7: 11 marks

17 TURN OVER

Page 18: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

SECTION B

For each question in this Section, select 1 answer only from the 5 options provided.Record your answers on the separate sheet provided.

Information for questions 1–2 A researcher interested in heat stress amongrunners surveyed a random sample of 1,000 runners who took part in fun runsin New Zealand in 2000 and 2001. Heat stress was reported by 18% of the 217runners who responded.

1. The quantity 18% is:

(A) A parameter only

(B) A statistic, a random variable and a residual

(C) A statistic, an estimate and an observed value of a random variable

(D) A parameter, random variable and a statistic

(E) An estimate and a statistic but it is not a random variable

2. The parameter of interest in this survey is:

(A) New Zealand runners who suffered from heat stress in fun runs carried out inNew Zealand in 2000 and 2001.

(B) The number of those who reported that they suffered from heat stress out ofthose who responded.

(C) The percentage of runners who suffered from heat stress in fun runs carriedout in New Zealand in 2000 and 2001.

(D) The number of runners in the sample of 1000 who reported that they sufferedfrom heat stress.

(E) The percentage of runners in the sample of 1000 who suffered from heat stressin fun runs carried out in New Zealand in 2000 and 2001.

3. A researcher wishes to test whether a standard fish food and a new product workequally well at producing fish of equal weight after a 2-month feeding program.The experimenter has 2 identical fish tanks (1 & 2) to put fish in and is consideringhow to assign the 40 tagged fish to the tanks. To properly assign the fish, one stepwould be to:

(A) Assign the fish at random to the two tanks and give the standard feed to tank1 and new product to the other.

(B) Put the darker coloured fish into tank 1, the lighter coloured fish in tank 2with the feed assigned at random to the tanks.

(C) Obtain pairs of fish whose weights are virtually equal at the start of the ex-periment and put the heavier of the pair into tank 2, the other to tank 1 withthe feed assigned at random to the tanks.

18 TURN OVER

Page 19: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(D) Divide fish by their initial length and assign the 20 longest fish to tank 1, the20 shorter fish to tank 2 with the feed assigned at random to the tanks.

(E) Put all the odd-number tagged fish in one tank, the even-number tagged fishin the other, and give the new food type to both tanks.

4. A new headache remedy was given to a group of 25 subjects who had headaches.Four hours after taking the new remedy, 20 of the subjects reported that theirheadaches had disappeared. From this information you conclude:

(A) Nothing, because there is no control group for comparison.

(B) That the new treatment is better than aspirin.

(C) Nothing, because the sample size is too small.

(D) That the remedy is not effective for the treatment of headaches.

(E) That the remedy is effective for the treatment of headaches.

5. A nutritionist wants to study the effect of storage time (6, 12 and 18 months) onthe amount of vitamin C present in freeze dried fruit. Six fruit packs were randomlyassigned to each of the three storage times. The treatment, experimental unit, andresponse are respectively:

(A) Random assignment, a fruit pack, amount of vitamin C

(B) A specific storage time, a fruit pack, amount of vitamin C

(C) A specific storage time, the nutritionist, amount of vitamin C

(D) A fruit pack, amount of vitamin C, a specific storage time

(E) A specific storage time, amount of vitamin C, a fruit pack

6. A customer satisfaction survey is conducted by sampling from a list of 5,000 newwhiteware buyers. The list included 1,000 buyers from each of five brands. Theanalyst selects a sample of 500 whiteware buyers, by randomly sampling 100 buyersof each brand. Is this an example of a simple random sample?

(A) Yes, because the whiteware buyers of every brand were equally represented inthe sample.

(B) Yes, because each buyer in the sample was randomly sampled.

(C) No, because the population consisted of purchasers of five different brands ofwhiteware.

(D) Yes, because each buyer in the sample had an equal chance of being sampled.

(E) No, because every possible 500-buyer sample did not have an equal chance ofbeing chosen.

19 TURN OVER

Page 20: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

Information for questions 7–10 To assess opinions on research into geneticmodification a student decides to carry out a telephone survey. A random selec-tion of phone numbers will be used and the student needs to determine how manypeople to include in the sample. The parameter of interest is the proportion ofpeople in the country who are against such research.

7. If the student decides to survey 150 people, a 95% confidence interval for the pro-portion of people against the research will:

(A) Be exactly ±0.08 to 2dp

(B) Be at most ±0.08 to 2dp

(C) Be at most ±0.11 to 2dp

(D) Be at least ±0.11 to 2dp

(E) Be exactly ±0.10 to 2dp

8. If the student wants this confidence interval to be at most ±7%, the minimumnumber of people that will need to be surveyed is:

(A) 100

(B) 784

(C) 14

(D) 400

(E) 196

9. A friend suggests that there might be a relationship between opinions on this subjectand political preference. The student agrees and decides to modify the survey bydeciding in advance to select a specific number of people who support each of thepolitical parties in New Zealand. This modification to the survey design is knownas:

(A) Systematic sampling

(B) Stratification

(C) Randomisation

(D) Blocking

(E) Replication

10. Of those called, 14% of people refuse to answer any questions. What type of problemmight this cause when drawing conclusions from the results?

(A) Low expected counts

(B) Non-response bias

(C) Recall bias

(D) Insignificance as he needs 90% response rate

(E) Selection bias

20 TURN OVER

Page 21: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

Information for questions 11–12 A fair coin is tossed three times. It willeither come up heads or tails.

11. If the order matters how many outcomes are possible?

(A) 9

(B) 8

(C) 6

(D) 12

(E) 10

12. What is the probability that the coin comes up tails exactly twice out of the threetosses?

(A) 0.375

(B) 0.250

(C) 0.333

(D) 0.500

(E) 0.200

Information for questions 13–16 As part of the 2010 Pacific Drugs and Al-cohol Survey, the 1122 participants were asked about their alcohol consumptionduring the previous year (2009). The results are shown in the table below:

Drank alcoholYes(A) No (A′) Total

Female (F) 312 298 610Male (M) 297 215 512Total 609 513 1122

13. An estimate of Pr(A|F ) is

(A) 312/609

(B) 312/1122

(C) 312/298

(D) 312/297

(E) 312/610

14. An estimate of Pr(A ∩ F ) is

(A) 610/1122

(B) 312/609

21 TURN OVER

Page 22: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(C) 610/609

(D) 609/1122

(E) 312/1122

15. An estimate of Pr(A) is

(A) 312/609

(B) 610/1122

(C) 609/1122

(D) 312/1122

(E) 610/609

16. An estimate of the probability that a person is female, given that the person drankalcohol, is

(A) 609/610

(B) 609/1122

(C) 312/609

(D) 312/1122

(E) 312/610

Information for questions 17–18 The average salary for an employee at onecorporation is $50000 per year, with a variance of $8000. This year, managementawards the following bonuses to every employee.

• A Christmas bonus of $900

• An incentive bonus equal to 10 percent of the employee’s salary

17. What is the mean of the total bonus received by employees?

(A) 50900

(B) 42900

(C) 5900

(D) 50000

(E) 1700

18. What is the standard deviation of the total bonus received by employees?

(A) 894.43

(B) 3.00

(C) 8.94

22 TURN OVER

Page 23: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

(D) 80.00

(E) 28.28

19. A ten-question quiz has five true-false questions and five multiple-choice questions.A student randomly picks an answer for every question. Let X denote the numberof answers that are correct. Which condition for a binomial experiment is not met?

(A) The outcome is either the answer is correct or it incorrect.

(B) There are two outcomes.

(C) Outcomes are independent from one trial to the next.

(D) Probability of success is the same for all trials.

(E) The number of trials is fixed.

20. We take a random sample of size n from a normal distribution with a mean of µand variance of σ2. We calculate the sample mean x for this sample. Which ofthe following formulae gives upper and lower bounds on the middle 90% of thedistribution of x?

(A) µ± 1.645× σ2 × n(B) µ± 1.645× σ2/n

(C) x± 1.645× σ2/√n

(D) x± 1.645× σ/√n

(E) µ± 1.645× σ/√n

21. Suppose we take a sample of size n from some distribution (not necessarily normal)and calculate the sample mean x and sample standard deviation s. Select theoptions which best describes why the Central Limit Theorem is an important result:

(A) We can use the formula p ± 1.96√p(1− p)/n to calculate a 95% confidence

interval for a population mean if we have a large enough sample.

(B) It allows us to assume that the sample mean has a t-distribution.

(C) We can use the formula x± 1.96s/n to calculate a 95% confidence interval fora population mean even if we only have a small sample.

(D) If we have a large enough sample, we do not need to assume a normal distri-bution for the population when calculating a 95% confidence interval for thepopulation mean and can use the formula x± 1.96s/

√n for the calculation.

(E) Even if we have small samples, we can assume the population is normallydistributed therefore we can use the formula x± 1.96s2/

√n when calculating

a 95% confidence interval for the population mean.

23 TURN OVER

Page 24: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

Information for questions 22–23 A normal random variable X has a meanof 7 and a standard deviation of 5. Z is a standard normal random variable.

22. The diagram above represents a standard normal distribution. Which of thesestatements is true if we wish to find the proportion of values for which Pr(X > 5)?

(A) The area labeled w equals Pr(X > 5).

(B) The area labeled y equals 1− Pr(X > 5).

(C) The area labeled y equals 0.40 to 2dp

(D) The point labeled z equals 0.40 to 2dp

(E) The point labeled z equals −0.40 to 2dp

23. If Pr(Z > 1.96) = 0.025, what is the value x for which Pr(X > x) = 0.025?

(A) -2.80 to 2dp

(B) 7.39 to 2dp

(C) -16.80 to 2dp

(D) 16.80 to 2dp

(E) 2.80 to 2dp

Information for questions 24–25 A biologist measured the lengths of 10maple leaves. The following lengths (in cm) were obtained.4.9 7.1 4.1 9.1 6.7 7.8 8.8 7.0 7.5 9.0

24. The standard deviation of this sample to 4 dp is 1.6687 cm. Select the correctstandard error of the mean for the above study:

(A) 1.66872/√

9

(B) 1.66872/√

10

(C) 1.6687/√

9

(D) 1.6687/√

10

(E) 1.66872/10

25. Using the multiplier 2.2621, the 95% confidence interval for the mean length of the10 maple leaves is closest to:

(A) (6.5701, 7.8299)

(B) (5.2082, 9.1918)

(C) (6.0063, 8.3937)

(D) (5.9418, 8.4582)

(E) (5.1004, 9.2996)

24 TURN OVER

Q22 - ignore. Figure not provided

Page 25: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

Information for questions 26–31 To investigate whether a new vaccine willbenefit asthmatics, 40 participants received the vaccine while 40 participantsreceived a placebo. After the study was completed, 16 participants from thevaccine group had improved asthma symptoms, and 12 participants from theplacebo group had improved asthma symptoms.

26. The difference between the proportion of the vaccine and placebo groups who hadimprovement and the standard error of the difference are respectively:

(A) 0.1000 and 0.0170

(B) 4 and 0.2079

(C) 0.1000 and 0.1061

(D) 4 and 0.1061

(E) 0.1000 and 0.2079

25 TURN OVER

Page 26: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

27. The 95% confidence interval for the difference in proportions is:

(A) 0.1000± 1.96× 0.1061

(B) 4± 1.96× 0.2079

(C) 0.1000± 1.96× 0.2079

(D) 4± 1.96× 0.1061

(E) 0.1000± 1.96× 0.0170

Further information for questions 28–31 Another way to analyse these datais to carry out a χ2 test to assess if there is an association between group andimprovement. Counts and totals are displayed in the following table.

ImprovementGroup Yes No TotalVaccine 16 24 40Placebo 12 28 40Total 28 52 80

28. The null hypothesis in the χ2 test is:

(A) The means for each group are the same.

(B) There is an association between group and improvement.

(C) The probability of having improved asthma symptoms is greater in the vaccinegroup.

(D) The probability of having improved asthma symptoms is the same in eachgroup.

(E) There is no association between group and improvement.

29. Under the null hypothesis, the expected number of participants in the vaccine groupwho had improved asthma symptoms is:

(A) 14.00

(B) 15.50

(C) 28.00

(D) 12.00

(E) 40.35

26 TURN OVER

Page 27: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

30. The χ2 statistic of this test is 0.88 to 2dp. The degrees of freedom associated withthis is:

(A) 78

(B) 4

(C) 2

(D) 79

(E) 1

31. The p-value for this test statistic is 0.9979. We therefore conclude that:

(A) There is evidence of an association between group and improvement.

(B) The probability of improved asthma symptoms is 0.9979.

(C) The probability that there is no association between group and improvementis 0.9979.

(D) There is no evidence of an association between group and improvement.

(E) The probability of experiencing improved asthma symptoms is greater in thevaccine group.

Information for questions 32–35 It is known that 15% of people are allergicto wasp venom. Suppose a group of 11 people walk through a forest track knownto harbour wasps. Assume that the properties of the binomial distribution aresatisfied.

32. What is the mean of the number of people in groups of size 11 that will be allergicto wasp venom?

(A) π/n = (15/100)/11

(B) n× π = 11× 15/100

(C) n× π = 0.5× 15

(D) π(1− π)/n = 15/100× 85/100× 1/11

(E) π = 15/100

27 TURN OVER

Page 28: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

33. What is the variance of the number of people in groups of size 11 that are allergicto wasp venom?

(A) n× π = 11× 15/100

(B)√π × (1− π)/n =

√(15/100× (1− 15/100))/11

(C) n× π × (1− π) = 11× 15/100× (1− 15/100)

(D) n× π × (1− π) = 11× 1/2× 1/2

(E) π × (1− π)/√n = (15/100× (1− 15/100))/

√11

34. If 11 people from one family are sampled, would it still be appropriate to use theabove binomial distribution to model the number in the group allergic to waspvenom?

(A) No, because different members of the family will have different probabilities ofbeing allergic to wasp venom.

(B) Yes, because they are all from the same family a binomial approximation willbe more likely to be satisfied.

(C) No, because the sample size of 11 is small.

(D) Yes, because there is no information to suggest that this family would beanything other than a representative sample from the population.

(E) No, because they are all from the same family and so the assumption of inde-pendence will be violated.

35. Suppose we calculated the probability that no more than 4 people would be allergicto wasp venom using a normal approximation to the binomial. Which of thesestatements is NOT true?

(A) To correct for continuity we would need to to take the area under the normalcurve from minus infinity to 3.5.

(B) The normal approximation can never be exact but it will get better in largesamples.

(C) The normal approximation is unlikely to be accurate because nπ is less than5.

(D) The normal approximation is useful for binomial calculations when n is large.

(E) To correct for continuity we would need to to take the area under the normalcurve from −∞ to 4.5.

28 TURN OVER

Page 29: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

Information for questions 36–37 Among Takahe chicks, 11% are thought tobe underweight. A sample of 30 Takahe chicks is taken in one breeding season.

36. Using the results for the binomial distribution, the mean and standard deviation ofthe number of chicks that are underweight is closest to:

(A) Mean = 3.3, Standard deviation = 1.71

(B) Mean = 30, Standard deviation = 2.94

(C) Mean = 30, Standard deviation = 5.17

(D) Mean = 3.3, Standard deviation = 26.70

(E) Mean = 330, Standard deviation = 8.63

37. If you calculated the probability that 6 or more chicks are underweight for a sampleof size 30 using a normal approximation to the binomial, the standardized z valueused in the calculation is closest to:

(A) 2.28

(B) -1.28

(C) 1.28

(D) 1.58

(E) 1.87

Information for questions 38–40 As part of a study into the distribution ofEmoia murphyi (Murphy’s tree skink) throughout the Southwest Pacific, thefollowing data were obtained. The table records the summary statistics for thehead length in mm of 23 female skinks found on two separate islands:

n sample mean sample std.dev.Samoa 13 14.1 0.60Vavau 10 13.6 0.88

The main question of interest is whether or not there is a difference between themean lengths of the skinks from the two islands.

29 TURN OVER

Page 30: UNIVERSITY OF OTAGO EXAMINATIONS 2011

STAT110

38. The researchers wish to analyse the data using the Student’s t-distribution. Hereare some possible underlying assumptions:

I. The underlying head lengths are normally distributed.

II. The recordings are independent.

III. The mean head length should be known for the population.

Which of the assumptions are necessary for a valid analysis using the Student’st-distribution?

(A) I and II only.

(B) I only.

(C) II only.

(D) I and III only.

(E) All of I, II and III.

39. The 95% confidence interval for the mean head length of female skinks from Vavauis:

(A) 13.6± (2.262× 0.88/10)

(B) 13.6± (2.262× 0.882/√

10)

(C) 13.6± (2.262× 0.88/√

10)

(D) 13.6± (2.262× 0.88/√

9)

(E) 0.88± (2.262× 13.6/√

23)

40. A 95% confidence interval for the difference between the mean head length of femaleskinks of those from Samoa and those from Vavau is equal to:

(A) (0.6− 0.88)± (2.080× 0.733×√

0.6/13 + 0.88/10)

(B) (14.1− 13.6)± (2.080×√

0.62/10 + 0.882/13)

(C) (0.6− 0.88)± (2.080× 0.538×√

1/13 + 1/10)

(D) (14.1− 13.6)± (2.080× 0.538×√

13.6/13 + 14.1/10)

(E) (14.1− 13.6)± (2.080× 0.733×√

1/13 + 1/10)

30 TURN OVER

Q39 - ignore. Multiplier for CI would be provided in exam.

Q40 - ignore. Multiplier for CI would be provided in exam.