statistical inference-(mgt601) mid fa2015

61
A coin is tossed 100 times what is the expected number of heads? Answer is: 0.5 or 1/2 Given H Answer Z=1 Two cards Answer : 64 When referring to

Upload: muhammad-azam

Post on 29-Jan-2016

135 views

Category:

Documents


4 download

DESCRIPTION

Stattistacshd

TRANSCRIPT

Page 1: Statistical Inference-(MGT601) Mid FA2015

A coin is tossed 100 times what is the expected number of heads?

Answer is: 0.5 or 1/2

Given H

Answer Z=1

Two cards

Answer : 64

When referring to

Page 2: Statistical Inference-(MGT601) Mid FA2015

Answer : None of these

When a null

Answer : <42

One-tailed two-tailed

Answer : z test one tailed

Which of the following in the first step in calculating Median of data set

Page 3: Statistical Inference-(MGT601) Mid FA2015

Answer : Array the data

The accuracy of the prediction of variable can be improved

Answer : adding more dependent variables to the regression model

The hypothesis

Page 4: Statistical Inference-(MGT601) Mid FA2015

Answer : Simple Hypothesis

Given p

Answer : No, Because P(A and B) (1st option)

When referring to a curve that tails off to the left end

Answer : none of these

The test statistics is equal to

Page 5: Statistical Inference-(MGT601) Mid FA2015

Answer : None of these

The correct formula is (Sample mean – Population Mean/Standard error)

A pair of dice are thrown

Answer : 1/6 Good answer

If p(a or b)

Page 6: Statistical Inference-(MGT601) Mid FA2015

Answer : P(A) + P(B) is the joint probability of A and B (I am not sure about

this)

When n dice are rolled the possible out comes are:

Answer : 6n

The number of road accidents is a

Answer : discrete

The grouped data are called

Page 7: Statistical Inference-(MGT601) Mid FA2015

Answer : secondary data

A two tailed test of a difference between two proportions led to

Answer : 0.05 1st option

The range of test statistic-z is

Answer: 3rd option

Page 8: Statistical Inference-(MGT601) Mid FA2015

The frequency divided by the total number of observations is called

Answer : relative frequency

Two cards are drawn from a well-shuffled

Answer : 1/221 (calculation required so I can’t ans)

For two tailed test of hypothesis at

Page 9: Statistical Inference-(MGT601) Mid FA2015

Answer : between the two critical values

If a sample of size m is drawn from one population and size

Answer : n-1 , m-1

For an upper tailed test of the difference of two means based on dependent

samples

Page 10: Statistical Inference-(MGT601) Mid FA2015

Answer : 1.645

Given Ho

Answer : z=1 accept Ho

Two cards are switched

Answer : 2 but not sure (I also not sure)

Page 11: Statistical Inference-(MGT601) Mid FA2015

Two cards are drawn

Given 130

Answer : Z

The range of test

Answer : c (-infinity to +infinity)

Page 12: Statistical Inference-(MGT601) Mid FA2015

1st Quiz

1. If four coins are tossed, how many elements will the sample space contains. A=2, B= 4. C=16.

2. A Bag contains 10 red balls and 7 blue balls, A ball is drawn at random. The probability that ball

drawn is red. A= 7. B= 7/17. 10/17. 3/17.

3. A fair coin is tossed three times. What is the probability that at least one head appears. A=7/8.

B= 6/8. C= 5/8. D= 4/8.

4. If a dice is thrown twice, the number of elements in the sample space. A= 2. B= 4. C=16. D=36.

5. Mean, Median and mode always coincide in the case of …………..Distribution. A= Poisson. B=

Binomial. C= Normal. D= Hypergeometric.

6. Two events. A and B are mutually exclusive and each have a nonzero probability. If event A is

known to occur, the probability of the occurrence of event B is. A= one. B= any positive value.

C=0, d= any value between 0 to 1.

7. The probability of getting a head in tossing of a coin is. A= 0.5, B= 1, C= 1.5, d= -0.5.

8. The probability of an event cannot ne. A= 1, B=0.1, C= 0.5, D=-0.5

9. Find X compliment, x= 2,8,4,4,6,8,10. A= 49, B= 42, C= 9, D= 6.

10. P (A intersection B) =….., A= P(A) P(A/B), B= P (B) P(B/B). C= P(A) P(B/A), D= none of these.

11. For a random sample 9 women the average resting pulse rate x= 76 beats per minute and the

sample standard deviation is s= 5. The standards error of the sample mean is. A= 0.557, B=

0.745, C= 1.667, D= 2.778.

12. Null and alternative hypothesis are statements about. A= population parameters, B= Sample

Parameters, C= Sample statistics, D= it depends- Sometimes population parameters and

sometimes sample parameters.

2nd Quiz

Q.1. In order to carry out a chi square test on data in contingency table, the observed values in the table

should be. A= close to the expected values, b= all greater than or equal to 5. C = frequencies, d=

quantitative.

Q.2. If two attributes A and B have perfect association the value of coefficient of association is equal to.

A= +1, B= 0, C= -1, D= (r-1 x c-1)

Q.3. The degree of freedom for chi square are (r-1)(c-1) for a contingency table with r-rows and c-

columns so for 2*2 contingency table there are. A= one degree of freedom, b= Two degree of freedom,

c= three degree of freedom, d= four degree of freedom,

Q.4. For an r*c contingency table the number of degrees of freedom equal. A= rc, b= r+c, c= (r-1) + (c-1),

d= (r-1)(c-1)

Q.5. For a 3 * 3 contingency table the number of cells in the table are. A= 3, b= 6, c= 9, d= 4.

Q.6. The total area under the curve of chi-square distribution is, A= 1, B= 0.5, C= 0 to infinity, D= - infinity

to + infinity

Q.7. Ch-square curve ranges from. A= – infinity to + infinity, B= 0 to infinity, C= - infinity to 0, D= 0 to 1.

Q.8.The value of chi- square statistics is always. A= negative, B= 0, C= non negative, D= one.

Q.9.The slope of the simple linear regression equation (x is the independent variable and y is the

dependent variable) represents the. A= mean value of y when x=0, b= change in mean value of y per

unit change in x, c= True value of y for a fixed value of x, d= Variance of the value of x.

Page 13: Statistical Inference-(MGT601) Mid FA2015

Q. 10. the range of values of co relation co-efficient are. A= 0 to 1, b= -1 to 0, C= -1 to 1. D= none of the

above.

Mid Term

1. With a lower significance level the probability of rejecting a null hypothesis that is actually true:

Decrease

2. Specify which probability distribution to use in a hypothesis and weather it will be one tailed or

two tailed if the following information is given t-test two tailed

3. A pair of dice are thrown. Find the probability of getting a total of either 5 or 11: 1/6

4. If we want to test whether the proportions of more than two populations are equal. We use a.

analysis of variance. b. estimation. C. the variance. e. interval estimates. f. none of the these.

5. Which of the following is based on the relationship or association between two or more

variables? Regression and Correlation

6. Which of the following test could be based on the normal distribution? a. Difference between

independent means, b. difference between dependent means, c. difference between

proportions. d. (a, c but not b) e. All of the above

7. The measure of central tendency listed below is: 1.The mean

8. Quantitative variable are variables measured on a __ scale. numeric

9. a/2 is called: Two tailed significance level

10. the normal distribution is the appropriate distribution to use in testing hypotheses about all of

the above

11. scores that differ greatly from the measures of central tendency are called: extreme scores

12. when a null hypothesis is accepted, it is possible that. a. b. Zcal < Ztab

13. A fair coin is tossed three times. What is the probability that at least one head appears? 7/8

14. When one card is selected at random from a pack of 52 cards playing cards, the possible

selections are: 52

15. Given H … find Z and make the statistical decision Z=2.4 reject H0

16. The number of road accidents is a\

17. __ random variable continuous discrete

18. When data are classified according to a single characteristics, it is called Qualitative

classification / Simple Classification

19. A relative frequency distribution presents frequencies in terms of only A and C (Fractions

and percentages)

20. Which of the following is true about the number of variables in regression ? There can be

only be one dependent variable but multiple independent variables

21. Decision makers make decisions on the appropriate significance level by examining the cost of

a). performing the test. b). A type I error. c). A type II error. d). a and b. e. b and c.

22. In the case that two events A and B are mutally exclusive, P(A union B)= ? P(A)+ P(B)

23. Find x x=1,7,3,4,6,8,10 N is replaced by n-1 or 39

24. Find x compliment, x= 1,7,3,4,6,8,10. A= 49, B= 23, C= 39, D= 59

25. When referring to a curve that tails off to the left end, you would call it none of these

26. A binomial distribution may be approximated by a poisson distribution if: 1 = n is large p is large,

2= n is amall, p is large, 3= none of these, 4= a and b but not c.

27. Specify which probability distribution to use in a hypothesis test and whether it will be one-

tailed or two-tailed given the following information

28. The square of variance of a distribution is the : None of these

29. Which of the following is true regarding the acceptance and rejection region? All of the

above

30. for a normal curve with mean 55 and standard deviation 10, what will be the area under curve

to the right of value 55?

Page 14: Statistical Inference-(MGT601) Mid FA2015

a)1.0

b)0.68

c)0.32

d)0.5

e)non of above

31. What dose regression means? The general process of predicting one variable from another

variable

32. what is the probability that a randomly selected value of a population is greater than median of

that population? ½ (0.5)

33. if P(A or B)=P(A), then 1. A and B are mutually exclusive 2. The venn diagram area of A

and B overlap 3. P (A) + P(B) is the joint probability of A and B. 4. None of these. 5. All of these.

34. If the null hypothesis is rejected, then we may be making

a. Correct decision

b. type I error

c. type II error

d. either A or B

e. either A or C

35. A bag contains one rupee, 50-paisa and 25-paisa coins in the ratio

2 : 3 : 5. Their total value is Rs.144. The value of 50-paisa coins is

Rs.24

Rs.36

Rs.48

Rs.72

Rs.80

36. Which of the following normal curves is most likely the curve for u=10, sigma =5? Curve

for u=20, sigma=10

37. A number between 0 and 1, that is used to measure uncertainty is called Probability

38. Histogram is a graph of frequency distribution

39. Given a =80, n=625,u0=350 and X= 356 Find Z? 1.88

40. The grouped data are called : Difficult to tell

41. u and sigma are parameters z distribution

42. The values that separate the acceptance region from the rejection region is called Critical values

43. The test statistics is equal to: None of these

44. I fair coin is tossed three times, what is the probability that at least one head appears? 7/8

45. Economists use regression analysis and base their predictions of the annual gross domestic

product (GDP) on the final consumption spending within the economy. What are the dependent

and independent variables for the analysis. Dependent: GDP; Independent: final;

consumption spending

46. Square root of variance have only values: non Negative

47. A frequency distribution that contains a class with limits of “10 and under 20” would have a

midpoints: 15

48. Z= _____ z=x . u/sigma \square root of (n)

49. Which of the following represents the probability of mutally exclusive events A and B?

P(A)+P(B)

50. In testing hypothesis: alpha + beta is always equal to difficult to tell

51. The standard deviation of a binomial distribution depends upon: 1=success, 2= failure, 2= trial,

3= b and c but not a, 4= a,b and c

Page 15: Statistical Inference-(MGT601) Mid FA2015

52. Suppose we Want to test whether the population mean is significantly large or small than 10.

What should our alternative hypothesis be ? u<=10

53. The argument in which the order of the objects selected from a specific pool of objects is

important called permutation

54. For an upper tailed test of the difference of two means based on dependent samples of size 6

and alpha =.05, the critical value for the test statistic is : 2.015

55. What is the probability that a value chosen at random from a particular population is larger than

the median of the population? 0.5

56. The accuracy of prediction of variable can be improved by adding more independent variables

to regression mode

57. The power of test is equal to : 1-beta

58. Six white balls and four black balls, Which are indistinguishable apart from colour, are placed in

a bag. If six balls are taken from the bag, Find the probability of their being three white and

three black balls : 8/21

59. The probability of an event occurring given that another event had occurred is called:

Conditional probability

60. The Largest and the smallest values of any given class of a frequency distribution are called:

Class limits

61. If a coin is tossed thrice the sample space consist of 8 elements

62. If two dices are rolled, the possible outcomes are : 36

63. Which of the following is an example of a parameter? n or u

64. For two tailed test of hypothesis at sigma=0.10, the acceptance region is the entire region:

Between the two critical values

65. In Lower tail alpha=0.05 then z tabulated is 1.65

66. A two tailed test of a difference between two propositions led to z=1.85, for its standardized

difference of sample proposition. For which of the following significance level would you reject

H0? Alpha=0.05

67. Given u0=130, x =150, sigma=25 and n=4. What test statistic is appropriate? 1= t, 2 = z, 3= x2,

4=f

68. The average of lower and upper class limits is called Class boundary

69. If total number of data points are 120, then we can make a total of __ number of classes. 8

70. Fisher test nm

71. Alpha + beta = 1

72. Upper tailed test

73. Which test will be used if the population is normal and the standard deviation is known: Z test

74. Which one of the following is discrete variable: Number of rooms

75. If the total number of data points are 120, then we can make a total of --------number of classes:

6.

76. If the dependent variable decreases as the independent variable increase: Negative linear

relationship.

77. Numerical quality that describe a population is called: Parameter

78. Degree of freedom of t distribution is. a). N+1, b). n-1, c). n. d). n-1/2

79. Given x =100, ax = 16, and u0= 90, find Z : 0.65

80. When n dice are rolled, the possible outcomes are: 6n

81. The simple probability of an occurrence of an event is called: none of these.

82. When a null hypothesis is H0: u=42, then the alternative hypothesis can be : H1: u less than 42

83. If a is any event in S and A its complement, then p(A) is equal to : 1-p(A)

84. In regression analysis the variable we would like to predict or explain is called : dependent

variable

85. Histogram is a graph of : frequency distribution.

Page 16: Statistical Inference-(MGT601) Mid FA2015

86. Given H: u=12, H1:u greater than 12, n=64, x=15. Phi=10, Sigma=0.05. find Z are make the

statistical decision: Z=2.4, reject H0

87. Given x =120, u0=100,s=34.75 and n=25, find t : 2.88

88. An arrangement in which the order of the objects selected from a specific pool of objects is

important called : permutation

89. With referring to a curve that tails off to the left end, you would call it : none of these

90. What does the term regression means: the general process of predicting one variable from

another variable.

91. Which of the following is the first step in calculating the median of data set. Array the data.

92. which of the is not a measure of central tendency? Geometric mean

93. f(x) represents the -------Variable : dependent.

94. The frequency divided by the total number of observations is called. Relative frequency 95. How does the computation of a sample variance differ from the computation of a population

variance? a). u is replaced by x, b). n is replaced by n-1, c) n is replaced by n, d) a and c but

not b, e) a and b but not c 96. If a sample of size m is drawn from one population. What are the respective degrees of

freedom if one has to apply Fisher’s test. a= n-1,m-1, b= n,m. c= n-1,m-1. d= n-1,m. e= m-

1,n

Previous quiz.

1. Which of the following is a criteria for selecting a regression line which best represents the data.

A= the mean of the data must agree with the line. B= the sum of squared differences between

the dependent variable must be minimized. C= the sum of the squared horizontal differences in

the independent variable must be minimized. D= the line must agree with at least half of the

data points

2. Which is the probability that a value at random from a particular population is larger than the

median of the population. A= 0.25, B= 0.5, C= 1.0, D= 0.67, E= none of these

3. P(A)=? A= number of favorable commitment/total number of possible outcomes. B= total

number of possible outcomes/number of favorable outcomes, C= both a and b. D= none of

these

4. The weight in grams of 10 male and 10 female eing-neckled pheasants are obtained. The

variance for each are different. In order to test the hypothesis that the variance of the different

genders favors males over females, which of the following test may be used? T- test one tailed.

5. In an un paired sample t-test with sample size n1=11 & n2= 11, the value of tabulated should be

obtain from. A= 10 degree of freedom, B= 21 degree of freedom, C= 22 degree of freedom, D=

20 degree of freedom.

6. E(x-x compliment)(y-y compliment) =0, E(x-xcompliment)2 = 10 & n=5 find the cooficient of

coorelation. A= 1, B= 2, C= 0, D= 0.5

7. A time series has. A= two components, B= three components, C= four components= five

components.

8. If the regression lines of 4 on x and y are respectively given by 2x-3y=0 and 4y-5x=8 find out

values of two regression coefficients of y on x and x on y. A= 3/2 and 5/4, B= ½ and 1/5, C= 2/3

and 4/5, D= 2/5 and ¾.

9. For an r x c contingency table, the number of cells in the table are. A= r.c, B= (r-1)(r-c). C= r+c,

D= (r+1)(c-1)

10. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical

decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2

0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.

11. Moving average is. A= given the trend in a straight line, B= measure the seasonal variation, C=

smooth-out the time series. D= none of the above.

Page 17: Statistical Inference-(MGT601) Mid FA2015

12. Suppose that y= 1, when x= 0, then y=2 where x= 2. Find the least square estimate b. A= 2.0, B=

1.0, C= 1.5, D= 2.5.

13. Given x= 0.6-0.5y and y = 0.8, find x= ?. A= 0.1, B= 03, C= 0.2, D= 0.4.

14. Suppose that y= 1, when x= 0, then y=0 when x= 1 and that y=3 where is x=2. In this case find

the sample correlation, A= 1, B=2, C= 3, D= 4.

15. In semi averages method, if the number of values is odd then we drop: A= first value, B= third

value, C= last value, D= middle value, E=middle two value.

16. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least

square estimate a. A= 2, B= 3, C= 4, D= 1.

17. Which of the following normal curves looks most like the curve for u=10,o=5

A=Currve for u=10,o=10. B.Curve for u=20, o=10. C=Curve for u=20, o=5.D=Curve for u=13, o=3. E=None of these.

18. Degree of freedom of t-distribution is. A=n+1. B=n-1. C=N. D=n-1/2

19. Which of the following tests could be based on the normal distribution. A=Difference

between Independent means. B=Difference between dependent means.C=Difference between

propotions D=All of the above. E=a and c but not b

20. If the null hypothesis is rejected then we may be making.A=a correct decision. B=a Type I error.

C=a type II error.D=either A or B. E=either A or C

21. In lower tail\(\alpha\)=0.05 then tabulated is.A=.96. B=1.45. C=2.03. D=1.65.

22. The normal distribution is the appropriate distribution to use in testing hypothesis about.A=A

proportion, when npho>5 and nqho>5. B=A mean,when o is known and the population is

normal. C=A mean ,when o is unknown but n is large. D=All of above. E=None of these

23. Which of the following is true regarding the acceptance and rejection region. A=The acceptance

region is the range of values of the sample statistics within which if values of the sample

statistics falls then the null hypothesis is accepted. B=The rejection region is the range of values

of the sample statistics within which if values of the sample statistic fails then the null

hypothesis is rejected. C=The rejection region is the range of values of the sample statistic

within which if values of the sample statistic falls then the alternative hypothesis is accepted.

D=All of the above. E=only A and B.

24. If the total number of data points are 120 then we can make a total of-------- number of classes.

A=8. B=7. C=6. D=5

25. r=+1. A= no correlation. B=Negative correlation. C=Perfect correlation. D=None of above.

26. \(\mu\)and \(\sigma\) are the parameters------------. A=f distribution. B=z distribution. C= t

distribution. D= None of above.

27. Which of the following represents the probability of mutually exclusive events A and B.

A=P(A)+p(B). B=P(A)+P(B)+P(A^B). C=P(A)+P(B)-P(A^B). D=P(A)-P(B).

28. Specify which probability distribution to use in a hypothesis test and whether it will be one

tailed or two tailed if the following information is given. Ho:u=15,H1:u not equal to 15, x-

bar=14.8, n=20. A= z-test ;one- tailed. B= z- test; two tailed. C=t-test;one tailed. D= t-test;two

tailed. E= a and b but not c.

29. The probability of an event occuring given that an other event had occured is called. A= joint

probability. B=Conditional probability.C=Binominak probability. D=Discrete probability.

30. Z=. A= z=\(\bar(x)\)-\(mu\)/\(\sigma\)\\(\sqrt(n)\), B= z=\(\bar(x)\)-

\(mu\)/\(\sigma2\)\\(\sqrt(n)\), C= z=x-\(mu\)/\(\sigma\)\\(\sqrt(n)\),

31. The simple probability of an occurrence of an event is called the. A=Bayesian probability. B=Joint

probability. C=Marginal probability. D=Conditional probability.E=None of these.

32. When referning to a curve that is tails of to the left end,you would call it. A=Symmetrical.

B=Skewed right. C=Positively skewed. D=All of these. E=None of these.

33. How does the computation of a sample variance differ from the computation of a population

variance. A=u is replaced by x. B=N is replaced by n-1. C= N is replaced by n. D= a and c. But not

d. E= a and b but not c.

34. Find \(\bar{x}\). x=1,7,3,4,6,8,10. A= 49. B=23. C=39 . D=59.

Page 18: Statistical Inference-(MGT601) Mid FA2015

35. A pair of dice thrown,find the probability of getting a total of 5 or 11. A=2/6. B=6/6. C=3/6.

D=1/6.

36. If the sample of size m is drawn from one population and size of n from another

population,what are the respective degrees of freedom if one has to apply fisher's test. A=n-1,

m-1 . B=n,m. C= n-1,m+1. D=n-1,m. E=m-1,n.

37. Which of the following is an example of a parameter. A=x. B=n. C=u. D= All of these. E= b and c

but not a.

38. When a null hypothesis is Ho,u=42 then the alternative hypothesis can be. A=H1,u>42.

B=H1;u<42. C=H1;u=40. D=H1;u=40.

39. Quantitative variable are variables of measured on a---------scale. A=theoretical. B= numeric. C=

ordinary. D= ratio

40. For an upper tailed test of the difference of two means based on dependent samples of siza 6

and a=0.05 the critical value for the test statistic is.A=2.015. B=1.645. C=1.812. D=1.782. E=None

of these.

41. F(x) represent the-------- variable. A=Independent. B=Dependent. C=a and b. D= none of these.

42. If we want to test whether the proportions of more than two populations are equal,we

use.A=Analysis of variance. B= Estimation. C=The variance.D=Internal estimates.E=none of

these.

43. In the case that two events a and b are mutually exclusive P(AUB). A=P(A)+P(B). B=P(A)+P(B)-P(A

intersection B). C=P(A)xP(B). D=P(A intersection B)/P(B).

44. For two tailed test of hypothesis at a=0.10 the acceptance region is the entire region.A=To the

right of the negative critical values.B=Between the two critical values.C=Outside of the two

critical values. D=To the left of the positive critical value. E=None of these.

45. Which of the following is true for any regression model. A=The y- intercept of the model must

agree with the y-intercept of the data. B=There will always be a linear relationship between a

regression model and data.C=The choice of regression model to the best represent the data is

based on observing the trend in data. D=The standard deviation of the regression model is

always exactly the same as the standard deviation of the data.

46. Specify which probability distribution to use in a hypothesis test and whether it will be one

tailed or two tailed given the following information. Ho,u< and equal to 27. H1:u>27. X- bar = 33,

standard deviation=4, n=50.A= z-test:one tailed. B=z- test: two tailed. C= t-test one tailed. D= t-

test; two tailed. E=None of these.

47. The square of the variance of a distribution is the. A= Standard deviation. B=Mean. C=Range.

D=Absolute deviation. E= None of these.

48. With a lower significance level,the probability of rejecting a null hypothesis thai is actually true.

A=Decreases. B=Remains the same. C=Increases. D=Increases as the mean changes. E=None of

these.

49. The number of road accidents is a--------- random variable. A= Discrete. B= Contiuous. C=Both.

D=None of above.

50. Six white balls and four black balls,which are indistinguishable apart from color,are placed in a

bag.if six balls are taken the bag,find the probability of their being three white and three black.

A=8/10. B=8/21. C=10/21. D=21/8.

51. Square root of variance have only values. A=Less than 10. B= Greater than 10. C= Less than 0.

D=Greater than 0. E=Non negative.

52. Suppose we want to test whether a population mean is significantly larger or smaller than

10.what should our alternative hypothesis is be. A=u<10. B=u>10. C=u=10. D=u not equal to 10.

E=None of these.

53. A two tailed test of a difference between two proportions led to z= 1.85 for its standardized

difference of sample proportions. for which of the following significance level would you reject

H0?. A= a= 0.05, B= a= 0.10, C= a=0.02, D= a=(a) and (b), but not (c). E = none of these.

Page 19: Statistical Inference-(MGT601) Mid FA2015

54. For a normal curve with u=55 and @=10, how much area will be found under the curve to the

right of the value 55? A= 1.0, B= 0.68, C= 0.5, D= 0.32, E= none of these

Final paper

1. If the null hypothesis is rejected, then we may be making

a. Correct decision

b. type I error

c. type II error

d. either A or B

e. either A or C

2. Given rxy=-0.75, Sy=5, E(x-x)(Y-Y)=-15n. find Sx. A= 5, b=3, c=2, d= 4

3. Moving average is. A= given the trend in a straight line, B= measure the seasonal variation, C=

smooth-out the time series. D= none of the above.

4. Given x= 0.6-0.5y and y = 0.8, find x= ?. A= 0.1, B= 03, C= 0.2, D= 0.4.

5. Given x =120, u0=100,s=34.75 and n=25, find t : a= 3, b= 4, c= 2.88, d= 2

6. Given x=1, y=8 and b=2 find the value of interpret a. a= 7, b= 6, c= 8, d=10

7. Specify which probability distribution to use in a hypothesis test and whether it will be one

tailed or two tailed if the following information is given. Ho:u=15,H1:u not equal to 15, x-

bar=14.8, n=20. A= z-test ;one- tailed. B= z- test; two tailed. C=t-test;one tailed. D= t-test;two

tailed. E= a and b but not c

8. Given a =80, n=625,u0=350 and X= 356 Find Z? a=3, b=1, c= 2, d=1.88

9. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical

decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2

0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.

10. Any statement whose validity is tested on the basis of a sample is called. A= null hypothesis, b=

alternative hypothesis, c= statistical hypothesis, d= simple hypothesis

11. X2 curve ranges from. A= – infinity to + infinity, B= 0 to infinity, C= - infinity to 0, D= 0 to 1.

12. Semi-average method is used for measurement of trend values when. A=trend is linear, b=

observed data contains yearly values, c= the given time series contains odd number of values,

d= none of the above

13. Given a=80, n=625,u0=350 and X= 356 Find Z? a=1.88, b=1.99, c= 1.77, d=1.66

14. Given the equation of the straight line Y=a+bx, and the values of a a=45 b= -10 and x=3 find the

value of y. a=15, b=16, c=17, d=18

15. r=+1. A= no correlation. B=Negative correlation. C=Perfect correlation. D=None of above.

16. For an r x c contingency table, the number of cells in the table are. A= r.c, B= (r-1)(r-c). C= r+c,

D= (r+1)(c-1)

17. The hypothesis u less than 10 is a. a= simple hypothesis, b= composite hypothesis, c=

alternative hypothesis, d= difficult to tell

18. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least

square estimate a. A= 2, B= 1, C= 1.5, D= 2.5.

19. If the respective values of f0= 21,38,32,29,36,25,41,23 and

fe=31.31,27.69,32.37,28.63,32.37,28.63,33.96,30.64,then find x2, a=12, b=13, c=11, d=11.22

20. P(type II error) is equal to. A=alpha, b= beta, c= 1-alpha, d= 1-beta.

21. Given H: u=12, H1:u greater than 12, n=64, x=15. Phi=10, Sigma=0.05. find Z are make the

statistical decision: a= Z=2.4, reject H0, b= z= 2.4 accept H0, c= z= 3.4 reject H1, d= z=3.4 reject

H0

22. Given f0=30, 75, 45, 30, 75, 45 fe=52.5, 52.5, 37.5, 60.0, 60.0, df=2 and alpha = 0.05 find x2. A=

29.786, b= 30.0, c=26.99, d=23.

23. Given u0=130, x =150, sigma=25 and n=4. What test statistic is appropriate? a= t, b = z, c= f,

d=X2

24. Given X= 100, ox= 16, and u0=90, find Z= A= 0.6, B= 0.63, C= 0.62, 0.5.

Page 20: Statistical Inference-(MGT601) Mid FA2015

25. If X2=13.95, df=4, X20.05(4)=13.227, we make the following statistical decision. A= We accept H0 at

alpha = 0.01 and a=0.05, b= we reject H0 at alpha = 0.05 but not at a= 0.01, C= We reject H0 at

a=0.01 but not at a=0.05, d= we reject H0 at a=0.01 and a=0.05.

26. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the sample

correlation coefficient r? . A= 2, B= 1, C= 1.5, D= 2.5.

27. The alternative hypothesis is called, a= null hypothesis, b= statistical hypothesis, c= research

hypothesis, d= single hypothesis

28. A time series has. A= two components, B= three components, C= four components= five

components.

29. Suppose that y= 1, and when x= 0, that y=2 when x= 1. And that y=3 when x=2. Find the least

square estimate b. A= 2, B= 1, C= 1.5, D= 2.5.

30. P(type I error) is equal to. A=alpha, b= beta, c= 1-alpha, d= 1-beta.

31. In the semi averages method, if the number of values is odd then we drop: A= first value, B=

third value, C= last value, D= middle value, E=middle two value.

32. Analysis is the statistical tool we can use to describe the degree to which one variable is linearly

related to another, a= regression, b= correlation, c= variances, d= none of the above.

33. A statement which is tested for the purpose of rejection under the assumption that it is true is

called. A= null hypothesis, B= alternative hypothesis, c= simple hypothesis, d= composite

hypothesis.

34. Which of the following is a criteria for selecting a regression line which best represents the data.

A= the mean of the data must agree with the line. B= the sum of squared differences between

the dependent variable must be minimized.

35. C= the sum of the squared horizontal differences in the independent variable must be

minimized. D= the line must agree with at least half of the data points

36. The choice of one-tailed test and two tailed test depends upon. A= null hypothesis, b=

alternative hypothesis, c= none of these, d= composite hypothesis.

37. Given x2= 20.178, D.of=4 and alpha=0.01, find the table value of x2 and make the statistical

decision. A= x20.01(4)=13.277, rejected H0. B= x20.01(4)=14.277 rejected H1. C= x2

0.01(4)=13.277, rejected H1. D= X2 0.01(4)=14.277, rejected H0.

38. In regression analysis the variable we would like to predict or explain is called : A. independent

variable b. dependent variable, c= regression coefficient, d= residual error

39. The degree of confidence is equal to . a= alpha, b= beta, c= 1-alpha, d= 1-beta.

40. µ = 100, X¯= 120, n = 25 s = 35.5 Find t which is 2.82

41. suppose that the null hypothesis is true and it is rejected, is known as. A= type I error, and its

probability is Beta. B= type I error, and its probability is alpha. C= type II error, and its

probability is alpha. D= type II error, and its probability is Beta.

42. Degree of freedom of t distribution is. a). N+1, b). n-1, c). n. d). n-1/2

Page 21: Statistical Inference-(MGT601) Mid FA2015

Basic Probability

Library, Teaching and Learning 2014

Page 22: Statistical Inference-(MGT601) Mid FA2015

1

Basic definition Formula, symbols Probability of an event, A, occurring:

AP

outcomespossibleofnumbertotal

outcomessuccessfulofnumber

Complementary Events Events in the whole sample space but not one of the outcomes included in A are complementary.

APAnotPAP 1)~

(

Limits of P P(any event occurring) lies between 0 and 1 0 1 P A( ) If event A is certain not to happen: 0)( AP

If event A is certain to happen: 1)( AP

Union or General Addition Rule Probability that either one or other event occurs BAPBorAP

Mutually Exclusive Events Events that cannot both occur at the same time have no intersection.

BPAPBAPB or AP

For events that are mutually exclusive: 0B and AP

Non- Mutually Exclusive Events Events that can both occur at the same time have some intersection.

BandAPBPAPBorAP

Note that this rule applies regardless, as if there is no intersection, zero will be subtracted.

Statistically Independent Events

Events where the occurrence of one event does not influence the likelihood of the other occurring

If A and B are independent, then

BPAPBandAP

Note the reverse of this is also true:

If BPAPBandAP , then A and B are independent.

This is a specifically mathematical definition. Do not rely on “gut feeling” or instinct

to tell you whether two events are statistically independent or not.

Basic rules for calculating simple probability

Page 23: Statistical Inference-(MGT601) Mid FA2015

2

Conditional Probability This arises when we are calculating the probabilities of a particular event, A, given that we know the condition of another event, B. It is the probability that an event occurs given that another event has occurred.

Bn

BandAn

BP

BandAPBAP

)(

)(

P(A|B) means : “The probability that A will occur given that B has already occurred.”

Also, Note: If the events are independent, then

APBP

BPAP

BP

BandAPBAP

)(

)(

i.e., if events A and B are independent then the conditional probability that A occurs, given that event B has occurred, is simply the probability that event A occurs.

Expected Value The expected value of a random variable is the mean of the random variable

nn xpxxpxxpxxpxXE ...332211 That is, to work out the expected value of a random variable, multiply each possible value of X by its probability and add these products.

Presentation of information As well as just being written out, information can be presented in a table or as a diagram. Examine the following information. The set of digits, D, contains the numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} The set of even numbers, E, is {2, 4, 6, 8} The set of odd numbers, O, is {1, 3, 5, 7, 9} The set of prime numbers, P, is {2, 3, 5, 7} This Venn diagram shows the relationships between the sets. Note there are some numbers in more than one grouping and zero is all on its own.

35

72

46

8

O

E

P

D

9

1

D = digits E = even numbers O = odd numbers P = prime numbers

Page 24: Statistical Inference-(MGT601) Mid FA2015

3

A summary of this information could have been written in table form, showing the number of digits in each category:

Odd numbers Even numbers Neither Total

Prime 3 1 0 4

Not Prime 2 3 1 6

Total 5 4 1 10

That is, there are ‘3’ digits that are both odd and prime, ‘2’ digits that are both odd and not prime and ‘5’ odd digits in total etc. Study how the following probabilities are calculated, using the rules given above.

4.010

4evenP 5.0

10

5oddP 1.0

10

1odd or even neitherP

1.010

1even and primeP (from Venn Diagram)

or using formula: BandAPBPAPBorAP

10

7

10

1

10

4

10

4even or primeP

10

3odd and primeP

10

6

10

3

10

4

10

5odd or primeP

1, neither or odd evenP 0even and oddP

Probabilities are very easy to calculate if data is given in table form. If you are given information not in table form, try to tabulate it before you start your calculations. The table is sometimes referred to as a contingency table.

In some texts you will see: used for “intersection” and defined by the word “and”. used for “union” and defined by the word “or”.

PROBABILITY Addition law (events not mutually exclusive):

P(A or B) = P (A) + P(B) – P(A and B)

For mutually exclusive events: P(A or B) = P(A) + P(B)

P(A and B) = 0

Multiplication law: P(A and B) P(A)P(B|A)=P(B)P(A|B) If statistically independent:

P(A|B) = P(A) and P(B|A) = P(B)

P(A and B) P(A)P(B)

DO NOT ABANDON YOUR OWN LOGIC – think about the questions and the

likely answer.

Page 25: Statistical Inference-(MGT601) Mid FA2015

4

PROBABILITY - PRACTICE QUESTIONS

1. If two events are mutually exclusive, the probability that they both occur is:

A 0.00 B 0.50 C 1.00. D Cannot be determined from the information given 2. When using the general multiplication rule, P(A and B) is equal to:

A P(A|B).P(B) B P(A).P(B) C P(B)/P(A) D P(A)/P(B) 3. A recent survey of banks revealed the following distribution for the interest rate being

charged on a home loan (based on a 15-year mortgage with a 20% deposit):

Interest rate 7.0% 7.5% 8.0% 8.5% > 8.5%

Probability 0.12 0.23 0.24 0.35 0.06

If a bank is selected at random from this distribution, what is the chance that the interest rate charged on a home loan will exceed 8.0%? A 0.06 B 0.41 C 0.59 D 1.00

Use the following information for the next two questions.

Mothers Against Drunk Driving is a very visible group whose main focus is to educate the public about the harm caused by drunk drivers. A study was recently done that emphasised the problem we all face with drinking and driving. Four hundred accidents that occurred on a Saturday night were analysed. Two items noted were the number of vehicles involved and whether alcohol played a role in the accident. The numbers are shown below: Number of Vehicles Involved Totals Totals

Did alcohol play a role? 1 2 3

Yes 50 100 20 170

No 25 175 30 230

Totals 75 275 50 400

4. What proportion of accidents involved alcohol and a single vehicle?

A 25/400 B 50/400 C 195/400 D 245/400 5. Given that alcohol was not involved, what proportion of the accidents were multiple vehicle?

A 50/170 B 120/170 C 205/230 D 25/230 6. The connotation ‘expected value’ or ‘expected gain’ from playing Roulette at a casino

means:

A the amount you expect to ‘gain’ on a single play

B the amount you expect to ‘gain’ in the long run over many plays

C the amount you need to ‘break even’ over many plays

D the amount you should expect to ‘gain’ if you are lucky

Page 26: Statistical Inference-(MGT601) Mid FA2015

5

7. If two events are collectively exhaustive, the probability that one or the other occurs is

A 0 B 0.50 C 1.00 D Cannot be determined from the information given 8. There are 100 female students and 230 male students in a class. The probability that a

randomly picked student is a female is:

A 0 B 0.50 C 0.30 D Cannot be determined from the information given 9. According to a survey of American households, the probability that the residents own two

cars IF annual household income is over $25,000 is 80%. Of the households surveyed, 60% had incomes over $25,000 and 70% had two cars. The probability that the residents of a household own two cars AND have an income less than or equal to $25,000 a year is:

A 0.12 B 0.18 C 0.22 D 0.48

10. A company has two machines that produce widgets. An older machine produces 23%

defective widgets, while the new machine produces only 8% defective widgets. In addition, the new machine produces three times as many widgets as the older machine does. Given a randomly chosen widget was tested and found to be defective, what is the probability it was produced by the new machine?

A 0.08 B 0.15 C 0.489 D 0.511

Use the following information for the next two questions. A certain sales company has both male and female employees. These employees either worked overtime (extra hours) or did not. The probability that an employee chosen at random was male was 0.60. The probability that a randomly chosen employee worked overtime was 0.45. 11. What is the probability that an employee chosen at random will be female? 12. The probability that an employee chosen at random is both male AND works overtime is

0.25. What is the probability that a randomly chosen employee is male OR works overtime? Hint: to answer this question it could help to construct a 2x2 contingency table.

Use the following information for the next three questions.

The marks (pass or fail) of 100 QMET103 students were summarised according to student gender:

Passed Failed

Male 20 20

Female 45 15 13. If a student is selected at random, what is the probability that the student passed QMET103?

14. If a student is selected at random, what is the probability that the student failed QMET103 AND is male?

15. Given that the selected student had passed, what is the probability that the student was

male?

Page 27: Statistical Inference-(MGT601) Mid FA2015

6

16. A local retail store surveyed 1000 people and asked whether they intended to purchase a large television over the next 12 months. Twelve months later, the same respondents were contacted and asked whether they actually purchased the television.

Their responses are summarized in the following table:

Planned to purchase Actually Purchased

Yes No

Yes 200 50

No 100 650

a) What is the probability that a randomly selected person planned to purchase a large

television? b) What is the probability that a randomly selected person planned to purchase a

television AND actually purchased a television? c) What is the probability that a randomly selected person planned to purchase a

television OR actually purchased a television? d) Given that a randomly selected person planned to purchase a television, what is the

probability that he/she actually purchased a television? e) Are the two events, planning to purchase a television and actually purchasing a

television, statistically independent? (Show working).

17. 300 students were sampled to determine attitudes to internal assessment workloads.

Students from both Commerce and Science Divisions were sampled and the following table produced:

Workload

too light

Workload

about right

Workload

too much

Science 20 30 50 Commerce 100 20 80

a) What is the probability that a randomly selected person in the sample considers the

workload too light? b) What is the probability that a randomly selected person in the sample considers the

workload about right AND too light? c) What is the probability that a randomly selected person in the sample is a commerce

student OR considers the workload too much? d) Given that a randomly selected student is from the Commerce Division, what is the

probability that the student considers the workload about right? e) What is the probability that a randomly selected student is not a science student AND

they think the workload is too light?

Page 28: Statistical Inference-(MGT601) Mid FA2015

7

18. There are 50 students in the Lincoln University Rugby Club and 20 of them take vitamin C daily. 30% Rugby Club students catch a cold each year. 20% of students who take Vitamin C every day caught a cold last year.

a) Prepare a contingency table for the above information. b) What is the probability that a randomly selected student who does not take Vitamin C

every day caught a cold last year?

c) Given that the randomly selected student caught a cold last year, what is the

probability that he takes Vitamin C? d) Are taking Vitamin C and catching a cold independent events? Support your answer

with appropriate mathematical calculations.

19. A soft drink company is interested in introducing a new Cola brand to the market. Initially

they developed three different flavours and want to select the flavour which would be the most popular one. Their research department randomly selected 100 males and 100 females and asked them to choose the best flavour between the three flavours (say A, B and C). The results are summarised in the following table:

Flavour Male Female

A 25 30

B 35 50

C 40 20

a) What is the probability that a person likes flavour A?

b) What is the probability that a randomly selected person is a female and likes flavour C?

c) Given that the randomly selected person is male, what is the probability that he likes flavour C?

d) What is the probability that a randomly selected person is female or likes the flavour A?

e) If two persons are randomly selected without replacement, what is the probability that both persons selected will like flavour C?

SOLUTIONS

Questions 1 - 6

1 A 2 A 3 B 4 B

5 C 6 B

Questions 7-10

1 C 2 C 3 C 4 D

Questions 11-15

1 0.4 2 0.8 3 0.65 4 0.2 5 0.31

Question 16

A 0.25 B 0.2 C 0.35 D 0.8 E NO

Page 29: Statistical Inference-(MGT601) Mid FA2015

8

Question 17

a 0.4 b 0 c 0.83 d 0.1 e 0.33

Question 18

Took Vit C NO Vit C Total

Caught cold 4 11 15 b 0.3667

NO Cold 16 19 35 c 0.2667

Total 20 30 50

d Question 19

a 0.275 b 0.1 c 0.4 d 0.625 e 0.0889

tIndependenllyStatisticanotHence

CvitandcoldPvitCPColdP

12.008.0 ;12.03.04.0

Page 30: Statistical Inference-(MGT601) Mid FA2015

MCQ TESTING OF HYPOTHESIS

MCQ 13.1 A statement about a population developed for the purpose of testing is called:

(a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic

MCQ 13.2 Any hypothesis which is tested for the purpose of rejection under the assumption that it is true is

called:

(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (d) Composite hypothesis

MCQ 13.3 A statement about the value of a population parameter is called:

(a) Null hypothesis (b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis

MCQ 13.4 Any statement whose validity is tested on the basis of a sample is called:

(a) Null hypothesis (b) Alternative hypothesis (c) Statistical hypothesis (b) Simple hypothesis

MCQ 13.5 A quantitative statement about a population is called:

(a) Research hypothesis (b) Composite hypothesis (c) Simple hypothesis (d) Statistical hypothesis

MCQ 13.6 A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false is

called:

(a) Simple hypothesis (b) Composite hypothesis (c) Statistical hypothesis (d) Alternative hypothesis

MCQ 13.7 The alternative hypothesis is also called:

(a) Null hypothesis (b) Statistical hypothesis (c) Research hypothesis (d) Simple hypothesis

MCQ 13.8 A hypothesis that specifies all the values of parameter is called:

(a) Simple hypothesis (b) Composite hypothesis (c) Statistical hypothesis (d) None of the above

MCQ 13.9 The hypothesis µ ≤ 10 is a:

(a) Simple hypothesis (b) Composite hypothesis (c) Alternative hypothesis (d) Difficult to tell.

MCQ 13.10 If a hypothesis specifies the population distribution is called:

(a) Simple hypothesis (b) Composite hypothesis (c) Alternative hypothesis (d) None of the above

MCQ 13.11 A hypothesis may be classified as:

(a) Simple (b) Composite (c) Null (d) All of the above

MCQ 13.12 The probability of rejecting the null hypothesis when it is true is called:

(a) Level of confidence (b) Level of significance (c) Power of the test (d) Difficult to tell

Page 31: Statistical Inference-(MGT601) Mid FA2015

MCQ 13.13 The dividing point between the region where the null hypothesis is rejected and the region where it is not

rejected is said to be:

(a) Critical region (b) Critical value (c) Acceptance region (d) Significant region

MCQ 13.14 If the critical region is located equally in both sides of the sampling distribution of test-statistic, the test is

called:

(a) One tailed (b) Two tailed (c) Right tailed (d) Left tailed

MCQ 13.15 The choice of one-tailed test and two-tailed test depends upon:

(a) Null hypothesis (b) Alternative hypothesis (c) None of these (d) Composite hypotheses

MCQ 13.16 Test of hypothesis Ho: µ = 50 against H1: µ > 50 leads to:

(a) Left-tailed test (b) Right-tailed test (c) Two-tailed test (d) Difficult to tell

MCQ 13.17 Test of hypothesis Ho: µ = 20 against H1: µ < 20 leads to:

(a) Right one-sided test (b) Left one-sided test (c) Two-sided test (d) All of the above

MCQ 13.18 Testing Ho: µ = 25 against H1: µ ≠ 20 leads to:

(a) Two-tailed test (b) Left-tailed test (c) Right-tailed test (d) Neither (a), (b) and (c)

MCQ 13.19 A rule or formula that provides a basis for testing a null hypothesis is called:

(a) Test-statistic (b) Population statistic (c) Both of these (d) None of the above

MCQ 13.20 The range of test statistic-Z is:

(a) 0 to 1 (b) -1 to +1 (c) 0 to ∞ (d) -∞ to +∞

MCQ 13.21 The range of test statistic-t is:

(a) 0 to ∞ (b) 0 to 1 (c) -∞ to +∞ (d) -1 to +1

MCQ 13.22 If Ho is true and we reject it is called:

(a) Type-I error (b) Type-II error (c) Standard error (d) Sampling error

MCQ 13.23 The probability associated with committing type-I error is:

(a) β (b) α (c) 1 – β (d) 1 – α

MCQ 13.24 A failing student is passed by an examiner, it is an example of:

(a) Type-I error (b) Type-II error (c) Unbiased decision (d) Difficult to tell

Page 32: Statistical Inference-(MGT601) Mid FA2015

MCQ 13.25 A passing student is failed by an examiner, it is an example of:

(a) Type-I error (b) Type-II error (c) Best decision (d) All of the above

MCQ 13.26 1 – α is also called:

(a) Confidence coefficient (b) Power of the test (c) Size of the test (d) Level of significance

MCQ 13.27 1 – α is the probability associated with:

(a) Type-I error (b) Type-II error (c) Level of confidence (d) Level of significance

MCQ 13.28 Area of the rejection region depends on:

(a) Size of α (b) Size of β (c) Test-statistic (d) Number of values

MCQ 13.29 Size of critical region is known as:

(a) β (b) 1 - β (c) Critical value (d) Size of the test

MCQ 13.30 A null hypothesis is rejected if the value of a test statistic lies in the:

(a) Rejection region (b) Acceptance region (c) Both (a) and (b) (d) Neither (a) nor (b)

MCQ 13.31 The test statistic is equal to:

MCQ 13.32 Level of significance is also called:

(a) Power of the test (b) Size of the test (c) Level of confidence (d) Confidence coefficient

MCQ 13.33 Level of significance α lies between:

(a) -1 and +1 (b) 0 and 1 (c) 0 and n (d) -∞ to +∞

MCQ 13.34 Critical region is also called:

(a) Acceptance region (b) Rejection region (c) Confidence region (d) Statistical region

MCQ 13.35 The probability of rejecting Ho when it is false is called:

(a) Power of the test (b) Size of the test (c) Level of confidence (d) Confidence coefficient

MCQ 13.36 Power of a test is related to:

(a) Type-I error (b) Type-II error (c) Both (a) and (b) (d) Neither (a) and (b)

Page 33: Statistical Inference-(MGT601) Mid FA2015

MCQ 13.37 In testing hypothesis α + β is always equal to:

(a) One (b) Zero (c) Two (d) Difficult to tell

MCQ 13.38 The significance level is the risk of:

(a) Rejecting Ho when Ho is correct (b) Rejecting Ho when H1 is correct

(c) Rejecting H1 when H1 is correct (d) Accepting Ho when Ho is correct.

MCQ 13.39 An example in a two-sided alternative hypothesis is:

(a) H1: µ < 0 (b) H1: µ > 0 (c) H1: µ ≥ 0 (d) H1: µ ≠ 0

MCQ 13.40 If the magnitude of calculated value of t is less than the tabulated value of t and H1 is two-sided, we

should:

(a) Reject Ho (b) Accept H1 (c) Not reject Ho (d) Difficult to tell

MCQ 13.41 Accepting a null hypothesis Ho:

(a) Proves that Ho is true (b) Proves that Ho is false

(c) Implies that Ho is likely to be true (d) Proves that µ ≤ 0

MCQ 13.42 The chance of rejecting a true hypothesis decreases when sample size is:

(a) Decreased (b) Increased (c) Constant (d) Both (a) and (b)

MCQ 13.43 The equality condition always appears in:

(a) Null hypothesis (b) Simple hypothesis (c) Alternative hypothesis (d) Both (a) and (b)

MCQ 13.44 Which hypothesis is always in an inequality form?

(a) Null hypothesis (b) Alternative hypothesis (c) Simple hypothesis (d) Composite hypothesis

MCQ 13.45 Which of the following is composite hypothesis?

(a) µ ≥ µo (b) µ ≤ µo (c) µ = µo (d) µ ≠ µo

MCQ 13.46 P (Type I error) is equal to:

(a) 1 – α (b) 1 – β (c) α (d) β

MCQ 13.47 P (Type II error) is equal to:

(a) α (b) β (c) 1 – α (d) 1 – β

MCQ 13.48 The power of the test is equal to:

(a) α (b) β (c) 1 – α (d) 1 – β

Page 34: Statistical Inference-(MGT601) Mid FA2015

MCQ 13.49 The degree of confidence is equal to:

(a) α (b) β (c) 1 – α (d) 1 – β

MCQ 13.50 α / 2 is called:

(a) One tailed significance level (b) Two tailed significance level

(c) Left tailed significance level (d) Right tailed significance level

MCQ 13.51 Student’s t-test is applicable only when:

(a) n≤30 and σ is known (b) n>30 and σ is unknown (c) n=30 and σ is known (d) All of the above

MCQ 13.52 Student’s t-statistic is applicable in case of:

(a) Equal number of samples (b) Unequal number of samples (c) Small samples (d) All of the above

MCQ 13.53 Paired t-test is applicable when the observations in the two samples are:

(a) Equal in number (b) Paired (c) Correlation (d) All of the above

MCQ 13.54 The degree of freedom for paired t-test based on n pairs of observations is:

(a) 2n - 1 (b) n - 2 (c) 2(n - 1) (d) n - 1

MCQ 13.55

The test-statistic

has d.f = ________:

(a) n (b) n - 1 (c) n - 2 (d) n1 + n2 - 2

MCQ 13.56 In an unpaired samples t-test with sample sizes n1= 11 and n2= 11, the value of tabulated t should be

obtained for:

(a) 10 degrees of freedom (b) 21 degrees of freedom

(c) 22 degrees of freedom (d) 20 degrees of freedom

MCQ 13.57 In analyzing the results of an experiment involving seven paired samples, tabulated t should be

obtained for:

(a) 13 degrees of freedom (b) 6 degrees of freedom

(c) 12 degrees of freedom (d) 14 degrees of freedom

MCQ 13.58 The mean difference between 16 paired observations is 25 and the standard deviation of differences is

10. The value of statistic-t is:

(a) 4 (b) 10 (c) 16 (d) 25

MCQ 13.59

Statistic-t is defined as deviation of sample mean from population mean µ expressed in terms of:

(a) Standard deviation (b) Standard error

(c) Coefficient of standard deviation (d) Coefficient of variation

Page 35: Statistical Inference-(MGT601) Mid FA2015

MCQ 13.60 Student’s t-distribution has (n-1) d.f. when all the n observations in the sample are:

(a) Dependent (b) Independent (c) Maximum (d) Minimum

MCQ 13.61 The number of independent values in a set of values is called:

(a) Test-statistic (b) Degree of freedom (c) Level of significance (d) Level of confidence

MCQ 13.62 The purpose of statistical inference is:

(a) To collect sample data and use them to formulate hypotheses about a population

(b) To draw conclusion about populations and then collect sample data to support the conclusions

(c) To draw conclusions about populations from sample data

(d) To draw conclusions about the known value of population parameter

MCQ 13.63 Suppose that the null hypothesis is true and it is rejected, is known as:

(a) A type-I error, and its probability is β

(b) A type-I error, and its probability is α

(c) A type-II error, and its probability is α

(d) A type-Il error, and its probability is β

MCQ 13.64 An advertising agency wants to test the hypothesis that the proportion of adults in Pakistan who read a Sunday

Magazine is 25 percent. The null hypothesis is that the proportion reading the Sunday Magazine is:

(a) Different from 25% (b) Equal to 25% (c) Less than 25 % (d) More than 25 %

MCQ 13.65

If the mean of a particular population is µo,

is distributed:

(a) As a standard normal variable, if the population is non-normal

(b) As a standard normal variable, if the sample is large

(c) As a standard normal variable, if the population is normal

(d) As the t-distribution with v = n - 1 degrees of freedom

MCQ 13.66

If µ1 and µ2 are means of two populations,

is distributed:

(a) As a standard normal variable, if both samples are independent and less than 30

(b) As a standard normal variable, if both populations are normal

(c) As both (a) and (b) state

(d) As the t-distribution with n1 + n2 - 2 degrees of freedom

MCQ 13.67

If the population proportion equals po, then

is distributed:

(a) As a standard normal variable, if n > 30

(b) As a Poisson variable

(c) As the t-distribution with v= n 1 degrees of freedom

(d) As a distribution with v degrees of freedom

Page 36: Statistical Inference-(MGT601) Mid FA2015

MCQ 13.68 When σ is known, the hypothesis about population mean is tested by:

(a) t-test (b) Z-test (c) χ2-test (d) F-test

MCQ 13.69

Given µo = 130, = 150, σ = 25 and n = 4; what test statistics is appropriate?

(a) t (b) Z (c) χ2 (d) F

MCQ 13.70 Given Ho: µ = µo, H1: µ ≠ µo, α = 0.05 and we reject Ho; the absolute value of the Z-statistic must have equalled

or been beyond what value?

(a) 1.96 (b) 1.65 (c) 2.58 (d) 2.33

MCQ 13.71 If p1 and p2 are not identical, then standard error of the difference of proportions (p1 – p2) is:

MCQ 13.72 Under the hypothesis Ho: p1 = p2, the formula for the standard error of the difference between

proportions (p1 – p2) is:

Page 37: Statistical Inference-(MGT601) Mid FA2015

1

CORRELATION & REGRESSION

MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, select the best answer. 1. The correlation coefficient is used to determine: a. A specific value of the y-variable given a specific value of the x-variable b. A specific value of the x-variable given a specific value of the y-variable c. The strength of the relationship between the x and y variables d. None of these 2. If there is a very strong correlation between two variables then the correlation coefficient must be a. any value larger than 1 b. much smaller than 0, if the correlation is negative c. much larger than 0, regardless of whether the correlation is negative or positive d. None of these alternatives is correct. 3. In regression, the equation that describes how the response variable (y) is related to the

explanatory variable (x) is: a. the correlation model b. the regression model c. used to compute the correlation coefficient d. None of these alternatives is correct. 4. The relationship between number of beers consumed (x) and blood alcohol content (y) was studied

in 16 male college students by using least squares regression. The following regression equation was obtained from this study:

!= -0.0127 + 0.0180x The above equation implies that: a. each beer consumed increases blood alcohol by 1.27% b. on average it takes 1.8 beers to increase blood alcohol content by 1% c. each beer consumed increases blood alcohol by an average of amount of 1.8% d. each beer consumed increases blood alcohol by exactly 0.018 5. SSE can never be a. larger than SST b. smaller than SST c. equal to 1 d. equal to zero

Page 38: Statistical Inference-(MGT601) Mid FA2015

6. Regression modeling is a statistical framework for developing a mathematical equation that describes how

a. one explanatory and one or more response variables are related b. several explanatory and several response variables response are related c. one response and one or more explanatory variables are related d. All of these are correct. 7. In regression analysis, the variable that is being predicted is the a. response, or dependent, variable b. independent variable c. intervening variable d. is usually x 8. Regression analysis was applied to return rates of sparrowhawk colonies. Regression analysis was

used to study the relationship between return rate (x: % of birds that return to the colony in a given year) and immigration rate (y: % of new adults that join the colony per year). The following regression equation was obtained.

! = 31.9 – 0.34x Based on the above estimated regression equation, if the return rate were to decrease by 10% the

rate of immigration to the colony would: a. increase by 34% b. increase by 3.4% c. decrease by 0.34% d. decrease by 3.4% 9. In least squares regression, which of the following is not a required assumption about the error

term ε? a. The expected value of the error term is one. b. The variance of the error term is the same for all values of x. c. The values of the error term are independent. d. The error term is normally distributed. 10. Larger values of r2 (R2) imply that the observations are more closely grouped about the a. average value of the independent variables b. average value of the dependent variable c. least squares line d. origin 11. In a regression analysis if r2 = 1, then a. SSE must also be equal to one b. SSE must be equal to zero c. SSE can be any positive value d. SSE must be negative

Page 39: Statistical Inference-(MGT601) Mid FA2015

12. The coefficient of correlation a. is the square of the coefficient of determination b. is the square root of the coefficient of determination c. is the same as r-square d. can never be negative 13. In regression analysis, the variable that is used to explain the change in the outcome of an

experiment, or some natural process, is called a. the x-variable b. the independent variable c. the predictor variable d. the explanatory variable e. all of the above (a-d) are correct f. none are correct 14. In the case of an algebraic model for a straight line, if a value for the x variable is specified, then a. the exact value of the response variable can be computed b. the computed response to the independent value will always give a minimal residual c. the computed value of y will always be the best estimate of the mean response d. none of these alternatives is correct. 15. A regression analysis between sales (in $1000) and price (in dollars) resulted in the following

equation: ! = 50,000 - 8X The above equation implies that an a. increase of $1 in price is associated with a decrease of $8 in sales b. increase of $8 in price is associated with an increase of $8,000 in sales c. increase of $1 in price is associated with a decrease of $42,000 in sales d. increase of $1 in price is associated with a decrease of $8000 in sales 16. In a regression and correlation analysis if r2 = 1, then a. SSE = SST b. SSE = 1 c. SSR = SSE d. SSR = SST 17. If the coefficient of determination is a positive value, then the regression equation a. must have a positive slope b. must have a negative slope c. could have either a positive or a negative slope d. must have a positive y intercept

Page 40: Statistical Inference-(MGT601) Mid FA2015

18. If two variables, x and y, have a very strong linear relationship, then a. there is evidence that x causes a change in y b. there is evidence that y causes a change in x c. there might not be any causal relationship between x and y d. None of these alternatives is correct. 19. If the coefficient of determination is equal to 1, then the correlation coefficient a. must also be equal to 1 b. can be either -1 or +1 c. can be any value between -1 to +1 d. must be -1 20. In regression analysis, if the independent variable is measured in kilograms, the dependent

variable a. must also be in kilograms b. must be in some unit of weight c. cannot be in kilograms d. can be any units 21. The data are the same as for question 4 above. The relationship between number of beers

consumed (x) and blood alcohol content (y) was studied in 16 male college students by using least squares regression. The following regression equation was obtained from this study:

!= -0.0127 + 0.0180x Suppose that the legal limit to drive is a blood alcohol content of 0.08. If Ricky consumed 5 beers

the model would predict that he would be: a. 0.09 above the legal limit b. 0.0027 below the legal limit c. 0.0027 above the legal limit d. 0.0733 above the legal limit 22. In a regression analysis if SSE = 200 and SSR = 300, then the coefficient of determination is a. 0.6667 b. 0.6000 c. 0.4000 d. 1.5000 23. If the correlation coefficient is 0.8, the percentage of variation in the response variable explained

by the variation in the explanatory variable is a. 0.80% b. 80% c. 0.64% d. 64%

Page 41: Statistical Inference-(MGT601) Mid FA2015

24. If the correlation coefficient is a positive value, then the slope of the regression line a. must also be positive b. can be either negative or positive c. can be zero d. can not be zero 25. If the coefficient of determination is 0.81, the correlation coefficient a. is 0.6561 b. could be either + 0.9 or - 0.9 c. must be positive d. must be negative 26. A fitted least squares regression line a. may be used to predict a value of y if the corresponding x value is given b. is evidence for a cause-effect relationship between x and y c. can only be computed if a strong linear relationship exists between x and y d. None of these alternatives is correct. 27. Regression analysis was applied between $ sales (y) and $ advertising (x) across all the branches

of a major international corporation. The following regression function was obtained. ! = 5000 + 7.25x If the advertising budgets of two branches of the corporation differ by $30,000, then what will be

the predicted difference in their sales? a. $217,500 b. $222,500 c. $5000 d. $7.25 28. Suppose the correlation coefficient between height (as measured in feet) versus weight (as

measured in pounds) is 0.40. What is the correlation coefficient of height measured in inches versus weight measured in ounces? [12 inches = one foot; 16 ounces = one pound]

a. 0.40 b. 0.30 c. 0.533 d. cannot be determined from information given e. none of these 29. Assume the same variables as in question 28 above; height is measured in feet and weight is

measured in pounds. Now, suppose that the units of both variables are converted to metric (meters and kilograms). The impact on the slope is:

a. the sign of the slope will change b. the magnitude of the slope will change c. both a and b are correct d. neither a nor b are correct

Page 42: Statistical Inference-(MGT601) Mid FA2015

30. Suppose that you have carried out a regression analysis where the total variance in the response is 133452 and the correlation coefficient was 0.85. The residual sums of squares is:

a. 37032.92 b. 20017.8 c. 113434.2 d. 96419.07 e. 15% f. 0.15 31. This question is related to questions 4 and 21 above. The relationship between number of beers

consumed (x) and blood alcohol content (y) was studied in 16 male college students by using least squares regression. The following regression equation was obtained from this study:

!= -0.0127 + 0.0180x Another guy, his name Dudley, has the regression equation written on a scrap of paper in his

pocket. Dudley goes out drinking and has 4 beers. He calculates that he is under the legal limit (0.08) so he decides to drive to another bar. Unfortunately Dudley gets pulled over and confidently submits to a road-side blood alcohol test. He scores a blood alcohol of 0.085 and gets himself arrested. Obviously, Dudley skipped the lecture about residual variation. Dudley’s residual is:

a. +0.005 b. -0.005 c. +0.0257 d. -0.0257 32. You have carried out a regression analysis; but, after thinking about the relationship between

variables, you have decided you must swap the explanatory and the response variables. After refitting the regression model to the data you expect that:

a. the value of the correlation coefficient will change b. the value of SSE will change c. the value of the coefficient of determination will change d. the sign of the slope will change e. nothing changes 33. Suppose you use regression to predict the height of a woman’s current boyfriend by using her own

height as the explanatory variable. Height was measured in feet from a sample of 100 women undergraduates, and their boyfriends, at Dalhousie University. Now, suppose that the height of both the women and the men are converted to centimeters. The impact of this conversion on the slope is:

a. the sign of the slope will change b. the magnitude of the slope will change c. both a and b are correct d. neither a nor b are correct

Page 43: Statistical Inference-(MGT601) Mid FA2015

34. A residual plot: a. displays residuals of the explanatory variable versus residuals of the response variable. b. displays residuals of the explanatory variable versus the response variable. c. displays explanatory variable versus residuals of the response variable. d. displays the explanatory variable versus the response variable. e. displays the explanatory variable on the x axis versus the response variable on the y axis.

35. When the error terms have a constant variance, a plot of the residuals versus the independent

variable x has a pattern that a. fans out b. funnels in c. fans out, but then funnels in d. forms a horizontal band pattern e. forms a linear pattern that can be positive or negative 36. You studied the impact of the dose of a new drug treatment for high blood pressure. You think

that the drug might be more effective in people with very high blood pressure. Because you expect a bigger change in those patients who start the treatment with high blood pressure, you use regression to analyze the relationship between the initial blood pressure of a patient (x) and the change in blood pressure after treatment with the new drug (y). If you find a very strong positive association between these variables, then: a. there is evidence that the higher the patients initial blood pressure, the bigger the impact

of the new drug. b. there is evidence that the higher the patients initial blood pressure, the smaller the impact

of the new drug. c. there is evidence for an association of some kind between the patients initial blood

pressure and the impact of the new drug on the patients blood pressure d. none of these are correct, this is a case of regression fallacy Question 37: A variety of summary statistics were collected for a small sample (10) of bivariate data, where the dependent variable was y and an independent variable was x.

ΣX = 90 Σ ( )( )XXYY −− = 466

ΣY = 170 Σ ( )2XX − = 234

n = 10 Σ ( )2YY − = 1434 SSE = 505.98 37.1 Use the formula to the right to compute the sample correlation coefficient: a. 0.8045 b. -0.8045 c. 0 d. 1

Page 44: Statistical Inference-(MGT601) Mid FA2015

37.2 The least squares estimate of b1 equals a. 0.923 b. 1.991 c. -1.991 d. -0.923 37.3 The least squares estimate of b0 equals a. 0.923 b. 1.991 c. -1.991 d. -0.923 37.4 The sum of squares due to regression (SSR) is a. 1434 b. 505.98 c. 50.598 d. 928.02 37.5 The coefficient of determination equals a. 0.6471 b. -0.6471 c. 0 d. 1 37.6 The point estimate of y when x = 0.55 is a. 0.17205 b. 2.018 c. 1.0905 d. -2.018 e. -0.17205 MULTIPLE CHOICE ANSWERS 1. c 11. b 21. b 31. c 37.5 a 2. b 12. b 22. b 32. b 37.6 a 3. b 13. e 23. d 33. d 4. c 14. a 24. a 34. c 5. a 15. d 25. b 35. d 6. c 16. d 26. a 36. d 7. a 17. c 27. a 37.1 a 8. b 18. c 28. a 37.2 b 9. a 19. b 29. b 37.3 d 10. c 20. d 30. a 37.4 d

Page 45: Statistical Inference-(MGT601) Mid FA2015

Chapter 15 Multiple Choice Questions

(The answers are provided after the last question.)

1. What is the median of the following set of scores? 18, 6, 12, 10, 14 ?

a. 10 b. 14 c. 18 d. 12 2. Approximately what percentage of scores fall within one standard deviation of the mean in a normal distribution? a. 34% b. 95% c. 99% d. 68% 3. The denominator (bottom) of the z-score formula is a. The standard deviation b. The difference between a score and the mean c. The range d. The mean 4. Let's suppose we are predicting score on a training posttest from number of years of education and the score on an aptitude test given before training. Here is the regression equation Y = 25 + .5X1 +10X2,

where X1 = years of education and X2 = aptitude test score. What is the predicted score for someone with 10 years of education and a aptitude test score of 5? a. 25 b. 50 c. 35 d. 80 5. The standard deviation is: a. The square root of the variance b. A measure of variability c. An approximate indicator of how numbers vary from the mean d. All of the above 6. Hypothesis testing and estimation are both types of descriptive statistics. a. True b. False

Page 46: Statistical Inference-(MGT601) Mid FA2015

7. A set of data organized in a participants(rows)-by-variables(columns) format is known as a “data set.” a. True b. False 8. A graph that uses vertical bars to represent data is called a ____. a. Line graph b. Bar graph c. Scatterplot d. Vertical graph 9. The goal of ___________ is to focus on summarizing and explaining a specific set of data. a. Inferential statistics b. Descriptive statistics c. None of the above d. All of the above 10. The most frequently occurring number in a set of values is called the ____. a. Mean b. Median c. Mode d. Range 11. As a general rule, the _______ is the best measure of central tendency because it is more precise. a. Mean b. Median c. Mode d. Range 12. Focusing on describing or explaining data versus going beyond immediate data and making inferences is the difference between _______. a. Central tendency and common tendency b. Mutually exclusive and mutually exhaustive properties c. Descriptive and inferential d. Positive skew and negative skew 13. Why are variance and standard deviation the most popular measures of variability? a. They are the most stable and are foundations for more advanced statistical analysis b. They are the most simple to calculate with large data sets c. They provide nominally scaled data d. None of the above 14. ____________ is the set of procedures used to explain or predict the values of a dependent variable based on the values of one or more independent variables. a. Regression analysis

Page 47: Statistical Inference-(MGT601) Mid FA2015

b. Regression coefficient c. Regression equation d. Regression line 15. The ______ is the value you calculate when you want the arithmetic average. a. Mean b. Median c. Mode d. All of the above 16. ___________ are used when you want to visually examine the relationship between two quantitative variables. a. Bar graphs b. Pie graphs c. Line graphs d. Scatterplots 17. The _______ is often the preferred measure of central tendency if the data are severely skewed. a. Mean b. Median c. Mode d. Range 18. Which of the following is the formula for range? a. H + L b. L x H c. L - H d. H – L 19. Which is a raw score that has been transformed into standard deviation units? a. z score b. SDU score c. t score d. e score 20. Which of the following is NOT a measure of variability? a. Median b. Variance c. Standard deviation d. Range 21. Which of the following is NOT a common measure of central tendency? a. Mode b. Range c. Median

Page 48: Statistical Inference-(MGT601) Mid FA2015

d. Mean 22. What is the median of this set of numbers: 4, 6, 7, 9, 2000000? a. 7.5 b. 6 c. 7 d. 4 23. What is the mean of this set of numbers: 4, 6, 7, 9, 2000000? a. 7.5 b. 400,005.2 c. 7 d. 4 24. Which of the following is interpreted as the percentage of scores in a reference group that falls below a particular raw score? a. Standard scores b. Percentile rank c. Reference group d. None of the above 25. The median is ______. a. The middle point b. The highest number c. The average d. Affected by extreme scores 26. Which measure of central tendency takes into account the magnitude of scores? a. Mean b. Median c. Mode d. Range 27. If a test was generally very easy, except for a few students who had very low scores, then the distribution of scores would be _____. a. Positively skewed b. Negatively skewed c. Not skewed at all d. Normal 28. How many dependent variables are used in multiple regression? a. One b. One or more c. Two or more d. Two

Page 49: Statistical Inference-(MGT601) Mid FA2015

29. Which of the following represents the fiftieth percentile, or the middle point in a set of numbers arranged in order of magnitude? a. Mode b. Median c. Mean d. Variance 30. If a distribution is skewed to the left, then it is __________. a. Negatively skewed b. Positively skewed c. Symmetrically skewed d. Symmetrical 31. In a grouped frequency distribution, the intervals should be what? a. Mutually exclusive b. Exhaustive c. Both A and B d. Neither A nor B 32. When a set of numbers is heterogeneous, you can place more trust in the measure of central tendency as representing the typical person or unit. a. True b. False 33. Non-overlapping categories or intervals are known as ______. a. Inclusive b. Exhaustive c. Mutually exclusive d. Mutually exclusive and exhaustive 34. To interpret the relationship between two categorical variables, a contingency table should be constructed with either column or row percentages, and ----. a. If the percentages are calculated down the columns, then comparisons should be made across

the rows b. If the percentages are calculated across the rows, comparisons should be made down the

columns c. Both a and b are correct d. Neither a nor b is correct Answers: 1. d 2. d 3. a 4. d 5. d

Page 50: Statistical Inference-(MGT601) Mid FA2015

6. b 7. a 8. b 9. b 10. c 11. a 12. c 13. a 14. a 15. a 16. d 17. b 18. d 19. a 20. a 21. b 22. c 23. b 24. b 25. a 26. a 27. b 28. a 29. b 30, a 31. c 32. b 33. c 34. c

Page 51: Statistical Inference-(MGT601) Mid FA2015

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

Correct answers are in bold italics.. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners to the lung capacity of farm workers. The researcher studied 200 workers of each type. Other factors that might affect lung capacity are smoking habits and exercise habits. The smoking habits of the two worker types are similar, but the coal miners generally exercise less than the farm workers. 1. Which of the following is the explanatory variable in this study?

a. Exercise b. Lung capacity c. Smoking or not d. Occupation

2. Which of the following is a confounding variable in this study?

a. Exercise b. Lung capacity c. Smoking or not d. Occupation

This scenario applies to Questions 3 to 5: A randomized experiment was done by randomly assigning each participant either to walk for half an hour three times a week or to sit quietly reading a book for half an hour three times a week. At the end of a year the change in participants' blood pressure over the year was measured, and the change was compared for the two groups. 3. This is a randomized experiment rather than an observational study because:

a. Blood pressure was measured at the beginning and end of the study. b. The two groups were compared at the end of the study. c. The participants were randomly assigned to either walk or read, rather than choosing

their own activity. d. A random sample of participants was used.

4. The two treatments in this study were:

a. Walking for half an hour three times a week and reading a book for half an hour three times a week.

b. Having blood pressure measured at the beginning of the study and having blood pressure measured at the end of the study.

c. Walking or reading a book for half an hour three times a week and having blood pressure measured.

d. Walking or reading a book for half an hour three times a week and doing nothing.

Page 52: Statistical Inference-(MGT601) Mid FA2015

Scenario for Questions 3 to 5, continued 5. If a statistically significant difference in blood pressure change at the end of a year for the

two activities was found, then: a. It cannot be concluded that the difference in activity caused a difference in the change in

blood pressure because in the course of a year there are lots of possible confounding variables.

b. Whether or not the difference was caused by the difference in activity depends on what else the participants did during the year.

c. It cannot be concluded that the difference in activity caused a difference in the change in blood pressure because it might be the opposite, that people with high blood pressure were more likely to read a book than to walk.

d. It can be concluded that the difference in activity caused a difference in the change in blood pressure because of the way the study was done.

6. What is one of the distinctions between a population parameter and a sample statistic?

a. A population parameter is only based on conceptual measurements, but a sample statistic is based on a combination of real and conceptual measurements.

b. A sample statistic changes each time you try to measure it, but a population parameter remains fixed.

c. A population parameter changes each time you try to measure it, but a sample statistic remains fixed across samples.

d. The true value of a sample statistic can never be known but the true value of a population parameter can be known.

7. A magazine printed a survey in its monthly issue and asked readers to fill it out and send it

in. Over 1000 readers did so. This type of sample is called a. a cluster sample. b. a self-selected sample. c. a stratified sample. d. a simple random sample.

8. Which of the following would be most likely to produce selection bias in a survey? a. Using questions with biased wording. b. Only receiving responses from half of the people in the sample. c. Conducting interviews by telephone instead of in person. d. Using a random sample of students at a university to estimate the proportion of people

who think the legal drinking age should be lowered.

9. Which one of the following variables is not categorical? a. Age of a person. b. Gender of a person: male or female. c. Choice on a test item: true or false. d. Marital status of a person (single, married, divorced, other)

Page 53: Statistical Inference-(MGT601) Mid FA2015

10. A polling agency conducted a survey of 100 doctors on the question “Are you willing to treat women patients with the recently approved pill RU-486”? The conservative margin of error associated with the 95% confidence interval for the percent who say 'yes' is a. 50% b. 10% c. 5% d. 2%

11. Which one of these statistics is unaffected by outliers? a. Mean b. Interquartile range c. Standard deviation d. Range

12. A list of 5 pulse rates is: 70, 64, 80, 74, 92. What is the median for this list? a. 74 b. 76 c. 77 d. 80

13. Which of the following would indicate that a dataset is not bell-shaped?

a. The range is equal to 5 standard deviations. b. The range is larger than the interquartile range. c. The mean is much smaller than the median. d. There are no outliers.

14. A scatter plot of number of teachers and number of people with college degrees for cities in California reveals a positive association. The most likely explanation for this positive association is: a. Teachers encourage people to get college degrees, so an increase in the number of

teachers is causing an increase in the number of people with college degrees. b. Larger cities tend to have both more teachers and more people with college degrees, so

the association is explained by a third variable, the size of the city. c. Teaching is a common profession for people with college degrees, so an increase in the

number of people with college degrees causes an increase in the number of teachers. d. Cities with higher incomes tend to have more teachers and more people going to college,

so income is a confounding variable, making causation between number of teachers and number of people with college degrees difficult to prove.

15. The value of a correlation is reported by a researcher to be r = −0.5. Which of the following

statements is correct? a. The x-variable explains 25% of the variability in the y-variable. b. The x-variable explains −25% of the variability in the y-variable. c. The x-variable explains 50% of the variability in the y-variable. d. The x-variable explains −50% of the variability in the y-variable.

16. What is the effect of an outlier on the value of a correlation coefficient?

a. An outlier will always decrease a correlation coefficient. b. An outlier will always increase a correlation coefficient. c. An outlier might either decrease or increase a correlation coefficient, depending on

where it is in relation to the other points. d. An outlier will have no effect on a correlation coefficient.

Page 54: Statistical Inference-(MGT601) Mid FA2015

17. One use of a regression line is

a. to determine if any x-values are outliers. b. to determine if any y-values are outliers. c. to determine if a change in x causes a change in y. d. to estimate the change in y for a one-unit change in x.

18. Past data has shown that the regression line relating the final exam score and the midterm

exam score for students who take statistics from a certain professor is: final exam = 50 + 0.5 × midterm

One interpretation of the slope is a. a student who scored 0 on the midterm would be predicted to score 50 on the final exam. b. a student who scored 0 on the final exam would be predicted to score 50 on the midterm

exam. c. a student who scored 10 points higher than another student on the midterm would be

predicted to score 5 points higher than the other student on the final exam. d. students only receive half as much credit (.5) for a correct answer on the final exam

compared to a correct answer on the midterm exam. Questions 19 to 21: A survey asked people how often they exceed speed limits. The data are then categorized into the following contingency table of counts showing the relationship between age group and response.

Exceed Limit if Possible? Age Always Not Always TotalUnder 30 100 100 200 Over 30 40 160 200 Total 140 260 400

19. Among people with age over 30, what's the "risk" of always exceeding the speed limit?

a. 0.20 b. 0.40 c. 0.33 d. 0.50

20. Among people with age under 30 what are the odds that they always exceed the speed limit?

a. 1 to 2 b. 2 to 1 c. 1 to 1 d. 50%

21. What is the relative risk of always exceeding the speed limit for people under 30 compared to

people over 30? a. 2.5 b. 0.4 c. 0.5 d. 30%

Page 55: Statistical Inference-(MGT601) Mid FA2015

Questions 22 and 23: A newspaper article reported that "Children who routinely compete in vigorous after-school sports on smoggy days are three times more likely to get asthma than their non-athletic peers." (Sacramento Bee, Feb 1, 2002, p. A1) 22. Of the following, which is the most important additional information that would be useful

before making a decision about participation in school sports? a. Where was the study conducted? b. How many students in the study participated in after-school sports? c. What is the baseline risk for getting asthma? d. Who funded the study?

23. The newspaper also reported that "The number of children in the study who contracted

asthma was relatively small, 265 of 3,535." Which of the following is represented by 265/3535 = .075? a. The overall risk of getting asthma for the children in this study. b. The baseline risk of getting asthma for the “non-athletic peers” in the study. c. The risk of getting asthma for children in the study who participated in sports. d. The relative risk of getting asthma for children who routinely participate in vigorous

after-school sports on smoggy days and their non-athletic peers. Questions 24 to 26: The following histogram shows the distribution of the difference between the actual and “ideal” weights for 119 female students. Notice that percent is given on the vertical axis. Ideal weights are responses to the question “What is your ideal weight”? The difference = actual −ideal. (Source: idealwtwomen dataset on CD.)

24. What is the approximate shape of the distribution?

a. Nearly symmetric. b. Skewed to the left. c. Skewed to the right. d. Bimodal (has more than one peak).

25. The median of the distribution is approximately

a. −10 pounds. b. 10 pounds. c. 30 pounds. d. 50 pounds.

Page 56: Statistical Inference-(MGT601) Mid FA2015

Scenario for Questions 24 to 26, continued 26. Most of the women in this sample felt that their actual weight was

a. about the same as their ideal weight. b. less than their ideal weight. c. greater than their ideal weight. d. no more than 2 pounds different from their ideal weight.

27. A chi-square test of the relationship between personal perception of emotional health and

marital status led to rejection of the null hypothesis, indicating that there is a relationship between these two variables. One conclusion that can be drawn is: a. Marriage leads to better emotional health. b. Better emotional health leads to marriage. c. The more emotionally healthy someone is, the more likely they are to be married. d. There are likely to be confounding variables related to both emotional health and

marital status. 28. A chi-square test involves a set of counts called “expected counts.” What are the expected

counts? a. Hypothetical counts that would occur of the alternative hypothesis were true. b. Hypothetical counts that would occur if the null hypothesis were true. c. The actual counts that did occur in the observed data. d. The long-run counts that would be expected if the observed counts are representative.

29. Pick the choice that best completes the following sentence. If a relationship between two

variables is called statistically significant, it means the investigators think the variables are a. related in the population represented by the sample. b. not related in the population represented by the sample. c. related in the sample due to chance alone. d. very important.

30. Simpson's Paradox occurs when

a. No baseline risk is given, so it is not know whether or not a high relative risk has practical importance.

b. A confounding variable rather than the explanatory variable is responsible for a change in the response variable.

c. The direction of the relationship between two variables changes when the categories of a confounding variable are taken into account.

d. The results of a test are statistically significant but are really due to chance.

Page 57: Statistical Inference-(MGT601) Mid FA2015

Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms 1 and 2 are also representative of questions that may appear on the final exam. 1. A randomly selected sample of 1,000 college students was asked whether they had ever used the drug

Ecstasy. Sixteen percent (16% or 0.16) of the 1,000 students surveyed said they had. Which one of the following statements about the number 0.16 is correct? A. It is a sample proportion. B. It is a population proportion. C. It is a margin of error. D. It is a randomly chosen number.

2. In a random sample of 1000 students, p̂ = 0.80 (or 80%) were in favor of longer hours at the school

library. The standard error of p̂ (the sample proportion) is A. .013 B. .160 C. .640 D. .800

3. For a random sample of 9 women, the average resting pulse rate is x = 76 beats per minute, and the

sample standard deviation is s = 5. The standard error of the sample mean is A. 0.557 B. 0.745 C. 1.667 D. 2.778

4. Assume the cholesterol levels in a certain population have mean µ= 200 and standard deviation σ =

24. The cholesterol levels for a random sample of n = 9 individuals are measured and the sample mean x is determined. What is the z-score for a sample mean x = 180? A. –3.75 B. –2.50 C. −0.83 D. 2.50

5. In a past General Social Survey, a random sample of men and women answered the question “Are you

a member of any sports clubs?” Based on the sample data, 95% confidence intervals for the population proportion who would answer “yes” are .13 to .19 for women and .247 to .33 for men. Based on these results, you can reasonably conclude that A. At least 25% of American men and American women belong to sports clubs. B. At least 16% of American women belong to sports clubs. C. There is a difference between the proportions of American men and American women who

belong to sports clubs. D. There is no conclusive evidence of a gender difference in the proportion belonging to sports

clubs. 6. Suppose a 95% confidence interval for the proportion of Americans who exercise regularly is 0.29 to

0.37. Which one of the following statements is FALSE? A. It is reasonable to say that more than 25% of Americans exercise regularly. B. It is reasonable to say that more than 40% of Americans exercise regularly. C. The hypothesis that 33% of Americans exercise regularly cannot be rejected. D. It is reasonable to say that fewer than 40% of Americans exercise regularly.

Page 58: Statistical Inference-(MGT601) Mid FA2015

7. In hypothesis testing, a Type 2 error occurs when A. The null hypothesis is not rejected when the null hypothesis is true. B. The null hypothesis is rejected when the null hypothesis is true. C. The null hypothesis is not rejected when the alternative hypothesis is true. D. The null hypothesis is rejected when the alternative hypothesis is true.

8. Null and alternative hypotheses are statements about:

A. population parameters. B. sample parameters. C. sample statistics. D. it depends - sometimes population parameters and sometimes sample statistics.

9. A hypothesis test is done in which the alternative hypothesis is that more than 10% of a population is

left-handed. The p-value for the test is calculated to be 0.25. Which statement is correct? A. We can conclude that more than 10% of the population is left-handed. B. We can conclude that more than 25% of the population is left-handed. C. We can conclude that exactly 25% of the population is left-handed. D. We cannot conclude that more than 10% of the population is left-handed.

10. Which of the following is NOT true about the standard error of a statistic?

A. The standard error measures, roughly, the average difference between the statistic and the population parameter.

B. The standard error is the estimated standard deviation of the sampling distribution for the statistic. C. The standard error can never be a negative number. D. The standard error increases as the sample size(s) increases.

11. A prospective observational study on the relationship between sleep deprivation and heart disease was

done by Ayas, et. al. (Arch Intern Med 2003). Women who slept at most 5 hours a night were compared to women who slept for 8 hours a night (reference group). After adjusting for potential confounding variables like smoking, a 95% confidence interval for the relative risk of heart disease was (1.10, 1.92). Based on this confidence interval, a consistent conclusion would be A. Sleep deprivation is associated with a modestly increased risk of heart disease. B. Sleep deprivation is associated with a modestly decreased risk of heart disease. C. There was no evidence of an association between sleep deprivation and heart disease. D. Lack of sleep causes the risk of heart disease to increase by 10% to 92%.

12. Consider a random sample of 100 females and 100 males. Suppose 15 of the females are left-handed

and 12 of the males are left-handed. What is the estimated difference between population proportions of females and males who are left-handed (females − males)? Select the choice with the correct notation and numerical value. A. p1 − p2 = 3 B. p1 − p2 = 0.03 C. 21 ˆˆ pp − = 3 D. 21 ˆˆ pp − = 0.03

13. A result is called “statistically significant” whenever

A. The null hypothesis is true. B. The alternative hypothesis is true. C. The p-value is less or equal to the significance level. D. The p-value is larger than the significance level.

Page 59: Statistical Inference-(MGT601) Mid FA2015

14. The confidence level for a confidence interval for a mean is A. the probability the procedure provides an interval that covers the sample mean. B. the probability of making a Type 1 error if the interval is used to test a null hypothesis about the

population mean. C. the probability that individuals in the population have values that fall into the interval. D. the probability the procedure provides an interval that covers the population mean.

For the next two questions: It is known that for right-handed people, the dominant (right) hand tends to be stronger. For left-handed people who live in a world designed for right-handed people, the same may not be true. To test this, muscle strength was measured on the right and left hands of a random sample of 15 left-handed men and the difference (left - right) was found. The alternative hypothesis is one-sided (left hand stronger). The resulting t-statistic was 1.80. 15. This is an example of:

A. A two-sample t-test. B. A paired t-test. C. A pooled t-test. D. An unpooled t-test.

16. Assuming the conditions are met, based on the t-statistic of 1.80 the appropriate conclusion for this

test using α = .05 is: (Table would be provided with exam.) A. Df = 14, so p-value < .05 and the null hypothesis can be rejected. B. Df = 14, so p-value > .05 and the null hypothesis cannot be rejected. C. Df = 28, so p-value < .05 and the null hypothesis can be rejected. D. Df = 28, so p-value > .05 and the null hypothesis cannot be rejected.

17. A test of H0: µ = 0 versus Ha: µ > 0 is conducted on the same population independently by two

different researchers. They both use the same sample size and the same value of α = 0.05. Which of the following will be the same for both researchers? A. The p-value of the test. B. The power of the test if the true µ = 6. C. The value of the test statistic. D. The decision about whether or not to reject the null hypothesis.

18. Which of the following is not a correct way to state a null hypothesis?

A. H0: 0ˆˆ 21 =− pp (Sample statistics do not go into hypotheses) B. H0: µd = 10 C. H0: µ1 − µ2 = 0 D. H0: p = .5

19. A test to screen for a serious but curable disease is similar to hypothesis testing, with a null hypothesis

of no disease, and an alternative hypothesis of disease. If the null hypothesis is rejected treatment will be given. Otherwise, it will not. Assuming the treatment does not have serious side effects, in this scenario it is better to increase the probability of: A. making a Type 1 error, providing treatment when it is not needed. B. making a Type 1 error, not providing treatment when it is needed. C. making a Type 2 error, providing treatment when it is not needed. D. making a Type 2 error, not providing treatment when it is needed.

Page 60: Statistical Inference-(MGT601) Mid FA2015

20. A random sample of 25 college males was obtained and each was asked to report their actual height and what they wished as their ideal height. A 95% confidence interval for µd = average difference between their ideal and actual heights was 0.8" to 2.2". Based on this interval, which one of the null hypotheses below (versus a two-sided alternative) can be rejected? A. H0: µd = 0.5 B. H0: µd = 1.0 C. H0: µd = 1.5 D. H0: µd = 2.0

21. The average time in years to get an undergraduate degree in computer science was compared for men

and women. Random samples of 100 male computer science majors and 100 female computer science majors were taken. Choose the appropriate parameter(s) for this situation. A. One population proportion p. B. Difference between two population proportions p1 − p2. C. One population mean 1µ D. Difference between two population means µ1 − µ2

22. If the word significant is used to describe a result in a news article reporting on a study,

A. the p-value for the test must have been very large. B. the effect size must have been very large. C. the sample size must have been very small. D. it may be significant in the statistical sense, but not in the everyday sense.

23. A random sample of 5000 students were asked whether they prefer a 10 week quarter system or a 15

week semester system. Of the 5000 students asked, 500 students responded. The results of this survey ________ A. can be generalized to the entire student body because the sampling was random. B. can be generalized to the entire student body because the margin of error was 4.5%. C. should not be generalized to the entire student body because the non-response rate was 90%. D. should not be generalized to the entire student body because the margin of error was 4.5%.

24. In a report by ABC News, the headlines read “City Living Increases Men’s Death Risk” The headlines

were based on a study of 3,617 adults who lived in the United States and were more than 25 years old. One researcher said, “Elevated levels of tumor deaths suggest the influence of physical, chemical and biological exposures in urban areas… Living in cities also involves potentially stressful levels of noise, sensory stimulation and overload, interpersonal relations and conflict, and vigilance against hazards ranging from crime to accidents.” Is a conclusion that living in an urban environment causes an increased risk of death justified? A. Yes, because the study was a randomized study. B. Yes, because many of the men in the study were under stress. C. No, because the study was a retrospective study. D. No, because the study was an observational study.

25. A significance test based on a small sample may not produce a statistically significant result even if the true value differs substantially from the null value. This type of result is known as A. the significance level of the test. B. the power of the study. C. a Type 1 error. D. a Type 2 error.

Page 61: Statistical Inference-(MGT601) Mid FA2015

For the next two questions: An observational study found a statistically significant relationship between regular consumption of tomato products (yes, no) and development of prostate cancer (yes, no), with lower risk for those consuming tomato products. 26. Which of the following is not a possible explanation for this finding?

A. Something in tomato products causes lower risk of prostate cancer. B. There is a confounding variable that causes lower risk of prostate cancer, such as eating vegetables

in general, that is also related to eating tomato products. C. A large number of food products were measured to test for a relationship, and tomato products

happened to show a relationship just by chance. D. A large sample size was used, so even if there were no relationship, one would almost certainly

be detected. 27. Which of the following is a valid conclusion from this finding?

A. Something in tomato products causes lower risk of prostate cancer. B. Based on this study, the relative risk of prostate cancer, for those who do not consume tomato

products regularly compared with those who do, is greater than one. C. If a new observational study were to be done using the same sample size and measuring the same

variables, it would find the same relationship. D. Prostate cancer can be prevented by eating the right diet.

28. The best way to determine whether a statistically significant difference in two means is of practical

importance is to A. find a 95% confidence interval and notice the magnitude of the difference. B. repeat the study with the same sample size and see if the difference is statistically significant

again. C. see if the p-value is extremely small. D. see if the p-value is extremely large.

29. A large company examines the annual salaries for all of the men and women performing a certain job

and finds that the means and standard deviations are $32,120 and $3,240, respectively, for the men and $34,093 and $3521, respectively, for the women. The best way to determine if there is a difference in mean salaries for the population of men and women performing this job in this company is A. to compute a 95% confidence interval for the difference. B. to subtract the two sample means. C. to test the hypothesis that the population means are the same versus that they are different. D. to test the hypothesis that the population means are the same versus that the mean for men is

higher. 30. One problem with hypothesis testing is that a real effect may not be detected. This problem is most

likely to occur when A. the effect is small and the sample size is small. B. the effect is large and the sample size is small. C. the effect is small and the sample size is large. D. the effect is large and the sample size is large.