statistics recapitulation ii

VI. PROBABILITY

1. Sample space is the collection of all possible events

2. Event is an outcome of a trial that is of interest for research

3. Probability refers to the probability of occurrence of a simple event A

4. The probability of an event lies within the range 0 to 1

5. The sum of the probabilities off all possible outcomes from sample space has to be equal 1

6. If events A and B are independent then the outcome of one event does not affect the probability of occurrence of another event – p(A and B) = p(A ⋅ B) = p(A) ∙ p(B)

7. What is missing (in empty box) in formula for calculating of probability for joint ��event (A and B) if events A and B are independent: p(A and B) = p(A) ⋅ p(B) ?

8. If two events are mutually exclusive, the probability of their simultaneous ��realization equals 0.

9. Complete next definition:

Probability is the limit value of the RELATIVE frequency of realization of the event in n repetitions of the experiment.

10. If p(A) = 1, event A is certain event

11. If p(A) = 0, event A is impossible event

12. If 0 < p(A) < 1, event A is uncertain event

13. 100 first year students of the Sarajevo Faculty of Economy took the Statistics exam. 70 of them passed. One student was randomly selected. The event A is "the selected student passed the exam". The event opposite or complement to even A is: the selected student did not pass the exam.

14. Complement of event A is event that includes all events that are not part of event A.

15. Joint event is an event that has two or more characteristics “or” event & “and” event

16. The probability of the event B, on condition that the event A was already realized, is marked with p(B/A) and is calculated using the following formula: p(B / A) = 𝑝(𝐴/𝐵)∙𝑝(𝐵)𝑝(𝐴)

17. Conditional probability p( A / B) for two independent events A and B, equals: p(A∩B) = 𝑝 𝑎 ∙ 𝑝(𝐵) p(A/B) = p(A)

18. What is missing (in empty box) in formula for calculating of

conditional probability p(B / A) = !(!")!(!)

19. According to general addition rule, there is: P(A or B) = P(A∪B) = P(A) + P(B) – P(A∩B)

20. Bayes theorem defines the probability of event 𝐵! occurring given event A has occurred

21. If two events are mutually exclusive, then: P(A∪B) = P(A) + P(B)

22. If two events are mutually exclusive, then: …

23. “Randomly selected student will get the grade 6 in the Statistics exam” and “Randomly selected student will get the grade 8 in the Statistics exam” are two events that are: exclusive

VII. PROBABILITY DISTRIBUTIONS

VII. 1. DISCRETE PROBABILITY DISTRIBUTIONS

24. Which of the following is a valid probability value for a discrete random variable?

25. The law of probability of the random variable X is:

𝑋: −1 0 10.2 0.5 0.4

What is wrong with this formula? sum of probabilities of random variable x is different than 1

��26. The law of probability of the random variable X is:

X: −1 0 10.2 0.5 0.3

Probability that the random variable X will take value less than 1 is: F(1)=P(X≤ 1) = 𝑝(1)!!!

27. Discrete distributions include: Binomial, Poisson and Hypergeometric distributions

28. Random variable “the number of letters in a word picked at random out of the ��dictionary” is a discrete random variable.

⎟⎟⎠

⎞⎜⎜⎝

⎛

)x(p...)x(p)x(px...xx

:Xk

k

21

21

TRUE

29. What mathematical operator do we use to calculate the expected value for discrete random variables? Sum operator

30. A binomial distribution requires: Sampling without replacement for infinite population or with replacement for finit populations Each observation is classified into one out of two exclusive categories: failure or success There are two outcomes The outcome is fixed

31. The probability of a faulty product appearing in the production process is 0.15. What theoretical distribution would you use to model the random variable "the number of faulty products in the production series made up of 1,000 units of products "? Binomial

32. If p = 0.5 and n = 4, then the corresponding binomial distribution is:

4𝑥 ∙ 0.5! ∙ 0.5!!!

33. Suppose that a quiz consists of 20 True and False questions. A student hasn’t studied for the exam and will just randomly guess al all answers (assume True and False answers are equally likely). How would you find the probability that student will get 8 or fewer answers correct?????

34. The requirement that the probability of success remains constant from trial to trial is a property of the Binomial distribution.

35. The goal keeper scores 3 out of 1000 goals scored in football matches. What theoretical distribution would you use to approximate the binomial random variable "the number of goals scored by the goal-keeper per 500 goals scored "? Poisson distribution

36. Which of the following statements about the binomial distribution is not correct? …

37. Mean for the Binomial distribution is equal: µ = E(X) = n ⋅ p

38. Variance for the Binomial distribution is equal: σ! = 𝑛 ∙ 𝑝 ∙ 𝑞

39. The Binomial distribution will be symmetric if is: 𝑝 = 0.5

40. Formula for calculating probability for a binomial random variable is:

𝑝 𝑥 = 𝑛𝑥 ∙ 𝑝! ∙ (1 − 𝑝)!!!

41. For Binomial distribution, p(x) is: probability that among n trials will be realized exactly x successes

42. What is missing (in empty box) in formula for calculating probability for a Binomial random variable:

𝑝 𝑥 = 𝑛𝑥 ∙ 𝑝! ∙ (1 − 𝑝)!!!

43. What is missing (in empty box) in formula for calculating probability for a Binomial random variable:

𝑝 𝑥 = 𝑛𝑥 ∙ 𝑝! ∙ (1 − 𝑝)!!!

44. The number of cars that arrive at the petrol station is a Poisson's random variable with the mathematical expectancy 2. The parameter 𝜆 of the Poisson's distribution equals: 2

45. Formula for calculating probability for a Poisson random variable is:

𝑝 𝑥 =𝑒!! ∙ 𝜆!

𝑥!

46. What is missing (in empty box) in formula for calculating probability for a poisson random variable:

𝑝 𝑥 =𝑒!! ∙ 𝜆!

𝑥!

47. What is missing (in empty box) in formula for calculating probability for a ��poisson random variable:

𝑝 𝑥 =𝑒!! ∙ 𝜆!

𝑥!

48. For a Poisson distribution, the mean is: 𝜇 = 𝐸 𝑋 = 𝜆

49. For a Poisson distribution, the variance is: σ! = 𝐸 𝑥 − 𝜇 ! = 𝜆

VII. 2. CONTINUOUS PROBABILITY DISTRIBUTIONS

50. The total area under the curve of f(x) for continuous random variable is equal to: 1

51. A uniform distribution has constant probability, or there is same probability for each outcome

��52. On next graph:

We have probability density function for: �� uniform probability curve

53. What characteristic of a continuous random variable is described by its “expected value”? The real parameter 𝜇 - MEAN

54. What mathematical operator do we use to calculate the expected value for continuous random variables? Integral operator

55. The cumulative distribution function of the random variable X in point a is defined as a probability that the random variable X will take the following value: lower or equal to some number a

56. The normal distribution has the following characteristics: - symmetry about its mean - the mode and medeian both equal to the mean - the inflection points of the curve occur one standard deviation away from the mean i.e. at the 𝜇 − 𝜎 𝑎𝑛𝑑 𝜇 + 𝜎

57. The standardized normal distribution is the normal distribution with a mean of zero and a variance of one.

a b x

1b-a

Density function

f (x)

( ) ( ) ,E X xf x dx xµ∞

−∞

= = −∞ < <∞∫

58. “The time taken for a computer to boot up” is random variable that follows a normal distribution with mean 30 seconds and standard deviation 5 second. Standardized (z) score for a boot up time of x=30 second is:𝑧! =

!"!!"!

= 0

59. For the normal distribution relation between mean, median and mode is: The mode and median are both equal to the mean.

60. The mean and median are the same for a normal distribution. TRUE

61. What is missing (in empty box) in formula for probability density function of the normal distribution:

𝑓 𝑥! =1

𝜎 ∙ 2 ∙ 𝜋∙ 𝑒!

!!∙!!!!!

62. Which of the following is not true about the normal distribution:

63. In the standardized normal distribution the value of the cumulative distribution ��function F(z) for z = 0 is: 0.5

64. What is missing (in empty box) in formula for cumulative distribution function ��

𝐹 𝑧! = 𝑝 𝑧 ≤ 𝑧! = !!!∙ 𝑒!

!!

! !"!!!!

65. What mathematical operator do we use to determine cumulative distribution ��function in continuous random variables? INTEGRAL

66. According to roles for determination probability for different kinds of cases with ��standardized normal distribution,

probability for (Z≤ −𝑧!) is equal: 1 − 𝐹(𝑧!)

67. According to roles for determination probability for different kinds of cases with ��standardized normal distribution, probability for (−𝑧! < Z≤ 𝑧!) is equal:

𝐹 𝑧! − 𝐹 −𝑧! = 2𝐹 𝑧! − 1

68. If a variable X is equal to the sum of squares of variables 𝑋! , 𝑖 = 1,𝑛 we say that ��variable X follows: Chi-square distribution with n degrees of freedom.

69. The mean for chi-square distribution is equal to: number of degrees of freedom

70. In cases where we need to make a decision on the significant difference of actual (observed) and theoretical (expected) frequency, we will apply: Chi-square distribution

71. The random variable T that follows Student's distribution

with n degrees of freedom is given by: 𝑇 = !! !

72. The mean for Student distribution with n degrees of freedom is equal to: 𝜇 = 𝐸 𝑇 = 0

73. The Student distribution is: continuous probability distribution

74. F follows Ficher-Snedecor's distribution with degrees of freedom

𝐹 =𝑥 𝑚𝑦 𝑛

��75. To test hypotheses about the equality of two samples variance, we will use F distribution

VIII. SAMPLING

76. The shape of the distribution of an unbiased sample is: similar to shape of population it comes from

77. How many different samples sizes n from the basic set of N elements, we can choose?

78. If a sample is unbiased, then it can be representative of the population. TRUE

79. An unbiased sample has similar characteristics to the population, and we can use these to make inferences about the population. TRUE

80. A biased sample is representative of the target population. FALSE

81. If we try to predict the shape of the population distribution from information about biased sample, we will: end up with wrong result

82. If we try to predict the shape of the population distribution from information about unbiased sample, we will: get right result

83. If we know the shape of unbiased sample distribution, we can use it to predict that of the population to a reasonable level of confidence. TRUE

( )!

! !nN

N NCn n N n

⎛ ⎞= =⎜ ⎟ ⋅ −⎝ ⎠

84. We can’t use biased sample to make inferences about the population because the sample and population have different characteristics. TRUE

85. Probability sampling is where every item has a calculable chance of selection

86. We will need to take a larger sample if: population is large population is heterogeneous reliable data are required variability in population is high we use higher level of significance

87. A sampling frame lists: the entire relevant population

88. A list of all of the units in a population is called sample frame.

89. Non-probability sampling is made – we don’t know the probability of selection of items

90. For simple random sample: - every element of population has an equal chance of being selected for the sample - population is homogeneous - best suits situations where not much information is available

91. For stratified sample: - population is heterogeneous - final sample is composed of samples selected from each group - population size is proportional to relative size of the strata

92. For systematic sample: each element in population has a

known and equal probability of selection

93. To ensure that particular different groups within a population are adequately represented in the sample, we will create: STRATIFIED SAMPLE

94. Type of samples that is free of classification error and that requires minimum advance knowledge of the population is: SIMPLE RANDOM SAMPLE

95. Sampling with replacement means that: when you have selected each unit and recorded relevant information about it, you put put it back into the population.

96. Sampling without replacement means that: the sampling unit isn’t replaced back into the population

97. Stratification is the process of: grouping members of the population into relatively homogeneous subgroups before sampling

98. A professor which conducting some research might use student volunteers to constitute a sample. In that case he will work with: CONVINIENCE SAMPLING

99. If the person most knowledgeable on the subject of the study selects elements of the population that he or she feels like most representative of the population, we work with: JUDGEMENT SAMPLING

100. The Central limit theorem says: No matter what is probability distribution that describes the population, if the sample size n is large enough (more than 30), then the population of all possible sample means is approximately normal with mean 𝜇 and standard deviation. 𝜎! =

!!

101. Theorem: “No matter what is the probability distribution that describes the population, if the sample size n is large enough, then the population of all possible sample means is approximately normal with mean 𝜇 and standard deviation 𝜎! =

!! is: CENTRAL LIMIT THEOREM

102. The Central limit theorem says that the sampling distribution of the sample mean is approximately normal under certain conditions. Which of following is a necessary condition for the Central limit Theorem to be used?

103. The Central limit theorem is based on the sample size being large (> 30). TRUE

104. The central limit theorem is not based on the samples being drawn on a random basis. FALSE

105. Which one of the following sampling examples would generally lead to the least reliable statistical inferences about the population from which the sample has been selected?

106. The sample was taken by randomly selecting one student from the administration’s official list of students and then choosing every 100th. This is an example of what kind of sampling? SYSTEMATIC

107. We make an inference if: generalization sample from population

IX. CONFIDENCE INTERVALS

108. In general form for confidence interval

, θ represents: parameter from population

( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −


, ϕ represents: statistics from sample


h represents: surroundings (margin of error)


(1-α) represents: confidence


α represents: type I error

113. Role “the larger sample lead to the larger sample error” is: FALSE “the larger sample lead to the smaller sample error”

114. In point estimation we use the data from the sample to compute a value of sample statistic that serves as an estimate of a population parameter. TRUE

115. Sampling error is: (chance, random error) the absolute difference between an unbiased point estimate and the corresponding population parameter.

116. Sample bias is: constant error due to inadequate design

117. Unbiased estimation for population parameter means the expected value (arithmetic mean) for statistics in the sample is equal to the value of the parameter from the population. TRUE

118. In determining confidence intervals for the population

( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −

( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −

( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −

( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −

arithmetic mean on the basis of a small sample, in case that the population variance is unknown, we use the following distribution: STUDENT’S PROBABILITY

119. In determining confidence intervals for the population arithmetic mean on the basis of a large sample, in case that the population variance is unknown, we use the following distribution: NORMAL PROBABILITY

120. In determining confidence intervals for the population arithmetic mean, in case that the population variance is known, we always use the following distribution: NORMAL PROBABILITY

121. In determining confidence intervals for the population arithmetic mean on the basis of a large sample, in case that the population variance is unknown, we will use formula:

122. In determining confidence intervals for the population arithmetic mean on the basis of a small sample, in case that the population variance is unknown, we will use formula:

123. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean on the basis of a small sample, in case that the population variance is unknown:

𝑡

124. If confidence interval for the population arithmetic mean looks like

than we work on: small sample, unknown standard deviation, t distribution

125. The t distribution, in determining confidence intervals for the population arithmetic mean, is used when: the standard deviation unknown and the sample is small

126. In determining confidence intervals for the population arithmetic mean, in case that the population variance is known, we will use formula:

𝑝 𝑥 − 𝑧 ∙ 𝜎! ≤ 𝜇 ≤ 𝑋 + 𝜎 ∙ 𝑥 = 1 − 𝛼


than we work on: known standard deviation, z distribution

128. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean on the basis of a large sample, in case that the population variance is unknown:

? z

129. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean in case that the population variance is known:

𝜎𝑛

130. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean on the basis of a large sample, in case that the population variance is

unknown: ? 𝑆𝑛


than we work on: z distribution (normal)

132. In determining confidence intervals for the population variance on the basis of a large sample, we use the following distribution: NORMAL DISTRIBUTION

133. In determining confidence intervals for the population variance on the basis of a small sample, we use the following distribution: Chi-square

134. In determining confidence intervals for the variance on the basis of a large sample, we will use formula:

135. What is missing (in empty box) in formula for confidence interval for the variance on the basis of a large sample

2 ∙ 𝑛 ∙ 𝑆!

136. In determining confidence intervals for the variance on the basis of a small sample, we will use formula:

137. What is missing (in empty box) in formula for confidence interval for the variance on the basis of a small sample

𝑋!!!,!!!!

!

138. The formula for determining the standard error of means , in case that the population variance is known, is:

��139. What is missing (in empty box) in formula for calculating standard error of means 𝜎! in case that the population variance is

known 𝑛

��140. What is missing (in empty box) in formula for calculating estimation for standard error of means 𝑆!, in case that the

X nσσ =

population variance is unknown ? 𝑛

141. The estimated value for the standard error of a proportion is equal: ��

142. The standard error of a proportion is equal:

𝜎!! =𝑝! ∙ 𝑞!𝑛

143.What is missing (in empty box) in formula for calculating estimation for the standard error of a proportion

𝑞!

144. What is missing (in empty box) in formula for calculating of the standard error of a proportion

𝑞!

145. The formula for determining the estimated value for the standard error mean 𝑆!, in a case that population variance is unknown

146. In the sample of 42 elements, we calculated the mean 54 and

the variance of 24.8. We wish to determine the interval in which the population mean would be, with 99% certainty. What frequency distribution do we need to apply? NORMAL DISTRIBUTION

147. If we decide to use 99% confidence interval rather than 95% confidence interval, we would expect the confidence interval to become: Wider

148. In determining confidence intervals for the proportion, we use formula:

151. In the sample of 20 elements, we calculated the mean 50 and the variance of 12. We wish to determine the interval in which the population variance would be, with 95% certainty. What model for confidence interval do we need to apply?

chi-square

152. In the sample of 50 elements, we calculated the mean 50 and the variance of 12. We wish to determine the interval in which the population mean would be, with 95% certainty. What model for confidence interval do we need to apply?

153. In the sample of 40 elements, we calculated the mean 50 and the variance of 12. We wish to determine the interval in which the population variance would be, with 99% certainty. What model for confidence interval do we need to apply?

NORMAL

154. In determining confidence intervals for the population proportion, we always use the following distribution: Z DISTRIBUTION

statistics recapitulation ii

Documents

probability of event

event pa

event b

joint event

simple event

uncertain event

impossible event

given event