statistics recapitulation ii
DESCRIPTION
Questions for recapitulationTRANSCRIPT
VI. PROBABILITY
1. Sample space is the collection of all possible events
2. Event is an outcome of a trial that is of interest for research
3. Probability refers to the probability of occurrence of a simple event A
4. The probability of an event lies within the range 0 to 1
5. The sum of the probabilities off all possible outcomes from sample space has to be equal 1
6. If events A and B are independent then the outcome of one event does not affect the probability of occurrence of another event – p(A and B) = p(A ⋅ B) = p(A) ∙ p(B)
7. What is missing (in empty box) in formula for calculating of probability for joint ���event (A and B) if events A and B are independent: p(A and B) = p(A) ⋅ p(B) ?
8. If two events are mutually exclusive, the probability of their simultaneous ���realization equals 0.
9. Complete next definition:
Probability is the limit value of the RELATIVE frequency of realization of the event in n repetitions of the experiment.
10. If p(A) = 1, event A is certain event
11. If p(A) = 0, event A is impossible event
12. If 0 < p(A) < 1, event A is uncertain event
13. 100 first year students of the Sarajevo Faculty of Economy took the Statistics exam. 70 of them passed. One student was randomly selected. The event A is "the selected student passed the exam". The event opposite or complement to even A is: the selected student did not pass the exam.
14. Complement of event A is event that includes all events that are not part of event A.
15. Joint event is an event that has two or more characteristics “or” event & “and” event
16. The probability of the event B, on condition that the event A was already realized, is marked with p(B/A) and is calculated using the following formula: p(B / A) = 𝑝(𝐴/𝐵)∙𝑝(𝐵)𝑝(𝐴)
17. Conditional probability p( A / B) for two independent events A and B, equals: p(A∩B) = 𝑝 𝑎 ∙ 𝑝(𝐵) p(A/B) = p(A)
18. What is missing (in empty box) in formula for calculating of
conditional probability p(B / A) = !(!")!(!)
19. According to general addition rule, there is: P(A or B) = P(A∪B) = P(A) + P(B) – P(A∩B)
20. Bayes theorem defines the probability of event 𝐵! occurring given event A has occurred
21. If two events are mutually exclusive, then: P(A∪B) = P(A) + P(B)
22. If two events are mutually exclusive, then: …
23. “Randomly selected student will get the grade 6 in the Statistics exam” and “Randomly selected student will get the grade 8 in the Statistics exam” are two events that are: exclusive
VII. PROBABILITY DISTRIBUTIONS
VII. 1. DISCRETE PROBABILITY DISTRIBUTIONS
24. Which of the following is a valid probability value for a discrete random variable?
25. The law of probability of the random variable X is:
𝑋: −1 0 10.2 0.5 0.4
What is wrong with this formula? sum of probabilities of random variable x is different than 1
���26. The law of probability of the random variable X is:
X: −1 0 10.2 0.5 0.3
Probability that the random variable X will take value less than 1 is: F(1)=P(X≤ 1) = 𝑝(1)!!!
27. Discrete distributions include: Binomial, Poisson and Hypergeometric distributions
28. Random variable “the number of letters in a word picked at random out of the ���dictionary” is a discrete random variable.
⎟⎟⎠
⎞⎜⎜⎝
⎛
)x(p...)x(p)x(px...xx
:Xk
k
21
21
TRUE
29. What mathematical operator do we use to calculate the expected value for discrete random variables? Sum operator
30. A binomial distribution requires: Sampling without replacement for infinite population or with replacement for finit populations Each observation is classified into one out of two exclusive categories: failure or success There are two outcomes The outcome is fixed
31. The probability of a faulty product appearing in the production process is 0.15. What theoretical distribution would you use to model the random variable "the number of faulty products in the production series made up of 1,000 units of products "? Binomial
32. If p = 0.5 and n = 4, then the corresponding binomial distribution is:
4𝑥 ∙ 0.5! ∙ 0.5!!!
33. Suppose that a quiz consists of 20 True and False questions. A student hasn’t studied for the exam and will just randomly guess al all answers (assume True and False answers are equally likely). How would you find the probability that student will get 8 or fewer answers correct?????
34. The requirement that the probability of success remains constant from trial to trial is a property of the Binomial distribution.
35. The goal keeper scores 3 out of 1000 goals scored in football matches. What theoretical distribution would you use to approximate the binomial random variable "the number of goals scored by the goal-keeper per 500 goals scored "? Poisson distribution
36. Which of the following statements about the binomial distribution is not correct? …
37. Mean for the Binomial distribution is equal: µ = E(X) = n ⋅ p
38. Variance for the Binomial distribution is equal: σ! = 𝑛 ∙ 𝑝 ∙ 𝑞
39. The Binomial distribution will be symmetric if is: 𝑝 = 0.5
40. Formula for calculating probability for a binomial random variable is:
𝑝 𝑥 = 𝑛𝑥 ∙ 𝑝! ∙ (1 − 𝑝)!!!
41. For Binomial distribution, p(x) is: probability that among n trials will be realized exactly x successes
42. What is missing (in empty box) in formula for calculating probability for a Binomial random variable:
𝑝 𝑥 = 𝑛𝑥 ∙ 𝑝! ∙ (1 − 𝑝)!!!
43. What is missing (in empty box) in formula for calculating probability for a Binomial random variable:
𝑝 𝑥 = 𝑛𝑥 ∙ 𝑝! ∙ (1 − 𝑝)!!!
44. The number of cars that arrive at the petrol station is a Poisson's random variable with the mathematical expectancy 2. The parameter 𝜆 of the Poisson's distribution equals: 2
45. Formula for calculating probability for a Poisson random variable is:
𝑝 𝑥 =𝑒!! ∙ 𝜆!
𝑥!
46. What is missing (in empty box) in formula for calculating probability for a poisson random variable:
𝑝 𝑥 =𝑒!! ∙ 𝜆!
𝑥!
47. What is missing (in empty box) in formula for calculating probability for a ���poisson random variable:
𝑝 𝑥 =𝑒!! ∙ 𝜆!
𝑥!
48. For a Poisson distribution, the mean is: 𝜇 = 𝐸 𝑋 = 𝜆
49. For a Poisson distribution, the variance is: σ! = 𝐸 𝑥 − 𝜇 ! = 𝜆
VII. 2. CONTINUOUS PROBABILITY DISTRIBUTIONS
50. The total area under the curve of f(x) for continuous random variable is equal to: 1
51. A uniform distribution has constant probability, or there is same probability for each outcome
���52. On next graph:
We have probability density function for: ��� uniform probability curve
53. What characteristic of a continuous random variable is described by its “expected value”? The real parameter 𝜇 - MEAN
54. What mathematical operator do we use to calculate the expected value for continuous random variables? Integral operator
55. The cumulative distribution function of the random variable X in point a is defined as a probability that the random variable X will take the following value: lower or equal to some number a
56. The normal distribution has the following characteristics: - symmetry about its mean - the mode and medeian both equal to the mean - the inflection points of the curve occur one standard deviation away from the mean i.e. at the 𝜇 − 𝜎 𝑎𝑛𝑑 𝜇 + 𝜎
57. The standardized normal distribution is the normal distribution with a mean of zero and a variance of one.
a b x
1b-a
Density function
f (x)
( ) ( ) ,E X xf x dx xµ∞
−∞
= = −∞ < <∞∫
58. “The time taken for a computer to boot up” is random variable that follows a normal distribution with mean 30 seconds and standard deviation 5 second. Standardized (z) score for a boot up time of x=30 second is:𝑧! =
!"!!"!
= 0
59. For the normal distribution relation between mean, median and mode is: The mode and median are both equal to the mean.
60. The mean and median are the same for a normal distribution. TRUE
61. What is missing (in empty box) in formula for probability density function of the normal distribution:
𝑓 𝑥! =1
𝜎 ∙ 2 ∙ 𝜋∙ 𝑒!
!!∙!!!!!
62. Which of the following is not true about the normal distribution:
63. In the standardized normal distribution the value of the cumulative distribution ���function F(z) for z = 0 is: 0.5
64. What is missing (in empty box) in formula for cumulative distribution function ���
𝐹 𝑧! = 𝑝 𝑧 ≤ 𝑧! = !!!∙ 𝑒!
!!
! !"!!!!
65. What mathematical operator do we use to determine cumulative distribution ���function in continuous random variables? INTEGRAL
66. According to roles for determination probability for different kinds of cases with ���standardized normal distribution,
probability for (Z≤ −𝑧!) is equal: 1 − 𝐹(𝑧!)
67. According to roles for determination probability for different kinds of cases with ���standardized normal distribution, probability for (−𝑧! < Z≤ 𝑧!) is equal:
𝐹 𝑧! − 𝐹 −𝑧! = 2𝐹 𝑧! − 1
68. If a variable X is equal to the sum of squares of variables 𝑋! , 𝑖 = 1,𝑛 we say that ���variable X follows: Chi-square distribution with n degrees of freedom.
69. The mean for chi-square distribution is equal to: number of degrees of freedom
70. In cases where we need to make a decision on the significant difference of actual (observed) and theoretical (expected) frequency, we will apply: Chi-square distribution
71. The random variable T that follows Student's distribution
with n degrees of freedom is given by: 𝑇 = !! !
72. The mean for Student distribution with n degrees of freedom is equal to: 𝜇 = 𝐸 𝑇 = 0
73. The Student distribution is: continuous probability distribution
74. F follows Ficher-Snedecor's distribution with degrees of freedom
𝐹 =𝑥 𝑚𝑦 𝑛
���75. To test hypotheses about the equality of two samples variance, we will use F distribution
VIII. SAMPLING
76. The shape of the distribution of an unbiased sample is: similar to shape of population it comes from
77. How many different samples sizes n from the basic set of N elements, we can choose?
78. If a sample is unbiased, then it can be representative of the population. TRUE
79. An unbiased sample has similar characteristics to the population, and we can use these to make inferences about the population. TRUE
80. A biased sample is representative of the target population. FALSE
81. If we try to predict the shape of the population distribution from information about biased sample, we will: end up with wrong result
82. If we try to predict the shape of the population distribution from information about unbiased sample, we will: get right result
83. If we know the shape of unbiased sample distribution, we can use it to predict that of the population to a reasonable level of confidence. TRUE
( )!
! !nN
N NCn n N n
⎛ ⎞= =⎜ ⎟ ⋅ −⎝ ⎠
84. We can’t use biased sample to make inferences about the population because the sample and population have different characteristics. TRUE
85. Probability sampling is where every item has a calculable chance of selection
86. We will need to take a larger sample if: population is large population is heterogeneous reliable data are required variability in population is high we use higher level of significance
87. A sampling frame lists: the entire relevant population
88. A list of all of the units in a population is called sample frame.
89. Non-probability sampling is made – we don’t know the probability of selection of items
90. For simple random sample: - every element of population has an equal chance of being selected for the sample - population is homogeneous - best suits situations where not much information is available
91. For stratified sample: - population is heterogeneous - final sample is composed of samples selected from each group - population size is proportional to relative size of the strata
92. For systematic sample: each element in population has a
known and equal probability of selection
93. To ensure that particular different groups within a population are adequately represented in the sample, we will create: STRATIFIED SAMPLE
94. Type of samples that is free of classification error and that requires minimum advance knowledge of the population is: SIMPLE RANDOM SAMPLE
95. Sampling with replacement means that: when you have selected each unit and recorded relevant information about it, you put put it back into the population.
96. Sampling without replacement means that: the sampling unit isn’t replaced back into the population
97. Stratification is the process of: grouping members of the population into relatively homogeneous subgroups before sampling
98. A professor which conducting some research might use student volunteers to constitute a sample. In that case he will work with: CONVINIENCE SAMPLING
99. If the person most knowledgeable on the subject of the study selects elements of the population that he or she feels like most representative of the population, we work with: JUDGEMENT SAMPLING
100. The Central limit theorem says: No matter what is probability distribution that describes the population, if the sample size n is large enough (more than 30), then the population of all possible sample means is approximately normal with mean 𝜇 and standard deviation. 𝜎! =
!!
101. Theorem: “No matter what is the probability distribution that describes the population, if the sample size n is large enough, then the population of all possible sample means is approximately normal with mean 𝜇 and standard deviation 𝜎! =
!! is: CENTRAL LIMIT THEOREM
102. The Central limit theorem says that the sampling distribution of the sample mean is approximately normal under certain conditions. Which of following is a necessary condition for the Central limit Theorem to be used?
103. The Central limit theorem is based on the sample size being large (> 30). TRUE
104. The central limit theorem is not based on the samples being drawn on a random basis. FALSE
105. Which one of the following sampling examples would generally lead to the least reliable statistical inferences about the population from which the sample has been selected?
106. The sample was taken by randomly selecting one student from the administration’s official list of students and then choosing every 100th. This is an example of what kind of sampling? SYSTEMATIC
107. We make an inference if: generalization sample from population
IX. CONFIDENCE INTERVALS
108. In general form for confidence interval
, θ represents: parameter from population
( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −
109. In general form for confidence interval
, ϕ represents: statistics from sample
110. In general form for confidence interval
h represents: surroundings (margin of error)
111. In general form for confidence interval
(1-α) represents: confidence
112. In general form for confidence interval
α represents: type I error
113. Role “the larger sample lead to the larger sample error” is: FALSE “the larger sample lead to the smaller sample error”
114. In point estimation we use the data from the sample to compute a value of sample statistic that serves as an estimate of a population parameter. TRUE
115. Sampling error is: (chance, random error) the absolute difference between an unbiased point estimate and the corresponding population parameter.
116. Sample bias is: constant error due to inadequate design
117. Unbiased estimation for population parameter means the expected value (arithmetic mean) for statistics in the sample is equal to the value of the parameter from the population. TRUE
118. In determining confidence intervals for the population
( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −
( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −
( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −
( ) 1P h hϕ θ ϕ α− ≤ ≤ + = −
arithmetic mean on the basis of a small sample, in case that the population variance is unknown, we use the following distribution: STUDENT’S PROBABILITY
119. In determining confidence intervals for the population arithmetic mean on the basis of a large sample, in case that the population variance is unknown, we use the following distribution: NORMAL PROBABILITY
120. In determining confidence intervals for the population arithmetic mean, in case that the population variance is known, we always use the following distribution: NORMAL PROBABILITY
121. In determining confidence intervals for the population arithmetic mean on the basis of a large sample, in case that the population variance is unknown, we will use formula:
122. In determining confidence intervals for the population arithmetic mean on the basis of a small sample, in case that the population variance is unknown, we will use formula:
123. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean on the basis of a small sample, in case that the population variance is unknown:
𝑡
124. If confidence interval for the population arithmetic mean looks like
than we work on: small sample, unknown standard deviation, t distribution
125. The t distribution, in determining confidence intervals for the population arithmetic mean, is used when: the standard deviation unknown and the sample is small
126. In determining confidence intervals for the population arithmetic mean, in case that the population variance is known, we will use formula:
𝑝 𝑥 − 𝑧 ∙ 𝜎! ≤ 𝜇 ≤ 𝑋 + 𝜎 ∙ 𝑥 = 1 − 𝛼
127. If confidence interval for the population arithmetic mean looks like
than we work on: known standard deviation, z distribution
128. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean on the basis of a large sample, in case that the population variance is unknown:
? z
129. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean in case that the population variance is known:
𝜎𝑛
130. What is missing (in empty box) in formula for confidence interval for the population arithmetic mean on the basis of a large sample, in case that the population variance is
unknown: ? 𝑆𝑛
131. If confidence interval for the population arithmetic mean looks like
than we work on: z distribution (normal)
132. In determining confidence intervals for the population variance on the basis of a large sample, we use the following distribution: NORMAL DISTRIBUTION
133. In determining confidence intervals for the population variance on the basis of a small sample, we use the following distribution: Chi-square
134. In determining confidence intervals for the variance on the basis of a large sample, we will use formula:
135. What is missing (in empty box) in formula for confidence interval for the variance on the basis of a large sample
2 ∙ 𝑛 ∙ 𝑆!
136. In determining confidence intervals for the variance on the basis of a small sample, we will use formula:
137. What is missing (in empty box) in formula for confidence interval for the variance on the basis of a small sample
𝑋!!!,!!!!
!
138. The formula for determining the standard error of means , in case that the population variance is known, is:
���139. What is missing (in empty box) in formula for calculating standard error of means 𝜎! in case that the population variance is
known 𝑛
���140. What is missing (in empty box) in formula for calculating estimation for standard error of means 𝑆!, in case that the
X nσσ =
population variance is unknown ? 𝑛
141. The estimated value for the standard error of a proportion is equal: ���
142. The standard error of a proportion is equal:
𝜎!! =𝑝! ∙ 𝑞!𝑛
143.What is missing (in empty box) in formula for calculating estimation for the standard error of a proportion
𝑞!
144. What is missing (in empty box) in formula for calculating of the standard error of a proportion
𝑞!
145. The formula for determining the estimated value for the standard error mean 𝑆!, in a case that population variance is unknown
146. In the sample of 42 elements, we calculated the mean 54 and
the variance of 24.8. We wish to determine the interval in which the population mean would be, with 99% certainty. What frequency distribution do we need to apply? NORMAL DISTRIBUTION
147. If we decide to use 99% confidence interval rather than 95% confidence interval, we would expect the confidence interval to become: Wider
148. In determining confidence intervals for the proportion, we use formula:
151. In the sample of 20 elements, we calculated the mean 50 and the variance of 12. We wish to determine the interval in which the population variance would be, with 95% certainty. What model for confidence interval do we need to apply?
chi-square
152. In the sample of 50 elements, we calculated the mean 50 and the variance of 12. We wish to determine the interval in which the population mean would be, with 95% certainty. What model for confidence interval do we need to apply?
153. In the sample of 40 elements, we calculated the mean 50 and the variance of 12. We wish to determine the interval in which the population variance would be, with 99% certainty. What model for confidence interval do we need to apply?