Download - QBM117 Business Statistics Probability Distributions Random variables and probability distributions
QBM117Business Statistics
Probability Distributions
Random variables and probability distributions
Objectives
• To define a random variable.
• To define the probability distribution for a random variable.
• To distinguish between a discrete random variable and a continuous random variable.
• To introduce discrete probability distributions.
• Calculate the mean, variance and standard deviation of a discrete probability distribution.
Random Variables
• A random variable is a variable whose numerical value is determined by a the outcome of a random experiment.
• It is random because the value it assumes depends on chance.
Examples of Random Variables
• Imagine drawing a student at random from the student body.
• The student’s height, weight, weekly income and grade point average are all numerical values describing properties of the randomly selected student.
• They are all random variables.
Random Experiment
Draw a student at random from the student body
Random Variable
Height (meters) of the randomly selected student
Possible values for the random variable
Any value between about 1.5 m and 2 m
Random Experiment
Toss Two coins
Random Variable
The number of heads
Possible values for the random variable
0, 1 or 2
Random Experiment
Audit 50 tax returns
Random Variable
The number of returns containing errors
Possible values for the random variable
0, 1, 2,…,50
Random Experiment
Weigh a shipment of goods
Random Variable
The weight of the shipment
Possible values for the random variable
Any value greater than or equal to 0
Notation
• We make the distinction between random variable and the values it can assume, by following the convention of using a capital letter such as X and Y to denote random variables, and using lower-case letters such as x and y to denote their values.
Discrete and Continuous Random Variables
• There are two types of random variables
- discrete
- continuous
• They are distinguished from one another by the number of possible values they can assume.
Discrete Random Variables
• A discrete random variable has a finite number of possible values.
• For example- the number of defective items in a production
batch- the number of telephone calls received in a given
hour- the number of customers served in a hotel
reception on a given day
Continuous Random Variables
• A continuous random variable has an infinite number of possible values.
• For example- the duration of long-distance telephone calls- The lifetime of a certain brand of tyres- The total annual sales of a firm- The rate of return of a particular stock
Random ExperimentDraw a student at random from the student body
Random VariableHeight (meters) of the randomly selected student
Possible values for the random variableAny value between about 1.5 m and 2 m
Continuous or Discrete?Continuous
Examples revisited
Random Experiment
Toss Two coins
Random Variable
The number of heads
Possible values for the random variable
0, 1 or 2
Continuous or Discrete?
Discrete
Random Experiment
Audit 50 tax returns
Random Variable
The number of returns containing errors
Possible values for the random variable
0, 1, 2,…,50
Continuous or Discrete?
Discrete
Random Experiment
Weigh a shipment of goods
Random Variable
The weight of the shipment
Possible values for the random variable
Any value greater than or equal to 0
Continuous or Discrete?
Continuous
Probability Distributions
• A probability distribution of a random variable X tells us what the possible values of X are and the associated probabilities P(X=x) or p(x).
• There are two types of probability distributions
- discrete probability distribution
- continuous distribution
Discrete Probability Distributions
• The probability distribution of a discrete random variable is a table, formula or graph that lists all the possible values of the random variable and their associated probabilities.
X x1 x2 … xn
P(X=x) p1 p2 … pn
Requirements of Discrete Probability Distributions
If a discrete random variable X can take values
x1, x2,…, xn with probabilities p(x1), p(x2),…, p(xn) , the probabilities must satisfy two requirements:
1. Every probability p(xi) is a number between 0 and 1
1. The probabilities must add to 1
nixp i ,...,2,1for 1)(0
n
iixp
1
1)(
Example 1
Consider a study of 300 households in a town in the coast of Queensland. As a part of this study, data were collected showing the number of children in each household. The following results were obtained: 54 of the households has no children, 117 had 1 child, 72 had 2 children, 42 had 3 children, 12 had 4 children, and 3 had 5 children.
Consider the experiment of randomly selecting one of these households to participate in a follow-up study.
Let X = number of children in the household selected.
The possible values of X are 0, 1, 2, 3, 4, and 5.
The probability that the selected household has no children is 54/300 = 0.18.
Hence P(X=0) = 0.18
The probability that the selected household has 1 child is 117/300 = 0.39.
Hence P(X=1) = 0.39
The probability that the selected household has 2 children is 72/300 = 0.24Hence P(X=2) = 0.24
The probability that the selected household has 3 children is 42/300 = 0.14Hence P(X=3) = 0.14
The probability that the selected household has 4 children is 12/300 = 0.04Hence P(X=4) = 0.04
The probability that the selected household has 5 children is 3/300 = 0.01Hence P(X=5) = 0.01
The probability distribution of X can be presented in tabular form.
Note that each of the probabilities is between 0 and 1, and that the probabilities add to 1.
X 0 1 2 3 4 5
P(X=x) 0.18 0.39 0.24 0.14 0.04 0.01
The probability distribution of X can also be presented in terms of the following formula
5 if 01.0
4 if 04.0
3 if 14.0
2 if 24.0
1 if 39.0
0 if 18.0
)(
x
x
x
x
x
x
xp
It can also be presented in the form of a graph.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 1 2 3 4 5
X
P(X=x)
Using a Probability Distribution
• A primary advantage of defining a random variable and its probability distribution is that once the probability distribution is known, it is relatively easy to determine the probability of a variety of events that may be of interest to a decision maker.
• We interpret the probabilities the same way we did last week when we were looking at probability.
Consider Example 1:
P(X=4) = 0.04 implies that the probability that a randomly selected household has 4 children is 0.04
• We can also apply the addition rule for mutually exclusive events.
Consider Example 1:
The values of X are mutually exclusive; a household can have 0, 1, 2, 3, 4 or 5 children.
The probability that a randomly selected household has 3 or more children is
19.0
01.004.014.0
)5()4()3()3(
XPXPXPXP
Example 2
Using historical records, the personnel manager of a plant has determined the probability distribution of X, the number of employees absent per day. It is
What is the probability that there are no absent employees on any given day?
What is the probability that there are no more than 2 employees absent on any given day?
X 0 1 2 3 4 5 6 7
P(X=x) 0.005 0.025 0.310 0.340 0.220 0.080 0.019 0.001
What is the probability that there are no absent employees on any given day?
P(X=0) = 0.005
What is the probability that there are at most 2 absent employees on any given day?
68.0
340.0310.0025.005.0
)3()2()1()0()3(
XPXPXPXPXP
Expected Value and Variance
• In Topic 1 we calculated sample and population means and variances for frequency distributions.
• A probability distribution is the distribution of a population.
• We can calculate the population mean and variance for probability distributions.
Expected Value
The mean, or expected value, of a discrete random variable X is obtained by • multiplying each possible value of X by its
associated probability • and then summing the resulting products.
n
iii xpxXE
1
)()(
Example 1 revisited
X 0 1 2 3 4 5
P(X=x) 0.18 0.39 0.24 0.14 0.04 0.01
The expected number of children per household is
5.1
01.05 04.04
14.03 24.02 39.01 18.00
)(
XE
Variance
The variance of a discrete random variable X is found by • subtracting the mean from each value and
squaring this difference.• multiplying squared difference by the associated
probability,• and then summing the resulting products
n
iii xpxXV
1
22 )()()(
• A more computationally efficient method of calculating the variance of a discrete random variable is to use the following formula
• This is just a rearrangement of the formula on the previous slide.
n
iii xpxXV
1
222 )()(
Example 1 revisited
X 0 1 2 3 4 5
P(X=x) 0.18 0.39 0.24 0.14 0.04 0.01
The variance of number of children per household is
25.1
5.15.3
5.1 01.05 04.04
14.03 24.02 39.01 18.00
)(
2
222
2222
2
XV
Standard Deviation
• Following on from Topic 1, the standard deviation can be found by taking the square root of the variance
Example 1 revisited
X 0 1 2 3 4 5
P(X=x) 0.18 0.39 0.24 0.14 0.04 0.01
The standard deviation of X, the number of children per household, is
(2d.p.) 12.1
25.1
Example 2 revisited
X = the number of employees absent per day
Determine the mean and standard deviation of the number of employees absent per day.
X 0 1 2 3 4 5 6 7
P(X=x) 0.005 0.025 0.310 0.340 0.220 0.080 0.019 0.001
The mean number of employees absent per day is
066.3
001.07 019.06 080.05 220.04
340.03 310.02 025.01 005.00
The mean number of employees absent per day is
(2d.p.) 178.1
(3.066) 587.10
(3.066) 001.07
019.06 080.05 220.04
340.03 310.02 025.01 005.00
2
22
222
2222
Example 3 (Exercise 5.19)The owner of a small firm has just purchased a personal computer, which she expects will surge her for the next two years. The owner has just been told that she must buy a surge suppressor to provide protection for her new hardware against possible surges or variations in the electrical current. Her son David, a recent university graduate, advises that an inexpensive suppressor could be purchased that would provide protection against one surge only. He notes that the amount of damage done without a suppressor would depend on the extent of the surge. David conservatively estimates that, over the next two years, there is a 1% chance of incurring $400 damage and a 2% chance of incurring $200 damage. But the probability of incurring $100 damage is 0.1.
1. How much should the owner be willing to pay for a surge suppressor?
2. Determine the standard deviation of the possible amounts of damage.
To answer these questions we need to construct the probability distribution for the amount of damage incurred.
Let X = the amount of damage incurred.
David conservatively estimates that, over the next two years, there is a 1% chance of incurring $400 damage and a 2% chance of incurring $200 damage. But the probability of incurring $100 damage is 0.1.
X 0 100 200 400
P(X=x) 0.87 0.10 0.02 0.01
1. To determine how much the owner should be willing to pay for a surge suppressor we need to work out the expected amount of damage to be incurred.
The expected amount of damage to be incurred is $18, therefore the owner should be willing to pay up to $18.
18
01.040002.020010.010087.00)(
XE
2. To determine the standard deviation of the possible amounts of damage we need to calculate the variance and then take the square root of the variance to obtain the standard deviation.
Hence the standard deviation of the possible amounts of damage is $55.46.
3076
18 3400
1801.0400
02.0200 10.0100 87.00)(
2
22
222
XV
46.55
3076
Reading for next lecture
• Chapter 5 Section 5.4
Exercises
• 5.1• 5.5• 5.11• 5.22 a and b only