mathematics for engineers statistics tutorial 4 – probability distributions
TRANSCRIPT
© D.J.Dunn www.freestudy.co.uk 1
MATHEMATICS FOR ENGINEERS
STATISTICS
TUTORIAL 4 – PROBABILITY DISTRIBUTIONS
CONTENTS
Sample Space
Accumulative Probability
Probability Distributions
Binomial Distribution
Normal Distribution
Poisson Distribution
You will find a useful calculation aid for all probability distributions at this web address:
http://www.stat.vt.edu/~sundar/java/applets/Distributions.html#POISSON
This tutorial is a continuation of outcome 4 tutorial 1
© D.J.Dunn www.freestudy.co.uk 2
SAMPLE SPACE AND ACCUMULATIVE FREQUENCY
Consider the case when two dice are rolled and the outcome is to guess
the resulting score. n = 6 and x = 2 so there are 62 = 36 permutations
There are only 11 possible scores. You would only need 11 guesses to
be sure of getting it right. If we arrange all the possible results into a
table we get the sample space shown as the shaded region.
First Die Second Die
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
There are 36 events with equal probability so p = 1/36. The probability of getting any score is hence
the product of frequency and probability. In cases like this we should use a frequency table as
shown and this is called a probability distribution.
Score 2 3 4 5 6 7 8 9 10 11 12
Frequency f 1 2 3 4 5 6 5 4 3 2 1
P 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
Accumulative f 1/36 3/36 6/36 10/36 15/36 21/36 26/36 30/36 33/36 35/36 36/36
We can see that the probability of getting any given sum is not the same. The only way to get a sum
of 2 is to roll a 1 on both dice, but you can get a sum of 7 in six different ways.
Plotting shows this is a linear distribution symmetrically placed around the middle value.
The accumulative probability always adds up to 1.0. The probability value can only be between 0
and 1. An event that is certain has a probability of 1 and an event that is impossible has a
probability of 0.
We can see a rule for the probability of a given sum is P = f p
f is the frequency of the event and p is the equal probability of any score this being p =1/nr
n = number of possibilities for each event (6) and r the number of events (2).
Hence in this case P = 1/nr x f = 1/6
2 x f = f/36
It is important to note that the distribution is not a continuous function but a set of discrete values
based on the integers 1, 2, 3 ...... n
© D.J.Dunn www.freestudy.co.uk 3
If we turned the probability distribution into a
bar graph with bars of width = 1, the
accumulative frequency would be the area of
the graph between 0 and the given score.
The probability of getting at least 5 is 10/36
from the table and from the graph it is the pink
area this being:
(1/36 + 2/36 + 3/36 + 4/36) = 10/36
The green area represents the probability of getting a score of 6 or more and is simply found by
subtracting the pink area from the total area. The total area is always one (36/36).
WORKED EXAMPLE No. 1
When two dice are rolled 12 times, what is the probability of guessing correctly a score of 4?
What is the probability of guessing a score of least 4 or less?
SOLUTION
From the plots or table created previously, the probability of guessing a score of exactly 4 is
4/36. The probability of guessing a score of four or less is 10/36
BINOMIAL DISTRIBUTION
The Binomial distribution only applies to events where there are two outcomes, say win and
lose or heads and tails (tossing a coin).
The Binomial Distribution was covered in outcome 1 and was written as:
(1 + x)n = 1 +
nC1 x +
nC2 x
2 +
nC3 x
3 +
nC4 x
4 + ....... +
x
n
The key part is the Binomial coefficient n
Cr This may be considered as a way of evaluating how
many successes 'r' you are likely to get when you repeat the event 'n' times.
Let's revise how to evaluate nCr
On the top line we put the first r factors of n
and on the bottom line we put r!
e.g. 41 x 2 x 3
2 x 3 x 4 C3
4 61 x 2
3 x 4 C2
4
And if we evaluate these for all values of r we
get a symmetrical distribution. The plot shows
the result for n = 4.
The mean is always the middle value so the mean r is always n/2
© D.J.Dunn www.freestudy.co.uk 4
Let's consider the probability distribution for tossing a coin where the probability of a head or tail is
both ½ or 0.5. Consider tossing the coin 4 times (n = 4). The sample space is like this.
FLIP 1 FLIP 2 FLIP 3 FLIP 4 HEADS
H H H H 4
H H H T 3
H H T H 3
H H T T 2
H T H H 3
H T H T 2
H T T H 2
H T T T 1
T H H H 3
T H H T 2
T H T H 2
T H T T 1
T T H H 2
T T H T 1
T T T H 1
T T T T 0
Note there are 24 = 16 possible results. Now we can build the frequency distribution.
Note the P = nCr 16 so we didn't need to construct the sample space.
Heads (r) 0 1 2 3 4 nCr 1 4 6 4 1
P 1/16 4/16 6/16 4/16 1/16
Acc P 1/16 5/16 11/16 15/16 16/16
Suppose we want to know the chances of getting exactly 2 correct. Using r = 2 we see we have a
probability of 6/16 = 0.375.
Suppose we want to know the chances of getting 2 or less guesses correct. Using r = 2 we get a
probability of 11/16 = 0.6875
Suppose we want the probability of getting 2 or more guesses correct. This would be found by
subtracting the last answer from 1 to give 5/16 = 0.3125.
WORKED EXAMPLE No. 2
What is the probability of correctly calling four heads when a coin is tossed ten times?
SOLUTION
The number of possible permutations is 210
= 1024
The probability of calling correctly 4 times is nCr /1024
2051.01024
1x
1 x 2 x 3 x 4
7 x 8 x 9 x 10
1024
CP 4
10
© D.J.Dunn www.freestudy.co.uk 5
UNEQUAL PROBABILITY
Without proof, when the probability of an event is not 0.5 the probability of getting r results correct
out of n events is: rnrr
n p)(1pCP
p is the probability of each event.
In the case p = 0.5 this reduces to nr
n (0.5)CP which is the same formula already used.
If the number of tosses are large (n is large) the frequency distribution resembles a continuous graph
and it is tempting to join the points as shown but we should remember that the values of r are
integers (whole numbers) and so we can never have values in between. The plot below is for n = 50.
For cases where π ½ the distribution becomes skewed. Consider the following case. A bag contains three balls numbered 1 to 3. A single ball is drawn from the bag at random and then
replaced. If this is repeated 3 times we get the following sample space. Note how the pattern is
constructed in three groups of 9 giving 27 permutations.
Ball drawn Number of
ones
Ball drawn Number of
ones
Ball drawn Number of
ones
1 1 1 3 2 1 1 2 3 1 1 2
1 1 2 2 2 1 2 1 3 1 2 1
1 1 3 2 2 1 3 1 3 1 3 1
1 2 1 2 2 2 1 1 3 2 1 1
1 2 2 1 2 2 2 0 3 2 2 0
1 2 3 1 2 2 3 0 3 2 3 0
1 3 1 2 2 3 1 1 3 3 1 1
1 3 2 1 2 3 2 0 3 3 2 0
1 3 3 1 2 3 3 0 3 3 3 0
Number of times drawn r 0 1 2 3
frequency 8 12 6 1
Probability P 8/27 12/27 6/27 1/27
If we make n = 50 we get a curve with a peak at n/3 and if p =2/3 the
peak is at 2n/3.
All results are predicted by the equation rnrr
n p)(1pCPr
You might try the animation at this web address to see this in action.
http://www.stat.wvu.edu/SRS/Modules/Binomial/binomial.html
© D.J.Dunn www.freestudy.co.uk 6
WORKED EXAMPLE No. 3
Verify the four results previous for n = 3, p = 1/3
SOLUTION
r = 3 27/1/3)2((1/3)(3)(2)
(3)(2)/3)2((1/3)Cp)(1pCP 0303
33rnr
rn
r = 2 6/27or 2/9(2/3)(1/3)(2)
(3)(2)/3)2((1/3)Cp)(1pCP 1212
23rnr
rn
r = 1 12/27or 4/9(2/3)(1/3)(1)
(3)/3)2((1/3)Cp)(1pCP 2121
13rnr
rn
r = 0 8/27(2/3)1(1/3)/3)2((1/3)Cp)(1pCP 20300
3rnrr
n
WORKED EXAMPLE No. 4
A bag contains 3 balls numbered 1, 2 and 3. One ball is removed at random and noted and then
replaced. This is repeated 5 times. What is the probability of guessing the number correctly
three times out 5?
SOLUTION
p = 1/3 n = 5 and r = 3
1646.0/3)2((1/3)(3)(2)
(5)(4)(3)/3)2((1/3)Cp)(1pCP 2323
35rnr
rn
MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION
In statistics the variance is defined as
222
f
fx
f
fxσ
In the terminology used here x becomes r and P is the probability of r correct guesses. 22
2
P
r P
P
r Pσ
The following example shows that ∑P =1 so this reduces to 222 r Pr Pσ
The standard deviation is σ = √S
Without proof -
It can be shown that this reduces to σ2 = np(1 - p) and when p = ½ σ
2 = n/4
The mean of the Binomial distribution when p =1/2 is clearly the middle value so r = n/2
When π ½ we can see from the graphs that the mean is r = pn.
When π ½ the standard deviation is σ2 = np(1-p)
© D.J.Dunn www.freestudy.co.uk 7
WORKED EXAMPLE No. 5
A coin is flipped six times. Show that the resulting frequency distribution for correct tosses has
a standard deviation of 1.225 by use of both formulae.
SOLUTION
First by the simple method σ2 = n/4 = 6/4 = 1.5 σ = √1.5 = 1.225
Next by the full method
222
P
r P
P
r Pσ
r 0 1 2 3 4 5 6
nCr 1 6 15 20 15 6 1
P = nCr /2
6 1/64 6/64 15/64 20/64 15/64 6/64 1/64 ∑P = 1
P r2 0 6/64 60/64 180/64 240/64 150/64 36/64 ∑P r
2 = 672/64
P r 0/64 6/64 30/64 60/64 60/64 36/64 6/64 ∑P r = 192/64
5.195.101
64/192
1
64/672
P
r P
P
r Pσ
2222
σ = 1.225
WORKED EXAMPLE No. 6
In the last example, what is the probability of guessing correctly exactly four times and at least
four times?
SOLUTION
From the table we see the probability of guessing four correct is 15/64 but the probability of
guessing at least four is (1 + 6 + 15 + 20 + 15)/64 = 57/64. This is the accumulative value.
WORKED EXAMPLE No. 7
Samples of a product are tested to a certain standard and it is found that there is a probability of
0.2 that they fail. What is the probability of selecting 5 failures from a selection of 15? What is
the mean and standard deviation for this sample?
SOLUTION
p = 0.2 n = 15 r = 5 103.0.8)0((0.2)10! 5!
15!.8)0((0.2)Cp)(1pCP 105105
515rnr
rn
Mean = pn = 0.2 x 15 = 3 σ = √(15)(0.2)(1-0.2) = 1.549
You will find a useful calculating aid for at the following web address
http://hyperphysics.phy-astr.gsu.edu/hbase/math/disfcn.html
© D.J.Dunn www.freestudy.co.uk 8
SELF ASSESSMENT EXERCISE No. 1
1. If a coin is tossed 20 times, what is the probability of getting the call correct 5 times?
(0.0148)
2. If a six sided die is tossed 10 times, what is the probability of getting the call right five times?
(0.013)
3. A lottery system consists of drawing one numbered ball from a bag containing nine. This is
repeated with six separate bags. What is the probability of guessing all the numbers drawn?
(1/531441)
4. 20 coins are flipped each with a probability of 0.5 that it will be heads. What is the standard
deviation for the frequency distribution? (2.236)
5. A machine making electrical resistors has a probability of 0.1 that the values will fall outside
the target range. What is the probability of randomly picking 20 from a batch of 100 that will be
outside the target? (0.00117)
What are the mean and the standard deviation for this distribution? (10 and 3)
NORMAL DISTRIBUTION CURVES
In statistics, the normal distribution is often used. In terms of probability the equation without
explanation is given as:
2πσ
eP
22/2σrr
The normal distribution curve is not used exclusively for events with a win/lose or yes/no result but
it does give similar results to the Binomial distribution when n is large. The same mean and
standard deviation must be used in the comparison. Even for low values of n the curves are well
matched as shown in the plotted examples below with n = 10. The normal distribution is not
normally used for win/lose situations unless n is 50 or larger.
You can compare the Binomial and normal distribution at this web address
http://www.ruf.rice.edu/~lane/stat_sim/binom_demo.html
The normal distribution is more widely used for cases where the standard deviation and mean are
known as a result of many measurements. We then use it to predict the probability of a given value
or range of values.
© D.J.Dunn www.freestudy.co.uk 9
The normal distribution curve can be made into one that fits all eventualities. This is done by
changing the mean to zero by subtracting r and making the standard deviation 1 by dividing by σ.
Instead of plotting r we plot σ
rrz
. As this is a standard graph, the area of the graph can be
tabulated and used to solve problems. The table given here covers the area from - to the value of z. Because the graph is symmetrical, other areas can be worked out as appropriate. The total area is
1.0 so the total either side of the mean is 0.5.
Tables of the Normal Distribution
Probability Content from - to z
Note red area = 1 – green area
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
© D.J.Dunn www.freestudy.co.uk 10
WORKED EXAMPLE No. 8
The maximum number of people that can occupy a lift is set at 8. The total weight of 8 people
chosen at random follows a normal distribution with a mean of 550 kg and a standard deviation
of 150 kg. What is the probability that the total weight of 8 people exceeds 600kg?
SOLUTION
550 r σ = 150 330.0150
550600
σ
rrz
Look in the table down the left hand column for z = 0.3
and across under 0.03.
The number in the table for z = 0.33 is 0.6293
The green area to the right is 1 – 0.6293 = 0.3707
This is the probability that the weight will exceed 600kg.
WORKED EXAMPLE No. 9
The lifetime in hours of a mass produced product is represented by the normal distribution
curve with a mean of 1400 and a standard deviation of 300. What is the probability that a
component taken at random will have a lifetime between 1400 and 1450 hours?
SOLUTION
r 1400 σ = 300
First find the probability for 1450 hours
17.0300
14001450
σ
rrz
From the table P = 0.5675
Next find the probability for 1400 hours
0300
14001400
σ
rrz
From the table P = 0.500 as expected for the mean.
The probability of the component having a lifetime between 1400 and 1450 hours is :
0.5675 – 0.5 = 0.0675
© D.J.Dunn www.freestudy.co.uk 11
SELF ASSESSMENT EXERCISE No. 2
1. The height of adult males is normally distributed with a mean of 1.78 m and a standard
deviation of 0.076 m. What is the probability of a randomly selected man having a height of
less than 1.6 m? (0.0089)
2. A grinding machine produces components with a mean diameter of 30 mm. All the components
are measured and the actual size logged. The standard deviation over a period of time is 0.05
mm. Assuming the normal distribution represents the actual distribution, what is the probability
of a component being between 29.95 mm and 30.05 mm diameter? (0.6826)
(Note this is the standard figure for the range between σ = -1 and σ = +1)
3. The breaking strengths of 150 spot welds was measured in Newton and grouped into bands of
20 N as shown.
Range f
160-10 2
180-200 6
200-220 10
220-240 28
240-260 50
260-280 31
280-300 15
300-320 8
Calculate the mean and the standard deviation. (Answers 251.47 N and 29.04 N)
Calculate the probability that a sample taken at random will have strength of less than 200 N
based on the normal distribution.
(Answer about 4%)
Calculate the probability based on the raw data above. (Answer 5.3%)
© D.J.Dunn www.freestudy.co.uk 12
POISSON DISTRIBUTION
Proof and derivation is not given at this level of study but students will find the derivation of this
formula at the following web address.
http://en.wikipedia.org/wiki/Poisson_distribution
This is a distribution representing discrete samples (same as the Binomial) but it brings the time
element into the equation. The probability distribution is given by:
r!
λeP
rλ
r = number of occurrences λ = average occurrences/time interval
You will find another useful aid to calculation at this web address.
http://hyperphysics.phy-astr.gsu.edu/hbase/math/poiex.html
WORKED EXAMPLE No. 10
A business receives order at an average rate of 1 per minute. What is the probability of getting
three orders in one minute?
SOLUTION
λ = 1 r = 3 6%or 0.0613(3)(2)
1e
r!
λeP
31rλ
WORKED EXAMPLE No.11
An emergency service receives an average of 2.1 false alarms per day. What is the probability
of getting four false alarms in a given day?
SOLUTION
λ = 2.1 r = 4 10%or 0.0992(4)(3)(2)
2.1 e
r!
λeP
42.1rλ
SELF ASSESSMENT EXERCISE No. 3
Solve all the following on the assumption that Poisson's distribution applies.
1. On average the demand for a certain product is four per week. If the stock at the beginning of
each week is renewed so that there are always 6 in store, what is the probability of running out
of stock in any week? (13.4%)
2. A call centre has a capacity to deal with 25 calls per minute on average. What is the probability
of getting 30 calls in any minute period? (4.5%)
3. The average time taken for a worker to assemble a certain product is 45 minutes. There are 10 workers employed to make these assemblies. What is the probability of assembling 10 units in
an hour? (8%)