probability and statistics - unene.ca · probability and statistics — probability ©wei-chau xie...
TRANSCRIPT
UNENE Math Refresher Course
Probability and Statistics
1
Probability and Statistics — Statistical Measures © Wei-Chau Xie
Statistical Measures
Given a sample of n observations x1, x2, . . . , xn of random variable X.
Location
{
Central tendency: mean, median, mode
Position: percentiles
Sample Mean (Average, Expected Value) X = 1
n
n∑
i=1
xi
Median m is the value above and below which an equal number of
observations lie, i.e., P{
X<m}
= P{
X>m}
= 0.5
Mode is the most frequently occurring value.
X X
Uni-modal frequency distribution Bi-modal frequency distribution
Relative frequency Relative frequency
2
Probability and Statistics — Statistical Measures © Wei-Chau Xie
pth percentile Qp is the point below which p-percent of the observations lie.
Median is the 50th percentile, m = Q0.5
Procedure for finding Qp
( 1
n< p<
n−1
n
)
Sort the data in ascending order: x1, x2, . . . , xn
k equals the integral part of (n+1)p
d equals the decimal part of (n+1)p
Qp lies between xk and xk+1, i.e.
Qp = xk + d (xk+1 − xk)
3
Probability and Statistics — Statistical Measures © Wei-Chau Xie
Variability (dispersion) characterizes individual differences.
Variance measures variation or spread in the data
Var(X) = s2 = 1
n−1
n∑
i=1
(xi − X)2
Standard deviation s =√
Var(X)
X
s2
s2 > s1
s1Relative frequency
X1=X2
4
Probability and Statistics — Statistical Measures © Wei-Chau Xie
Coefficient of variation δ expresses the standard deviation in relative terms
δ = s∣∣X
∣∣
, X 6= 0
The coefficient of variation is useful because the standard deviation of data
must always be understood in the context of the mean of the data.
When comparing between data sets with different units or widely different
means, we should use the coefficient of variation for comparison instead of
the standard deviation.
5
Probability and Statistics — Statistical Measures © Wei-Chau Xie
Example
We have a batch of 1000 I-beams used for building construction. A sample of 10
tensile strength measurements is obtained:
126, 128, 135, 146, 137, 142, 125, 131, 139, 141
Sample mean X = 1
10
10∑
i=1
xi = 1
10(126 + 128 + 135 + · · · + 141) = 135.0
Sample standard deviation
s2 = 1
9
10∑
i=1
(xi − X)2
= 1
9[(126−135)2 + (128−135)2 + (135−135)2 + · · · + (141−135)2]
= 472
9= 52.44 =⇒ s = 7.24
Coefficient of variation δ = s
X= 7.24
135.0= 0.054
6
Probability and Statistics — Statistical Measures © Wei-Chau Xie
Sort the data in ascending order
125, 126, 128, 131, 135, 137, 139, 141, 142, 146
25th percentile Q0.25
p = 0.25, 0.1< p<0.9
(n+1)p = 11×0.25 = 2.75 =⇒ k = 2, d = 0.75 =⇒ x2<Q0.25<x3
Q0.25 = x2 + d (x3−x2) = 126 + 0.75×(128 − 126) = 127.5
50th percentile Q0.50 = Median
p = 0.50, 0.1< p<0.9
(n+1)p = 11×0.50 = 5.5 =⇒ k = 5, d = 0.5 =⇒ x5<Q0.50<x6
Q0.50 = x5 + d (x6−x5) = 135 + 0.50×(137 − 135) = 136
7
Probability and Statistics — Probability © Wei-Chau Xie
Probability
Consider an experiment with a finite number of mutually exclusive outcomes
which are equiprobable (equally likely due to the nature of the experiment).
The probability of event A is the fraction of outcomes in which A occurs.
P(A) = N(A)
N
where N: total number of outcomes
N(A): number of outcomes leading to the occurrence of event A
8
Probability and Statistics — Probability © Wei-Chau Xie
Example
In rolling a single unbiased dice, what is the probability of getting an even
number of spots?
N = 6 mutually exclusive equiprobable outcomes (getting 1, 2, 3, 4, 5, 6)
A = getting an even number of sports, N(A) = 3 (getting 2, 4, 6)
∴ P(A) = N(A)
N= 3
6= 1
2
9
Probability and Statistics — Probability © Wei-Chau Xie
Basics of Combinatorial Analysis
Given n1 elements a(1)1 , a
(1)2 , . . . , a
(1)n1 ,
n2 elements a(2)1 , a
(2)2 , . . . , a
(2)n2 ,
...
nr elements a(r)1 , a
(r)2 , . . . , a
(r)nr ,
there are precisely n1n2· · · nr distinct ordered r-tuples (a(1)i1 , a
(2)i2 , . . . , a
(r)ir )
containing one element of each kind.
A population of n elements has precisely
Cnr =
(nr
)
= n!r!(n−r)!
sub-populations of size r6n.
Cnr is called the number of combination of n things taken r at a time
(without regard for order).
10
Probability and Statistics — Probability © Wei-Chau Xie
Example
In rolling a pair of unbiased dice, what is the probability that both dice show the
same number of spots?
In rolling a pair of dice, each outcome is an ordered pair (a, b),
where a is the number on the first dice, n1 = 6
b is the number on the second dice, n2 = 6
∴ There are N = n1n2 = 6×6 = 36 mutually exclusive equiprobable events.
A = both dice show the same number of spots
∴ A occurs whenever a = b =⇒ N(A) = 6
P(A) = N(A)
N= 6
36= 1
6
11
Probability and Statistics — Probability © Wei-Chau Xie
Example (Quality Control)
A batch of 100 manufactured items is checked by an inspector, who examines
10 items selected at random. If none of the 10 items is defective, he accepts the
whole batch. Otherwise, the batch is subjected to further inspection. What is
the probability that a batch containing 10 defective items will be accepted?
Draw 10 items from 100 items: N = C10010 = 100!
10!(100−10)! = 100!10! 90!
A = the batch is accepted =⇒ draw 10 items from 90 non-defective items
N(A) = C9010 = 90!
10!(90−10)! = 90!10! 80!
∴ P(A) = N(A)
N=
90!10! 80!
100!10! 90!
= 90!80! · 90!
100! = 81·82· · · 90
91·92· · · 100= 0.3305
12
Probability and Statistics — Probability © Wei-Chau Xie
The mutually exclusive outcomes of an experiment are called elementary
events (sample points).
The set of all elementary events associated with a given experiment is the
sample space, denoted by �.
Venn Diagram
A sample space � is represented by a rectangle; an event A is then represented
symbolically by a closed region within this rectangle.
If the area of the rectangle is taken as 1, the area of the closed region A is the
probability that event A will occur.
�
A
13
Probability and Statistics — Probability © Wei-Chau Xie
Rolling a balanced dice, � ={
1, 2, 3, 4, 5, 6}
The Venn diagram is a rectangle of area 1, divided into six equal rectangles of
area16
.
�
1 3 5
2 4 6
P{
Getting 1}
= 16
�
1 3 5
2 4 6
P{
Getting an odd number of spots}
= P{
Getting 1 or 3 or 5}
= 12
�
1 3 5
2 4 6
14
Probability and Statistics — Probability © Wei-Chau Xie
Combination of Events
A1 and A2 are mutually exclusive if A1 and A2 cannot occur simultaneously.
The union of events A1 and A2, A1 ∪ A2, is the event consisting of the
occurrence of at least one of the events A1 and A2.
The intersection of events A1 and A2, A1∩A2 or A1A2, is the event consisting
of the occurrence of both events A1 and A2.
The difference of events A1 and A2, A1−A2, is the event in which A1 occurs
but not A2.
The complementary event of A, A, is the event that A does not occur, i.e.,
A = �−A.
If the occurrence of event A1 implies the occurrence of event A2, then
A1 ⊂ A2 (event A1 is contained in event A2)
15
Probability and Statistics — Probability © Wei-Chau Xie
�
A1 and A2 are mutually exclusive A1 ∪ A2
A1 ∩ A2 A1 − A2
A1 ⊂ A2
A1 A2
�
��
��
A
A1 A2
A1
A1
A2 A1 A2
A2
A
A=� − A
16
Probability and Statistics — Probability © Wei-Chau Xie
Statistical Independence
If the outcome of one experiment has no influence on the outcome of the other,
these two experiments are called statistically independent.
Let
A1 = an event associated with only the first experiment
A2 = an event associated with only the second experiment
Then the occurrence of A1 has no influence on the occurrence of A2, and
conversely.
Two events A1 and A2 are said to be (statistically) independent if they satisfy
P(A1 ∩ A2) = P(A1 A2) = P(A1) P(A2)
17
Probability and Statistics — Probability © Wei-Chau Xie
The Addition Law for Probabilities
If A1, A2, . . . , An are mutually exclusive, then
P(A1 ∪ A2 ∪ · · · ∪ An) = P(A1) + P(A2) + · · · + P(An)
�
A1 and A2 are mutually exclusive
A1 A2
�
A
A
A=� − A
P(A1 ∪ A2) = P(A1)+P(A2) P(A) = 1−P(A)
A ∪ A = �, A and A are mutually exclusive =⇒ P(A) + P(A) = 1
P(A) = 1 − P(A) or P(A) = 1 − P(A)
18
Probability and Statistics — Probability © Wei-Chau Xie
The Addition Law for Probabilities
For arbitrary A1 and A2: P(A1 ∪ A2) = P(A1) + P(A2) − P(A1 A2)
�
�
A1 A2
P(A1 ∪ A2) P(A1) P(A2)
�
A1 A2
A1 A2
�
A1 A2
�
A1 A2A2
+=
−
P(A1 ∩ A2)P(A1)+P(A2)
For arbitrary A1, A2, and A3,
P(A1 ∪ A2 ∪ A3) = P(A1) + P(A2) + P(A3)
− [P(A1 A2) + P(A2 A3) + P(A3 A1)] + P(A1 A2 A3)
19
Probability and Statistics — Probability © Wei-Chau Xie
Example
The water supply for a city C comes from two sources A and B. The water
is transported by a pipeline consisting of branches 1, 2, and 3; the probability
of failure of each branch is 0.01 and the failures of the individual branches are
statistically independent. Assume that either source alone is sufficient to supply
the water for the city. What is the probability of water shortage for city C?
Source A 1
2
3City C
Source B
S = Water shortage in city C; Ai = Branch i fails, P(Ai) = 0.01, i = 1, 2, 3
Method 1: S = A3 ∪ A1 A2
P(S) = P(A3 ∪ A1 A2) = P(A3) + P(A1 A2) − P(A3 A1 A2)
= P(A3) + P(A1)P(A2) − P(A3)P(A1)P(A2)
= 0.01 + 0.012 − 0.013 = 0.010099
20
Probability and Statistics — Probability © Wei-Chau Xie
Method 2: S = A1 A2 A3 ∪ A1 A2 A3 ∪ A1 A2 A3
These events are mutually exclusive.
P(S) = P(A1 A2 A3 ∪ A1 A2 A3 ∪ A1 A2 A3)
= P(A1 A2 A3) + P(A1 A2 A3) + P(A1 A2 A3)
= P(A1)P(A2)P(A3) + P(A1)P(A2)P(A3) + P(A1)P(A2)P(A3)
= 0.01×0.99×0.99 + 0.99×0.01×0.99 + 0.99×0.99×0.99
= 0.989901
∴ P(S) = 1 − P(S) = 1 − 0.989901 = 0.010099
21
Probability and Statistics — Probability © Wei-Chau Xie
Dependent Events
P(A∣∣ B) = Conditional probability of A given that event B has occurred.
P(A∣∣ B) = P(AB)
P(B)=⇒ P(AB) = P(A
∣∣ B)P(B) = P(B
∣∣ A)P(A)
�
BA AB
P(A|B)=P(AB)
P(AB)
P(B)
P(B)
22
Probability and Statistics — Probability © Wei-Chau Xie
Dependent Events
P(A∣∣ B) = P(AB)
P(B)=⇒ P(AB) = P(A
∣∣ B)P(B) = P(B
∣∣ A)P(A)
If A and B are statistically independent,
P(A∣∣ B) = P(A), P(B
∣∣ A) = P(B) =⇒ P(AB) = P(A)P(B)
If B implies A, B ⊂ A =⇒ AB = B, P(AB) = P(B) =⇒ P(A∣∣ B) = 1.
If A1, A2, . . . , An are mutually exclusive events with A =n∪
k=1Ak, then
P(A∣∣ B) =
n∑
k=1
P(Ak
∣∣ B)
P(A∣∣ B) + P(A
∣∣ B) = 1
23
Probability and Statistics — Probability © Wei-Chau Xie
Example
The foundation of a building may fail either from bearing capacity or by
excessive settlement, with B and S denoting the respective failure mode. If
P(B) = 0.001, P(S) = 0.008, and P(B∣∣ S) = 0.1, determine
1. the probability of failure of the foundation;
2. the probability that the building has excessive settlement but no failure in
bearing capacity.
1. F = Failure of the foundation = B ∪ S
P(F) = P(B ∪ S) = P(B) + P(S) − P(BS)
= P(B) + P(S) − P(B∣∣ S)P(S)
= 0.001 + 0.008 − 0.1×0.008 = 0.0082
2. P(BS) = P(B∣∣ S) P(S) = [1 − P(B
∣∣ S)] P(S)
= (1−0.1)×0.008 = 0.0072
24
Probability and Statistics — Probability © Wei-Chau Xie
Example
Two power generating units a and b operate in parallel to supply the power
requirements of a small city. The demand for power is subject to considerable
fluctuation, and it is known that each unit has a capacity so that it can supply
the city’s full power requirement 75% of the time in case the other unit fails. The
probability of failure of each unit is 0.1, whereas the probability that both units
will fail is 0.02. If there is failure in the power generation, what is the probability
that the city will have its supply of full power?
A = unit a fails, P(A) = 0.1; B = unit b fails, P(B) = 0.1; P(AB) = 0.02
E = only one unit fails if there is failure = (AB ∪ AB)∣∣ (A ∪ B)
S = the city has power supply if there is failure, P(S) = 0.75 P(E)
P(E) = P[(AB ∪ AB)∣∣ (A ∪ B)] = P[AB
∣∣ (A ∪ B)] + P[AB
∣∣ (A ∪ B)]
=P[AB(A ∪ B)]
P(A ∪ B)+
P[AB(A ∪ B)]P(A ∪ B)
25
Probability and Statistics — Probability © Wei-Chau Xie
∵ P[AB(A ∪ B)] = P[(A ∪ B)∣∣ AB] P(AB) = 1 · P(AB)
P[AB(A ∪ B)] = P[(A ∪ B)∣∣ AB] P(AB) = 1 · P(AB)
∴ P(E) = P(AB) + P(AB)
P(A ∪ B)=
P(A∣∣ B)P(B) + P(B
∣∣ A)P(A)
P(A) + P(B) − P(AB)
P(A∣∣ B) = P(AB)
P(B)= 0.02
0.1= 0.2, P(A
∣∣ B) = 1 − P(A
∣∣ B) = 0.8
P(B∣∣ A) = P(BA)
P(A)= 0.02
0.1= 0.2, P(B
∣∣ A) = 1 − P(B
∣∣ A) = 0.8
∴ P(E) = 0.8×0.1 + 0.8×0.1
0.1 + 0.1 − 0.02= 0.889
∴ P(S) = 0.75 P(E) = 0.75×0.889 = 0.667
26
Probability and Statistics — Probability © Wei-Chau Xie
Total Probability Formula
Suppose B1, B2, . . . , Bn are mutually exclusive and collectively exhaustive (i.e.,
B1 ∪ B2 ∪ · · · ∪ Bn = �) events in the sense that one and only one of the events
B1, B2, . . . , Bn always occurs. Then
P(A) = P(AB1) + P(AB2) + · · · + P(ABn) =n∑
k=1
P(ABk)
P(A) = P(A∣∣ B1)P(B1) + P(A
∣∣ B2)P(B2) + · · · + P(A
∣∣ Bn)P(Bn)
=n∑
k=1
P(A∣∣ Bk)P(Bk)
27
Probability and Statistics — Probability © Wei-Chau Xie
�
�
�
�
P(A)
P(AB1) P(AB2)
P(AB3) P(AB4)
B1
B1
B2
B2
B3
B3
B4
B4
+
++
=
A
P(A) = P(AB1) + P(AB2) + P(AB3) + P(AB4)
= P(A∣∣ B1)P(B1) + P(A
∣∣ B2)P(B2) + P(A
∣∣ B3)P(B3) + P(A
∣∣ B4)P(B4)
28
Probability and Statistics — Probability © Wei-Chau Xie
Example
Suppose that in any given year, the probability of damaging storms in the city
of Toronto is 0.20. During such a storm, if not accompanied by tornadoes, the
probability of structural failures in the city of Toronto is 0.10. When a storm
occurs in the region, the probability that it will be accompanied by a tornado
is 0.25, and the probability that this tornado will hit the city of Toronto is 0.05.
Assume that tornadoes occur only during a storm, and when the city is hit by
a tornado it is certain to cause structural failures, whereas the probability of
structural failure in the city when a tornado occurs in the region but does not hit
the city is 0.10. What is the probability of structural failure in the city of Toronto
in a period of one year?
S = having storm, P(S) = 0.2;
T = having tornado; H = tornado hits the city;
F = having structural failure
29
Probability and Statistics — Probability © Wei-Chau Xie
Since ST, ST, ST, ST are mutually exclusive and collectively exhaustive, apply
the Total Probability Formula:
P(F) = P(F∣∣ ST)P(ST) + P(F
∣∣ ST)P(ST) + P(F
∣∣ ST)P(ST) + P(F
∣∣ ST)P(ST)
︸ ︷︷ ︸ ︸ ︷︷ ︸
0 0
= P(F∣∣ ST)P(ST) + P(F
∣∣ ST)P(ST)
Again, since H and H are mutually exclusive and collectively exhaustive
P(F∣∣ ST) = P[(F
∣∣ ST)
∣∣ H] P(H) + P[(F
∣∣ ST)
∣∣ H] P(H)
= 1.0×0.05 + 0.1×0.95 = 0.145
P(ST) = P(T∣∣ S) P(S) = 0.25×0.2 = 0.05
P(ST) = P(T∣∣ S) P(S) = 0.75×0.2 = 0.15
P(F∣∣ ST) = 0.1
∴ P(F) = 0.145×0.05 + 0.1×0.15 = 0.02225
30
Probability and Statistics — Probability © Wei-Chau Xie
Bayes’s Theorem
∵ P(AB) = P(A∣∣ B)P(B) = P(B
∣∣ A)P(A)
∴ P(B∣∣ A) =
P(A∣∣ B)P(B)
P(A)
If B1, B2, . . . , Bn are mutually exclusive and collectively exhaustive events of
which one must occur, then
∵ P(ABi) = P(A∣∣ Bi)P(Bi) = P(Bi
∣∣ A)P(A)
∴ P(Bi
∣∣ A) =
P(A∣∣ Bi)P(Bi)
P(A)=
P(A∣∣ Bi)P(Bi)
n∑
k=1
P(A∣∣ Bk) P(Bk)
Bayes's theorem provides a formula for finding the probability
that the "effect" A was "caused" by the event Bi.
31
Probability and Statistics — Probability © Wei-Chau Xie
Example
The computing system at the University of Waterloo is currently undergoing
shutdown for repairs. Previous shutdowns have been due to hardware failure,
software failure, or power failure. The system is forced to shut down 73%
of the time when it experiences hardware problems, 12% of the time when it
experiences software problems, and 88% of the time when it experiences power
problems. Maintenance engineers have determined that the probabilities of
hardware, software, and power problems are 0.01, 0.05, and 0.02, respectively.
The probability that any two types of failure occur simultaneously is negligible.
1. What is the probability that the current shutdown is due to hardware failure?
2. What is the probability that the current shutdown is due to software failure?
3. What is the probability that the current shutdown is due to power failure?
F = Computing system shutdown
H = Hardware failure, P(H) = 0.01, P(F∣∣ H) = 0.73
S = Software failure, P(S) = 0.05, P(F∣∣ S) = 0.12
P = Power failure, P(P) = 0.02, P(F∣∣ P) = 0.88
32
Probability and Statistics — Probability © Wei-Chau Xie
When computing system shutdown occurs, one and only one of H, S, P must
occur. From the Total Probability Formula:
P(F) = P(F∣∣ H) P(H) + P(F
∣∣ S) P(S) + P(F
∣∣ P) P(P)
= 0.73×0.01 + 0.12×0.05 + 0.88×0.02 = 0.0309
1. P(H∣∣ F) = P(HF)
P(F)= 0.0073
0.0309= 0.236
2. P(S∣∣ F) = P(SF)
P(F)= 0.006
0.0309= 0.194
3. P(P∣∣ F) = P(PF)
P(F)= 0.0176
0.0309= 0.570
Hence, if there is computing system failure, the maintenance engineers should
check power system first, hardware second, and software last.
33
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
Discrete Probability Distributions
A random variable X is discrete if X takes only a finite or countably infinite number
of distinct values x, with corresponding probabilities (probability mass functions)
pX(x) = P(X = x), 06 pX(x)61
Normalization condition:∑
pX(x) = 1
Expected value: E[X] =∑
x pX(x)
Variance: Var(X) =∑(
x−E[X])2
pX(x) = E[X2] −{
E[X]}2
Standard deviation: σ =√
Var(X)
Some Properties:
E[c] = c, Var(c) = 0, where c is an arbitrary constant.
E[c1 X1+c2 X2] = c1 E[X1] + c2 E[X2]
If X1 and X2 are independent random variables
E[X1X2] = E[X1] E[X2], Var(c1X1+c2X2) = c21 Var(X1) + c2
2 Var(X2)
34
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
Bernoulli Trials and the Binomial Distribution
Bernoulli trials are identical, independent experiments, in each of which
an event A may occur with probability p or fail to occur with probability
q = 1−p.
In n consecutive Bernoulli trials, each elementary event ω can be described
by a sequence like 1011· · · 01001 (n digits), where success (or failure) in the
ith trial is denoted by 1 (or 0) in the ith digit.
The probability of an elementary event ω, in which there are precisely k
successes and (n−k) failures, is P(ω) = pkqn−k.
X = total number of successes in n Bernoulli trials
X = k, if there are precisely k successes in the elementary event ω.
Number of distinct sequences is Cnk (selecting k trials out of n for successes).
P(X = k) = Prob. of having k successes (n−k failures) in n Bernoulli trials
P(X = k) = C nk pk qn−k, k = 0, 1, . . . , n Binomial Distribution
35
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
Example
Tossing a coin, X = Getting heads
1: Heads, p; 0: Tails, q = 1 − p
Tossing a coin three times, what is the probability of getting one heads?
1 0 0 Probability = p·q·q = p1q2
0 1 0 Probability = q·p·q = p1q2
0 0 1 Probability = q·q·p = p1q2
Number of sequences = C31 = 3
P(X = 1) = P{
Getting one heads in three tosses}
= p·q·q + q·p·q + q·q·p = 3p1q2
= C31 p1q2
36
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
Example
Suppose the probability of hitting a target with a single shot is 0.001. What is the
probability of hitting the target 2 or more times in 5000 shots?
p = probability of hitting a target with a single shot = 0.001, q = 1−p = 0.999
Hk = hitting the target k times in 5000 shots
A = hitting the target 2 or more times in 5000 shots = H2 ∪ H3 ∪ · · · ∪ H5000
A = H0 ∪ H1
P(A) = P(H0 ∪ H1) = P(H0) + P(H1)
= C50000 p0q5000 + C5000
1 p1q4999 = 1·1·q5000 + 5000·p q4999
= q4999(q + 5000 p)
= 0.9994999(0.999 + 5000×0.001) = 0.0404
∴ P(A) = 1 − P(A) = 0.9596
37
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
Poisson Distribution
Consider an event A with the following assumptions
The event can occur at random at any time.
The occurrence of an event in a given time interval is independent of that in
any other non-overlapping intervals.
λ = mean rate of occurrence of event A
a = λt = average number of occurrences of event A in time interval t
Random variable X = number of occurrences of event A in time interval t
P(X = k) = ak e−a
k! , k = 0, 1, 2, . . . Poisson Distribution
Expected value: E[X] = a = λt
Variance: Var(X) = a = λt
Return period = 1
λ= mean time between two consecutive occurrences
38
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
Example
A building is located in a region where earthquakes may occur, which may be
modelled as a Poisson process. From past record, the mean rate of occurrence of
a large earthquake that may cause damage to the building is 1 in 50 years. Assume
that during a strong earthquake, the probability of damage to the building is 0.1.
Consider a 10-year period.
1. Determine the probability of the building subjected to earthquakes.
2. What is the probability that the building will be damaged?
Q = number of earthquakes in a 10-year period
t = 10, λ = 150
, a = λt = 0.2
P(Q = 0) = a0e−a
0! = e−a = e−0.2 = 0.819
∴ Probability of the building subjected to earthquakes = P(Q>1)
= 1 − P(Q = 0) = 1 − 0.819 = 0.181
39
Probability and Statistics — Discrete Probability Distributions © Wei-Chau Xie
D = the building will be damaged by earthquakes in a 10-year period
p = probability of survival of the building in an earthquake = 0.9
Since Q = 0, 1, 2, . . . are mutually exclusive and collectively exhaustive, from
the Total Probability Formula
P(D) = P(D∣∣ Q = 0) P(Q = 0) + P(D
∣∣ Q = 1) P(Q = 1) + · · ·
+ P(D∣∣ Q = n) P(Q = n) + · · ·
= 1× 0.20e−0.2
0! + 0.91× 0.21e−0.2
1! + · · · + 0.9n× 0.2ne−0.2
n! + · · ·
= e−0.2
[
1 + (0.9×0.2)1
1! + · · · + (0.9×0.2)n
n! + · · ·]
= e−0.2 e0.9×0.2 = e−0.02 = 0.9802
P(D) = 1 − P(D) = 0.0198
ex = 1 + x
1! + x2
2! + · · · + xn
n! + · · ·
40
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Continuous Probability Distributions
A random variable X is continuous if P(x1<X6x2) =∫ x2
x1
fX(x) dx.
fX(x)>0 is the probability density function (PDF).
Normalization condition:
∫ +∞
−∞fX(x) dx = 1
If X is a continuous random variable, P(X = x) = 0
xx x1
Area=P(x1<X<x2)
x2
fX(x)
FX(x)
FX(x) = P(X6x) =∫ x
−∞fX(x) dx, −∞< x <+∞
is the probability distribution function.
41
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
P(x1<X6x2) = FX(x2) − FX(x1)
fX(x) = dFX(x)
dx
Expected value: E[X] =∫ x
−∞x fX(x) dx
Variance: Var(X) = E[{
X−E[X]}2] = E[X2] −
{
E[X]}2
=∫ x
−∞
{
x−E[X]}2
fX(x) dx
42
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Normal Distribution
Normal distribution is the most important distribution in statistical analysis.
Probability density function of the normally distributed random variable X:
fX(x) = 1√2π σ
e−(x−µ)2
2σ2
−∞<x<+∞
The expected value (mean) of X is µ, standard deviation is σ , X ∼ N(µ, σ 2).
0
0. 1
0. 2
0. 3
0. 4
–10 –8 –6 –4 –2 2 4 6 8 10
x
µ=0, σ=2
µ=2, σ=2.5
µ=–4, σ=1.5
µ=0, σ=3
µ=0, σ=1
fX(x)
43
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Standard Normal Random Variable
A normally distributed random variable with µ = 0, σ = 1 is called a standard
normal random variable, Z ∼ N(0, 12).
Probability density function fZ(z) = 1√2π
e−z2
2
Probability distribution function 8(z) = P(Z6z) =∫ z
−∞
1√2π
e− y2
2dy
zz
fZ(z)
P(Z >z)=1–8(z) P(z1<Z<z2)=8(z2)–8(z1)P(Z<z)=8(z)
zz2z1
fZ(z)
8(z) is tabulated in Table 4.44
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Standard Normal Random Variable
For any normal random variable X ∼ N(µ, σ 2).
P(X6x) = 8
(x−µ
σ
)
, P(X>x) = 1 − 8
(x−µ
σ
)
P(x1<X6x2) = 8
(x2−µ
σ
)
− 8
(x1−µ
σ
)
45
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Example
A structure is resting on three supports, A, B, and C. Although the loads on
the three supports can be estimated accurately, the soil conditions under A, B,
and C are not completely predictable. Assume that the settlements SA, SB, and
SC are independent normally distributed random variables with means 2, 2.5,
3 cm and coefficients of variation 20%, 20%, 25%, respectively. What is the
probability that the maximum settlement will exceed 4 cm?
µA = 2, σA = 0.2 µA = 0.4
µB = 2.5, σB = 0.2 µB = 0.5
µC = 3, σC = 0.25 µC = 0.75
P(maxS>4) = 1 − P(maxS64) = 1 − P(SA 64 ∩ SB 64 ∩ SC 64)
= 1 − P(SA 64) P(SB 64) P(SC 64)
= 1 − 8
(4 − 2
0.4
)
8
(4 − 2.5
0.5
)
8
(4 − 3
0.75
)
= 1 − 8(5.0) 8(3.0) 8(1.33) = 1 − 1.0×0.9986×0.9082 = 0.093
46
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Example
The diameters of rotor shafts DS in a lot and the inner diameters of bearings
DB in another lot are DS ∼ N(0.249, 0.0032) and DB ∼ N(0.255, 0.0022).
1. Determine the mean and standard deviation of the clearance between shafts
and bearings c =(
12 DB − 1
2 DS
)
selected from these lots?
2. If a shaft and a bearing are selected at random, what is the probability that
the shaft will not fit inside the bearing?
1. E[c] = E[12 DB− 1
2 DS] = 12 E[DB] − 1
2 E[DS] = 12 ·0.255 − 1
2 ·0.249 = 0.003
Var(c) = Var(RB−RS) = Var(
12 DB− 1
2 DS
)
=(
12
)2Var(DB) +
(
− 12
)2Var(DS)
= 14 ×0.0022 + 1
4 ×0.0032 = 3.25×10−6
σc =√
Var(c) =√
3.25×10−6 = 0.0018
2. P(DS >Db) = P(c<0) = 8
(0−0.003
0.0018
)
= 8(−1.67) = 0.0475
47
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Lognormal Distribution
If X ∼ N(µ, σ 2), then Y = eX is lognormal Y ∼ LN(µ, σ 2) with PDF:
fY( y) =
1√2π σy
e−(ln y−µ)2
2σ2
, y>0
0, y60
A random variable Y has a lognormal probability distribution if ln Y is normal.
0
0. 1
0. 2
0. 3
0. 4
0. 5
0. 6
54321 6y
µ=1, σ=2
µ=1, σ=1
µ=1, σ=0.5
µ=1, σ=0.25
fY(y)
48
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
µ, σ are not the mean and standard deviation of the lognormal distribution Y.
Expected value E[Y] = eµ+σ2
2, Variance Var(Y) = e2µ+σ 2(
eσ 2 − 1)
Given the mean E[Y] and the variance Var(Y) of the lognormal distribution,
the parameters µ and σ are
σ 2 = ln
[
1 + Var(Y){
E[Y]}2
]
= ln(
1 + δ2Y
)
, µ = ln{
E[Y]}
− 12 σ 2
P(Y6y) = 8
( ln y − µ
σ
)
The lognormal distribution is useful in those applications where the values of the
random variable are known to be strictly positive; for example, the strength and
fatigue life of material, the intensity of rainfall, the time for project completion, and
the volume of air traffic.
49
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Example
The operational life T of a certain equipment is lognormally distributed with
a mean life of 1500 hours and a coefficient of variation of 30%. What is
the probability that two of the five equipments used, which are statistically
independent, will malfunction in less than 900 hours of operation?
E[T] = 1500, δT = 0.3
σ 2 = ln
[
1 + Var(T){
E[T]}2
]
= ln(1+δ2T) = ln(1+0.32) = 0.086, σ = 0.294
µ = ln{
E[T]}
− 12 σ 2 = ln 1500 − 1
2 ×0.086 = 7.270
p = P(T<900) = 8
(ln 900−µ
σ
)
= 8
(ln 900−7.270
0.294
)
= 8(−1.59) = 0.0559
P(2 of 5 equipments will malfunction in less than 900 hours of operation)
= C52 p2(1−p)3 = 10×0.05592×(1−0.0559)3 = 0.026
50
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Exponential Distribution
If events occur according to a Poisson process with mean rate of occurrence
λ, then the time T between two consecutive occurrence of the event has an
exponential distribution.
fT(t) ={
λ e−λt, t>0
0, t<0FT(t) = P(T6 t) = 1−e−λt =⇒ P(T>t) = e−λt
t0
1
t
FT(t)=P(T< t)
fT(t)
Expected value: E[T] = 1
λ= Mean recurrence time = Return period
Variance: Var(T) = 1
λ2
51
Probability and Statistics — Continuous Probability Distributions © Wei-Chau Xie
Example
Historical records of earthquakes in San Francisco show that, during the period
1836-1961, there were 16 earthquakes of intensity VI or more. Assume that the
occurrence of such earthquakes in this region follows a Poisson process.
1. What is the probability that such earthquakes will occur in the next 2 years?
2. What is the probability that no earthquake of this high intensity will occur
in the next 10 years?
3. What are the return period of an intensity VI or higher earthquake and the
probability of such earthquakes occurring within the return period?
Mean rate of occurrence λ = 16126
= 0.127 earthquakes per year
(1) P(T62) = 1 − e−λt = 1 − e−0.127×2 = 0.224
(2) P(No earthquake of this high intensity will occur in the next 10 years)
= P(T>10) = e−λt = e−0.127×10 = 0.281
(3) P(Return period) = 1λ
= 12616
= 7.875 years
P(T67.813) = 1 − e−λt = 1 − e−0.127×7.875 = 1 − e−1 = 0.63252
Probability and Statistics — Sampling Distribution © Wei-Chau Xie
Consider a population with mean µ and standard deviation σ .
Take a sample of size n: x1, x2, . . . , xn.
Sample mean: X = 1
n
n∑
i=1
xi
Central Limit Theorem
If X is the mean of a random sample of size n taken from a population having
mean µ and standard deviation σ , then
z = X−µσ√n
is the value of a random variable whose distribution function approaches that of
the standard normal distribution as n→∞.
For n>30, X is a value of a random variable approximately N(
µ, σ 2X
)
.
σX
= σ√n
is called standard error.
53
Probability and Statistics — Confidence Intervals © Wei-Chau Xie
Confidence Intervals
Confidence interval estimation of µ (σ known)
X − zα/2 · σ√n
< µ < X + zα/2 · σ√n
with (1−α)100% confidence
Confidence interval estimation of µ (σ unknown)
X − zα/2 · s√n
< µ < X + zα/2 · s√n
, Large sample (n>30)
X − tα/2, (n−1)
· s√n
< µ < X + tα/2, (n−1)
· s√n
, Small sample
with (1−α)100% confidence
Confidence interval for σ 2
(n−1)s2
χ2α/2, (n−1)
< σ 2 <(n−1)s2
χ21−α/2, (n−1)
with (1−α)100% confidence
54
Probability and Statistics — Confidence Intervals © Wei-Chau Xie
Confidence Intervals
Confidence interval estimation of the difference between means (µ1−µ2)
Large Sample (n1, n2 >30) with (1−α)100% confidence
X1 − X2 ± zα/2
√
s21
n1
+ s22
n2
Small Sample with (1−α)100% confidence
X1 − X2 ± tα/2, (n1+n2−2)
√
(n1−1)s21 + (n2−1)s2
2
n1 + n2 − 2· n1 + n2
n1n2
55
Probability and Statistics — Confidence Intervals © Wei-Chau Xie
Example
The daily dissolved oxygen (DO) concentration for a stream at a station has been
recorded for 30 days, which give a sample mean X = 2.52 mg/L and sample
standard deviation s = 4.2 mg/L. Determine the 95% and 99% confidence
intervals for the population mean µ.
95% confidence interval =⇒ 1−α = 0.95, α = 0.05
tα/2, n−1 = t0.025, 29 = 2.045
µ = X ± t0.025, 29s√n
= 2.52 ± 2.045 × 4.2√30
= 2.52 ± 1.57
∴ 0.95 < µ < 4.09
99% confidence interval =⇒ 1−α = 0.99, α = 0.01
tα/2, n−1 = t0.005, 29 = 2.756
µ = X ± t0.005, 29s√n
= 2.52 ± 2.756 × 4.2√30
= 2.52 ± 2.11
∴ 0.41 < µ < 4.63
56
Probability and Statistics — Linear Regression © Wei-Chau Xie
Linear Regression
The fitting of a straight line through a given set of points according to some
specified goodness-of-fit criterion is called linear regression.
The Method of Least Square
Suppose the true linear relationship between X and Y is Y(x) = A + BX.
We have n paired observations (xi, yi), i = 1, 2, . . . , n, which are used to
obtain a and b, estimated values for A and B.
For the ith observation, the independent variable is xi, and the predicted
value of Y is Y(xi) = a+bxi.
The goal of lease-square regression is to find the values of a and b that
minimize the error
E =n
∑
i=1
[ yi − Y(xi)]2 =
n∑
i=1
[ yi − (a+bxi)]2
57
Probability and Statistics — Linear Regression © Wei-Chau Xie
Estimated Linear Regression Coefficients
a = Y − bX, b = SXY
SXX
where
X = 1
n
n∑
i=1
xi, Y = 1
n
n∑
i=1
yi,
SXY =n
∑
i=1
(xi − X)(yi − Y) =n
∑
i=1
xi yi − n XY
SXX =n
∑
i=1
(xi − X)2 =n
∑
i=1
x2i − n X
2
SYY =n
∑
i=1
(yi − Y)2 =n
∑
i=1
y2i − n Y
2
58
Probability and Statistics — Linear Regression © Wei-Chau Xie
X
X
Y
sY
measures total variability
in Y about the sample mean
sY |X
measures variability about
the estimated regression line
Y
Y(X)
sY =√
SYY
n−1, s
Y∣∣X
=√
SYY − bSXY
n−2
Total variability sY measures total variability in Y about the sample mean Y
without considering the effect of X.
Standard error of estimate sY∣∣X
measures variability about the estimated
regression line Y(X) = a+bX.59
Probability and Statistics — Linear Regression © Wei-Chau Xie
With (1−α)100% Confidence
Confidence interval for the conditional mean given X
µY∣∣X
= (a + bX) ± tα/2, (n−2)·sY∣∣X
√
1
n+ (X−X)2
SXX
X
Y
XX
Y(X)=a+bX
(1–α)100% confidence interval for µY|X
Confidence interval of prediction for an individual Y given X
Y(X) = (a + bX) ± tα/2, (n−2)·sY∣∣X
√
1
n+ (X−X)2
SXX
+1
60
Probability and Statistics — Linear Regression © Wei-Chau Xie
Correlation
Regression analysis is concerned with predicting the value of the dependent
variable Y for a known value of the independent variable X.
Correlation analysis is concerned with the nature of the relationship between
the two variables, focusing on the strength of that relationship.
The magnitude of the correlation coefficient ρ (−16ρ 6+1) is a measure of
the degree of linear inter-relationship between two variables.
When ρ = ±1.0, X and Y are linearly related.
When ρ = 0, there is no linear relationship between X and Y.
Sample correlation coefficient r = SXY√
SXX·SYY
61
Probability and Statistics — Linear Regression © Wei-Chau Xie
X
Y
Y(X)
X
Positive Correlation
0 < ρ < 1
Negative Correlation
–1 < ρ < 0
Y Y(X)
X
Perfect Positive Correlation
ρ = +1
YY(X)
X
Perfect Negative Correlation
ρ = –1
Y
Y(X)
62
Probability and Statistics — Linear Regression © Wei-Chau Xie
Example
A stress-strain relationship is to be established for steel beams by measuring
the strain of a specimen Y under the given stress X. The following sample
measurements are obtained:
X 14 23 9 17 10 22 5 12 6 16
Y 68 105 40 79 81 95 31 72 45 93
10∑
i=1
xi = 134,10∑
i=1
yi = 709,10∑
i=1
xi yi = 10747,10∑
i=1
x2i = 2140,
10∑
i=1
y2i = 55895
X = 1n
∑
xi = 110
×134 = 13.4, Y = 1n
∑
yi = 110
×709 = 70.9
SXX =∑
x2i − nX
2 = 2140 − 10×13.42 = 344.4
SYY =∑
y2i − nY
2 = 55895 − 10×70.92 = 5626.9
SXY =∑
xi yi − nXY = 10747 − 10×13.4×70.9 = 1246.4
63
Probability and Statistics — Linear Regression © Wei-Chau Xie
Use the method of least squares to determine the expression for the estimated
regression line.
b = SXY
SXX
= 1246.4
344.4= 3.619
a = Y − b X = 70.9 − 3.619×13.4 = 22.405
∴ Y(x0) = a + bx0 = 22.405 + 3.619x0
Compute the sample standard deviation for the strain Y.
sY =√
SYY
n−1=
√
5626.9
9= 25.004
Compute the standard error of the estimate for strain Y.
sY∣∣X
=
√
SYY − bSXY
n−2=
√
5626.9 − 3.619×1246.4
8= 11.812
64
Probability and Statistics — Linear Regression © Wei-Chau Xie
Determine the predicted strain when the stress is 10.
Y(x0) = 22.405 + 3.619 x0 = 22.405 + 3.619×10 = 58.595
Find a 95% confidence interval for the predicted strain when the stress is 10.
Y(10) = (a + bX) ± tα/2, (n−2)·sY∣∣X
√
1
n+ (X−X)2
SXX
+ 1
From Table 6, t0.025, 8 = 2.306
Y(10) = 58.595 ± 2.306×11.812×
√
1
10+ (10−13.4)2
344.4+ 1
= 58.595 ± 29.009
∴ 29.586 6 Y(10) 6 87.604 with 95% confidence
65
Probability and Statistics — Linear Regression © Wei-Chau Xie
Find a 95% confidence interval for the mean strain when the stress is 10.
µY∣∣X
= (a + bX) ± tα/2, (n−2)·sY∣∣X
√
1
n+ (X−X)2
SXX
= 58.595 ± 2.306×11.812×
√
1
10+ (10−13.4)2
344.4
= 58.595 ± 9.956
∴ 48.639 6 µY∣∣X
6 68.551 with 95% confidence
Determine the sample correlation coefficient.
r = SXY√
SXX·SYY
= 1246.4√344.4×5626.9
= 0.895
66
Probability and Statistics — Linear Regression © Wei-Chau Xie
0
20
40
60
80
100
120
5 10 15 20 25 30
Y
X
95% confidence interval
for the mean strain
when the stress is 10
95% confidence interval
for the predicted strain
when the stress is 10
Predicted strain
when the stress is 10
Y(X)=22.405+3.619 X
67