ee401 (semester 1) jitkomut songsiri 1. review of probability
TRANSCRIPT
![Page 1: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/1.jpg)
EE401 (Semester 1) Jitkomut Songsiri
1. Review of Probability
• Random Experiments
• The Axioms of Probability
• Conditional Probabilty
• Independence of Events
• Sequential Experiments
1-1
![Page 2: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/2.jpg)
Random Experiments
An experiment in which the outcome varies in an unpredictable fashionwhen the experiment is repeated under the same conditions
Examples:
• Select a ball from an urn containing balls numbered 1 to n
• Toss a coin and note the outcome
• Roll a dice and note the outcome
• Measure the time between page requests in a Web server
• Pick a number at random between 0 and 1
Review of Probability 1-2
![Page 3: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/3.jpg)
Sample space
Sample space is the set of all possible outcomes, denoted by S
• obtained by listing all the elements, e.g., S = {H,T}, or• giving a property that specifies the elements, e.g., S = {x | 0 ≤ x ≤ 3}
Same experimental procedure may have different sample spaces
y
x
0
1
1
S1
y
x
0
1
1
S2
• Experiment 1: Pick two numbers at random between zero and one
• Experiment 2: Pick a number X at random between 0 and 1, then picka number Y at random between 0 and X
Review of Probability 1-3
![Page 4: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/4.jpg)
Three possibilities for the number of outcomes in sample spaces
finite, countably infinite, uncountably infinite
Examples:
S1 = {1, 2, 3, . . . , 10}S2 = {HH,HT,TT,TH}S3 = {x ∈ Z | 0 ≤ x ≤ 10}S4 = {1, 2, 3, . . .}S5 = {(x, y) ∈ R× R | 0 ≤ y ≤ x ≤ 1}S6 = Set of functions X(t) for which X(t) = 0 for t ≥ t0
Discrete sample space: if S is countable (S1, S2, S3, S4)
Continuous sample space: if S is not countable (S5, S6)
Review of Probability 1-4
![Page 5: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/5.jpg)
Events
Event is a subset of a sample space when the outcome satisfies certainconditions
Examples: Ak denotes an event corresponding to the experiment Ek
E1 : Select a ball from an urn containing balls numbered 1 to 10
A1 : An even-numbered ball (from 1 to 10) is selected
S1 = {1, 2, 3, . . . , 10}, A1 = {2, 4, 6, 8, 10}
E2 : Toss a coin twice and note the sequence of heads and tails
A2 : The two tosses give the same outcome
S2 = {HH,HT,TT,TH}, A2 = {HH,TT}
Review of Probability 1-5
![Page 6: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/6.jpg)
E3 : Count # of voice packets containing only silence from 10 speakers
A3 : No active packets are produced
S3 = {x ∈ Z | 0 ≤ x ≤ 10}, A3 = {0}
Two events of special interest:
• Certain event, S, which consists of all outcomes and hence alwaysoccurs
• Impossible event or null event, ∅, which contains no outcomes andnever occurs
Review of Probability 1-6
![Page 7: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/7.jpg)
Review of Set Theory
• A = B if and only if A ⊂ B and B ⊂ A
• A ∪B (union): set of outcomes that are in A or in B
• A ∩B (intersection): set of outcomes that are in A and in B
• A and B are disjoint or mutually exclusive if A ∩B = ∅• Ac (complement): set of all elements not in A
• A ∪B = B ∪A and A ∩B = B ∩A
• A ∪ (B ∪ C) = (A ∪B) ∪ C and A ∩ (B ∩ C) = (A ∩B) ∩ C
• A∪ (B ∩C) = (A∪B)∩ (A∪C) and A∩ (B ∪C) = (A∩B)∪ (A∩C)
• DeMorgan’s Rules
(A ∪B)c = Ac ∩Bc, (A ∩B)c = Ac ∪Bc
Review of Probability 1-7
![Page 8: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/8.jpg)
Axioms of Probability
Probabilities are numbers assigned to events indicating how likely it is thatthe events will occur
A Probability law is a rule that assigns a number P (A) to each event A
P (A) is called the the probability of A and satisfies the following axioms
Axiom 1 P (A) ≥ 0
Axiom 2 P (S) = 1
Axiom 3 If A ∩B = ∅ then P (A ∪B) = P (A) + P (B)
Review of Probability 1-8
![Page 9: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/9.jpg)
Probability Facts
• P (Ac) = 1− P (A)
• P (A) ≤ 1
• P (∅) = 0
• If A1, A2, . . . , An are pairwise mutually exclusive then
P
(
n⋃
k=1
Ak
)
=
n∑
k=1
P (Ak)
• P (A ∪B) = P (A) + P (B)− P (A ∩B)
• If A ⊂ B then P (A) ≤ P (B)
Review of Probability 1-9
![Page 10: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/10.jpg)
Conditional Probability
The probability of event A given that event B has occured
The conditional probability, P (A|B), is defined as
P (A|B) =P (A ∩B)
P (B), for P (B) > 0
If B is known to have occured, then A can occurs only if A ∩B occurs
Simply renormalizes the probability of events that occur jointly with B
Useful in finding probabilities in sequential experiments
Review of Probability 1-10
![Page 11: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/11.jpg)
Example: Tree diagram of picking balls
Selecting two balls at random without replacement
Outcome of first draw
Outcome of second draw
B1
B2B2
W1
W2W2
35
25
14
34
24
24
110
310
310
310
B1, B2 are the events of getting a black ball in the first and second draw
P (B2|B1) =1
4, P (W2|B1) =
3
4, P (B2|W1) =
2
4, P (W2|W1) =
2
4
The probability of a path is the product of the probabilities in the transition
P (B1 ∩B2) = P (B2|B1)P (B1) =1
4
2
5=
1
10
Review of Probability 1-11
![Page 12: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/12.jpg)
Example: Tree diagram of Binary Communication
Input
Output 00
0
11
1
p1− p
εε 1− ε1− ε
(1− p)(1− ε) (1− p)ε pε p(1− ε)
Ai: event the input was i, Bi: event the reciever was i
P (A0 ∩B0) = (1− p)(1− ε)
P (A0 ∩B1) = (1− p)ε
P (A1 ∩B0) = pε
P (A1 ∩B1) = p(1− ε)
Review of Probability 1-12
![Page 13: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/13.jpg)
Theorem on Total Probability
Let B1, B2, . . . , Bn be mutually exclusive events such that
S = B1 ∪B2 ∪ · · · ∪Bn
(their union equals the sample space)
Event A can be partitioned as
A = A ∩ S = (A ∩B1) ∪ (A ∩B2) ∪ · · · ∪ (A ∩Bn)
Since A ∩Bk are disjoint, the probability of A is
P (A) = P (A ∩B1) + P (A ∩B2) + · · ·+ P (A ∩Bn)
or equivalently,
P (A) = P (A|B1)P (B1) + P (A|B2)P (B2) + · · ·+ P (A|Bn)P (Bn)
Review of Probability 1-13
![Page 14: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/14.jpg)
Example: revisit the tree diagram of picking two balls
Outcome of first draw
Outcome of second draw
B1
B2B2
W1
W2W2
35
25
14
34
24
24
110
310
310
310
Find the probability of the event that the second ball is white
P (W2) = P (W2|B1)P (B1) + P (W2|W1)P (W1)
=3
4· 25+
1
2· 35=
3
5
Review of Probability 1-14
![Page 15: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/15.jpg)
Bayes’ Rule
The conditional probablity of event A given B is related to the inverseconditional probability of event B given A by
P (A|B) =P (B|A)P (A)
P (B)
• P (A) is called a priori probability
• P (A|B) is called a posteriori probability
Let A1, A2, . . . , An be a partition of S
P (Ai|B) =P (B|Ai)P (Ai)
∑nk=1P (B|Ak)P (Ak)
Review of Probability 1-15
![Page 16: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/16.jpg)
Example: Binary Channel
OutputInput
00
11
ε
ε
1− ε
1− εAi event the input was i
Bi event the receiver output was i
Input is equally likely to be 0 or 1
P (B1) = P (B1|A0)P (A0)+P (B1|A1)P (A1) = ε(1/2)+(1−ε)(1/2) = 1/2
Applying Bayes’ Rule, we obtain
P (A0|B1) =P (B1|A0)P (A0)
P (B1)=
ε/2
1/2= ε
If ε < 1/2, input 1 is more likely than 0 when 1 is observed
Review of Probability 1-16
![Page 17: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/17.jpg)
Independence of Events
Events A and B are independent if
P (A ∩B) = P (A)P (B)
• Knowledge of event B does not alter the probability of event A
• This implies P (A|B) = P (A)
�������������������������
�������������������������
��������������������
��������������������
0
1
1
A
B
x
y
��������������������������������
��������������������������������
��������������������
��������������������
0
1
1
A
C x
y
A and B are independent A and C are not independent
Review of Probability 1-17
![Page 18: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/18.jpg)
Example: System Reliability
Controller Unit 1
Unit 2
Unit 3
• System is ’up’ if the controller and atleast two units are functioning
• Controller fails with probability p
• Peripheral unit fails with probability a
• All components fail independently
• A: event the controller is functioning
• Bi: event unit i is functioning
• F : event two or more peripheral units are functioning
Find the probability that the system is up
Review of Probability 1-18
![Page 19: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/19.jpg)
The event F can be partition as
F = (B1 ∩B2 ∩Bc3) ∪ (B1 ∩Bc
2 ∩B3) ∪ (Bc1 ∩B2 ∩B3) ∪ (B1 ∩B2 ∩B3)
Thus,
P (F ) = P (B1)P (B2)P (Bc3) + P (B1)P (Bc
2)P (B3)
+ P (Bc1)P (B2)P (B3) + P (B1)P (B2)P (B3)
= 3(1− a)2a+ (1− a)3
P (system is up) = P (A ∩ F ) = P (A)P (F )
= (1− p)P (F ) = (1− p){3(1− a)2a+ (1− a)3}
Review of Probability 1-19
![Page 20: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/20.jpg)
Sequential Independent Experiments
• Consider a random experiment consisting of n independent experiments
• Let A1, A2, . . . , An be events of the experiments
• We can compute the probability of events of the sequential experiment
P (A1 ∩A2 ∩ · · · ∩An) = P (A1)P (A2) . . . P (An)
• Example: Bernoulli trial
– Perform an experiment and note if the event A occurs– The outcome is “success” or “failure”– The probability of success is p and failure is 1− p
Review of Probability 1-20
![Page 21: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/21.jpg)
Binomial Probability
• Perform n Bernoulli trials and observe the number of successes
• Let X be the number of successes in n trials
• The probability of X is given by the Binomial probability law
P (X = k) =
(
nk
)
pk(1− p)n−k
for k = 0, 1, . . . , n
• The binomial coefficient
(
nk
)
=n!
k!(n− k)!
is the number of ways of picking k out of n for the successes
Review of Probability 1-21
![Page 22: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/22.jpg)
Example: Error Correction Coding
OutputInput
00
11
ε
ε
1− ε
1− ε• Transmit each bit three times
• Decoder takes a majority vote of the receivedbits
Compute the probability that the receiver makes an incorrect decision
• View each transmission as a Bernoulli trial
• Let X be the number of wrong bits from the receiver
P (X ≥ 2) =
(
32
)
ε2(1− ε) +
(
33
)
ε3
Review of Probability 1-22
![Page 23: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/23.jpg)
Mutinomial Probability
• Generalize the binomial probability law to the occurrence of more thanone event
• Let B1, B2, . . . , Bm be possible events with
P (Bk) = pk, and p1 + p2 + · · ·+ pm = 1
• Suppose n independent repetitions of the experiment are performed
• Let Xj be the number of times each Bj occurs
• The probability of the vector (X1,X2, . . . ,Xm) is given by
P (X1 = k1, X2 = k2, . . . , Xm = km) =n!
k1!k2! . . . km!pk11 pk22 · · · pkmm
where k1 + k2 + · · ·+ km = n
Review of Probability 1-23
![Page 24: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/24.jpg)
Geometric Probability
• Repeat independent Bernoulli trials until the the first success occurs
• Let X be the number of trials until the occurrence of the first success
• The probability of this event is called the geometric probability law
P (X = k) = (1− p)k−1p, for k = 1, 2, . . .
• The geometric probabilities sum to 1:
∞∑
k=1
P (X = k) = p∞∑
k=1
qk−1 =p
1− q= 1
where q = 1− p
• The probability that more than n trials are required before a success
P (X > n) = (1− p)n
Review of Probability 1-24
![Page 25: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/25.jpg)
Example: Error Control by Retransmission
• A sends a message to B over a radio link
• B can detect if the messages have errors
• The probability of transmission error is q
• Find the probability that a message needs to be transmitted more thantwo times
Each transmission is a Bernoulli trial with probability of success p = 1− q
The probability that more than 2 transmissions are required is
P (X > 2) = q2
Review of Probability 1-25
![Page 26: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/26.jpg)
Sequential Dependent Experiments
Sequence of subexperiments in which the outcome of a givensubexperiment determine which subexperiment is performed next
Example: Select the urn for the first draw by flipping a fair coin
1
0
11
1
1
01
0
0 1
Draw a ball, note the number on the ball and replace it back in its urn
The urn used in the next experiment depends on # of the ball selected
Review of Probability 1-26
![Page 27: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/27.jpg)
Trellis Diagram
Sequence of outcomes
...
H
T000
0000000
111
111
1111
Probability of a sequence of outcomes
...
2/3
1/3
5/6
2/3 2/3
1/3 1/3
1/6 1/6
5/6 5/6
1/2
1/21/6
0000
1111
is the product of probabilities along the path
Review of Probability 1-27
![Page 28: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/28.jpg)
Markov Chains
Let A1, A2, . . . , An be a sequence of events from n sequential experiments
The probability of a sequence of events is given by
P (A1A2 · · ·An) = P (An|A1A2 · · ·An−1)P (A1A2 · · ·An−1)
If the outcome of An−1 only determines the nth experiment and An then
P (An|A1A2 · · ·An−1) = P (An|An−1)
and the sequential experiments are called Markov Chains
Thus,
P (A1A2 · · ·An) = P (An|An−1)P (An−1|An−2) · · ·P (A2|A1)P (A1)
Find P (0011) in the urn example
Review of Probability 1-28
![Page 29: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/29.jpg)
The probability of the sequence 0011 is given by
P (0011) = P (1|1)P (1|0)P (0|0)P (0)
where the transition probabilities are
P (1|1) = 5
6, P (1|0) = 1
3, P (0|0) = 2
3
and the initial probability is given by
P (0) =1
2
Hence,
P (0011) =5
6· 13· 23· 12=
5
54
Review of Probability 1-29
![Page 30: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/30.jpg)
References
Chapter 2 inA. Leon-Garcia, Probability, Statistics, and Random Processes for
Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009
Review of Probability 1-30
![Page 31: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/31.jpg)
EE401 (Semester 1) Jitkomut Songsiri
2. Random Variables
• definition
• probability measures: CDF, PMF, PDF
• expected values and moments
• examples of RVs
2-1
![Page 32: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/32.jpg)
Definition
Real lineS
SX
x
X(ζ) = x
ζ
a random variable X is a function mapping an outcome to a real number
• the sample space, S, is the domain of the random variable
• SX is the range of the random variable
example: toss a coin three times and note the sequence of heads and tails
S = {HHH,HHT,HTH,THH,HTT,THT,TTH,TTT}
Let X be the number of heads in the three tosses
SX = {0, 1, 2, 3}
Random Variables 2-2
![Page 33: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/33.jpg)
Types of Random Variables
Discrete RVs take values from a countable set
example: let X be the number of times a message needs to be transmitteduntil it arrives correctly
SX = {1, 2, 3, . . .}Continuous RVs take an infinite number of possible values
example: let X be the time it takes before receiving the next phone calls
Mixed RVs have some part taking values over an interval like typicalcontinuous variables, and part of it concentrated on particular values likediscrete variables
Random Variables 2-3
![Page 34: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/34.jpg)
Probability measures
Cumulative distribution function (CDF)
F (a) = P (X ≤ a)
Probability mass function (PMF) for discrete RVs
p(k) = P (X = k)
Probability density function (PDF) for continuous RVs
f(x) =dF (x)
dx
Random Variables 2-4
![Page 35: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/35.jpg)
Cumulative Distribution Function (CDF)
Properties
0 ≤ F (a) ≤ 1
F (a) → 1, as a → ∞F (a) → 0, as a → −∞
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
F (b)− F (a) =
∫ b
a
f(x)dx F (a) =∑
k≤a
p(k)
Random Variables 2-5
![Page 36: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/36.jpg)
Probability Density Function
Probability Density Function (PDF)
• f(x) ≥ 0
• P (a ≤ X ≤ b) =∫ b
af(x)dx
• F (x) =x∫
−∞f(u)du
Probability Mass Function (PMF)
• p(k) ≥ 0 for all k
• ∑
k∈S
p(k) = 1
Random Variables 2-6
![Page 37: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/37.jpg)
Expected values
let g(X) be a function of random variable X
E[g(X)] =
∑
x∈S
g(x)p(x) X is discrete
∞∫
−∞g(x)f(x)dx X is continuous
Mean
µ = E[X ] =
∑
x∈S
xp(x) X is discrete
∞∫
−∞xf(x)dx X is continuous
Varianceσ2 = var[X ] = E[(X − µ)2]
nth MomentE[Xn]
Random Variables 2-7
![Page 38: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/38.jpg)
Facts
Let Y = g(X) = aX + b, a, b are constants
• E[Y ] = aE[X ] + b
• var[Y ] = a2 var[X ]
• var[X ] = E[X2]− (E[X ])2
Random Variables 2-8
![Page 39: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/39.jpg)
Example of Random Variables
Discrete RVs
• Bernoulli
• Binomial
• Geometric
• Negative binomial
• Poisson
• Uniform
Continuous RVs
• Uniform
• Exponential
• Gaussian (Normal)
• Gamma
• Rayleigh
• Cauchy
• Laplacian
Random Variables 2-9
![Page 40: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/40.jpg)
Bernoulli random variables
let A be an event of interest
a Bernoulli random variable X is defined as
X = 1 if A occurs and X = 0 otherwise
it can also be given by the indicator function for A
X(ζ) =
{
0, if ζ not in A
1, if ζ in A
PMF: p(1) = p, p(0) = 1− p, 0 ≤ p ≤ 1
Mean: E[X ] = p
Variance: var[X ] = p(1− p)
Random Variables 2-10
![Page 41: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/41.jpg)
Example of Bernoulli PMF: p = 1/3
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 0.2 0.4 0.6 0.8 10.65
0.7
0.75
0.8
0.85
0.9
0.95
1
xx
p(x)
F(x)
Random Variables 2-11
![Page 42: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/42.jpg)
Binomial random variables
• X is the number of successes in a sequence of n independent trials
• each experiment yields success with probability p
• when n = 1, X is a Bernoulli random variable
• SX = {0, 1, 2, . . . , n}• ex. Transmission errors in a binary channel: X is the number of errorsin n independent transmissions
PMF
p(k) = P (X = k) =
(
nk
)
pk(1− p)n−k, k = 0, 1, . . . , n
MeanE[X ] = np
Variancevar[X ] = np(1− p)
Random Variables 2-12
![Page 43: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/43.jpg)
Example of Binomial PMF: p = 1/3, n = 10
0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0 2 4 6 8 100
0.2
0.4
0.6
0.8
1
xx
p(x)
F(x)
Random Variables 2-13
![Page 44: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/44.jpg)
Geometric random variables
• repeat independent Bernoulli trials, each has probability of success p
• X is the number of experiments required until the first success occurs
• SX = {1, 2, 3, . . .}• ex. Message transmissions: X is the number of times a message needsto be transmitted until it arrives correctly
PMFp(k) = P (X = k) = (1− p)k−1p
Mean
E[X ] =1
p
Variance
var[X ] =1− p
p2
Random Variables 2-14
![Page 45: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/45.jpg)
Example of Geometric PMF: p = 1/3
0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
x
p(x)
• parameters: p = 1/4, 1/3, 1/2
Random Variables 2-15
![Page 46: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/46.jpg)
Negative binomial (Pascal) random variables
• repeat independent Bernoulli trials until observing the rth success
• X is the number of trials required until the rth success occurs
• X can be viewed as the sum of r geometrically RVs
• SX = {r, r + 1, r + 2, . . .}
PMF
p(k) = P (X = k) =
(
k − 1r − 1
)
pr(1− p)k−r, k = r, r + 1, . . .
MeanE[X ] =
r
pVariance
var[X ] =r(1− p)
p2
Random Variables 2-16
![Page 47: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/47.jpg)
Example of negative binomial PMF: p = 1/3, r = 5
0 10 20 30 400
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0 10 20 30 400
0.2
0.4
0.6
0.8
1
xx
p(x)
F(x)
Random Variables 2-17
![Page 48: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/48.jpg)
Poisson random variables
• X is a number of events occurring in a certain period of time
• events occur with a known average rate
• the expected number of occurrences in the interval is λ
• SX = {0, 1, 2, . . .}• examples:
– number of emissions of a radioactive mass during a time interval– number of queries arriving in t seconds at a call center– number of packet arrivals in t seconds at a multiplexer
PMF
p(k) = P (X = k) =λk
k!e−λ, k = 0, 1, 2, . . .
Mean E[X ] = λ
Variance var[X ] = λ
Random Variables 2-18
![Page 49: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/49.jpg)
Example of Poisson PMF: λ = 2
0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0 2 4 6 8 100.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
xx
p(x)
F(x)
Random Variables 2-19
![Page 50: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/50.jpg)
Derivation of Poisson distribution
• approximate a binomial RV when n is large and p is small
• define λ = np, in 1898 Bortkiewicz showed that
p(k) =
(
nk
)
pk(1− p)n−k ≈ λk
k!e−λ
Proof.
p(0) = (1− p)n = (1− λ/n)n ≈ e−λ, n → ∞p(k + 1)
p(k)=
(n− k)p
(k + 1)(1− p)=
(1− k/n)λ
(k + 1)(1− λ/n)
take the limit n → ∞
p(k + 1) =λ
k + 1p(k) =
(
λ
k + 1
)(
λ
k
)
· · ·(
λ
1
)
p(0) =λk+1
(k + 1)!e−λ
Random Variables 2-20
![Page 51: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/51.jpg)
Comparison of Poisson and Binomial PMFs
0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0 5 10 15 200
0.02
0.04
0.06
0.08
0.1
0.12
0.14
xx
p(x)
p(x)
p = 1/2, n = 10 p = 1/10, n = 100
• red: Poisson
• blue: Binomial
Random Variables 2-21
![Page 52: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/52.jpg)
Uniform random variables
Discrete Uniform RVs
• X has n possible values, x1, . . . , xn that are equally probable
• PMF
p(x) =
{
1n, if x ∈ {x1, . . . , xn}0, otherwise
Continuous Uniform RVs
• X takes any values on an interval [a, b] that are equally probable
f(x) =
{
1(b−a), for x ∈ [a, b]
0, otherwise
• Mean: E[X ] = (a+ b)/2
• Variance: var[X ] = (b− a)2/12
Random Variables 2-22
![Page 53: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/53.jpg)
Example of Discrete Uniform PMF: X = 0, 1, 2, . . . , 10
0 5 100
0.02
0.04
0.06
0.08
0.1
0 5 100
0.2
0.4
0.6
0.8
1
xx
p(x
)
F(x
)
Example of Continuous Uniform PMF: X ∈ [0, 2]
0 0.5 1 1.5 2−0.5
0
0.5
1
1.5
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
xx
f(x
)
F(x
)
Random Variables 2-23
![Page 54: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/54.jpg)
Exponential random variables
• arise when describing the time between occurrence of events
• examples:
– the time between customer demands for call connections– the time used for a bank teller to serve a customer
• λ is the rate at which events occur
• a continuous counterpart of the geometric random variable
f(x) =
{
λe−λx, if x ≥ 0
0, if x < 0
Mean E[X ] = 1λ
Variance var[X ] = 1λ2
Random Variables 2-24
![Page 55: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/55.jpg)
Example of Exponential PDF
0 5 100
0.2
0.4
0.6
0.8
1
0 5 100
0.2
0.4
0.6
0.8
1
xx
f(x
)
F(x
)
• parameters: λ = 1, 1/2, 1/3
Random Variables 2-25
![Page 56: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/56.jpg)
Memoryless property
P (X > t+ h|X > t) = P (X > h)
• P (X > t+ h|X > t) is the probability of having to wait additionally atleast h seconds given that one has already been waiting t seconds
• P (X > h) is the probability of waiting at least h seconds when one firstbegins to wait
• thus, the probability of waiting at least an additional h seconds is thesame regardless of how long one has already been waiting
Proof.
P (X > t+ h|X > t) =P{(X > t+ h) ∩ (X > t)}
P (X > t), for h > 0
=P (X > t+ h)
P (X > t)=
e−λ(t+h)
e−λt= e−λh
Random Variables 2-26
![Page 57: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/57.jpg)
this is not the case for other non-negative continuous RVs
in fact, the conditional probability
P (X > t+ h|X > t) =1− P (X ≤ t+ h)
1− P (X ≤ t)=
1− F (t+ h)
1− F (t)
depends on t in general
Random Variables 2-27
![Page 58: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/58.jpg)
Gaussian (Normal) random variables
• arise as the outcome of the central limit theorem
• the sum of a large number of RVs is distributed approximately normally
• many results involving Gaussian RVs can be derived in analytical form
• let X be a Gaussian RV with parameters mean µ and variance σ2
Notation X ∼ N (µ, σ2)
f(x) =1√2πσ2
exp− (x− µ)2
2σ2, −∞ < x < ∞
Mean E[X ] = µ
Variance var[X ] = σ2
Random Variables 2-28
![Page 59: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/59.jpg)
let Z ∼ N (0, 1) be the normalized Gaussian variable
CDF of Z is
FZ(z) =1√2π
∫ z
−∞e−t2/2dt , Φ(z)
then CDF of X ∼ N (µ, σ2) can be obtained by
FX(x) = Φ
(
x− µ
σ
)
in MATLAB, the error function is defined as
erf(x) =2√π
∫ x
0
e−t2dt
hence, Φ(z) can be computed via the erf command as
Φ(z) =1
2
[
1 + erf
(
z√2
)]
Random Variables 2-29
![Page 60: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/60.jpg)
Example of Gaussian PDF
−10 −5 0 5 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
xx
f(x
)
F(x
)
• parameters: µ = 1, σ = 1, 2, 3
Random Variables 2-30
![Page 61: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/61.jpg)
Gamma random variables
• appears in many applications:
– the time required to service customers in queuing system– the lifetime of devices in reliability studies– the defect clustering behavior in VLSI chips
• let X be a Gamma variable with parameters α, λ
f(x) =λ(λx)α−1e−λx
Γ(α), x ≥ 0; α, λ > 0
where Γ(z) is the gamma function, defined by
Γ(z) =
∫ ∞
0
xz−1e−xdx, z > 0
Mean E[X ] = αλ Variance var[X ] = α
λ2
Random Variables 2-31
![Page 62: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/62.jpg)
Properties of the gamma function
Γ(1/2) =√π
Γ(z + 1) = zΓ(z) for z > 0
Γ(m+ 1) = m!, for m a nonnegative integer
Special cases
a Gamma RV becomes
• exponential RV when α = 1
• m-Erlang RV when α = m, a positive integer
• chi-square RV with k DOF when α = k/2, λ = 1/2
Random Variables 2-32
![Page 63: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/63.jpg)
Example of Gamma PDF
0 5 10 15 200
0.1
0.2
0.3
0.4
0.5
x
f(x
)
• blue: α = 0.2, λ = 0.2 (long tail)
• green: α = 1, λ = 0.5 (exponential)
• red: α = 3, λ = 1/2 (Chi square with 6 DOF)
• black: α = 5, 20, 50, 100 and α/λ = 10 (α-Erlang with mean 10)
Random Variables 2-33
![Page 64: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/64.jpg)
m-Erlang random variables
0 t2 t
X2
tm−1 tm+1
. . .
. . .t1
X1
tm
Xm Xm+1
• the kth event occurs at time tk
• the times X1, X2, . . . , Xm between events are exponential RVs
• N(t) denotes the number of events in t seconds, which is a Poisson RV
• Sm = X1 +X2 + . . .+Xm is the elapsed time until the mth occurs
we can show that Sm is an m-Erlang random variable
Random Variables 2-34
![Page 65: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/65.jpg)
Proof. Sm ≤ t iff m or more events occur in t seconds
F (t) = P (Sm ≤ t)
= P (N(t) ≥ m)
= 1−m−1∑
k=0
(λt)k
k!e−λt
to get the density function of Sm, we take the derivative of F (t):
f(t) =dF (t)
dt=
m−1∑
k=0
e−λt
k!
(
λ(λt)k − kλ(λt)k−1)
=λ(λt)m−1e−λt
(m− 1)!⇒ Erlang distribution with parameters m,λ
the sum of m exponential RVs with rate λ is an m-Erlang RV
if m becomes large, the m-Erlang RV should approach the normal RV
Random Variables 2-35
![Page 66: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/66.jpg)
Rayleigh random variables
• arise when observing the magnitude of a vector
• ex. The absolute values of random complex numbers whose real andimaginary are i.i.d. Gaussian
PDFf(x) =
x
σ2e−x2/2σ2
, x ≥ 0, α > 0
MeanE[X ] = σ
√
π/2
Variance
var[X ] =4− π
2σ2
the Chi square distribution is a generalization of Rayleigh distribution
Random Variables 2-36
![Page 67: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/67.jpg)
Example of Rayleigh PDF
0 2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 2 4 6 8 100
0.2
0.4
0.6
0.8
1
xx
f(x
)
F(x
)
• parameters: σ = 1, 2, 3
Random Variables 2-37
![Page 68: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/68.jpg)
Cauchy random variables
f(x) =1/π
1 + x2, −∞ < x < ∞
• Cauchy distribution does not have any moments
• no mean, variance or higher moments defined
• Z = X/Y is the standard Cauchy if X and Y are independent Gaussian
−5 −4 −3 −2 −1 0 1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
x
f(x
)
Random Variables 2-38
![Page 69: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/69.jpg)
Laplacian random variables
PDFf(x) =
α
2e−α|x−µ|, −∞ < x < ∞
MeanE[X ] = µ
Variance
var[X ] =2
α2
• arise as the difference between two i.i.d exponential RVs
• unlike Gaussian, the Laplace density is expressed in terms of theabsolute difference from the mean
Random Variables 2-39
![Page 70: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/70.jpg)
Example of Laplacian PDF
−2 −1.5 −1 −0.5 0 0.5 1 1.5 20
0.5
1
1.5
2
2.5
x
f(x
)
• parameters: µ = 1, α = 1, 2, 3, 4, 5
Random Variables 2-40
![Page 71: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/71.jpg)
Related MATLAB commands
• cdf returns the values of a specified cumulative distribution function
• pdf returns the values of a specified probability density function
• randn generates random numbers from the standard Gaussiandistribution
• rand generates random numbers from the standard uniform distribution
• random generates random numbers drawn from a specified distribution
Random Variables 2-41
![Page 72: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/72.jpg)
References
Chapter 3,4 inA. Leon-Garcia, Probability, Statistics, and Random Processes for
Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009
Random Variables 2-42
![Page 73: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/73.jpg)
EE401 (Semester 1) Jitkomut Songsiri
3. Functions of random variables
• linear and quadratic transformations
• general transformations
• characteristic function
• Markov and Chebyshev inequalities
• Chernoff bound
3-1
![Page 74: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/74.jpg)
Functions of random variables
let X be an RV and g(x) be a real-valued function defined on the real line
• Y = g(X), Y is also an RV
• CDF of Y will depend on g(x) and CDF of X
Example: define g(x) as
g(x) = (x)+ =
{
x, if x ≥ 0
0, if x < 0
• an input voltage X passes thru a halfwave rectifier
• A/D converter: a uniform quantizer maps input to the closet point
• Y is # of active speakers in excess of M , i.e., Y = (X −M)+
Functions of random variables 3-2
![Page 75: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/75.jpg)
CDF of Y = g(X)
probability of equivalent events:
P (Y in C) = P (g(X) in C) = P (X in B)
where B is the equivalent event of values of X sucht that g(X) is in C
Functions of random variables 3-3
![Page 76: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/76.jpg)
Example: Voice Transmission System
• X is # of active speakers in a group of N speakers
• let p be the probability that a speaker is active
• a voice transmission system can transmit up to M signals at a time
• let Y be the number of signal discarded, so Y = (X −M)+
Y take values from the set SY = {0, 1, . . . , N −M}
we can compute PMF of Y as
P (Y = 0) = P (X in {0, 1, . . . ,M}) =M∑
k=0
pX(k)
P (Y = k) = P (X = M + k) = pX(M + k), 0 < k ≤ N −M,
Functions of random variables 3-4
![Page 77: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/77.jpg)
Affine functions
define Y = aX + b, a > 0. Find CDF and PDF of Y
If a > 0
y
x
{Y ≤ y}
{X ≤ y−ba }
y−ba
Y=aX
+b
P (Y ≤ y) = P (aX + b ≤ y)
= P (X ≤ (y − b)/a)
thus,
FY (y) = FX
(
y − b
a
)
PDF of Y is obtained by differentiating the CDF wrt. to y
fY (y) =1
afX
(
y − b
a
)
Functions of random variables 3-5
![Page 78: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/78.jpg)
Example: Affine function of a Gaussian
let X ∼ N (m,σ2) :
fX(x) =1√2πσ2
exp − (x−m)2
2σ2
let Y = aX + b, with a > 0
from page 3-5,
fY (y) =1
afX
(
y − b
a
)
=1
√
2π(aσ)2exp − (y − b− am)2
2(aσ)2
• Y has also a Gaussian distribution with mean b+am and variance (aσ)2
• thus, a linear function of a Gaussian is also a Gaussian
Functions of random variables 3-6
![Page 79: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/79.jpg)
Example: Quadratic functions
define Y = X2. find CDF and PDF of Y
for a positive y, we have
{Y ≤ y} ⇐⇒ {−√y ≤ X ≤ √
y}
thus,
FY (y) =
{
0, y < 0
FX(√y)− FX(−√
y), y > 0−√
y√y
{Y ≤ y}
Y = X2
differentiating wrt. to y gives
fY (y) =fX(
√y)
2√y
+fX(−√
y)
2√y
for X ∼ N (0, 1), Y is a Chi-square random variable with one DOF
Functions of random variables 3-7
![Page 80: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/80.jpg)
General functions of random variables
suppose Y = g(X) is a transformation (could be many-to-one)
to find fY (y) for a fixed y, we solve the equation y = g(x) to get x
this may have multiple roots:
y = g(x1) = g(x2) = · · · = g(xn) = · · ·
it can be shown that
fY (y) =fX(x1)
|g′(x1)|+ · · ·+ fX(xn)
|g′(xn)|+ · · ·
where g′(x) is the derivative (Jacobian) of g(x)
Functions of random variables 3-8
![Page 81: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/81.jpg)
Affine transformation: Y = aX + b, g′(x) = a
the equation y = ax+ b has a single solution x = (y − b)/a for every y, so
fY (y) =1
|a|fX(
y − b
a
)
Quadratic transformation: Y = aX2, a > 0, g′(x) = 2ax
if y ≤ 0, then the equation y = ax2 has no real solutions, so fY (y) = 0
if y > 0, then it has two solutions
x1 =√
y/a, x2 = −√
y/a
and therefore
fY (y) =1
2√ay
(
fX(√
y/a) + fX(−√
y/a))
Functions of random variables 3-9
![Page 82: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/82.jpg)
Log of uniform variables
verify that if X has a standard uniform distribution U(0, 1), then
Y = − ln(X)/λ
has an exponential distribution with parameter λ
for Y = y, we can solve X = x = e−λy ⇒ unique root
• the Jacobian is g′(x) = −1/λx = −eλy/λ
• fY (y) = 0 when x = e−λy /∈ [0, 1] or when y < 0
• when y ≥ 0 (or e−λy ∈ [0, 1]), we will have
fY (y) =fX(e−λy)
| − 1/λx| = λe−λy
Functions of random variables 3-10
![Page 83: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/83.jpg)
Amplitude samples of a sinusoidal waveform
let Y = cosX where X ∼ U(0, 2π], find the pdf of Y
for |y| > 1 there is no solution of x ⇒ fY (y) = 0
for |y| < 1 the equation y = cosx has two solutions:
x1 = cos−1(y), x2 = 2π − x1
the Jacobians are
g′(x1) = − sin(x1) = − sin(cos−1(y)) = −√
1− y2, g′(x2) =√
1− y2
since fX(x) = 1/2π in the interval (0, 2π], so
fY (y) =1
π√
1− y2, for − 1 < y < 1
note that although fY (±1) = ∞ the probability that y = ±1 is 0
Functions of random variables 3-11
![Page 84: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/84.jpg)
Transform Methods
• moment generating function
• characteristic function
Functions of random variables 3-12
![Page 85: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/85.jpg)
Moment generating functions
for a random variable X , the moment generating function (MGF) of X is
Φ(t) = E[etX]
Continuous
Φ(t) =
∫ ∞
−∞etxf(x)dx
DiscreteΦ(t) =
∑
k
etxkp(xk)
• except for a sign change, Φ(t) is the 2-sided Laplace transform of pdf
• knowing Φ(t) is equivalent to knowing f(x)
Functions of random variables 3-13
![Page 86: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/86.jpg)
Moment theorem
computing any moments of X is easily obtained by
E[Xn] =dnΦ(t)
dtn
∣
∣
∣
∣
t=0
because
E[etX] = E
[
1 + tX +(tX)2
2!+ · · ·+ (tX)n
n!+ · · ·
]
= 1 + tE[X ] +t2
2!E[X2] + · · ·+ tn
n!E[Xn] + · · ·
note that Φ(0) = 1
Functions of random variables 3-14
![Page 87: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/87.jpg)
MGF of Gaussian variables
the MGF of X ∼ N (µ, σ2) is given by
Φ(t) = e(µt+σ2t2/2)
it can be derived by completing square in the exponent:
Φ(t) =1√2πσ2
∫ ∞
−∞e−(x−µ)2
2σ2 etxdx
where
−(x− µ)2
2σ2+ tx = −(x− (µ+ σ2t))2
2σ2+ µt+
σ2t2
2
from the MGF, we obtain
Φ′(0) = µ, Φ′′(0) = µ2 + σ2
Functions of random variables 3-15
![Page 88: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/88.jpg)
Characteristic functions
the characteristic function (CF) of a random variable X is defined by
Continuous
Φ(ω) = E[ejωX] =
∫ ∞
−∞f(x)ejωxdx
DiscreteΦ(ω) = E[ejωX] =
∑
k
eiωxkp(xk)
• Φ(ω) is obtained from the moment generating function when t = jω
• Φ(ω) is simply the Fourier transform of the PDF or PMF of X
• every pdf and its characteristic function form a unique Fourier pair:
Φ(ω) ⇐⇒ f(x)
Functions of random variables 3-16
![Page 89: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/89.jpg)
the characteristic function is maximum at origin because f(x) ≥ 0:
|Φ(ω)| ≤ Φ(0) = 1
Linear transformation: if Y = aX + b , then
Φy(ω) = ejbωΦx(aω)
Gaussian variables: let X ∼ N (µ, σ2)
the characteristic function of X is
Φ(ω) = ejµω · e−σ2ω2/2
Binomial variables: parameters are n, p and q = 1− p
Φ(ω) = (pejω + q)n
Functions of random variables 3-17
![Page 90: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/90.jpg)
Markov and Chebyshev Inequalities
Markov inequality
let X be a nonnegative RV with mean E[X ]
P (X ≥ a) ≤ E[X ]
a, a > 0
Chebyshev inequality
let X be an RV with mean µ and variance σ2
P (|X − µ| ≥ a) ≤ σ2
a2
Functions of random variables 3-18
![Page 91: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/91.jpg)
example: manufacturing of low grade resistors
• assume the averge resistance is 100 ohms (measured by a statisticalanalysis)
• some of resistors have different values of resistance
if all resistors over 200 ohms will be discarded, what is the maximumfraction of resistors to meet such a criterion ?
using Markov inequality with µ = 100 and a = 200
P (X ≥ 200) ≤ 100
200= 0.5
the percentage of discarded resistors cannot exceed 50% of the total
Functions of random variables 3-19
![Page 92: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/92.jpg)
if the variance of the resistance is known to equal 100, find the probabilitythat the resistance values are between 50 and 150
P (50 ≤ X ≤ 150) = P (|X − 100| ≤ 50)
= 1− P (|X − 100| ≥ 50)
by Chebyshev inequality
P (|X − 100| ≥ 50) ≤ σ2
(50)2= 1/25
hence,
P (50 ≤ X ≤ 150) ≥ 1− 1
25=
24
25
Functions of random variables 3-20
![Page 93: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/93.jpg)
Chernoff bound
the Chernoff bound is given by
P (X ≥ a) ≤ inft≥0
Eet(X−a)
which can be expressed as
logP (X ≥ a) ≤ inft≥0
{−ta+ logEetX}
• E[etX] is the moment generating function
• logEetX is called the cumulant generating function
• Chernoff bound is useful when EetX has an analytical expression
Functions of random variables 3-21
![Page 94: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/94.jpg)
Example: X is Gaussian with zero mean and unit variance
the cumulant generating function is
logE[etX] = t2/2
hence,logP (X ≥ a) ≤ inf
t≥0{−ta+ t2/2} = −a2/2
and the Chernoff bound gives
P (X ≥ a) ≤ e−a2/2
which is tighter than the Chebyshev inequality:
P (|X | ≥ a) ≤ 1/a2 =⇒ P (X ≥ a) ≤ 1/2a2
Functions of random variables 3-22
![Page 95: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/95.jpg)
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Chernoff bound
Chebyshev boundP(X
≥a)
P (X ≥ a)
a
when a is small, Chebyshev bound is useless while the Chernoff bound istighter
Functions of random variables 3-23
![Page 96: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/96.jpg)
References
Chapter 3,4 inA. Leon-Garcia, Probability, Statistics, and Random Processes for
Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009
Functions of random variables 3-24
![Page 97: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/97.jpg)
EE401 (Semester 1) Jitkomut Songsiri
4. Pairs of Random Variables
• probabilities
• conditional probability and expectation
• independence
• joint characteristic function
• functions of random variables
4-1
![Page 98: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/98.jpg)
Definition
let ζ be an outcome in the sample space S
a pair of RVs Z(ζ) is a function that maps ζ to a pair of real numbers
Z(ζ) = (X(ζ), Y (ζ))
example: a web page provides the user with a choice either to watch anad or move directly to the requested page
let ζ be the patterns of user arrivals to a webpage
• N1(ζ) be the number of times the webpage is directly requested
• N2(ζ) be the number of that the ads is chosen
(N1(ζ), N2(ζ)) assigns a pair of nonnegative integers to each outcome ζ
Pairs of Random Variables 4-2
![Page 99: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/99.jpg)
Events of interest
events involving a pair of RVs (X, Y ) can be represented by regions
Example:
yyy
xxx(4, 0)
(0, 4)(5, 5)
A
B
C
A = {X + Y ≤ 4}, B = {min(X,Y ) ≤ 5}, C = {X2 + Y 2 ≤ 25}
• A: total revenue from two sources is less than 4
• C: total noise power is less than r2
Pairs of Random Variables 4-3
![Page 100: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/100.jpg)
Events and Probabilities
we consider the events that that the product form:
C = {X in A} ∩ {Y in B}
the probability of product-form events is
P (C) = P ({X in A}) ∩ P ({Y in B})= P (X in A, Y in B)
Pairs of Random Variables 4-4
![Page 101: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/101.jpg)
Probability for pairs of random variables
Joint cumulative distribution function
FXY (a, b) = P (X ≤ a, Y ≤ b)
Properties:
y
x
(a, b)
• a joint CDF is a nondecreasing function of x and y:
FXY (x1, y1) ≤ FXY (x2, y2), if x1 ≤ x2 and y1 ≤ y2
• FXY (x1,−∞) = 0, FXY (−∞, y1) = 0, FXY (∞,∞) = 1
• P (x1 < X ≤ x2, y1 < Y ≤ y2)
= FXY (x2, y2)− FXY (x2, y1)− FXY (x1, y2) + FXY (x1, y1)
Pairs of Random Variables 4-5
![Page 102: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/102.jpg)
Joint PMF for discrete RVs
pXY (x, y) = P (X = x, Y = y), (x, y) ∈ S
Joint PDF for continuous RVs
fXY (x, y) =∂2FXY (x, y)
∂x∂y
Marginal PMF
pX(x) =∑
y∈S
pXY (x, y), pY (y) =∑
x∈S
pXY (x, y)
Marginal PDF
fX(x) =
∫ ∞
−∞fXY (x, z)dz, fY (y) =
∫ ∞
−∞fXY (z, y)dz
Pairs of Random Variables 4-6
![Page 103: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/103.jpg)
Example 1: Jointly Gaussian Random Variables
if X,Y are jointly Gaussian, a joint pdf of X and Y can be given by
fXY (x, y) =1
2π√
1− ρ2exp − (x2 − 2ρxy + y2)
2(1− ρ2), −∞ < x, y < ∞
−2
0
2
−2
0
2
0
0.05
0.1
0.15
0.2
fXY (x, y)
with |ρ| < 1
find the marginal PDF’s
Pairs of Random Variables 4-7
![Page 104: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/104.jpg)
the marginal pdf of X is found by integrating fXY (x, y) over y:
fX(x) =e−x2/2(1−ρ2)
2π√
1− ρ2
∫ ∞
−∞e−(y2−2ρxy)/2(1−ρ2)dy
=e−x2/2
√2π
∫ ∞
−∞
e−(y−ρx)2/2(1−ρ2)
√
2π(1− ρ2)dy
=e−x2/2
√2π
• the second step follows from completing the square in (y − ρx)2
• the last integral equals 1 since its integrand is a Gaussian pdf withmean ρx and variance 1− ρ2
• the marginal pdf of X is also a Gaussian with mean 0 and variance 1
• from the symmetry of fXY (x, y) in x and y, the marginal pdf of Y isalso the same as X
Pairs of Random Variables 4-8
![Page 105: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/105.jpg)
Example 2
consider X and Y with a joint PDF
fXY (x, y) = ce−xe−y, 0 ≤ y ≤ x < ∞
find the constant c, the marginal PDFs and P (X + Y ≤ 1)
y y
x x
x=
y
x=
y x+y=1
12
12
the constant c is found from the normalization condition:
1 =
∫ ∞
0
∫ x
0
ce−xe−ydydx =⇒ c = 2
Pairs of Random Variables 4-9
![Page 106: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/106.jpg)
the marginal PDFs are obtained by
fX(x) =
∫ ∞
0
fXY (x, y)dy =
∫ x
0
2e−xe−ydy, 0 ≤ x < ∞
fY (y) =
∫ ∞
0
fXY (x, y)dx =
∫ ∞
y
2e−xe−ydx = 2e−y, 0 ≤ y < ∞
P (X + Y ≤ 1) can be found by taking the intersection of the region wherethe joint PDF is nonzero and the event {X + Y ≤ 1}
P (X + Y ≤ 1) =
∫ 1/2
0
∫ 1−y
y
2e−xe−ydxdy =
∫ 1/2
0
2e−y[e−y − e−(1−y)]dy
= 1− 2e−1
Pairs of Random Variables 4-10
![Page 107: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/107.jpg)
Conditional Probability
Discrete RVs
the conditional PMF of Y given X = x is defined by
pY |X(y|x) = P (Y = y|X = x) =P (X = x, Y = y)
P (X = x)
=pXY (x, y)
pX(x)
Continuous RVs
the conditional PDF of Y given X = x is defined by
fY |X(y|x) =fXY (x, y)
fX(x)
Pairs of Random Variables 4-11
![Page 108: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/108.jpg)
Example: Number of defects in a region
• let X be the total number of defects on a chip
X ∼ Poisson(α)
• let Y be the number of defects falling in region R
• if X = n (given), then Y is binomial with (n, p)
pY |X(k|n) =
0, k > n(
n
k
)
pk(1− p)n−k, 0 ≤ k ≤ n
• we can show thatY ∼ Poisson(αp)
Pairs of Random Variables 4-12
![Page 109: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/109.jpg)
P (Y = k) =∞∑
n=0
P (Y = k|X = n)P (X = n)
=∞∑
n=k
(
nk
)
pk(1− p)n−kαne−α
n!
=(αp)ke−α
k!
∞∑
n=k
[(1− p)α]n−k
(n− k)!
=(αp)ke−αe(1−p)α
k!=
(αp)k
k!e−αp
Pairs of Random Variables 4-13
![Page 110: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/110.jpg)
Example: Customers arrive at a service station
• let N be # of customers arriving at a station during time t
N ∼ Poisson(βt)
• let T be the service time for each customer
T ∼ exponential(α)
• we can show that # of customers that arrive during the service time isa geometric RV with probability of success α/(α+ β)
Pairs of Random Variables 4-14
![Page 111: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/111.jpg)
P (N = k) =
∫ ∞
0
P (N = k|T = t)fT (t)dt
=
∫ ∞
0
(
(βt)k
k!e−βt
)
αe−αtdt
=αβk
k!
∫ ∞
0
tke−(α+β)tdt
let r = (α+ β)t, then
P (N = k) =αβk
k!(α+ β)k+1
∫ ∞
0
rke−rdr
=
(
α
α+ β
)(
β
α+ β
)k
(the last integral is a gamma function and is equal to k!)
Pairs of Random Variables 4-15
![Page 112: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/112.jpg)
Conditional Expectation
the conditional expectation of Y given X = x is defined by
Continuous RVs
E[Y |X ] =
∫ ∞
−∞y fY |X(y|x)dy
Discrete RVsE[Y |X ] =
∑
y
y pY |X(y|x)
• E[Y |X ] is the center of mass associated with the conditional pdf or pmf
• E[Y |X ] can be viewed as a function of random variable X
• E[E[Y |X ]] = E[Y ]
Pairs of Random Variables 4-16
![Page 113: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/113.jpg)
in fact, we can show that
E[h(Y )] = E[E[h(Y )|X ]]
for any function h(·) that E[|h(Y )|] < ∞proof.
E[E[h(Y )|X ]] =
∫ ∞
−∞E[h(Y )|x]fX(x)dx
=
∫ ∞
−∞
∫ ∞
−∞h(y)fY |X(y|x)dy fX(x)dx
=
∫ ∞
−∞h(y)
∫ ∞
−∞fXY (x, y) dxdy
=
∫ ∞
−∞h(y)fY (y) dy
= E[h(Y )]
Pairs of Random Variables 4-17
![Page 114: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/114.jpg)
Example: Average defects of a chip in a region
From the example on page 4-12,
E[Y ] = E[E[Y |X ]]
=∞∑
n=0
np P (X = n)
= p∞∑
n=0
nP (X = n)
= pE[X ] = αp
Pairs of Random Variables 4-18
![Page 115: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/115.jpg)
Example: Average arrivals in a service time
from the example on page 4-14,
E[N ] = E[E[N |T ]]
=
∫ ∞
0
E[N |T = t]fT (t)dt
=
∫ ∞
0
βtfT (t)dt
= βE[T ]
=β
α
Pairs of Random Variables 4-19
![Page 116: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/116.jpg)
Example: Variance of arrivals in a service time
same example as in page 4-12
N is Poisson RV with parameter βt when T = t is given, so
E[N |T = t] = βt, E[N2|T = t] = (βt) + (βt)2
the second moment of N can be calculated by
E[N2] = E[E[N2|T ]]
=
∫ ∞
0
E[N2|T = t] fT (t) dt
=
∫ ∞
0
(βt+ β2t2) fT (t) dt
= βE[T ] + β2E[T 2]
Pairs of Random Variables 4-20
![Page 117: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/117.jpg)
therefore,
var(N) = E[N2]− (E[N ])2
= β2E[T 2] + βE[T ]− β2(E[T ])2
= β2 var(T ) + βE[T ]
• if T is not random (E[T ] is constant and var(T ) = 0), the mean andvariance of N are those of a Poisson RV with parameter βE[T ]
• when T is random, the mean of N remains the same, but var(N)increases by the term β2 var(T )
• bote that the above result holds for any distribution fT (t)
• if T is exponential with parameter α, then E[T ] = 1/α andvar(T ) = 1/α2, so
E[N ] =β
α, var(N) =
β2
α2+
β
α
Pairs of Random Variables 4-21
![Page 118: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/118.jpg)
Independence of two random variables
X and Y are independent if and only if
FXY (x, y) = FX(x)FY (y), ∀x, y
this is equivalent to
Discrete Random Variables
pXY (x, y) = pX(x)pY (y)
pY |X(y|x) = pY (y)
Continuous Random Variables
fXY (x, y) = fX(x)fY (y)
fY |X(y|x) = fY |X(y)
If X and Y are independent, so are any pair of functions g(X) and h(Y )
Pairs of Random Variables 4-22
![Page 119: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/119.jpg)
Example
let X and Y be Gaussian RVs with zero mean and unit variance
the product of the marginal pdf’s of X and Y is
fX(x)fY (y) =1
2πexp − (x2 + y2)
2, −∞ < x, y < ∞
from the example on page 4-7, the joint pdf of X and Y is
fXY (x, y) =1
2π√
1− ρ2exp − (x2 − 2ρxy + y2)
2(1− ρ2), −∞ < x, y < ∞
therefore the jointly Guassian X and Y are independent if and only if
ρ = 0
ρ is called correlation coefficient between X and Y
Pairs of Random Variables 4-23
![Page 120: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/120.jpg)
Expected Values and Covariance
the expected value of Z = g(X,Y ) is defined as
E[Z] =
∫ ∞
−∞
∫ ∞
−∞g(x, y) fXY (x, y) dx dy X, Y continuous
E[Z] =∑
x
∑
y
g(x, y) pXY (x, y) X, Y discrete
• E[X + Y ] = E[X ] +E[Y ]
• E[XY ] = E[X ]E[Y ] if X and Y are independent
Covariance of X and Y
cov(X,Y ) = E[(X −E[X ])(Y −E[Y ])]
• cov(X,Y ) = E[XY ]−E[X ]E[Y ]
• cov(X,Y ) = 0 if X and Y are independent (the converse is NOT true)
Pairs of Random Variables 4-24
![Page 121: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/121.jpg)
Correlation Coefficient
denoteσX =
√
var(X), σY =√
var(Y )
the standard deviations of X and Y
the correlation coefficient of X and Y is defined by
ρXY =cov(X, Y )
σXσY
• −1 ≤ ρXY ≤ 1
• ρXY gives the linear dependence between X and Y : for Y = aX + b,
ρXY = 1 if a > 0 and ρXY = −1 if a < 0
• X and Y are said to be uncorrelated if ρXY = 0
Pairs of Random Variables 4-25
![Page 122: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/122.jpg)
if X and Y are independent then X and Y are uncorrelated
but the converse is NOT true
example: uncorrelated but dependent random variables
let θ be a uniform RV in the interval (0, 2π) and let
X = cos θ, Y = sin θ
y
xθ1
1
−1
−1
• the marginals of X and Y are arcsine pdf’s
• the products of the marginals of X and Y is nonzeroin the square region
• (X,Y ) is the point on the unit circle, so they areare dependent
E[XY ] =1
2π
∫ 2π
0
sinφ cosφ dφ =1
4π
∫ 2π
0
sin 2φ dφ = 0
since E[X ] = E[Y ] = 0, the above eq. implies X and Y are uncorrelated
Pairs of Random Variables 4-26
![Page 123: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/123.jpg)
Joint Characteristic Function
the joint characteristic function of X and Y is defined by
ΦXY (λ, ω) = E[ej(λX+ωY )] =
∫ ∞
−∞
∫ ∞
−∞ej(λX+ωY )fXY (x, y) dx dy
the joint characterestic function is a 2D Fourier transform
if X and Y are independent
ΦXY (λ, ω) = E[ejλX]E[ejωY ] = ΦX(λ)ΦY (ω)
Pairs of Random Variables 4-27
![Page 124: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/124.jpg)
Example
let U ∼ N (0, 1) and V ∼ N (0, 1) be independent RVs
defineX = U + V, Y = U − V
the joint characteristic function of X,Y is obtained by
ΦXY (λ, ω) = E[ej(λ(U+V )+ω(U−V ))]
= E[ej(λ+ω)U+j(λ−ω)V ]
= E[ej(λ+ω)U ]E[ej(λ−ω)V ]
= ΦU(λ+ ω)ΦV (λ− ω)
= e−(λ+ω)2/2e−(λ−ω)2/2
= = e−(λ2+ω2) = ΦX(λ)ΦY (ω)
X and Y are also Gaussian with zero mean and variance 2
Pairs of Random Variables 4-28
![Page 125: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/125.jpg)
from the identity:
∂2E[ej(λX+ωY )]
∂λ∂ω= j2E[XY ej(λX+ωY )]
the joint characteristic function is also useful for finding E[XY ], since
E[XY ] =1
j2∂2E[ej(λX+ωY )]
∂λ∂ω
∣
∣
∣
∣
λ=0,ω=0
= −∂2E[e−(λ2+ω2)]
∂λ∂ω
∣
∣
∣
∣
∣
λ=0,ω=0
= −e−(λ2+ω2)(4λω)|λ=0,ω=0
= 0
thus X and Y are uncorrelated
(note that X and Y have zero mean)
Pairs of Random Variables 4-29
![Page 126: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/126.jpg)
Function of Multiple Random Variables
• sum of random variables: Z = X + Y
• division of random variables: Z = X/Y
• linear transformation
Pairs of Random Variables 4-30
![Page 127: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/127.jpg)
Sum of Random Variables
y
x
x+ y ≤ z
Let Z = X + Y
P (Z ≤ z) = P (X + Y ≤ z)
integrate the joint pdf fXY over the yellow region
CDF of Z: FZ(z) = P (Z ≤ z) =∞∫
−∞
z−x∫
−∞fXY (x, y)dydx
PDF of Z: fZ(z) =dFZ(z)
dz=
∞∫
−∞fXY (x, z − x)dx
when X,Y are independent, the pdf of Z has a form of convolution:
fZ(z) =
∫ ∞
−∞fX(x)fY (z − x)dx
Pairs of Random Variables 4-31
![Page 128: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/128.jpg)
Example
find the pdf of the sum Z = X + Y
X, Y are jointly Gaussian with zero mean and unit variance withcorrelation coefficient ρ = −1/2
fZ(z) =
∫ ∞
−∞fXY (x, z − x)dx
=1
2π√
(1− ρ2)
∫ ∞
−∞e−(x2−2ρx(z−x)+(z−x)2)/2(1−ρ2)dx
=1
2π√
3/4
∫ ∞
−∞e−(x2−xz+z2)/2(3/4)dx
=e−z2/2
√2π
the sum of these two nonindependent Gaussian is also a Gaussian RV
Pairs of Random Variables 4-32
![Page 129: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/129.jpg)
Characteristic function of a sum
let X and Y be independent RVs and define
Z = X + Y
then CF of X is the product of CFs of X and Y :
ΦZ(ω) = ΦX(ω)ΦY (ω)
proof.
ΦZ(ω) = E[ejωZ] = E[ejω(X+Y )]
= E[ejωX] E[ejωY ] (∵ X and Y are independent)
= ΦX(ω) ΦY (ω)
Pairs of Random Variables 4-33
![Page 130: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/130.jpg)
Example: sum of independent binomials
let X and Y be i.i.d. binomials RVs with parameters n, p
PX(k) = PY (k) =
(
nk
)
pk qn−k (q = 1− p)
first compute the CF of X and Y
ΦX(ω) = ΦY (ω) =n∑
k=0
ejωk
(
nk
)
pk qn−k = (pejω + q)n
the CF of Z is then given by
ΦZ(ω) = ΦX(ω) ΦY (ω) = (pejω + q)2n
conclusion: Z is also a binomial with parameters 2n and p
Pairs of Random Variables 4-34
![Page 131: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/131.jpg)
Division of Random Variables
let Z = X/Y
if Y = y (given), then Z = X/y, a scaled version of X
therefore, if Y is fixed then the distribution of Z must be the same as X
fZ|Y (z|y) = |y|fX|Y (yz|y)
use this result to find the pdf of Z:
fZ(z) =
∫ ∞
−∞fZ|Y (z|y)fY (y) dy
=
∫ ∞
−∞|y|fX|Y (yz|y)fY (y) dy
=
∫ ∞
−∞|y|fXY (yz, y) dy
Pairs of Random Variables 4-35
![Page 132: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/132.jpg)
Division of Exponential RVs
let X and Y be exponential RVs with mean 1
fX(x) = e−x, x ≥ 0, fY (y) = e−y, y ≥ 0
assume that X,Y are independent, so
fXY (x, y) = fX(x)fY (y) = e−(x+y)
the pdf of Z = Y/X can be determined by
fZ(z) =
∫ ∞
0
ye−yze−ydy =1
(z + 1)2, z > 0
Pairs of Random Variables 4-36
![Page 133: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/133.jpg)
Linear Transformation
let A be an invertible linear transformation such that[
UV
]
= A
[
XY
]
⇐⇒[
XY
]
= A−1
[
UV
]
(x, y)
(u, v)dx
dyA dM :
area of the parallelogram
P (X ∈ dx, Y ∈ dy) = fXY (x, y)dxdy, P (U ∈ du, V ∈ dV ) = fUV (u, v)dM
it can be shown that
dM = | detA|dx dy,
fUV (u, v) =1
| detA|fXY (x, y)
Pairs of Random Variables 4-37
![Page 134: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/134.jpg)
Example: Linear Transformation of a Gaussian
let X and Y be jointly Gaussian RVs with the joint pdf
fXY (x, y) =1
2π√
3/4exp − 2(x2 − xy + y2)
3
let U and V be obtained from (X,Y ) by
[
UV
]
=1√2
[
1 1−1 1
] [
XY
]
⇐⇒[
XY
]
=1√2
[
1 −11 1
] [
UV
]
therefore the pdf of U and V is
fUV (u, v) =1
π√3exp − (u2/3 + v2)
U and V become independent, zero-mean Gaussian RVs
Pairs of Random Variables 4-38
![Page 135: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/135.jpg)
References
Chapter 5 inA. Leon-Garcia, Probability, Statistics, and Random Processes for
Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009
Pairs of Random Variables 4-39
![Page 136: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/136.jpg)
EE401 (Semester 1) Jitkomut Songsiri
5. Random Vectors
• probabilities
• characteristic function
• cross correlation, cross covariance
• Gaussian random vectors
• functions of random vectors
5-1
![Page 137: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/137.jpg)
Random vectors
we denote X a random vector
X is a function that maps each outcome ζ to a vector of real numbers
an n-dimensional random variable has n components:
X =
X1
X2...
Xn
also called a multivariate or multiple random variable
Random Vectors 5-2
![Page 138: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/138.jpg)
Probabilities
Joint CDF
F (x) , FX(x1, x2, . . . , xn) = P (X1 ≤ x1, X2 ≤ x2, . . . ,Xn ≤ xn)
Joint PMF
p(x) , pX(x1, x2, . . . , xn) = P (X1 = x1, X2 = x2, . . . ,Xn = xn)
Joint PDF
f(x) , fX(x1, x2, . . . , xn) =∂n
∂x1 . . . ∂xnF (x)
Random Vectors 5-3
![Page 139: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/139.jpg)
Marginal PMF
pXj(xj) = P (Xj = xj) =∑
x1
. . .∑
xj−1
∑
xj+1
. . .∑
xn
pX(x1, x2, . . . , xn)
Marginal PDF
fXj(xj) =
∫ ∞
−∞. . .
∫ ∞
−∞fX(x1, x2, . . . , xn) dx1 . . . dxj−1dxj+1 . . . dxn
Conditional PDF: the PDF of Xn given X1, . . . ,Xn−1 is
f(xn|x1, . . . , xn−1) =fX(x1, . . . , xn)
fX1,...,Xn−1(x1, . . . , xn−1)
Random Vectors 5-4
![Page 140: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/140.jpg)
Characteristic Function
the characteristic function of an n-dimensional RV is defined by
Φ(ω) = Φ(ω1, . . . , ωn) = E[ej(ω1X1+···+ωnXn)]
=
∫
x
ejωTxf(x)dx
where
ω =
ω1
ω2...ωn
, x =
x1
x2...xn
Φ(ω) is the n-dimensional Fourier transform of f(x)
Random Vectors 5-5
![Page 141: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/141.jpg)
Independence
the random variables X1, . . . , Xn are independent if
the joint pdf (or pmf) is equal to the product of their marginal’s
DiscretepX(x1, . . . , xn) = pX1(x1) · · · pXn(xn)
ContinuousfX(x1, . . . , xn) = fX1(x1) · · · fXn(xn)
we can specify an RV by the characteristic function in place of the pdf,
X1, . . . , Xn are independent if
Φ(ω) = Φ1(ω1) · · ·Φn(ωn)
Random Vectors 5-6
![Page 142: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/142.jpg)
Example: White noise signal in communication
the n samples X1, . . . Xn of a noise signal have the joint pdf:
fX(x1, . . . , xn) =e−(x2
1+···+x2n)/2
(2π)n/2for all x1, . . . , xn
the joint pdf is the n-product of one-dimensional Gaussian pdf’s
thus, X1, . . . ,Xn are independent Gaussian random variables
Random Vectors 5-7
![Page 143: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/143.jpg)
Expected Values
the expected value of a function
g(X) = g(X1, . . . , Xn)
of a vector random variable X is defined by
E[g(X)] =
∫
x
g(x)f(x)dx Continuous
E[g(X)] =∑
x
g(x)p(x) Discrete
Mean vector
µ = E[X] = E
X1
X2...
Xn
,
E[X1]E[X2]
...E[Xn]
Random Vectors 5-8
![Page 144: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/144.jpg)
Correlation and Covariance matrices
Correlation matrix has the second moments of X as its entries:
R , E[XXT ] =
E[X1X1] E[X1X2] · · · E[X1Xn]E[X2X1] E[X2X2] · · · E[X2Xn]
... ... . . . ...E[XnX1] E[XnX2] · · · E[XnXn]
withRij = E[XiXj]
Covariance matrix has the second-order central moments as its entries:
C , E[(X− µ)(X− µ)T ]
withCij = cov(Xi,Xj) = E[(Xi − µi)(Xj − µj)]
Random Vectors 5-9
![Page 145: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/145.jpg)
Symmetric matrix
A ∈ Rn×n is called symmetric if A = AT
Facts: if A is symmetric
• all eigenvalues of A are real
• all eigenvectors of A are orthogonal
• A admits a decomposition
A = UDUT
where UTU = UUT = I (U is unitary) and D is diagonal
(of course, the diagonals of D are eigenvalues of A)
Random Vectors 5-10
![Page 146: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/146.jpg)
Unitary matrix
a matrix U ∈ Rn×n is called unitary if
UTU = UUT = I
example: 1√2
[
1 −11 1
]
,
[
cos θ − sin θsin θ cos θ
]
Facts:
• a real unitary matrix is also called orthogonal
• a unitary matrix is always invertible and U−1 = UT
• columns vectors of U are mutually orthogonal
• norm is preserved under a unitary transformation:
y = Ux =⇒ ‖y‖ = ‖x‖
Random Vectors 5-11
![Page 147: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/147.jpg)
Positive definite matrix
a symmetric matrix A is positive semidefinite, written as A � 0 if
xTAx ≥ 0, ∀x ∈ Rn
and positive definite, written as A ≻ 0 if
xTAx > 0, for all nonzero x ∈ Rn
Facts: A � 0 if and only if
• all eigenvalues of A are non-negative
• all principle minors of A are non-negative
Random Vectors 5-12
![Page 148: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/148.jpg)
example: A =
[
1 −1−1 2
]
� 0 because
xTAx =[
x1 x2
]
[
1 −1−1 2
] [
x1
x2
]
= x21 + 2x2
2 − 2x1x2
= (x1 − x2)2 + x2
2 ≥ 0
or we can check from
• eigenvalues of A are 0.38 and 2.61 (real and positive)
• the principle minors are 1 and
∣
∣
∣
∣
1 −1−1 2
∣
∣
∣
∣
= 1 (all positive)
note: A � 0 does not mean all entries of A are positive!
Random Vectors 5-13
![Page 149: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/149.jpg)
Properties of correlation and covariance matrices
let X be a (real) n-dimensional random vector with mean µ
Facts:
• R and C are n× n symmetric matrices
• R and C are positive semidefinite
• If X1, . . . ,Xn are independent, then C is diagonal
• the diagonals of C are given by the variances of Xk
• if X has zero mean, then R = C
• C = R− µµT
Random Vectors 5-14
![Page 150: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/150.jpg)
Cross Correlation and Cross Covariance
let X,Y be vector random variables with means µX, µY respectively
Cross Correlation
cor(X,Y) = E[XYT ]
if cor(X,Y) = 0 then X and Y are said to be orthogonal
Cross Covariance
cov(X,Y) = E[(X− µX)(Y − µY )T ]
= cor(X,Y)− µXµTY
if cov(X,Y) = 0 then X and Y are said to be uncorrelated
Random Vectors 5-15
![Page 151: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/151.jpg)
Affine transformation
let Y be an affine transformation of X:
Y = AX+ b
where A and b are deterministic matrices
• µY = AµX + b
µY = E[AX+ b] = AE[X] +E[b] = AµX + b
• CY = ACXAT
CY = E[(Y − µY )(Y − µY )T ] = E[(A(X− µX))(A(X− µX))T ]
= AE[(X− µX)(X− µX)T ]AT = ACXAT
Random Vectors 5-16
![Page 152: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/152.jpg)
Diagonalization of covariance matrix
suppose a random vector Y is obtained via a linear transformation of X
X Y
A
• the covariance matrices of X,Y are CX,CY respectively
• A may represent linear filter, system gain, etc.
• the covariance of Y is CY = ACXAT
Problem: choose A such that CY becomes ’diagonal’
in other words, the variables Y1, . . . , Yn are required to be uncorrelated
Random Vectors 5-17
![Page 153: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/153.jpg)
since CX is symmetric, it has the decomposition:
CX = UDUT
where
• D is diagonal and its entries are eigenvalues of CX
• U is unitary and the columns of U are eigenvectors of CX
diagonalization: pick A = UT to obtain
CY = ACXAT = AUDUTAT = UTUDUTU = D
as desired
Random Vectors 5-18
![Page 154: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/154.jpg)
one can write X in terms of Y as
X = UUTX = UY =[
U1 U2 · · · Un
]
Y1
Y2...Yn
=
n∑
k=1
YkUk
this equation is called Karhunen-Loeve expansion
• X can be expressed as a weighted sum of the eigenvectors Uk
• the weighting coefficients are uncorrelated random variables Yk
Random Vectors 5-19
![Page 155: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/155.jpg)
example: X has the covariance matrix
[
4 22 4
]
design a transformation Y = AX s.t. the covariance of Y is diagonal
the eigenvalues of CX and the corresponding eigenvectors are
λ1 = 6, u1 =
[
11
]
, λ2 = 2, u2 =
[
1−1
]
u1 and u2 are orthogonal, so if we normalize uk so that ‖uk‖ = 1 then
U =[
u1√2
u2√2
]
is unitary
therefore, CX = UDUT where
U =
[
1/√2 1/
√2
1/√2 −1/
√2
]
, D =
[
6 00 2
]
thus, if we choose A = UT then CY = D which is diagonal as desired
Random Vectors 5-20
![Page 156: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/156.jpg)
Whitening transformation
we wish to find a transformation Y = AX such that
CY = I
• a white noise property: the covariance is the identity matrix
• all components in Y are all uncorrelated
• the variances of Yk are normalized to 1
from CY = ACXAT and use the eigenvalue decomposition in CX
CY = AUDUTAT = AUD1/2D1/2UTAT
denote D1/2 the square root of D with D � 0, i.e., D1/2D1/2 = D
D = diag(d1, . . . , dn) =⇒ D1/2 = diag(√
d1, . . . ,√
dn)
can you find A that makes CY the identity matrix ?
Random Vectors 5-21
![Page 157: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/157.jpg)
Gaussian random vector
X1, . . . , Xn are said to be jointly Gaussian if their joint pdf is given by
f(x) , fX(x1, x2, . . . , xn) =1
(2π)n/2 det(Σ)1/2exp −1
2(x−µ)TΣ−1(x−µ)
µ is the mean (n× 1) and Σ ≻ 0 is the covariance matrix (n× n):
µ =
µ1
µ2...µn
, Σ =
Σ11 Σ12 · · · Σ1n
Σ21 Σ22 · · · Σ2n... ... . . . ...
Σn1 Σn2 · · · Σnn
and
µk = E[Xk], Σij = E[(Xi − µi)(Xj − µj)]
Random Vectors 5-22
![Page 158: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/158.jpg)
example: the joint density function of x (not normalized) is given by
f(x1, x2, x3) = exp − x21 + 3x2
2 + 2(x3 − 1)2 + 2x1(x3 − 1)
2
• f is an exponential of negative quadratic in x so x must be a Gaussian
f(x1, x2, x3) = exp − 1
2
x1
x2
x3 − 1
T
1 0 10 3 01 0 2
x1
x2
x3 − 1
• the mean vector is (0, 0, 1) and the covariance matrix is
C =
1 0 10 3 01 0 2
−1
=
2 0 −10 1/3 0−1 0 1
• the variance of x1 is highest while x2 is smallest
• x1 and x2 are uncorrelated, so are x2 and x3
Random Vectors 5-23
![Page 159: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/159.jpg)
examples of Gaussian density contour (the exponent of exponential)
[
x1
x2
]T [Σ11 Σ12
Σ12 Σ22
]−1 [x1
x2
]
= 1
uncorrelated
x1
x2
different variance
x1
x2
correlated
x1
x2
Σ =
[
1 00 1
]
Σ =
[
1/2 00 1
]
Σ =
[
1 −1−1 2
]
Random Vectors 5-24
![Page 160: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/160.jpg)
Properties of Gaussian variables
many results on Gaussian RVs can be obtained analytically:
• marginal’s of X is also Gaussian
• conditional pdf of Xk given the other variables is a Gaussian distribution
• uncorrelated Gaussian random variables are independent
• any affine transformation of a Gaussian is also a Gaussian
these are well-known facts
and more can be found in the areas of estimation, statistical learning, etc.
Random Vectors 5-25
![Page 161: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/161.jpg)
Characteristic function of Gaussian
Φ(ω) = Φ(ω1, ω2, . . . , ωn) = ejµTω e−
ωTΣω2
Proof. By definition and arranging the quadratic term in the power of exp
Φ(ω) =1
(2π)n/2|Σ|1/2∫
x
ejxTω e−
(x−µ)TΣ−1(x−µ)2 dx
=ejµ
Tω e−ωTΣω
2
(2π)n/2|Σ|1/2∫
x
e−(x−µ−jΣω)TΣ−1(x−µ−jΣω)
2 dx
= exp (jµTω) exp (−1
2ωTΣω)
(the integral equals 1 since it is a form of Gaussian distribution)
for one-dimensional Gaussian with zero mean and variance Σ = σ2,
Φ(ω) = e−σ2ω2
2
Random Vectors 5-26
![Page 162: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/162.jpg)
Affine Transformation of a Gaussian is Gaussian
let X be an n-dimensional Gaussian, X ∼ N (µ,Σ) and define
Y = AX+ b
where A is m× n and b is m× 1 (so Y is m× 1)
ΦY(ω) = E[ejωTY] = E[ejω
T (AX+b)]
= E[ejωTAX · ejωTb] = ejω
TbΦX(ATω)
= ejωTb · ejµTATω · e−ωTAΣATω/2
= ejωT (Aµ+b) · e−ωTAΣATω/2
we read off that Y is Gaussian with mean Aµ+ b and covariance AΣAT
Random Vectors 5-27
![Page 163: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/163.jpg)
Marginal of Gaussian is Gaussian
the kth component of X is obtained by
Xk =[
0 · · · 1 0]
X , eTkX
(ek is a standard unit column vector; all entries are zero except the kth
position)
hence, Xk is simply a linear transformation (in fact, a projection) of X
Xk is then a Gaussian with mean
eTkµ = µk
and covarianceeTk Σ ek = Σkk
Random Vectors 5-28
![Page 164: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/164.jpg)
Uncorrelated Gaussians are independent
suppose (X,Y) is a jointly Gaussian vector with
mean µ =
[
µx
µy
]
and covariance
[
CX 00 CY
]
in otherwords, X and Y are uncorrelated Gaussians:
cov(X,Y ) = E[XY T ]−E[X ]E[Y ]T = 0
the joint density can be written as
fXY(x,y) =1
(2π)n|CX|1/2|CY |1/2exp −1
2
[
x− µx
y − µy
]T [C−1
X 00 C−1
Y
] [
x− µx
y − µy
]
=1
(2π)n/2|CX|1/2e−1
2(x−µx)TC
−1X (x−µx)· 1
(2π)n/2|CY |1/2e−
12(y−µy)
TC−1Y (y−µy)
proving the independence
Random Vectors 5-29
![Page 165: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/165.jpg)
we can also see from the characteristic function
Φ(ω1, ω2) = E
[
exp
(
j
[
ω1
ω2
]T [X
Y
]
)]
= exp
(
j
[
ω1
ω2
]T [µy
µy
]
)
· exp − 1
2
[
ω1
ω2
]T [CX 00 CY
] [
ω1
ω2
]
= exp (jωT1 µx) exp − ωT
1 CXω1
2· exp (jωT
2 µy) · exp − ωT2 CYω2
2
, Φ1(ω1) · Φ2(ω2)
proving the independence
Random Vectors 5-30
![Page 166: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/166.jpg)
Conditional of Gaussian is Gaussian
let Z be an n-dimensional Gaussian which can be decomposed as
Z =
[
X
Y
]
∼ N([
µx
µy
]
,
[
Σxx Σxy
ΣTxy Σyy
])
the conditional pdf of X given Y is also Gaussian with conditional mean
µX|Y = µx +ΣxyΣ−1yy (Y − µy)
and conditional covariance
ΣX|Y = Σx − ΣxyΣ−1yy Σ
Txy
Random Vectors 5-31
![Page 167: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/167.jpg)
Proof:
from the matrix inversion lemma, Σ−1 can be written as
Σ−1 =
S−1 −S−1ΣxyΣ−1yy
−Σ−1yy Σ
TxyS
−1 Σ−1yy +Σ−1
yy ΣTxyS
−1ΣxyΣ−1yy
where S is called the Schur complement of Σxx in Σ and
S = Σxx − ΣxyΣ−1yy Σ
Txy
detΣ = detS · detΣyy
we can show that Σ ≻ 0 if any only if S ≻ 0 and Σyy ≻ 0
Random Vectors 5-32
![Page 168: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/168.jpg)
from fX|Y(x|y) = fX(x,y)/fY(y), we calculate the exponent terms
[
x− µx
y − µy
]T
Σ−1
[
x− µx
y − µy
]
− (y − µy)TΣ−1
yy (y − µy)
= (x− µx)TS−1(x− µx)− (x− µx)
TS−1ΣxyΣ−1yy (y − µy)
−(y − µy)TΣ−1
yy ΣTxyS
−1(x− µx)
+(y − µy)T (Σ−1
yyΣTxyS
−1ΣxyΣ−1yy )(y − µy)
= [x− µx − ΣxyΣ−1yy (y − µy)]
TS−1[x− µx − ΣxyΣ−1yy (y − µy)]
, (x− µX|Y)TΣ−1
X|Y(x− µX|Y)
fX|Y(x|y) is an exponential of quadratic function in x
so it has a form of Gaussian
Random Vectors 5-33
![Page 169: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/169.jpg)
Standard Gaussian vectors
for an n-dimensional Gaussian vector X ∼ N (µ,C) with C ≻ 0
let A be an n× n invertible matrix such that
AAT = C
(A is called a factor of C)
then the random vectorZ = A−1(X− µ)
is a standard Gaussian vector, i.e.,
Z ∼ N (0, I)
(obtain A via eigenvalue decomposition or Cholesky factorization)
Random Vectors 5-34
![Page 170: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/170.jpg)
Functions of random vectors
• minimum and maximum of random variables
• general transformation
• affine transformation
Random Vectors 5-35
![Page 171: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/171.jpg)
Minimum and Maximum of RVs
let X1,X2, . . . ,Xn be independent RVs
define the minimum and maximum of RVs by
Y = min(X1,X2, . . . ,Xn), Z = max(X1, X2, . . . , Xn)
the maximum of X1,X2, . . . ,Xn is less than z iff Xi ≤ z for all i, so
FZ(z) = P (X1 ≤ z)P (X2 ≤ z) · · ·P (Xn ≤ z) = (FX(z))n
the minimum of X1, X2, . . . , Xn is greater than y iff Xi ≥ y for all i, so
1− FY (y) = P (X1 > y)P (X2 > y) · · ·P (Xn > y) = (1− FX(y))n
andFY (y) = 1− (1− FX(y))n
Random Vectors 5-36
![Page 172: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/172.jpg)
Example: Merging of independent Poisson arrivals
...Source 1 Source 2 Source n
Server
web page request
• Ti denotes the interarrival times for source i
• Ti has exponential distribution with rate λi
• find the distribution of the interarrival times between consecutiverequests at server
Random Vectors 5-37
![Page 173: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/173.jpg)
each Ti satisfies the memoryless property, so the time that has elapsedsince the last arrival is irrelevant
the time until the next arrival at the multiplexer is
Z = min(T1, T2, . . . , Tn)
therefore, the cdf of Z can be computed by:
1− FZ(z) = P (min(T1, T2, . . . , Tn) > z)
= P (T1 > z)P (T2 > z) · · ·P (Tn > z)
= (1− FT1(z))(1− FT2(z)) · · · (1− FTn(z))
= e−λ1z · · · e−λnz = e−(λ1+λ2+···+λn)z
the interarrival time is an exponential RV with rate λ1 + λ2 + · · ·+ λn
Random Vectors 5-38
![Page 174: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/174.jpg)
General transformation
let X be a vector random variable
define Z = g(X) : Rn → Rn and assume that g is invertible
so that for Z = z we can solve for x uniquely:
x = g−1(z)
then the joint pdf of Z is given by
fZ(z) =fX(g
−1(z))
|detJ |
where detJ is the determinant of the Jacobian matrix:
J =
∂g1/∂x1 · · · ∂g1/∂xn... . . . ...
∂gn/∂x1 · · · ∂gn/∂xn
Random Vectors 5-39
![Page 175: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/175.jpg)
Affine transformation
if X is a continuous random vector and A is an invertible matrix
then Y = AX+ b has pdf
fY(y) =1
|det(A)|fX(A−1(y − b))
Gaussian case: let X ∼ N (0,Σ)
fY(y) =1
(2π)n/2| detA||Σ|1/2 exp − 1
2(y − b)TA−TΣ−1A−1(y − b)
=1
(2π)n/2|AΣAT |1/2 exp − 1
2(y − b)T (AΣAT )−1(y − b)
we read off that Y is also Gaussian with mean b and covariance AΣAT
this agrees with the result in page 5-16 and 5-27
Random Vectors 5-40
![Page 176: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/176.jpg)
Example: Sum of jointly Gaussian
a special case of linear transformation is
Z = a1X1 + a2X2 + · · ·+ anXn
where X1, . . . , Xn are jointly Gaussian
Z can be written as
Z =[
a1 · · · an]
X1...
Xn
, AX
Z is simply a linear transformation of a Gaussian
Random Vectors 5-41
![Page 177: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/177.jpg)
therefore, Z is Gaussian with mean
E[Z] = Aµ =∑
i=1
aiE[Xi]
and variance
var(Z) = cov(Z) = AΣAT =n∑
i=1
n∑
j=1
aiaj cov(Xi,Xj)
if X1, . . . ,Xn are independent Gaussian, i.e.,
cov(Xi,Xj) = 0
then the variance of Z is reduced to
var(Z) =n∑
i=1
a2i cov(Xi, Xi) =n∑
i=1
a2i var(Xi)
Random Vectors 5-42
![Page 178: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/178.jpg)
References
Chapter 6 inA. Leon-Garcia, Probability, Statistics, and Random Processes for
Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009
Random Vectors 5-43
![Page 179: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/179.jpg)
EE401 (Semester 1) Jitkomut Songsiri
6. Sums of Random Variables
• mean and variance
• PDF of sums of independent RVs
• laws of large numbers
• central limit theorems
6-1
![Page 180: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/180.jpg)
Mean and Variance
let X1,X2, . . . ,Xn be a sequence of RVs
regardless of statistical dependence, we have
E[X1 +X2 + · · ·+Xn] = E[X1] + E[X2] + · · ·+ E[Xn]
the variance of a sum of RVs is, however, NOT equal to the sum ofvariances
var(X1 +X2 + · · ·+Xn) =n∑
k=1
var(Xk) +n∑
j=1
n∑
k=1
cov(Xj,Xk)
If X1, X2, . . . , Xn are uncorrelated, then
var(X1 +X2 + · · ·+Xn) = var(X1) + var(X2) + · · ·+ var(Xn)
Sums of Random Variables 6-2
![Page 181: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/181.jpg)
PDF of sums of independent RVs
consider the sum of n independent RVs
Sn = X1 +X2 + · · ·+Xn
the characteristic function of Sn is
ΦS(ω) = E[ejωSn] = E[ejω(X1+X2+···+Xn)]
= E[ejωX1] · · ·E[ejωXn]
= ΦX1(ω) · · ·ΦXn(ω)
thus the pdf of Sn is found by finding the inverse Fourier of ΦS(ω):
fS(X) = F−1[ΦX1(ω) · · ·ΦXn(ω)]
Sums of Random Variables 6-3
![Page 182: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/182.jpg)
Example
find the pdf of a sum of n independent exponential RVs
all exponential variables have parameter α
the characteristic function of a single exponential RV is
ΦX(ω) =α
α− jω
the characteristic function of the sum is
ΦS(ω) =
(
α
α− jω
)n
we see that Sn is an n-Erlang RV
Sums of Random Variables 6-4
![Page 183: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/183.jpg)
Sample mean
let X be an RV with E[X ] = µ (unknown)
X1, X2, . . . , Xn denote n independent, repeated measurements of X
Xj’s are independent, identically distributed (i.i.d.) RVs
the sample mean of the sequences is used to estimate E[X ]:
Mn =1
n
n∑
j=1
Xj
two statistical quantities for characterizing the sample mean’s properties:
• E[Mn]: we say Mn is unbiased if E[Mn] = µ
• var(Mn): we examine this value when n is large
Sums of Random Variables 6-5
![Page 184: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/184.jpg)
the sample mean is an unbiased estimator for µ:
E[Mn] = E
1
n
n∑
j=1
Xj
=1
n
n∑
j=1
E[Xj] = µ
suppose var(X) = σ2 (true variance)
since Xj’s are i.i.d, the variance of Mn is
var(Mn) =1
n2
n∑
j=1
var(Xj) =nσ2
n2=
σ2
n
hence, the variance of the sample mean approaches zero as the number ofsamples increases
Sums of Random Variables 6-6
![Page 185: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/185.jpg)
Weak Law of Large Numbers
let X1,X2, . . . ,Xn be a sequence of iid RVs with finite mean E[X ] = µand variance σ2
for any ǫ > 0,limn→∞
P [ |Mn − µ| < ǫ ] = 1
• for large enough n, the sample mean will be close to the true mean withhigh probability
• Proof. apply Chebyshev inequality:
P [|Mn − µ| ≥ ǫ] ≤ σ2
nǫ2=⇒ P [|Mn − µ| < ǫ] ≥ 1− σ2
nǫ2
Sums of Random Variables 6-7
![Page 186: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/186.jpg)
scattergram of 1000 realizations of the sample mean
−0.4 −0.2 0 0.2 0.4−0.4
−0.2
0
0.2
0.4
−0.4 −0.2 0 0.2 0.4−0.4
−0.2
0
0.2
0.4
−0.4 −0.2 0 0.2 0.4−0.4
−0.2
0
0.2
0.4
−0.4 −0.2 0 0.2 0.4−0.4
−0.2
0
0.2
0.4
x1x1
x1x1
x2
x2
x2
x2
n = 100 n = 1000
n = 10000 n = 100000
• Mn’s are computed from 2-dimensional Gaussian with zero mean
• as n increases, the probability of Mn’s are concentrated at zero is high
Sums of Random Variables 6-8
![Page 187: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/187.jpg)
Strong Law of Large Numbers
let X1,X2, . . . ,Xn be a sequence of iid RVs with finite mean E[X ] = µand finite variance, then
P [ limn→∞
Mn = µ] = 1
• Mk is the sequence of sample mean computed using X1 through Xk
• with probability 1, every sequence of sample mean calculations willeventually approach and stay close to E[X ] = µ
• the strong law implies the weak law
Sums of Random Variables 6-9
![Page 188: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/188.jpg)
Central Limit Theorem
let X1,X2, . . . ,Xn be a sequence of iid RVs with
finite mean E[X ] = µ and finite variance σ2
let Sn be the sum of the first n RVs in the sequences:
Sn = X1 +X2 + · · ·+Xn
and define
Zn =Sn − nµ
σ√n
then
limn→∞
P (Zn ≤ z) =1√2π
∫ z
−∞e−x2/2dx
as n becomes large, the CDF of normalized Sn approaches Gaussiandistribution
Sums of Random Variables 6-10
![Page 189: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/189.jpg)
Proof of Central Limit Theorem
first note that
Zn =Sn − nµ
σ√n
=1
σ√n
n∑
k=1
(Xk − µ)
the characteristic function of Zn is given by
ΦZn(ω) = E[ejωZn] = E
[
expjω
σ√n
n∑
k=1
(Xk − µ)
]
= E
[
n∏
k=1
ejω(Xk−µ)/σ√n
]
=(
E[ejω(X−µ)/σ√n])n
(using the fact that Xk’s are iid)
Sums of Random Variables 6-11
![Page 190: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/190.jpg)
expanding the exponential expression gives
E[ejω(X−µ)/σ√n] = E
[
1 +jω
σ√n(X − µ) +
(jω)2
2!nσ2(X − µ)2 + . . .
]
≈ 1− ω2
2n
(the higher order term can be neglected as n becomes large)
then we obtain
ΦZn(ω) →(
1− ω2
2n
)n
→ e−ω2/2, as n → ∞
Sums of Random Variables 6-12
![Page 191: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/191.jpg)
0 5 10 15 20 250
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
sum of 5 exponential RVs
Gaussian
x
F(x)
n = 5
20 30 40 50 60 70 800
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
sum of 50 exponential RVs
Gaussian
x
F(x)
n = 50
• blue lines are the CDF of the sum of n exponential RVs with λ = 1where n = 5 (left) and n = 50 (right)
• red dashed line is the CDF of a Gaussian RV with the same mean (n/λ)and variance n/λ2
• as n increases, the CDF approaches that of Gaussian distribution
Sums of Random Variables 6-13
![Page 192: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/192.jpg)
0 1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
sum of 5 Bernoulli RVs
Gaussian
x
F(x)
n = 5
0 5 10 15 20 250
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
sum of 25 Bernoulli RVs
Gaussian
x
F(x)
n = 25
• blue lines are the CDFs of the sum of n Bernoulli RVs with p = 1/2where n = 5 (left) and n = 25 (right)
• red dashed line is the CDF of a Gaussian with mean np and variancenp(1− p)
Sums of Random Variables 6-14
![Page 193: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/193.jpg)
Example
the time between events is iid exponential RVs with mean m sec
find the probability that the 1000th event occurs in time interval(1000± 50)m
• Xj is the time between events
• Sn is the time of the nth event (then Sn = X1 +X2 + · · ·+Xn)
• E[Sn] = nm and var(Sn) = nm2
the CLT gives
P (950m ≤ S1000 ≤ 1050m)
= P
(
950m− 1000m
m√1000
≤ Zn ≤ 1050m− 1000m
m√1000
)
≈ Φ(1.58)− Φ(−1.58)
Sums of Random Variables 6-15
![Page 194: EE401 (Semester 1) Jitkomut Songsiri 1. Review of Probability](https://reader031.vdocuments.us/reader031/viewer/2022022007/621094adeaeaf13f257eaffa/html5/thumbnails/194.jpg)
References
Chapter 7 inA. Leon-Garcia, Probability, Statistics, and Random Processes for
Electrical Engineering, 3rd edition, Pearson Prentice Hall, 2009
Sums of Random Variables 6-16