lecture notes 3 multiple random variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · lecture...

22
Lecture Notes 3 Multiple Random Variables Joint, Marginal, and Conditional pmfs Bayes Rule and Independence for pmfs Joint, Marginal, and Conditional pdfs Bayes Rule and Independence for pdfs Functions of Two RVs One Discrete and One Continuous RVs More Than Two Random Variables Corresponding pages from B&T textbook: 110-111, 158-159, 164-170, 173-178, 186-190, 221-225. EE 178/278A: Multiple Random Variables Page 3 – 1 Two Discrete Random Variables – Joint PMFs As we have seen, one can define several r.v.s on the sample space of a random experiment. How do we jointly specify multiple r.v.s, i.e., be able to determine the probability of any event involving multiple r.v.s? We first consider two discrete r.v.s Let X and Y be two discrete random variables defined on the same experiment. They are completely specified by their joint pmf p X,Y (x, y)=P{X = x, Y = y} for all x ∈X ,y ∈Y Clearly, p X,Y (x, y) 0, and x∈X y∈Y p X,Y (x, y)=1 Notation: We use (X, Y ) p X,Y (x, y) to mean that the two discrete r.v.s have the specified joint pmf EE 178/278A: Multiple Random Variables Page 3 – 2

Upload: others

Post on 26-Apr-2020

26 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Lecture Notes 3

Multiple Random Variables

• Joint, Marginal, and Conditional pmfs

• Bayes Rule and Independence for pmfs

• Joint, Marginal, and Conditional pdfs

• Bayes Rule and Independence for pdfs

• Functions of Two RVs

• One Discrete and One Continuous RVs

• More Than Two Random Variables

Corresponding pages from B&T textbook: 110-111, 158-159, 164-170, 173-178, 186-190, 221-225.

EE 178/278A: Multiple Random Variables Page 3 – 1

Two Discrete Random Variables – Joint PMFs

• As we have seen, one can define several r.v.s on the sample space of a randomexperiment. How do we jointly specify multiple r.v.s, i.e., be able to determinethe probability of any event involving multiple r.v.s?

• We first consider two discrete r.v.s

• Let X and Y be two discrete random variables defined on the same experiment.They are completely specified by their joint pmf

pX,Y (x, y) = PX = x, Y = y for all x ∈ X , y ∈ Y

• Clearly, pX,Y (x, y) ≥ 0, and∑

x∈X∑

y∈Y pX,Y (x, y) = 1

• Notation: We use (X,Y ) ∼ pX,Y (x, y) to mean that the two discrete r.v.s havethe specified joint pmf

EE 178/278A: Multiple Random Variables Page 3 – 2

Page 2: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

• The joint pmf can be described by a table

Example: Consider X,Y with the following joint pmf pX,Y (x, y)

X1 2 3 4

1 1/16 0 1/8 1/16

Y 2 1/32 1/32 1/4 0

3 0 1/8 1/16 1/16

4 1/16 1/32 1/16 1/32

EE 178/278A: Multiple Random Variables Page 3 – 3

Marginal PMFs

• Consider two discrete r.v.s X and Y . They are described by their joint pmfpX,Y (x, y). We can also define their marginal pmfs pX(x) and pY (y). How arethese related?

• To find the marginal pmf of X , we use the law of total probability

pX(x) =∑

y∈Yp(x, y) for x ∈ X

Similarly to find the marginal pmf of Y , we sum over x ∈ X• Example: Find the marginal pmfs for the previous example

X pY (y)1 2 3 4

1 1/16 0 1/8 1/16

Y 2 1/32 1/32 1/4 0

3 0 1/8 1/16 1/16

4 1/16 1/32 1/16 1/32pX(x)

EE 178/278A: Multiple Random Variables Page 3 – 4

Page 3: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Conditional PMFs

• The conditional pmf of X given Y = y is defined as

pX|Y (x|y) =pX,Y (x, y)

pY (y)for pY (y) 6= 0 and x ∈ X

Also, the conditional pmf of Y given X = x is

pY |X(y|x) = pX,Y (x, y)

pX(x)for pX(x) 6= 0 and y ∈ Y

• For fixed x, pY |X(y|x) is a pmf for Y

• Example: Find pY |X(y|2) for the previous example

EE 178/278A: Multiple Random Variables Page 3 – 5

• Chain rule: Can write

pX,Y (x, y) = pX(x)pY |X(y|x)

• Bayes rule for pmfs: Given pX(x) and pY |X(y|x) for all (x, y) ∈ X × Y , we canfind

pX|Y (x|y) =pX,Y (x, y)

pY (y)

=pY |X(y|x)pY (y)

pX(x)

=pY |X(y|x)

x′∈XpX,Y (x′, y)

pX(x), by total probability

Using the chain rule, we obtain another version of Bayes rule

pX|Y (x|y) =pY |X(y|x)

x′∈XpX(x′)pY |X(y|x′)

pX(x)

EE 178/278A: Multiple Random Variables Page 3 – 6

Page 4: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Independence

• The random variables X and Y are said to be independent if for any eventsA ∈ X and B ∈ Y

PX ∈ A, Y ∈ B = PX ∈ APY ∈ B

• Can show that the above definition is equivalent to saying that the r.v.s X andY are independent if

pX,Y (x, y) = pX(x)pY (y) for all (x, y) ∈ X × Y

• Independence implies that pX|Y (x|y) = pX(x) for all (x, y) ∈ X × Y

EE 178/278A: Multiple Random Variables Page 3 – 7

Example: Binary Symmetric Channel

• Consider the following binary communication channel

X ∈ 0, 1 Y ∈ 0, 1

Z ∈ 0, 1

The bit sent X ∼ Bern(p), the noise Z ∼ Bern(ǫ), the bit receivedY = (X + Z) mod 2 = X ⊕ Z , and X and Z are independent. Find

1. pX|Y (x|y),2. pY (y), and

3. the probability of error PX 6= Y

EE 178/278A: Multiple Random Variables Page 3 – 8

Page 5: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

1. To find pX|Y (x|y) we use Bayes rule

pX|Y (x|y) =pY |X(y|x)

x′∈XpY |X(y|x′)pX(x′)

pX(x)

We know pX(x). To find pY |X , note that

pY |X(y|x) = PY = y|X = x= PX ⊕ Z = y|X = x= PZ = y ⊕X |X = x= PZ = y ⊕ x|X = x= PZ = y ⊕ x, since Z and X are independent

= pZ(y ⊕ x)

So we have

pY |X(0|0) = pZ(0⊕ 0) = pZ(0) = 1− ǫ, pY |X(0|1) = pZ(0⊕ 1) = pZ(1) = ǫ

pY |X(1|0) = pZ(1⊕ 0) = pZ(1) = ǫ, pY |X(1|1) = pZ(1⊕ 1) = pZ(0) = 1− ǫ

EE 178/278A: Multiple Random Variables Page 3 – 9

Plugging in the Bayes rule equation, we obtain

pX|Y (0|0) =pY |X(0|0)

pY |X(0|0)pX(0) + pY |X(0|1)pX(1)pX(0)

=(1− ǫ)(1− p)

(1− ǫ)(1− p) + ǫp

pX|Y (1|0) =pY |X(0|1)

pY |X(0|0)pX(0) + pY |X(0|1)pX(1)pX(1) = 1− pX|Y (0|0)

=ǫp

(1− ǫ)(1− p) + ǫp

pX|Y (0|1) =pY |X(1|0)

pY |X(1|0)pX(0) + pY |X(1|1)pX(1)pX(0)

=ǫ(1− p)

ǫ(1− p) + (1− ǫ)p

pX|Y (1|1) =pY |X(1|1)

pY |X(1|0)pX(0) + pY |X(1|1)pX(1)pX(1) = 1− pX|Y (0|1)

=(1− ǫ)p

ǫ(1− p) + (1− ǫ)p

EE 178/278A: Multiple Random Variables Page 3 – 10

Page 6: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

2. We already found pY (y): pY (1) = ǫ(1− p) + (1− ǫ)p

3. Now to find the probability of error PX 6= Y , consider

PX 6= Y = pX,Y (0, 1) + pX,Y (1, 0)

= pY |X(1|0)pX(0) + pY |X(0|1)pX(1)

= ǫ(1− p) + ǫp

= ǫ

An interesting special case is when ǫ = 1/2

Here, PX 6= Y = 1/2, which is the worst possible (no information is sent),and

pY (0) =1

2p+

1

2(1− p) =

1

2= pY (1),

i.e., Y ∼ Bern(1/2), independent of the value of p !

Also in this case, the bit sent X and the bit received Y are independent (checkthis)

EE 178/278A: Multiple Random Variables Page 3 – 11

Two Continuous Random variables – Joint PDFs

• Two continuous r.v.s defined over the same experiment are jointly continuous ifthey take on a continuum of values each with probability 0. They are completelyspecified by a joint pdf fX,Y such that for any event A ∈ (−∞,∞)2,

P(X,Y ) ∈ A =

(x,y)∈A

fX,Y (x, y) dx dy

For example, for a rectangular area

Pa < X ≤ b, c < Y ≤ d =

∫ d

c

∫ b

a

fX,Y (x, y) dx dy

• Properties of a joint pdf fX,Y :

1. fX,Y (x, y) ≥ 0

2.∫∞−∞

∫∞−∞ fX,Y (x, y) dx dy = 1

• Again joint pdf is not a probability. Can relate it to probability as

Px < X ≤ x+∆x, y < Y ≤ y +∆y ≈ fX,Y (x, y) ∆x ∆y

EE 178/278A: Multiple Random Variables Page 3 – 12

Page 7: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Marginal PDF

• The Marginal pdf of X can be obtained from the joint pdf by integrating thejoint over the other variable y

fX(x) =

∫ ∞

−∞fX,Y (x, y) dy

This follows by the law of total probability. To see this, consider

x

y

x x+∆x

EE 178/278A: Multiple Random Variables Page 3 – 13

fX(x) = lim∆x→0

Px < X ≤ x+∆x∆x

= lim∆x→0

1

∆xlim

∆y→0

∞∑

n=−∞Px < X ≤ x+∆x, n∆y < Y ≤ (n+ 1)∆y

= lim∆x→0

1

∆x

∫ ∞

−∞fX,Y (x, y) dy∆x =

∫ ∞

−∞fX,Y (x, y) dy

EE 178/278A: Multiple Random Variables Page 3 – 14

Page 8: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Example

• Let (X,Y ) ∼ f(x, y), where

f(x, y) =

c if x, y ≥ 0, and x+ y ≤ 10 otherwise

1. Find c

2. Find fY (y)

3. Find PX ≥ 12Y

• Solution:

1. To find c, note that∫ ∞

−∞

∫ ∞

−∞f(x, y) dx dy = 1,

thus 12c = 1, or c = 2

EE 178/278A: Multiple Random Variables Page 3 – 15

2. To find fY (y), we use the law of total probability

fY (y) =

∫ ∞

−∞f(x, y) dx

=

∫ (1−y)

02 dx 0 ≤ y ≤ 1

0 otherwise

=

2(1− y) 0 ≤ y ≤ 1

0 otherwise

y

fY (y)

2

0 1

EE 178/278A: Multiple Random Variables Page 3 – 16

Page 9: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

3. To find the probability of the set X ≥ 12Y we first sketch it

x

y

123

1

x = 12y

from the figure we find that

P

X ≥ 1

2Y

=

(x,y):x≥12y

fX,Y (x, y) dx dy

=

∫ 23

0

∫ (1−y)

y2

2 dx dy =2

3

EE 178/278A: Multiple Random Variables Page 3 – 17

Example: Buffon’s Needle Problem

• The plane is ruled with equidistant parallel lines at distance d apart. Throwneedle of length l < d at random. What is the probability that it will intersectone of the lines?

X

d

l

Θ

• Solution:

Let X be the distance from the midpoint of the needle to the nearest of theparallel lines, and Θ be the acute angle determined by a line through the needleand the nearest parallel line

EE 178/278A: Multiple Random Variables Page 3 – 18

Page 10: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

(X,Θ) are uniformly distributed within the rectangle [0, d/2]× [0, π/2], thus

fX,Θ(x, θ) =4

πd, x ∈ [0, d/2], θ ∈ [0, π/2]

The needle intersects a line iff X < l2 sinΘ

The probability of intersection is:

P

X <l

2sinΘ

=

∫ ∫

(x,θ): x< l2 sin θ

fX,Θ(x, θ) dx dθ

=4

πd

∫ π/2

0

∫ l2 sin θ

0

dx dθ

=4

πd

∫ π/2

0

l

2sin θ dθ

=2l

πd

EE 178/278A: Multiple Random Variables Page 3 – 19

Example: Darts

• Throw a dart on a disk of radius r. Probability on the coordinates (X,Y ) isdescribed by a uniform pdf on the disk:

fX,Y (x, y) =

1πr2

, if x2 + y2 ≤ r2

0, otherwise

Find the marginal pdfs

• Solution: To find the pdf of Y (same as X), consider

fY (y) =

∫ ∞

−∞fX,Y (x, y) dx =

1

πr2

x: x2+y2≤r2dx

=1

πr2

√r2−y2

−√

r2−y2dx =

2πr2

r2 − y2, if |y| ≤ r0, otherwise

EE 178/278A: Multiple Random Variables Page 3 – 20

Page 11: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Conditional pdf

• Let X and Y be continuous random variables with joint pdf fX,Y (x, y), wedefine the conditional pdf of Y given X as

fY |X(y|x) = fX,Y (x, y)

fX(x)for fX(x) 6= 0

• Note that, for a fixed X = x, fY |X(y|x) is a legitimate pdf on Y – it isnonnegative and integrates to 1

• We want the conditional pdf to be interpreted as:

fY |X(y|x)∆y ≈ Py < Y ≤ y +∆y|X = xThe RHS can be interpreted as a limit

Py < Y ≤ y +∆y|X = x = lim∆x→0

Py < Y ≤ y +∆y, x < X ≤ x+∆xPx < X ≤ x+∆x

≈ lim∆x→0

fX,Y (x, y) ∆x ∆y

fX(x)∆x=

fX,Y (x, y)

fX(x)∆y

EE 178/278A: Multiple Random Variables Page 3 – 21

• Example: Let

f(x, y) =

2, x, y ≥ 0, x+ y ≤ 10, otherwise

Find fX|Y (x|y).

Solution: We already know that fY (y) =

2(1− y), 0 ≤ y ≤ 10, otherwise

Therefore

fX|Y (x|y) =fX,Y (x, y)

fY (y)=

11−y, x, y ≥ 0, x+ y ≤ 1, y < 1

0, otherwise

x

fX|Y (x|y)1

(1−y)

(1− y)

EE 178/278A: Multiple Random Variables Page 3 – 22

Page 12: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

• Chain rule: pX,Y (x, y) = pX(x)pY |X(y|x)

EE 178/278A: Multiple Random Variables Page 3 – 23

Independence and Bayes Rule

• Independence: Two continuous r.v.s are said to be independent if

fX,Y (x, y) = fX(x)fY (y) for all x, y

It can be shown that this definition is equivalent to saying that X and Y areindependent if for any two events A,B ⊂ (−∞,∞)

PX ∈ A, Y ∈ B = PX ∈ APY ∈ B• Example: Are X and Y in the previous example independent?

• Bayes rule for densities: Given fX(x) and fY |X(y|x), we can find

fX|Y (x|y) =fY |X(y|x)fY (y)

fX(x)

=fY |X(y|x)

∫∞−∞ fX,Y (u, y)du

fX(x)

=fY |X(y|x)

∫∞−∞ fX(u)fY |X(y|u) dufX(x)

EE 178/278A: Multiple Random Variables Page 3 – 24

Page 13: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

• Example: Let Λ ∼ U[0, 1], and the conditional pdf of X given Λ = λfX|Λ(x|λ) = λe−λx, 0 ≤ λ ≤ 1, i.e., X |Λ = λ ∼ Exp(λ). Now, given thatX = 3, find fΛ|X(λ|3)Solution: We use Bayes rule

fΛ|X(λ|3) = fX|Λ(3|λ)fΛ(λ)∫ 1

0fΛ(u)fX|Λ(3|u) du

=

λe−3λ

19(1−4e−3)

, 0 ≤ λ ≤ 1

0, otherwise

λ

λ

fΛ(λ)

fΛ|X(λ|3)0

1

1

13

1.378

0.56

EE 178/278A: Multiple Random Variables Page 3 – 25

Joint cdf

• If X and Y are two r.v.s over the same experiment, they are completelyspecified by their joint cdf

FX,Y (x, y) = PX ≤ x, Y ≤ y for x, y ∈ (−∞,∞)

x

y

(x, y)

EE 178/278A: Multiple Random Variables Page 3 – 26

Page 14: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

• Properties of the joint cdf:

1. FX,Y (x, y) ≥ 0

2. FX,Y (x1, y1) ≤ FX,Y (x2, y2) whenever x1 ≤ x2 and y1 ≤ y2

3. limx,y→∞FX,Y (x, y) = 1

4. limy→∞FX,Y (x, y) = FX(x) and limx→∞FX,Y (x, y) = FY (y)

5. limy→−∞ FX,Y (x, y) = 0 and limx→−∞ F (x, y) = 0

6. The probability of any set can be determined from the joint cdf, for example,

x

y

a b

c

d

Pa < X ≤ b, c < Y ≤ d = F (b, d)− F (a, d)− F (b, c) + F (a, c)

EE 178/278A: Multiple Random Variables Page 3 – 27

• If X and Y are continuous random variables having a joint pdf fX,Y (x, y), then

FX,Y (x, y) =

∫ x

−∞

∫ y

−∞fX,Y (u, v) dudv for x, y ∈ (−∞,∞)

Moreover, if FX,Y (x, y) is differentiable in both x and y, then

fX,Y (x, y) =∂2F (x, y)

∂x∂y= lim

∆x,∆y→0

Px < X ≤ x+∆x, y < Y ≤ y +∆y∆x∆y

• Two random variables are independent if

FX,Y (x, y) = FX(x)FY (y)

EE 178/278A: Multiple Random Variables Page 3 – 28

Page 15: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Functions of Two Random Variables

• Let X and Y be two r.v.s with known pdf fX,Y (x, y) and Z = g(x, y) be afunction of X and Y . We wish to find fZ(z)

• We use the same procedure as before: First calculate the cdf of Z , thendifferentiate it to find fZ(z)

• Example: Max and Min of Independent Random Variables

Let X ∼ fX(x) and Y ∼ fY (y) be independent, and define

U = maxX,Y , and V = minX,Y

Find the pdfs of U and V

• Solution: To find the pdf of U , we first find its cdf

FU(u) = PU ≤ u= PX ≤ u, Y ≤ u= FX(u)FY (u), by independence

EE 178/278A: Multiple Random Variables Page 3 – 29

Now, to find the pdf, we take the derivative w.r.t. u

fU(u) = fX(u)FY (u) + fY (u)FX(u)

For example, if X and Y are uniformly distributed between 0 and 1,

fU(u) = 2u for 0 ≤ u ≤ 1

Next, to find the pdf of V , consider

FV (v) = PV ≤ v= 1− PV > v= 1− PX > v, Y > v= 1− (1− FX(v))(1− FY (v))

= FX(v) + FY (v)− FX(v)FY (v),

thus

fV (v) = fX(v) + fY (v)− fX(v)FY (v)− fY (v)FX(v)

For example, if X ∼ Exp(λ1) and Y ∼ Exp(λ2), then V ∼ Exp(λ1 + λ2)

EE 178/278A: Multiple Random Variables Page 3 – 30

Page 16: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Sum of Independent Random Variables

• Let X and Y be independent r.v.s with known distributions. We wish to findthe distribution of their sum W = X + Y

• First assume X ∼ pX(x) and Y ∼ pY (y) are independent integer-valued r.v.s,then for any integer w, the pmf of their sum

pW (w) = PX + Y = w

=∑

(x,y): x+y=wPX = x, Y = y

=∑

x

PX = x, Y = w − x

=∑

x

PX = xPY = w − x, by independence

=∑

x

pX(x)pY (w − x)

This is the discrete convolution of the two pmfs

EE 178/278A: Multiple Random Variables Page 3 – 31

For example, let X ∼ Poisson(λ1) and Y ∼ Poisson(λ2) be independent, thenthe pmf of their sum

pW (w) =

∞∑

x=−∞pX(x)pY (w − x)

=w∑

x=0

pX(x)pY (w − x) for w = 0, 1, . . .

=

w∑

x=0

λx1

x!e−λ1

λw−x2

(w − x)!e−λ2

=(λ1 + λ2)

w

w!e−(λ1+λ2)

w∑

x=0

(

w

x

)(

λ1

λ1 + λ2

)x(λ2

λ1 + λ2

)w−x

=(λ1 + λ2)

w

w!e−(λ1+λ2)

Thus W = X + Y ∼ Poisson(λ1 + λ2)In general a Poisson r.v. with parameter λ can be written as the sum of anynumber of independent Poisson(λi) r.v.s, so long as

λi = λ. This property ofa distribution is called infinite divisibility

EE 178/278A: Multiple Random Variables Page 3 – 32

Page 17: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

• Now, let’s assume that X ∼ fX(x), and Y ∼ fY (y) are independent continuousr.v.s. We wish to find the pdf of their sum W = X + Y . To do so, first notethat

PW ≤ w|X = x = PX + Y ≤ w|X = x= Px+ Y ≤ w|X = x= Px+ Y ≤ w, by independence

= FY (w − x)

ThusfW |X(w|x) = fY (w − x), a very useful result

Now, to find the pdf of W , consider

fW (w) =

∫ ∞

−∞fW,X(w, x) dx

=

∫ ∞

−∞fX(x)fW |X(w|x) dx =

∫ ∞

−∞fX(x)fY (w − x) dx

This is the convolution of fX(x) and fY (y)

EE 178/278A: Multiple Random Variables Page 3 – 33

• Example: Assume that X ∼ U[0, 1] and Y ∼ U[0, 1] are independent r.v.s. Findthe pdf of their sum W = X + YSolution: To find the pdf of the sum, we convolve the two pdfs

fW (w) =

∫ ∞

−∞fX(x)fY (w − x) dx

=

∫ w

0dx, if 0 ≤ w ≤ 1

∫ 1

w−1dx, if 1 < w ≤ 2

0, otherwise

=

w, if 0 ≤ w ≤ 1

2− w, if 1 < w ≤ 2

0, otherwise

• Example: If X ∼ N (µ1, σ21) and Y ∼ N (µ2, σ

22) are indepedent, then their sum

W ∼ N (µ1 + µ2, σ21 + σ2

2), i.e., Gaussian is also an infinitely divisibledistribution– any Gaussian r.v. can be written as the sum of any number ofindependent Gaussians as long as their means sum to its mean and theirvariances sum to its variance (will prove this using transforms later)

EE 178/278A: Multiple Random Variables Page 3 – 34

Page 18: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

One Discrete and One Continuous RVs

• Let Θ be a discrete random variable with pmf pΘ(θ)

• For each Θ = θ, such that pΘ(θ) 6= 0, let Y be a continuous random variablewith conditional pdf fY |Θ(y|θ)

• The conditional pmf of Θ given Y can be defined as a limit

pΘ|Y (θ|y) = lim∆y→0

PΘ = θ, y < Y ≤ y +∆yPy < Y ≤ y +∆y

= lim∆y→0

pΘ(θ)fY |Θ(y|θ)∆y

fY (y)∆y

=fY |Θ(y|θ)fY (y)

pΘ(θ)

• So we obtain yet another version of Bayes rule

pΘ|Y (θ|y) =fY |Θ(y|θ)

θ′ pΘ(θ′)fY |Θ(y|θ′)

pΘ(θ)

EE 178/278A: Multiple Random Variables Page 3 – 35

Example: Additive Gaussian Noise Channel

• Consider the following communication channel model

Θ Y

Z ∼ N (0, N)

where the signal sent

Θ =

+1, with probability p−1, with probability 1− p,

the signal received (also called observation) Y = Θ+ Z , and Θ and Z areindependent

Given Y = y is received (observed), find the a posteriori pmf of Θ, pΘ|Y (θ|y)

EE 178/278A: Multiple Random Variables Page 3 – 36

Page 19: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

• Solution: We use Bayes rule

pΘ|Y (θ|y) =fY |Θ(y|θ)pΘ(θ)

θ′ pΘ(θ′)fY |Θ(y|θ′)

We know pΘ(θ). To find fY |Θ(y|θ), considerPY ≤ y|Θ = 1 = PΘ+ Z ≤ y|Θ = 1

= PZ ≤ y −Θ|Θ = 1= PZ ≤ y − 1|Θ = 1= PZ ≤ y − 1, by independence of Θ and Z

Therefore, Y |Θ = +1 ∼ N (+1, N). Also, Y |Θ = −1 ∼ N (−1, N)

Thus

pΘ|Y (1|y) =p√2πN

e−(y−1)2

2N

p√2πN

e−(y−1)2

2N + (1−p)√2πN

e−(y+1)2

2N

=pey

pey + (1− p)e−yfor −∞ < y < ∞

EE 178/278A: Multiple Random Variables Page 3 – 37

• Now, let p = 1/2. Suppose the receiver decides that the signal transmitted is 1if Y > 0, otherwise he decides that it is a −1. What is the probability ofdecision error?

• Solution: First we plot the conditional pdfs

y

fY |Θ(y| − 1) fY |Θ(y|1)

1−1

This decision rule make sense, since you decide that the signal transmitted is 1 iffY |Θ(y|1) > fY |Θ(y| − 1)

An error occurs if

Θ = 1 is transmitted and Y ≤ 0, or

Θ = −1 is transmitted and Y > 0

EE 178/278A: Multiple Random Variables Page 3 – 38

Page 20: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

But if Θ = 1, then Y ≤ 0 iff Z < −1, and if Θ = −1, then Y > 0 iff Z > 1

Thus the probability of error is

P error = PΘ = 1, Y ≤ 0 or Θ = −1, Y > 0= PΘ = 1, Y ≤ 0+ PΘ = −1, Y > 0= PΘ = 1PY ≤ 0|Θ = 1+ PΘ = −1PY > 0|Θ = −1

=1

2PZ < −1+ 1

2PZ > 1

= Q

(

1√N

)

EE 178/278A: Multiple Random Variables Page 3 – 39

Summary: Total Probability and Bayes Rule

• Law of total probability:

events: P(B) =∑

iP(Ai ∩B), Ais partition Ω

pmf: pX(x) =∑

y p(x, y)

pdf: fX(x) =∫

fX,Y (x, y) dy

mixed: fY (y) =∑

θ pΘ(θ)fY |Θ(y|θ), pΘ(θ) =∫

fY (y)pΘ|Y (θ|y)dy• Bayes rule:

events: P(Aj|B) =P(B|Aj)∑

i P(B|Ai)P(Ai)P(Aj), Ais partition Ω

pmf: pX|Y (x|y) =pY |X(y|x)

∑x′ pY |X(y|x′)pX(x′)pX(x)

pdf: fX|Y (x|y) =fY |X(y|x)

∫fX(x′)fY |X(y|x′) dx′fX(x)

mixed: pΘ|Y (θ|y) =fY |Θ(y|θ)

∑θ′ pΘ(θ′)fY |Θ(y|θ′) pΘ(θ),

fY |Θ(y|θ) =pΘ|Y (θ|y)

∫fY (y′)pΘ|Y (θ|y′)dy′ fY (y)

EE 178/278A: Multiple Random Variables Page 3 – 40

Page 21: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

More Than Two RVs

• Let X1, X2, . . . , Xn be random variables (defined over the same experiment)

• If the r.v.s are discrete then they can be jointly specified by their joint pmf

pX1,X2,...,Xn(x1, x2, . . . , xn), for all (x1, x2, . . . , xn) ∈ X1 ×X2×, . . .×Xn

• If the r.v.s are jointly continuous, then they can be specified by the joint pdf

fX1,X2,...,Xn(x1, x2, . . . , xn), for all (x1, x2, . . . , xn)

• Marginal pdf (or pmf) is the joint pdf (or pmf) for a subset of X1, . . . , Xn;e.g. for three r.v.s X1, X2, X3, the marginals are fXi(xi) and fXi,Xj(xi, xj) fori 6= j

• The marginals can be obtained from the joint in the usual way, e.g. for then = 3 example

fX1,X2(x1, x2) =

∫ ∞

−∞fX1,X2,X3(x1, x2, x3) dx3

EE 178/278A: Multiple Random Variables Page 3 – 41

• Conditional pmf or pdf can be defined in the usual way, e.g. the conditional pdfof (Xk+1, Xk+2, . . . ,Xn) given (X1, X2, . . . , Xk) is

fXk+1,...,Xn|X1,...,Xk(xk+1, . . . , xn|x1, . . . , xk) =

fX1,...,Xn(x1, . . . , xn)

fX1,...,Xk(x1, . . . , xk)

• Chain rule: We can write

fX1,...,Xn(x1, . . . , xn) = fX1(x1)fX2|X1(x2|x1)fX3|X1,X2

(x3|x1, x2) . . .

. . . fXn|X1,...,Xn−1(xn|x1, . . . , xn−1)

• In general X1, X2, . . . , Xn are completely specified by their joint cdf

FX1,...,Xn(x1, . . . , xn) = PX1 ≤ x1, . . . ,Xn ≤ xn, for all (x1, . . . , xn)

EE 178/278A: Multiple Random Variables Page 3 – 42

Page 22: Lecture Notes 3 Multiple Random Variablesisl.stanford.edu/~abbas/ee178/lect03-2.pdf · Lecture Notes 3 Multiple Random Variables • Joint, Marginal, and Conditional pmfs • Bayes

Independence and Conditional Independence

• Independence is defined in the usual way: X1,X2, . . . ,Xn are said to beindependent iff

fX1,X2,...,Xn(x1, x2, . . . , xn) =n∏

i=1

fXi(xi) for all (x1, x2, . . . , xn)

• Important special case, i.i.d. r.v.s: X1,X2, . . . ,Xn are said to be independentand identically distributed (i.i.d.) if they are independent and have the samemarginal, e.g. if we flip a coin n times independently we can generateX1,X2, . . . ,Xn i.i.d. Bern(p) r.v.s

• The r.v.s X1 and X3 are said to be conditionally independent given X2 iff

fX1,X3|X2(x1, x3|x2) = fX1|X2

(x1|x2)fX3|X2(x3|x2) for all (x1, x2, x3)

• Conditional independence does not necessarily imply or is implied byindependence, i.e., X1 and X3 independent given X2 does not necessarily meanthat X1 and X3 are independent (or vice versa)

EE 178/278A: Multiple Random Variables Page 3 – 43

• Example: Series Binary Symmetric Channels:

X1X2 X3

Z1 Z2

Here X1 ∼ Bern(p), Z1 ∼ Bern(ǫ1), and Z2 ∼ Bern(ǫ2) are independent, andX3 = X1 + Z1 + Z2 mod 2

In general X1 and X3 are not independent

X1 and X3 are conditionally independent given X2

Also X1 and Z1 are independent, but not conditionally independent given X2

• Example Coin with Random Bias: Consider a coin with random bias P ∼ fP (p).Flip it n times independently to generate the r.v.s X1,X2, . . . ,Xn (Xi = 1 ifi-th flip is heads, 0 otherwise)

The r.v.s X1,X2, . . . ,Xn are not independent

However, X1, X2, . . . , Xn are conditionally independent given P — in fact,for any P = p, they are i.i.d. Bern(p)

EE 178/278A: Multiple Random Variables Page 3 – 44