ece 302: lecture 5.3 covariance - purdue university
TRANSCRIPT
c©Stanley Chan 2020. All Rights Reserved.
ECE 302: Lecture 5.3 Covariance
Prof Stanley Chan
School of Electrical and Computer EngineeringPurdue University
1 / 13
c©Stanley Chan 2020. All Rights Reserved.
Joint Expectation
Recall:
E[X ] =
∫ΩxfX (x)dx .
How about the expectation for two variables?
Definition
Let X and Y be two random variables. The joint expectation is
E[XY ] =∑y∈ΩY
∑x∈ΩX
xy · pX ,Y (x , y) (1)
if X and Y are discrete, or
E[XY ] =
∫y∈ΩY
∫x∈ΩX
xy · fX ,Y (x , y)dxdy (2)
if X and Y are continuous. Joint expectation is also called correlation.2 / 13
c©Stanley Chan 2020. All Rights Reserved.
Outline
Joint PDF and CDF
Joint Expectation
Conditional Distribution
Conditional Expectation
Sum of Two Random Variables
Random Vectors
High-dimensional Gaussians and Transformation
Principal Component Analysis
Today’s lecture
What is joint expectation?
Why define joint expectation as E[XY ]?
Independence
Covariance
Correlation3 / 13
c©Stanley Chan 2020. All Rights Reserved.
Covariance
Definition
Let X and Y be two random variables. Then the covariance of X and Y is
Cov(X ,Y ) = E[(X − µX )(Y − µY )], (3)
where µX = E[X ] and µY = E[Y ].
Remark:Cov(X ,X ) = E[(X − µX )(X − µX )] = Var[X ]
4 / 13
c©Stanley Chan 2020. All Rights Reserved.
Covariance
Theorem
Let X and Y be two random variables. Then,
Cov(X ,Y ) = E[XY ]− E[X ]E[Y ] (4)
Proof. Just apply the definition of covariance:
Cov(X ,Y ) = E[(X − µX )(Y − µY )]
= E[XY − XµY − YµX + µXµY ] = E[XY ]− µXµY .
5 / 13
c©Stanley Chan 2020. All Rights Reserved.
Properties
Theorem
For any X and Y ,
a. E[X + Y ] = E[X ] + E[Y ].
Proof. Recall the definition of joint expectation:
E[X + Y ] =∑y
∑x
(x + y)pX ,Y (x , y)
=∑y
∑x
xpX ,Y (x , y) +∑y
∑x
ypX ,Y (x , y)
=∑x
x
(∑y
pX ,Y (x , y)
)+∑y
y
(∑x
pX ,Y (x , y)
)=∑x
xpX (x) +∑y
ypY (y) = E[X ] + E[Y ].
6 / 13
c©Stanley Chan 2020. All Rights Reserved.
Properties
Theorem
For any X and Y ,
b. Var[X + Y ] = Var[X ] + 2Cov(X ,Y ) + Var[Y ].
Proof.
Var[X + Y ] = E[(X + Y )2]− E[X + Y ]2 = E[(X + Y )2]− (µX + µY )2
= E[X 2 + 2XY + Y 2]− (µ2X + 2µXµY + µ2
Y )
= E[X 2]− µ2X + E[Y 2]− µ2
Y + 2(E[XY ]− µXµY )
= Var[X ] + 2Cov(X ,Y ) + Var[Y ].
7 / 13
c©Stanley Chan 2020. All Rights Reserved.
Correlation Coefficient
Definition
Let X and Y be two random variables. The correlation coefficient is
ρ =Cov(X ,Y )√Var[X ]Var[Y ]
. (5)
ρ is always between 0 and 1, i.e., 1 ≤ ρ ≤ 1. This is due to the cosineangle definition.
When X = Y (fully correlated), ρ = +1.
When X = −Y (negatively correlated), ρ = −1.
When X and Y uncorrelated then ρ = 0.
8 / 13
c©Stanley Chan 2020. All Rights Reserved.
Independence
Theorem
If X and Y are independent, then
E[XY ] = E[X ]E[Y ]. (6)
Proof.
E[XY ] =∑y
∑x
xypX ,Y (x , y)
=∑y
∑x
xypX (x)pY (y)
=
(∑x
xpX (x)
)(∑y
ypY (y)
)= E[X ]E[Y ].
9 / 13
c©Stanley Chan 2020. All Rights Reserved.
Properties
Theorem
Consider the following two statements:
a. X and Y are independent;
b. Cov(X ,Y ) = 0.
It holds that (a) implies (b), but (b) does not imply (a). Thus,independence is a stronger condition than correlation.
Proof: See ebook.
What is the relationship between independence and uncorrelated?
Independence ⇒ uncorrelated.Independence : uncorrelated.
10 / 13
c©Stanley Chan 2020. All Rights Reserved.
Computation
Theory:
ρ =E[XY ]− µXµY
σXσY.
Practice:
ρ =1N
∑Nn=1 xnyn − x y√
1N
∑Nn=1(xn − x)2
√1N
∑Nn=1(yn − y)2
, (7)
-5 -4 -3 -2 -1 0 1 2 3 4 5
-5
-4
-3
-2
-1
0
1
2
3
4
5
-5 -4 -3 -2 -1 0 1 2 3 4 5
-5
-4
-3
-2
-1
0
1
2
3
4
5
-5 -4 -3 -2 -1 0 1 2 3 4 5
-5
-4
-3
-2
-1
0
1
2
3
4
5
(a) ρ = −0.0038 (b) ρ = 0.5321 (c) ρ = 0.9656 11 / 13
c©Stanley Chan 2020. All Rights Reserved.
Summary
Covariance is:Cov(X ,Y ) = E[(X − µX )(Y − µY )], (8)
Correlation coefficient is
ρ =Cov(X ,Y )√Var[X ]Var[Y ]
. (9)
What is the relationship between independence and uncorrelated?
Independence ⇒ uncorrelated.Independence : uncorrelated.
12 / 13
c©Stanley Chan 2020. All Rights Reserved.
Questions?
13 / 13