probtheo4

8/10/2019 probtheo4

1/78

4. Joint Distributions of Two Random Variables

4.1 Joint Distributions of Two Discrete Random Variables

Suppose the discrete random variables X and Y have supports SXand SY, respectively.

The joint distribution ofX and Y is given by the set{P(X =x,Y =y)}xSX,ySY.This joint distribution satisfies

P(X =x,Y =y)0xSX

ySY

P(X =x,Y =y)=1

1 / 7 8
http://find/

8/10/2019 probtheo4

2/78

Joint Distributions of Two Discrete Random Variables

When SX and SYare small sets, the joint distribution ofX and Yis usually presented in a table, e.g.

Y = 1 Y = 0 Y = 1X = 0 0 1

3 0

X = 1 13

0 13

2 / 7 8
http://find/

8/10/2019 probtheo4

3/78

4.1.1 Marginal Distributions of Discrete Random Variables

Note that by summing along a row (i.e. summing over all thepossible values ofYfor a particular realisation ofX) we obtain theprobability that Xtakes the appropriate value, i.e.

P(X =x) = ySY

P(X =x,Y =y).

Similarly, by summing along a column (i.e. summing over all thepossible values ofXfor a particular realisation ofY) we obtain theprobability that Ytakes the appropriate value, i.e.

P(Y =y) =xSX

P(X =x,Y =y).

3 / 7 8
http://find/

8/10/2019 probtheo4

4/78

Marginal Distributions of Discrete Random Variables

The distributions ofX and Yobtained in this way are called themarginal distributions ofX and Y, respectively.

These marginal distributions satisfy the standard conditions for adiscrete distribution, i.e.

P(X =x) 0;xSX

P(X =x) = 1.

4 / 7 8
http://find/

8/10/2019 probtheo4

5/78

4.1.2 Independence of Two Discrete Random Variables

Two discrete random variables X and Yare independent if and

only if

P(X =x,Y =y) =P(X =x)P(Y =y), x SX, y SY.

5 / 7 8
http://find/

8/10/2019 probtheo4

6/78

4.1.3 The Expected Value of a Function of Two Discrete

Random Variables

The expected value of a function, g(X,Y), of two discrete randomvariables is defined as

E[g(X,Y)] =xSX

ySY

g(x, y)P(X =x,Y =y).

In particular, the expected value ofX is given by

E[X] = xSX

ySY

xP(X =x,Y =y).

It should be noted that if we have already calculated the marginaldistribution ofX, then it is simpler to calculate E[X] using this.

6 / 7 8
http://find/

8/10/2019 probtheo4

7/78

4.1.4 Covariance and the Correlation Coefficient

The covariance between X and Y, Cov(X,Y) is given by

Cov(X,Y) =E(XY) E(X)E(Y)

Note that by definition Cov(X,X) =E(X2)

E(X)2 =Var(X).

Also, for any constants a, b

Cov(X, aY +b) =aCov(X,Y).

The coefficient of correlation between X and Y is given by

(X,Y) = Cov(X,Y)Var(X)Var(Y)

.

7 / 7 8
http://find/

8/10/2019 probtheo4

8/78

Properties of the Correlation Coefficient

1.1 (X,Y) 1.2. If|(X,Y)| = 1, then there is a deterministic linear

relationship between X and Y, Y =aX+b. When(X,Y) = 1, then a>0. When (X,Y) =

1,

a0, (X, aY+ b) =(X,Y), i.e. the correlation

coefficient is independent of the units a variable ismeasured in. Ifa

8/10/2019 probtheo4

9/78

Example 4.1

Discrete random variables X and Yhave the following jointdistribution:

Y =

1 Y = 0 Y = 1

X = 0 0 13

0

X = 1 13

0 13

a) Calculate i) P(X>Y), ii) the marginal distributions ofX and

Y, iii) the expected values and variances ofX and Y, iv) thecoefficient of correlation between X and Y.

b) Are X and Y independent?

9 / 7 8
http://find/

8/10/2019 probtheo4

10/78

Example 4.1

To calculate the probability of such an event, we sum over all thecells which correspond to that event. Hence,

P(X>Y)=P(X = 0,Y = 1) +P(X = 1,Y = 1) ++P(X = 1,Y= 0) =

1

3

10/78
http://find/

8/10/2019 probtheo4

11/78

8/10/2019 probtheo4

12/78

Example 4.1

Similarly,

P(Y =

1)=

1

x=0

P(X =x,Y =

1) = 0 +

1

3

=1

3

P(Y= 0)=1

x=0

P(X =x,Y= 0) =1

3+ 0 =

1

3

P(Y= 1)=1 P(Y = 1) P(Y= 0) =1

3 .

12/78
http://find/

8/10/2019 probtheo4

13/78

Example 4.1

We can calculate the expected values and variances ofX and Yfrom these marginal distributions.

E[X]=1

x=0

xP(X =x) = 0 13

+ 1 23

=23

E[Y]=1

y=1

yP(Y =y) = 1 13

+ 0 13

+ 1 13

= 0.

13/78
http://find/http://goback/

8/10/2019 probtheo4

14/78

Example 4.1

To calculate Var(X) and Var(Y), we use the formula

Var(X) =E(X2) E(X)2.

E[X2]=1

x=0

x2P(X =x) = 02 13

+ 12 23

=2

3

E[Y2]=1

y=1

yP(Y =y) = (1)2 13

+ 02 13

+ 12 13

=2

3.

Thus,

Var(X)=2

3

2

3

2=

2

9

Var(Y)=2

3 02 =2

3

14/78

8/10/2019 probtheo4

15/78

Example 4.1

To calculate the coefficient of correlation, we first calculate thecovariance between X and Y. We have

Cov(X,Y) =E[XY] E[X]E[Y],

where

E[XY]=1

x=01

y=1xyP(X =x,Y =y)

=0 0 13

+ 1 (1) 13

+ 1 1 13

= 0.

15/78
http://find/

8/10/2019 probtheo4

16/78

8/10/2019 probtheo4

17/78

Example 4.1

Note that if(X,Y) = 0, then we can immediately state that Xand Y are dependent.

When (X,Y) = 0, we must check whether X and Y are

independent from the definition. X and Yare independent if andonly if

P(X =x,Y =y) =P(X =x)P(Y =y), x SX, y SY.

To show that X and Yare dependent it is sufficient to show thatthis equation does not hold for some pair of values x and y.

17/78

8/10/2019 probtheo4

18/78

Example 4.1

We have

P(X = 0,Y = 1) = 0 =P(X = 0)P(X = 1) = 13

2.

It follows that X and Yare dependent. It can be seen thatX = |Y|.

18/78
http://find/

8/10/2019 probtheo4

19/78

4.1.5 Conditional Distributions and Conditional Expected

Values

Define the conditional distribution ofX given Y =y, wherey SY by{P(X =x|Y =y)}xSX.

P(X =x|Y =y) =P(X =x,Y =y)

P(Y =y) .

Analogously, the conditional distribution ofY given X =x, wherex SX is given by{P(Y =y|X =x)}ySY.

P(Y =y|X =x) = P(X =x,Y =y)P(Y =y)

.

Note: This definition is analagous to the definition of conditionalprobability given in Chapter 1.

19/78

8/10/2019 probtheo4

20/78

Conditional Expected Value

The expected value of the function g(X) given Y =y is denotedE[g(X)|Y =y]. We have

E[g(X)|Y =y] = xSX

g(x)P(X =x|Y =y).

In particular,

E[X|Y =y] = xSX

xP(X =x|Y =y).

The expected value of the function g(Y) given X =x is definedanalogously.

20/78
http://find/

8/10/2019 probtheo4

21/78

Probabilistic Regression

The graph of the probabilistic regression ofX on Y is given by ascatter plot of the points{(y,E[X|Y =y])}ySY.Note that the variable we are conditioning on appears on thex-axis.

Analogously, the graph of the probabilistic regression ofY on X isgiven by a scatter plot of the points{(x,E[Y|X =x])}xSX.

These graphs illustrate the nature of the (probabilistic) relationbetween the variables X and Y.

21/78

8/10/2019 probtheo4

22/78

Probabilistic Regression

The graph of the probabilistic regression ofY on X, E[Y

|X =x],

may be1. Monotonically increasing in x. This indicates a

positive dependency between X and Y. (X,Y) willbe positive.

2. Monotonically decreasing in x. This indicates anegative dependency between X and Y. (X,Y) willbe negative.

3. Flat. This indicates that (X,Y) = 0. It does not,however, indicate that X and Yare independent.

4. None of the above (e.g. oscillatory). In this case, Xand Yare dependent, but this dependency cannot bedescribed in simple terms such as positive ornegative. the correlation coefficient can be 0, positiveor negative in such a case (butcannotbe-1or1).

22/78
http://find/

8/10/2019 probtheo4

23/78

Example 4.2

Discrete random variables X and Yhave the following jointdistribution:

Y = 1 Y = 0 Y = 1X = 0 0 1

3 0

X = 1 13 0 13

a) Calculate the conditional distributions ofXgiven i)Y = 1, ii) Y= 0, iii) Y = 1.

b) Calculate the conditional distributions ofY

given i)X= 0, ii) X = 1.

c) Draw the graph of the probabilistic regression of i) Yon X, ii) X on Y.

23/78
http://find/

8/10/2019 probtheo4

24/78

Example 4.2

a) i) The distribution ofXconditioned on Y =

1 is given by

P(X =x|Y = 1) = P(X =x,Y = 1)P(Y = 1) , x SX = {0, 1}.

From Example 4.1, P(Y =

1) = 1/3. Thus

P(X = 0|Y = 1)=P(X = 0,Y = 1)1/3

= 0

P(X = 1|Y = 1)=P(X = 1,Y = 1)1/3

= 1

Note that since Xonly takes the values 0 and 1

P(X = 1|Y = 1) = 1 P(X = 0|Y = 1).

24/78
http://find/

8/10/2019 probtheo4

25/78

Example 4.2

ii) Similarly, the distribution ofXconditioned on Y= 0 is given by

P(X = 0|Y= 0)=P(X = 0,Y = 0)P(Y = 0)

=1/3

1/3= 1

P(X = 1|Y= 0)=P(X = 1,Y = 0)

P(Y = 0) =

0

1/3= 0

iii) The distribution ofXconditioned on Y= 1 is given by

P(X = 0|Y= 1)=

P(X = 0,Y = 1)

P(Y = 1) =

0

1/3= 0

P(X = 1|Y= 1)=P(X = 1,Y = 1)P(Y = 1)

=1/3

1/3= 1

25/78
http://find/

8/10/2019 probtheo4

26/78

8/10/2019 probtheo4

27/78

Example 4.2

Similarly, the distribution ofY conditioned on X= 1 is given by

P(Y = 1|X= 1)=P(X = 1,Y =

1)

P(X = 1) =

1/3

2/3= 1/2

P(Y = 0|X= 1)=P(X = 1,Y = 0)P(X = 1)

= 0

2/3= 0

P(Y = 1|X= 1)=1 P(Y = 1|X = 1) P(Y = 0|X= 1) = 1/2

27/78
http://find/

8/10/2019 probtheo4

28/78

Example 4.2

c) i) To plot the probabilistic regression ofY on X, we need firstto calculate E[Y|X =x], x SX = {0, 1}. We have

E[Y|X= 0]= ySY

yP(Y =y|X= 0) = 1 0 + 0 1 + 1 0 = 0

E[Y|X= 1]=ySY

yP(Y =y|X= 1) = 1 12

+ 0 0 + 1 12

= 0

Since E[Y|X =x] is constant, it follows that (X,Y) = 0.

28/78
http://find/

8/10/2019 probtheo4

29/78

Example 4.2

x

0

E[Y|X =x]

1

29/78
http://find/

8/10/2019 probtheo4

30/78

8/10/2019 probtheo4

31/78

Example 4.2

y

E[X|Y =y]

-1 1

1

0

31/78
http://find/

8/10/2019 probtheo4

32/78

4.2 Joint Distributions of Two Continuous Random

Variables

The joint distribution of two continuous random variables can bedescribed by the joint density function fX,Y(x, y).

The joint density function fX,Ysatisfies the following two

conditions

fX,Y(x, y)0

fX,Y(x, y)dxdy=1

In practice, we integrate over the joint support of the randomvariablesX and Y. This is the set of pairs (x, y) for whichfX,Y(x, y)> 0 (see Example 4.4).

32/78
http://find/

8/10/2019 probtheo4

33/78

Joint Distributions of Two Continuous Random Variables

The definitions used for continuous random variables are analogousto the ones used for discrete random variables. Summations arereplaced by integrations.

Note that the volume under the density function is equal to 1.

The probability that (X,Y) Ais given by

P[(X,Y)

A] =

A

fX,Y(x, y)dxdy

33/78
http://find/

8/10/2019 probtheo4

34/78

4.2.1 Marginal Density Functions for Continuous Random

Variables

The marginal density function ofXcan be obtained by integratingthe joint density function with respect to Y.

fX(x) =

fX,Y(x, y)dy.

Similarly, the marginal density function ofYcan be obtained byintegrating the joint density function with respect to X.

fY(y) =

fX,Y(x, y)dx.

If the joint support ofX and Y is not a rectangle, some careshould be made regarding the appropriate form of the joint densityfunction according to the variable we are integrating with respect

to (see Example 4.4). 34/78
http://find/

8/10/2019 probtheo4

35/78

4.2.2 Independence of Two Continuous Random Variables

These marginal density functions satisfy the standard conditionsfor the density function of a single random variable.

The continuous random variables X and Yare independent if andonly if

fX,Y(x, y) =fX(x)fY(y), x, y.

35/78
http://find/

8/10/2019 probtheo4

36/78

C C ffi f C

8/10/2019 probtheo4

37/78

4.2.4 Covariance and Coefficient of Correlation

The definitions of the covariance and coefficient of correlation,Cov(X,Y) and (X,Y) are identical to the definitions used for

discrete random variables.

The properties of these measures and the interpretation of thecoefficient of correlation are unchanged.

37/78

E l 3
http://find/

8/10/2019 probtheo4

38/78

Example 4.3

The pair of random variables X and Yhas a uniform jointdistribution on the triangle Awith apexes (0,0), (2,0) and (2,2),i.e. fX,Y(x, y) =c if (x, y) A, otherwise fX,Y(x, y) = 0. Thedensity function is illustrated on the following slide.

a) Find the constant c.

b) Calculate P(X>1,Y>1).

c) Find the marginal densities ofX and Y.

d) Find the expected values and variances ofX and Y.

e) Find the coefficient of correlation, (X,Y).

f) Are X and Y independent?

38/78

E l 4 3
http://find/

8/10/2019 probtheo4

39/78

Example 4.3

x

y

2

20

fX,Y(x, y) =c

fX,Y(x, y) = 0

39/78

E l 4 3
http://find/

8/10/2019 probtheo4

40/78

Example 4.3

a) Method 1: The clever way

Note that if the density function is constant on some set A, thenthe volume under the density curve (which must be equal to 1) is

simply the height of the density curve, c, times the area of the setA.

The set A is a triangle with base and height both of length 2. Thearea of this triangle is thus 0.5

2

2 = 2.

It follows that 2c= 1 c= 0.5.

40/78

E l 4 3
http://find/

8/10/2019 probtheo4

41/78

Example 4.3

Method 2 The joint support ofX and Y is the triangle A. Hence, A

fX,Y(x, y)dydx= 1.

We must be careful to get the boundaries of integration correct.

Note that in A, xvaries from 0 to 2. Fixing x, ycan vary from 0to x. Hence,

AfX,Y(x, y)dydx=

2

x=0

x

y=0

cdy

dx

=

2

x=0

[cy]xy=0dx=

2

0

cxdx

=c[x2/2]20= 2c c= 0.5.

41/78

E l 4 3
http://find/

8/10/2019 probtheo4

42/78

Example 4.3

It is possible to do the integration in the opposite order. Note thatin A,yvaries from 0 to 2. Fixingy, xcan vary from y to 2. Hence,

A

fX,Y(x, y)dydx= 2

y=0

2

x=y

cdxdy.

Doing the integration this way obviously leads to the same answer,but the calculation is slightly longer.

42/78

Example 4 3
http://find/

8/10/2019 probtheo4

43/78

Example 4.3

Method 1: The Clever Way If (X,Y) has a uniform distributionover the set A, then P[(X,Y) B], where B A is given by

P[(X,Y) B] =|B||A| ,

where|A| is the area of the set A.

43/78

Example 4 3
http://find/

8/10/2019 probtheo4

44/78

Example 4.3

The area ofA was calculated to be 2.

If both X and Yare greater than 1, then (X,Y) must belong to

the interior of the triangle with apexes (1,1), (2,1) and (2,2).The area of this triangle is 0.5. Hence,

P(X>1,Y>1) =0.5

2 =

1

4.

44/78

Example 4 3
http://find/

8/10/2019 probtheo4

45/78

Example 4.3

Method 2: By integrationWhen both X and Y are greater than1, there is only a positive density on the triangle with apexes (1,1),(2,1) and (2,2).

In this triangle, 1 x 2. Fixing x, 1 y x. Hence,

P(X>1,Y>1) =

2

1

x1

fX,Y(x, y)dydx.

45/78

Example 4 3
http://find/

8/10/2019 probtheo4

46/78

Example 4.3

Thus

P(X>1,Y>1)=

2

1

x1

0.5dydx

= 21 [0.5y]

x

y=1dx

=

2

1

(0.5x 0.5)dx

=x2

4 x

221

=(1 1) (1/4 1/2) = 1/4

46/78

Example 4 3
http://find/

8/10/2019 probtheo4

47/78

Example 4.3

c) Note that both X and Ytake values in [0, 2].To get the marginal density function at x, fX(x), we integrate thejoint density function with respect to y, i.e. we fix xand sumover y.

Given x, we should consider for what values ofy the joint densityis positive. From the graph of the density function, there is apositive density as long as 0< y

8/10/2019 probtheo4

48/78

Example 4.3

To get the marginal density function at y, fY(y), we integrate thejoint density function with respect to x, i.e. we fix y and sumover x.

For a given y, we should consider for what values ofx the joint

density is positive. From the graph of the density function, there isa positive density as long as y

8/10/2019 probtheo4

49/78

Example 4.3

8/10/2019 probtheo4

50/78

Example 4.3

Similarly,

E(Y)=

2

0

yfY(y)dy=

2

0

0.5y(2 y)dy

= 20 (y 0.5y

2

)dy= [y2

/2 y3

/6]2

0=

2

3

E(Y2)=

2

0

y2fY(y)dy=

2

0

0.5y2(2 y)dy

= 20 (y

2

0.5y3

)dy= [y3

/3 y4

/8]

2

0=

2

3

It follows that Var(Y) =E(Y2) E(Y)2 = 23 2

3

2= 2

9.

50/78

Example 4.3

8/10/2019 probtheo4

51/78

Example 4.3

e) First, we find the covariance,Cov(X,Y) =E(XY) E(X)E(Y). We have

E(XY)= A

xyfX,Y

(x, y)dxdy= 2x=0

xy=0

0.5xydy dx=

2

x=0

[xy2/4]xy=0dx=

2

0

x3/4dx

=[x4/16]20= 1

It follows that Cov(X,Y) = 1 4/3 2/3 = 1/9.

51/78
http://find/

8/10/2019 probtheo4

52/78

Example 4.3

8/10/2019 probtheo4

53/78

p

Without calculating the correlation coefficient, we can show thatX and Yare dependent from the definition.

Intuitively, X and Yare dependent, since the values Y can takedepend on the realisation ofX.

In such cases, it is easiest to show dependence by choosing x andy, such that x SX and y SY, but (x, y) is not in the jointsupport, i.e. 0

8/10/2019 probtheo4

54/78

Expectations

The conditional density function ofX given Y =y, where y SYis given by fX|Y=y, where

fX|Y=y(x) = fX,Y(x, y)

fY(y)

The conditional density function ofY given X =x, where x SXis given by fY|X=x, where

fY|X=x(y) = fX,Y(x, y)

fX(x)

54/78

Conditional Expectations
http://find/

8/10/2019 probtheo4

55/78

p

The conditional expectation ofg(X) given Y =y, where y SYis given by E[g(X)|Y =y], where

E[g(X)|Y =y] =

g(x)fX|Y=y(x)dx.

In particular,

E[X|Y =y] =

xfX|Y=y(x)dx.

In practice, we integrate over the interval where fX|Y=y is positive.The conditional expected value ofg(Y) given X =x is definedanalogously.

55/78

Probabilistic Regression for Continuous Random Variables
http://find/

8/10/2019 probtheo4

56/78

In the case of discrete random variables, the probabilisticregression ofY on X is given by a scatter plot.

In the case of continuous random variables, the probabilisticregression ofY on X is given by a curve.

The x-coordinate in this plot is given by x, the y-coordinate isgiven by E[Y|X =x].

56/78

Probabilistic Regression for Continuous Random Variables
http://find/

8/10/2019 probtheo4

57/78

The probabilistic regression ofX on Y is defined analogously.

Since we are conditioning on y, the x-coordinate is given by y.

The y-coordinate is given by E(X|Y =y).A probabilistic regression curve can be interpreted in an analogousway to the interpretation of probabilistic regression for discreterandom variables (see Example 4.4).

57/78

Example 4.4

8/10/2019 probtheo4

58/78

For the joint distribution given in Example 4.3

i) Calculate the conditional density function ofX given Y =y.

Derive E(X|Y =y).ii) Calculate the conditional density function ofY given X =x.Derive E(Y|X =x).

58/78

Example 4.4
http://find/

8/10/2019 probtheo4

59/78

i) We have fY(y) = 1 y/2, 0 y 2. Thus SY = [0, 2). Hence,fory [0, 2)

fX|Y=y(x) = fX,Y(x, y)

fY(y) .

Since the joint support ofX and Y is not a rectangle, we must becareful with the form offX,Y(x, y). Here, we are fixing y, there isa positive joint density ify x 2. Hence, for y x 2,

fX|Y=y(x) = 1/2

1 y/2=

1

2 y.

Otherwise, fX|Y=y(x) = 0.

59/78

Example 4.4
http://find/

8/10/2019 probtheo4

60/78

We have fX|Y=y(x) = 1/(2 y) for y x 2. Hence,

E[X|Y =y]= 2

y

xfX|Y=y(x)dx=

2

y

x

2 ydx

=

x2

2(2 y)2x=y

= 4 y22(2 y)

=(2 y)(2 +y)

2(2 y) = 1 +y/2

Since this regression curve is increasing, it follows that a) X and Yare dependent, b) (X,Y)>0.

60/78

Example 4.4
http://find/

8/10/2019 probtheo4

61/78

ii) We have fX(x) =x/2, 0 x 2. Thus SX = (0, 2]. Hence, forx (0, 2]

fY|X=x(y) = fX,Y(x, y)

fX(x) .

Again, we must be careful with the form offX,Y(x, y). Here, weare fixing x, there is a positive joint density if 0 y x. Hence,for 0 y x,

fY|X=x(y) = 1/2

x/2=

1

x.

Otherwise, fY|X=x(y) = 0.

61/78

Example 4.4
http://find/

8/10/2019 probtheo4

62/78

We have fY|X=x(y) = 1/x for 0 y x. Hence,

E[Y|X =x]= x0

yfY|X=x(y)dy=

x0

y

xdy

=y2

2x

x

y=0

= x2

2x

=x/2

Again, since this regression curve is increasing, it follows that a) Xand Y are dependent, b) (X,Y)> 0.

62/78

4.3 The Multivariate Normal Distribution

8/10/2019 probtheo4

63/78

Let X= (X1,X2, . . . ,Xk) be a vector ofk random variables.

Let E[Xi] =i and = (1, 2, . . . , k), i.e. is the vector ofexpected values

Let be a k kmatrix where the entry in row i, column j, i,j isgiven by Cov(Xi,Xj).

is called the covariance matrix of the random variables

X1,X2, . . . ,Xk.

63/78

Properties of the Covariance matrix,

8/10/2019 probtheo4

64/78

1. is symmetric [since Cov(Xi,Xj)=Cov(Xj,Xi)].2. i,i=Cov(Xi,Xi) =Var(Xi) =

2i , i.e. the i-th entry

on the leading diagonal is the variance of the i-thrandom variable.

3. is positive definite, i.e. ify is any 1 kvector,yyT 0.4. Dividing each element i,j byij (where i is the

standard deviation of the i-th variable), we obtainthe correlation matrix R. i,j(the element in row i,

column jof this matrix gives the coefficient ofcorrelation between Xi and Xj. Note i,i= 1 bydefinition.

64/78

The Form of the Multivariate Distribution
http://find/

8/10/2019 probtheo4

65/78

X= (X1,X2, . . . ,Xk) has a multivariate normal distribution if andonly if the joint density function is given by (in matrix form)

fX(x) = (2)k/2||1/2 exp[1/2(x )1(x )T]

where x= (x1, x2, . . . , xn) and|| is the determinant of thecovariance matrix.

IfX= (X1,X2, . . . ,Xk) has a multivariate normal distribution,then any linear combination of the Xihas a (univariate) normal

distribution. In particular, each Xihas a (univariate) normaldistribution.

65/78

The Bivariate Normal distribution
http://find/

8/10/2019 probtheo4

66/78

In this case k= 2, X= [X1,X2], = [1, 2]. The covariancematrix is given by

= 21 Cov(X1,X2)

Cov(X1,X2) 22 =

21 1212

22 ,

where i is the standard deviation ofXi and is the coefficient ofcorrelation between X1 and X2. The correlation matrix is given by

R=1 1

.

66/78

http://find/

8/10/2019 probtheo4

67/78

In this case, the determinant of the covariance matrix is given by

|| =2122 22122 =2122(1 2).Hence,||1/2 =12

1 2.

67/78


8/10/2019 probtheo4

68/78

Also,

1 = 1

2122

(1 2)

22 1212 21

It follows that

1/2(x )1(x ) = 1/2(x1 1, x2 2)1x1 1x2 2

= 1

2(1 2

)

(x1 1)2

2

1

+(x2 2)2

2

2

2(x1 1)(x2 2)

12

68/78


8/10/2019 probtheo4

69/78

Hence, the joint density function ofX= [X1,X2] is given by

fX

(x1, x2)= 1

212

1 2exp

1

2(1 2) (x1

1)

2

21+

+ (x2 2)2

22

2(x1 1)(x2 2)12

.

69/78


8/10/2019 probtheo4

70/78

Note that if= 0, then

fX(x1, x2)= 1

212exp

12

(x1 1)2

21

+(x2 2)2

22

=fX1(x1)fX2(x2).

Hence, it follows that if [X1,X2] has a bivariate normaldistribution, then = 0 is a necessary and sufficient condition forX1 and X2 to be independent. Note that, In general, in order to

prove independence it is not sufficient to show that = 0 (seeExample 4.2).

70/78

8/10/2019 probtheo4

71/78

Example 4.5

8/10/2019 probtheo4

72/78

Suppose the height of males has a normal distribution with mean170cm and standard deviation 15cm and the correlation of theheight of a father and son is 0.6.

1. Derive the joint distribution of the height of twomales chosen at random.

2. Calculate the probability that the average height oftwo such individuals is greater than 179cm.

3. Assuming the joint distribution of the height of fatherand son is a bivariate normal distribution, derive the

joint distribution of the height of a father and son.4. Calculate the probability that the average height of a

father and son is greater than 179cm.

72/78

Example 4.5
http://find/

8/10/2019 probtheo4

73/78

1. Since the individuals are picked at random, we may assume thatthe two heights are independent. The univariate density functionof the i-th height (i= 1, 2) is

fXi(xi) = 1

i

2

exp(xi i)2

22i

= 1

15

2

exp(xi 170)2

2

152 ,The joint density function of the two heights is the product ofthese density functions, i.e.

fX1,X2(x1, x2)=fX1(x1)fX2(x2)

= 1

152 2exp(x1 170)2 (x2 170)2

2 152.

73/78

Example 4.5
http://find/

8/10/2019 probtheo4

74/78

2. Note that the sum of the two heights also has a normaldistribution. Let S=X1+X2. E[S] =E[X1] +E[X2] = 340.

Since X1 and X2 are independent,

Var(S)=Var(X1)+Var(X2) = 2 152 = 450.We must calculate

P([X1+X2]/2> 179) =P(X1+X2>358) =P(S>358).

74/78
http://find/

8/10/2019 probtheo4

75/78

Example 4.5

8/10/2019 probtheo4

76/78

3. The joint distribution of the height of a father and his son isgiven by

fX(x1, x2)= 1

2121 2

exp 1

2(1

2)

(x1 1)221

+

+(x2 2)2

22

2(x1 1)(x2 2)12

= 1

360exp

11.28

225 (x1 170)

2+

+(x2 170)2 1.2(x1 170)(x2 170)

.

76/78

Example 4.5
http://find/

8/10/2019 probtheo4

77/78

4. If the joint distribution of the height of father and son has abivariate distribution, then the sum of their heights has a normaldistribution. In this case, E[S] =E[X1] +E[X2] = 340.

Var(S)=Var(X1)+Var(X2) + 2Cov(X1,X2). We have

=Cov(X1,X2)

12Cov(X1,X2)=12= 0.6

15

15 = 135.

77/78

Example 4.5
http://find/

8/10/2019 probtheo4

78/78

It follows that

Var(S)=Var(X1) +Var(X2) + 2Cov(X1,X2)

=225 + 225 + 2 135 = 720.

By standardising, we obtain

P(S>358)=P

S E(S)

S>

358 340720

=P(Z>0.67) = 0.2514.

78/78
http://find/

probtheo4

Documents