probtheo4

Upload: yulisarebecca

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 probtheo4

    1/78

    4. Joint Distributions of Two Random Variables

    4.1 Joint Distributions of Two Discrete Random Variables

    Suppose the discrete random variables X and Y have supports SXand SY, respectively.

    The joint distribution ofX and Y is given by the set{P(X =x,Y =y)}xSX,ySY.This joint distribution satisfies

    P(X =x,Y =y)0xSX

    ySY

    P(X =x,Y =y)=1

    1 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    2/78

    Joint Distributions of Two Discrete Random Variables

    When SX and SYare small sets, the joint distribution ofX and Yis usually presented in a table, e.g.

    Y = 1 Y = 0 Y = 1X = 0 0 1

    3 0

    X = 1 13

    0 13

    2 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    3/78

    4.1.1 Marginal Distributions of Discrete Random Variables

    Note that by summing along a row (i.e. summing over all thepossible values ofYfor a particular realisation ofX) we obtain theprobability that Xtakes the appropriate value, i.e.

    P(X =x) = ySY

    P(X =x,Y =y).

    Similarly, by summing along a column (i.e. summing over all thepossible values ofXfor a particular realisation ofY) we obtain theprobability that Ytakes the appropriate value, i.e.

    P(Y =y) =xSX

    P(X =x,Y =y).

    3 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    4/78

    Marginal Distributions of Discrete Random Variables

    The distributions ofX and Yobtained in this way are called themarginal distributions ofX and Y, respectively.

    These marginal distributions satisfy the standard conditions for adiscrete distribution, i.e.

    P(X =x) 0;xSX

    P(X =x) = 1.

    4 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    5/78

    4.1.2 Independence of Two Discrete Random Variables

    Two discrete random variables X and Yare independent if and

    only if

    P(X =x,Y =y) =P(X =x)P(Y =y), x SX, y SY.

    5 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    6/78

    4.1.3 The Expected Value of a Function of Two Discrete

    Random Variables

    The expected value of a function, g(X,Y), of two discrete randomvariables is defined as

    E[g(X,Y)] =xSX

    ySY

    g(x, y)P(X =x,Y =y).

    In particular, the expected value ofX is given by

    E[X] = xSX

    ySY

    xP(X =x,Y =y).

    It should be noted that if we have already calculated the marginaldistribution ofX, then it is simpler to calculate E[X] using this.

    6 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    7/78

    4.1.4 Covariance and the Correlation Coefficient

    The covariance between X and Y, Cov(X,Y) is given by

    Cov(X,Y) =E(XY) E(X)E(Y)

    Note that by definition Cov(X,X) =E(X2)

    E(X)2 =Var(X).

    Also, for any constants a, b

    Cov(X, aY +b) =aCov(X,Y).

    The coefficient of correlation between X and Y is given by

    (X,Y) = Cov(X,Y)Var(X)Var(Y)

    .

    7 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    8/78

    Properties of the Correlation Coefficient

    1.1 (X,Y) 1.2. If|(X,Y)| = 1, then there is a deterministic linear

    relationship between X and Y, Y =aX+b. When(X,Y) = 1, then a>0. When (X,Y) =

    1,

    a0, (X, aY+ b) =(X,Y), i.e. the correlation

    coefficient is independent of the units a variable ismeasured in. Ifa

  • 8/10/2019 probtheo4

    9/78

    Example 4.1

    Discrete random variables X and Yhave the following jointdistribution:

    Y =

    1 Y = 0 Y = 1

    X = 0 0 13

    0

    X = 1 13

    0 13

    a) Calculate i) P(X>Y), ii) the marginal distributions ofX and

    Y, iii) the expected values and variances ofX and Y, iv) thecoefficient of correlation between X and Y.

    b) Are X and Y independent?

    9 / 7 8

    http://find/
  • 8/10/2019 probtheo4

    10/78

    Example 4.1

    To calculate the probability of such an event, we sum over all thecells which correspond to that event. Hence,

    P(X>Y)=P(X = 0,Y = 1) +P(X = 1,Y = 1) ++P(X = 1,Y= 0) =

    1

    3

    10/78

    http://find/
  • 8/10/2019 probtheo4

    11/78

  • 8/10/2019 probtheo4

    12/78

    Example 4.1

    Similarly,

    P(Y =

    1)=

    1

    x=0

    P(X =x,Y =

    1) = 0 +

    1

    3

    =1

    3

    P(Y= 0)=1

    x=0

    P(X =x,Y= 0) =1

    3+ 0 =

    1

    3

    P(Y= 1)=1 P(Y = 1) P(Y= 0) =1

    3 .

    12/78

    http://find/
  • 8/10/2019 probtheo4

    13/78

    Example 4.1

    We can calculate the expected values and variances ofX and Yfrom these marginal distributions.

    E[X]=1

    x=0

    xP(X =x) = 0 13

    + 1 23

    =23

    E[Y]=1

    y=1

    yP(Y =y) = 1 13

    + 0 13

    + 1 13

    = 0.

    13/78

    http://find/http://goback/
  • 8/10/2019 probtheo4

    14/78

    Example 4.1

    To calculate Var(X) and Var(Y), we use the formula

    Var(X) =E(X2) E(X)2.

    E[X2]=1

    x=0

    x2P(X =x) = 02 13

    + 12 23

    =2

    3

    E[Y2]=1

    y=1

    yP(Y =y) = (1)2 13

    + 02 13

    + 12 13

    =2

    3.

    Thus,

    Var(X)=2

    3

    2

    3

    2=

    2

    9

    Var(Y)=2

    3 02 =2

    3

    14/78

    http://find/http://goback/
  • 8/10/2019 probtheo4

    15/78

    Example 4.1

    To calculate the coefficient of correlation, we first calculate thecovariance between X and Y. We have

    Cov(X,Y) =E[XY] E[X]E[Y],

    where

    E[XY]=1

    x=01

    y=1xyP(X =x,Y =y)

    =0 0 13

    + 1 (1) 13

    + 1 1 13

    = 0.

    15/78

    http://find/
  • 8/10/2019 probtheo4

    16/78

  • 8/10/2019 probtheo4

    17/78

    Example 4.1

    Note that if(X,Y) = 0, then we can immediately state that Xand Y are dependent.

    When (X,Y) = 0, we must check whether X and Y are

    independent from the definition. X and Yare independent if andonly if

    P(X =x,Y =y) =P(X =x)P(Y =y), x SX, y SY.

    To show that X and Yare dependent it is sufficient to show thatthis equation does not hold for some pair of values x and y.

    17/78

    http://find/http://goback/
  • 8/10/2019 probtheo4

    18/78

    Example 4.1

    We have

    P(X = 0,Y = 1) = 0 =P(X = 0)P(X = 1) = 13

    2.

    It follows that X and Yare dependent. It can be seen thatX = |Y|.

    18/78

    http://find/
  • 8/10/2019 probtheo4

    19/78

    4.1.5 Conditional Distributions and Conditional Expected

    Values

    Define the conditional distribution ofX given Y =y, wherey SY by{P(X =x|Y =y)}xSX.

    P(X =x|Y =y) =P(X =x,Y =y)

    P(Y =y) .

    Analogously, the conditional distribution ofY given X =x, wherex SX is given by{P(Y =y|X =x)}ySY.

    P(Y =y|X =x) = P(X =x,Y =y)P(Y =y)

    .

    Note: This definition is analagous to the definition of conditionalprobability given in Chapter 1.

    19/78

    http://find/http://goback/
  • 8/10/2019 probtheo4

    20/78

    Conditional Expected Value

    The expected value of the function g(X) given Y =y is denotedE[g(X)|Y =y]. We have

    E[g(X)|Y =y] = xSX

    g(x)P(X =x|Y =y).

    In particular,

    E[X|Y =y] = xSX

    xP(X =x|Y =y).

    The expected value of the function g(Y) given X =x is definedanalogously.

    20/78

    http://find/
  • 8/10/2019 probtheo4

    21/78

    Probabilistic Regression

    The graph of the probabilistic regression ofX on Y is given by ascatter plot of the points{(y,E[X|Y =y])}ySY.Note that the variable we are conditioning on appears on thex-axis.

    Analogously, the graph of the probabilistic regression ofY on X isgiven by a scatter plot of the points{(x,E[Y|X =x])}xSX.

    These graphs illustrate the nature of the (probabilistic) relationbetween the variables X and Y.

    21/78

    http://find/http://goback/
  • 8/10/2019 probtheo4

    22/78

    Probabilistic Regression

    The graph of the probabilistic regression ofY on X, E[Y

    |X =x],

    may be1. Monotonically increasing in x. This indicates a

    positive dependency between X and Y. (X,Y) willbe positive.

    2. Monotonically decreasing in x. This indicates anegative dependency between X and Y. (X,Y) willbe negative.

    3. Flat. This indicates that (X,Y) = 0. It does not,however, indicate that X and Yare independent.

    4. None of the above (e.g. oscillatory). In this case, Xand Yare dependent, but this dependency cannot bedescribed in simple terms such as positive ornegative. the correlation coefficient can be 0, positiveor negative in such a case (butcannotbe-1or1).

    22/78

    http://find/
  • 8/10/2019 probtheo4

    23/78

    Example 4.2

    Discrete random variables X and Yhave the following jointdistribution:

    Y = 1 Y = 0 Y = 1X = 0 0 1

    3 0

    X = 1 13 0 13

    a) Calculate the conditional distributions ofXgiven i)Y = 1, ii) Y= 0, iii) Y = 1.

    b) Calculate the conditional distributions ofY

    given i)X= 0, ii) X = 1.

    c) Draw the graph of the probabilistic regression of i) Yon X, ii) X on Y.

    23/78

    http://find/
  • 8/10/2019 probtheo4

    24/78

    Example 4.2

    a) i) The distribution ofXconditioned on Y =

    1 is given by

    P(X =x|Y = 1) = P(X =x,Y = 1)P(Y = 1) , x SX = {0, 1}.

    From Example 4.1, P(Y =

    1) = 1/3. Thus

    P(X = 0|Y = 1)=P(X = 0,Y = 1)1/3

    = 0

    P(X = 1|Y = 1)=P(X = 1,Y = 1)1/3

    = 1

    Note that since Xonly takes the values 0 and 1

    P(X = 1|Y = 1) = 1 P(X = 0|Y = 1).

    24/78

    http://find/
  • 8/10/2019 probtheo4

    25/78

    Example 4.2

    ii) Similarly, the distribution ofXconditioned on Y= 0 is given by

    P(X = 0|Y= 0)=P(X = 0,Y = 0)P(Y = 0)

    =1/3

    1/3= 1

    P(X = 1|Y= 0)=P(X = 1,Y = 0)

    P(Y = 0) =

    0

    1/3= 0

    iii) The distribution ofXconditioned on Y= 1 is given by

    P(X = 0|Y= 1)=

    P(X = 0,Y = 1)

    P(Y = 1) =

    0

    1/3= 0

    P(X = 1|Y= 1)=P(X = 1,Y = 1)P(Y = 1)

    =1/3

    1/3= 1

    25/78

    http://find/
  • 8/10/2019 probtheo4

    26/78

  • 8/10/2019 probtheo4

    27/78

    Example 4.2

    Similarly, the distribution ofY conditioned on X= 1 is given by

    P(Y = 1|X= 1)=P(X = 1,Y =

    1)

    P(X = 1) =

    1/3

    2/3= 1/2

    P(Y = 0|X= 1)=P(X = 1,Y = 0)P(X = 1)

    = 0

    2/3= 0

    P(Y = 1|X= 1)=1 P(Y = 1|X = 1) P(Y = 0|X= 1) = 1/2

    27/78

    http://find/
  • 8/10/2019 probtheo4

    28/78

    Example 4.2

    c) i) To plot the probabilistic regression ofY on X, we need firstto calculate E[Y|X =x], x SX = {0, 1}. We have

    E[Y|X= 0]= ySY

    yP(Y =y|X= 0) = 1 0 + 0 1 + 1 0 = 0

    E[Y|X= 1]=ySY

    yP(Y =y|X= 1) = 1 12

    + 0 0 + 1 12

    = 0

    Since E[Y|X =x] is constant, it follows that (X,Y) = 0.

    28/78

    http://find/
  • 8/10/2019 probtheo4

    29/78

    Example 4.2

    x

    0

    E[Y|X =x]

    1

    29/78

    http://find/
  • 8/10/2019 probtheo4

    30/78

  • 8/10/2019 probtheo4

    31/78

    Example 4.2

    y

    E[X|Y =y]

    -1 1

    1

    0

    31/78

    http://find/
  • 8/10/2019 probtheo4

    32/78

    4.2 Joint Distributions of Two Continuous Random

    Variables

    The joint distribution of two continuous random variables can bedescribed by the joint density function fX,Y(x, y).

    The joint density function fX,Ysatisfies the following two

    conditions

    fX,Y(x, y)0

    fX,Y(x, y)dxdy=1

    In practice, we integrate over the joint support of the randomvariablesX and Y. This is the set of pairs (x, y) for whichfX,Y(x, y)> 0 (see Example 4.4).

    32/78

    http://find/
  • 8/10/2019 probtheo4

    33/78

    Joint Distributions of Two Continuous Random Variables

    The definitions used for continuous random variables are analogousto the ones used for discrete random variables. Summations arereplaced by integrations.

    Note that the volume under the density function is equal to 1.

    The probability that (X,Y) Ais given by

    P[(X,Y)

    A] =

    A

    fX,Y(x, y)dxdy

    33/78

    http://find/
  • 8/10/2019 probtheo4

    34/78

    4.2.1 Marginal Density Functions for Continuous Random

    Variables

    The marginal density function ofXcan be obtained by integratingthe joint density function with respect to Y.

    fX(x) =

    fX,Y(x, y)dy.

    Similarly, the marginal density function ofYcan be obtained byintegrating the joint density function with respect to X.

    fY(y) =

    fX,Y(x, y)dx.

    If the joint support ofX and Y is not a rectangle, some careshould be made regarding the appropriate form of the joint densityfunction according to the variable we are integrating with respect

    to (see Example 4.4). 34/78

    http://find/
  • 8/10/2019 probtheo4

    35/78

    4.2.2 Independence of Two Continuous Random Variables

    These marginal density functions satisfy the standard conditionsfor the density function of a single random variable.

    The continuous random variables X and Yare independent if andonly if

    fX,Y(x, y) =fX(x)fY(y), x, y.

    35/78

    http://find/
  • 8/10/2019 probtheo4

    36/78

    C C ffi f C

  • 8/10/2019 probtheo4

    37/78

    4.2.4 Covariance and Coefficient of Correlation

    The definitions of the covariance and coefficient of correlation,Cov(X,Y) and (X,Y) are identical to the definitions used for

    discrete random variables.

    The properties of these measures and the interpretation of thecoefficient of correlation are unchanged.

    37/78

    E l 3

    http://find/
  • 8/10/2019 probtheo4

    38/78

    Example 4.3

    The pair of random variables X and Yhas a uniform jointdistribution on the triangle Awith apexes (0,0), (2,0) and (2,2),i.e. fX,Y(x, y) =c if (x, y) A, otherwise fX,Y(x, y) = 0. Thedensity function is illustrated on the following slide.

    a) Find the constant c.

    b) Calculate P(X>1,Y>1).

    c) Find the marginal densities ofX and Y.

    d) Find the expected values and variances ofX and Y.

    e) Find the coefficient of correlation, (X,Y).

    f) Are X and Y independent?

    38/78

    E l 4 3

    http://find/
  • 8/10/2019 probtheo4

    39/78

    Example 4.3

    x

    y

    2

    20

    fX,Y(x, y) =c

    fX,Y(x, y) = 0

    39/78

    E l 4 3

    http://find/
  • 8/10/2019 probtheo4

    40/78

    Example 4.3

    a) Method 1: The clever way

    Note that if the density function is constant on some set A, thenthe volume under the density curve (which must be equal to 1) is

    simply the height of the density curve, c, times the area of the setA.

    The set A is a triangle with base and height both of length 2. Thearea of this triangle is thus 0.5

    2

    2 = 2.

    It follows that 2c= 1 c= 0.5.

    40/78

    E l 4 3

    http://find/
  • 8/10/2019 probtheo4

    41/78

    Example 4.3

    Method 2 The joint support ofX and Y is the triangle A. Hence, A

    fX,Y(x, y)dydx= 1.

    We must be careful to get the boundaries of integration correct.

    Note that in A, xvaries from 0 to 2. Fixing x, ycan vary from 0to x. Hence,

    AfX,Y(x, y)dydx=

    2

    x=0

    x

    y=0

    cdy

    dx

    =

    2

    x=0

    [cy]xy=0dx=

    2

    0

    cxdx

    =c[x2/2]20= 2c c= 0.5.

    41/78

    E l 4 3

    http://find/
  • 8/10/2019 probtheo4

    42/78

    Example 4.3

    It is possible to do the integration in the opposite order. Note thatin A,yvaries from 0 to 2. Fixingy, xcan vary from y to 2. Hence,

    A

    fX,Y(x, y)dydx= 2

    y=0

    2

    x=y

    cdxdy.

    Doing the integration this way obviously leads to the same answer,but the calculation is slightly longer.

    42/78

    Example 4 3

    http://find/
  • 8/10/2019 probtheo4

    43/78

    Example 4.3

    Method 1: The Clever Way If (X,Y) has a uniform distributionover the set A, then P[(X,Y) B], where B A is given by

    P[(X,Y) B] =|B||A| ,

    where|A| is the area of the set A.

    43/78

    Example 4 3

    http://find/
  • 8/10/2019 probtheo4

    44/78

    Example 4.3

    The area ofA was calculated to be 2.

    If both X and Yare greater than 1, then (X,Y) must belong to

    the interior of the triangle with apexes (1,1), (2,1) and (2,2).The area of this triangle is 0.5. Hence,

    P(X>1,Y>1) =0.5

    2 =

    1

    4.

    44/78

    Example 4 3

    http://find/
  • 8/10/2019 probtheo4

    45/78

    Example 4.3

    Method 2: By integrationWhen both X and Y are greater than1, there is only a positive density on the triangle with apexes (1,1),(2,1) and (2,2).

    In this triangle, 1 x 2. Fixing x, 1 y x. Hence,

    P(X>1,Y>1) =

    2

    1

    x1

    fX,Y(x, y)dydx.

    45/78

    Example 4 3

    http://find/
  • 8/10/2019 probtheo4

    46/78

    Example 4.3

    Thus

    P(X>1,Y>1)=

    2

    1

    x1

    0.5dydx

    = 21 [0.5y]

    x

    y=1dx

    =

    2

    1

    (0.5x 0.5)dx

    =x2

    4 x

    221

    =(1 1) (1/4 1/2) = 1/4

    46/78

    Example 4 3

    http://find/
  • 8/10/2019 probtheo4

    47/78

    Example 4.3

    c) Note that both X and Ytake values in [0, 2].To get the marginal density function at x, fX(x), we integrate thejoint density function with respect to y, i.e. we fix xand sumover y.

    Given x, we should consider for what values ofy the joint densityis positive. From the graph of the density function, there is apositive density as long as 0< y

  • 8/10/2019 probtheo4

    48/78

    Example 4.3

    To get the marginal density function at y, fY(y), we integrate thejoint density function with respect to x, i.e. we fix y and sumover x.

    For a given y, we should consider for what values ofx the joint

    density is positive. From the graph of the density function, there isa positive density as long as y

  • 8/10/2019 probtheo4

    49/78

    Example 4.3

  • 8/10/2019 probtheo4

    50/78

    Example 4.3

    Similarly,

    E(Y)=

    2

    0

    yfY(y)dy=

    2

    0

    0.5y(2 y)dy

    = 20 (y 0.5y

    2

    )dy= [y2

    /2 y3

    /6]2

    0=

    2

    3

    E(Y2)=

    2

    0

    y2fY(y)dy=

    2

    0

    0.5y2(2 y)dy

    = 20 (y

    2

    0.5y3

    )dy= [y3

    /3 y4

    /8]

    2

    0=

    2

    3

    It follows that Var(Y) =E(Y2) E(Y)2 = 23 2

    3

    2= 2

    9.

    50/78

    Example 4.3

    http://find/http://goback/
  • 8/10/2019 probtheo4

    51/78

    Example 4.3

    e) First, we find the covariance,Cov(X,Y) =E(XY) E(X)E(Y). We have

    E(XY)= A

    xyfX,Y

    (x, y)dxdy= 2x=0

    xy=0

    0.5xydy dx=

    2

    x=0

    [xy2/4]xy=0dx=

    2

    0

    x3/4dx

    =[x4/16]20= 1

    It follows that Cov(X,Y) = 1 4/3 2/3 = 1/9.

    51/78

    http://find/
  • 8/10/2019 probtheo4

    52/78

    Example 4.3

  • 8/10/2019 probtheo4

    53/78

    p

    Without calculating the correlation coefficient, we can show thatX and Yare dependent from the definition.

    Intuitively, X and Yare dependent, since the values Y can takedepend on the realisation ofX.

    In such cases, it is easiest to show dependence by choosing x andy, such that x SX and y SY, but (x, y) is not in the jointsupport, i.e. 0

  • 8/10/2019 probtheo4

    54/78

    Expectations

    The conditional density function ofX given Y =y, where y SYis given by fX|Y=y, where

    fX|Y=y(x) = fX,Y(x, y)

    fY(y)

    The conditional density function ofY given X =x, where x SXis given by fY|X=x, where

    fY|X=x(y) = fX,Y(x, y)

    fX(x)

    54/78

    Conditional Expectations

    http://find/
  • 8/10/2019 probtheo4

    55/78

    p

    The conditional expectation ofg(X) given Y =y, where y SYis given by E[g(X)|Y =y], where

    E[g(X)|Y =y] =

    g(x)fX|Y=y(x)dx.

    In particular,

    E[X|Y =y] =

    xfX|Y=y(x)dx.

    In practice, we integrate over the interval where fX|Y=y is positive.The conditional expected value ofg(Y) given X =x is definedanalogously.

    55/78

    Probabilistic Regression for Continuous Random Variables

    http://find/
  • 8/10/2019 probtheo4

    56/78

    In the case of discrete random variables, the probabilisticregression ofY on X is given by a scatter plot.

    In the case of continuous random variables, the probabilisticregression ofY on X is given by a curve.

    The x-coordinate in this plot is given by x, the y-coordinate isgiven by E[Y|X =x].

    56/78

    Probabilistic Regression for Continuous Random Variables

    http://find/
  • 8/10/2019 probtheo4

    57/78

    The probabilistic regression ofX on Y is defined analogously.

    Since we are conditioning on y, the x-coordinate is given by y.

    The y-coordinate is given by E(X|Y =y).A probabilistic regression curve can be interpreted in an analogousway to the interpretation of probabilistic regression for discreterandom variables (see Example 4.4).

    57/78

    Example 4.4

    http://find/http://goback/
  • 8/10/2019 probtheo4

    58/78

    For the joint distribution given in Example 4.3

    i) Calculate the conditional density function ofX given Y =y.

    Derive E(X|Y =y).ii) Calculate the conditional density function ofY given X =x.Derive E(Y|X =x).

    58/78

    Example 4.4

    http://find/
  • 8/10/2019 probtheo4

    59/78

    i) We have fY(y) = 1 y/2, 0 y 2. Thus SY = [0, 2). Hence,fory [0, 2)

    fX|Y=y(x) = fX,Y(x, y)

    fY(y) .

    Since the joint support ofX and Y is not a rectangle, we must becareful with the form offX,Y(x, y). Here, we are fixing y, there isa positive joint density ify x 2. Hence, for y x 2,

    fX|Y=y(x) = 1/2

    1 y/2=

    1

    2 y.

    Otherwise, fX|Y=y(x) = 0.

    59/78

    Example 4.4

    http://find/
  • 8/10/2019 probtheo4

    60/78

    We have fX|Y=y(x) = 1/(2 y) for y x 2. Hence,

    E[X|Y =y]= 2

    y

    xfX|Y=y(x)dx=

    2

    y

    x

    2 ydx

    =

    x2

    2(2 y)2x=y

    = 4 y22(2 y)

    =(2 y)(2 +y)

    2(2 y) = 1 +y/2

    Since this regression curve is increasing, it follows that a) X and Yare dependent, b) (X,Y)>0.

    60/78

    Example 4.4

    http://find/
  • 8/10/2019 probtheo4

    61/78

    ii) We have fX(x) =x/2, 0 x 2. Thus SX = (0, 2]. Hence, forx (0, 2]

    fY|X=x(y) = fX,Y(x, y)

    fX(x) .

    Again, we must be careful with the form offX,Y(x, y). Here, weare fixing x, there is a positive joint density if 0 y x. Hence,for 0 y x,

    fY|X=x(y) = 1/2

    x/2=

    1

    x.

    Otherwise, fY|X=x(y) = 0.

    61/78

    Example 4.4

    http://find/
  • 8/10/2019 probtheo4

    62/78

    We have fY|X=x(y) = 1/x for 0 y x. Hence,

    E[Y|X =x]= x0

    yfY|X=x(y)dy=

    x0

    y

    xdy

    =y2

    2x

    x

    y=0

    = x2

    2x

    =x/2

    Again, since this regression curve is increasing, it follows that a) Xand Y are dependent, b) (X,Y)> 0.

    62/78

    4.3 The Multivariate Normal Distribution

    http://find/http://goback/
  • 8/10/2019 probtheo4

    63/78

    Let X= (X1,X2, . . . ,Xk) be a vector ofk random variables.

    Let E[Xi] =i and = (1, 2, . . . , k), i.e. is the vector ofexpected values

    Let be a k kmatrix where the entry in row i, column j, i,j isgiven by Cov(Xi,Xj).

    is called the covariance matrix of the random variables

    X1,X2, . . . ,Xk.

    63/78

    Properties of the Covariance matrix,

    http://find/http://goback/
  • 8/10/2019 probtheo4

    64/78

    1. is symmetric [since Cov(Xi,Xj)=Cov(Xj,Xi)].2. i,i=Cov(Xi,Xi) =Var(Xi) =

    2i , i.e. the i-th entry

    on the leading diagonal is the variance of the i-thrandom variable.

    3. is positive definite, i.e. ify is any 1 kvector,yyT 0.4. Dividing each element i,j byij (where i is the

    standard deviation of the i-th variable), we obtainthe correlation matrix R. i,j(the element in row i,

    column jof this matrix gives the coefficient ofcorrelation between Xi and Xj. Note i,i= 1 bydefinition.

    64/78

    The Form of the Multivariate Distribution

    http://find/
  • 8/10/2019 probtheo4

    65/78

    X= (X1,X2, . . . ,Xk) has a multivariate normal distribution if andonly if the joint density function is given by (in matrix form)

    fX(x) = (2)k/2||1/2 exp[1/2(x )1(x )T]

    where x= (x1, x2, . . . , xn) and|| is the determinant of thecovariance matrix.

    IfX= (X1,X2, . . . ,Xk) has a multivariate normal distribution,then any linear combination of the Xihas a (univariate) normal

    distribution. In particular, each Xihas a (univariate) normaldistribution.

    65/78

    The Bivariate Normal distribution

    http://find/
  • 8/10/2019 probtheo4

    66/78

    In this case k= 2, X= [X1,X2], = [1, 2]. The covariancematrix is given by

    = 21 Cov(X1,X2)

    Cov(X1,X2) 22 =

    21 1212

    22 ,

    where i is the standard deviation ofXi and is the coefficient ofcorrelation between X1 and X2. The correlation matrix is given by

    R=1 1

    .

    66/78

    The Bivariate Normal distribution

    http://find/
  • 8/10/2019 probtheo4

    67/78

    In this case, the determinant of the covariance matrix is given by

    || =2122 22122 =2122(1 2).Hence,||1/2 =12

    1 2.

    67/78

    The Bivariate Normal distribution

    http://find/http://goback/
  • 8/10/2019 probtheo4

    68/78

    Also,

    1 = 1

    2122

    (1 2)

    22 1212 21

    It follows that

    1/2(x )1(x ) = 1/2(x1 1, x2 2)1x1 1x2 2

    = 1

    2(1 2

    )

    (x1 1)2

    2

    1

    +(x2 2)2

    2

    2

    2(x1 1)(x2 2)

    12

    68/78

    The Bivariate Normal distribution

    http://find/http://goback/
  • 8/10/2019 probtheo4

    69/78

    Hence, the joint density function ofX= [X1,X2] is given by

    fX

    (x1, x2)= 1

    212

    1 2exp

    1

    2(1 2) (x1

    1)

    2

    21+

    + (x2 2)2

    22

    2(x1 1)(x2 2)12

    .

    69/78

    The Bivariate Normal distribution

    http://find/http://goback/
  • 8/10/2019 probtheo4

    70/78

    Note that if= 0, then

    fX(x1, x2)= 1

    212exp

    12

    (x1 1)2

    21

    +(x2 2)2

    22

    =fX1(x1)fX2(x2).

    Hence, it follows that if [X1,X2] has a bivariate normaldistribution, then = 0 is a necessary and sufficient condition forX1 and X2 to be independent. Note that, In general, in order to

    prove independence it is not sufficient to show that = 0 (seeExample 4.2).

    70/78

    http://find/http://goback/
  • 8/10/2019 probtheo4

    71/78

    Example 4.5

  • 8/10/2019 probtheo4

    72/78

    Suppose the height of males has a normal distribution with mean170cm and standard deviation 15cm and the correlation of theheight of a father and son is 0.6.

    1. Derive the joint distribution of the height of twomales chosen at random.

    2. Calculate the probability that the average height oftwo such individuals is greater than 179cm.

    3. Assuming the joint distribution of the height of fatherand son is a bivariate normal distribution, derive the

    joint distribution of the height of a father and son.4. Calculate the probability that the average height of a

    father and son is greater than 179cm.

    72/78

    Example 4.5

    http://find/
  • 8/10/2019 probtheo4

    73/78

    1. Since the individuals are picked at random, we may assume thatthe two heights are independent. The univariate density functionof the i-th height (i= 1, 2) is

    fXi(xi) = 1

    i

    2

    exp(xi i)2

    22i

    = 1

    15

    2

    exp(xi 170)2

    2

    152 ,The joint density function of the two heights is the product ofthese density functions, i.e.

    fX1,X2(x1, x2)=fX1(x1)fX2(x2)

    = 1

    152 2exp(x1 170)2 (x2 170)2

    2 152.

    73/78

    Example 4.5

    http://find/
  • 8/10/2019 probtheo4

    74/78

    2. Note that the sum of the two heights also has a normaldistribution. Let S=X1+X2. E[S] =E[X1] +E[X2] = 340.

    Since X1 and X2 are independent,

    Var(S)=Var(X1)+Var(X2) = 2 152 = 450.We must calculate

    P([X1+X2]/2> 179) =P(X1+X2>358) =P(S>358).

    74/78

    http://find/
  • 8/10/2019 probtheo4

    75/78

    Example 4.5

  • 8/10/2019 probtheo4

    76/78

    3. The joint distribution of the height of a father and his son isgiven by

    fX(x1, x2)= 1

    2121 2

    exp 1

    2(1

    2)

    (x1 1)221

    +

    +(x2 2)2

    22

    2(x1 1)(x2 2)12

    = 1

    360exp

    11.28

    225 (x1 170)

    2+

    +(x2 170)2 1.2(x1 170)(x2 170)

    .

    76/78

    Example 4.5

    http://find/
  • 8/10/2019 probtheo4

    77/78

    4. If the joint distribution of the height of father and son has abivariate distribution, then the sum of their heights has a normaldistribution. In this case, E[S] =E[X1] +E[X2] = 340.

    Var(S)=Var(X1)+Var(X2) + 2Cov(X1,X2). We have

    =Cov(X1,X2)

    12Cov(X1,X2)=12= 0.6

    15

    15 = 135.

    77/78

    Example 4.5

    http://find/
  • 8/10/2019 probtheo4

    78/78

    It follows that

    Var(S)=Var(X1) +Var(X2) + 2Cov(X1,X2)

    =225 + 225 + 2 135 = 720.

    By standardising, we obtain

    P(S>358)=P

    S E(S)

    S>

    358 340720

    =P(Z>0.67) = 0.2514.

    78/78

    http://find/