lect6a

Upload: shraddha-saran

Post on 03-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 lect6a

    1/50

    1

    6. Mean, Variance, Moments and

    Characteristic Functions

    For a r.v X , its p.d.f represents complete information

    about it, and for any Borel set B on the x-axis

     Note that represents very detailed information, and

    quite often it is desirable to characterize the r.v in terms ofits average behavior. In this context, we will introduce two

     parameters - mean and variance - that are universally used

    to represent the overall properties of the r.v and its p.d.f.

    ( )   ∫=∈  B   X    dx x f  B X  P  .)()(ξ (6-1)

    )( x f  X 

    )( x f  X 

    PILLAI

  • 8/11/2019 lect6a

    2/50

    2

    Mean or the Expected Value of a r.v X is defined as

    If X is a discrete-type r.v, then using (3-25) we get

    Mean represents the average (mean) value of the r.v in avery large number of trials. For example if ∼ then

    using (3-31) ,

    is the midpoint of the interval (a,b).

    ∫  ∞+

    ∞−===

     

    .)()(   dx x f  x X  E  X   X  X η (6-2)

    .)( 

    )()()(1

    ∑∑∑   ∫∫   ∑

    ===

    −=−===

    i

    ii

    i

    ii

    iiii

    iii X 

     x X  P  x p x

    dx x x p xdx x x p x X  E  X      δδη

    (6-3)

    ),,(  baU  X 

    (6-4)∫  +

    =−

    −=

    −=

    −=

      b

    a

    b

    a

    ba

    ab

    ab x

    abdx

    ab

     x X  E 

    2)(22

    1)(

    222

    PILLAI

  • 8/11/2019 lect6a

    3/50

    3

    On the other hand if X is exponential with parameter as in

    (3-32), then

    implying that the parameter in (3-32) represents the meanvalue of the exponential r.v.

    Similarly if X is Poisson with parameter as in (3-45),

    using (6-3), we get

    Thus the parameter in (3-45) also represents the mean of

    the Poisson r.v.

    ∫∫  ∞ −−∞ ===

     

    0

    /

    0,)(   λλ

    λλ dy yedxe

     x X  E    y x (6-5)

    λ

    λ

    λ

    .!)!1(

     

    !!)()(

    01

    100

    λλλλλ

    λλ

    λλλλ

    λλ

    ===−

    =

    ====

    −∞

    =

    −∞

    =

    =

    −∞

    =

    −∞

    =

    ∑∑

    ∑∑∑

    eei

    ek 

    e

    k k e

    k kek  X kP  X  E 

    i

    i

    (6-6)

    λ

    PILLAI

  • 8/11/2019 lect6a

    4/50

    4

    In a similar manner, if X is binomial as in (3-44), then its

    mean is given by

    Thus np represents the mean of the binomial r.v in (3-44).For the normal r.v in (3-29),

    .)(!)!1(

    )!1(

    )!1()!(

    !

     

    !)!(

    !)()(

    111

    01

    100

    npq pnpq piin

    n

    npq pk k n

    n

    q pk k n

    nk q p

    nk k  X kP  X  E 

    ninin

    i

    k nk n

    k nk n

    k nk n

    n

    =+=−−

    =−−=

    −=

     

      

     ===

    −−−−

    =

    =

    =

    ==

    ∑∑

    ∑∑∑

    (6-7)

    .2

    1

    2

    )(2

    1

    2

    1)(

    1

     2/

    2

    0

     2/

    2

     2/

    2

     2/)(

    2

    2222

    2222

    µπσ

    µπσ

    µπσπσ

    σσ

    σσµ

    =⋅+=

    +==

    ∫∫

    ∫∫∞+

    ∞−

    −∞+

    ∞−

    ∞+

    ∞−

    −∞+

    ∞−

    −−

         

       

    dyedy ye

    dye ydx xe X  E 

     y y

     y x

    (6-8)

    PILLAI

  • 8/11/2019 lect6a

    5/50

    5

    Thus the first parameter in ∼ is infact the mean of

    the Gaussian r.v X . Given ∼ suppose defines a

    new r.v with p.d.f Then from the previous discussion,

    the new r.v Y has a mean given by (see (6-2))

    From (6-9), it appears that to determine we need to

    determine However this is not the case if only isthe quantity of interest. Recall that for any y,

    where represent the multiple solutions of the equation

    But(6-10) can be rewritten as

    ),(  2σµ N  X 

    ),(   x f  X   X  )( X  g Y  =).( y f Y 

    Y µ

    ∫  ∞+

    ∞−== 

    .)()(   dy y f  yY  E  Y Y µ (6-9)

    ),(Y  E 

    ).( y f Y  )(Y  E 0>∆ y

    ( ) ( ) ,∑   ∆+≤

  • 8/11/2019 lect6a

    6/50

    6

    where the terms form nonoverlapping intervals.Hence

    and hence as ∆ y covers the entire y-axis, the corresponding

    ∆ x’s are nonoverlapping, and they cover the entire x-axis.Hence, in the limit as integrating both sides of (6-12), we get the useful formula

    In the discrete case, (6-13) reduces to

    From (6-13)-(6-14), is not required to evaluate

    for We can use (6-14) to determine the mean of where X is a Poisson r.v. Using (3-45)

    ( )iii   x x x   ∆+ ,

    ,)()()()( ii

    i X ii

    i

    i X Y    x x f  x g  x x f  y y y f  y   ∆=∆=∆   ∑∑ (6-12)

    ( ) ∫∫  ∞+

    ∞−

    ∞+

    ∞− === 

    .)()()()()(   dx x f  x g dy y f  y X  g  E Y  E   X Y  (6-13)

    ).()()( ii

    i   x X  P  x g Y  E    == ∑ (6-14))( y f Y 

    ,2 X Y  =

    )(Y  E 

    ).( X  g Y  =

    ,0→∆ y

    PILLAI

  • 8/11/2019 lect6a

    7/50

    7

    ( )

    ( ) . !)!1(

     

    !!! 

    !)1(

    )!1( 

    !!)(

    2

    0

    1

    1

    10 0

    0

    1

    1

    1

    2

    0

    2

    0

    22

    λλλλ

    λλ

    λλ

    λλλλλ

    λλ

    λλ

    λλλ

    λλλλ

    λλλ

    λλ

    λλ

    +=+= 

     

     

     +=

     

     

     

     +

    −=

      

       +=

      

       +=

    +=−

    =

    ====

    =

    +−

    =

    =

    =

    =

    =

    +−

    =

    =

    −∞

    =

    −∞

    =

    ∑∑

    ∑∑ ∑

    ∑∑

    ∑∑∑

    eee

    em

    eei

    e

    ei

    ieii

    ie

    iie

    k k e

    k k e

    k ek k  X  P k  X  E 

    m

    m

    i

    i

    i

    i

    i i

    ii

    i

    i

    k k 

    (6-15)

    In general, is known as the k th moment of r.v X .

    Thus if ∼ its second moment is given by (6-15).,)(  λ P  X 

    ( )k  X  E 

    PILLAI

  • 8/11/2019 lect6a

    8/50

    8

    Mean alone will not be able to truly represent the p.d.f of

    any r.v. To illustrate this, consider the following scenario:

    Consider two Gaussian r.vs ∼ and ∼

    Both of them have the same mean However, as

    Fig. 6.1 shows, their p.d.fs are quite different. One is moreconcentrated around the mean, whereas the other one

    has a wider spread. Clearly, we need atleast an additional

     parameter to measure this spread around the mean!

    (0,1) 1   N  X  (0,10). 2   N  X 

    .0=µ

    Fig.6.1

    )( 11  x f  X 

    1 x

    1

    2

    =σ(a)

    )( 22  x f  X 

    2 x

    102 =σ(b)

    )( 2 X 

    PILLAI

  • 8/11/2019 lect6a

    9/50

    9

    For a r.v X with mean represents the deviation of

    the r.v from its mean. Since this deviation can be either

     positive or negative, consider the quantity and itsaverage value represents the average mean

    square deviation of X around its mean. Define

    With and using (6-13) we get

    is known as the variance of the r.v X , and its squareroot is known as the standard deviation of

     X . Note that the standard deviation represents the root mean

    square spread of the r.v X around its mean

    µµ − X  ,

    ( ) ,2µ− X 

    ( ) ][ 2µ− X  E 

    ( ) .0][ 22 >−= µσ   X  E  X 

    (6-16)

    2)()(   µ−=   X  X  g 

    .0)()( 

    22 >−= ∫  ∞+

    ∞−dx x f  x  X  X  µσ

    (6-17)

    2

     X 

    σ2)(   µσ −=   X  E  X 

    PILLAI

  • 8/11/2019 lect6a

    10/50

    10

    Expanding (6-17) and using the linearity of the integrals, we

    get

    Alternatively, we can use (6-18) to computeThus , for example, returning back to the Poisson r.v in (3-

    45), using (6-6) and (6-15), we get

    Thus for a Poisson r.v, mean and variance are both equal

    to its parameter

    ( )

    ( ) ( )   [ ] .)( 

    )(2)( 

    )(2)(

    22 ___ 

    2222

    2

     

    2

     222

     X  X  X  E  X  E  X  E 

    dx x f  xdx x f  x

    dx x f  x x X Var 

     X  X 

     X  X 

    −=−=−=

    +−=

    +−==

    ∫∫

    ∫∞+

    ∞−

    ∞+

    ∞−

    ∞+

    ∞−

    µ

    µµ

    µµσ

    (6-18)

    .2 X 

    σ

    ( ) .22 ___ 

    22 2 λλλλσ =−+=−=   X  X  X 

    (6-19)

    .λPILLAI

    T d t i th i f th l )( 2N

  • 8/11/2019 lect6a

    11/50

    11

    To determine the variance of the normal r.v we

    can use (6-16). Thus from (3-29)

    To simplify (6-20), we can make use of the identity

    for a normal p.d.f. This gives

    Differentiating both sides of (6-21) with respect to we

    get

    or

    ( ) .21])[()(

     2/)(

    222

    22

    ∫  ∞+

    ∞−−−−=−=   dxe x X  E  X Var    x   σµπσµµ (6-20)

    ),,( 2σµ N 

    ∫∫   ∞+∞−−−∞+

    ∞−==   2/)(

    2

     1

    2

    1)(22

    dxedx x f    x X σµ

    πσ

    ∫  ∞+

    ∞−

    −− = 

    2/)( .222

    σπσµ dxe   x (6-21)

    ∫  ∞+

    ∞−

    −− =−

     

    2/)(

    3

    2

    2)( 22

    πσ

    µ   σµ dxe x   x

    ( ) ,2

    1 2 

    2/)(

    2

    2 22 σπσ

    µ   σµ =−∫  ∞+

    ∞−

    −− dxe x   x (6-22)

    PILLAI

  • 8/11/2019 lect6a

    12/50

    12

    which represents the in (6-20). Thus for a normal r.v

    as in (3-29)

    and the second parameter in infact represents the

    variance of the Gaussian r.v. As Fig. 6.1 shows the larger

    the the larger the spread of the p.d.f around its mean.

    Thus as the variance of a r.v tends to zero, it will begin to

    concentrate more and more around the mean ultimately

     behaving like a constant.

    Moments: As remarked earlier, in general

    are known as the moments of the r.v X , and

    ),( 2σµ N 

    )( X Var 

    2)(   σ= X Var  (6-23)

    1 ),( ___ 

    ≥==   n X  E  X m   nnn(6-24)

    PILLAI

  • 8/11/2019 lect6a

    13/50

    13

    ])[(   nn   X  E    µµ −= (6-25)

    are known as the central moments of X . Clearly, the

    mean and the variance It is easy to relate

    and Infact

    In general, the quantities

    are known as the generalized moments of X about a, and

    are known as the absolute moments of X .

    ,1m=µ .22 µσ =   nm

    .nµ

    ( ) .)( )( 

    )(])[(

    00

    0

    k n

    n

    k nk n

    k nk 

    n

    nn

    m

    n X  E 

    n

     X k n E  X  E 

    =

    =

    =

     

     

     

     =−

     

     

     

     =

      

       −

      

      =−=

    ∑∑

    µµ

    µµµ

    (6-26)

    ])[(  n

    a X  E    − (6-27)

    ]|[|   n X  E  (6-28)

    PILLAI

  • 8/11/2019 lect6a

    14/50

    14

    For example, if ∼ then it can be shown that

    Direct use of (6-2), (6-13) or (6-14) is often a tedious procedure to compute the mean and variance, and in this

    context, the notion of the characteristic function can be

    quite helpful.

    Characteristic Function

    The characteristic function of a r.v X is defined as

    −⋅=

    even. ,)1(31

    odd, ,0)(

    nn

    n X  E 

    n

    n

    σ

    +=−⋅=

    + odd. ),12(,/2!2

    even, ,)1(31)|(|

    12 k nk 

    nn X  E 

    k k 

    nn

    πσσ

    (6-29)

    (6-30)

    ),,0(  2σ N  X 

    PILLAI

  • 8/11/2019 lect6a

    15/50

    15

    Thus and for all

    For discrete r.vs the characteristic function reduces to

    Thus for example, if ∼ as in (3-45), then its

    characteristic function is given by

    Similarly, if X is a binomial r.v as in (3-44), itscharacteristic function is given by

    ( ) ∫+∞

    ∞−==Φ .)()(   dx x f ee E   X 

     jx jX 

     X 

    ωωω (6-31)

    ,1)0(   =Φ  X  1)(   ≤Φ ω X  .ω

    ∑   ==Φk 

     jk 

     X    k  X  P e ).()(  ωω (6-32)

    )(  λ P  X 

    .!

    )(

    !)( )1(

    0 0

    −−∞

    =

    =

    −− ====Φ   ∑ ∑  ωω λλλ

    ωλλω   λλω

      j j ee

    k k 

    k  jk  jk 

     X    eeek 

    ee

    k ee (6-33)

    .)()()(00

    n j

    n

    k nk  j

    n

    k nk  jk  X    q peq pe

    k nq p

    k ne   +=

      

      =

      

      =Φ   ∑∑ =

    =

    −   ωωωω (6-34)

    PILLAI

  • 8/11/2019 lect6a

    16/50

    16

    To illustrate the usefulness of the characteristic function of a

    r.v in computing its moments, first it is necessary to derive

    the relationship between them. Towards this, from (6-31)

    Taking the first derivative of (6-35) with respect to ω , and

    letting it to be equal to zero, we get

    Similarly, the second derivative of (6-35) gives

    ( )

    .!

    )(

    !2

    )()(1 

    !

    )(

    !

    )()(

    22

    2

    00

      +++++=

    =

    ==Φ   ∑∑

      ∞

    =

    =

    k k 

    k k 

    k  jX 

     X 

     X  E  j

     X  E  j X  jE 

     X  E  j

     X  j E e E 

    ωωω

    ωω

    ω   ω

    (6-35)

    .)(1

    )(or)()(

    00   ==   ∂Φ∂

    ==∂Φ∂

    ωω   ωω

    ωω

      X  X 

     j X  E  X  jE  (6-36)

    ,)(1)(0

    2

    2

    2

    2

    =∂Φ∂=

    ωω

    ω X  j

     X  E  (6-37)

    PILLAI

  • 8/11/2019 lect6a

    17/50

    17

    and repeating this procedure k times, we obtain the k th

    moment of X to be

    We can use (6-36)-(6-38) to compute the mean, variance

    and other higher order moments of any r.v X . For example,

    if ∼ then from (6-33)

    so that from (6-36)

    which agrees with (6-6). Differentiating (6-39) one more

    time, we get

    .1 ,)(1

    )(0

    ≥∂Φ∂=

    =

    k  j

     X  E k 

     X k 

    ωω

    ω (6-38)

    ,)(   ωλλ λωω  ω

     je X   jeee j

    −=∂Φ∂ (6-39)

    ,)(   λ= X  E  (6-40)

    ),(  λ P  X 

    PILLAI

  • 8/11/2019 lect6a

    18/50

    18

    ( ),)()( 222

    2ωλωλλ λλ

    ωω   ωω  je je X  e je jeee

     j j

    +=∂Φ∂   − (6-41)

    so that from (6-37)

    which again agrees with (6-15). Notice that compared to the

    tedious calculations in (6-6) and (6-15), the efforts involved

    in (6-39) and (6-41) are very minimal.

    We can use the characteristic function of the binomial r.v

     B(n, p) in (6-34) to obtain its variance. Direct differentiation

    of (6-34) gives

    so that from (6-36), as in (6-7).

    ,)( 22 λλ += X  E  (6-42)

    1)()(   −+=

    ∂Φ∂   n j j X  q pe jnpe   ωω

    ωω

    (6-43)

    np X  E    =)(PILLAI

    O diff i i f (6 43) i ld

  • 8/11/2019 lect6a

    19/50

    19

    One more differentiation of (6-43) yields

    and using (6-37), we obtain the second moment of the

     binomial r.v to be

    Together with (6-7), (6-18) and (6-45), we obtain the

    variance of the binomial r.v to be

    To obtain the characteristic function of the Gaussian r.v, we

    can make use of (6-31). Thus if ∼ then

    ( )

    2212

    2

    2

    )()1()()(   −− +−++=

    Φ∂   n j jn j j X  q pe penq peenp j   ωωωω

    ω

    ω

    (6-44)

    ( ) .)1(1)( 222 npq pn pnnp X  E    +=−+= (6-45)

    [ ] .)()( 2222222 npq pnnpq pn X  E  X  E  X    =−+=−=σ (6-46)

    ),,(  2σµ N  X 

    PILLAI

    1 22∞+∫

  • 8/11/2019 lect6a

    20/50

    20

    .2

    2

    )thatso (Let

    2

    1

    2

    )(Let2

    1)(

    )2/(

     

    2/

    2

    2/

     2/))((

    2

    22

     )2(2/

    2

     2/

    2

     2/)(

    2

    222222

    222

    2222

    22

    ωσµωσωσµω

    σωσωσµω

    ωσσµωσωµω

    σµω

    πσ

    πσ

    ωσωσ

    πσπσ

    µπσ

    ω

    −∞+

    ∞−

    −−

    ∞+

    ∞−

    −+−

    ∞+

    ∞−

    −−∞+

    ∞−

    ∞+

    ∞−

    −−

    ==

    =

    +==−

    ==

    =−=Φ

    ∫∫

     ju j

     ju ju j

     j y y j y y j j

     x x j

     X 

    edueee

    duee

     ju yu j y

    dyeedyeee

     y xdxee

    (6-47)

     Notice that the characteristic function of a Gaussian r.v itself 

    has the “Gaussian” bell shape. Thus if ∼ then

    and

    ),,0(  2σ N  X 

    ,2

    1)(

    22 2/

    2

    σ

    πσ x

     X    e x f   −= (6-48)

    (6-49).)( 2/22ωσω   −=Φ   e X 

    PILLAI

    2/2222 2/ σx

  • 8/11/2019 lect6a

    21/50

    21

    2/22ωσ−e

    ω(b)

    2/   σ xe−

     x

    (a)

    Fig. 6.2

    From Fig. 6.2, the reverse roles of in and arenoteworthy

    In some cases, mean and variance may not exist. Forexample, consider the Cauchy r.v defined in (3-39). With

    clearly diverges to infinity. Similarly

    2σ )( x f  X  )(ω X Φ

    ,

    )/(

    )( 22  x x f  X  += α

    πα

    ∫∫  ∞+

    ∞−

    ∞+

    ∞−∞=

     

      

     

    +−=

    +=

     

    22

    22

    22 ,1)(   dx

     xdx

     x

     x X  E 

    αα

    πα

    απα

    (6-50)

    .)1

     vs(2

    2

    σσ

    PILLAI

  • 8/11/2019 lect6a

    22/50

    22

    .)( 

    22∫  ∞+

    ∞− +=   dx

     x

     x X  E 

    απα

    (6-51)

    To compute (6-51), let us examine its one sided factor 

    With

    indicating that the double sided integral in (6-51) does not

    converge and is undefined. From (6-50)-(6-52), the mean

    and variance of a Cauchy r.v are undefined.We conclude this section with a bound that estimates the

    dispersion of the r.v beyond a certain interval centered

    around its mean. Since measures the dispersion of 

    0 22∫  ∞+

    +  dx

     x

     x

    α   θα tan= x

      / 2 / 22

    2 2 2 2  0 0 0

      / 2 / 2

    0  0

    tan sin  secsec cos

    (cos )  log cos log cos ,

    cos 2

     x dx d d   x

    π π

    π   π

    α θ θα θ θ θα α θ θ

    θ πθ

    θ

    + ∞= =

    +

    = − = − = − = ∞

    ∫ ∫ ∫

    ∫ (6-52)

    2σPILLAI

    h d i hi b d

  • 8/11/2019 lect6a

    23/50

    23

    the r.v X around its mean µ , we expect this bound to

    depend on as well.

    Chebychev Inequality

    Consider an interval of width 2ε symmetrically centered

    around its mean µ as in Fig. 6.3. What is the probability that

     X falls outside this interval? We need

    ( ) ? ||   εµ ≥− X  P  (6-53)

    µε2

    εµ −   εµ + X 

    Fig. 6.3

     X 

    PILLAI

    To compute this probability we can start with the definition

  • 8/11/2019 lect6a

    24/50

    24

    To compute this probability, we can start with the definition

    of

    From (6-54), we obtain the desired probability to be

    and (6-55) is known as the chebychev inequality.

    Interestingly, to compute the above probability bound the

    knowledge of is not necessary. We only need thevariance of the r.v. In particular with in (6-55) we

    obtain

    )( x f  X 

    ( ) ,||2

    2

    ε

    σεµ ≤≥− X  P 

    (6-54)

    [ ]

    ( ) .||)()( 

    )()()()()(

    2

    ||

    2

    ||

    2

    ||

    2

     

    222

    εµεεε

    µµµσ

    εµεµ

    εµ

    ≥−≥≥≥

    −≥−=−=

    ∫∫∫∫

    ≥−≥−

    ≥−∞+

    ∞−

     X  P dx x f dx x f 

    dx x f  xdx x f  x X  E 

     x  X 

     x  X 

     x  X  X 

    (6-55)

    .2σ

    ,2

    σσε   k =

    ( ) .1|| 2k k  X  P    ≤≥− σµ (6-56)

    PILLAI

  • 8/11/2019 lect6a

    25/50

    25

    Thus with we get the probability of X  being outside

    the 3σ interval around its mean to be 0.111 for any r.v.

    Obviously this cannot be a tight bound as it includes all r.vs.

    For example, in the case of a Gaussian r.v, from Table 4.1

    which is much tighter than that given by (6-56). Chebychevinequality always underestimates the exact probability.

    ( ) .0027.03|| =≥ σ X  P  (6-57)

    ,3=k 

    )1,0(   == σµ

    PILLAI

  • 8/11/2019 lect6a

    26/50

    26

    Moment Identities :

    Suppose  X  is a discrete random variable that takesonly nonnegative integer values. i.e.,

    Then

    similarly

    ,2,1,0 ,0)(   =≥==   k  pk  X  P  k 

    ∑∑ ∑ ∑ ∑∑∞

    =

    =

    =

    +=

    =

    =

    ===

    ====>

    0

    0 0 1

    1

    01

    )()( 

    1)()()(

    i

    k k k i

    i

    k i

     X  E i X  P i

    i X  P i X  P k  X  P 

    2

    )}1({)(

    2

    )1()()(

    1

    1

    010

    −==

    −===>

      ∑∑∑∑

      ∞

    =

    =

    =

    =

     X  X  E i X  P 

    iik i X  P k  X  P k 

    i

    i

    k ik 

    PILLAI

    (6-58)

  • 8/11/2019 lect6a

    27/50

    27

    which gives

    Equations (6-58) – (6-59) are at times quite useful insimplifying calculations. For example, referring to the

    Birthday Pairing Problem [Example 2-20., Text], let X 

    represent the minimum number of people in a group for a birthday pair to occur. The probability that “the first

    n people selected from that group have different

     birthdays” is given by [ P ( B) in page 39, Text]

    But the event the “the first n people selected have

    .)()12()()(1 0

    22

    ∑ ∑∞

    =

    =>+===

    i k 

    k  X  P k i X  P i X  E  (6-59)

    .)1( 2/)1(1

    1

     N nnn

    k n   e N 

    k  p   −−−

    =≈−∏=

    PILLAI

  • 8/11/2019 lect6a

    28/50

    28

    different birthdays” is the same as the event “ X > n.”

    Hence

    Using (6-58), this gives the mean value of X to be

    Similarly using (6-59) we get

    ( 1) / 2( ) .n n N  P X n e− −> ≈

    { }

    2

    2 2

     

    ( 1) / 2 ( 1/ 4) / 2

      1/ 2

    0 0

      1/ 2(1/8 ) / 2 (1/ 8 ) / 2

      1/ 2 0

    1

    2

    ( ) ( )

    2

    1/ 2 24.44.2

    n n N x N  

    n n

     N x N N x N 

     E X P X n e e dx

    e e dx e N e dx

     N 

    π

    π

    ∞ ∞ ∞− − − −

    = =∞ − −

    = > ≈ ≈

    = = +

    ≈ + =

    ∑ ∑   ∫

    ∫ ∫

    (6-60)

    PILLAI

  • 8/11/2019 lect6a

    29/50

    29

    Thus

    2

    2 2 2

     

    2

    0

    ( 1) / 2 ( 1/ 4) / 2 

    0 1/ 2

    1/ 2

    (1/8 ) / 2 / 2 ( 1/ 4) / 2

    0 0 1/ 2

    ( ) (2 1) ( )

    (2 1) 2 ( 1)

    2 2

    2 2 1

    2 2 ( )2 8

    1 52 2 1 2 2

    4 4

    779.139.

    n

    n n N x N  

    n

     N x N x N x N 

     E X n P X n

    n e x e dx

    e xe dx xe dx e dx

     N 

     N E X 

     N N N N 

    π

    π

    π π

    =

    ∞∞− − − −

    =   −

    ∞ ∞

    − − − −

    = + >

    = + = +

    = + +

    = + +

    = + + + = + +

    =

    ∑   ∫

    ∫ ∫ ∫

    82.181))(()()( 22 =−=   X  E  X  E  X Var 

    PILLAI

    which gives

  • 8/11/2019 lect6a

    30/50

    30

    g

    Since the standard deviation is quite high compared to themean value, the actual number of people required for a

     birthday coincidence could be anywhere from 25 to 40.

    Identities similar to (6-58)-(6-59) can be derived in the

    case of continuous random variables as well. For example,

    if X is a nonnegative random variable with density function

     f  X ( x) and distribution function F  X ( X ), then

    .48.13 ≈ X 

    σ

    PILLAI

    (   )(   )

     

    0 0 0

     

    0 0 0

     

    0 0

    { } ( ) ( )

    ( ) ( ) ( )

    {1 ( )} ( ) ,

     X X 

     X 

     X 

     x

     y

     E X x f x dx dy f x dx

     f x dx dy P X y dy P X x dx

     F x dx R x dx

    ∞ ∞

    ∞ ∞ ∞ ∞

    ∞ ∞

    = =

    = = > = >

    = − =

    ∫ ∫ ∫

    ∫ ∫ ∫ ∫

    ∫ ∫ (6-61)

    where

  • 8/11/2019 lect6a

    31/50

    31

    where

    Similarly

    ( ) 1 ( ) 0, 0. X 

     R x F x x= − ≥ >

    ( )( )

     

    2 2

     0 0 0

     

    0

    0 .

    { } ( ) 2 ( )

    2 ( )

    2 ( )

     X X 

     X 

     x

     y

     E X x f x dx ydy f x dx

     f x dx ydy

     x R x dx

    ∞ ∞

    ∞ ∞

    = =∫ ∫ ∫=   ∫ ∫

    =   ∫

    A B b ll T i i (P t R d Di i )

  • 8/11/2019 lect6a

    32/50

    32

    A Baseball Trivia (Pete Rose and Dimaggio):

    In 1978 Pete Rose set a national league record by

    hitting a string of 44 games during a 162 game baseball

    season. How unusual was that event?As we shall see, that indeed was a rare event. In that context,

    we will answer the following question: What is the

     probability that someone in major league baseball willrepeat that performance and possibly set a new record in

    the next 50 year period? The answer will put Pete Rose’s

    accomplishment in the proper perspective.

    Solution: As example 5-32 (Text) shows consecutive

    successes in n trials correspond to a run of length r in n

    PILLAI

    t i l F (5 133) (5 134) t t t th b bilit f

  • 8/11/2019 lect6a

    33/50

    33

    trials. From (5-133)-(5-134) text, we get the probability of 

    r successive hits in n games to be

    where

    and p represents the probability of a hit in a game. PeteRose’s batting average is 0.303, and on the average since

    a batter shows up about four times/game, we get

    r r n

    r nn   p p ,,1 −+−= αα (6-62)

    k r k 

    r n

    k kr n

    r n   qp )()1()1(/

    0

     ,   −=   ∑+

    =

      

        −α

    PILLAI

    (6-63)

    76399.00.303)-(1-1

    game)hit /P(no-1

    game)hit /oneleastat(

    4

    ==

    == P  p

    (6-64)

    Substituting this value for p into the expressions

  • 8/11/2019 lect6a

    34/50

    34

    Subs u g s v ue o p o e e p ess o s

    (6-62)-(6-63) with r = 44 and n = 162, we can compute the

    desired probability  pn. However since n is quite largecompared to r , the above formula is hopelessly time

    consuming in its implementation, and it is preferable to

    obtain a good approximation for  pn.Towards this, notice that the corresponding moment

    generating function for in Eq. (5-130) Text,

    is rational and hence it can be expanded in partial fraction as

    where only r roots (out of r +1) are accounted for, since the

    root  z = 1/ p is common to both the numerator and thedenominator of Here

    PILLAI

    )( z φ

    ,

    1

    1)(

    11   ∑

    =+

    −=

    +−

    −=

    k    k 

    r r 

    r r 

     z  z 

    a

     z qp z 

     z  p z φ

    ).( z φ

    (6-65)

    nn   pq   −=1

    r r  zzzp ))(1(

  • 8/11/2019 lect6a

    35/50

    35

    r r 

    r r r r 

     z  z 

    r r 

     z  z k 

     z qpr 

     z  z  z rp z  p

     z qp z 

     z  z  z  pa

    )1(1

    )()1(lim

    1

    ))(1( lim

    1

    1

    ++−−−−

    =

    +−−−

    =

    +→

    PILLAI

    From (6-65) – (6-66)

    where

    r k  z qpr 

     z  pa

    k  ,,2,1 ,)1(1

    1=

    +−−=

    (6-66)

    (6-67)∑∑ ∑∑  ∞

    =

    = =

    +−

    =

    ==

    −−

    =00 1

     )1(

     

    1

     )(

    /1

    )(

    )(n

    n

    n

    n

    n

    q

    n

    k k 

    k    k 

    k   z q z  z  A

     z  z  z 

    a z 

    n

       

    φ

    k k   z qpr 

     z  p

    a A )1(1

    1

    +−

    =−=

    or

    and

  • 8/11/2019 lect6a

    36/50

    36

    PILLAI

    and

    However (fortunately), the roots in

    (6-65)-(6-67) are all not of the same importance (in terms

    of their relative magnitude with respect to unity). Notice

    that since for large n, for only the roots

    nearest to unity contribute to (6-68) as n becomes larger.To examine the nature of the roots of the denominator 

    in (6-65), note that (refer to Fig 6.1)

    implying that

    for increases from –1 and reaches a positivemaximum at z 0 given by

    .11

    )1(∑=

    +−=−=  r 

    n

    k k nn   z  A pq (6-68)

    r k  z k 

    ,1,2, ,   =

    0)1( →+−   nk 

     z  ,1||   >k  z 

    11)(   +−−=   r r  z qp z  z  A

     ,01)0(  

  • 8/11/2019 lect6a

    37/50

    37

    which gives

    There onwards A( z ) decreases to Thus there are two

     positive roots for the equation given by

    and Since but negative, bycontinuity has the form (see Fig 6.1)

    PILLAI

    ,0)1(1)(

    0

    0

    =+−=− z  z 

     z r qpdz 

    .)1(

    10 +

    =r qp

     z r 

    (6-69)

    .∞−0)(   = z  A

    01  z  z   <

    .1/12   >=   p z  0)1(   ≈−=  r 

    qp A1

     z   .0 ,11

      >+= εε z 

    Fig 6.1 A( z ) for r odd

    )( z  A

     z 1−

    1  z 2=1/ p

     z 1   z 0

    It is possible to obtain a bound for in (6 69) Whenz

  • 8/11/2019 lect6a

    38/50

    38

    It is possible to obtain a bound for in (6-69). When

     P varies from 0 to 1, the maximum of is

    attained for and it equals Thus

    and hence substituting this into (6-69), we get

    Hence it follows that the two positive roots of  A( z ) satisfy

    Clearly, the remaining roots of are complex if r is

    0 z 

    PILLAI

    r r   p pqp )1(   −=)1/(   +=   r r  p .)1/( 1++   r r  r r 

    1

    )1(  +

    +≤

    r qp (6-70)

    (6-72)

    (6-71).1110r r 

    r  z    +=+≥

    .11

     1

    11 21   >=

  • 8/11/2019 lect6a

    39/50

    39

    Fig 6.2). It is easy to show that the absolute value of every

    such complex or negative root is greater than 1/ p >1.

    To show this when r is even, suppose represents the

    negative root. Then

    α−

    0)1()(1

    =−+−=−  +r r 

    qp A   αααPILLAI

    Fig 6.2 A( z ) for r even

    )( z  A

     z  z 1   z 0   z 2

    α−

    so that the function

  • 8/11/2019 lect6a

    40/50

    40

    so a e u c o

    starts positive, for x > 0 and increases till it reaches once

    again maximum at and then decreases to

    through the root Since  B(1/ p) = 2, we

    get > 1/ p > 1, which proves our claim.

    r  z  /110   +≥∞− .10 >>=   z  x   α

    2)(1)(1

    +=−+=  +

     x A xqp x x B  r r 

    PILLAI

    (6-73)

    α

    Fig 6.3 Negative root

    α

    1

    )( x B

    0)(   =α B

     z 0 1/ p

    Finally if is a complex root of A(z) thenθρ   jez =

  • 8/11/2019 lect6a

    41/50

    41

    Finally if is a complex root of A( z ), then

    so that

    or 

    Thus from (6-72), belongs to either the interval (0,  z 1)

    or the interval in Fig 6.1. Moreover , by equating

    the imaginary parts in (6-74) we get

    ρ e z   =

    01)( )1(1 =−−=   ++ θθθ ρρρ   r  jr r  j j eqpee A (6-74)

    1)1(1

    1 |1|+++

    +≤+=  r r r  jr r 

    qpeqp   ρρρ  θ

    .1)1(

    sin

    sin=

    +

    θ

    θρ

      r qp   r r 

    ),( 1 ∞ p

    ρ

    .01)( 1

  • 8/11/2019 lect6a

    42/50

    42

    equality being excluded if Hence from (6-75)-(6-76)

    and (6-70)

    or 

    But As a result lies in the interval only.

    Thus

    ρ.01

      z  z   < ),( 1 ∞ p

    ,1

    )1(

      sin

    sin

    +≤

    +r 

    θ

    θ

    .0≠

    r r r 

    r  z 

    qpr 

    qpr   

     

     

     

       +>=

    +

    >⇒>+1

    )1(

    1 1)1( 0ρρ

    .1

    10r 

     z    +≥>ρ

    .11

    >>  pρPILLAI

    (6-76)

    (6-77)

    T i th t l t f th l i l

  • 8/11/2019 lect6a

    43/50

    43

    To summarize the two real roots of the polynomial

     A( z ) are given by

    and all other roots are (negative or complex) of the form

    Hence except for the first root z 1 (which is very close tounity), for all other roots

    As a result, the most dominant term in (6-68) is the first

    term, and the contributions from all other terms to qn

    in

    (6-68) can be bounded by

    ,11

     ;0 ,1 21   >=>+= p

     z  z    εε

    .11

     where >>= p

    e z    jk    ρρ  θ

    .allforrapidly0 

    )1(

    k  z 

      n

    k    →

    +−

    PILLAI

    (6-78)

    (6-79)

    |||| )1()1( ≤ +−+− ∑∑ zAzAr 

    nr 

    n

  • 8/11/2019 lect6a

    44/50

    44

    Thus from (6-68), to an excellent approximation

    This gives the desired probability to be

    .)1(11

    +−=   nn

      z  Aq (6-81)

    0. 11

     |)|()1(

    |)|( 

    |)|()1(1

    |)|(1 

    ||||

    11

    1

    2

    1

    2

    2

    )(

    2

    )(

    →≤+−=

    +≤

    +−−

    ++

    +

    =

    +

    =

    ==

    ∑∑

    q p

    q p

    r r 

     p z  pqr 

     z  p

     p z  pqr 

     z  p

     z  A z  A

    nn

    nr 

    k r 

    r k 

    nr 

    k r 

    k k k 

    k k k 

    (6-80)

    PILLAI

    )(1 r pz

  • 8/11/2019 lect6a

    45/50

    45

     Notice that since the dominant root z 1 is very close to

    unity, an excellent closed form approximation for z 1 can

     be obtained by considering the first order Taylor seriesexpansion for A( z ). In the immediate neighborhood of  z =1

    we get

    so that gives

    or 

    .

    )()1(1

    )(111 )1(1

    1

    1   +−

     

     

     

     

    +−

    −−=−=   n

    r nn  z 

     pz qr 

     pz q p (6-82)

    PILLAI

    εεε ))1(1()1()1()1(   r r  qpr qp A A A   +−+−=′+=+

    0)1()(1

      =+= ε A z  A

    ,

    )1(1

      r 

    qpr 

    qp

    +−

    .)(

    11 r

    r qp z  +≈ (6-83)

  • 8/11/2019 lect6a

    46/50

    46

    Returning back to Pete Rose’s case,  p = 0.763989, r = 44gives the smallest positive root of the denominator

     polynomial

    to be

    (The approximation (6-83) gives ).

    Thus with n = 162 in (6-82) we get

    to be the probability for scoring 44 or more consecutive

    45441)(   z qp z  z  A   −−=

    .05490000016936.11 = z 05480000016936.11 = z 

    0002069970.0162

     = p (6-84)

    )1(11   r qpr +−

    (6 83)

    hits in 162 games for a player of Pete Rose’s caliber a

    ll b bili i d d! I h i i

  • 8/11/2019 lect6a

    47/50

    47

    very small probability indeed! In that sense it is a very

    rare event.Assuming that during any baseball season there are

    on the average about (?) such players over

    all major league baseball teams, we obtain [use Lecture #2,Eqs.(2-3)-(2-6) for the independence of 50 players]

    to be the probability that one of those players will hit the

    desired event. If we consider a period of 50 years, then the

     probability of some player hitting 44 or more consecutive

    games during one of these game seasons turns out to be

    PILLAI

    50252   =×

    .40401874.0)1(1 501   =−−   P 

    0102975349.0)1(150

    1621   =−−=   p P 

    (6-85)

    (We have once again used the independence of the 50

  • 8/11/2019 lect6a

    48/50

    48

    (We have once again used the independence of the 50

    seasons.)

    Thus Pete Rose’s 44 hit performance has a 60-40

    chance of survival for about 50 years.From (6-85), rareevents do indeed occur. In other words, some unlikely

    event is likely to happen.

    However, as (6-84) shows a particular unlikely eventsuch as Pete Rose hitting 44 games in a sequence is

    indeed rare.

    Table 6.1 lists  p162 for various values of r. From there,

    every reasonable batter should be able to hit at least 10

    to 12 consecutive games during every season!

    −−

    p ; n = 162r

  • 8/11/2019 lect6a

    49/50

    49

    0.9525710

    0.4893315

    0.1493720

    0.03928250.00020744

     pn ; n 162r 

    Table 6.1 Probability of r runs in n trials for p=0.76399.

    As baseball fans well know, Dimaggio holds the record of

    consecutive game hitting streak at 56 games (1941). With

    a lifetime batting average of 0.325 for Dimaggio, the above

    calculations yield [use (6-64), (6-82)-(6-83)] the probability

    for that event to be

    .0000504532.0=p (6-86)

  • 8/11/2019 lect6a

    50/50

    50

    Even over a 100 year period, with an average of 50excellent hitters / season, the probability is only

    (where ) that someone

    will repeat or outdo Dimaggio’s performance.Remember,60 years have already passed by, and no one has done it yet!

    .0000504532.0n

     p

    PILLAI

    (6-86)

    (6-87)2229669.0)1(1100

    0   =−−   P 00251954.0)1(1 500   =−−=   n p P