206 hwk3 solutions

Upload: askazy007

Post on 06-Jul-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 206 Hwk3 Solutions

    1/3

    AMS 206: Classical and Bayesian Inference (Winter 2015)

    Homework 3 solutions

    1. Exercise 7.2.1Solution:   By definition, Pr(X 6   >   3000   |   x) =

     ∞

    3000 f (x6   |   x)dx6. We therefore requiref (x6 |   x) =

     ∞

    0   f (x6 |   θ)ξ (θ |   x)dθ, for which we need the posterior distribution,   ξ (θ |   x).Under the gamma(1, 5000) prior for  θ, the posterior distribution for  θ  can be recognized as agamma(6, 5000 +

     5i=1 xi) distribution. Now, letting  α = 6 and  β  = 5000 +

     5i=1 xi, we have

    f (x6 | x) =   β α

    Γ(α)

       ∞

    0θα exp(−θ(x6 + β ))dθ =   β 

    αΓ(α + 1)

    Γ(α)(x6 + β )α+1  =

      β αα

    (x6 + β )α+1.

    Therefore,

    Pr(X 6 >  3000 | x) =    ∞

    3000

    β αα

    (x6 + β )α+1dx6 = −β α

    (x6 + β )−α∞

    3000

    and with  β  = 5000 + 16178 and  α  = 6, we get 0.4516 for this probability.

    2. Exercise 7.2.2Solution:  We observe 2 defective items out of 8, and the number of defectives follows a binomialdistribution, so  f n(x |  θ) ∝  θ2(1 − θ)6. The prior is a discrete distribution supported on onlytwo values, and therefore the posterior is also a discrete distribution supported on these samevalues (since the prior equals 0 at all other values). The posterior probabilities are given byξ (θ = 0.1 | x) ∝ 0.12(1−0.1)6(0.7) = 0.00371 and ξ (θ = 0.2 | x) ∝ 0.22(1−0.2)6(0.3) = 0.00314,which are normalized to give  ξ (θ  = 0.1 |   x) = 0.54 and   ξ (θ  = 0.2 |   x) = 0.46. Note that thedenominator of  ξ (θ | x) is given by  ξ (0.1)f n(x | 0.1) + ξ (0.2)f n(x | 0.2).3. Exercise 7.2.10Solution:   The prior for   θ   has density   ξ (θ) = 1/10 for 10 ≤   θ  ≤   20, and   f (x  |   θ) = 1for   θ − 0.5 ≤   x ≤   θ  + 0.5. The posterior distribution   ξ (θ |   x) is therefore proportional toa constant, i.e., a uniform distribution, on an interval which is determined by finding whereboth   ξ (θ) is nonzero and   f (x |   θ) is nonzero. The likelihood contribution is nonzero whenx − 0.5 ≤ θ ≤ x + 0.5, or  θ ∈ [11.5, 12.5], since  x = 12. The prior is nonzero in this range, andthus  θ | x ∼  Uniform[11.5, 12.5].

    4. Exercise 7.2.11Solution:   Now,  f n(x

    |θ) = 1 for  θ

     ∈[max

    {x1, . . . , xn

    } −0.5, min

    {x1, . . . , xn

    }+ 0.5]. Plugging

    in the values from the data, this interval becomes [11.2, 11.4], and again, the prior is constantin this range. Therefore,  θ | x ∼ Uniform[11.2, 11.4].

    5. Exercise 7.3.7Solution:  We know that the normal distribution is a conjugate prior for the mean of a normalwith known variance. Using the formulas for updating the mean and variance of the posteriordistribution (Theorem 7.3.3), we have  θ | x ∼ N (µ1, v21), with

    µ1 = (4)(68) + 10(1)(69.5)

    4 + 10(1)  = 69.07 and   v21  =

      4

    4 + 10 = 0.286.

  • 8/18/2019 206 Hwk3 Solutions

    2/3

    6. Exercise 7.3.8Solution:   The normal distribution is symmetric around its mean, and therefore the inter-val of fixed length with the highest probability is the interval centered at the mean. There-fore, for part (a) the interval of length 1 inch with highest probability is (67.5, 68.5), and forpart (b) it is (68.57, 69.57). The values of these probabilities are obtained using the fact that

    (θ − 68)/1 ∼   N (0, 1) for part (a), and (θ − 69.07)/√ 

    0.286) ∼   N (0, 1) for part (b). That is,for part (a) we compute Pr(θ <  68.5) − Pr(θ <  67.5) = Φ(0.5) − Φ(−0.5) = 0.383. For part(b), using the same procedure, we get a probability of 0.6528. Note that the probability frompart (b) is higher, and thus the uncertainty about  θ   is reduced in the posterior distribution ascompared to the prior distribution.

    7. Exercise 7.3.11Solution:  The standard deviation of the posterior distribution is v1 =  σv0/(σ

    2 + nv20)1/2. Plug-

    ging in  n  = 100,  σ  = 2, we have  v1  = 2v0/(4 + 100v20)

    1/2 =  v0/(1 + 25v20)

    1/2. For any  v0, thedenominator is larger than (25v20)

    1/2 = 5v0, which means  v1 <  1/5.

    8. Exercise 7.3.12Solution:  The prior for  θ   is a gamma(0.04, 0.2) distribution, since then E(θ) = 0.04/0.2 = 0.2and SD(θ) = 0.2/0.2 = 1 as required. With conditionally i.i.d. exponential random vari-ables, the likelihood is   f n(x   |   θ)  ∝   θn exp(−θ

    ni=1 xi), and the posterior is   ξ (θ   |   x)  ∝

    θ20+0.04−1 exp(−θ(0.2+ni=1 xi)). Using ni=1 xi = 20(3.8), we have θ | x ∼ gamma(20.04, 76.2).9. Exercise 7.3.13Solution:  The mean of a gamma distribution with parameters α  and  β  is α/β  and the standarddeviation is  α1/2/β , so the coefficient of variation is  α−1/2. If the prior coefficient of variationis 2, then   α   = 1/4 in the prior distribution. From the previous problem, we have that thecoefficient of variation in the posterior distribution is (n + (1/4))−1/2. This is less than 0.1 if n

    ≥99.75, which means that at least 100 customers are needed.

    10. Exercise 7.3.15Solution:   Part (a): We need to evaluate the integral   β

    α

    Γ(α)

     ∞

    0   θ−(α+1) exp(−β/θ)dθ. Let  φ  =

    1/θ. Then  dθ/dφ = −φ−2, so dθ = −φ−2dφ. Note that when  θ → 0,  φ → ∞, and when  θ → ∞,φ → 0. Now, with this change of variables, the integral becomes   βαΓ(α)

     ∞

    0   φα−1 exp(−βφ)dφ = 1,

    since this is the integral of the density of a gamma distribution with parameters   α >   0 andβ > 0.Part (b): With  f n(x |  θ) ∝  θ−n/2 exp(−

    ni=1(xi − µ)2/(2θ)), the posterior distribution for  θ

    is proportional to  f n(x |  θ)ξ (θ) ∝  θ−(α+(n/2)+1) exp(−θ−1(β  + 0.5n

    i=1(xi − µ)2)). This is aninverse-gamma distribution with revised parameters  α  + (n/2) and  β  + 0.5

    ni=1(xi − µ)2.

    11. Exercise 7.3.17Solution:   The prior density is   ξ (θ) ∝   θ−4 for   θ ≥   4, and 0 otherwise. The likelihood isf n(x |   θ) ∝   θ−3 for max{x1, x2, x3} ≤   θ. Multiplying the prior times the likelihood givesξ (θ   |   x) ∝   θ−7 for   θ  ≥   max{x1, x2, x3}   = 8 and   θ  ≥   4. Therefore,   ξ (θ   |   x) ∝   θ−7 forθ ≥ 8, and 0 elsewhere. To find the normalizing constant, solve  ∞8   θ−7dθ  = 6−18−6, and thusξ (θ | x) = (6 · 86)θ−7 for  θ ≥ 8, and 0 elsewhere.

  • 8/18/2019 206 Hwk3 Solutions

    3/3

    12. Exercise 7.3.18Solution:  The Pareto distribution as a prior for  θ  has density  ξ (θ) ∝ θ−(α+1) for  θ ≥  x0, and0 elsewhere, where  α >  0 and  x0   >   0. The likelihood for   n   conditionally i.i.d. Uniform[0, θ]random variables is  f n(x | θ) ∝ θ−n for  θ ≥ max{x1, . . . , xn}. Therefore, ξ (θ | x) ∝ θ−(α+n+1),for  θ ≥   max{x0, x1, . . . , xn}. Up to the normalizing constant, this is the density of a Pareto

    distribution with revised parameters  α

    =  α  + n  and x

    0 = max{x0, x1, . . . , xn}.13. Exercise 7.3.21Solution:   The likelihood of  n   conditionally i.i.d. random variables from an exponential dis-tribution is  f n(x | θ) = θn exp(−θ

    ni=1 xi). Combining the likelihood with the improper prior

    ξ (θ) ∝ θ−1, we obtain ξ (θ | x) ∝  θn−1 exp(−θni=1 xi). This is proportional to a gamma distri-bution for  θ  with parameters  n  and

     ni=1 xi =  nx̄; therefore, E(θ | x) = 1/x̄.

    14. Exercise 7.3.23Note:  Although minimal restrictions are typically needed for functions  b(x),  c(θ) and  d(x) en-tering the definition of the exponential family  p.d.f. or p.f.   f (x | θ), the function a(θ) can not bearbitrary as stated in the exercise. This function provides the normalizing term for  f (x

    |θ); in

    particular, if  X  is continuous with values in sample space  S , from 1 = S  f (x | θ)dx, we obtain

    that  a(θ) = { S  b(x)exp(c(θ)d(x))dx}−1, that is, function  a(θ) is specified from functions  b(x),c(θ) and  d(x). Refer to exercise 7.3.24 for a list of distributions that belong to the exponentialfamily of distributions, and to exercises 7.3.25 and 7.3.26 for counterexamples.Solution:   Part (a): The product of   ξ α,β(θ) and   f (x   |   θ) can be written proportional toa(θ)α+1 exp{c(θ)(β  + d(x))}, which is of the form  ξ α+1,β+d(x)(θ).Part (b): With   n   random variables assumed to arise conditionally i.i.d. from an exponentialfamily with p.d.f.   f (x |   θ), the likelihood is  f n(x |   θ) ∝   (a(θ))n exp{c(θ)

    ni=1 d(xi)}. Com-

    bining the likelihood with the prior   ξ α0,β0(θ), the posterior density for   θ   is proportional to(a(θ))n+α0 exp{c(θ)(β 0 +

     ni=1 d(xi))}, and therefore the prior hyperparameters  α0   and  β 0  are

    updated to α0 + n  and β 0 + ni=1 d(xi), respectively.