@let@token lecture 5 and 6: bootstrap - tu...

51
Introduction Two stages approach Consistency Confidence Interval Assignments Lecture 5 and 6: Bootstrap Applied Statistics 2015 1 / 37

Upload: buihanh

Post on 10-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Lecture 5 and 6: Bootstrap

Applied Statistics 2015

1 / 37

Page 2: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Bootstrap, firstly introduced in Efron (1979), is a resampling methodoften used to find

1 standard errors for estimators

2 confidence intervals for unknown parameters

3 p-values for test statistics under a null hypothesis

2 / 37

Page 3: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

An example

Let X1, . . . , Xn be a random sample from distribution F with mean θ. LetXn be an estimator of θ.

Question: What is the sampling distribution of Xn?

If we knew that F = N(θ, 1), then Xnd= N(θ, 1/n).

If we don’t know the distribution, but we could draw many samples ofsize n from F . Then we have {Xn1, . . . , Xnm} that is considered asa random sample of Xn. The empirical distribution function based onthis sample is then a good approximation of the distribution of Xn.

What if we don’t know the distribution, and we can only afford onerandom sample?

3 / 37

Page 4: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

An example

Let X1, . . . , Xn be a random sample from distribution F with mean θ. LetXn be an estimator of θ.

Question: What is the sampling distribution of Xn?

If we knew that F = N(θ, 1), then Xnd= N(θ, 1/n).

If we don’t know the distribution, but we could draw many samples ofsize n from F . Then we have {Xn1, . . . , Xnm} that is considered asa random sample of Xn. The empirical distribution function based onthis sample is then a good approximation of the distribution of Xn.

What if we don’t know the distribution, and we can only afford onerandom sample?

3 / 37

Page 5: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

An example

Let X1, . . . , Xn be a random sample from distribution F with mean θ. LetXn be an estimator of θ.

Question: What is the sampling distribution of Xn?

If we knew that F = N(θ, 1), then Xnd= N(θ, 1/n).

If we don’t know the distribution, but we could draw many samples ofsize n from F . Then we have {Xn1, . . . , Xnm} that is considered asa random sample of Xn. The empirical distribution function based onthis sample is then a good approximation of the distribution of Xn.

What if we don’t know the distribution, and we can only afford onerandom sample?

3 / 37

Page 6: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

An example

Let X1, . . . , Xn be a random sample from distribution F with mean θ. LetXn be an estimator of θ.

Question: What is the sampling distribution of Xn?

If we knew that F = N(θ, 1), then Xnd= N(θ, 1/n).

If we don’t know the distribution, but we could draw many samples ofsize n from F . Then we have {Xn1, . . . , Xnm} that is considered asa random sample of Xn. The empirical distribution function based onthis sample is then a good approximation of the distribution of Xn.

What if we don’t know the distribution, and we can only afford onerandom sample?

3 / 37

Page 7: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The bootstrap idea

The sample stands in for population and we do many times re-samplingfrom the sample.

4 / 37

Page 8: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

SRS of size n

(a)

SRS of size n

SRS of size n

Sampling distributionPOPULATION

unknown mean �

x–

x–

x–

···

···

(b)

Theory

Sampling distribution NORMAL POPULATIONunknown mean �

�/0_

n

Resample of size n

Resample of size n

Resample of size n

(c)

One SRS of size n

Bootstrap distributionPOPULATION

unknown mean �

x–

x–

x–

···

···

FIGURE 14.4 (a) The idea of the sampling distribution of the sample mean x: take verymany samples, collect the x-values from each, and look at the distribution of these values.(b) The theory shortcut: if we know that the population values follow a normal distribution,theory tells us that the sampling distribution of x is also normal. (c) The bootstrap idea: whentheory fails and we can afford only one sample, that sample stands in for the population, andthe distribution of x in many resamples stands in for the sampling distribution.

14-9

5 / 37

Page 9: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 1: A functional point of view

Let X1, . . . , Xn i.i.d. from F and Tn = g(X1, . . . , Xn) be an estimator ofθ. Often, θ is a functional of F . Following are some examples.

Quantities

1 θ = E(X1) =∫xdF (x)

2 θ = med(X1) = F−1(12

)3 θ = supx∈R |F (x)− F0(x)|, for a given cdf F0.

Estimators

1 θn = 1n

∑ni=1Xi =

∫xdFn(x)

2 θn = X(m) = F−1n

(12

), where n = 2m− 1, for convenience.

3 θ = supx∈R |Fn(x)− F0(x)|

Put θ = h(F ). The estimator is obtained by plugging Fn in the

function h: θ = h(Fn)

6 / 37

Page 10: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 1: A functional point of view

Let X1, . . . , Xn i.i.d. from F and Tn = g(X1, . . . , Xn) be an estimator ofθ. Often, θ is a functional of F . Following are some examples.

Quantities

1 θ = E(X1) =∫xdF (x)

2 θ = med(X1) = F−1(12

)3 θ = supx∈R |F (x)− F0(x)|, for a given cdf F0.

Estimators

1 θn = 1n

∑ni=1Xi =

∫xdFn(x)

2 θn = X(m) = F−1n

(12

), where n = 2m− 1, for convenience.

3 θ = supx∈R |Fn(x)− F0(x)|

Put θ = h(F ). The estimator is obtained by plugging Fn in the

function h: θ = h(Fn)

6 / 37

Page 11: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 1: A functional point of view

Let X1, . . . , Xn i.i.d. from F and Tn = g(X1, . . . , Xn) be an estimator ofθ. Often, θ is a functional of F . Following are some examples.

Quantities

1 θ = E(X1) =∫xdF (x)

2 θ = med(X1) = F−1(12

)3 θ = supx∈R |F (x)− F0(x)|, for a given cdf F0.

Estimators

1 θn = 1n

∑ni=1Xi =

∫xdFn(x)

2 θn = X(m) = F−1n

(12

), where n = 2m− 1, for convenience.

3 θ = supx∈R |Fn(x)− F0(x)|

Put θ = h(F ). The estimator is obtained by plugging Fn in the

function h: θ = h(Fn)

6 / 37

Page 12: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 1: A functional point of view

Measures of performance of θn

(A) λn(F ) = PF (√n(θn − h(F )) ≤ a)

(B) λn(F ) = PF

(√n(θn−h(F ))τ(F ) ≤ a

), for some scaling factor τ(F ).

(C) λn(F ) = EF (θn − h(F ))

(D) λn(F ) = VarF (θn − h(F ))

The task is to develop a procedure for estimating λn(F ). The idea issimilar to how we derive the estimator of θ. The estimator of λn(F ) isobtained by plugging Fn. For instance,

(A) λn(Fn) = PFn(√n(θ∗n − h(Fn)) ≤ a),

where θ∗n = T (X∗1 , . . . , X∗n) and X∗i iid from Fn. Here, h(Fn) = θn

is a parameter in bootstrap space.

7 / 37

Page 13: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 1: A functional point of view

Measures of performance of θn

(A) λn(F ) = PF (√n(θn − h(F )) ≤ a)

(B) λn(F ) = PF

(√n(θn−h(F ))τ(F ) ≤ a

), for some scaling factor τ(F ).

(C) λn(F ) = EF (θn − h(F ))

(D) λn(F ) = VarF (θn − h(F ))

The task is to develop a procedure for estimating λn(F ).

The idea issimilar to how we derive the estimator of θ. The estimator of λn(F ) isobtained by plugging Fn. For instance,

(A) λn(Fn) = PFn(√n(θ∗n − h(Fn)) ≤ a),

where θ∗n = T (X∗1 , . . . , X∗n) and X∗i iid from Fn. Here, h(Fn) = θn

is a parameter in bootstrap space.

7 / 37

Page 14: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 1: A functional point of view

Measures of performance of θn

(A) λn(F ) = PF (√n(θn − h(F )) ≤ a)

(B) λn(F ) = PF

(√n(θn−h(F ))τ(F ) ≤ a

), for some scaling factor τ(F ).

(C) λn(F ) = EF (θn − h(F ))

(D) λn(F ) = VarF (θn − h(F ))

The task is to develop a procedure for estimating λn(F ). The idea issimilar to how we derive the estimator of θ. The estimator of λn(F ) isobtained by plugging Fn. For instance,

(A) λn(Fn) = PFn(√n(θ∗n − h(Fn)) ≤ a),

where θ∗n = T (X∗1 , . . . , X∗n) and X∗i iid from Fn. Here, h(Fn) = θn

is a parameter in bootstrap space.

7 / 37

Page 15: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

An example: estimating the error distribution of the mean

Let θ = h(F ) = EF (X) and θn = h(Fn) = Xn. Consider a very simplescenario n = 2. Suppose that the realization of (X1, X2) is (c, d). How to

compute the estimated cdf of error: λn(Fn) = PFn(√n(θ∗n−h(Fn)) ≤ a)?

Let (X∗1 , X∗2 ) be a random sample from Fn. Then

P(X∗i = c) = P(X∗i = d) = 1/2, for i = 1, 2.

Note h(Fn) = c+d2 and θ∗n =

X∗1 +X∗

2

2 .

Prob. 14

12

14

(X∗1 , X∗2 ) (c, c) (c, d) or (d, c) (d, d)

θ∗n c c+d2 d

θ∗n − θn c−d2 0 d−c

2

The last row gives cdf of (θ∗n − θn), i.e. λn(Fn).

8 / 37

Page 16: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 2: Resampling

In theory, λn(Fn) is known to us because the randomness is governedcompeletely by Fn, which only depends on the data (X1, . . . , Xn).

In practice, the exact calculation of λn(Fn) is not feasible, except forsmall values of n.

Efron (1979) provides a way to estimate λn(Fn) by sampling fromFn.

9 / 37

Page 17: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 2: Sampling from Fn

We aim to estimateλn(Fn) = PFn(

√n(T (X∗1 , . . . , X

∗n)− θn) ≤ a) = PFn(

√n(θ∗n− θn) ≤ a).

Draw B samples, each with sample size n, from Fn.

Note that Fn is a discrete uniform df, assigning probability mass 1/nto each Xi.

for i in 1:B

draw X*_1,..., X*_n with replacement from {X_1,...,X_n}

compute theta*_i=T(X*_1,...,X*_n)

Resulting vector {θ∗1n, . . . , θ∗Bn} is a random sample of θ∗n.

Now λn(Fn) can be approximated by its empirical counterpart:

λ∗B,n =1

B

B∑i=1

I(√n(θ∗ni − θn) ≤ a)

10 / 37

Page 18: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Stage 2: Sampling from Fn

Similarly, λn(Fn) = EFn

(T (X∗1 , . . . , X

∗n)− θn

)can be estimated by

λ∗B,n = 1B

∑Bi=1

(θ∗ni − θn

).

11 / 37

Page 19: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Remarks

This is a two-stage procedure, called bootstrapping.

(i) estimating λn(F ) by λn(Fn)(ii) approximating λn(Fn) by λ∗B,n

The approximation in the second stage can be highly accurate bychoosing B sufficiently large. Thus λ∗B,n is an approximator rather

than an estimtor of λn(Fn).

There are many alternatives being available to each of the two stages.

Plugging a different estimator of F to estimate λn(F ), for instance aparametric estimator when dealing with some parametric family.Resampling m observations from Fn, with m = o(n). This is calledm out of n bootstrap.

12 / 37

Page 20: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Remarks

There are two sources of random variation in bootstrap distributions orbootstrap samples.

Choosing an original sample at random from the population. (Stage1)

Choosing bootstrap resamples at random from the original sample.(Stage 2)

Again, variation due to the first stage is dominating.

13 / 37

Page 21: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

–3 µ µ

µ

3 30 06

0 x

x

x x3 3 30 0

Sample 1

0 03 3 0 3x x

Sample 2

0 0 0x x x3 3 3

Sample 3

0 0 0x x x3 3 3

Sample 4

0 0 0x x x3 3 3

Sample 5

Population distribution Sampling distribution

Bootstrap distribution 6for

Sample 1

Bootstrap distributionfor

Sample 1

Bootstrap distributionfor Sample 2

Bootstrap distributionfor

Sample 3

Bootstrap distributionfor

Sample 4

Bootstrap distributionfor Sample 5

Bootstrap distribution 2for

Sample 1

Bootstrap distribution 3for

Sample 1

Bootstrap distribution 4for

Sample 1

Bootstrap distribution 5for

Sample 1

Population mean =Sample mean = x–

––

– – –

– –

– –

– –

FIGURE 14.12 Five random samples (n = 50) from the same population, with a bootstrapdistribution for the sample mean formed by resampling from each of the five samples. At theright are five more bootstrap distributions from the first sample.

14-28

14 / 37

Page 22: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Remarks

One may feel that bootstrapping achieves the impossible: provideadditional information (about λn(F )) without acquiring more data.This is NOT true. What λ∗B,n does is to provide a simple and accu-

rate approximation to λn(Fn) when the latter is too complicated tocompute directly.

15 / 37

Page 23: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Does bootstrapping work?

DefinitionLet ρ be a metric on the space of cdfs, so it measures the distance of twocdfs. We say that the bootstrap is consistent under ρ if, as n→∞,

ρ(λn(F ), λ∗B,n)P→ 0.

Kolmogorov metric Let F1 and F2 be two cdfs.

K(F1, F2) = supx∈R|F1(x)− F2(x)|;

16 / 37

Page 24: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Consistency

Theorem (functions of mean)

Write µ1 = E(X1). Let θ = g(µ1) and θn = g(Xn). Suppose thatE(X2

1

)<∞ and g is continuously differentiable at µ1. Then, as n→∞,

K(λ(Fn), λ∗B,n)a.s.→ 0,

where λn(F ) = PF (√n(g(Xn)− θ) ≤ x) and

λ∗B,n =#{i :

√n(g

(X∗i)− θn) ≤ x}

B,

with X∗i is average of the i-th bootstrap sample, i.e. X∗i = 1n

∑ni=1X

∗i,1

17 / 37

Page 25: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Consistency

Theorem (quantiles)

For 0 < p < 1, let θ = F−1(p) and θ = F−1n (p). Suppose that F has a

positive derivative at θ. Then, as n→∞,

K(λ(Fn), λ∗B,n)a.s.→ 0,

where λn(F ) = PF (√n(F−1

n (p)− θ) ≤ x) and

λ∗B,n =#{i :

√n(X∗i,(np) − θn) ≤ x}

B,

with X∗i,(np) the p-th sample percentile of the i-th bootstrap sample

(X∗11, . . . , X∗1n).

18 / 37

Page 26: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Failure of the bootstrapSuppose X1, . . . , Xn is a random sample from a distribution F and thatX1 has mean µ and unit variance. Let θ = |µ| and θn = |Xn|. If µ = 0,then the bootstrap is not consistent for estimating the ditribution ofEn =

√n(|Xn| − |µ|).

If µ = 0, End→ |Z| where Z ∼ N(0, 1).

It can be shown that

(√n(Xn − µ),

√n(X∗n − Xn))

d→ (Z1, Z2),

where Z1 and Z2 are independent N(0, 1) random variables.

Since

E∗n =√n(|X∗n| − |Xn|) = |

√n(X∗n − Xn) +

√nXn| − |

√nXn|

d→ |Z1 + Z2| − |Z1|.

E∗n does not converge to the absolute value of a standard normalrandom variable.

19 / 37

Page 27: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Failure of the bootstrapSuppose X1, . . . , Xn is a random sample from a distribution F and thatX1 has mean µ and unit variance. Let θ = |µ| and θn = |Xn|. If µ = 0,then the bootstrap is not consistent for estimating the ditribution ofEn =

√n(|Xn| − |µ|).

If µ = 0, End→ |Z| where Z ∼ N(0, 1).

It can be shown that

(√n(Xn − µ),

√n(X∗n − Xn))

d→ (Z1, Z2),

where Z1 and Z2 are independent N(0, 1) random variables.

Since

E∗n =√n(|X∗n| − |Xn|) = |

√n(X∗n − Xn) +

√nXn| − |

√nXn|

d→ |Z1 + Z2| − |Z1|.

E∗n does not converge to the absolute value of a standard normalrandom variable.

19 / 37

Page 28: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Failure of the bootstrapSuppose X1, . . . , Xn is a random sample from a distribution F and thatX1 has mean µ and unit variance. Let θ = |µ| and θn = |Xn|. If µ = 0,then the bootstrap is not consistent for estimating the ditribution ofEn =

√n(|Xn| − |µ|).

If µ = 0, End→ |Z| where Z ∼ N(0, 1).

It can be shown that

(√n(Xn − µ),

√n(X∗n − Xn))

d→ (Z1, Z2),

where Z1 and Z2 are independent N(0, 1) random variables.

Since

E∗n =√n(|X∗n| − |Xn|) = |

√n(X∗n − Xn) +

√nXn| − |

√nXn|

d→ |Z1 + Z2| − |Z1|.

E∗n does not converge to the absolute value of a standard normalrandom variable.

19 / 37

Page 29: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Failure of the bootstrapSuppose X1, . . . , Xn is a random sample from a distribution F and thatX1 has mean µ and unit variance. Let θ = |µ| and θn = |Xn|. If µ = 0,then the bootstrap is not consistent for estimating the ditribution ofEn =

√n(|Xn| − |µ|).

If µ = 0, End→ |Z| where Z ∼ N(0, 1).

It can be shown that

(√n(Xn − µ),

√n(X∗n − Xn))

d→ (Z1, Z2),

where Z1 and Z2 are independent N(0, 1) random variables.

Since

E∗n =√n(|X∗n| − |Xn|) = |

√n(X∗n − Xn) +

√nXn| − |

√nXn|

d→ |Z1 + Z2| − |Z1|.

E∗n does not converge to the absolute value of a standard normalrandom variable.

19 / 37

Page 30: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Failure of the bootstrap

Failure of the bootstrap

Den

sity

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

limit distr T*limit distr T

Here T should be read as En and T ∗ as E∗n.

20 / 37

Page 31: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Failure of the bootstrap

It does not always work. Following are a few situations where the simplebootstrap fails to estimate the CDF of En consistently:

En =√n(Xn − µ) when Var(X1) =∞

En =√n(g(Xn)− g(µ)) and g′(µ) = 0

En =√n(g(Xn)− g(µ)) and g′(µ) does not exist.

En =√n(F−1

n (p)− F−1(p)) and F ′(F−1(p)) = 0 or F has unequalright and left derivatives at F−1(p).

The underlying population Fθ is indexed with a parameter θ, andthe support of the Fθ depends on the value of θ.

Some problems might be solved by more advanced bootstrap procedures.

21 / 37

Page 32: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

There are several ways to construct bootstrap confidence intervals for θ.

Normal interval

Pivotal interval (or, bootstrap basic interval)

Percentile interval

Bias corrected interval

Studentised pivotal interval (or, bootstrap-t interval)

Let {θ∗1n, . . . , θ∗Bn} be the bootstrap sample; see Slide 10. Denote the

ordered sample by θ∗(1) ≤ θ∗(2) ≤ . . . ≤ θ

∗(B).

In this section, Φ denotes the CDF of N(0, 1) and Φ(zα) = α.

22 / 37

Page 33: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Normal intervalAssumption: θn−θσn

d→ N(0, 1), where σ2n = Var(θn).

If σn is known, then the (1 − α) CI of θ is given by [θn ± zα/2σ],where zα/2 is the quantile of N(0, 1).

If σn is unknown, then we need to estimate it by bootstrapping. Notethis corresponds to (D) on Slide 7. That is

λn(F ) = VarF (θn) = EF (θn − EF (θn))2.

Stage 1: λn(Fn) = EFn(θ∗n − EFn(θ∗n))2,

Stage 2: λ∗B,n =1

B

B∑i=1

(θ∗in − θ∗n)2, where θ∗n =1

B

B∑i=1

θ∗in.

Bootstrap normal interval is given byθn ± zα/2√√√√ 1

B

B∑i=1

(θ∗in − θ∗n)2

23 / 37

Page 34: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Normal intervalAssumption: θn−θσn

d→ N(0, 1), where σ2n = Var(θn).

If σn is known, then the (1 − α) CI of θ is given by [θn ± zα/2σ],where zα/2 is the quantile of N(0, 1).

If σn is unknown, then we need to estimate it by bootstrapping. Notethis corresponds to (D) on Slide 7. That is

λn(F ) = VarF (θn) = EF (θn − EF (θn))2.

Stage 1: λn(Fn) = EFn(θ∗n − EFn(θ∗n))2,

Stage 2: λ∗B,n =1

B

B∑i=1

(θ∗in − θ∗n)2, where θ∗n =1

B

B∑i=1

θ∗in.

Bootstrap normal interval is given byθn ± zα/2√√√√ 1

B

B∑i=1

(θ∗in − θ∗n)2

23 / 37

Page 35: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Normal intervalAssumption: θn−θσn

d→ N(0, 1), where σ2n = Var(θn).

If σn is known, then the (1 − α) CI of θ is given by [θn ± zα/2σ],where zα/2 is the quantile of N(0, 1).

If σn is unknown, then we need to estimate it by bootstrapping. Notethis corresponds to (D) on Slide 7. That is

λn(F ) = VarF (θn) = EF (θn − EF (θn))2.

Stage 1: λn(Fn) = EFn(θ∗n − EFn(θ∗n))2,

Stage 2: λ∗B,n =1

B

B∑i=1

(θ∗in − θ∗n)2, where θ∗n =1

B

B∑i=1

θ∗in.

Bootstrap normal interval is given byθn ± zα/2√√√√ 1

B

B∑i=1

(θ∗in − θ∗n)2

23 / 37

Page 36: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Pivotal interval (Bootstrap basic interval)

Define the pivot Rn = θn − θ. Let H(r) denotes the CDF of Rn.Let rα/2 = H−1(α/2) and r1−α/2 = H−1(1− α/2). Then

P(rα/2 ≤ θn − θ ≤ r1−α/2) = 1− α.

We need to estimate H. This corresponds to (A) on Slide 7. Abootstrap estimator of H is given by

λ∗B,n =1

B

B∑i=1

I(R∗in ≤ r) =: HB,n(r)

where R∗in = θ∗in − θn.

24 / 37

Page 37: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Pivotal interval (Bootstrap basic interval)

Define the pivot Rn = θn − θ. Let H(r) denotes the CDF of Rn.Let rα/2 = H−1(α/2) and r1−α/2 = H−1(1− α/2). Then

P(rα/2 ≤ θn − θ ≤ r1−α/2) = 1− α.

We need to estimate H. This corresponds to (A) on Slide 7. Abootstrap estimator of H is given by

λ∗B,n =1

B

B∑i=1

I(R∗in ≤ r) =: HB,n(r)

where R∗in = θ∗in − θn.

24 / 37

Page 38: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Pivotal interval

The estimators of rα/2 and r1−α/2 are given by

rα/2 = H−1B,n(α/2) = R∗

(αB2 )and r1−α/2 = R∗((1−α2 )B),

whereR∗(i) denotes the i-th order statistics of the sample {R∗in, i = 1, . . . , B}.Thus, the bootstrap basic confidence interval is

[θn −R∗((1−α2 )B), θn −R∗(αB2 )

]

or equivalently[2θn − θ∗((1−α2 )B), 2θn − θ

∗(αB2 )

].

25 / 37

Page 39: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Percentile interval

Assumption: Rn = θn − θ has a symmetric distribution around 0.

Because of the symmetric distribution rα/2 = −r1−α/2. Hence

P (θn + rα/2 ≤ θ ≤ θn + r1−α/2) = 1− α.Plugging in the bootstrap estimator of rα/2 and 1− rα/2, thepercentile interval is given by

[θn +R∗(αB2 )

, θn +R∗((1−α2 )B)]

or equivalently[θ∗

(αB2 ), θ∗((1−α2 )B)].

Homework The assumption can be relaxed as following. There existsan unknown increasing transformation h such that h(θn) − h(θ) hasa symmetric distribution around 0.

26 / 37

Page 40: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The BC (bias-corrected) percentile method

Assumption: there exists an unknown increasing transformation h suchthat h(θn)− h(θ) is (asymptotically) from N(w, 1).w is unknown. First we estimate w.

P (θn ≤ θ) = P (h(θn) ≤ h(θ)) = P (h(θn)− h(θ)− w ≤ −w) = Φ(−w).

Then w = Φ−1(β) = zβ , where β = P (θn ≤ θ). β can be estimated by

1

B

B∑i=1

I(θ∗in ≤ θ).

Thus

w = Φ−1

(1

B

B∑i=1

I(θ∗in ≤ θ)

).

27 / 37

Page 41: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The BC (bias-corrected) percentile methodFrom the normality, we have

P (zα/2 ≤ h(θn)− h(θ)− w ≤ z1−α/2) = 1− α,

equivalently

P (h−1(h(θn)− w + zα/2) ≤ θ ≤ h−1(h(θn)− w − zα/2)) = 1− α.

Denote the lower and upper bounds of θ by θl and θu, respectively. Weneed to estimate the bounds because h is unknown.

PFn(θ∗n ≤ θl) =PFn(h(θ∗n) ≤ h(θl))

=PFn(h(θ∗n) ≤ h(θn)− w + zα/2)

=PFn(h(θ∗n)− h(θn)− w ≤ zα/2 − 2w) Bootstrap world

≈PF (h(θn)− h(θn)− w ≤ zα/2 − 2w) Real world

=Φ(zα/2 − 2w).

28 / 37

Page 42: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The BC (bias-corrected) percentile methodFrom the normality, we have

P (zα/2 ≤ h(θn)− h(θ)− w ≤ z1−α/2) = 1− α,

equivalently

P (h−1(h(θn)− w + zα/2) ≤ θ ≤ h−1(h(θn)− w − zα/2)) = 1− α.

Denote the lower and upper bounds of θ by θl and θu, respectively. Weneed to estimate the bounds because h is unknown.

PFn(θ∗n ≤ θl) =PFn(h(θ∗n) ≤ h(θl))

=PFn(h(θ∗n) ≤ h(θn)− w + zα/2)

=PFn(h(θ∗n)− h(θn)− w ≤ zα/2 − 2w) Bootstrap world

≈PF (h(θn)− h(θn)− w ≤ zα/2 − 2w) Real world

=Φ(zα/2 − 2w).

28 / 37

Page 43: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The BC (bias-corrected) percentile method

We have PFn(θ∗n ≤ θl) ≈ Φ(zα/2 − 2w). This means θl is approximately

the quantile of θ∗n with probability Φ(zα/2−2w). Hence it can be estimated

by the corresponding empirical quantile of θ∗n:

θ∗(BΦ(zα/2−2w)).

In the same manner, we obtain the estimator of θu. The BC bootstrapinterval of θ is given by

[θ∗(BΦ(zα/2−2w)), θ∗(BΦ(z1−α/2−2w))],

where w is previously defined.

29 / 37

Page 44: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The BC (bias-corrected) percentile method

What if we only have h(θn)− h(θ) is (asymptotically) from N(w, σ2)?

If σ does not depend on h(θ), then the bias corrected percentilemethod can still be used. Why?

If σ depends on h(θ), then we should use the BCa (acceleratedbias-corrected bootstrap percentile) method. We don’t discussabout BCa method here.

30 / 37

Page 45: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The studentised interval

Consider a studentised pivotal: Rn = θn−θσn

, where σ2n = Var(θn). Let

rα/2 and r1−α/2 be the quantile of Rn. Then,

P(σnrα/2 ≤ θn − θ ≤ σnr1−α/2) = 1− α.

Suppose σn is known or we are able to find a consistent estimator ofσn: σn = σ(X1, . . . , Xn) with a known function σ.

We estimate rα/2 and r1−α/2 by bootstrapping. The Bootstrap-tinterval is given by

[θn − σnR∗((1−α2 )B), θn − σnR∗(αB2 )

],

where R∗(j) denotes the j-th order statistics of the sample

{R∗in =θ∗in−θnσ∗in

, i = 1, . . . , B} and σ∗in = σ(X∗i1, . . . , X∗in).

31 / 37

Page 46: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

The studentised interval

If σn is unknown. We estimate it by σB,n =√

1B

∑Bi=1(θ∗in − θ∗n)2; see

for normal intervals.Note the way to estimating the quantile of Rn is different from thatwhen σn is known. We consider

R∗in =θ∗in − θnse∗ni

,

where se∗ni needs to be computed for each bootstrap sample, which mightrequire a second bootstrap within each bootstrap. The obtained CI is,

[θn − σB,nR∗((1−α2 )B), θn − σB,nR∗(αB2 )

],

with R∗(j) the j-th order statistics of {R∗in, i = 1, . . . , n}.

32 / 37

Page 47: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Accuracy

A confidence interval CIα is said being first order accurate ifP(θ ∈ CIα) = α+O(n−1/2), and second order accurate ifP(θ ∈ CIα) = α+O(n−1).

Under regularity conditions: ”when bootstrap works”:normal interval, bootstrap basic interval, percentile interval, and BCinterval are first order accurate. BCa and bootstrap-t are secondorder accurate.

33 / 37

Page 48: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Group Presentation (March 9)Group 8

Consider 31 measurements of polished window strength data for aglass airplane window. In reliability tests, researchers often rely onparametric assumptions to characterize observed lifetimes. Pleaseimplement a composite GoF test to see if a Weibull distribution isappropriate. Use Cramer - Von Mises test. The data are as follows.

18.830, 20.800, 21.657, 23.030, 23.230, 24.050,

24.321, 25.500, 25.520, 25.800, 26.690, 26.770,

26.780, 27.050, 27.670, 29.900, 31.110, 33.200,

33.730, 33.760, 33.890, 34.760, 35.750, 35.910,

36.980, 37.080, 37.090, 39.580, 44.045, 45.290, 45.381

The Weibull distribution function is given by

Fβ,γ(x) =

0 x < 0

1− exp

(−[xγ

]β)x ≥ 0

.

This is a scale-shape family.What is your test statistic?Implement the parametric bootstrap to compute p-value.

34 / 37

Page 49: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Group Presentation (March 9)

Group 9

Why this method is called ’bootstrap’? Please do some literaturereview.In a controlled clinical trial, participants were randomly assigned totwo groups: (i) Aspirin and (ii) Placedo, where the aspirin grouphave been taking 325 mg aspirin every second day. At the end oftrial, the number of participants who suffered from MyocardialInfarction was assessed. The counts were given in the folloing table:

MyoInf No MyoInf Total

Aspirin 104 10933 11037Placebo 189 10845 11037

Risk Ratio (RR) defined as the ratio of proportions of cases (riskes)in two groups, is a popular measure in assessing results in clinicaltrials. From the table

RR =RaRp

=104/11037

189/11037= 0.55.

Construct a bootstrap estimate for the variability of RR.

35 / 37

Page 50: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Group Presentation (March 16)

Group 10

Consider the uniform distribution on [0, θ]. Suppose that X1, . . . , Xnare a random sample from U [0, θ]. Estimate θ with X(n), themaximum of the sample. Using bootstrap to estimate thedistribution of En = n(X(n) − θ). Choose n = 30.What is the limit distribution of En?Implement non-parametric bootstrapping: draw bootstrap samplesfrom Fn. Make a histogram to depict the distribution ofE∗n = n(X∗(n) −X(n)). Is it close to the limit distribution of En?Implement parametric bootstrapping: draw bootstrap samples fromFθ. Make a histogram to depict the distribution ofE∗n = n(X∗(n) −X(n)). Is it close to the limit distribution of En?

36 / 37

Page 51: @let@token Lecture 5 and 6: Bootstrap - TU Delftdutiosb.twi.tudelft.nl/~cai/AS2015/slides-bootstrap.pdf · IntroductionTwo stages approachConsistencyCon dence IntervalAssignments

Introduction Two stages approach Consistency Confidence Interval Assignments

Group Presentation (March 16)

Group 11

Suppse that X1, . . . X50 are iid from F . How do you contruct a 95%confidence interval for the mean E(X1)? Consider at least thefollowing methods: CLT, and the bootstrap confidence intervals wediscussed.

5.67 5.04 2.23 2.30 1.32 0.49 0.00 0.11 0.22 1.07 3.90

2.66 0.17 1.01 1.64 3.81 2.01 1.94 0.70 0.01 0.89 0.08

0.67 2.21 1.14 0.51 0.52 0.10 4.44 1.80 0.05 0.06 0.22

0.99 0.21 0.61 1.06 6.56 0.42 1.49 1.10 1.04 3.27 0.73

3.01 5.06 0.36 0.56 1.75 5.87

37 / 37