![Page 1: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/1.jpg)
Bootstrap Methods: RecentBootstrap Methods: RecentAdvances and New ApplicationsAdvances and New Applications
2007 ICTP Lectures2007 ICTP LecturesAnestis Anestis AntoniadisAntoniadis
UJF, LJK LabUJF, LJK LabTrieste October, 2007Trieste October, 2007
11
![Page 2: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/2.jpg)
Bootstrap TopicsBootstrap Topics• Introduction to Bootstrap
• Wide Variety of Applications
• Confidence regions and hypothesis tests
• Examples where bootstrap is not consistent and remedies:(1) infinite variance case for a population mean and(2) extreme values
• Available Software
22
![Page 3: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/3.jpg)
IntroductionIntroduction
• The bootstrap is a general method for doingstatistical analysis without making strongparametric assumptions.
• Efron’s nonparametric bootstrap, resamplesthe original data.
• It was originally designed to estimate bias andstandard errors for statistical estimates muchlike the jackknife.
33
![Page 4: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/4.jpg)
Introduction (continued)Introduction (continued)
• The bootstrap is similar to earliertechniques which are also calledresampling methods:– (1) jackknife,– (2) cross-validation,– (3) delta method,– (4) permutation methods, and– (5) subsampling..
44
![Page 5: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/5.jpg)
Introduction (continued)Introduction (continued)
The technique was extended, modified and
refined to handle a wide variety of problems
including:– (1) confidence intervals and hypothesis tests,
– (2) linear and nonlinear regression,
– (3) time series analysis and other problems
55
![Page 6: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/6.jpg)
Introduction (continued)Introduction (continued)
Definition of Efron’s nonparametric bootstrap.Given a sample of n independent identicallydistributed (i.i.d.) observations X1, X2, …, Xn froma distribution F and a parameter θ of thedistribution F with a real valued estimatorθ(X1, X2, …, Xn ), the bootstrap estimates the
accuracy of the estimator by replacing F with Fn,the empirical distribution, where Fn placesprobability mass 1/n at each observation Xi.
66
![Page 7: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/7.jpg)
Introduction (continued)Introduction (continued)
• Let X1*, X2
*, …, Xn* be a bootstrap sample, that
is a sample of size n taken with replacementfrom Fn .
• The bootstrap, estimates the variance of
θ(X1, X2, …, Xn ) by computing orapproximating the variance of
θ* = θ(X1*, X2
*, …, Xn* ).
77
![Page 8: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/8.jpg)
Schematic of Bootstrap ProcessSchematic of Bootstrap Process
88
![Page 9: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/9.jpg)
Introduction (continued)Introduction (continued)
• Statistical Functionals - A functional is amapping that takes functions into realnumbers.
• Parameters of a distribution can usually beexpressed as functionals of the populationdistribution.
• Often the standard estimate of aparameter is the same functional appliedto the empirical distribution.
99
![Page 10: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/10.jpg)
Introduction (continued)Introduction (continued)
• Statistical Functionals and the bootstrap.• A parameter θ is a functional T(F) where T
denotes the functional and F is apopulation distribution.
• An estimator of θ is θh = T(Fn) where Fn isthe empirical distribution function.
• Many statistical problems involveproperties of the distribution of θ - θh , itsmean (bias of θh ), variance, median etc.
1010
![Page 11: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/11.jpg)
Introduction (continued)Introduction (continued)• Bootstrap idea: Cannot determine the distribution of
θ - θh but through the bootstrap we can determine, orapproximate through Monte Carlo, the distribution ofθh - θ*, where θ* = T(Fn
*) and Fn* is the empirical
distribution for a bootstrap sample X1*, X2
*,…,Xn*
(θ* is a bootstrap estimate of θ).• Based on k bootstrap samples the Monte Carlo
approximation to the distribution of θh - θ* is used toestimate bias, variance etc. for θh .
• In bootstrapping θh substitutes for θ and θ* substitutes forθh . Called the bootstrap principle.
1111
![Page 12: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/12.jpg)
Introduction (continued)Introduction (continued)
1212
![Page 13: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/13.jpg)
Introduction (continued)Introduction (continued)
• Basic Theory: Mathematical results show thatbootstrap estimates are consistent in particularcases.
• Basic Idea: Empirical distributions behave inlarge samples like population distributions.Glivenko-Cantelli Theorem tells us this.
• The smoothness condition is needed to transferconsistency to functionals of Fn, such as theestimate of the parameter θ.
1313
![Page 14: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/14.jpg)
Wide Variety of ApplicationsWide Variety of Applications
• Efron and others recognized that throughthe power of fast computing the MonteCarlo approximation could be used toextend the bootstrap to many differentstatistical problems .
1414
![Page 15: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/15.jpg)
Wide Variety of ApplicationsWide Variety of Applications(continued)(continued)
• It can estimate process capability indices fornon-Gaussian data.
• It is used to adjust p-values in a variety ofmultiple comparison situations.
• It can be extended to problems involvingdependent data including multivariate, spatialand time series data and in sampling fromfinite populations.
1515
![Page 16: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/16.jpg)
Wide Variety of ApplicationsWide Variety of Applications(continued)(continued)
• It also has been applied to problems involvingmissing data.
• In many cases, the theory justifying the use ofbootstrap (e.g. consistency theorems) has beenextended to these non i.i.d. settings.
• In other cases, the bootstrap has been modifiedto “make it work.” The general case ofconfidence interval estimation is a notableexample.
1616
![Page 17: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/17.jpg)
Confidence regions and hypothesisConfidence regions and hypothesisteststests
• The percentile method and other bootstrapvariations may require 1000 or morebootstrap replications to be very useful.
• The percentile method only works underspecial conditions.
• Bias correction and other adjustments aresometimes needed to make the bootstrap“accurate” and “correct” when the samplesize n is small or moderate.
1717
![Page 18: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/18.jpg)
Confidence regions and hypothesisConfidence regions and hypothesistests (continued)tests (continued)
• Confidence intervals are accurate or nearlyexact when the stated confidence level for theintervals is approximately the long run probabilitythat the random interval contains the “true” valueof the parameter.
• Accurate confidence intervals are said to becorrect if they are approximately the shortestlength confidence intervals possible for the givenconfidence level.
1818
![Page 19: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/19.jpg)
Examples where the bootstrap failsExamples where the bootstrap fails
• Athreya (1987) shows that the bootstrapestimate of the sample mean isinconsistent when the populationdistribution has an infinite variance.
• Angus (1993) provides similarinconsistency results for the maximumand minimum of a sequence ofindependent identically distributedobservations.
1919
![Page 20: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/20.jpg)
Bootstrap RemediesBootstrap Remedies
• In the past decade many of the problemswhere the bootstrap is inconsistentremedies have been found by researchersto give good modified bootstrap solutionsthat are consistent.
• For both problems describe thus far asimple procedure called the m-out-nbootstrap has been shown to lead toconsistent estimates .
2020
![Page 21: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/21.jpg)
The The m-out-of-n m-out-of-n BootstrapBootstrap
• This idea was proposed by Bickel and Ren (1996) forhandling doubly censored data.
• Instead of sampling n times with replacement from asample of size n they suggest to do it only m timeswhere m is much less than n.
• To get the consistency results both m and n need to getlarge but at different rates. We need m=o(n). That ism/n→0 as m and n both → ∞.
• This method leads to consistent bootstrap estimates inmany cases where the ordinary bootstrap has problems,particularly (1) mean with infinite variance and (2)extreme value distributions.
2121
![Page 22: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/22.jpg)
Available SoftwareAvailable Software• Resampling Stats from Resampling Stats Inc.
(provides basic bootstrap tools in easy to usesoftware and is good as an elementary teaching tool).
• SPlus from Insightful Corporation ( good foradvanced bootstrap techniques such as BCa, easy touse in new Windows based version). The currentmodule Resample is what I use in my bootstrap classat statistics.com.
• S functions provided by Tibshirani (see Appendix inEfron and Tibshirani text or visit Rob Tibshirani’s website http:/www.stat-stanford.edu/~tibs)
2222
![Page 23: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/23.jpg)
Available Software (continued)Available Software (continued)
• Stata has a bootstrap algorithm available thatsome users rave about.
• Mathworks and other examples (see SusanHolmes web page):
http:/www-stat.stanford.edu/~susan) or contact herby email
• SAS macros are available and Proc MULTTESTdoes bootstrap sampling.
2323
![Page 24: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/24.jpg)
References on confidence intervalsReferences on confidence intervalsand hypothesis testsand hypothesis tests
(1) Chernick, M.R. (1999). Bootstrap Methods: APractitioner’s Guide. Wiley, New York.(2) Chernick, M.R. (2007). Bootstrap Methods: AGuide for Practitioners and Researchers, 2nd
Edition. Wiley, New York.(3) Hall, P. (1992). The Bootstrap andEdgeworth Expansion. Springer-Verlag, NewYork.(4) Efron, B. (1982) The Jackknife, the Bootstrapand Other Resampling Plans. Society forIndustrial and Applied Mathematics CBMS-NSFRegional Conference Series 38, Philadelphia.
24
![Page 25: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/25.jpg)
(5) Carpenter, J. and Bithell, J. (2000).Bootstrap confidence intervals: when, which,what? A practical guide for medical statisticians.Statistics in Medicine 19, 1141-1164.(6) Bahadur, R.R. and Savage, L.J. (1956). Thenonexistence of certain statistical procedures innonparametric problems. Annals ofMathematical Statistics 27, 1115-1122.(7) Ewens, W.J. and Grant, G.R. (2001).Statistical Methods in Bioinformatics AnIntroduction.
References on confidence intervals andReferences on confidence intervals andhypothesis tests (continued)hypothesis tests (continued)
25
![Page 26: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/26.jpg)
References on when bootstrap failsReferences on when bootstrap fails(1) Angus, J. E. (1993). Asymptotic theory forbootstrapping the extremes. Communs. Statist.Theory and Methods 22, 15-30.(2) Athreya, K. B. (1987). Bootstrap estimationof the mean in the infinite variance case. Ann.Statist. 15, 724-731.(3) Bickel, P. J. and Freedman, D. A. (1981).Some asymptotic theory for the bootstrap. Ann.Statist. 9, 1196-1217.
26
![Page 27: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/27.jpg)
References on when bootstrap failsReferences on when bootstrap fails(continued)(continued)
(4) Chernick, M.R. (1999). Bootstrap Methods: APractitioner’s Guide. Wiley, New York.
(5) Chernick, M.R. (2007). Bootstrap Methods: AGuide for Practitioners and Researchers, 2nd
Edition. Wiley, New York.
(6) Cochran, W. (1977). Sampling Techniques.3rd ed., Wiley, New York
27
![Page 28: Bootstrap Methods: Recent Advances and New …mescal.imag.fr/membres/yves.denneulin/LASCAR/page2/files/Bootstrap...Bootstrap Methods: Recent Advances and New Applications 2007 ICTP](https://reader034.vdocuments.us/reader034/viewer/2022051723/5ab468207f8b9a0f058bc912/html5/thumbnails/28.jpg)
References on when bootstrap failsReferences on when bootstrap fails(continued)(continued)
(7) Knight, K. (1989). On the bootstrap of thesample mean in the infinite variance case. Ann.Statist. 17, 1168-1175.
(8) LePage, R., and Billard, L. (editors). (1992).Exploring the Limit of Bootstrap. Wiley, NewYork.(9) Mammen, E. (1992). When Does theBootstrap Work? Asymptotic Results andSimulations Springer-Verlag, Heidelberg.
(10) Singh, K. (1981). On the asymptoticaccuracy of Efron’s bootstrap. Ann. Statist. 9,1187-1195.
28