![Page 1: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/1.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis
2011/2012
M. de Gunst
Lecture 4
![Page 2: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/2.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 2
Statistical Data Analysis: Introduction
TopicsSummarizing dataExploring distributions Bootstrap (continued)Robust methodsNonparametric testsAnalysis of categorical dataMultiple linear regression
![Page 3: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/3.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 3
Today’s topics: Bootstrap (Chapter 4: 4.3, 4.4)
4. Bootstrap4.1. Simulation (read yourself) (last week)4.2. Bootstrap estimators for distribution (last week)4.3. Bootstrap confidence intervals4.4. Bootstrap tests
![Page 4: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/4.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 4
Bootstrap: recap (1)
Situation realizations of , independent, unknown distr.
P
Bootstrap to estimate distribution of
estimator or test statistic
Which steps? First errorSecond errorStep 1. Estimate by
Step 2. Estimate by i.e. by empirical distribution of
![Page 5: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/5.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 5
Bootstrap: recap (2)
Step 1: Determine theoretical bootstrap estimator
empirical distributioni) Estimate P by parametric distribution, parameter estimated stochastic: estimator
ii) Estimate by
stochastic: bootstrap estimator
First error
![Page 6: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/6.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 6
Bootstrap: recap (3)
Step 2: From estimator to estimate: fixed
i) If has explicit expression, then done ii) If not, then estimate the estimate: use bootstrap (sampling) scheme to estimate
where and from by empirical distribution of , is stochastic: estimator empirical distr. of simulated realizations of is estimate
Second error
![Page 7: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/7.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 7
Bootstrap: recap (4)
Obtain empirical distr. of simulated realizations of
with bootstrap (sampling) scheme:
With the B bootstrap values get impression of (characteristics of) unknown distribution of Tn:
draw histogram compute sample variance compute sample sd
![Page 8: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/8.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 8
4.3. Bootstrap confidence intervals (1)
Tn : estimator of unknown parameter θ
Seen: accuracy of estimator Tn : variance of estimator’s distribution
Now: accuracy of estimator Tn : confidence interval
(1 - 2α)x100% confidence interval for θ is interval around Tn such that it contains `true’ θ with probability > 1 - 2α
If interval is [Tn - b1, Tn + b2], how to determine b1 and b2?
(blackboard)
![Page 9: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/9.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 9
Bootstrap confidence intervals (2)
(1 - 2α)x100% confidence interval for θ is interval around Tn such that it contains `true’ θ with probability > 1 - 2α
If interval is [Tn - b1, Tn + b2], then b1 and b2 determined by
[Tn - b1, Tn + b2] =
with , the distribution of Tn – θ,
So b1 and –b2 are quantiles of unknown distribution
How to estimate the quantiles b1 and –b2?
![Page 10: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/10.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 10
Bootstrap confidence intervals (3)
Interval is [Tn - b1, Tn + b2] =
How to estimate quantiles b1 and –b2 of unknown distribution of Tn – θ?
Estimate with , use bootstrap
Givesestimate of conf interval: (4.1)
![Page 11: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/11.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 11
Estimate of conf interval: (4.1)
In practice, determine in steps:
1. Estimate unknown distribution of Tn – θ with ,: use bootstrap
Same as before? No: Tn – θ , need bootstrap values
2. Estimate quantiles by empirical quantiles of bootstrap values
3. Bootstrap confidence interval:
Bootstrap confidence intervals (4)
(4.2)(You have to know this formula!!)
![Page 12: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/12.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 12
Estimate of confidence interval:Correspondingbootstrap confidence interval:
This is original bootstrap confidence interval, also called reflection method
Other method: percentile methodEstimate of confidence interval:Correspondingbootstrap confidence interval:
Only suitable if symmetric around 0. (Asymptotically two methods give same result)
Bootstrap confidence intervals (5)
(4.2)
(4.1)
We will use!!
We just discussed:
![Page 13: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/13.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 13
Bootstrap confidence intervals (5)
How to obtain the (sample) α-quantile ?
R: if zstar contains the bootstrap values > quantile(zstar, α)
Note: always same function of as of
For two samples and Y1 , . . . , Ym method is same
Example: if Tn,m = Xn-Ym, then Tn,m* = Xn * - Ym *
and Zn* = Xn * - Ym * - (Xn-Ym ) (cf. Example 4.4. in Reader)
![Page 14: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/14.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 14
4.4. Bootstrap Tests (1)
Remember last week’s slide:
![Page 15: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/15.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 15
From lecture 3: Kolmogorov-Smirnov test (5)
Data: yH0: F is normal ← composite null hypothesisH1 : F is not normal Test statistic:
R:> ks.test(y,pnorm)D = 0.6922, p-value = 6.661e-16
> ks.test(y,pnorm,mean=mean(y),sd=sd(y))D = 0.1081, p-value = 0.5655> mean(y)[1] 3.62158> sd(y)[1] 3.043356
adj
Incorrect: this is test for H0: F = N(0,1) H1: F ≠ N(0,1)
Incorrect : this is test for
H0: F = N(3.62158,(3.04335)2)
H1: F ≠ N(3.62158,(3.04335)2)
of y
Example
We have not used Dadj ! ! p-value should be
0.126 (next week)
Correct?
Correct?
![Page 16: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/16.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 16
Bootstrap Tests (2)
Solve this with bootstrap test!
General idea on blackboard
![Page 17: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/17.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 17
Bootstrap Tests (3) Example
![Page 18: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/18.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 18
Bootstrap Tests (4)
> hist(dprec, prob=T)> qqnorm(dprec)
Example
dprec
dprec
![Page 19: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/19.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 19
Bootstrap Tests (5) Example
![Page 20: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/20.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 20
Bootstrap Tests (6) Example
![Page 21: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/21.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 21
Bootstrap Tests (7) Example
![Page 22: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/22.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 22
Bootstrap Tests (8) Example
![Page 23: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/23.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 23
Bootstrap Tests (9) Example
![Page 24: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/24.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 24
Recap
Bootstrap4.3. Bootstrap confidence intervals4.4. Bootstrap tests
![Page 25: Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4](https://reader033.vdocuments.us/reader033/viewer/2022051201/5a4d1b567f8b9ab0599a9932/html5/thumbnails/25.jpg)
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
x
Statistical Data Analysis 25
Bootstrap
The end