bootstrap( - amine ouazad · bootstrap(and(the(dangers(of outliers(• exercise: –...
TRANSCRIPT
Bootstrap
Econometrics A Ass. Prof. Amine Ouazad
Mices
• Diabe:c mices get a treatment. Their sugar level is measured a@er the treatment.
• The numbers are the following: – 2.3,4.1,1.2,2.6,4.4,1.9 in the control group. – 2.1,2.0,1.9,1.6,2.2,0.7 in the treatment group.
• Exercise: – Es:mate the effect of the treatment and the standard error on the effect of the treatment.
– Discuss the assump:ons needed to es:mate the standard error of the treatment.
Outline
1. Problemo 2. Mice: the Bootstrap principle 3. Implementa:on in Stata 4. Theory:
1. Es:ma:on of the C.D.F. 2. Es:ma:on of confidence intervals by bootstrap 3. Improvement over the Central Limit Theorem
5. Tricky
Problemo • We used two tricks to find confidence intervals: – Either we used the Central Limit Theorem when the number of observa:ons is large.
– Or we assumed normally distributed residuals when the number of observa:ons is fixed (and small).
• What if none of these is true? – The number of observa:ons is small, and the residuals are not normally distributed.
• Possibili:es: – Assuming another distribu:on for the residuals. Theore:cally possible but super rare and nonstandard.
– Or using bootstrap.
Mices
• Diabe:c mices get a treatment. Their sugar level is measured a@er the treatment.
• The numbers are the following: – 2.3,4.1,4.1,1.2,2.6,4.4,1.9 in the control group. – 2.1,2.0,1.2,1.9,1.6,2.2,0.7 in the treatment group.
• Exercise: – Es:mate the effect of the treatment and the standard error on the effect of the treatment using bootstrap.
2.3, C
4.1, C
4.1, C
1.2, C
2.6, C
2.2, T
0.7, T
2.1, T
2.0, T
1.2, T
1.9, T
1.6, T
4.4, C
1.9, C
Mice: The Boostrap Principle • Sample with replacement from the set of observa:ons of mices.
• Calculate an es:mate b1 of the effect of the medica:on on mices.
• Repeat this step k=1,2,…,K :mes. At each step, calculate bk.
• The 2.5 percen:le of the bks provides a lower bound for a confidence interval on b.
• The 97.5 percen:le of the bks provides an upper bound for a confidence interval on b.
Implementa:on in Stata • Use of the bootstrap command in Stata. • bootstrap n(20): regress y x • Issues: – Only works with i.i.d residuals (more on this later). – Some bootstrap replica:ons may fail because A2 is not sa:sfied for these samples.
• Upside: – Very versa:le. Works with almost all es:ma:on procedures in Stata.
– Improves confidence intervals for i.i.d residuals in small samples, beaer than the normal approxima:on.
THEORY, PART 1: ESTIMATION OF THE EMPIRICAL CDF
Theory: Es:ma:on of the C.D.F
• Recap: the c.d.f. of a random variable is a func:on, the probability that the random variable is lower than a given threshold.
• i.i.d observa:ons of X, {X1,X2,…,XN} • C.d.f. of X is F0(x).
Examples of c.d.f.s • The c.d.f. of firms’ earnings
0
.2
.4
.6
.8
1
Cum
ulat
ive
Pro
babi
lity
-2 -1 0 1 2earnings over book value of equity
Empirical c.d.f. • Empirical c.d.f: • FN(x) = (1/N) Σi I(x<=Xi) • Using the law of large numbers, the empirical c.d.f. converges point by point to the true c.d.f. of X.
• Using the central limit theorem, the variance of the es:mate of the empirical c.d.f. is F0(x)(1-‐F0(x))/N.
• (Glivenko-‐Cantelli theorem: empirical c.d.f converges uniformly almost surely to the true c.d.f. F0).
The Empirical C.d.f.
• Use the observa:ons of firms’ earnings X1,X2,…,XN.
• Using these draws, create the empirical c.d.f. • Result: – For each x, the empirical c.d.f converges to the true c.d.f. of X as the number of draws becomes infinitely large.
THEORY, PART 2: USING THE EMPIRICAL CDF TO APPROXIMATE THE DISTRIBUTION OF A STATISTIC
Drawing from the sample
• Drawing a sample with replacement from the sample is iden:cal to drawing from a random variable whose c.d.f. is the empirical c.d.f.
• Indeed, the probability of picking a number lower than Xi is exactly equal to the frac:on of observa:ons below Xi.
Es:ma:on of confidence intervals by bootstrap
• Example: 8,9,10,2,1,8,9,5. – Calculate the mean, and give an es:mate of the variance of the emmean either
• Mean of X. – Empirical mean is m = 1/N Σi Xi. – The c.d.f of the mean is either approximated using the normal distribu:on (asympto:c approxima:on),
– Or using the empirical c.d.f.
Tricky
• Bootstrap requires i.i.d. draws from the same random variable. – If the observa:ons are correlated (clustering or autocorrela:ons), bootstrap is not valid.
– If the observa:ons do not have the same distribu:on, bootstrap is not valid.
Theory of the bootstrap
• Key ques:ons: – Is the bootstrap es:mator of a sta:s:c a consistent es:mator of that sta:s:c?
– Is the bootstrap es:mator beEer than the approxima:on provided by the Central Limit Theorem?
Theory of the bootstrap
• Note, as before Fn the empirical c.d.f of the observa:ons, and F0 the c.d.f. of the observa:ons (the true cdf).
• We are interested in the distribu:on of a sta:s:c Tn(X1,X2,…,Xn) of the observa:ons. This sta:s:c is either: (i) an es:mator (ii) a test sta:s:c (iii) a quan:ty of interest (a ra:o for instance).
• Note Gn(.,F0) the c.d.f of the sta:s:c.
Asympto:c approxima:ons
• The usual technique used so far. – For instance, we use the asympto:c normality of the OLS es:mator to es:mate confidence intervals.
• Principle: replace Gn(.,F0) with G∞(.,F0), which typically does not depend on the underlying distribu:on of the observa:ons. – For instance, the distribu:on of the OLS es:mator does not depend on the specific distribu:on of the residuals and the Xs (only their variance-‐covariance matrix).
Bootstrap approxima:on
• The bootstrap approxima:on uses Gn(.,Fn) as the approxima:on to Gn(.,F0).
• This is equivalent to: – Drawing a sample of the same size from the sample, using draws with replacement.
– Calcula:ng K values of the sta:s:c by repea:ng the procedure K :mes.
– Calcula:ng the empirical c.d.f. of the sta:s:c.
Informal statement of the proper:es of bootstrap
• For formal proofs see Horowitz (1999). • Bootstrap is consistent for: – OLS/IV/Panel es:mators. – Maximum likelihood and GMM es:mators. – t sta:s:cs, F sta:s:cs.
• Fails to be consistent es:mate of the cdf for: – The distribu:on of the maximum/minimum of a sample.
– Heavy-‐tailed distribu:ons (such as Cauchy distribu:ons).
Bootstrap and the dangers of outliers
• Exercise: – Calculate the mean and the s.e. of the mean of the sample -‐870,1,8,0.5,3,4 using the central limit theorem approxima:on and the bootstrap approxima:on.
– What is the probability that the outlier -‐870 is drawn more than 3 :mes for all 10 replica:ons of a bootstrap calcula:on?
• Moral: – Bootstrap results depend on the specific draws. Be careful with outliers.
BLOCK BOOTSTRAP
Block bootstrap
• If you believe there is correla:on of the observa:ons within firms, within an area, within an industry, simple bootstrap fails since observa:ons are not iid.
• For this, draw blocks (e.g. firms) of observa:ons rather than observa:ons.
• The blocks will be independent.
Block bootstrap
• Divide the dataset into blocks j=1,2,…,J, so that each block j has M observa:ons, and observa:ons across blocks are not correlated.
• (x11,…x1M),(x21,…,x2M), …, (xJ1,…,xJM) are iid draws.
• Draw a sample of J blocks with replacement. Es:mate your effect b1.
• Perform the previous step k=1,2,…,K :mes to get es:mates b1,b2,…,bK.
Block bootstrap exercise
• Take the dataset of firms’ earnings, calculate the mean of firms’ dividends, and the standard error of firms’ dividends, assuming that dividends are correlated across industries. – Either using the bootstrap command. – Or by drawing blocks yourself.
THEORY, PART 3: DOES THE BOOTSTRAP IMPROVE OVER THE ASYMPTOTIC APPROXIMATION?
Exercise • Generate a sample of Xs of size N with Pareto distribu:on. – Pareto distribu:on typical for income distribu:ons.
• Calculate the mean of X, and the standard error of the mean of X, using bootstrap and the central limit approxima:on.
• Generate many samples with the same size, generated from the same Pareto distribu:on. – Es:mate the error of the bootstrap approxima:on, the error of the central limit theorem approxima:on, and the difference between the bootstrap approxima:on and the central limit theorem approxima:on.
CONCLUSION
Prac:cal advice • The prac:cal use of bootstrap is super simple. – Bootstrap was discovered before the theory of bootstrap was wriaen. It works so well that people started wri:ng the theory to explain why.
• The theory is very hard, but you need only to remember: – Bootstrap works for i.i.d. draws, for clustered samples, use block bootstrap.
– Bootstrap fails for non con:nuous distribu:ons and for heavy-‐tailed distribu:ons (beware of outliers).
– If you have wriaen an econometric procedure where it is hard to write the closed-‐form formula for the standard error of the coefficients, use bootstrap.