bootstrap( - amine ouazad · bootstrap(and(the(dangers(of outliers(• exercise: –...

Bootstrap

Econometrics A Ass. Prof. Amine Ouazad

Mices

•  Diabe:c mices get a treatment. Their sugar level is measured a@er the treatment.

•  The numbers are the following: –  2.3,4.1,1.2,2.6,4.4,1.9 in the control group. –  2.1,2.0,1.9,1.6,2.2,0.7 in the treatment group.

•  Exercise: –  Es:mate the effect of the treatment and the standard error on the effect of the treatment.

– Discuss the assump:ons needed to es:mate the standard error of the treatment.

Outline

1.  Problemo 2.  Mice: the Bootstrap principle 3.  Implementa:on in Stata 4.  Theory:

1.  Es:ma:on of the C.D.F. 2.  Es:ma:on of confidence intervals by bootstrap 3.  Improvement over the Central Limit Theorem

5.  Tricky

Problemo •  We used two tricks to find confidence intervals: –  Either we used the Central Limit Theorem when the number of observa:ons is large.

–  Or we assumed normally distributed residuals when the number of observa:ons is fixed (and small).

•  What if none of these is true? –  The number of observa:ons is small, and the residuals are not normally distributed.

•  Possibili:es: –  Assuming another distribu:on for the residuals. Theore:cally possible but super rare and nonstandard.

–  Or using bootstrap.

Mices

•  Diabe:c mices get a treatment. Their sugar level is measured a@er the treatment.

•  The numbers are the following: – 2.3,4.1,4.1,1.2,2.6,4.4,1.9 in the control group. – 2.1,2.0,1.2,1.9,1.6,2.2,0.7 in the treatment group.

•  Exercise: – Es:mate the effect of the treatment and the standard error on the effect of the treatment using bootstrap.

2.3, C

4.1, C

4.1, C

1.2, C

2.6, C

2.2, T

0.7, T

2.1, T

2.0, T

1.2, T

1.9, T

1.6, T

4.4, C

1.9, C

Mice: The Boostrap Principle •  Sample with replacement from the set of observa:ons of mices.

•  Calculate an es:mate b1 of the effect of the medica:on on mices.

•  Repeat this step k=1,2,…,K :mes. At each step, calculate bk.

•  The 2.5 percen:le of the bks provides a lower bound for a confidence interval on b.

•  The 97.5 percen:le of the bks provides an upper bound for a confidence interval on b.

Implementa:on in Stata •  Use of the bootstrap command in Stata. •  bootstrap n(20): regress y x •  Issues: – Only works with i.i.d residuals (more on this later). –  Some bootstrap replica:ons may fail because A2 is not sa:sfied for these samples.

•  Upside: –  Very versa:le. Works with almost all es:ma:on procedures in Stata.

–  Improves confidence intervals for i.i.d residuals in small samples, beaer than the normal approxima:on.

THEORY, PART 1: ESTIMATION OF THE EMPIRICAL CDF

Theory: Es:ma:on of the C.D.F

•  Recap: the c.d.f. of a random variable is a func:on, the probability that the random variable is lower than a given threshold.

•  i.i.d observa:ons of X, {X1,X2,…,XN} •  C.d.f. of X is F0(x).

Examples of c.d.f.s •  The c.d.f. of firms’ earnings

0

.2

.4

.6

.8

1

Cum

ulat

ive

Pro

babi

lity

-2 -1 0 1 2earnings over book value of equity

Empirical c.d.f. •  Empirical c.d.f: •  FN(x) = (1/N) Σi I(x<=Xi) •  Using the law of large numbers, the empirical c.d.f. converges point by point to the true c.d.f. of X.

•  Using the central limit theorem, the variance of the es:mate of the empirical c.d.f. is F0(x)(1-‐F0(x))/N.

•  (Glivenko-‐Cantelli theorem: empirical c.d.f converges uniformly almost surely to the true c.d.f. F0).

The Empirical C.d.f.

•  Use the observa:ons of firms’ earnings X1,X2,…,XN.

•  Using these draws, create the empirical c.d.f. •  Result: – For each x, the empirical c.d.f converges to the true c.d.f. of X as the number of draws becomes infinitely large.

THEORY, PART 2: USING THE EMPIRICAL CDF TO APPROXIMATE THE DISTRIBUTION OF A STATISTIC

Drawing from the sample

•  Drawing a sample with replacement from the sample is iden:cal to drawing from a random variable whose c.d.f. is the empirical c.d.f.

•  Indeed, the probability of picking a number lower than Xi is exactly equal to the frac:on of observa:ons below Xi.

Es:ma:on of confidence intervals by bootstrap

•  Example: 8,9,10,2,1,8,9,5. – Calculate the mean, and give an es:mate of the variance of the emmean either

•  Mean of X. – Empirical mean is m = 1/N Σi Xi. – The c.d.f of the mean is either approximated using the normal distribu:on (asympto:c approxima:on),

– Or using the empirical c.d.f.

Tricky

•  Bootstrap requires i.i.d. draws from the same random variable. –  If the observa:ons are correlated (clustering or autocorrela:ons), bootstrap is not valid.

–  If the observa:ons do not have the same distribu:on, bootstrap is not valid.

Theory of the bootstrap

•  Key ques:ons: –  Is the bootstrap es:mator of a sta:s:c a consistent es:mator of that sta:s:c?

–  Is the bootstrap es:mator beEer than the approxima:on provided by the Central Limit Theorem?

Theory of the bootstrap

•  Note, as before Fn the empirical c.d.f of the observa:ons, and F0 the c.d.f. of the observa:ons (the true cdf).

•  We are interested in the distribu:on of a sta:s:c Tn(X1,X2,…,Xn) of the observa:ons. This sta:s:c is either: (i) an es:mator (ii) a test sta:s:c (iii) a quan:ty of interest (a ra:o for instance).

•  Note Gn(.,F0) the c.d.f of the sta:s:c.

Asympto:c approxima:ons

•  The usual technique used so far. –  For instance, we use the asympto:c normality of the OLS es:mator to es:mate confidence intervals.

•  Principle: replace Gn(.,F0) with G∞(.,F0), which typically does not depend on the underlying distribu:on of the observa:ons. –  For instance, the distribu:on of the OLS es:mator does not depend on the specific distribu:on of the residuals and the Xs (only their variance-‐covariance matrix).

Bootstrap approxima:on

•  The bootstrap approxima:on uses Gn(.,Fn) as the approxima:on to Gn(.,F0).

•  This is equivalent to: – Drawing a sample of the same size from the sample, using draws with replacement.

– Calcula:ng K values of the sta:s:c by repea:ng the procedure K :mes.

– Calcula:ng the empirical c.d.f. of the sta:s:c.

Informal statement of the proper:es of bootstrap

•  For formal proofs see Horowitz (1999). •  Bootstrap is consistent for: – OLS/IV/Panel es:mators. – Maximum likelihood and GMM es:mators. –  t sta:s:cs, F sta:s:cs.

•  Fails to be consistent es:mate of the cdf for: –  The distribu:on of the maximum/minimum of a sample.

– Heavy-‐tailed distribu:ons (such as Cauchy distribu:ons).

Bootstrap and the dangers of outliers

•  Exercise: –  Calculate the mean and the s.e. of the mean of the sample -‐870,1,8,0.5,3,4 using the central limit theorem approxima:on and the bootstrap approxima:on.

– What is the probability that the outlier -‐870 is drawn more than 3 :mes for all 10 replica:ons of a bootstrap calcula:on?

•  Moral: –  Bootstrap results depend on the specific draws. Be careful with outliers.

BLOCK BOOTSTRAP

Block bootstrap

•  If you believe there is correla:on of the observa:ons within firms, within an area, within an industry, simple bootstrap fails since observa:ons are not iid.

•  For this, draw blocks (e.g. firms) of observa:ons rather than observa:ons.

•  The blocks will be independent.

Block bootstrap

•  Divide the dataset into blocks j=1,2,…,J, so that each block j has M observa:ons, and observa:ons across blocks are not correlated.

•  (x11,…x1M),(x21,…,x2M), …, (xJ1,…,xJM) are iid draws.

•  Draw a sample of J blocks with replacement. Es:mate your effect b1.

•  Perform the previous step k=1,2,…,K :mes to get es:mates b1,b2,…,bK.

Block bootstrap exercise

•  Take the dataset of firms’ earnings, calculate the mean of firms’ dividends, and the standard error of firms’ dividends, assuming that dividends are correlated across industries. – Either using the bootstrap command. – Or by drawing blocks yourself.

THEORY, PART 3: DOES THE BOOTSTRAP IMPROVE OVER THE ASYMPTOTIC APPROXIMATION?

Exercise •  Generate a sample of Xs of size N with Pareto distribu:on. –  Pareto distribu:on typical for income distribu:ons.

•  Calculate the mean of X, and the standard error of the mean of X, using bootstrap and the central limit approxima:on.

•  Generate many samples with the same size, generated from the same Pareto distribu:on. –  Es:mate the error of the bootstrap approxima:on, the error of the central limit theorem approxima:on, and the difference between the bootstrap approxima:on and the central limit theorem approxima:on.

CONCLUSION

Prac:cal advice •  The prac:cal use of bootstrap is super simple. –  Bootstrap was discovered before the theory of bootstrap was wriaen. It works so well that people started wri:ng the theory to explain why.

•  The theory is very hard, but you need only to remember: –  Bootstrap works for i.i.d. draws, for clustered samples, use block bootstrap.

–  Bootstrap fails for non con:nuous distribu:ons and for heavy-‐tailed distribu:ons (beware of outliers).

–  If you have wriaen an econometric procedure where it is hard to write the closed-‐form formula for the standard error of the coefficients, use bootstrap.

bootstrap( - amine ouazad · bootstrap(and(the(dangers(of outliers(• exercise: –...

Documents