regression discontinuity design saralyn miller southern methodist university ____________________...
Post on 16-Dec-2015
213 Views
Preview:
TRANSCRIPT
1
Regression Discontinuity Design
Saralyn MillerSouthern Methodist University
____________________Paper presented at the annual meeting of the Southwest Educational Research Association, San Antonio, TX, February 2-4, 2011.
2
Presentation Outline
• What is RDD?• Example of RDD.• History of RDD.• Why was RDD developed?• Assumptions of RDD.• When to use RDD.• How to know if discontinuity occurred.• RDD limitations.• Computing RDD in R.• Conclusions.
3
What is RDD?
• RDD is an alternative to treatment/control experimental research.
• RDD is a quasi-experimental research design that allows the researcher to use a selection criteria rather than randomization of groups to determine treatment effects.
• RDD determines effectiveness of a program by comparing a treatment group that was selected according to the cutoff criterion to expected values from the comparison group that was not selected according to the cutoff criterion.
4
RDD Example• Students are given the DIBELS ORF reading assessment.• The students not meeting the benchmark requirements are placed in a
tier 2 intervention (treatment group).• Students above the benchmark are not given an intervention (control
group).• After the intervention is complete, students are post tested using DIBELS
ORF.• Control student scores are used to predict the scores of the students in
the treatment condition if the intervention had not occurred.• Actual treatment student scores are compared to the predicted value
from control student scores.• If the actual treatment student scores are statistically significantly
different (slope and intercept) from the predicted values calculated from control student scores, a treatment effect is reported.
5
History of RDD
• RDD was first used by Campbell and Thistlewaite (1960).• Developed for Psychology and Education.– Compensatory education programs (ex: Head Start).
• Campbell’s students, Sween and Trochim studied RDD further.– Non-linearity– Multiple cutoffs– Fuzzy discontinuity
• Several studies surfaced comparing RDD to experimental designs (ex: medical community).
6
Why was RDD developed?
• RDD was developed to control for selection bias. It provided an alternative to matching subjects in a T/C experimental design.– In experimental research, researchers attempt to equate groups
on all other variables other than the actual treatment. Campbell and Stanley’s argument is that this is almost difficult to do.
• RDD is an experiment randomized at the cutoff value.– RDD allows for a more detailed description of the selection
process.• Randomizing would prove to be unethical or even political.
7
RDD Assumptions
• The Cutoff Criterion– Cutoff value must be followed.
• The Pre-Post Distribution (can’t be curved)– The relationship between the pretest and posttest can not be
better explained with a curve.• Comparison Group Pretest Variance
– Comparison group must have a sufficient number of pretest values.• Continuous Pretest Distribution
– Both groups must come from a single continuous pretest distribution.
• Program Implementation– Program is delivered to all recipients in the same manner.
8
When do I use RDD?
• Pretest/posttest design– There are no matching treatment and control groups, but
comparisons are made based on a predicted regression line.– If posttest scores for those in the treatment condition are
better predicted by a new regression line than the regression line of the control group if it were extended, then a treatment effect is reported.
• Single subject research– Baseline serves as the control.– If intervention scores are better predicted by a new regression
line than the regression line of the baseline if it were extended, then a treatment effect is reported.
9
How do I use RDD?
• Pretest/Posttest Design– The use of a separate control group is not
idealistic.– Students are pre-tested and a cut-off criterion is
used to determine treatment and control groups. (Example: students that do not meet a certain benchmark are put into a treatment condition.)
– The intervention is implemented.– Students are post tested.
10
Continued
– A regression model is calculated using pretest scores and a dummy-coded independent variable for treatment and control conditions to serve as predictors of posttest scores.
11
Continued
• Single Subject Research– Baseline serves as comparison data.– Time is your independent variable and once intervention
begins, this time point is your cutoff criterion.– All data points after the cutoff criterion serve as your
treatment data.– A regression model is calculated using time and a dummy-
coded independent variable as a predictor of the dependent variable.
12
Continued
– If a discontinuity is found at the cutoff criterion, then a treatment effect is reported.
– A discontinuity can be either a statistically significant change in slope or y-intercept.
13
How do I know if there is a discontinuity?
• Compare slopes (interaction) and intercepts (main effects).– Treatment effects are reported if the regression line
for the treatment group better predicts the score of the treatment students than if the control group predicted treatment student scores. • If the interaction term shows statistical significance, the
slope of the treatment group is SSD from the control group.• If the main effect term for the dummy coded variable
shows statistical significance, the intercept of the treatment group is SSD from the control group.
15
Limitations of RDD
• Number of subjects in the study needs to be almost 3 X the size of an experimental T/C design. One group is usually very small (usually those struggling or those exceeding)
• Power decreases. • Some relationships that appear to be a
discontinuity are actually better explained with a nonlinear line.
• Fuzzy discontinuity – this occurs when the cutoff criterion is not strictly adhered to.
16
Curvilinearity Problem
http://socialresearchmethods.net/kb/statrd.htm (Trochim, 2006)
17
Steps for Computing Regression Discontinuity in R
• Subtract the cutoff score from the pretest value.• Create a dummy coded variable for the 2 groups.• Run a regression on post test scores given the new
pretest scores and new dummy coded variable.• Main effect of the dummy coded variable
indicates SSD for the intercepts.• Interaction effect indicates SSD for the slopes.
18
Example 1: Slope and Intercept are SSD
pre<-c(3,4,7,8,9,12,15,17,18,19,7,5,10,12,16,17,15,4,9,16,22,40,30,24,25,32,53,68,29,32,24,52,69,34,55,47,33,34,37,60,58,52,50,44,44)
post<-c(6,5,14,12,20,30,35,40,41,44,20,10,20,33,29,34,40,12,20,43,30,45,30,45,36,44,53,75,40,41,34,55,72,40,55,50,40,41,42,65,65,55,52,47,49)
preT<-ifelse(pre<=20,0,1)pre2<-pre-21 m1<-lm(post~pre2*preT)summary(m1)
19
Example 1: Slope and Intercept are SSD
Call:lm(formula = post ~ pre2 * preT) Residuals: Min 1Q Median 3Q Max -8.3602 -2.0447 -0.6085 2.2779 11.5122 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 48.8699 1.9240 25.400 < 2e-16 ***pre2 2.3827 0.1736 13.726 < 2e-16 ***preT -17.8182 2.4055 -7.407 4.41e-09 ***pre2:preT -1.5707 0.1830 -8.585 1.06e-10 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.945 on 41 degrees of freedomMultiple R-squared: 0.9483, Adjusted R-squared: 0.9445 F-statistic: 250.6 on 3 and 41 DF, p-value: < 2.2e-16
plot(m1)xyplot(post~pre*preT)plot(post~pre)abline(lm(post~pre), lwd=4)abline(lm(post[preT=="0"]~pre[preT=="0"]), col="blue",lwd=3)abline(lm(post[preT=="1"]~pre[preT=="1"]), col="red",lwd=3)abline(v=20)
23
Example 2: Slope is SSD
pre<-c(9,10,11,12,10,12,15,17,18,19,9,10,11,12,16,17,15,10,12,16,22,40,30,24,25,32,53,68,29,32,24,52,69,34,55,47,33,34,37,60,58,52,50,44,44)
post<-c(15,16,17,18,17,19,26,27,28,30,20,21,20,25,29,23,25,17,20,29,30,45,30,45,36,44,53,75,40,41,34,55,72,40,55,50,40,41,42,65,65,55,52,47,49)
preT<-ifelse(pre<=20,0,1)pre2<-pre-21 m1<-lm(post~pre2*preT)summary(m1)
24
Example 2: Slope is SSD
Call:lm(formula = post ~ pre2 * preT) Residuals: Min 1Q Median 3Q Max -8.360 -1.864 -0.702 1.969 11.512 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.6853 2.0203 16.178 < 2e-16 ***pre2 1.3315 0.2362 5.637 1.42e-06 ***preT -1.6337 2.3597 -0.692 0.4926 pre2:preT -0.5194 0.2412 -2.153 0.0372 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.332 on 41 degrees of freedomMultiple R-squared: 0.9599, Adjusted R-squared: 0.957 F-statistic: 327.4 on 3 and 41 DF, p-value: < 2.2e-16
27
Example 3: Intercept is SSD
pre<-c(14,15,7,8,9,12,15,17,18,19,7,5,10,12,16,17,15,4,9,16,22,40,30,24,25,32,53,68,29,32,24,52,69,34,55,47,33,34,37,60,58,52,50,44,44)
post<-c(64,64,63,66,63,65,66,66,67,68,63,62,65,66,66,60,61,62,59,59,47,52,49,55,47,50,59,65,50,51,45,51,57,49,51,58,49,54,52,60,60,61,60,58,49)
preT<-ifelse(pre<=20,0,1)pre2<-pre-21m1<-lm(post~pre2*preT)summary(m1)
28
Example 3: Intercept is SSD
Call:lm(formula = post ~ pre2 * preT) Residuals: Min 1Q Median 3Q Max -6.4227 -1.5633 0.1772 2.1679 6.7320 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 65.2954 1.5364 42.499 < 2e-16 ***pre2 0.1766 0.1564 1.129 0.265 preT -17.9134 1.9142 -9.358 9.9e-12 ***pre2:preT 0.1187 0.1630 0.728 0.471 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.12 on 41 degrees of freedomMultiple R-squared: 0.7976, Adjusted R-squared: 0.7828 F-statistic: 53.85 on 3 and 41 DF, p-value: 2.806e-14
31
Example 4: Slope and Intercept are not SSD
pre<-c(14,15,7,8,9,12,15,17,18,19,7,5,10,12,16,17,15,4,9,16,22,40,30,24,25,32,53,68,29,32,24,52,69,34,55,47,33,34,37,60,58,52,50,44,44)
post<-c(44,44,43,46,43,45,46,46,47,48,43,42,45,46,46,47,45,45,48,49,47,52,49,55,47,50,59,65,50,51,45,51,57,49,51,58,49,54,52,60,60,61,60,58,49)
preT<-ifelse(pre<=20,0,1)pre2<-pre-21m1<-lm(post~pre2*preT)summary(m1)
32
Example 4: Slope and Intercept are not SSD
lm(formula = post ~ pre2 * preT) Residuals: Min 1Q Median 3Q Max -6.4227 -1.5633 -0.1071 1.6471 6.7320 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 47.55588 1.38092 34.438 <2e-16 ***pre2 0.24639 0.14061 1.752 0.0872 . preT -0.17386 1.72049 -0.101 0.9200 pre2:preT 0.04893 0.14649 0.334 0.7401 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.804 on 41 degrees of freedomMultiple R-squared: 0.784, Adjusted R-squared: 0.7682 F-statistic: 49.61 on 3 and 41 DF, p-value: 1.052e-13
35
Justin
maximize<-read.table("C://Users/sjmiller/Desktop/RDD Justin.txt",header=T)
attach(maximize)library(MASS)library(lattice) Book2<-ifelse(Book==0, 0, 1)Time2<-Time-11m1<-lm(WRC~Time2*Book2, na.action=na.omit)summary(m1)
36
Call:lm(formula = WRC ~ Time2 * Book2, na.action = na.omit) Residuals: Min 1Q Median 3Q Max -18.5048 -3.1323 0.5714 3.7263 11.0159 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.8571 9.0062 0.650 0.5183 Time2 -0.1429 1.2371 -0.115 0.9085 Book2 22.1235 9.1842 2.409 0.0195 *Time2:Book2 0.9049 1.2381 0.731 0.4681 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 6.546 on 53 degrees of freedom (16 observations deleted due to missingness)Multiple R-squared: 0.9078, Adjusted R-squared: 0.9026 F-statistic: 174 on 3 and 53 DF, p-value: < 2.2e-16
Justin
38
plot(m1)plot(WRC~Time)abline(lm(WRC~Time), lwd=4)abline(lm(WRC[Book=="0"]~Time[Book=="0"]), col="blue",lwd=3)abline(lm(WRC[Book>="1"]~Time[Book>="1"]), col="red", lwd=3)abline(v=10)
Justin
39
Kristen
maximize<-read.table("C://Users/sjmiller/Desktop/RDD Kristen.txt",header=T)
attach(maximize)Book2<-ifelse(Book.1==0, 0, 1)Time2<-Time.1-22m2<-lm(WRC.1~Time2*Book2, na.action=na.omit)summary(m2)
40
Call:lm(formula = WRC.1 ~ Time2 * Book2, na.action = na.omit) Residuals: Min 1Q Median 3Q Max -8.55110 -1.75967 -0.06843 1.62443 10.21171 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.9154 1.9400 4.596 2.96e-05 ***Time2 0.2164 0.1537 1.407 0.1655 Book2 10.0195 2.2934 4.369 6.30e-05 ***Time2:Book2 0.4259 0.1591 2.676 0.0100 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.937 on 50 degrees of freedom (19 observations deleted due to missingness)Multiple R-squared: 0.9439, Adjusted R-squared: 0.9406 F-statistic: 280.6 on 3 and 50 DF, p-value: < 2.2e-16
Kristen
41
plot(m2)plot(WRC.1~Time.1)abline(lm(WRC.1~Time.1), lwd=4)abline(lm(WRC.1[Book.1=="0"]~Time.1[Book.1=="0"]), col="blue",lwd=3)abline(lm(WRC.1[Book.1>="1"]~Time.1[Book.1>="1"]), col="red", lwd=3)abline(v=21)
Kristen
43
Grace
maximize<-read.table("C://Users/sjmiller/Desktop/RDD Grace.txt",header=T)
attach(maximize)Book2<-ifelse(Book.2==0, 0, 1)Time2<-Time.2-18m3<-lm(WRC.2~Time2*Book2, na.action=na.omit)summary(m3)
44
Call:lm(formula = WRC.2 ~ Time2 * Book2, na.action = na.omit) Residuals: Min 1Q Median 3Q Max -10.7003 -3.5374 -0.0953 3.2246 8.6488 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 20.95084 2.57369 8.140 3.17e-11 ***Time2 0.62559 0.24477 2.556 0.01319 * Book2 8.33048 2.88043 2.892 0.00535 ** Time2:Book2 -0.02093 0.24848 -0.084 0.93314 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.78 on 59 degrees of freedom (10 observations deleted due to missingness)Multiple R-squared: 0.9182, Adjusted R-squared: 0.9141 F-statistic: 220.9 on 3 and 59 DF, p-value: < 2.2e-16
Grace
45
plot(m3)plot(WRC.2~Time.2)abline(lm(WRC.2~Time.2), lwd=4)abline(lm(WRC.2[Book.2=="0"]~Time.2[Book.2=="0"]), col="blue",lwd=3)abline(lm(WRC.2[Book.2>="1"]~Time.2[Book.2>="1"]), col="red", lwd=3)abline(v=17)
Grace
46
Conclusions: RDD
• RDD is an alternative to experimental research when a control group is not accessible.
• Provides an alternative approach when selection bias is prevalent.
• Treatment effects are reported if either the intercepts or slopes are statistically significantly different.
47
References• Cook, T. D. (2008). “Waiting for life to arrive”: A history of the regression-
discontinuity design in psychology, statistics and economics. Journal of Econometrics, 142, 636-654.
• Trochim, W. M. K. (1984). Research design for program evaluation: The regression-discontinuity approach. Sage, Beverly Hills, CA.
• Trochim, W. M. K. (2007). Regression-discontinuity analysis. Retrieved April 10, 2010, from http://www.socialresearchmethods.net/kb/statrd.htm
top related