meta-analysis using multilevel and bayesian modelsnycasa.org/meta analysis lecture bayesian...
TRANSCRIPT
Meta-analysis Using Multilevel and Bayesian Models
David RindskopfCUNY Graduate Center
Why should I stay awake
for this talk?
Advantages of Bayesian methods in meta-analysis
Realistic assumptions about unexplained variance
Borrow strength for better estimates of, more accurate confidence intervals for, effect sizes
More useful interpretation of results, including pretty pictures
Examples comparing usual, empirical Bayes, and fully Bayesian models
Why stay awake, continued(if you’re still awake)
How to do a Bayesian meta-analysis
Where to find software
hblm
Meta-Analyst
BUGS (Win and Open)
Where to read more about these methods
Meta-Analysis: Summarizing Results
from Many Studies
Suppose a large number of studies
have been done on a topic
Effectiveness of psychotherapy
Effect of class size on student achievement
Effectiveness of SAT coaching
Pygmalion effect (teacher expectancy)
How do we
Summarize the results
Reconcile discrepancies
o Due to sampling error
o Due to substantive differences among studies
o Due to methodological (artifactual)
differences among studies
Meta-analysis: Typical Procedure
Translate results of each study into an effect size
Put all results on a “common scale”
Most typical: difference in means between two groups,divided by standard deviation of control group(Not standard error of mean)
Find mean, standard error of mean, of effect sizes
Test homogeneity of effect sizes
See if study characteristics are related to effect sizes
e.g., therapist’s type/length of training
Multilevel framework includes meta-analysis
Usual multilevel exampleskids within classes within schools ….
patients within doctors and/or hospitals
Why meta-analysis fitsparticipants nested within studies
study characteristics related to effect sizes
participant characteristics related to effect sizes
estimates of true effect sizes (EB estimates)
Types of Models (Verbal Description)
Model Verbal Description
Fixed All effect sizes (ES) are equal
Fixed w/predictors All true (nonsampling) ES variation is accounted for by observed predictors
Random True (nonsampling) ES variation cannot be accounted for by predictors
Random w/predictors Some true variation in ES is accounted for by predictors
Empirical Bayes Assumes no error in estimate of residual true variation of ES
Fully Bayesian Assumes error in estimate of residual true variation of ES
Ex 1: SAT Coaching Studies
SAT Coaching Data
.45367.87
.05763
1 .05763 4.17d
d
se
Meta-Analysis
SAT Coaching
Homogeneity Analysis
28.39 14.9 20.52 1.38 1.90
7.94 10.2 .07 .01 .00
-2.75 16.3 -10.62 -.65 .42
6.82 11.0 -1.05 -.10 .01
-.64 9.4 -8.51 -.91 .82
.63 11.4 -7.24 -.64 .40
18.01 10.4 10.14 .98 .95
12.16 17.6 4.29 .24 .06
1
2
3
4
5
6
7
8
observed
effect size
standard
error of d DIFF DIFF.Z DIFF.Z.2
Sum = 4.56
Critical value of chi-square = 14.07, df = 7
Mean = 7.87
Estimating Effect SizesSAT Coaching Data
Suppose we want to estimate the effect of coaching for School A. What is the best estimate?
Method 1: School A had an observed effect size of about 28 points. That is the best estimate for that school.
Method 2: The average effect size for all schools was about 8 points. There is no reason to believe that School A is different than any of the other schools in the study. Therefore, the estimate for School A should be 8 points.
Estimating Effect Sizes
What are the advantages and disadvantages
of each approach?
Method 1: Treats each school independently.
Makes no use of information from other (probably similar) schools.
Does not adjust for sampling error.
Method 2: Treats schools as identical.
Gets more accurate estimate, but maybe of the wrong quantity.
Estimating Effect Sizes
Can we find a compromise between the methods that is better than each?
Yes: (Empirical) Bayes Estimates
EB Estimates: For each school, the estimated true effect size is a weighted average of that school’s observed ES, and the average ES over all schools.
The weights depend on
the sampling variability of the ES for that school
the estimated true variability
EB Estimates: Conceptual Diagram
d
ˆiEBd
id
• A school with a large standard error has an observed effect size estimate that cannot be trusted; therefore, its estimated true ES will be closer to the overall mean.
• A school with a small standard error has an observed ES estimate that can be trusted; therefore its estimated true ES will be closer to the observed ES.
SAT Data: ES estimates
Bayesian confidence intervals: Shrunk to better point estimate, narrower CI, by borrowing strength
O
O
O
O
O
O
O
O
X
hblm(EffSize ~ 1, s.e. = std.err)
Prior +/- Tau
8
7
6
5
4
3
2
1
-20 -10 0 10 20 30 40
-20 -10 0 10 20 30 40
ESTIMATE
X Prior Mn. --O-- Post.Mn +/- Post.SD --Y-- Obs.Y +/- SE
Y
Y
Y
Y
Y
Y
Y
Y
X
X
X
X
X
X
X
X
Bayesian Interpretation: More natural and more informative for decisions
Results for Bayesian model fit
(using hblm)
Coefficients:
Mean S.D. Prob > 0
(Intercept) 8.0056 4.7052 0.9575
But maybe we’re not interested in Prob(Effect > 0)
What is the probability that the average effect is large enough to matter?
Let’s find Prob(Ave effect > 20)
Only has meaning in Bayesian framework
z = (20 – 8)/4.7 = 2.55
Prob(z < 2.55) = .995
Prob(z > 2.55) = .005
So it’s unlikely this type of training raises SAT verbal scores by a useful amount (on average)
Ex 2: How to Incorporate Predictors of ES
Open Ed, Null Model
Call: hblm(eff.size ~ 1, s.e. = std.err)
DF for Resids= 6 RSS = 12.3639
Coefficients:
Mean S.D. Prob > 0
(Intercept) 0.4205 0.1049 0.9992
RSS Estimate of Tau = 0.1904
exp(post.mode(log(Tau))) = 0.1864 (s.d. = 0.089 )
Posterior Mean of Tau = 0.1723 (s.d. = 0.1084 )
0.022 0.040 0.069 0.114 0.186 0.304 0.503 0.862 1.610
0.0
0.0
50.1
00.1
50.2
00.2
50.3
00.3
5
Tau
Poste
rior
Pro
bability
of
Tau
A A A A A A A A A
B BB
B
B
B
BB B
CC
C
C
C
C
CC C
D D DD
DD D D D
E E EE
E E E E E
F FF
F
F
FF F F
G GG
G
G
GG G G
H H H H H H H H H
Estimates Conditional on Tau
A=(Intercept) B=1 C=2 D=3 E=4 F=5 G=6 H=7
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Conditio
nal M
ean
Open Ed, Main Effects Model hblm Output
Call: hblm(eff.size ~ obs + higher.grade, s.e. = std.err)
DF for Resids= 4 RSS = 7.1349
Coefficients:
Mean S.D. Prob > 0
(Intercept) 0.5138 0.1574 0.9959
obs -0.2838 0.2779 0.1258
higher.grade -0.0229 0.2523 0.4578
RSS Estimate of Tau = 0.1666
exp(post.mode(log(Tau))) = 0.1699 (s.d. = 0.1187 )
Posterior Mean of Tau = 0.1488 (s.d. = 0.1359 )
Open Ed, Main Effects Model
Y Prior Mn (Y-Prior)/SE Post.Mn Post.SD Prob > 0
1 0.6490 0.2071 1.9270 0.3170 0.2110 0.9572
2 -0.0430 0.2071 -1.6782 0.0971 0.1441 0.7544
3 0.5030 0.5138 -0.0583 0.5103 0.1372 0.9996
4 0.4580 0.5138 -0.2985 0.4956 0.1389 0.9992
5 0.5770 0.5138 0.3524 0.5354 0.1363 0.9999
6 0.5880 0.4909 0.5069 0.5221 0.1582 0.9996
7 0.3920 0.4909 -0.5093 0.4597 0.1594 0.9967
Open Ed, Main Effects Model
O
O
O
O
O
O
O
X
hblm(eff.size ~ obs + higher.grade, s.e. = std.err)
Prior +/- Tau
7
6
5
4
3
2
1
-0.2 0.0 0.2 0.4 0.6 0.8
-0.2 0.0 0.2 0.4 0.6 0.8
ESTIMATE
X Prior Mn. --O-- Post.Mn +/- Post.SD --Y-- Obs.Y +/- SE
Y
Y
Y
Y
Y
Y
Y
X
X
X
X
X
X
X
Open Ed, Main Effects Model
0.007 0.018 0.040 0.083 0.170 0.347 0.725 1.596 3.981
0.0
0.0
50.1
00.1
50.2
00.2
50.3
00.3
5
Tau
Poste
rior
Pro
bability
of
Tau
A A A A A A A A A
B BB
B
B
B
BB B
C C CC
C
C
C C C
D D D D D D D D DE E E EE
E E E E
F F F FF
F F F F
G G G GG
GG G G
H H H HH
HH H H
Estimates Conditional on Tau
A=(Intercept) B=1 C=2 D=3 E=4 F=5 G=6 H=7
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Conditio
nal M
ean
Open Ed, study 2 as outlier
Call: hblm(eff.size ~ study2, s.e. = std.err)
DF for Resids= 5 RSS = 1.0955
Coefficients: Mean S.D. Prob > 0
(Intercept) 0.5230 0.0875 1.0000 study2 -0.5660 0.1964 0.0037
RSS Estimate of Tau = 0 exp(post.mode(log(Tau))) = 0.0686 (s.d. = 0.0616 )
Posterior Mean of Tau = 0.0668 (s.d. = 0.0653 )
0.001 0.004 0.011 0.027 0.069 0.172 0.444 1.223 3.960
0.0
0.1
0.2
0.3
0.4
Tau
Poste
rior
Pro
bability
of
Tau
A A A A A A A A AB B B B B
B
BB B
C C C C C C C C C
D D D D D D D D DE E E E E
EE E E
F F F F FF
F F F
G G G G GG
G G G
H H H H H
H
HH H
Estimates Conditional on Tau
A=(Intercept) B=1 C=2 D=3 E=4 F=5 G=6 H=7
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Conditio
nal M
ean
Ex 3: Teacher Expectancy Data“Continuous” Predictor: Weeks of Contact
Statistical ModelWith no predictors:
Teacher Expectancy DataRandom Effects HLM (EB) Model, no predictors
0j j
j j j
u
d e
Teacher Expectancy DataPlot of Observed ES by Weeks of
Contact
Statistical Model:
Teacher Expectancy DataHLM Model, Weeks as Predictor
Regression (OLS) Predictions
WeeksOLS
Prediction
0 .41
1 .25
2 .09
3 -.06
.407 .157*Weeks
Estimated True ES as a Weighted Average of Observed ES and Conditional Mean
model
{ f or (j in 1:n)
{ prec[j] <- 1/v [j]; # precisions
b0[j] <- ga00 + ga01 * w1[j] + u0[j]
es[j] ~ dnorm(b0[j],prec[j])
u0[j] ~ dnorm(0,tauinv )
}
tau <- 1/tauinv ;
tauinv ~ dgamma(.001,.001); # prior distribution
ga00 ~ dnorm(0,.001); # prior distribution
ga01 ~ dnorm(0,.001); # " "
}
Fully Bayesian Modeling Using WinBUGS
node mean sd MC error 2.5% median 97.5% start sample
b0[1] 0.0818 0.0653 7.57E-4 -0.05326 0.08387 0.2076 10000 10001
b0[2] -0.0374 0.07597 0.001263 -0.1751 -0.04216 0.1249 10000 10001
b0[3] -0.07634 0.07543 8.636E-4 -0.2308 -0.07506 0.06993 10000 10001
b0[4] 0.4401 0.1172 0.001982 0.227 0.4335 0.6886 10000 10001
b0[5] 0.4101 0.1119 0.00124 0.1902 0.4095 0.6277 10000 10001
b0[6] -0.06481 0.06516 6.489E-4 -0.1951 -0.06507 0.06211 10000 10001
b0[7] -0.0543 0.06573 6.451E-4 -0.1838 -0.05487 0.07582 10000 10001
b0[8] -0.08677 0.08237 0.001189 -0.2682 -0.08189 0.06497 10000 10001
b0[9] 0.3966 0.1002 0.001034 0.1989 0.3978 0.5919 10000 10001
b0[10] 0.2896 0.09369 0.001865 0.1291 0.2809 0.5005 10000 10001
b0[11] 0.422 0.1117 0.001397 0.2041 0.4197 0.6496 10000 10001
b0[12] 0.3971 0.1063 0.00119 0.1842 0.3983 0.602 10000 10001
b0[13] 0.2407 0.08777 9.281E-4 0.05859 0.242 0.4089 10000 10001
b0[14] 0.1018 0.07637 8.85E-4 -0.04571 0.1 0.2649 10000 10001
b0[15] -0.08216 0.07495 9.984E-4 -0.2406 -0.07982 0.06029 10000 10001
b0[16] -0.06543 0.07606 7.27E-4 -0.2156 -0.06578 0.08336 10000 10001
b0[17] 0.2618 0.07717 9.501E-4 0.1139 0.2596 0.4201 10000 10001
b0[18] 0.08719 0.05891 6.253E-4 -0.03258 0.08769 0.204 10000 10001
b0[19] -0.06675 0.07586 7.727E-4 -0.2234 -0.0663 0.08335 10000 10001
ga00 0.4163 0.0939 0.001267 0.2352 0.4152 0.602 10000 10001
ga01 -0.161 0.03917 5.416E-4 -0.2391 -0.1608 -0.08579 10000 10001
tau 0.00492 0.006204 2.223E-4 4.419E-4 0.002861 0.02171 10000 10001
ga00 sample: 10001
0.0 0.2 0.4 0.6
0.0
2.0
4.0
6.0
ga01 sample: 10001
-0.4 -0.3 -0.2 -0.1
0.0
5.0
10.0
15.0
tau sample: 10001
-0.025 0.025 0.075
0.0
100.0
200.0
300.0
References
Draper, D., Gaver, D. P., Goel, P. K., Greenhouse, J. B., Hedges, L. V., Morris, C. N., Tucker, J. R., & Waternaux, C. M. (1992). Combining information: Statistical issues and opportunities for research. Washington, DC: National Academy Press. http://www.ams.ucsc.edu/~draper/draper-etal-1993b.pdf
Efron, B., Morris, C. (1977). Stein's paradox in statistics. Scientific American 238 (5): 119-127.
Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10,
DuMouchel, W. (1994). Hierarchical Bayes linear models for meta-analysis. Technical Report 27, National Institute of Statistical Sciences.www.niss.org/technicalreports/tr27.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.3067&rep=rep1&type=pdf (DuMouchel & Lise-Normand)
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.6750&rep=rep1&type=ps (DuMouchel et al, hblm for weather and schizophrenia)
References
Spiegelhalter, D. J., Abrams, K. R. and Myles, J. P. (2004). Bayesian Approaches to Clinical Trials and Health-care Evaluation. Chichester: John Wiley and Sons Limited.
Sutton, A. J., Abrams, K. R., Jones, D. R., Sheldon, T. A. and Song, F. (2000). Methods for Meta-Analysis in Medical Research. Wiley, New York.
Bayesian Meta-Analysis Software
Meta-analysthttp://tuftscaes.org/meta_analyst/
http://www.biomedcentral.com/content/pdf/1471-2288-9-80.pdf
hblm (runs only on Splus)ftp://ftp.research.att.com/dist/bayes-meta
WinBUGShttp://www.mrc-bsu.cam.ac.uk/bugs/
http://www.openbugs.info/w/
SummaryMultilevel models (Empirical Bayes)
approach allows the analyst to Correct for sampling error Estimate effects of study characteristics
on the outcome (substantive and methodological)
Estimate true effect size distribution Estimate true effect size for each study
Fully Bayesian methods (hblm and WinBUGS) • Allow inferences that take into account
uncertainty about residual variance estimate
• Allow useful inferences, such as Prob(effect size > meaningful value)
• Allow prediction for new study, to see whether it is consistent with results in literature so far