meta-analysis using multilevel and bayesian modelsnycasa.org/meta analysis lecture bayesian...

Meta-analysis Using Multilevel and Bayesian Models

David RindskopfCUNY Graduate Center

Why should I stay awake

for this talk?

Advantages of Bayesian methods in meta-analysis

Realistic assumptions about unexplained variance

Borrow strength for better estimates of, more accurate confidence intervals for, effect sizes

More useful interpretation of results, including pretty pictures

Examples comparing usual, empirical Bayes, and fully Bayesian models

Why stay awake, continued(if you’re still awake)

How to do a Bayesian meta-analysis

Where to find software

hblm

Meta-Analyst

BUGS (Win and Open)

Where to read more about these methods

Meta-Analysis: Summarizing Results

from Many Studies

Suppose a large number of studies

have been done on a topic

Effectiveness of psychotherapy

Effect of class size on student achievement

Effectiveness of SAT coaching

Pygmalion effect (teacher expectancy)

How do we

Summarize the results

Reconcile discrepancies

o Due to sampling error

o Due to substantive differences among studies

o Due to methodological (artifactual)

differences among studies

Meta-analysis: Typical Procedure

Translate results of each study into an effect size

Put all results on a “common scale”

Most typical: difference in means between two groups,divided by standard deviation of control group(Not standard error of mean)

Find mean, standard error of mean, of effect sizes

Test homogeneity of effect sizes

See if study characteristics are related to effect sizes

e.g., therapist’s type/length of training

Multilevel framework includes meta-analysis

Usual multilevel exampleskids within classes within schools ….

patients within doctors and/or hospitals

Why meta-analysis fitsparticipants nested within studies

study characteristics related to effect sizes

participant characteristics related to effect sizes

estimates of true effect sizes (EB estimates)

Types of Models (Verbal Description)

Model Verbal Description

Fixed All effect sizes (ES) are equal

Fixed w/predictors All true (nonsampling) ES variation is accounted for by observed predictors

Random True (nonsampling) ES variation cannot be accounted for by predictors

Random w/predictors Some true variation in ES is accounted for by predictors

Empirical Bayes Assumes no error in estimate of residual true variation of ES

Fully Bayesian Assumes error in estimate of residual true variation of ES

Ex 1: SAT Coaching Studies

SAT Coaching Data

.45367.87

.05763

1 .05763 4.17d

d

se

Meta-Analysis

SAT Coaching

Homogeneity Analysis

28.39 14.9 20.52 1.38 1.90

7.94 10.2 .07 .01 .00

-2.75 16.3 -10.62 -.65 .42

6.82 11.0 -1.05 -.10 .01

-.64 9.4 -8.51 -.91 .82

.63 11.4 -7.24 -.64 .40

18.01 10.4 10.14 .98 .95

12.16 17.6 4.29 .24 .06

1

2

3

4

5

6

7

8

observed

effect size

standard

error of d DIFF DIFF.Z DIFF.Z.2

Sum = 4.56

Critical value of chi-square = 14.07, df = 7

Mean = 7.87

Estimating Effect SizesSAT Coaching Data

Suppose we want to estimate the effect of coaching for School A. What is the best estimate?

Method 1: School A had an observed effect size of about 28 points. That is the best estimate for that school.

Method 2: The average effect size for all schools was about 8 points. There is no reason to believe that School A is different than any of the other schools in the study. Therefore, the estimate for School A should be 8 points.

Estimating Effect Sizes

What are the advantages and disadvantages

of each approach?

Method 1: Treats each school independently.

Makes no use of information from other (probably similar) schools.

Does not adjust for sampling error.

Method 2: Treats schools as identical.

Gets more accurate estimate, but maybe of the wrong quantity.

Estimating Effect Sizes

Can we find a compromise between the methods that is better than each?

Yes: (Empirical) Bayes Estimates

EB Estimates: For each school, the estimated true effect size is a weighted average of that school’s observed ES, and the average ES over all schools.

The weights depend on

the sampling variability of the ES for that school

the estimated true variability

EB Estimates: Conceptual Diagram

d

ˆiEBd

id

• A school with a large standard error has an observed effect size estimate that cannot be trusted; therefore, its estimated true ES will be closer to the overall mean.

• A school with a small standard error has an observed ES estimate that can be trusted; therefore its estimated true ES will be closer to the observed ES.

SAT Data: ES estimates

Bayesian confidence intervals: Shrunk to better point estimate, narrower CI, by borrowing strength

O

O

O

O

O

O

O

O

X

hblm(EffSize ~ 1, s.e. = std.err)

Prior +/- Tau

8

7

6

5

4

3

2

1

-20 -10 0 10 20 30 40

-20 -10 0 10 20 30 40

ESTIMATE

X Prior Mn. --O-- Post.Mn +/- Post.SD --Y-- Obs.Y +/- SE

Y

Y

Y

Y

Y

Y

Y

Y

X

X

X

X

X

X

X

X

Bayesian Interpretation: More natural and more informative for decisions

Results for Bayesian model fit

(using hblm)

Coefficients:

Mean S.D. Prob > 0

(Intercept) 8.0056 4.7052 0.9575

But maybe we’re not interested in Prob(Effect > 0)

What is the probability that the average effect is large enough to matter?

Let’s find Prob(Ave effect > 20)

Only has meaning in Bayesian framework

z = (20 – 8)/4.7 = 2.55

Prob(z < 2.55) = .995

Prob(z > 2.55) = .005

So it’s unlikely this type of training raises SAT verbal scores by a useful amount (on average)

Ex 2: How to Incorporate Predictors of ES

Open Ed, Null Model

Call: hblm(eff.size ~ 1, s.e. = std.err)

DF for Resids= 6 RSS = 12.3639

Coefficients:

Mean S.D. Prob > 0

(Intercept) 0.4205 0.1049 0.9992

RSS Estimate of Tau = 0.1904

exp(post.mode(log(Tau))) = 0.1864 (s.d. = 0.089 )

Posterior Mean of Tau = 0.1723 (s.d. = 0.1084 )

0.022 0.040 0.069 0.114 0.186 0.304 0.503 0.862 1.610

0.0

0.0

50.1

00.1

50.2

00.2

50.3

00.3

5

Tau

Poste

rior

Pro

bability

of

Tau

A A A A A A A A A

B BB

B

B

B

BB B

CC

C

C

C

C

CC C

D D DD

DD D D D

E E EE

E E E E E

F FF

F

F

FF F F

G GG

G

G

GG G G

H H H H H H H H H

Estimates Conditional on Tau

A=(Intercept) B=1 C=2 D=3 E=4 F=5 G=6 H=7

-0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Conditio

nal M

ean

Open Ed, Main Effects Model hblm Output

Call: hblm(eff.size ~ obs + higher.grade, s.e. = std.err)


Coefficients:

Mean S.D. Prob > 0

(Intercept) 0.5138 0.1574 0.9959

obs -0.2838 0.2779 0.1258

higher.grade -0.0229 0.2523 0.4578

RSS Estimate of Tau = 0.1666

exp(post.mode(log(Tau))) = 0.1699 (s.d. = 0.1187 )


Open Ed, Main Effects Model

Y Prior Mn (Y-Prior)/SE Post.Mn Post.SD Prob > 0

1 0.6490 0.2071 1.9270 0.3170 0.2110 0.9572

2 -0.0430 0.2071 -1.6782 0.0971 0.1441 0.7544

3 0.5030 0.5138 -0.0583 0.5103 0.1372 0.9996

4 0.4580 0.5138 -0.2985 0.4956 0.1389 0.9992

5 0.5770 0.5138 0.3524 0.5354 0.1363 0.9999

6 0.5880 0.4909 0.5069 0.5221 0.1582 0.9996

7 0.3920 0.4909 -0.5093 0.4597 0.1594 0.9967


O

O

O

O

O

O

O

X

hblm(eff.size ~ obs + higher.grade, s.e. = std.err)

Prior +/- Tau

7

6

5

4

3

2

1

-0.2 0.0 0.2 0.4 0.6 0.8

-0.2 0.0 0.2 0.4 0.6 0.8

ESTIMATE

X Prior Mn. --O-- Post.Mn +/- Post.SD --Y-- Obs.Y +/- SE

Y

Y

Y

Y

Y

Y

Y

X

X

X

X

X

X

X


0.007 0.018 0.040 0.083 0.170 0.347 0.725 1.596 3.981

0.0

0.0

50.1

00.1

50.2

00.2

50.3

00.3

5

Tau

Poste

rior

Pro

bability

of

Tau

A A A A A A A A A

B BB

B

B

B

BB B

C C CC

C

C

C C C

D D D D D D D D DE E E EE

E E E E

F F F FF

F F F F

G G G GG

GG G G

H H H HH

HH H H



-0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Conditio

nal M

ean

Open Ed, study 2 as outlier

Call: hblm(eff.size ~ study2, s.e. = std.err)


Coefficients: Mean S.D. Prob > 0

(Intercept) 0.5230 0.0875 1.0000 study2 -0.5660 0.1964 0.0037

RSS Estimate of Tau = 0 exp(post.mode(log(Tau))) = 0.0686 (s.d. = 0.0616 )


0.001 0.004 0.011 0.027 0.069 0.172 0.444 1.223 3.960

0.0

0.1

0.2

0.3

0.4

Tau

Poste

rior

Pro

bability

of

Tau

A A A A A A A A AB B B B B

B

BB B

C C C C C C C C C

D D D D D D D D DE E E E E

EE E E

F F F F FF

F F F

G G G G GG

G G G

H H H H H

H

HH H



-0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Conditio

nal M

ean

Ex 3: Teacher Expectancy Data“Continuous” Predictor: Weeks of Contact

Statistical ModelWith no predictors:

Teacher Expectancy DataRandom Effects HLM (EB) Model, no predictors

0j j

j j j

u

d e

Teacher Expectancy DataPlot of Observed ES by Weeks of

Contact

Statistical Model:

Teacher Expectancy DataHLM Model, Weeks as Predictor

Regression (OLS) Predictions

WeeksOLS

Prediction

0 .41

1 .25

2 .09

3 -.06

.407 .157*Weeks

Estimated True ES as a Weighted Average of Observed ES and Conditional Mean

model

{ f or (j in 1:n)

{ prec[j] <- 1/v [j]; # precisions

b0[j] <- ga00 + ga01 * w1[j] + u0[j]

es[j] ~ dnorm(b0[j],prec[j])

u0[j] ~ dnorm(0,tauinv )

}

tau <- 1/tauinv ;

tauinv ~ dgamma(.001,.001); # prior distribution

ga00 ~ dnorm(0,.001); # prior distribution

ga01 ~ dnorm(0,.001); # " "

}

Fully Bayesian Modeling Using WinBUGS

node mean sd MC error 2.5% median 97.5% start sample

b0[1] 0.0818 0.0653 7.57E-4 -0.05326 0.08387 0.2076 10000 10001

b0[2] -0.0374 0.07597 0.001263 -0.1751 -0.04216 0.1249 10000 10001

b0[3] -0.07634 0.07543 8.636E-4 -0.2308 -0.07506 0.06993 10000 10001

b0[4] 0.4401 0.1172 0.001982 0.227 0.4335 0.6886 10000 10001

b0[5] 0.4101 0.1119 0.00124 0.1902 0.4095 0.6277 10000 10001

b0[6] -0.06481 0.06516 6.489E-4 -0.1951 -0.06507 0.06211 10000 10001

b0[7] -0.0543 0.06573 6.451E-4 -0.1838 -0.05487 0.07582 10000 10001

b0[8] -0.08677 0.08237 0.001189 -0.2682 -0.08189 0.06497 10000 10001

b0[9] 0.3966 0.1002 0.001034 0.1989 0.3978 0.5919 10000 10001

b0[10] 0.2896 0.09369 0.001865 0.1291 0.2809 0.5005 10000 10001

b0[11] 0.422 0.1117 0.001397 0.2041 0.4197 0.6496 10000 10001

b0[12] 0.3971 0.1063 0.00119 0.1842 0.3983 0.602 10000 10001

b0[13] 0.2407 0.08777 9.281E-4 0.05859 0.242 0.4089 10000 10001

b0[14] 0.1018 0.07637 8.85E-4 -0.04571 0.1 0.2649 10000 10001

b0[15] -0.08216 0.07495 9.984E-4 -0.2406 -0.07982 0.06029 10000 10001

b0[16] -0.06543 0.07606 7.27E-4 -0.2156 -0.06578 0.08336 10000 10001

b0[17] 0.2618 0.07717 9.501E-4 0.1139 0.2596 0.4201 10000 10001

b0[18] 0.08719 0.05891 6.253E-4 -0.03258 0.08769 0.204 10000 10001

b0[19] -0.06675 0.07586 7.727E-4 -0.2234 -0.0663 0.08335 10000 10001

ga00 0.4163 0.0939 0.001267 0.2352 0.4152 0.602 10000 10001

ga01 -0.161 0.03917 5.416E-4 -0.2391 -0.1608 -0.08579 10000 10001

tau 0.00492 0.006204 2.223E-4 4.419E-4 0.002861 0.02171 10000 10001

ga00 sample: 10001

0.0 0.2 0.4 0.6

0.0

2.0

4.0

6.0

ga01 sample: 10001

-0.4 -0.3 -0.2 -0.1

0.0

5.0

10.0

15.0

tau sample: 10001

-0.025 0.025 0.075

0.0

100.0

200.0

300.0

References

Draper, D., Gaver, D. P., Goel, P. K., Greenhouse, J. B., Hedges, L. V., Morris, C. N., Tucker, J. R., & Waternaux, C. M. (1992). Combining information: Statistical issues and opportunities for research. Washington, DC: National Academy Press. http://www.ams.ucsc.edu/~draper/draper-etal-1993b.pdf

Efron, B., Morris, C. (1977). Stein's paradox in statistics. Scientific American 238 (5): 119-127.

Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10,

http://www.ams.ucsc.edu/~draper/draper-etal-1993b.pdf






DuMouchel, W. (1994). Hierarchical Bayes linear models for meta-analysis. Technical Report 27, National Institute of Statistical Sciences.www.niss.org/technicalreports/tr27.pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.3067&rep=rep1&type=pdf (DuMouchel & Lise-Normand)

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.6750&rep=rep1&type=ps (DuMouchel et al, hblm for weather and schizophrenia)

http://www.niss.org/technicalreports/tr27.pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.3067&rep=rep1&type=pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.3067&rep=rep1&type=pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.6750&rep=rep1&type=ps

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.6750&rep=rep1&type=ps

References

Spiegelhalter, D. J., Abrams, K. R. and Myles, J. P. (2004). Bayesian Approaches to Clinical Trials and Health-care Evaluation. Chichester: John Wiley and Sons Limited.

Sutton, A. J., Abrams, K. R., Jones, D. R., Sheldon, T. A. and Song, F. (2000). Methods for Meta-Analysis in Medical Research. Wiley, New York.

Bayesian Meta-Analysis Software

Meta-analysthttp://tuftscaes.org/meta_analyst/

http://www.biomedcentral.com/content/pdf/1471-2288-9-80.pdf

hblm (runs only on Splus)ftp://ftp.research.att.com/dist/bayes-meta

WinBUGShttp://www.mrc-bsu.cam.ac.uk/bugs/

http://www.openbugs.info/w/

http://tuftscaes.org/meta_analyst/









ftp://ftp.research.att.com/dist/bayes-meta



http://www.mrc-bsu.cam.ac.uk/bugs/



http://www.openbugs.info/w/

SummaryMultilevel models (Empirical Bayes)

approach allows the analyst to Correct for sampling error Estimate effects of study characteristics

on the outcome (substantive and methodological)

Estimate true effect size distribution Estimate true effect size for each study

Fully Bayesian methods (hblm and WinBUGS) • Allow inferences that take into account

uncertainty about residual variance estimate

• Allow useful inferences, such as Prob(effect size > meaningful value)

• Allow prediction for new study, to see whether it is consistent with results in literature so far

meta-analysis using multilevel and bayesian modelsnycasa.org/meta analysis lecture bayesian...

Documents