meta-analysis

Funded through the ESRC’s Researcher Development Initiative

Department of Education, University of Oxford

Session 2.1 – Revision of Day 1

What are the 3 primary types of effect sizes?What sort of information can be used to calculate

effect sizes?What software is available for calculating effect

sizes?

Standardized mean difference Group contrasts

Treatment groups Naturally occurring groups

Inherently continuous constructOdds-ratio

Group contrasts Treatment groups Naturally occurring groups

Inherently dichotomous constructCorrelation coefficient

Association between variables

pooled

FemalesMales

SD

XXES

rES

bc

adES

Standardised mean difference effect sizes indicate the amount of improvement of treatment group over control, or the difference between 2 groups.

Odds ratio effect sizes indicate the likelihood of something occurring, e.g., not catching an illness after inoculation

Correlation effect sizes indicate the strength of the relationship between 2 variables

79% of T above 69% of T above

8

Effect size as proportion in the Effect size as proportion in the Treatment group doing better Treatment group doing better than the average Control group than the average Control group person person

d = .20

0.00

0.10

0.20

0.30

0.40

0.50

d = .50

0.00

0.10

0.20

0.30

0.40

0.50

d = .80

0.00

0.10

0.20

0.30

0.40

0.50

57% of T above cx cx cx

= Control = Treatment

Effect sizes can be thought of as the average percentile standing of the average treated participant relative to the average untreated participant.

Cohen's (1988) Standard Effect Size Percentile Standing Percent of Nonoverlap

2.0 97.7 81.1%

1.9 97.1 79.4%

1.8 96.4 77.4%

1.7 95.5 75.4%

1.6 94.5 73.1%

1.5 93.3 70.7%

1.4 91.9 68.1%

1.3 90 65.3%

1.2 88 62.2%

1.1 86 58.9%

1.0 84 55.4%

0.9 82 51.6%

LARGE 0.8 79 47.4%

0.7 76 43.0%

0.6 73 38.2%

MEDIUM 0.5 69 33.0%

0.4 66 27.4%

0.3 62 21.3%

SMALL 0.2 58 14.7%

0.1 54 7.7% 0.0 50 0%

What are the key statistical assumptions of the 3 meta-analytic methods?

Includes the entire population of studies to be considered; do not want to generalise to other studies not included (including future studies).

All of the variability between effect sizes is due to sampling error alone. Thus, the effect sizes are only weighted by the within-study variance.assumes that the collected studies all represent random

samples from the same population

Effect sizes are independent.

jj ed Where

dj is the observed effect size in study j

δ is the ‘true’ population effect

and ej is the residual due to sampling variance in study j

In this and following formulae, we will use the symbols d and δ to refer toany measure for the observed and the true effect size, which is not necessarilythe standardized mean difference.

Is only a sample of studies from the entire population of studies to be considered. As a result, we do want to generalise to other studies not included in the sample (e.g., future studies).

Variability between effect sizes is due to sampling error plus variability in the population of effects.In contrast to fixed effects models, there are 2 sources of

variance

Assumes that the studies are random samples of some population in which the underlying (infinite-sample) effect sizes have a distribution rather than having a single value.

Effect sizes are independent.

jjj eud Where


δ is the mean ‘true’ population effect size

uj is the deviation of the true study effect size from the mean true effect size


Meta-analytic data is inherently hierarchical (i.e., effect sizes nested within studies) and has random error that must be accounted for

Effect sizes are not necessarily independentAllows for multiple effect sizes per studyThe model combines fixed and random effects

(often called a mixed effects model)

jjj eud 0Where


0 is the mean ‘true’ population effect size

uj is the deviation of the true study effect size from the mean true effect size


If between-study variance = 0, the multilevel model simplifies to the fixed effects regression model

If no predictors are included the model simplifies to random effects model

If the level 2 variance = 0 , the model simplifies to the fixed effects model

jj

s

ssjsj euXd

10

j

s

ssjsj eXd

10

jjj eud

jj ed

Many meta-analysts use an adaptive (or “conditional”) approach

IF between-study variance is found in the homogeneity test

THEN use random effects modelOTHERWISE use fixed effects model

Fixed effects models are very common, even though the assumption of homogeneity is “implausible” (Noortgate & Onghena, 2003)

There is a considerable lag in the uptake of new methods by applied meta-analysts

Meta-analysts need to stay on top of these developments by Attending coursesWide reading across disciplines

What is the first step in the analysis of meta-analytic data in fixed or random effects models?

What 2 common statistical techniques have been adapted for use in fixed and random effects meta-analytic modelling?

What common statistical technique is multilevel modelling analogous to?

Usually start with a Q-test to determine the overall mean effect size and the homogeneity of the effect sizes (MeanES.sps macro)

If there is significant homogeneity, then:1) should probably conduct random effects analyses

instead2) model moderators of the effect sizes (determine the

source/s of variance)

2

ESESwQ ii

The homogeneity (Q) test asks whether the different effect sizes are likely to have all come from the same population (an assumption of the fixed effects model). Are the differences among the effect sizes no bigger than might be expected by chance?

= effect size for each study (i = 1 to k)

= mean effect size

= a weight for each study based on the sample size

However, this (chi-square) test is heavily dependent on sample size. It is almost always significant unless the numbers (studies and people in each study) are VERY small. This means that the fixed effect model will almost always be rejected in favour of a random effects model.

iES

iES ES iw

The Q-test is easy to conduct using the MeanES.sps macro from David Wilson’s websiteMeanES ES=d /W=weight.

Significant heterogeneity in the effect sizes therefore random effects more appropriate and/or moderators need to be modelled

26

The analogue to the ANOVA homogeneity analysis is appropriate for categorical variablesLooks for systematic differences between groups of

responses within a variableEasy to implement using MetaF.sps macro

MetaF ES = d /W = Weight /GROUP = TXTYPE /MODEL = FE.

Multiple regression homogeneity analysis is more appropriate for continuous variables and/or when there are multiple variables to be analysedTests the ability of groups within each variable to predict

the effect sizeCan include categorical variables in multiple regression

as dummy variablesEasy to implement using MetaReg.sps macro

MetaReg ES = d /W = Weight /IVS = IV1 IV2 /MODEL = FE.

If the homogeneity test is rejected (it almost always will be), it suggests that there are larger differences than can be explained by chance variation (at the individual participant level). There is more than one “population” in the set of different studies.

The random effects model determines how much of this between-study variation can be explained by study characteristics that we have coded.

The total variance associated with the effect sizes has two components, one associated with differences within each study (participant level variation) and one between study variance:

iTi vvv

The weighting for each effect size consists of the within-study variance (vi) and between-study variance (vθ)

The new weighting for the random effects model (wiRE) is given by the formula:

vvw

iiRE

1

30

Thus, larger studies receive proportionally less weight in RE model than in FE model.

This is because a constant is added to the denominator, so the relative effect of sample size will be smaller in RE model

31

Like the FE model, RE uses ANOVA and multiple regression to model potential moderators/predictors of the effect sizes, if the Q-test reveals significant heterogeneity

Easy to implement using MetaF.sps macro (ANOVA) or MetaReg.sps (multiple regression).MetaF ES = d /W = Weight /GROUP = TXTYPE /MODEL =

ML.MetaReg ES = d /W = Weight /IVS = IV1 IV2 /MODEL = ML.

Significant heterogeneity in the effect sizes therefore need to model moderators

iwiw

iw

kQv 2

)1(

33

Similar to multiple regression, but corrects the standard errors for the nesting of the data

Start with an intercept-only (no predictors) model, which incorporates both the outcome-level and the study-level components This tells us the overall mean effect size Is similar to a random effects model

Then expand the model to include predictor variables, to explain systematic variance between the study effect sizes

34

(MLwiN screenshot)

jj

s

ssjsj euXd

10

jjj eud 0

Using the same simulated data set with n = 15

Multilevel models:build on the fixed and random effects modelsaccount for between-study variance (like random effects)Are similar to multiple regression, but correct the

standard errors for the nesting of the data. Improved modelling of the nesting of levels within studies increases the accuracy of the estimation of standard errors on parameter estimates and the assessment of the significance of explanatory variables (Bateman and Jones, 2003).

Multilevel modelling is more precise when there is greater between-study heterogeneity

Also allows flexibility in modelling the data when one has multiple moderator variables (Raudenbush & Bryk, 2002)

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage Publications.

Van den Noortgate, W., & Onghena, P. (2003). Multilevel meta-analysis: A comparison with traditional meta-analytical procedures. Educational and Psychological Measurement, 63, 765-790.

Wilson’s “meta-analysis stuff” website: http://mason.gmu.edu/~dwilsonb/ma.html

Raudenbush, S.W. and Bryk, A.S. (2002). Hierarchical Linear Models (2nd Ed.).Thousand Oaks: Sage Publications.

meta-analysis

Documents

observed effect size

true study effect size

odds ratio effect sizes

primary types of effect

populationeffect sizes

mean true effect sizeand

sample of studies

entire population of