meta-analysis
DESCRIPTION
Session 2.1 – Revision of Day 1. Funded through the ESRC’s Researcher Development Initiative. Meta-analysis. Department of Education, University of Oxford. Calculating effect sizes. 2. Questions. What are the 3 primary types of effect sizes? - PowerPoint PPT PresentationTRANSCRIPT
Funded through the ESRC’s Researcher Development Initiative
Department of Education, University of Oxford
Session 2.1 – Revision of Day 1
2
What are the 3 primary types of effect sizes?What sort of information can be used to calculate
effect sizes?What software is available for calculating effect
sizes?
Standardized mean difference Group contrasts
Treatment groups Naturally occurring groups
Inherently continuous constructOdds-ratio
Group contrasts Treatment groups Naturally occurring groups
Inherently dichotomous constructCorrelation coefficient
Association between variables
pooled
FemalesMales
SD
XXES
rES
bc
adES
55
66
Standardised mean difference effect sizes indicate the amount of improvement of treatment group over control, or the difference between 2 groups.
Odds ratio effect sizes indicate the likelihood of something occurring, e.g., not catching an illness after inoculation
Correlation effect sizes indicate the strength of the relationship between 2 variables
79% of T above 69% of T above
8
Effect size as proportion in the Effect size as proportion in the Treatment group doing better Treatment group doing better than the average Control group than the average Control group person person
d = .20
0.00
0.10
0.20
0.30
0.40
0.50
d = .50
0.00
0.10
0.20
0.30
0.40
0.50
d = .80
0.00
0.10
0.20
0.30
0.40
0.50
57% of T above cx cx cx
= Control = Treatment
Effect sizes can be thought of as the average percentile standing of the average treated participant relative to the average untreated participant.
Cohen's (1988) Standard Effect Size Percentile Standing Percent of Nonoverlap
2.0 97.7 81.1%
1.9 97.1 79.4%
1.8 96.4 77.4%
1.7 95.5 75.4%
1.6 94.5 73.1%
1.5 93.3 70.7%
1.4 91.9 68.1%
1.3 90 65.3%
1.2 88 62.2%
1.1 86 58.9%
1.0 84 55.4%
0.9 82 51.6%
LARGE 0.8 79 47.4%
0.7 76 43.0%
0.6 73 38.2%
MEDIUM 0.5 69 33.0%
0.4 66 27.4%
0.3 62 21.3%
SMALL 0.2 58 14.7%
0.1 54 7.7% 0.0 50 0%
10
What are the key statistical assumptions of the 3 meta-analytic methods?
Includes the entire population of studies to be considered; do not want to generalise to other studies not included (including future studies).
All of the variability between effect sizes is due to sampling error alone. Thus, the effect sizes are only weighted by the within-study variance.assumes that the collected studies all represent random
samples from the same population
Effect sizes are independent.
jj ed Where
dj is the observed effect size in study j
δ is the ‘true’ population effect
and ej is the residual due to sampling variance in study j
In this and following formulae, we will use the symbols d and δ to refer toany measure for the observed and the true effect size, which is not necessarilythe standardized mean difference.
Is only a sample of studies from the entire population of studies to be considered. As a result, we do want to generalise to other studies not included in the sample (e.g., future studies).
Variability between effect sizes is due to sampling error plus variability in the population of effects.In contrast to fixed effects models, there are 2 sources of
variance
Assumes that the studies are random samples of some population in which the underlying (infinite-sample) effect sizes have a distribution rather than having a single value.
Effect sizes are independent.
jjj eud Where
dj is the observed effect size in study j
δ is the mean ‘true’ population effect size
uj is the deviation of the true study effect size from the mean true effect size
and ej is the residual due to sampling variance in study j
Meta-analytic data is inherently hierarchical (i.e., effect sizes nested within studies) and has random error that must be accounted for
Effect sizes are not necessarily independentAllows for multiple effect sizes per studyThe model combines fixed and random effects
(often called a mixed effects model)
jjj eud 0Where
dj is the observed effect size in study j
0 is the mean ‘true’ population effect size
uj is the deviation of the true study effect size from the mean true effect size
and ej is the residual due to sampling variance in study j
If between-study variance = 0, the multilevel model simplifies to the fixed effects regression model
If no predictors are included the model simplifies to random effects model
If the level 2 variance = 0 , the model simplifies to the fixed effects model
jj
s
ssjsj euXd
10
j
s
ssjsj eXd
10
jjj eud
jj ed
Many meta-analysts use an adaptive (or “conditional”) approach
IF between-study variance is found in the homogeneity test
THEN use random effects modelOTHERWISE use fixed effects model
Fixed effects models are very common, even though the assumption of homogeneity is “implausible” (Noortgate & Onghena, 2003)
There is a considerable lag in the uptake of new methods by applied meta-analysts
Meta-analysts need to stay on top of these developments by Attending coursesWide reading across disciplines
21
What is the first step in the analysis of meta-analytic data in fixed or random effects models?
What 2 common statistical techniques have been adapted for use in fixed and random effects meta-analytic modelling?
What common statistical technique is multilevel modelling analogous to?
Usually start with a Q-test to determine the overall mean effect size and the homogeneity of the effect sizes (MeanES.sps macro)
If there is significant homogeneity, then:1) should probably conduct random effects analyses
instead2) model moderators of the effect sizes (determine the
source/s of variance)
2
ESESwQ ii
The homogeneity (Q) test asks whether the different effect sizes are likely to have all come from the same population (an assumption of the fixed effects model). Are the differences among the effect sizes no bigger than might be expected by chance?
= effect size for each study (i = 1 to k)
= mean effect size
= a weight for each study based on the sample size
However, this (chi-square) test is heavily dependent on sample size. It is almost always significant unless the numbers (studies and people in each study) are VERY small. This means that the fixed effect model will almost always be rejected in favour of a random effects model.
iES
iES ES iw
The Q-test is easy to conduct using the MeanES.sps macro from David Wilson’s websiteMeanES ES=d /W=weight.
Significant heterogeneity in the effect sizes therefore random effects more appropriate and/or moderators need to be modelled
26
The analogue to the ANOVA homogeneity analysis is appropriate for categorical variablesLooks for systematic differences between groups of
responses within a variableEasy to implement using MetaF.sps macro
MetaF ES = d /W = Weight /GROUP = TXTYPE /MODEL = FE.
Multiple regression homogeneity analysis is more appropriate for continuous variables and/or when there are multiple variables to be analysedTests the ability of groups within each variable to predict
the effect sizeCan include categorical variables in multiple regression
as dummy variablesEasy to implement using MetaReg.sps macro
MetaReg ES = d /W = Weight /IVS = IV1 IV2 /MODEL = FE.
If the homogeneity test is rejected (it almost always will be), it suggests that there are larger differences than can be explained by chance variation (at the individual participant level). There is more than one “population” in the set of different studies.
The random effects model determines how much of this between-study variation can be explained by study characteristics that we have coded.
The total variance associated with the effect sizes has two components, one associated with differences within each study (participant level variation) and one between study variance:
iTi vvv
The weighting for each effect size consists of the within-study variance (vi) and between-study variance (vθ)
The new weighting for the random effects model (wiRE) is given by the formula:
vvw
iiRE
1
30
Thus, larger studies receive proportionally less weight in RE model than in FE model.
This is because a constant is added to the denominator, so the relative effect of sample size will be smaller in RE model
31
Like the FE model, RE uses ANOVA and multiple regression to model potential moderators/predictors of the effect sizes, if the Q-test reveals significant heterogeneity
Easy to implement using MetaF.sps macro (ANOVA) or MetaReg.sps (multiple regression).MetaF ES = d /W = Weight /GROUP = TXTYPE /MODEL =
ML.MetaReg ES = d /W = Weight /IVS = IV1 IV2 /MODEL = ML.
Significant heterogeneity in the effect sizes therefore need to model moderators
iwiw
iw
kQv 2
)1(
33
Similar to multiple regression, but corrects the standard errors for the nesting of the data
Start with an intercept-only (no predictors) model, which incorporates both the outcome-level and the study-level components This tells us the overall mean effect size Is similar to a random effects model
Then expand the model to include predictor variables, to explain systematic variance between the study effect sizes
34
(MLwiN screenshot)
jj
s
ssjsj euXd
10
jjj eud 0
Using the same simulated data set with n = 15
Multilevel models:build on the fixed and random effects modelsaccount for between-study variance (like random effects)Are similar to multiple regression, but correct the
standard errors for the nesting of the data. Improved modelling of the nesting of levels within studies increases the accuracy of the estimation of standard errors on parameter estimates and the assessment of the significance of explanatory variables (Bateman and Jones, 2003).
Multilevel modelling is more precise when there is greater between-study heterogeneity
Also allows flexibility in modelling the data when one has multiple moderator variables (Raudenbush & Bryk, 2002)
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates.
Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage Publications.
Van den Noortgate, W., & Onghena, P. (2003). Multilevel meta-analysis: A comparison with traditional meta-analytical procedures. Educational and Psychological Measurement, 63, 765-790.
Wilson’s “meta-analysis stuff” website: http://mason.gmu.edu/~dwilsonb/ma.html
Raudenbush, S.W. and Bryk, A.S. (2002). Hierarchical Linear Models (2nd Ed.).Thousand Oaks: Sage Publications.