how to build a better education review
TRANSCRIPT
How to Build a Better Systematic Review in Education
David K. Evans (presenting co-authored work with Anna Popova)World Bank
April 21, 2016 Building Evidence in Education (BE2) – Washington DC, USA
Divergent Findings in Systematic Reviews + A Few Proposals
2
MotivationRecent years have seen an explosion in evidence on learningSix reviews over last two years on the same topic: How to improve learning outcomes for children in low and middle income countriesAlthough these are not exactly the same, they have a common goal but often reach different conclusions
1980 1985 1990 1995 2000 2005 2010 20150
50
100
150
200
250
Cumulative learning studies227 total studies
32 total studies
3
The reviews & their recommendations to improve learning(2013-2014)
Conn
• Pedagogical interventions
• Student incentives
Glewwe et al.
• Desks, tables, & chairs• Teacher subject
knowledge• Teacher presence
Kremer et al.• Pedagogical
interventions to match teaching to student learning
• Accountability• Incentives
Krishnaratne et al.
• Materials
McEwan
• Computers or instructional technology
• Teacher training• Smaller classes or
ability grouping
Murnane & Ganimian• Provide info about
school quality & returns to schooling
• Teacher incentives (in low performance settings)
• Specific guidance to low-skilled teachers
4
Since we did our analysis, several more have come out!(2015 alone)
Asim et al. (South Asia only)
• Teachers & schools (not households & communities)
• Provide resources• Provide incentives
Masino & Niño-Zarazúa
• Combine 2 or 3 of• Material and human resources• Behaviors and intertemporal choices
of teachers and students• Participatory and community
management reformsGlewwe & Muralidharan
• Pedagogy• School governance• Teacher accountability
Snilstveit et al. (900 pages!)
• Structured pedagogy
5
Not all reviews are systematic…. What is a systematic review anyway?
Ask the Campbell Collaboration:
“A systematic review must have:
1. Clear inclusion / exclusion criteria
2. An explicit search strategy
3. Systematic coding and analysis of included studies
4. Meta-analysis (where possible)”
What is meta-analysis anyway? Meta-analysis actually combines the estimates from different studies to see their size and significance when taken together.
7
What types of review are there?Meta-analysis
Vote counting
Narrative
Pros • Increase statistical power
• Objective weighting of the evidence
• Includes all quantitative studies
• Discussion of mechanisms
• Can include every study
Cons • Excludes good studies without particular data reported
• Average out bimodal outcomes
• Can tend to over-aggregation
• Ignores sample size & effect size
• Misleading if studies are underpowered
• Subjective weighting of the evidence
• Not transparent
ConnMcEwanKrishnaratne et al.
Glewwe et al. Kremer, Brannen, & GlennersterMurnane & Ganimian
8
Characteristics of a systematic review (Campbell collaboration)
Clear inclusion/ exclusion criteria
Explicit search strategy
Systematic coding and analysis of included studies
Meta-analysis (where possible)
0% 20% 40% 60% 80% 100%
Full Partial None
Proportion of reviews
9
What drives different conclusions?
Why do 6 reviews ostensibly covering largely the same literature find different conclusions?
10
Differing compositions
1 2 3 4 5 60
20
40
60
80
100
120
140
160
180159
3223
6 4 3
Number of reviews in which a study is included
Num
ber
of in
divi
dual
edu
cati
on s
tud-
ies
Out of 227 total studies with learning outcomes, how many are included in most systematic reviews about improving learning?
11
Differing compositions
1 2 3 4 5 60
20
40
60
80
100
120
140
160
180159
3223
6 4 3
Number of reviews in which a study is included
Num
ber
of in
divi
dual
edu
cati
on s
tud-
ies
Only 3/227 studies are included in all 6 reviews
Out of 227 total studies with learning outcomes, how many are included in most systematic reviews about improving learning?
12
Differing compositions
1 2 3 4 5 60
20
40
60
80
100
120
140
160
180159
3223
6 4 3
Number of reviews in which a study is included
Num
ber
of in
divi
dual
edu
cati
on s
tud-
ies
Only 13 studies are in-cluded in the majority of
reviews
Out of 227 total studies with learning outcomes, how many are included in most systematic reviews about improving learning?
13
Differing compositions
1 2 3 4 5 60
20
40
60
80
100
120
140
160
180159
3223
6 4 3
Number of reviews in which a study is included
Num
ber
of in
divi
dual
edu
cati
on s
tud-
ies
159 of the studies are included in only 1 of the 6 reviews
Out of 227 total studies with learning outcomes, how many are included in most systematic reviews about improving learning?
14
Differing categorizations
Study
Categorization across the 6 systematic reviewsTotal
citations
Conn 2014 Glewwe et al. 2014
Kremer et al. 2013
Krishnaratne et al. 2013
McEwan 2014 Murnane &
Ganimian 2014
Kremer et al. (2009)
Student incentives
Merit-based scholarships
Merit scholarships School fees Performance
incentivesCash
transfers 6
Banerjee et al. (2007) -
Computers & electronic
games
Reducing class size/Computer-
assisted learning/Contra
ct teachers
Materials
Instructional materials/
Computers or technology/
Teacher training/Class size or composition/Contract or volunteer teachers
Computer-assisted learning
5
How are the most frequently cited papers categorized in different reviews?
15
So, what drives the different conclusions?How many of the studies in one review’s recommended category of intervention are included in other reviews?
Percentage of driving studies included in other reviews
RecommendationConn 2014
Glewwe et al. 2014
Kremer et al. 2013
Krishnaratne et al. 2013
McEwan 2014
Murnane & Ganimian
2014 Conn 2014 - Pedagogical interventions -- 6% 0% 6% 6% 18%Kremer et al. 2013 - Matching teaching to students’ learning 50% 50% -- 50% 100% 50%Krishnaratne et al. 2013 - Materials provision 17% 67% 50% -- 100% 67%McEwan 2014 - Computers or instructional technology 0% 30% 30% 40% -- 70%Murnane & Ganimian 2014 - Information provision 11% 0% 11% 33% 33% --
17
How can we conduct better individual systematic reviews?
1. Conduct an exhaustive search, and maybe replicate it
• About 50 studies should be in 5-6 reviews, but only 8 are
2. Combine methods
• Overcome the fact that meta-analysis excludes too many but narratives aren’t systematic enough
3. Maintain low aggregation of intervention categories so that the categories can actually be useful
• Variation within intervention type means “computers work” is less useful than “computers software that adapts to student learning levels works”
18
What works?Conn
(2014)Glewwe et al. (2014)
Kremer, Brannen, & Glennerster
(2013)
Krishnaratne, White, &
Carpenter (2013)
McEwan (2014)
Murnane & Ganimian
(2014)
Tally
Pedagogical interventions that match teaching to students’ learning
✓ ✓ ✓ ✓ 4
Individualized teacher training
✓ ✓ ✓ ✓ 4
Teacher incentives ✓ ✓ ✓ 3Materials ✓ ✓ ✓ 3Student incentives ✓ ✓ 2Accountability ✓ 1Contract or volunteer teachers
✓ 1
Providing information about school quality and returns to schooling
✓ 1
Smaller classes or ability grouping
✓ 1
Teacher presence ✓ 1
19
What works: (1) Pedagogical interventions that match teaching to individual student learning levels
- Assign students to separate classes based on initial ability so that teachers can focus instruction at the level of learning of individual students in Kenya (Duflo, Dupas & Kremer 2011)
- Use math software to help students learn at their own pace (Banerjee et al. 2007)
• But just giving out laptops or desktop computers won’t guarantee the gains
20
What works: (2) Individualized, repeated teacher training, associated with a specific method or task
- Train teachers and provide them with regular mentoring to implement early grade reading instruction in local language in Kenya (Lucas et al.
2014)
- Help teachers learn to use storybooks and flash cards in India (He et al. 2009)
• As opposed to a similar (not identical) program introduced without teacher preparation (He et al. 2008)
21
And this is already out of date!The 2015 studies
• Teacher performance pay in Pakistan
• Public-private partnerships in Uganda
• Cash transfers in Honduras
• Malaria control + literacy instruction in Kenya
• Literacy innovations in Kenya (RTI)
• Teacher training + text messages in Kenya
• Teacher training for literacy in Uganda
• Teacher performance pay in China
22
Systematic reporting of results to permit aggregation
𝐸𝑓𝑓𝑒𝑐𝑡=𝑌 𝑇−𝑌 𝐶
𝑠𝑝𝑜𝑜𝑙𝑒𝑑
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟=√𝑛𝑇+𝑛𝐶
𝑛𝑇𝑛𝐶+𝐸𝑓𝑓𝑒𝑐𝑡 22(𝑛𝑇+𝑛𝐶)
Statistics wanted!
• Sample standard deviation across treatment and control
• Student sample size in treatment group
• Student sample size in control group
Sources: McEwan 2015; Borenstein 2009
23
Systematic reporting on programs
Each intervention is unique.
For 24 teacher training evaluations, we sought evidence on 43 potential indicators.
0% 20% 40% 60% 80% 100%
Reported indicators
We have developed an instrument to improve systematic reporting on interventions with teacher training.
Needed for other interventions.
24
Database of results(Do we really need another database?)
Don’t we already have lots of databases?
What would make it worthwhile?• Real-time updating• Systematic reporting of standardized effect
sizes• Systematic reporting of implementation
details
Database # of studies
IE2 – Impact Evaluations in Education
288
3ie Impact Evaluation database (Education)
773
AEA RCT Registry 55*Evans-Popova collection 322Others…
* Completed only, both developing & rich countries
What would it allow?• Just-in-time queries• Auto-updating meta-analysis• What do the 10 most effective
pedagogical interventions look like? • What do the 50 most effective programs
overall look like?
25
Conclusions
Systematic reviews in education could be much better They need more exhaustive searches, combined methodologies, and de-aggregation A coordinating body could help with systematic cataloguing
26
Sources
This presentation is based largely on
Evans, David, and Anna Popova. 2015. “What really works to improve learning in developing countries? An analysis of divergent findings in systematic reviews,” World Bank Policy Research Working Paper 7203. (link)
It also draws on work from the ongoing project:
Popova, Anna, David Evans, and Violeta Arancibia. 2016. “Inside In-Service Teacher Training: What Works and How Do We Measure It?” Work in progress. World Bank.
The database of learning studies on which the analysis is based is available here.
27
Review referencesConn, K. (2014). “Identifying Effective Education Interventions in Sub-Saharan Africa: A meta-analysis of rigorous impact evaluations.” Unpublished manuscript, Columbia University, New York, NY.
Glewwe, P. W., Hanushek, E. A., Humpage, S. D., & Ravina, R. (2014). “School resources and educational outcomes in developing countries: a review of the literature from 1990 to 2010.” in Education Policy in Developing Countries, ed. Glewwe, P. University of Chicago Press: Chicago and London.
Kremer, M., Brannen, C., & Glennerster, R. (2013). “The challenge of education and learning in the developing world.” Science, 340(6130), 297-300.
Krishnaratne, S., White, H., & Carpenter, E. (2013). “Quality education for all children? What works in education in developing countries.” 3ie Working Paper 20, International Initiative for Impact Evaluation.
McEwan, P. (2014). “Improving Learning in Primary Schools of Developing Countries: A Meta-Analysis of Randomized Experiments.” Review of
Educational Research.
Murnane, R. J., & Ganimian, A.J. (2014). “Improving Educational Outcomes in Developing Countries: Lessons from Rigorous Evaluations.” Unpublished manuscript.