11 th of october, 2010 university of cape town kamilla gumede martin abel introduction to randomised...
TRANSCRIPT
11th of October, 2010University of Cape Town
Kamilla GumedeMartin Abel
Introduction to Randomised Evaluations
• New research programme within SALDRU• Regional office of a global network• Specialise in RANDOMISED IMPACT
EVALUATIONS• Do 3 things:
– Run evaluations– Disseminate result – public good– Train others to run evaluations
J-PAL Africa
Fight poverty
• I. Why do we evaluate social programmes?• II. What is an IMPACT?• III. Impact evaluation methodologies • IV. How to run an RCT:
– Advantages of randomised evaluations– Theory of Change– Randomisation Design– External vs. Internal Validity
Overview
• Surprisingly little hard evidence on what works
• Need #1: Can do more with given budget with better evidence.
• Need #2: If people knew money was going to programs that worked, could help increase pot for anti-poverty programs
• Instead of asking “do aid/development programs work?” should be asking:– Which work best, why and when?– How can we scale up what works?
Evidence-based policy making
4
5
Example Aid: Optimists
“I have identified the specific investments that are needed [to end poverty]; found ways to plan and implement them; [and] shown that they can be affordable.”
Jeffrey Sachs End of Poverty
6
“After $2.3 trillion over 5 decades, why are the desperate needs of the world's poor still so tragically unmet?
Isn't it finally time for an end to the impunity of foreign aid?”
Bill Easterly The White Man’s Burden
Example Aid: Pessimists
• Accountability
• Lesson learning– Program– Organization– Beneficiaries– World
• So that we can reduce poverty through more effective programs
• Different types of evaluation contribute to these different objectives of evaluation
Objective of evaluation
The different types of evaluation
Evaluation (M&E)
Program Evaluation
Impact Evaluation
Randomized Evaluation
Evaluating Social Programmes
• What is outcome after programme?• What would have happened in the absence of
the programme?
• Take the difference between what happened (with the program)
- what would have happened (without the program)= IMPACT of the program
10
How to measure impact? (I)
Impact is defined as a comparison between:
1. the outcome some time after the program has been introduced
2. the outcome at that same point in time had the program not been introduced (the ”counterfactual”)
11
How to measure impact? (II)
Impact: What is it?
Time
Prim
ary
Out
com
e
Impact
Counterfactual
Intervention
Impact: What is it?
Time
Prim
ary
Out
com
e
ImpactCounterfactual
Intervention
• The counterfactual represents the state of the world that program participants would have experienced in the absence of the program (i.e. had they not participated in the program)
• Problem: Counterfactual cannot be observed
• Solution: We need to “mimic” or construct the counterfactual
Counterfactual
• Counterfactual is often constructed by selecting a group not affected by the program
• Randomized:– Use random assignment of the program to create a
control group which mimics the counterfactual.
• Non-randomized:– Argue that a certain excluded group mimics the
counterfactual.
15
Constructing the counterfactual
• Experimental:– Randomized Evaluations
• Quasi-experimental– Instrumental Variables– Regression Discontinuity Design
• Non-experimental– Pre-post– Difference in differences– Cross Sectional Regression– Fixed Effects Analysis– Statistical Matching
16
Methodologies in impact evaluation
South Africa OAP on labour supply
Non-experimental evaluations – Cross Sectional Regression
Bertrand et al. (2003) Posel et al. (2006)
• We can control for observable differences (age, gender, education,...)• There are also unobservable characteristics we cannot control for (motivation, etc.)
What people does a household with a pension attract?
Non-experimental evaluations – Panel Data Analysis with Fixed Effects
Ardington et al. (2009)
• Fixed effects analysis limits sample to households that changed pension status over time • We can control for unobservable characteristics that do not change • Unobservable characteristics may change over time• Data requirements: panel data, sizeable proportion of households switching
How to randomise
A. The basics• Randomly assign them to either:
Treatment Group – is offered treatment Control Group - not allowed to receive treatment
(during the evaluation period)
20
Target Populati
on
Not in evaluation
Evaluation Sample
Random Assignment
Treatment group
Control group
A. Why randomize? – Conceptual Argument• If properly designed and conducted,
randomized experiments provide the most credible method to estimate the impact of a program
• Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment,
• any difference that subsequently arises between them can be attributed to the program rather than to other factors.
21
Example: Primary vs Secondary
Returns to Secondary Education (?)• Standard way to measure this:
– Equation• But are people who complete school “the
same” as those who don’t:– More patient, more ambitious, more resourced
families and have lower immediate economic opportunities.
• 1,200 teens, qualified but cannot afford:– 300 boys, 300 girls get 4 year scholarship– Followed for 10 years
In class test
BREAK
Basic setup of a randomized evaluation
26
Target Populati
on
Not in evaluation
Evaluation Sample
Random Assignment
Treatment group
Control group
Base- line
survey
End- line
survey
Roadmap to Randomized Evaluations
• willing partner• sufficient time• interesting policy question / theory• sufficient resources
Environment / Context1
• mechanism of change (log frame)• state assumptions• identify research hypothesis• identify target population• identify indicators• identify threads to validity
Theory of Change2
• statistical validity• cluster correlation
Sufficient Sample Size4• Spillovers• Discouragement• Attrition• Political interference
Strategy to Manage Threats5
Check on •
• competing interventions• simple program• packages
Randomization Design
• encouragement• gradual rollout• simple lottery• rotation design
3
• individual •cluster design • block random.
Intervention Unit of randomization Randomization mechanism
Revise
• Willing partner• Sufficient time• Interesting policy question / theory• Sufficient resources
B. Environment / Context
Programs /Policies
Knowledge• Evidence• Experience• Personal• collective
Ideology• Own• External
Support• Budget• Political• Capacity
II. Evaluations: Providing evidence for policymaking
• What are the possible chains of outcomes in the case of the intervention?
• What are the assumptions underlying each chain of causation?
• What are the critical intermediary steps needed to obtain the final results?
• What variables should we try to obtain at every step of the way to discriminate between various models?
C. Theory of Change (I)
30
C. Theory of Change (II) – SA Pension System
31
Bertrand et al. (2003) Posel et al. (2006)
Different theories of change determine what indicators we measure and who do include in our evaluation
• Based on the Theory of Change, we identify indicators to test the different lines of causation and measure outcomes
...room for creativity…• How to measure women empowerment?
– Measure fraction of time they speak during village council meeting
• How to measure corruption in infrastructure projects?– Drill holes in the asphalt of newly built roads and
measure difference in actual and official thickness
C. Indicators
Roadmap to Randomized Evaluations
• willing partner• sufficient time• interesting policy question / theory• sufficient resources
Environment / Context1
• mechanism of change (log frame)• state assumptions• identify research hypothesis• identify target population• identify indicators• identify threads to validity
Theory of Change2
• statistical validity• cluster correlation
Sufficient Sample Size4• Spillovers• Discouragement• Attrition• Political interference
Strategy to Manage Threats5
Check on •
• competing interventions• simple program• packages
Randomization Design
• encouragement• gradual rollout• simple lottery• rotation design
3
• individual •cluster design • block random.
Intervention Unit of randomization Randomization mechanism
Revise
D. Basic setup of a randomized evaluation
34
Target Populati
on
Not in evaluation
Evaluation Sample
Random Assignment
Treatment group
Control group
• Evidence on the effectiveness of providing microfinance loans to the poor has been mixed. Some argue that financial literacy training is more effective while others propose that both loans and training needs to be provided to alleviate poverty
How can you design a randomised evaluation to assess which of these claims is true?
Case Study: Microfinance and/or Financial Literacy Training
D. Forms of Intervention
Random Assignment
6 month Financial Literacy
Control group
1 month Financial LiteracyRandom
Assignment
Microfinance
Control group
Financial Literacy
Financial Literacy AND Microfinance
Random Assignment Microfinance
Control group
Financial Literacy
Random Assignment
Microfinance
Control group
Simple Treatment / Control
Cross- cutting Design
Multiple Treatment
Varying levels of Treatment
• Individual
• Cluster (Class room, school, district,…)
• Generally, best to randomize at the level at which the treatment is administered.
• Ethical and practical concerns
E. Unit of Randomization
• Confronted with overcrowded schools and a shortage of teachers, in 2005 the NGO ICS offered to provided funds to hire 140 extra teachers each year.
What is the best unit of randomisation for our RCT?
Case Study: Extra Teachers in Kenya
• Lottery• Pull out of a hat/bucket• Use random number generator in spreadsheet or STATA
• Phase-in design• Rotation design• Encouragement design
F. Method of Randomization
How to Randomize, Part I - 40
Random assignment through lottery
2006
Income per person, per month, rupees
1000
500
0Treat Compare
1457 1442
Alternative Mechanism: Phase-in design
Round 1Treatment: 1/3Control: 2/3
Round 2Treatment: 2/3Control: 1/3
Round 3Treatment: 3/3Control: 0 1
1
11
1
1
1
1
1
11
1
1
1
2
2
22
2
2
22
2
2
2
22
2
2
2
3
333
3
33
33
3
33
3
3
3 3
3
Round 1Treatment: 1/3Control: 2/3
Round 2Treatment: 2/3Control: 1/3
Randomized evaluation endsRandomized evaluation ends
Roadmap to Randomized Evaluations
• willing partner• sufficient time• interesting policy question / theory• sufficient resources
Environment / Context1
• mechanism of change (log frame)• state assumptions• identify research hypothesis• identify target population• identify indicators• identify threads to validity
Theory of Change2
• statistical validity• cluster correlation
Sufficient Sample Size4
• Spillovers• Sample Bias• Attrition
Strategy to Manage Threats5
Check on •
• competing interventions• simple program• packages
Randomization Design
• encouragement• gradual rollout• simple lottery• rotation design
3
• individual •cluster design • block random.
Intervention Unit of randomization Randomization mechanism
Revise
• Internal Validity: Can we estimate the treatment effect for our particular sample?– Fails when there are differences between the two
groups (other than the treatment itself) that affect the outcome
• External Validity: Can we extrapolate our estimates to other populations?– Fails when outside our evaluation environment,
the treatment has a different effect
G. Internal vs. External Validity
• Threads to Internal Validity: control group is different from the counterfactual– Spill-overs– Sample Selection Bias – Attrition
• Examples:– Individuals assigned to comparison group could attempt to
move into treatment group (cross-over) and v.v.– Individuals assigned to treatment group could drop out of
the program (Attrition)
G. Threads to Internal Validity
45
Depends on three factors:• Program Implementation: can it be replicated
at a large scale?
• Study Sample: is it representative?– Does de-worming have the same effects in Kenya
and South Africa?
• Sensitivity of results: would a similar, but slightly different program, have same impact?
G. External Validity: Generalisability of results
Interested? Become part of the J-PAL research team!
“You get to spend a year in Siberia, while I have to stay here in Hawaii, to apply for grants to extend your research time there.”