impact monitoring evaluation of the effectiveness of work and income employment assistance
TRANSCRIPT
Impact monitoring
Evaluation of the effectiveness of Work and Income employment assistance
Introduction
Internal evaluation role
Impact evaluation• the evaluation question• impact evaluation: the counterfactual approach
Impact monitoring• advantages over impact evaluation• why is it possible today: falling cost of data
Example: Work and Income employment assistance• propensity matching: workhorse impact method• application of impact monitoring: Training Opportunities
Summary
Questions
Internal evaluation
• Organisations usually have a range of functions and interventions
• Internal evaluation is about building an evidence base on how best an organisation can deliver effective services
• Emphasis is not on one intervention, but on all interventions delivered by the organisation
• Additional challenges:– continuous process of reform and adjustment– operating within a changing policy and social context– often with very short development cycles
• Gaol of alignment: need to accelerate the evidence generation and slow down the development cycle
IMPACT EVALUATIONCounterfactual approach to
Evaluation is a question
What works, for whom and why?
What is the intervention logic of the intervention?
What are the casual links?
How does it work in practice?
Does practice vary?
Who participates and who is affected?
Revise the intervention logic to be consistent with the evidence
Are the causal links supported by the evidence?
Does the new logic achieve the original intervention goal?
What impact does it have on outcomes?
Immediate outcomes
Long term outcomes
Unintended outcomes
Do impacts vary across participants
Impact evaluation frameworkPawson and Tilley (1997) Context-Mechanism-Outcome
Outcomes
Mechanism
Outcomes tell you about the state of the world and how it changes over time- in employment or not- educational achievement- sense of well being
Mechanisms are the actions of organisations designed to change outcomes for the better- regulations and taxation- programmes and services- social marketing
Context is everything else that exists in the space that the mechanism operates within- social, cultural, community- physical space- legislative/policy settings- delivery organisation
Impact is about understanding the how the mechanism influences outcomes within its context- theory of change- testing the theory against evidence
Counterfactual designs
• Counterfactual: is the outcomes that would have occurred in the absence of the intervention
Ou
tco
me
Intervention
Observed outcomes
Counterfactual
Impact
• Impact: the intervention’s contribution to the outcome
Counterfactual designs
Outcomes
Mechanism
Intervention Counterfactual designs contrast two CMO scenarios
One with the mechanism being evaluated and one without
The reality is that either the counterfactual involves an alternative intervention or the absence of the intervention changes the context
Counterfactual
Outcomes
Alternative
For this reason understanding what happens in the counterfactual CMO is as important as understanding the intervention CMO
The counterfactual black box
• Counterfactual designs give evidence for causal links• But, they do not explain them
Intervention
Counterfactual establishes
OutcomeIntervention increased outcomeWe know it works
Black box
1
1
2 3
2
Cannot distinguish between equivalent causal explanations
or
But we do not know why it works
Unpacking the black box
There many ways to unpack the black box
The counterfactual approach is to examine intermediate outcomes
1 2 3
1 2
or
Reduces the range of alternative casual explanations
Casual black boxes remain
Intervention
Counterfactual establishes casual relationships
Outcome 3Outcome 1 Outcome 2+ + +
Context matters
• Context is often forgotten when using impact evidence– but casual mechanisms are contextually based
• The trick is:– knowing if context has changed– working out how this changes the casual mechanism
• Counterfactual evidence is most often presented without context
• Why:– practitioners of counterfactual methods are often
themselves removed from the context– counterfactual evidence can easily be abstracted from
its context
Comparable results
• Independent evaluations are difficult to use when comparing the impact of interventions
• Impact evidence will depend on the design, the outcome measures used, what the counterfactual represents
• Cannot be certain whether differences in intervention impacts stem from:– real differences in causal mechanisms– difference in impact method
IMPACT MONITORINGThe development of
Impact monitoring
• Impact evaluation needs to be:– Robust: decision makers have confidence about the
difference interventions make• Impact monitoring has the additional features of being:
– Consistent: enable direct comparison of intervention impacts
– Comprehensive: cover the bulk of interventions– Up to date: relevant to current decisions
• These additional features also require impact monitoring to be:– Efficient: at a low per intervention cost
Impact monitoring as a solution
• Impact monitoring has the potential to address many of the challenges of impact evaluation
Impact evaluation Impact monitoring
Infrequent On-going
Single intervention studies Multi intervention studies
Outcomes based on survey instruments
Outcomes tracked using administrative data
Few outcomes Many outcomes
Low efficiency High efficiency
Differences in method complicate intervention comparison
Consistent methods simplify comparisons
Long time lags in reporting findings Real time reporting
Making impact monitoring possible
• Impact monitoring is becoming feasible because:– machine readable administrative information– increasing computing power– falling cost of data storage– administrative data linking
• The main implication of these changes are:– to lower the cost of measuring outcomes– outcomes are measured in the same way across
populations– increased range of outcomes– rich profile information on individuals
Linked data
WelfareUBOCB
S2W
0-5 6-17 18-24
Client’s life stages25-40
Tax
Child protection
Education
Findings Care
SchoolECE
YJ Prison
Poly
child
PAYE
Justice
• Linked data means we can look at many dimensions of a person’s life.
• Cross agency linked data is of particular value.
Link
ed c
lient
re
cord
s
EMPLOYMENT ASSISTANCEImpact monitoring of
MSD impact monitoring
• Impact monitoring Work and Income assistance for the last 10 years
• Work and Income employment assistance:– training, job search, wage subsidy programmes
• Outcomes:– Off benefit, tertiary study, part time work, subsequent
assistance, Work and Income expenditure– Next (IDI outcomes): earnings and employment,
educational achievement, justice, migration• Duration:
– current longest outcome period is 13 years
Propensity matching
• Work horse method for impact monitoring:– highly automated method
• efficient• reduces influence of analyst bias
– easy to maintain and store results– independent of post participation outcomes– easy to explain to decision makers
• Propensity matching works well with administrative data:– information on large numbers of non-participants– rich profile information (especially prior outcomes)– consistent measurement between participants and
non-participants
Propensity matching: Short versionO
bser
ved
Uno
bser
ved*
Demographics
Labor market
Previous outcomes
Motivation
Skills / Education
Attitude
Networks
Based on participant’s observed profile, propensity matching selects a comparison group with the same average profile
?Assumes the profile of unobserved characteristics will also be the same
*: uncorrelated to observed characteristics
Propensity matching: Long version
• Propensity score matching was proposed by Rosenbaum & Rubin (1983)
• Propensity (P) is the likelihood of participating in an intervention based on a observed profile (X)
P(0,1) = f(X)
• Both participants and non-participant have a propensity to participate
• A comparison group matched on propensity score will have the same average profile as the participants
• Enforces ‘common support’:– matching only works if there are non-participants with
similar propensity to participate as participants
Propensity matching: Long version
participantsnon-participants comparison
Differences in propensity score distribution reflects differences in observed average profiles
Common support problem
Only can match where participant and non-participant score overlap
• Matching ensures we compare like groups– preferred over multivariate regression estimates– can be combined with differences-in-differences to
further reduce bias in the impact estimate
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
5.0%
propensity score
freq
uecn
y
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
5.0%
propensity score
freq
uecn
y
TRAINING OPPORTUNITIESImpact monitoring of
Training Opportunities
• Training Opportunities (TOPs) was a programme to assist clients without basic school qualifications gain basic skills and qualifications– qualifications were at NQF 2 and below– free to participate– up to two years in duration, usually around six months– targeted at clients at risk of long term benefit receipt– contracted out to external providers
• In 2011, TOPs funding was around 80 million a year• Nominally an MSD programme, but administered by
Tertiary Education Commission (TEC)
TOPs impact
-1 0 1 2 3 4 5 6 70
10
20
30
40
50
60
Participants (N: 10,038) Comparison (1, N: 8,433)
Years from starting TOPs
Off
mai
n be
nefit
(%)
-1 0 1 2 3 4 5 6 7 8
-14
-12
-10
-8
-6
-4
-2
0
2
4
Years from starting TOPs
Impa
ct o
n off
mai
n be
nefit
(ppt
)
Matched at participation start
% off main benefit each month from start date
Impact profile is the difference in outcomes between the participant and comparison group
Large lock in effect
Modest post participation effect
Impact after 7.5 years: 12.5 daysAdvice: modest impact possible in the long term (10+ years)
Participants starting between 2000 and 2002
TOPs impact for 2000 to 2007 participants
-1 0 1 2 3 4 5 6 7 8
-20
-15
-10
-5
0
5
Years from starting TOPs
Impa
ct o
n O
ff m
ain
bene
fit (p
pt) Comparing impact profiles for
participant cohorts helps provide early predictions of long term impacts
2007
2000-2002
20042006
Over the 2000s TOPs impact decreased- increased lock in- no post-participation effect
• Why? CMO - Changing context– strong labour demand especially for unskilled labour– falling eligible population increasing numbers of low risk participants
• Changing context alters impact by:– increased opportunity cost of participating– lower labour market value of skills gained
TOPs ends
• In response of the above findings, Ministers decided to split TOPs into two programmes
• Foundation Focused Training Opportunities (FFTO)– restricted to high risk clients (based on a statistical
risk profiling tool)– no more than six months in duration– foundation skills (literacy and numeracy)
• Training for Work (TfW)– no more than three months duration– medium risk clients– work focused training
• Programmes introduced in 2011
FFTO and TfW impact in 2012
• Reported on the early impact of TfW and FFTO in 2012
-1 0 1 2
-30
-25
-20
-15
-10
-5
0
5
Years from starting programme
Impa
ct o
n off
mai
n be
nefit
(ppt
)
TOPs
TFW
FFTO
FFTO impact profile was similar to TOPs
TfW showing a shorter lock in effect and positive post participation impact
• In response to this evidence, Ministers decided to end FFTO in 2013– funding transferred to the Ministry of Education to
fund free education to NQF 2– MSD no longer monitors the impact of this funding
-1.0 -0.5 0.0 0.5 1.0 1.5 2.0
-20
-15
-10
-5
0
5
10
Years from programme start
Impa
ct o
n of
f m
ain
bene
fit (
ppt)
Training for Work cohorts
• Training for Work impacts continue to improve
20112012
2013
Smaller lock in effect
Likely explanations (read conjecture):• better targeting → fewer low risk participants → lower lock in• tighter contract performance → higher post participation
impact
Positive post participation effect
Summary
• Impact monitoring increases the utility of impact information for decision makers– more likely that we see evidence based decisions
• Combined with measures of diverse outcomes, impact monitoring enables more precise testing of intervention logic– can better target qualitative research
• This is only possible through investment in data:– developing good electronic administrative systems – avoiding isolated systems (eg common client ids)– Agencies have proper data warehousing– linking and sharing of agency data (eg SNZ IDI)
QUESTIONS?Any