impact monitoring evaluation of the effectiveness of work and income employment assistance

Impact monitoring

Evaluation of the effectiveness of Work and Income employment assistance

Introduction

Internal evaluation role

Impact evaluation• the evaluation question• impact evaluation: the counterfactual approach

Impact monitoring• advantages over impact evaluation• why is it possible today: falling cost of data

Example: Work and Income employment assistance• propensity matching: workhorse impact method• application of impact monitoring: Training Opportunities

Summary

Questions

Internal evaluation

• Organisations usually have a range of functions and interventions

• Internal evaluation is about building an evidence base on how best an organisation can deliver effective services

• Emphasis is not on one intervention, but on all interventions delivered by the organisation

• Additional challenges:– continuous process of reform and adjustment– operating within a changing policy and social context– often with very short development cycles

• Gaol of alignment: need to accelerate the evidence generation and slow down the development cycle

IMPACT EVALUATIONCounterfactual approach to

Evaluation is a question

What works, for whom and why?

What is the intervention logic of the intervention?

What are the casual links?

How does it work in practice?

Does practice vary?

Who participates and who is affected?

Revise the intervention logic to be consistent with the evidence

Are the causal links supported by the evidence?

Does the new logic achieve the original intervention goal?

What impact does it have on outcomes?

Immediate outcomes

Long term outcomes

Unintended outcomes

Do impacts vary across participants

Impact evaluation frameworkPawson and Tilley (1997) Context-Mechanism-Outcome

Outcomes

Mechanism

Outcomes tell you about the state of the world and how it changes over time- in employment or not- educational achievement- sense of well being

Mechanisms are the actions of organisations designed to change outcomes for the better- regulations and taxation- programmes and services- social marketing

Context is everything else that exists in the space that the mechanism operates within- social, cultural, community- physical space- legislative/policy settings- delivery organisation

Impact is about understanding the how the mechanism influences outcomes within its context- theory of change- testing the theory against evidence

Counterfactual designs

• Counterfactual: is the outcomes that would have occurred in the absence of the intervention

Ou

tco

me

Intervention

Observed outcomes

Counterfactual

Impact

• Impact: the intervention’s contribution to the outcome

Counterfactual designs

Outcomes

Mechanism

Intervention Counterfactual designs contrast two CMO scenarios

One with the mechanism being evaluated and one without

The reality is that either the counterfactual involves an alternative intervention or the absence of the intervention changes the context

Counterfactual

Outcomes

Alternative

For this reason understanding what happens in the counterfactual CMO is as important as understanding the intervention CMO

The counterfactual black box

• Counterfactual designs give evidence for causal links• But, they do not explain them

Intervention

Counterfactual establishes

OutcomeIntervention increased outcomeWe know it works

Black box

1

1

2 3

2

Cannot distinguish between equivalent causal explanations

or

But we do not know why it works

Unpacking the black box

There many ways to unpack the black box

The counterfactual approach is to examine intermediate outcomes

1 2 3

1 2

or

Reduces the range of alternative casual explanations

Casual black boxes remain

Intervention

Counterfactual establishes casual relationships

Outcome 3Outcome 1 Outcome 2+ + +

Context matters

• Context is often forgotten when using impact evidence– but casual mechanisms are contextually based

• The trick is:– knowing if context has changed– working out how this changes the casual mechanism

• Counterfactual evidence is most often presented without context

• Why:– practitioners of counterfactual methods are often

themselves removed from the context– counterfactual evidence can easily be abstracted from

its context

Comparable results

• Independent evaluations are difficult to use when comparing the impact of interventions

• Impact evidence will depend on the design, the outcome measures used, what the counterfactual represents

• Cannot be certain whether differences in intervention impacts stem from:– real differences in causal mechanisms– difference in impact method

IMPACT MONITORINGThe development of

Impact monitoring

• Impact evaluation needs to be:– Robust: decision makers have confidence about the

difference interventions make• Impact monitoring has the additional features of being:

– Consistent: enable direct comparison of intervention impacts

– Comprehensive: cover the bulk of interventions– Up to date: relevant to current decisions

• These additional features also require impact monitoring to be:– Efficient: at a low per intervention cost

Impact monitoring as a solution

• Impact monitoring has the potential to address many of the challenges of impact evaluation

Impact evaluation Impact monitoring

Infrequent On-going

Single intervention studies Multi intervention studies

Outcomes based on survey instruments

Outcomes tracked using administrative data

Few outcomes Many outcomes

Low efficiency High efficiency

Differences in method complicate intervention comparison

Consistent methods simplify comparisons

Long time lags in reporting findings Real time reporting

Making impact monitoring possible

• Impact monitoring is becoming feasible because:– machine readable administrative information– increasing computing power– falling cost of data storage– administrative data linking

• The main implication of these changes are:– to lower the cost of measuring outcomes– outcomes are measured in the same way across

populations– increased range of outcomes– rich profile information on individuals

Linked data

WelfareUBOCB

S2W

0-5 6-17 18-24

Client’s life stages25-40

Tax

Child protection

Education

Findings Care

SchoolECE

YJ Prison

Poly

child

PAYE

Justice

• Linked data means we can look at many dimensions of a person’s life.

• Cross agency linked data is of particular value.

Link

ed c

lient

re

cord

s

EMPLOYMENT ASSISTANCEImpact monitoring of

MSD impact monitoring

• Impact monitoring Work and Income assistance for the last 10 years

• Work and Income employment assistance:– training, job search, wage subsidy programmes

• Outcomes:– Off benefit, tertiary study, part time work, subsequent

assistance, Work and Income expenditure– Next (IDI outcomes): earnings and employment,

educational achievement, justice, migration• Duration:

– current longest outcome period is 13 years

Propensity matching

• Work horse method for impact monitoring:– highly automated method

• efficient• reduces influence of analyst bias

– easy to maintain and store results– independent of post participation outcomes– easy to explain to decision makers

• Propensity matching works well with administrative data:– information on large numbers of non-participants– rich profile information (especially prior outcomes)– consistent measurement between participants and

non-participants

Propensity matching: Short versionO

bser

ved

Uno

bser

ved*

Demographics

Labor market

Previous outcomes

Motivation

Skills / Education

Attitude

Networks

Based on participant’s observed profile, propensity matching selects a comparison group with the same average profile

?Assumes the profile of unobserved characteristics will also be the same

*: uncorrelated to observed characteristics

Propensity matching: Long version

• Propensity score matching was proposed by Rosenbaum & Rubin (1983)

• Propensity (P) is the likelihood of participating in an intervention based on a observed profile (X)

P(0,1) = f(X)

• Both participants and non-participant have a propensity to participate

• A comparison group matched on propensity score will have the same average profile as the participants

• Enforces ‘common support’:– matching only works if there are non-participants with

similar propensity to participate as participants

Propensity matching: Long version

participantsnon-participants comparison

Differences in propensity score distribution reflects differences in observed average profiles

Common support problem

Only can match where participant and non-participant score overlap

• Matching ensures we compare like groups– preferred over multivariate regression estimates– can be combined with differences-in-differences to

further reduce bias in the impact estimate

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

4.0%

4.5%

5.0%

propensity score

freq

uecn

y

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

4.0%

4.5%

5.0%

propensity score

freq

uecn

y

TRAINING OPPORTUNITIESImpact monitoring of

Training Opportunities

• Training Opportunities (TOPs) was a programme to assist clients without basic school qualifications gain basic skills and qualifications– qualifications were at NQF 2 and below– free to participate– up to two years in duration, usually around six months– targeted at clients at risk of long term benefit receipt– contracted out to external providers

• In 2011, TOPs funding was around 80 million a year• Nominally an MSD programme, but administered by

Tertiary Education Commission (TEC)

TOPs impact

-1 0 1 2 3 4 5 6 70

10

20

30

40

50

60

Participants (N: 10,038) Comparison (1, N: 8,433)

Years from starting TOPs

Off

mai

n be

nefit

(%)

-1 0 1 2 3 4 5 6 7 8

-14

-12

-10

-8

-6

-4

-2

0

2

4


Impa

ct o

n off

mai

n be

nefit

(ppt

)

Matched at participation start

% off main benefit each month from start date

Impact profile is the difference in outcomes between the participant and comparison group

Large lock in effect

Modest post participation effect

Impact after 7.5 years: 12.5 daysAdvice: modest impact possible in the long term (10+ years)

Participants starting between 2000 and 2002

TOPs impact for 2000 to 2007 participants

-1 0 1 2 3 4 5 6 7 8

-20

-15

-10

-5

0

5


Impa

ct o

n O

ff m

ain

bene

fit (p

pt) Comparing impact profiles for

participant cohorts helps provide early predictions of long term impacts

2007

2000-2002

20042006

Over the 2000s TOPs impact decreased- increased lock in- no post-participation effect

• Why? CMO - Changing context– strong labour demand especially for unskilled labour– falling eligible population increasing numbers of low risk participants

• Changing context alters impact by:– increased opportunity cost of participating– lower labour market value of skills gained

TOPs ends

• In response of the above findings, Ministers decided to split TOPs into two programmes

• Foundation Focused Training Opportunities (FFTO)– restricted to high risk clients (based on a statistical

risk profiling tool)– no more than six months in duration– foundation skills (literacy and numeracy)

• Training for Work (TfW)– no more than three months duration– medium risk clients– work focused training

• Programmes introduced in 2011

FFTO and TfW impact in 2012

• Reported on the early impact of TfW and FFTO in 2012

-1 0 1 2

-30

-25

-20

-15

-10

-5

0

5

Years from starting programme

Impa

ct o

n off

mai

n be

nefit

(ppt

)

TOPs

TFW

FFTO

FFTO impact profile was similar to TOPs

TfW showing a shorter lock in effect and positive post participation impact

• In response to this evidence, Ministers decided to end FFTO in 2013– funding transferred to the Ministry of Education to

fund free education to NQF 2– MSD no longer monitors the impact of this funding

-1.0 -0.5 0.0 0.5 1.0 1.5 2.0

-20

-15

-10

-5

0

5

10

Years from programme start

Impa

ct o

n of

f m

ain

bene

fit (

ppt)

Training for Work cohorts

• Training for Work impacts continue to improve

20112012

2013

Smaller lock in effect

Likely explanations (read conjecture):• better targeting → fewer low risk participants → lower lock in• tighter contract performance → higher post participation

impact

Positive post participation effect

Summary

• Impact monitoring increases the utility of impact information for decision makers– more likely that we see evidence based decisions

• Combined with measures of diverse outcomes, impact monitoring enables more precise testing of intervention logic– can better target qualitative research

• This is only possible through investment in data:– developing good electronic administrative systems – avoiding isolated systems (eg common client ids)– Agencies have proper data warehousing– linking and sharing of agency data (eg SNZ IDI)

QUESTIONS?Any

impact monitoring evaluation of the effectiveness of work and income employment assistance

Documents