impact evaluation of pes action for the unemployed in flanders international evaluation conference...
TRANSCRIPT
Impact evaluation of PES action for the unemployed in Flanders
International Evaluation ConferenceVilnius, Lithuania
4-5 July 2013
Benedict WautersESF Agency Flanders
Research questions and challenges1. Questions
What is the effect of ESF-modules on job-seekers when they are not randomly assigned to treatment ?
Who is prone to get what : (self)selection? What works for whom? Are job outcomes determined by intermediate outcomes?
2. Challenges No “untreated” in Flanders Allocation to modules is not a random process Modules can have differentiated effects depending on the
characteristic s of job seekers
6 technical annexes to this presentation!
Data sources Survey of 2005 persons in june-juli 2010 (4 to 9
months after ESF action ended); Follow-up survey in june-november 2011 (21 months
after action)
Administrative data from client follow-up system of PES
Administrative data from Dimona (national databank to which employers confirm if someone starts to work for them / stops work) for 1411 persons out of the original 2005( for the multivariate analysis)
Participants
total
A causal model
X Ycause
Z
Etc.
cause
Personal char Xat t0
Barriers H at t0
Group 1 (will get T)
Group 2 (will not get T)
Treatment
Fraction of months worked in period of 24 months
Bt (effect of treatment)
no needto controlfor these in a regression, hence NO arrow
e
In the error term there is nothing systematic that could influence treatment or outcome, hence no arrow
The ideal(ised) case of a RCT (randomised control trial)
An RCT does not answer if something works,but if
something works better than doing nothing (or some other comparison base) if you would randomly allocate people to it.
In real life, no one actually wants to randomly allocate
anyone to a treatment (lottery). You want to allocate
purposefully, because you think certain persons will actually react better than others to an action or they
need it more!
The effect in an RCT actually shows the effect due only to a treatment, without the “possible”
benefit and fairness of being purposeful. “Possible”, because we can of course also get it wrong when we are purposeful. But an RCT does
not tell you anything about that.
Personal char Xat t0
Treatment T(modules)
Barriers H at t0
B Fraction of months worked in period of 24 months
B= effect of treatment? No…, we need to separate out the influence of unobserved variables (in the error term e) in B and find Bt where (Bt= B-Be)!
Causal model if there is NO random assignment…
Selection intoTreatment based onobservables = controlled for
Effect of observables on outcome = controlled for
SEQUENTIAL MODEL: reduce “bias” on unobservables
Personal char Xat t0
Treatment T(modules)
Barriers H at t0
Step 1: Treatment allocation
Step 2: Effect treatment on outcome, controlling for personal and context char and selection into outcome on observables
e
Estimated ê is better proxy for omitted var under presence of Z (exclusion restriction)• Step 1: treatment= f(X, H, Z) + e2 or T=a+CX+DH+FZ+e2
• Step 2: outcomes=f(T, X,H, ê2) + u or Y=A+BT+CX+DH+Eê2+u
Z
Bt
• If there are unobservables (e) that affect treatment allocation and outcomes, then estimated B is biased as it reflects the influence of T AND e•Sequentiel model: eliminate bias by omitted unobservables e• Step 1: Treatment= f(X, H) + e or T=a+CX+DH+e• Step 2: outcomes= f(T, X, H) + u or Y=A+BT+CX+DH+Eê+u
Fraction of months worked in period of 24 months
and selection on unobservables
(Bt=B-Be)
Eg. Eagerness to work (unobserved)
Be
but Be is also biased if Z has not been identified)
Conclusion step 1
Step 1 indicates non-random selection Gender, age, presence of children in the
household, attainment of tertiary education, being long term unemployed, being of foreign origin, having a handicap, having a work related problem, persons having care duties or medical problems….
None of the above are necessarily a problem: they could reflect that PES staff try to choose the best module for the person, taking into account aspects we are not measuring, as well as other good reasons!
A notable one of a different kind is however the distance to the nearest PES-shop: there is no obvious good reason why this should influence selection
Step 1 and a half
We use a fractional logit model as work intensity is defined as the period worked out of 24 months (only data regarding if someone worked in a month, but not clear how many days). The same issues in terms of interpreting coëfficients apply as previously so use the same method to make them easier to grasp.
In this model, no account has been taken of unobservables (e and Z)
Module Marginal effect on workintensity
diagnosis, … (M2) REF CAT
Persons or. training (M5) 0.2 pptn (Non-Sign.)
Pathway support (M7) 7.6 pptn
Job search training(M3) 7.8 pptn
Training and coaching on workfloor (M6)
9.5 pptn
Profession orient. tr (M4) 15.4 pptn
Module Work intensity:relatively more pos or negative (at 10% significance levels)
Diagnosis, … (M2) Positive for 16-25-year olds and non-EU; negative for (very) long term unemployed
Job search training (M3) Negative for 40-49 year olds; positive voor mid-level schooled and high pre-action job search behaviour
Prof. or. training (M4) Negative for (very) long term UE; positive for non EU
Person. Or. training (M5) Negative for 40-plus and long –term UE
Tr. + coach. on work floor(M6)
Negative for mid-level schooled
Pathway support (M7) Negative for 50-plus and (very) LT UE
These are results while controlling also for unobservables!
Step 2: what works for whom
Step 2: the role of unobservables
Module Marginal effect on workintensity (not controlling for unobservables)
Correction on marginal effects on workintensity when unobserved variables are taken into account* = added value of unobservable selection =
diagnosis, … (M2) = REF CAT
Persons or. training (M5)
0.2 pptn (NS) No effect
Pathway support (M7) 7.6 pptn +2 pptn.
Job search training (M3) 7.8 pptn -3 pptn.
Training and coaching on workfloor (M6)
9.5 pptn +8 pptn.
Profession orient. tr (M4)
15.4 pptn +3 pptn.
*+: the (self-)selection on unobservables in this module has positive effect on outcome-: the (self-)selection on unobservables in this module has negative effect on outcome
To obtain a marginal effect (relative to module 2) on workintensity of being RANDOMLY allocated to a particular module, we would detract the right column from the left.
Except for job search training, it is a good
thing there is “bias” / selectivity based on
unobservables! Random allocation on
unobservables would not be beneficial!
Predicted outcomes of random allocation (on observables AND unobservable) versus actual allocation
Module Actual participants allocated to it
Random allocation
diagnosis, … (M2) 0,308 0,414 (third best)
Persons or. training (M5)
0,335 0,168 (weakest)
Pathway support (M7) 0,393 0,341 (second weakest)
Job search training (M3)
0,393 0,380 (third weakest)
Training and coaching on workfloor (M6)
0,393 0,558 (second best)
Profession orient. tr (M4)
0,495 0,764 (best)
Fra
ction
of m
on
ths w
ork
ed
If we would randomly allocate persons to e.g. module 6, the module would yield better average results. However, this is just
because the actual participants of this module had relatively weak observable characteristics. Random allocation would increase the
share of participants with a stronger profile and hence would result in a better performance of the module.
Another question to address
We were also interested in knowing HOW work intensity is affected? Specification of intermediate outcomes
= theory based evaluation Use of a structural equation model
no fractional logit, used OLS -a different model for for this data would result in higher coëfficients than now estimated
not possible to control for selection via unobservables
ModulesObservable characteristics
*
* influence of obs. char on selection into modules is taken into account, but not shown
Interm. outcomesHard outcomes
Pathway support
Train./ coach. onworkfloor
Person. or.training
Prof. orient.training
Job search training
Diagnosis
Job search
Work intens.
Effect on Work intensity
Effect on job search
Soft skills
Modest effect of soft skills
Modest negative effect of
searching!?
Those who work most probably need to
search less!
Subsample of only those participants who at the time of the first survey
were not at work yet
Job search
Work intens.
Effect on Work intensity
Effect on job search
Soft skills
Negative effect of searching on work disappears but still negative effect of persons oriented training on job
searching!?
Pathway support
Train./ coach. onworkfloor
Person. or.training
Prof. orient.training
Job search training
DiagnosisEven more
modest effect of soft skills
Modest positive effect of
searching
Policy recommendations
Consider to… Put non-EU persons more into professional
training (M4) and into diagnosis (M2) Put mid-schooled more into training /
coaching on the work floor (M6) Put young more into diagnosis (M2) Put positive action in place for 50 plus Emphasise the quality of search behaviour,
rather than the quantity Review the selectivity in terms of distance
to a PES shop
Some final “bewares”1. Work intensity = number of months in which at least 1 day was
worked A month with one day is equal to a full month of working!
2. Intially, mistakes were made by regressing variables of the type “The action helped you because… ” on variables of the type of “fraction of months worked”
Statements regarding causality cannot be used in a regression to quantify presumed causal relations
Due to initally too much “copy-pasting” of measurement instruments without thinking through how they would be used in analysis (also many variables were never used)
3. We are evaluating separate modules, but in reality, Flemish job seekers get a pathway with a customised sequence of modules
unit of analysis is wrong; but at least, the evaluated module was always the most intensive one of a pathway so there is some value to the evaluation
NOT….
AC
TIV
ITIE
SA
CT
IVIT
IES
AC
TIV
ITIE
S
Ou
tcom
eO
utco
me
Ou
tcom
e
Ou
tcom
eO
utco
me
Ou
tcom
e
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
Outcome
ACTIVITIES
ACTIVITIES
ACTIVITIES
ACTIVITIES
BUT….
As soon as some outcomes become
visible, other actions are offered to build on this
progress. The “theory of change” can therefore be very different for
different people!
Some final “bewares”
4. Did not control for “intensity” of a module: obviously, if more time / effort is spent on / in modules they
will get more results due to the sheer difference in effort
5. What did we really evaluate? There is no “model” of what e.g. a “module 3” action is
supposed to look like In fact, “module 3” hides a variety of actions, executed in
different ways In theory, we should also pay for a process evaluation of all
this variety to ensure it conforms to a “standard” 800 000 EUR evaluation budget for the whole period = 2,13% of the total
technical assistance budget AND Flanders will get 30% less ESF in the next period!
this evaluation alone cost 320 000 EUR (about half for the surveys) Phase 1: 150 000 EUR Phase 2: 170 000 EUR
Some final “bewares”6. Statistically very complex
nice to get the statistics right, but no policy-maker can really understand (although they may pretend) where the recommendations come from…
…hence they do not trust them!
7. Actions for the unemployed are the responsability of the PES in Flanders where ESF just ensures that more of the same can be provided than would otherwise be possible: There is no inherent difference between ESF versus no-ESF financed actions
As ESF pays only for a fraction of the various modules executed by the PES, they could easily satisfy the recommendations of the evaluation…
by submiting to ESF only particular combinations of participant profiles and module…
…without changing anything for the PES as a whole
we are not allowed to request info from the PES other than that pertaining to the actions ESF finances
This kind of evaluation should NOT be done by an ESF MA but by the national /regional government
Some final “bewares”
8. Finally,… the elephant in the room no one talks about: COST! Sure, redirecting some persons to some
modules can be a good thing… …but for training/coaching on the work
floor and for profession oriented training this is not so obvious as these also cost a lot more: societal cost/benefit analysis would be required not evident to quantify the benefit side
Thank you for your attention Questions?
Contacts: [email protected]
Contractor for the evaluation [email protected]
Lead contracted researcher at the University of Leuven
Technical annexes
Annex 1
Results from descriptive analysis
Static results: labour market position 24 Months after the action 54,1% at work 21 months after first
survey (versus 37,7% - a difference of 16,3%*) 1. Mod 7 (pathway support) and follow-up
= 45,4%2. Mod 2 (diagnosis and pathway definition)=
48,7%3. Mod 3 (job search training) = 51,3% 4. Mod 5 (persons oriented training) = 53,9%5. Mod 4 (profession oriented training) =
62,2% 6. Mod 6 (training and coaching on the
work floor) = 62,7%
*Possible biased by attrition of the sample from 2005 to 1153 persons
This is a snapshot, people could have worked in the period before, but not at the moment measured (24 month period)
Static vs Dynamic results: Months with at least on day of work in the 24 months period 54,1% at work 21 months after first survey
(versus 37,7% - a difference of 16,3%*) 1. 2 Mod 2 (diagnosis and pathway definition)=
48,7% AND 8M(onths) AND 1,9 tr(ansitions)**2. 4 Mod 5 (persons oriented training) = 53,9%
AND 9,2M and 2,1 tr3. 1 Mod 7 (pathway support) and follow-up =
45,4% AND 9,6M AND 2,2 tr4. 3 Mod 3 (job search training) = 51,3% AND
9,6M AND 2,3 tr5. 6 Mod 6 (training and coaching on the work
floor) = 62,7% AND 11M and 2,7 tr6. 5 Mod 4 (profession oriented training) =
62,2% AND 12M and 2,4 trLongest time at work is also accompanied by more transitions
(0tr= never got a job, 1=got a job, 2=lost it again,…)*Possible biased by attrition of the sample from 2005 to 1153 persons
about 500 never worked
After 6 months: about 1000 never worked
People that never fell back into unemployment
Shallow
Shallow
Most hollow
Most hollow
Temporary conclusions from descriptive analysis
Mod 6 and 4 come out on top consistently in this sample of
…but perhaps this is because of the participant characteristics and other factors, not the action itself?
How to account for these selection effects? Heckman 2 step procedure (instrumental
variable / control function approach)
Annex 2
Regression analysis to estimate treatment effects
Angle = slope of equation(with hom much does Y in/decrease when X in/decreases)
Straight line equation with only 2 variables: Y = intercept + angle*X
Y
X
intercept
Angle = slope of equation(with hom much does Y in/decrease when X in/decreases)
Estimated regression equation: Y = intercept + B*X +error
Y
X
=observation
intercept
Error = reflects fact that the observations are not actually ON the line. It is only a model that fits the data to some extent.The error term will contain some random noise, but it will also contain non-random structure due to “omitted variables”.
Zi is 0 for the control group and 1 for the treatment group
The case of a control versus only 1 treatment group.
The coëfficient of thetreatment reflects the effect of the treatment relative to the control group
Annex 3
Step 1 regression
Step 1
Estimating likelihood that a person would participate in a module, given their characterstics
Mod 2 =0, Mod 3=1, etc. for “pes-module”.
However, selection into a range of discrete alternatives (like PES modules) is a non-linear function that requires complicated regressions (NOT ordinary least squares)
For a non-linear model, the effect (coëfficient) of X on the probability
of being in a particular module depends 1) on the level of the Xi of interest as well as 2) E.g. at value x1, the increase in Y is different
that the increase due to the same change in X at x2. Also, the same
increase in X at x2 leads to a different increase of Y depending on whether the dummy variable is
0 or 1!
For Y=intercept+ Beta*X + Delta*D. Effect (coëfficient) of X does not depend on the actual value of X (e.g. x1 or x2). Also, a change of value of a dichotomous variable (e.g. a dummy taking value of 0 and 1) does not affect the slope of the line, only the value of the intercept by Delta.
For equations with many variables on the right hand side, this means that the coëfficient of any variable depends on the value of that variable AND the value
of all other variables!
We use a multinomial (more than 2 possible outcomes) logit (to take account of the non-linearity) model where:
Probability (Y=Mj I Xi) = exp (Yi*Xi) / exp SUM (all possible Y for all possible X for all possible pes-modules)
[hence we establish a %].
With Mj= the possible modules. Xi=13 variables . Yi= intercept and 13 coëfficients.
Most of the right hand variables are dummy variables (0 or 1 values, “on” versus “off”). This means we can have a look at the
predicted probability of getting a particular module if you are e.g. a man versus if you are a woman, holding everything else equal (at
the sample means). The difference between these probabilities is the marginal effect.
For continuous right hand side variables (e.g. distance to the local PES shop), we have to check the effect on probabilities along the
entire range of values for the variable.
e.g. Average person with labour market (LM) disability has approx 23% points more chance to be selected into pathway support. As sample average to be selected into pathway support is 24% and people with LM disabilities represented 20% of the sample this means that person without LM disabilities has estimated prob of 18% and people with LM disabilities of 42% to be selected into pathway support
e.g. Average man has approx 9% points more chance to be selected into profession oriented training. As sample average to be selected into profession oriented training is 16,5% and men are women are equally represented in the sample this means that women have estimated prob of 12% and men 21% to be selected into profession oriented training
For a continuous variable e.g. distance to a PES shop, we explore the entire range of values
P at 10
P at 12
P at 14
Test exclusion restrictionsVariables that affect treatment but not outcomes (conditional on treatment) Capturing differences in local labour markets and proximity of the PES service delivery
•Distance to nearest job centre
•Degree of morfological and functional urbanization (defined at city level)
•Regional labour market dummies (n=13)
•Regional unemployment rate by age and gender, unemployment rate squared, first differenced unemployment rate
•Joint significance of instruments in selection equation• These variables are jointly significant in the first stage of the
selection model•Test of overidentifying restrictions in outcome equation
• These variables are jointly insignificant in the outcome equation
In search for exclusion restrictions
Coefficients, odds ratio’s and marginal effects
Diagnosis
Training
TOTAL
Men 40 60 100
Women 70 30 100
TOT 110 90 200
Example: Assume only 2 treatment choices 60% of males and 30% of females (are) select(ed) into training
Definition: Odds=p/1-pOdds males are selected into training : odds(M) = .6/.4 = 1.5 Odds females are selected into training : odds (F) = .3/.7 = 0.43
The odds ratio for male vs. female to be selected into training is then odds(M)/odds(F) = 1.5/0.43 = 3.5 Log OR=1.25The odds of being selected into training are about 3.5 times greater for males than females.Next you estimate the following logistic regression model:Training = 1.25 gender - .051 Age - 1.056 Unemployment duration The effect on the odds of gender is exp(1.25) = 3.5 meaning the odds increase by 250% (=3.5-1)So B=effect on logit=log(p/1-p) or exp(B)= effect on odds= p/1-p
Still hard to grasp! However, it is possible to calculate p= exp(B)/1+exp(B) at various levels of X’sWe can then report marginal effects, evaluated at the mean as these are useful, informative, and easy to understand. Reported marginal effect= dy/dx=(P Training|men, all x at mean) - (P Training|women, all x at mean) = 30%For 2 hypothetical individuals, with average values on all other X, the predicted probability to be selected into training is 30% points higher for men than for women. (e.g. p men= 60 p women=30)
Annex 4
Step 2 regression
Step 2
Ml = transformed insertion of the estimated e2 from step 1 (with exclusion restrictions)*.
*see technical paper Bourguignon et al 2011
We have the same number (M=6) of estimations (in the form described above) as we have modules ( we estimate with the subsample of partipants for each module). In fact, all the coëfficients are different in each of these equations. That includes the coëfficients of Ml which are different for each module relative to another module.
So we have 6 values for , this is the total effect of the
various “propensities” to be allocated to various modules, for people
(with average characterstics)allocated to a specific module
1 2 3 4 5 6 estimations
Step 2
Annex 5
Operationalising intermediate outcomes
ModulesObservable characteristics
*Interm. outcomes
Hard outcome
Self-knowledge and self-efficacy (confidence one can succeed) (based on ordinal factor analysis of 19 five item likert scales)
Job searching after the ESF action(based on ordinal factor analysis of 12 channels, number of channels, number of applications)
Operationalising self-knowledge and efficacy
ItemsFactorlading
Dankzij het deelnemen aan de actie heb ik een beter zicht gekregen op de functies en jobs die ik graag zou doen
0,81
Dankzij het deelnemen aan de actie heb ik een beter zicht op de functies en jobs waarvoor ik geschikt ben
0,82
Dankzij het deelnemen aan de actie heb ik een beter zicht op bedrijven/organisaties waarin ik kan gaan werken
0,77
Dankzij het deelnemen aan de actie heb ik een beter zicht op mijn kansen op werk
0,79
Dankzij het deelnemen aan de actie heb ik een beter zicht op de jobmogelijkheden die erzijn
0,78
Dankzij het deelnemen aan de actie ben ik beter op de hoogte van mogelijk interessante opleidingen voor mij
0,68
Dankzij het deelnemen aan de actie heb ik meer mensen leren kennen die me kunnen helpen bij het vinden van een job
0,66
Cronbach’s alfa 0,90Proportie van de verklaarde variantie 64,01%
Items FactorladingDankzij het deelnemen aan de actie ken ik mijn sterke en mijn zwakke punten beter
0,83
Dankzij het deelnemen aan de actie weet ik beter hoe ik aan mijn zwakke punten kan werken
0,81
Dankzij het deelnemen aan de actie weet ik beter wat ik belangrijk vind in een job
0,79
Dankzij het deelnemen aan de actie weet ik beter in welke jobs ik me goed zou voelen
0,78
Dankzij het deelnemen aan de actie weet ik beter wat ik in het verleden goed heb aangepakt in verband met werk en wat ik verkeerd heb aangepakt in verband met werk
0,75
Cronbach’s alfa 0,89Proportie van de verklaarde variantie 70,22%
2.Knowledge of labour market
3. Awareness of own job related potential and preferences
Items FactorladingDankzij het deelnemen aan de actie heb ik meer vertrouwen in mezelf en mijn mogelijkheden
0,78
Dankzij het deelnemen aan de actie heb ik er meer vertrouwen in dat ik contacten kan leggen met mensen die me kunnen helpen bij het vinden van een job
0,82
Dankzij het deelnemen aan de actie heb ik er meer vertrouwen in dat ik de juiste stappen kan zetten om een job te vinden
0,82
Dankzij het deelnemen aan de actie heb ik er meer vertrouwen in dat ik het goed zal doen op een sollicitatiegesprek
0,81
Dankzij het deelnemen aan de actie heb ik er meer vertrouwen in dat ik de hulp en steun van anderen kan inroepen om me te helpen bij het vinden van een job
0,76
Dankzij het deelnemen aan de actie heb ik er meer vertrouwen in dat ik een goede sollicitatiebrief en CV kan opstellen
0,69
Dankzij het deelnemen aan de actie heb ik er meer vertrouwen in dat ik werkgevers op de juiste manier kan contacteren en overtuigen om me in overweging te nemen voor een job
0,78
Cronbach’s alfa 0,92Proportie van de verklaarde variantie 66,74%
1. Job related self-efficacy
All items in 1, 2, 3 were 5-point Likert scales. They were each reduced to one variable using ordinal factor analysis. But the cross-loadings were too high to justify separate constructs so all scales were combined into one variable.
Annex 6
What if there was random allocation on both observables and unobservables?
Outcomes if random allocation
We can use the equations with the estimated coëfficients to generate predictions of work intensity
Predict the work intensity using the model with the actual participants that took it
Predict the work intensity using the same model but with randomly allocated participants (here random allocation will occur on both observable and unobservable characteristics)