2011 - impact evaluation 3 (dfid talk)
TRANSCRIPT
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
1/16
ThreeyearsagoIgaveatalkatDFIDcalledImpactEvaluation2.0
Ididntthinkthetalkwouldgetaroundmuch,whichiswhyIgaveitsuchanarrogant
title.Tomyenduring surpriseanddread,somepeopleactuallyreadit.
Mypoint
then:
to
talk
about
how
impact
evaluations
could
better
serve
the
needs
of
policymakers,andacceleratelearning.
Frankly,thebenefitsofthesimplerandomizedcontroltrialhavebeen(inmyopinion)
overestimated.Butwiththerightdesignandapproach,theyholdevenmore
potentialthanhasbeenpromisedorrealized.
Ivelearnedthisthehardway.
ManyofmycolleagueshavemoreexperiencethanI,andhavelearnedtheselessons
already.WhatIhavetosayisnotnewtothem.ButIdontthinkthelessonsare
widelyrecognizedjustyet.
ThistalkisanattempttodrawoutwhatmoreIvelearnedinthethreeyearssince
thatfirsttalk.Animpactevaluation3.0?Youbethejudge.
1
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
2/16
ThesehavebeenearlydaysofRCTs
Wekindoftakeopportunitieswhereweseethem
Andwehavelearnedalot.Ithinkthesummariessentaroundillustratealotofinteresting
findings.
Buttheevidencesofarisagrabbagofdifferentinterventions
Takeelectionswork.I lookatthedifferentresultsfromRCTs,anditsreallyhardtodrawout
thegenerallessons.Trying toinfluencepeoplewithinformationseemstoinfluencethema
little.
ButifIwereaprogrammanagerinLiberiarightnow,thinkinghaveIlearnedthataffectsthe
wayIprogramaroundthe2011election,ImnotsureIhavemanyanswers.
Anotherwayofaskingthisquestion:WhatdidIknowabouthumanbehaviornowthatI
didntknow
before
seeing
this
work?
Infact,theanswersraisemorequestions.Peopleseemtobeinfluencedbysmallcuesorinfo
campaigns.Doesthismeanthatthegainsareeasilyreversed?Arewesuchficklecreatures?
That,insomesense,mightbetheimportantquestion
2
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
3/16
QuestionIwouldhavefor DFID:Whatarethestrategiclessonsyouneedtolearnas
anorganization?
You areoneofthebiggestdonorsintheworld:youcansetstrategicobjectivesand
answerquestions
Moreimportantly: ifnotyou,whoelse?
Ifyouleaveittotheacademicslikeme,youllgetlotsofanswersbutnotnecessarily
toquestionsthathelpyoudoyourjobbetter.Alotofthequestionsweaskare
directlyrelevanttoyourwork,andmakingaidbetter.Others answermorearcane
questions.Maybetheseareimportantasbasicresearch,buttheresagoodchance
theyarequestionsthatareimmediatelyrelevant.
Ithinkyoushouldsupportbothbasicresearchinaidisimportant.Buttotipthe
balancetorelevantworkyouaregoingtohavetodrivethestrategyyourself.
So:Fundagendasandbigquestions,notprojectevaluations
Whereisthishappening?Microfinance.Savings.Cashtransfers.
3
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
4/16
Dontevaluatethingsthathaveverylittlegeneralizability
Evaluationscantakeyears.Willcosthundredsofthousandsifnotmore
NeedtobeabletolearnsomethingmorethanimpactofprogramXinplaceYattime
Z
Needtolearnsomethingfundamentalabouthumanbehavior,oraquantityor
relationshipthatappliesbroadly
Asanexample,letmeuse civiceducationprograms
4
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
5/16
ThisisaphotoofaciviceducationcampaignIhavebeenworkingoninLiberia
Whatdoesaciviceducationprogramorinformationcampaigntellyou?
Theinitialdesignideashaduscapturingtheimpactonattitudesknowledgeand
participationin
the
community.
Compare
outcomes
in
T&
C
Itsnotclearwhatwe learnfromthis.Weseehowwelltheprogramworkedinthis
context.WhatdoesthistellmeaboutwhetherIshouldpushthisagendamore
broadly?
ThediscussionswehadatthebeginningturnedtoWhatsyourtheoryofchange?
Whydoesinformationseemtochangepeoplesbehavior?Arepeoplesoinfostarved
oreasilyinfluenced?
PersonallyImquitesuspiciousofthistheoryofchange. Itscommontoalotof
governanceprograms:justgivepeopleinformation.Asifinformationisthelimiting
constraint.
Letmecomebacktothisexampleinamoment
5
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
6/16
First Iwanttomakeageneralpoint
Itreallyhelpstoarticulatetheassumptionsonwhichtheprogramisbased
Whetheritsimplicitorexplicit,everyprogramisrootedinatheoryofchange
Assumptionsaboutthewayhumanswork,ororganizationswork,orthepolitical
systemswork
Somearegood,somearebad
Sometimes,whenyoustarttoarticulatethatinadetailedorformalway,yourealize
thatthetheoryofchangeisnotasclearasyouthought
ManyNGOsandprogramsareveryclearaboutthis,butIwouldsaymorearenot.Its
oftensuperficial.
6
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
7/16
Testtheideanottheprogram
Thetheoryofchangeortheassumptionsthatunderlieyourprogram.Chancesare
thisgeneralizesmuchmorethantheimpactofprogramX inplaceYattimeZ
Thismeans
developing
and
testing
theories
Itmightevenmeanputtingtheoryandideatestingaheadofprogramobjectives,at
leastintheshortterm
7
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
8/16
Letmecomebacktothe civiceducationexample
Aswearticulatedthetheoryofchange,itevolvedfromaelitestellingpoorpeoplewhattothinktosomethingmuchmorenuanced
Toputitbriefly,thetheoryofchangeisthat,intheabsenceofformaljusticeoradmin
systems,the
rules
and
practices
followed
in
the
village
derive
from
accepted
norms
Thesenormstendtoreflecttheinterestsofoldermalesofthedominantethnicgroup
Theciviceducationprogramaimedtocreateanewsetofnorms,areferencepointforadministrationandjusticethatpeoplecouldpointto
Todoso,itneededtocreatecommonknowledge
Anditwouldhavetostick.
IdeallyIwouldhavelikedtoexperimentwithdifferentcontentandcurricula
Butwhat
we
were
able
to
do
is
randomize
both
intensity
of
treatment
to
assess
impact
on
commonknowledge
Welookedforadifferenceinspilloverstononparticipants
Wealsorandomizedtiming,sothatwecouldlookatgrowthordecay
8
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
9/16
Letmegiveanotherexample,fromapostconflictstabilizationprogramstillunderway:TheSustainableTransformationofYouthinLiberia(STYL)program.
Mybasicobjective:Workwithhighriskstreetyouthstreethawkers,eggars,homeless,pettycriminals,drugdealers,junkiesinMonrovia tomitigatepovertyandviolenceandpotentialforpoliticalinstability
Wantto
test
role
of
poverty
in
criminal
activity,
aggression
and
political
violence
Wealsothinktheremaybecognitiveandbehavioralrootstoaggressionandcrimeandotherantisocialbehavior
Soweareoverlayingtwotreatments.Oneisagrantforstartinganincomegeneratingactivity(iftheywant).Oneisashortprogramofcognitivebehaviortherapydesignedtoessentially instillselfdisciplineandafuturefocustoseeifpreferencesandpersonalitytraitscanbechanged,especiallytheonesthatcanmakepeopleaharmtothemselvesandothers
Somepeopletalkaboutnudges.Think ofthisasashove.
Werandomlyevaluatetheimpactofthetwointerventionsaloneaswellasincombination.
Wearevaryingthesizeofeachtreatmentinordertounderstandtheresponsivenessofaggressionandcrimetotheseforces,andtotextdifferentmodelsofpovertyalleviationandassumptionsaboutwhatdrivesantisocialbehavior
Whatwelearn,wehopeisgeneralizablebeyondthisspecificgroup.
Ideaisthatthesetendenciesmaybegeneralizabletohighriskyouthacrosstheregion
9
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
10/16
Thislastprogramisruninaverytightlycontrolledway.ItsnotM&E,itsR&D.
Wehavebeenusingthewrongletters.
Iusedtosaytopeople:look,Microsoft doesnotgobigonthemarketbeforeithasreallytesteditsproduct.Theyarethebiggestplayerintheindustry,thatwouldbecrazy.
ThenIrealized actuallythatsEXACTLYwhatMicrosoftdoes,andwesufferforiteverytime
webootupourcomputer,orgetthebluescreenofdeath.
SonowIuseGoogleasanexample.Theyareconstantlycomingupwithnewideas,theyempowertheiremployeestoinnovateandexperiment,theyconstantlytrynewthingsin
beta.Theyaretinkering,perfecting.
Allthewhile,theyarelearningsomethingfundamentalabouthowhumanbeingssearchandgetandprocessinfo.Theyknowmorethananyonehowtheiruserstick.
Theyscalegradually.
DFID:You
are
big,
your
money
will
get
spent,
the
programs
will
implement,
most
will
be
prettygood.You,DFID,canbetheMicrosoftofaid.
Thequestionis:doyouwanttobebetter?
ForanexampleofR&D,letmegobacktothestreetyouthexample.
10
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
11/16
STYLisatightlycontrolledproject
Weareselfconsciouslyrunningthiswholeprogramasapilotforscaleupintheregion
Westartedwith100people.Thatsallweneededforproofofconcept.Weshowednoadverseeffectfromgivingcashtodrugdealersandpetty criminalsandbeggars.
Infactwesawhugedropsinhomelessness,druguseandcrime
Wevejustscaledto400more.Wewillbeabletogetverygoodprecisiononimpacts
becausewemeasurepeopleat2weekintervalsthreetimesayear.Wehavefulltimeethnographers.
Wearenowplanningathirdandfourthphase,wherewewilltestoutdifferentsizesofeconomicassistance,seewhathappensifyouaddinskillstraining,seewhathappensifyouprovidelongtermbehavioralreinforcement.
Whenwearefinished,whatwehopetohaveisacheap,scalable,proveninterventionthatisreadytogotoscaleinLiberiaoroutsideit.
Willitmakesensetoevaluatethescaleup?Whynot?Replicationisimportant.
But,ifitmeantcrowdingoutthetimeormoneyorintellectualenergytodoR&Donthenextproject,onthenextidea,thenIwouldsay:Dotheminimumyouneedtotestreplicationandthenfocusyourscarceresourceselsewhere
11
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
12/16
Somepeoplearethinkingthatwontworkformykindofproject
Andyouarepartlyright.Butprobablyonlypartly.
It willworkwhereyoucangetpreciseanswerswithoutgoingtohugescale so
anythingthat
works
with
individuals
or
groups
Iftheimpactislargeenough,itseasytomeasureinasmallgroupaswell.
Thismakesithardertodowithacommunitylevelexperiment,orsomethingwhere
timingaroundaneventlikeanelectioniskey
ButIguaranteeyou:ineveryoneofthosecases,ifyouwroteoutyourassumptions
aboutthewaythathumansororganizationsorgroupsorsystemswork,Icouldfind
CRUCIALassumptionsthatcouldbetestedonasmallscale
Thatis,importantaspectsofyourideaandprogramcouldbehonedandperfected
untilweknowsomethingmoreabouthowaidworks,andalsohaveaterrificprogram
12
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
13/16
Myfifth andlastmessageisthatyoumustembracefailure
Ireadsomeofthereviewsthatwerecirculated.Everyonesaidthisprogramhadthis
impactandthatprogramhadthatimpact
Notone
said
and
this
program
had
no
effect
at
all
or
even
this
program
affected
thisandnotthat
Iexaggeratealittle,butnotmuch
Whatthisprobablymeansisthat
a) Peoplearenotfinishingfailingstudies
b) Peoplearenotpublishingfailedstudies
c) Peopleselectivelyreportwhatworks
d) Weareallattunedtorememberwhatworks,notwhatfails
Ithinktheanswerise)alloftheabove.Thisisgettingbetter.
ButfranklyIcantimagineanythingmoredamagingtoaprogramdesignerthannot
knowingwhathasntworkedelsewhere
13
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
14/16
Theimplicationisanorganizationalshiftthatwill bevery,verydifficulttoovercome
Smallthingsyoucandotomakeitbetter
a) Prespecify yourtheory,testsandoutcomes
b) Actuallyreportwhich oneshadimpactandwhichdidnot
c) Celebrate andcommunicate
failures
Dorememberthatabsenceofevidenceisnotevidenceofabsence.Buttakeabsence
ofevidenceseriously.
Resistthenaturalhumanurgetofocusonthepositiveimpacts.Thatisthepathof
Microsoft.
14
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
15/16
Ultimately,donotfocusonRCTsthemethodology.Theyareameanstoanend.
15
-
7/31/2019 2011 - Impact Evaluation 3 (DFID Talk)
16/16
Mypartingshot:
Youareabigorganization.Youhavealotofmoneytospend.Mostofyourprograms
arefine.Youaregoingtoputaidandgovernanceprogramsoutthereandtheyare
goingtobeimpactful,eveniftheydontworkperfectly.
Thatis,youhaveitwithinyourpowertobetheMicrosoftoftheaidindustry.
Imsuggestingthatyou couldactuallydobetter.
MaybeyouwontbetheGoogleofaid,but surelyyouwanttomoveinthat
direction?
16