what works? targeting, testing & tracking (triple-t) amimb, glazier’s hall 29 october 2013...
TRANSCRIPT
What Works?Targeting, Testing & Tracking
(Triple-T)
AMIMB, Glazier’s Hall29 October 2013
Lawrence W. ShermanInstitute of CriminologyCambridge University
Forecasting Storms
• 1987 –Big False Negative• 2013—Big True Positive• Week in advance• Protective actions taken• Huge reduction in harm• Same in Eastern India 2013• Who was the hero?
Who Was Vice-Admiral Robert Fitzroy?
1819: Weather Was All Words
• Fitzroy changed the business model• ADDED numbers into it• Sharpened the words with scales, %• Improved accuracy over “experience”• Transformed qualitative forecasts• Into quantitatively-supported statements• Now we call them “algorithms”• See things coming much sooner
Meteorological Office of the Board of Trade 1854
• Job: Publish Weather Data• Not Predict It• Fitzroy invented term “forecasting”• Controversial idea• Not 100% accurate• Royal Society Attacked It• Budget Cut; forecasts stopped• Fitzroy committed suicide
Prevention Finally Won• Falling Barometer Gale Warnings• Fishing ships told to stay in port• Owners objected; Practice discontinued• Deaths from gales; Prevention revived
What’s YOUR Job?
• Measuring Harm?• Preventing Harm? • Reducing Harm?• Correcting Harm?• Redressing Wrongs?
What is your business model?
Nor for making money.
Just for doing the job.
Words--not many Numbers
• Every prison is different• Every human life is important• All rights must be protected
BUT....• Which prisons are worst?• Which are about to explode in violence?• Which will have the most suicides?• How can algorithms help prevent harm?
AMIMB Website Einstein Quote
• Using same method• Expect different result?• Is there a new method IMBs could use?• Are there other business models?• Could they get better results? • Could they ADD numbers to add VALUE?
Evidence-Based Practice
1. An Idea:
“...Practices should be based on
scientific evidence about what works best.”
(Sherman, 1998)
2. An Analytic Framework
“A standard for all....strategies: using scientific evidence to target, test and track ...[all] practices” (Sherman, 2013)
Words AND Numbers
• AMIMB Practical Guide Has Many Words• Many numbers requested• Not clear how they are to be gathered• Whether comparisons could be reliable• What is the logic model by which• EACH IMB REPORT CAN REDUCE
HARM?
Same Problem for HM Inspectors
• Police since 1856• Prisons? Probation?• “Efficiency and Effectiveness”• Case by case• Word by word• HMI Police developed a “report card”• But measures were not scientific• Not “risk-adjusted”
Daniel Kahneman
QUICK! What emotion does this person feel?
QUICK! How Much is This Product?
17 X 24 = ???
Thinking, Fast and….Slower
• Reading the face—fast!
• Multiplying two numbers—slow (408)
• Why is that important to monitoring?
• It adds theory about how we think, decide, act
Two Cognitive Systems:Overlapping
System I FAST
Intuitive
Automatic
Effortless
Associative
Rapid
Opaque Process
Skilled
System IISLOW
Reflective
Controlled
Effortful
Deductive
Slow
Self-Aware
Rule-Following
Good News, Bad News
Good News:• Most decisions are made with System I• System I conserves energy, time• Most System I decisions are right--driving
Bad News:• Many important decisions are “wrong”• Many could be “right” if we used System II• We resist System II because it is “costly,” tiring
Slow Thinking About Strategy
TargetingAiming for biggest impact
TrackingMeasuring BOTH policing and crime
TestingDeciding what works
Targeting---PREDICT
Focus
•Issues, situations, processes
•Units, managers, problem leaders
Classify
•Concentrations
•Causes
Prioritize
•Greatest impact
•Best chance of success
Clinical vs. Statistical: Kahneman Chapter 21Began Career with
Paul Meehl (1954)200 Replications
Since ThenFormulas always
beat (60%) or equal (40%) human judgment—latter is far more costly
Crime, parole, air pilot errors, credit risk
Wine value forecasts Cot Death risk
Paul Meehl
3 ways of PredictingQualitative1. Clinical (System 1~2?; not transparent)
Statistical:
2. checklist with validation
3. supercomputer data mining
High Risk (2%)
Neither High nor Low Risk (38%)
Low Risk (60%)
High Risk 2% vs. Bottom 60%Two Years From Forecast Date
Charges for Any Offence 8 X more
Charges Serious Offence 10 X more
Charges Murder or Attempt 75 X more
TestingComparing two methodsSame kind of problems
THEN,.... ASKING:
Which one works better?Which one costs less?Which one gets best result for
same cost?
Testing DefinedA Fair Comparison Between two different methodsE = Experimental Method (new)C= Control Method (current) All else equalWhich one is better?By what criteria?
What Kind of Evidence?Not This Kind But This Kind
But what is good evidence?What constitutes a good test?
Many bad tests
More intuitive than quantitative
“Illusion of Validity”
Restorative Justice Experiment:Did Program Cause Crime Drop?
71% 71%
49%
100%
20%
40%
60%
80%
100%
120%
140%
Yr (-2) Yr (-1) Yr (+1) Yr (+2)
Line 1
DCOffenders (n =62)
Inferring Cause From Trend?Post Hoc Ergo
Propter Hoc?
After this, then because of this?
Or would it have dropped anyway?“Natural” Trend“History”Other factors“Spurious” explanationsThat can be eliminated—give confidence
All ruled out by randomized controlled trials
(RCTs)
Randomized Controlled Trial RCT:COMPARISON or NET difference
101%
121%
71% 71%
28%
49%
100%
20%
40%
60%
80%
100%
120%
140%
Yr (-2) Yr (-1) Yr (+1) Yr (+2)
CourtOffenders(n=59)
DCOffenders (n =62)
CONTROL Group A sample that measures what would
happen at the same time and place without the intervention being introduced to an otherwise identical
EXPERIMENTAL Group
AFP/ACT ANU Experiment:Did Program Cause Crime Drop?
71% 71%
49%
100%
20%
40%
60%
80%
100%
120%
140%
Yr (-2) Yr (-1) Yr (+1) Yr (+2)
Line 1
DCOffenders (n =62)
Inferring Cause From Trend?
• Post Hoc Ergo Propter Hoc?
• After this, then because of this?
Or would it have dropped anyway?
• “Natural” Trend• “History”• Other factors• “Spurious” explanations• That can be eliminated—give confidence
All ruled out by randomized controlled trials
(RCTs)
Randomized Controlled Trial RCT:COMPARISON or NET difference
101%
121%
71% 71%
28%
49%
100%
20%
40%
60%
80%
100%
120%
140%
Yr (-2) Yr (-1) Yr (+1) Yr (+2)
CourtOffenders (n=59)
DCOffenders (n =62)
CONTROL Group
• A sample that measures what would happen at the same time and place without the intervention being introduced to an otherwise identical
EXPERIMENTAL Group
Tracking
Crime Where When
Policing Patrol, POP Arrests
Match? Ratios Trends
Tracking in Politics
• Obama’s “Cave”• Tracking volunteer activity—500 offices• Contacting voters• Determining who supports Obama• Determining when & how they will vote• Arranging transport to the polls • Scheduling election day drivers • “Ground Game” cost $100 Million
3. Tracking in Prisons
• Liebling’s Quality of Prison Life Measures
• What are the trends in
--inputs (resources)
--outputs (activities)
--outcomes (results)
AMIMB-NOMS COMPSTAT?
Tracking With Evidence
• Discuss• Criticize• Problem-Solve• Use and misuse of data• Refinement through trial and error • Technology making change easier
Sherman: 1998
The Rise of Evidence?
• More Evidence is Available• More Evidence is Demanded• Budget cutbacks make evidence relevant• But there are only early adopters• A tipping point may some soon• The pace is quickening
How Can IMBs Be Evidence-Based?
1. Quantify current reports
2. Design new—fewer—measures
3. Test different models of IMB work
What Works?Targeting, Testing & Tracking
(Triple-T)
THANK YOU
Lawrence W. ShermanInstitute of CriminologyCambridge University