when is a program ready for rigorous impact analysis?

55
Abt Associates | pg 1 [G]overnment should be seeking out creative, results- oriented programs like the ones here today and helping them replicate their efforts across America. President Barack Obama, 6/30/2009 http://www.nationalservice.gov/about/ newsroom/ statements_detail.asp? tbl_pr_id=1828

Upload: pilis

Post on 09-Feb-2016

12 views

Category:

Documents


0 download

DESCRIPTION

[G] overnment should be seeking out creative, results-oriented programs like the ones here today and helping them replicate their efforts across America. President Barack Obama, 6/30/2009 http://www.nationalservice.gov/about/ newsroom/ statements_detail.asp?tbl_pr_id =1828. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 1

[G]overnment should be seeking out creative, results-oriented programs like the ones here today and helping them replicate their efforts across America.

President Barack Obama, 6/30/2009http://www.nationalservice.gov/about/ newsroom/

statements_detail.asp?tbl_pr_id=1828

Page 2: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 2

When Is a Program Ready for Rigorous Impact Analysis?

Diana Epstein (CAP/Center for American Progress)and Jacob Alex Klerman (Abt Associates)

APPAM/HSE Conference“Improving the Quality of Public Services”,Moscow, June 2011

Page 3: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 3

The Basic Argument

On Logic Models

Some Examples

Some Broader Implications

Discussion

Outline

Page 4: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 4

The Goal

• Identify program ideas that can successfully address pressing social problems

• Roll them out nationally

Program Idea

Broad Rollout

Page 5: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 5

Require Rigorous Impact Evaluation

• Many apparently plausible programs “don’t’ work”

• Many that work in one site, don’t work in another site

• So, require Impact Evaluation “tollgate” – Usually random assignment– Saving money

• This is the “New Orthodoxy” – Coalition for Effective Policy– OMB (2009)

Program Idea

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 6: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 6

Many Programs Fail RA, So Pilot

• We argue that a rush to “random assignment evaluation” has two problems1. Some programs clearly will not pass the

Impact Evaluation “tollgate”2. Some of those programs, would pass the

Impact Evaluation with more “development”

• A “pilot” would help with both problems– i.e., run the program for a while– Then, if the program is promising …– Start the Impact Analysis

Program Idea

Pilot

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 7: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 7

Formative Evaluation/Process Evaluation

• Formative Evaluation to improve the program

• Process Evaluation to screen out programs that are unlikely to show impact

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 8: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 8

Formative Evaluation/Process Evaluation

• Formative Evaluation to improve the program

• Process Evaluation to screen out programs that are unlikely to show impact

• But, how do you do that?– New Orthodoxy: Only random assignment

can reliably detect impact– So, how can a Process Evaluation screen?

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 9: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 9

The Basic Argument

On Logic Models

Some Examples

Some Broader Implications

Discussion

Outline

Page 10: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 10

“Falsifiable Logic Models” Can Screen

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Require Falsifiable Logic Model

Page 11: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 11

“Falsifiable Logic Models” Can Screen

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Require Falsifiable Logic Model

Revise program and Falsifiable Logic Model

Page 12: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 12

“Falsifiable Logic Models” Can Screen

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Require Falsifiable Logic Model

Revise program and Falsifiable Logic Model

Only proceed if program satisfies it’s own Falsifiable Logic Model

Page 13: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 13

Why Might this Work?

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

• Logic Models explicate the path from resources to impacts

• All but the “impact” step occur– In the treatment group– During (or at the end of) treatment

Page 14: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 14

Why Might this Work?

• Logic Models explicate the path from resources to impacts

• All but the “impact” step occur– In the treatment group– During (or at the end of) treatment

• So, verifying the logic model does not require– Random assignment– Or even a control group– Long program follow-up – And expensive post-program survey tracking efforts

Program Idea

Formative Evaluation

Process Evaluation

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 15: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 15

The Basic Argument

On Logic Models

Some Examples

Some Broader Implications

Discussion

Outline

Page 16: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 16

Intermediate benchmarks– Resources/inputs, Activities, Outputs, Outcomes

… that were (should have been/could have been) specified in a Falsifiable Logic Model

And, that could be detected – Using only the treatment group– Without an expensive follow-up survey– Before (or perhaps shortly after) the end of treatment

Here goes …

But Will this Screen? Need Examples of …

Page 17: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 17

1. Acquire Resources: Form partnerships, acquire and retain staff (with target qualifications)• Salem ERA: Very high staff turnover

2. Recruit Cases: Fill the program• Portland ERA: Recruited only a third of target enrollees

3. Sustain Participation• Rural WTW Strategies Evaluation; SC Moving Up ERA; Cleveland Achieve ERA

Forms of Logic Model Failures: 1-3

Page 18: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 18

4. Implement with Fidelity• Mathematica Supplemental Reading Evaluation; Abt Mentoring

Evaluation

5. Pre/Post Progress• MDRC NEWWS HCD program’s academic testing (but see GED)

Forms of Logic Model Failures: 4-5

Page 19: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 19

The Basic Argument

On Logic Models

Some Examples

Some Broader Implications

Discussion

Outline

Page 20: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 20

Currently, program developers have an incentive to over-promise– More likely to be funded– But, underpowered Impact Evaluations and null results

Inducing “Truth Telling”

Page 21: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 21

Currently, program developers have an incentive to over-promise– More likely to be funded– But, underpowered Impact Evaluations and null results

Process Evaluation tollgate gives an incentive to under-promise– More likely to pass the Process Evaluation tollgate, but– Less likely to fund Pilot– And, less likely to fund Impact Evaluation

Inducing “Truth Telling”

Page 22: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 22

Currently, program developers have an incentive to over-promise– More likely to be funded– But, underpowered Impact Evaluations and null results

Process Evaluation tollgate gives an incentive to under-promise– More likely to pass the Process Evaluation tollgate, but– Less likely to fund Pilot– And, less likely to fund Impact Evaluation

And if developing a Falsifiable Logic Model forces program developers to more thoroughly and realistically think through their program models, that’s good too!

Inducing “Truth Telling”

Page 23: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 23

For Program Operator– Otherwise, an implicit expectation of proceeding– E.g., ED i3, CNCS SIF, Orszag (2009)

For Evaluator– And probably different contractors– Otherwise, an implicit expectation of proceeding– And contractual considerations lean towards doing so

Key Innovation: Separate Contracts

Page 24: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 24

For Program Operator– Otherwise, an implicit expectation of proceeding– E.g., ED i3, CNCS SIF, Orszag (2009)

For Evaluator– And probably different contractors– Otherwise, an implicit expectation of proceeding– And contractual considerations lean towards doing so

Key Innovation: Separate Contracts

Current practice often runs Process Evaluation simultaneously with Impact Evaluation

Page 25: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 25

The Basic Argument

On Logic Models

Some Examples

Some Broader Implications

Discussion

Outline

Page 26: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 26

Evaluation timelines are already long

Inconsistent with– Pressing problems– Short-term attention to (and funding for) specific problems

This approach would make evaluation timelines much longer– Additional piloting– Additional contracting between the steps

Approach Seems Infeasible: Timeline

Page 27: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 27

Implicit Assumption: Programs are willing to subject themselves to:– Long and burdensome evaluation– Possibility (likelihood) of failure

Plausible if:– Program’s goal is broad scale rollout– Rigorous evaluation is the only way to get there– Programs are confident of “passing”

Some positive examples (Nurse-Family Partnership; Teen Pregnancy Prevention Program; Orszag, 2009)

But they are the exception, rather than the rules

Approach Seems Infeasible: Willingness

Page 28: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 28

The Basic Argument

On Logic Models

Some Examples

Some Broader Implications

Discussion

Outline

Page 29: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 29

When Is a Program Ready for Rigorous Impact Analysis?

When Is a Program Ready for Rigorous Impact Evaluation?

Diana Epstein (CAP/Center for American Progress)and Jacob Alex Klerman (Abt Associates)

APPAM/HSE Conference “Improving the quality of Public Services”,Moscow, June 2011

Page 30: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 30

The Need for Impact Evaluation

• Many apparently plausible programs “don’t’ work”

• So, require Impact Evaluation “tollgate” – Usually random assignment– Saving money

Program Idea

Broad Rollout

Random Assignment Trial

Page 31: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 31

Efficacy Trial/Replication/Effectiveness Trial

• Some programs that work in one site, don’t work in other sites

• So:– Efficacy Evaluation (small trial, ideal

conditions)– Replicate to other (and more) sites– Effectiveness trial at the replicated sites

(larger trial, real world conditions)

Program Idea

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 32: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 32

This Is Hardly New

• It’s the “New Orthodoxy” – Coalition for Effective Policy– OMB (2009)

Program Idea

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 33: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 33

This Is Hardly New

• It’s the “New Orthodoxy” – Coalition for Effective Policy– OMB (2009)

• And, we think that’s a problem

Program Idea

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 34: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 34

Random Assignment Has Lots of Problems

Program Idea

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

• Random assignment fits Winston Churchill’s description of “democracy”

– “The worst form of government [evaluation], except for all the others that have been tried from time to time.”

• Random assignment is– Expensive– Long time lines– Subjects people to programs that don’t work

Page 35: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 35

Random Assignment Has Lots of Problems

• Random assignment fits Winston Churchill’s description of “democracy”

– “The worst form of government [evaluation], except for all the others that have been tried from time to time.”

• Random assignment is– Expensive– Long time lines– Subjects people to programs that don’t work

• Can we do better?– Avoid evaluating programs with no impact– Improve programs so that they will have impact

Program Idea

Broad Rollout

Efficacy Trial

Effectiveness Trial

Replication

Page 36: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 36

1. Acquire Resources

2. Recruit Cases

3. Sustain Participation

4. Implement

5. with Fidelity

6. Pre/Post Progress

Thus, Tollgate Is Implementable

Falsifiable and specifiable in Logic Model

Measured in Treatment Group only

No expensive follow-up survey needed

Occurs during or shortly after program activities

Page 37: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 37

1. Acquire Resources

2. Recruit Cases

3. Sustain Participation

4. Implement

5. with Fidelity

6. Pre/Post Progress

Thus, Tollgate Is Implementable

Falsifiable and specifiable in Logic Model

Measured in Treatment Group only

No expensive follow-up survey needed

Occurs during or shortly after program activities

… with a Pilot Implementation and a Process Evaluation

Page 38: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 38

When Is a Program Ready for Rigorous Impact Analysis?

When Is a Program Ready for Rigorous Impact Evaluation?

Diana Epstein (CAP/Center for American Progress)and Jacob Alex Klerman (Abt Associates)

APPAM/HSE Conference “Improving the quality of Public Services”,Moscow, June 2011

Page 39: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 39

“The program logic model is defined as a picture of how your organization does its work – the theory and assumptions underlying the program. A program logic model links outcomes (both short- and long-term) with program activities/processes and the theoretical assumptions/principles of the program.”

Logic Model Definition

Source: W.K. Kellogg Foundation Logic Model Guide http://www.wkkf.org/~/media/6E35F79692704AA0ADCC8C3017200208.ashx

Page 40: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 40

YOUR PLANNED WORK describes what resources you think you need to implement your program and what you intend to do.– 1. Resources include the human, financial, organizational, and

community resources a program has available to direct toward doing the work. Sometimes this component is referred to as Inputs.

– 2. Program Activities are what the program does with the resources. Activities are the processes, tools, events, technology, and actions that are an intentional part of the program implementation. These interventions are used to bring about the intended program changes or results.

YOUR INTENDED RESULTS include all of the program’s desired results (outputs, outcomes, and impact).

Logic Model: Your Planned Work …

Page 41: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 41

YOUR PLANNED WORK describes what resources you think you need to implement your program and what you intend to do.

YOUR INTENDED RESULTS include all of the program’s desired results (outputs, outcomes, and impact).– 3. Outputs are the direct products of program activities and may include

types, levels and targets of services to be delivered by the program.– 4. Outcomes are the specific changes in program participants’ behavior,

knowledge, skills, status and level of functioning. Short-term outcomes should be attainable within 1 to 3 years, while longer-term outcomes should be achievable within a 4 to 6 year timeframe.

– 5. Impact is the fundamental intended or unintended change occurring in organizations, communities or systems as a result of program activities within 7 to 10 years.

Logic Model: Your Intended Results …

Page 42: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 42

If you don’t know where you’re going, how are you gonna’ know when you get there?

Yogi Berra, New York Yankees Player and Manger, 1925-

Happy families are all alike; every unhappy family is unhappy in its own way.

Anna Karenina, Chapter 1, first lineLeo Tolstoy, Russian mystic & novelist (1828 – 1910)

Page 43: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 43

Paper is nights and weekends work– In reaction to evaluation experience—positive and negative

Has been presented and read internally (Abt JASG) and externally (N. Campbell/ACF, Burt Barnow, Demetra Nightingale; we hope soon B. Kelly/ACF)– Probably going to try to present at ACF

Your comments—on presentation, on paper, and on ideas—much appreciated– In particular, more and better examples

(We hope) to a journal “soon”

Paper Status

Page 44: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 44

Goal: Effective Programs

Program

Idea

Broad-scale

Rollout

Effectiveness Evaluation

Replication

Efficacy Evaluation

Page 45: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 45

Question: How to Get There?

Program

Idea

Broad-scale

Rollout

Effectiveness Evaluation

Replication

Efficacy Evaluation

?

Page 46: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 46

Most rigorously evaluated programs “fail”– Even programs that pass

initial efficacy trial often “fail” follow-on effectiveness trial

And the more rigorous the evaluation, the more likely is “failure”

=> Evaluate before roll-out– Otherwise implement

ineffective programs

Random Assignment is Necessary

Page 47: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 47

Suggests a Random Assignment “Tollgate”

Program

Idea

Broad-scale

Rollout

Effectiveness Evaluation

Replication

Efficacy Evaluation

Page 48: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 48

Most rigorously evaluated programs “fail”– Even programs that pass

initial efficacy trial, often “fail” follow-on effectiveness trial

And the more rigorous the evaluation, the more likely is “failure”

=> Evaluate before roll-out– Otherwise implement

ineffective programs

Random Assignment is Necessary, but Expensive

• In dollars

• In calendar time

• In the lives of clients/participants who waste time in programs that don’t work

Page 49: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 49

Random Assignment is Necessary, but Expensive

• In dollars

• In calendar time

• In the lives of clients/participants who waste time in programs that don’t work

=> Don’t evaluate programs that will fail. Duh!

• Most rigorously evaluated programs “fail”

– Even programs that pass initial efficacy trial, often “fail” follow-on effectiveness trial

• And the more rigorous the evaluation, the more likely is “failure”

=> Evaluate before roll-out– Otherwise implement ineffective

programs

Page 50: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 50

Random Assignment is Necessary, but Expensive

• In dollars

• In calendar time

• In the lives of clients/participants who waste time in programs that don’t work

=> Don’t evaluate programs that will fail. Duh!

• Most rigorously evaluated programs “fail”

– Even programs that pass initial efficacy trial, often “fail” follow-on effectiveness trial

• And the more rigorous the evaluation, the more likely is “failure”

=> Evaluate before roll-out– Otherwise implement ineffective

programs

Page 51: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 51

It seems reasonable to require a Logic Model as a condition of funding– If a program can’t describe its “planned work” and its “intended

results”, it is not ready for (implementation) funding– And that Logic Model should be detailed and falsifiable—“When?” and

“How many?/What fraction?” (though this is unconventional) According to its own Logic Model, a program that can not

“satisfy” the explicit falsifiable goals will not succeed=> Don’t proceed to rigorous impact evaluation (i.e., random

assignment) until you have checked the Logic Model in a pilot

Hold a Program to Its Own Logic Model

Page 52: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 52

Only the last step—“Impact”—requires a control/comparison group

The first three (and often the fourth) steps occur during the program, so no need for expensive follow-up surveys

… And this Idea Is Not Vacuous

i.e., a conventional Process Evaluation

Diana Epstein
I think this slide needs a different title. "And this idea is not vacuous" doesn't really get the point accross. How about "And this idea has inherent merit."
Page 53: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 53

Programs that fail don’t (necessarily) need to be discarded– Program refinement—sometimes called “Formative

Evaluation” is an option

Two caveats– What is simply “moving the goalposts”?

Consider whether we are interested in the program with more modest expectations

– When and how many times to refine? Not sure; refinement comes at the expense of developing other (potentially) beneficial programs

Possibility of Program Refinement

Page 54: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 54

Recall the Earlier Program Development Model …

Program

Idea

Broad-scale

Rollout

Effectiveness Evaluation

Replication

Efficacy Evaluation

Page 55: When Is a Program Ready for  Rigorous Impact Analysis?

Abt Associates |pg 55

Suggest Adding a Process Evaluation “Tollgate”