castillo high quality program evaluation in nonprofits
TRANSCRIPT
High Quality Program Evaluation in Nonprofit Organizations
February 2015 DCPNI – Isaac Castillo - @Isaac_outcomes 1
Isaac D. CastilloDeputy Director
DC Promise Neighborhood Initiative@Isaac_outcomes
February 19, 2015
Why Bother With All of This?
February 2015 Isaac Castillo - @isaac_outcomes 2
Ultimately, you should be measuring outcomes or effectiveness for a
single reason:
To better serve your clients / population.
Learning Objectives
• Understand methods to determine if your program is really leading to positive change (as opposed to change happening due to chance)
• Learn best practices in using and analyzing surveys and how to avoid common mistakes
• Identify best ways to balance costs and quality when doing program evaluation
February 2015 Isaac Castillo - @isaac_outcomes 3
Outputs vs Outcomes
• Output measures assess what you do and who you serve. Examples include:
• Served 100 youth during summer camp
• Provided 2,250 hours of tutoring during the academic year
• 9 out of 10 youth attended at least 75 % of available art instruction classes offered
Outcome measures assess changes in your target population. Examples include:
• 75 % of youth increased their knowledge of local history during the summer camp
• 50% of youth increased math grades by one grade level during the academic year
• 25% fewer youth reported being involved in bullying over the last year
February 2015 Isaac Castillo - @isaac_outcomes 4
Outputs
• Outputs DO: – Tell you about whether your program was
implemented well. For example, they indicate whether a program:• delivered the intended number of sessions• reached its intended population• resulted in adequate participation levels
• Outputs DO NOT: – Tell you if participants benefited from your program– Serve as indicators of program success or
effectiveness
February 2015 Isaac Castillo - @isaac_outcomes 5
Outcomes
• Outcomes DO:
– Tell you if participants benefited from your program
– Serve as indicators of program success or effectiveness
• Outcomes DO NOT:
– Tell you about whether your program was implemented well (or provide clues about how your program improved participant outcomes)
February 2015 Isaac Castillo - @isaac_outcomes 6
What is Program Evaluation?
• Process to determine if your program / intervention / approach is effective.
• Need to define what is ‘success’ for your program first.
• Program evaluation does NOT need to be done by specialists or outsiders – but those people do add credibility and rigor (in most cases)
7February 2015 @Isaac_outcomes
The Basics of Program Evaluation –An Example
• The concept of dieting – if you understand dieting, you understand the basics of program evaluation.
• What is the goal of dieting (how do you define dieting ‘success’)?
• How do you know if your diet ‘works’?
8February 2015 @Isaac_outcomes
Data and Dieting
9February 2015 @Isaac_outcomes
Person weighs 200 Pounds (90 Kilograms)
• Does that data point alone tell us anything?
• Context Matters – what if person is 4 feet tall and 10 years old? • Timing Matters – is this at beginning, end, or middle of diet?
Could Be About More Than Weight
10February 2015 @Isaac_outcomes
• Other things that could be measured:• Body Mass Index (BMI)• Physical fitness• Blood measures (cholesterol levels)• Own perceptions of health / feeling• Appearance / muscle tone
Outcomes vs. Impact
• “Impact” gets used loosely. Precise meaning in evaluation world: impact = difference between program outcomes and comparison group (usually through RCT).
• “Outcomes” focus on measuring the effectiveness of your program. Help to determine the effectiveness of your program.
• Be aware of the differences in terms and who your audience is.
February 2015 Isaac Castillo - @Isaac_outcomes 11
How Can Nonprofits Measure Change?
• Easiest thing to do is to measure before and after for your participants.
• Can also compare to other groups.
• How and what you measure is just as important.
February 2015 Isaac Castillo - @Isaac_outcomes 12
Traditional (Time Series)
• Most common type of program evaluation.
• Looking to see if things have changed over time.
• What was situation before program, then what was situation after program.
• Must measure same things, in same ways, at both points in time.
13February 2015 @Isaac_outcomes
Before Program Program Delivered After Program
Comparison Group
• A time series study that compares to another group (that does not receive programming).
• More rigorous, but more challenging
14February 2015 @Isaac_outcomes
Program Delivered
Before ProgramNo (or minimal)
programming After Program
Who / What Will You Evaluate?
• Need to define the population that will be evaluated.
• Need to define ‘success measures’ (outcomes) – what are you trying to achieve?
• Once these questions are answered, then need to consider which participants will be part of the evaluation (and maybe who gets programming).
15February 2015 @Isaac_outcomes
In Time Series, This is Simple
• Usually just serve and evaluate those that enroll in the program:
• First come, first served is what is frequently used if there are too many potential participants.
16February 2015 @Isaac_outcomes
Before Program Program Delivered After Program
Self-selection
Comparison Groups Are More Complicated
• Can select by randomizing participants into groups:
17February 2015 @Isaac_outcomes
Program Delivered
Before ProgramNo (or minimal)
programming After Program
Random Selection
Compare across high/low dosage
• Can use self-selection:
18February 2015 @Isaac_outcomes
Program Delivered
Before ProgramNo (or minimal)
programming After Program
High Attendance
Self-selection
Low Attendance
But How Do You Measure Change?
• Most common ways:
– Use data that someone else has collected (report cards, health status, etc.)
– Pre/post-tests or surveys – at least two points in time.
– Focus groups or interviews.
• Can (and should) combine these.
February 2015 Isaac Castillo - @Isaac_outcomes 19
How can you quickly analyze data?
• You can do a lot in Excel.
• Think about assumptions and questions ahead of time.
• Think about your analysis before your program starts.
• Open ended and text responses are time consuming to analyze…..
• But you can put numbers to a lot of things.
February 2015 Isaac Castillo - @Isaac_outcomes 20
Importance of identified data• Try to avoid use of anonymous or grouped data. • Ideally, you would be able to match (and track)
data at individual level. • That means you need names or unique
identifiers. • Your analysis would then focus on those that
have data and multiple points in time and that data can be matched to same individual.
• Different from whole group analysis (compare whole group at point 1 to whole group at point 2 – even though there are different people in groups).
February 2015 Isaac Castillo - @Isaac_outcomes 21
Group vs. Individual AnalysisParticipant Pre-Test Score Post-Test Score Difference
Participant 1 10 No post-test ??
Participant 2 20 No post-test ??
Participant 3 10 10 0
Participant 4 20 20 0
Participant 5 No pre-test 20 ??
Participant 6 No pre-test 30 ??
Average: 15 20 + 5
February 2015 Isaac Castillo - @Isaac_outcomes 22
Are There Some Things That Can’t be Measured?
• The key is properly defining what success looks like.
• Large and fuzzy concepts ARE difficult to measure.
• But their component parts can usually be measured.
• Let’s start with an example….
February 2015 Isaac Castillo - @Isaac_outcomes 23
Your engagement in workshopCategory Description Numerical
Value
Poor Openly not paying attention to presentation. Not in room, or on unrelated internet sites (Facebook). Are you playing Candy Crush now? If so, you are in this category.
1
Fair Not taking notes, but at least listening. Askingquestions or making comments that are distracting or do not contribute positively to learning.
2
Good Taking notes and listening actively to content, but not participating in any other way (no questions, no comments)
3
Excellent Active listening / note-taking and asking questions. Questions push discussion in positive ways.
4
February 2015 Isaac Castillo - @Isaac_outcomes 24
How can you analyze this?
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9
Graphing One Person’s Engagement
February 2015 Isaac Castillo - @Isaac_outcomes 25
Does adding a line help?
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9
Graphing One Person’s Engagement
January 28th 2015 Isaac Castillo - @Isaac_outcomes 26
What about a Trendline?
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9
Graphing One Person’s Engagement
February 2015 Isaac Castillo - @Isaac_outcomes 27
Comparing multiple points in time
• Easy to compare changes between two points in time (pre/post), but what if you have multiple data points?
• If you have data at four points in time, do you only compare first and last? What do you do with middle two points?
• What about 20 points of data? Still first and last, or do you want something that more accurately collects what happens over entire time (like regressions / trendlines)?
• Is using the first/last data point even the best thing to do (will they be the most accurate)?
February 2015 Isaac Castillo - @Isaac_outcomes 28
Avoiding common mistakes
• Collecting different data in different ways over time (post test is different from pre test).
• Should you even be giving pre-tests? (Retrospective post-then-pre-tests and normalization of skill over time).
• Are there things that shouldn’t be self-reported (too much bias)?
• Is a very complex outcome oversimplified?
February 2015 Isaac Castillo - @Isaac_outcomes 29
How Detailed or Rigorous Does the Evaluation Need to Be?
• What do you want to do with the results? – Prove to yourself the program works? – Use the results to market/fundraise? – Publish the results through your own materials? – Publish the results in peer-reviewed journals?
• How ‘certain’ do you want to be about the results? – Are you fine with some doubt? – Will you be comfortable answering concerns and
criticisms?
• Are you willing to live with negative results?
30February 2015 @Isaac_outcomes
Costs and Rigor
• The more you want to do with the results, the more you need to spend on evaluation.
• Approximately 3% of organization’s budget should be spent on evaluation activities.
• Can grow capacity over time – start small.
• Very little cost to do simple data collection –don’t overcomplicate at the beginning.
February 2015 Isaac Castillo - @Isaac_outcomes 31
Some More Complex Data Questions
• Let’s try to delve into some deeper questions.
– What does your target population look like, and is it different than from what you anticipated?
– Do you have a way to know if participants are re-enrolling in programming?
– How do you define a program participant? And what does it take to get a person ‘enrolled’?
February 2015 Isaac Castillo - @Isaac_outcomes 32
Assessing your service population
• Basic demographics are easy place to start
• Can you include other characteristics to measure need/risk levels of populations? – Income level (or proxies)
– Education level (or proxies)
– Other characteristics that are important
• Where do you get this data? – Administrative sources (someone else collects)
– Screening tools (you collect)
February 2015 Isaac Castillo - @Isaac_outcomes 33
How do you know if you have ‘repeat customers’?
• Does your data system have unique identifiers for participants?
• Does your data system have a way to track multiple enrollments in the same program at different periods of time?
• Does your program have distinct end points (and criteria for exit) and are those trackable?
• Big question: Are repeat customers a good thing? In some instances, they could actually be a negative outcome if participants repeat programs.
February 2015 Isaac Castillo - @Isaac_outcomes 34
At What Point is Someone “In” Your Program?
• Do you have defined criteria as to when a participant is officially enrolled in your program? – Is it when they fill out an intake form? Or when they
have signed a consent form?
– Or is it when they attend their first (or second) event?
• What is the process to make this happen? – What paperwork or other things do prospective
participants have to go through?
– Do you have any idea of how long this process takes (with data – not just guesses)?
February 2015 Isaac Castillo - @Isaac_outcomes 35
How Can You Use This Data?
• Are you serving the ‘right’ population?
• Are your participants getting ‘enough’ service to obtain outcomes?
• Should you change who you serve?
• Should you change what you do?
February 2015 Isaac Castillo - @Isaac_outcomes 36
Fictional Mentoring Program – Dosage by Quartile
February 2015 Isaac Castillo - @Isaac_outcomes 37
Target: 50 hours of mentoring per year
30 hours
15 hours (also median)
6 hours
5% hit the 50 hour target
What if we combine dosage and outcomes?
February 2015 Isaac Castillo - @Isaac_outcomes 38
Target: 50 hours of mentoring per year
‘Best outcomes’ – avg of 48 hours
‘Moderate outcomes’ – avg of 12 hours
‘No change’ – avg of 6 hours
‘Negative change’ – avg of 2 hours
How does this help redefine targets?
We could add in population factors...
Outcome level Females Males Transgender
Strong positive outcomes
36 days 52 days 56 days
Moderate positive outcomes
14 days 12 days 10 days
No or negativeoutcomes
2 days 4 days 4 days
February 2015 Isaac Castillo - @Isaac_outcomes 39
Average number of program days attended by each subpopulation
…or risk factors
Outcome level Very low income Low Income Moderate Income
Strong positive outcomes
36 days 52 days 56 days
Moderate positive outcomes
14 days 12 days 10 days
No or negativeoutcomes
2 days 4 days 4 days
February 2015 Isaac Castillo - @Isaac_outcomes 40
Average number of program days attended by each subpopulation
And then you can delve into cells to ask and answer questions.
• It took moderate income participants far more days than very low income participants to see effects. Why?
• Does this mean should exclude moderate income participants (and low income)?
• Should we change our dosage target?
• Should we change how we define and measure the outcomes?
February 2015 Isaac Castillo - @Isaac_outcomes 41