predicting completed interviews in a national panel survey · predicting completed interviews in a...
TRANSCRIPT
Predicting Completed Interviews in aNational Panel Survey
Travis Pape & James LawrenceU.S. Census Bureau
American Association for Public Opinion ResearchAnnual Conference May 14 – 17, 2015
Hollywood, FL
1
Disclaimer: This presentation is released to inform interested parties of research and to encourage discussion. Theviews expressed are those of the authors and not necessarily those of the Census Bureau.
Motivation for Paradata Metrics
Improve survey performance to meet strategic surveyobjectives
Cost, Time and Quality Productivity Efficiency Level of Effort Data Completeness Balanced Respondent Sample
2
Background on Propensity Models• Propensity modeling aims at generating probability scores which reflect
the likelihood of a particular outcome on a given contact attempt.
• Propensity modeling research has been on going for some time
• Useful in nonresponse adjustment and in data collection, particularly inCAPI surveys.
• With declining response rates and growing interest in responsive designs,interest is increasing
• Census and other survey organizations has been developing andresearching propensity models
3
Census Research
4
• Erdman et al. have developed several propensity models for seven majordemographic household surveys
• In 2013, a model was developed for the American Housing Survey
• H/M/L response propensities were delivered via UTS.
• No guidelines on how to use scores provided
• The L.A. RO used scores during the end of the data collection
Objectives of Propensity Research Understand the statistical properties of the
propensity model and its scores Propensity scores over time Actual vs Predicted outcomes Model stability Case subgroups based off propensity scores
Identify the survey management implications Establish guidelines for field supervisory staff Recommend actions to manage cases with propensities
Clearly define reports/tools to assist field staff
5
National Crime Victimization Survey (NCVS)
• Conducted by the Census Bureau for the Bureau of Justice Statistics
• Nation’s primary source of criminal victimization information
• Rotating panel design (sample divided into 6 panels interviewed at6-month intervals)
• Nationally represented sample: ~10,500 cases per month
• Selected households remain in sample for 3 years
• Includes GQs in sample
6
Predicting Completed Interview on Next Contact Attempt
7
Variables Evaluated for Inclusion in the Model
Known characteristics of case before data collection Field Strata: binary, 1=most difficult strata; else, easier strata
combined Time in sample: # of survey periods case has been in sample Days until closeout: # of days left in the current data collection
period Census mail return: Block group percentage of 2010 mailed return
Housing unit characteristics discovered during datacollection Access barrier: unable to reach/locked gate/buzzer entry Language barrier: unable to conduct at time of contact Vacation: on vacation, away from home / at second home
8
9
Time Constraints
Privacy Issues
Firm Reluctance
Prior Interview Issues
Variables Evaluated for Inclusion in the Model(cont.)
Concerns: Time constraints on earlier attempt in current data collection period : CHI concerns Privacy issues on earlier attempt in current data collection period : CHI concerns Firm reluctance on earlier attempt in current data collection period: CHI concerns Interview concerns on earlier attempt in current data collection period : CHI concerns; longitudinal
survey in which R voices concern about past survey periods
Contact Attempt Outcomes: Contact with eligible respondent Negative Reluctance Questions about survey Number of Telephone Contacts
Strategies: Phone message left on answering machine Notice of Visit left or promotional packet Appoint was scheduled in prior contact Peak hours: current attempt is made during peak hours
(M-F: 3pm – 9pm, Sat: 9am – 6pm, Sun: 12pm – 6pm)
10
Defining the Model
11
Data: March – October 2014
Model best predictors were identified by running the groups of candidatevariables for each month and selecting statistically significant predictors todefine the “final” model
Final model was fit on the March – October data, and will be used movingforward
The values of the predictors (the coefficients) are re-estimated at thebeginning of each month using the data from the previous month as atraining set (e.g., May coefficients based on April data)
Daily propensity score calculated for each remaining open case
Any case that has not been worked is assigned the average propensity forthat day
12
NCVS Propensity Model Coefficients
Strata 0.047 0.051 0.051 0.050 0.039
# of attempts 0.090 0.161 0.175 0.163 0.225
# of appointments 0.134 0.105 0.205 0.139 0.193
Prior Contact 0.360 0.333 0.322 0.331 0.271
Question raised on previouscontact 0.454 0.370 0.367 0.300 0.233
June July August September OctoberIntercept -1.177 -1.216 -1.110 -1.159 -1.036
13
NCVS Propensity Model Coefficients
June July August September OctoberAccess Indicator -0.938 -1.048 -1.004 -1.104 -1.165
Days Until Closeout -0.078 -0.020 -0.134 -0.060 -0.093
Language Barrier -0.804 -0.457 -0.628 -0.430 -0.500
Vacation Indicated -0.981 -1.276 -1.150 -0.813 -1.429
# of phone phone attempts -0.140 -0.223 -0.240 -0.232 -0.291
# of personal visit attempts -0.167 -0.243 -0.267 -0.218 -0.322
Prior Reluctance -0.477 -0.296 -0.373 -0.360 -0.378
Negative reluctance onprevious contact -0.168 -0.111 -0.304 -0.292 -0.157
Roster size on previouscontact * * * * *
Time in Sample * * * * *
Key Points:• Model coefficients
are consistent frommonth-to-month
• Negative effects arelogical
• Access has largestinfluence
• Strata differencehas minimal butsignificant effect
14
Average Propensity Scores by Day ofData Collection: October
14.5%
15.0%
15.5%
16.0%
16.5%
17.0%
17.5%
18.0%
18.5%
19.0%
1-O
ct2-
Oct
3-O
ct4-
Oct
5-O
ct6-
Oct
7-O
ct8-
Oct
9-O
ct10
-Oct
11-O
ct12
-Oct
13-O
ct14
-Oct
15-O
ct16
-Oct
17-O
ct18
-Oct
19-O
ct20
-Oct
21-O
ct22
-Oct
23-O
ct24
-Oct
25-O
ct26
-Oct
27-O
ct28
-Oct
29-O
ct30
-Oct
31-O
ct
Prop
ensit
y Pe
rcen
tage
Date
Average Propensity (October, Open Cases)
15
Average Scores: June - September
14.5%
15.0%
15.5%
16.0%
16.5%
17.0%
17.5%
1-Se
p
3-Se
p
5-Se
p
7-Se
p
9-Se
p
11-S
ep
13-S
ep
15-S
ep
17-S
ep
19-S
ep
21-S
ep
23-S
ep
25-S
ep
27-S
ep
29-S
ep
Prop
ensit
y Pe
rcen
tage
Average Propensity (September, Open Cases)
14.5%
15.0%
15.5%
16.0%
16.5%
17.0%
17.5%
18.0%
1-Au
g
3-Au
g
5-Au
g
7-Au
g
9-Au
g
11-A
ug
13-A
ug
15-A
ug
17-A
ug
19-A
ug
21-A
ug
23-A
ug
25-A
ug
27-A
ug
29-A
ug
31-A
ug
Prop
ensit
y Pe
rcen
tage
Average Propensity (August, Open Cases)
12.5%
13.0%
13.5%
14.0%
14.5%
15.0%
15.5%
1-Ju
l
3-Ju
l
5-Ju
l
7-Ju
l
9-Ju
l
11-Ju
l
13-Ju
l
15-Ju
l
17-Ju
l
19-Ju
l
21-Ju
l
23-Ju
l
25-Ju
l
27-Ju
l
29-Ju
l
31-Ju
l
Prop
ensit
y Pe
rcen
tage
Average Propensity (July, Open Cases)
13.5%
14.0%
14.5%
15.0%
15.5%
16.0%
1-Ju
n
3-Ju
n
5-Ju
n
7-Ju
n
9-Ju
n
11-Ju
n
13-Ju
n
15-Ju
n
17-Ju
n
19-Ju
n
21-Ju
n
23-Ju
n
25-Ju
n
27-Ju
n
29-Ju
n
Prop
ensit
y Pe
rcen
tage
Average Propensity Score (June, Open Cases)
16
Validity Check: Do high propensity casescomplete at higher rate?
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
October Completion Rates
Top 25%
Bottom 75%
17
Actual Completion Rates of Top Quartile Casesby Day of Data Collection: June - September
0%5%
10%15%20%25%30%35%40%45%
1-Se
p
3-Se
p
5-Se
p
7-Se
p
9-Se
p
11-S
ep
13-S
ep
15-S
ep
17-S
ep
19-S
ep
21-S
ep
23-S
ep
25-S
ep
27-S
ep
29-S
ep
September Completion Rates
0%5%
10%15%20%25%30%35%40%
1-Au
g
3-Au
g
5-Au
g
7-Au
g
9-Au
g
11-A
ug
13-A
ug
15-A
ug
17-A
ug
19-A
ug
21-A
ug
23-A
ug
25-A
ug
27-A
ug
29-A
ug
31-A
ug
August Completion Rates
0%
10%
20%
30%
40%
50%
1-Ju
l
3-Ju
l
5-Ju
l
7-Ju
l
9-Ju
l
11-Ju
l
13-Ju
l
15-Ju
l
17-Ju
l
19-Ju
l
21-Ju
l
23-Ju
l
25-Ju
l
27-Ju
l
29-Ju
l
31-Ju
l
July Completion Rates
0%5%
10%15%20%25%30%35%40%
June Completion Rates
18
How to Develop Field Guidelines? How to use these scores and their impact… Increase response rate? Target low probability cases with selected characteristics more
aggressively? (e.g., presence of age 18-24 adult?)
Develop business rules E.g., on Day 10 assess # of contact attempts to high propensity
cases and prioritize those with less than X attempts E.g., on Day 21 prioritize high propensity cases and close out
low propensity cases quickly
Decide how to implement / enforce
19
Who? All NCVS survey staff within one or two Regional Offices. Monitored by regional Survey StatisticiansSystem Changes? Push to the Unified Tracking System or Case Management Syst. Reports and their interpretationTest vs. Control? Survey areas assigned to test vs. control, with some matching
by demographics of the areas Manipulate case assignments through case management?
How to Test and Measure the Benefitof Using Propensity Scores?
Moving Forward Continue to improve the model Adding contact strategies and their interaction
effects for optimal contact on a case-by-case basis Analyze differences in completion rates and
propensity scores amongst experienced FR vs.newly hired FRs
Test in the field
20
To be continued…
21