predicting completed interviews in a national panel survey · predicting completed interviews in a...

Predicting Completed Interviews in aNational Panel Survey

Travis Pape & James LawrenceU.S. Census Bureau

American Association for Public Opinion ResearchAnnual Conference May 14 – 17, 2015

Hollywood, FL

1

Disclaimer: This presentation is released to inform interested parties of research and to encourage discussion. Theviews expressed are those of the authors and not necessarily those of the Census Bureau.

Motivation for Paradata Metrics

Improve survey performance to meet strategic surveyobjectives

Cost, Time and Quality Productivity Efficiency Level of Effort Data Completeness Balanced Respondent Sample

2

Background on Propensity Models• Propensity modeling aims at generating probability scores which reflect

the likelihood of a particular outcome on a given contact attempt.

• Propensity modeling research has been on going for some time

• Useful in nonresponse adjustment and in data collection, particularly inCAPI surveys.

• With declining response rates and growing interest in responsive designs,interest is increasing

• Census and other survey organizations has been developing andresearching propensity models

3

Census Research

4

• Erdman et al. have developed several propensity models for seven majordemographic household surveys

• In 2013, a model was developed for the American Housing Survey

• H/M/L response propensities were delivered via UTS.

• No guidelines on how to use scores provided

• The L.A. RO used scores during the end of the data collection

Objectives of Propensity Research Understand the statistical properties of the

propensity model and its scores Propensity scores over time Actual vs Predicted outcomes Model stability Case subgroups based off propensity scores

Identify the survey management implications Establish guidelines for field supervisory staff Recommend actions to manage cases with propensities

Clearly define reports/tools to assist field staff

5

National Crime Victimization Survey (NCVS)

• Conducted by the Census Bureau for the Bureau of Justice Statistics

• Nation’s primary source of criminal victimization information

• Rotating panel design (sample divided into 6 panels interviewed at6-month intervals)

• Nationally represented sample: ~10,500 cases per month

• Selected households remain in sample for 3 years

• Includes GQs in sample

6

Predicting Completed Interview on Next Contact Attempt

7

Variables Evaluated for Inclusion in the Model

Known characteristics of case before data collection Field Strata: binary, 1=most difficult strata; else, easier strata

combined Time in sample: # of survey periods case has been in sample Days until closeout: # of days left in the current data collection

period Census mail return: Block group percentage of 2010 mailed return

Housing unit characteristics discovered during datacollection Access barrier: unable to reach/locked gate/buzzer entry Language barrier: unable to conduct at time of contact Vacation: on vacation, away from home / at second home

8

9

Time Constraints

Privacy Issues

Firm Reluctance

Prior Interview Issues

Variables Evaluated for Inclusion in the Model(cont.)

Concerns: Time constraints on earlier attempt in current data collection period : CHI concerns Privacy issues on earlier attempt in current data collection period : CHI concerns Firm reluctance on earlier attempt in current data collection period: CHI concerns Interview concerns on earlier attempt in current data collection period : CHI concerns; longitudinal

survey in which R voices concern about past survey periods

Contact Attempt Outcomes: Contact with eligible respondent Negative Reluctance Questions about survey Number of Telephone Contacts

Strategies: Phone message left on answering machine Notice of Visit left or promotional packet Appoint was scheduled in prior contact Peak hours: current attempt is made during peak hours

(M-F: 3pm – 9pm, Sat: 9am – 6pm, Sun: 12pm – 6pm)

10

Defining the Model

11

Data: March – October 2014

Model best predictors were identified by running the groups of candidatevariables for each month and selecting statistically significant predictors todefine the “final” model

Final model was fit on the March – October data, and will be used movingforward

The values of the predictors (the coefficients) are re-estimated at thebeginning of each month using the data from the previous month as atraining set (e.g., May coefficients based on April data)

Daily propensity score calculated for each remaining open case

Any case that has not been worked is assigned the average propensity forthat day

12

NCVS Propensity Model Coefficients

Strata 0.047 0.051 0.051 0.050 0.039

# of attempts 0.090 0.161 0.175 0.163 0.225

# of appointments 0.134 0.105 0.205 0.139 0.193

Prior Contact 0.360 0.333 0.322 0.331 0.271

Question raised on previouscontact 0.454 0.370 0.367 0.300 0.233

June July August September OctoberIntercept -1.177 -1.216 -1.110 -1.159 -1.036

13

NCVS Propensity Model Coefficients

June July August September OctoberAccess Indicator -0.938 -1.048 -1.004 -1.104 -1.165

Days Until Closeout -0.078 -0.020 -0.134 -0.060 -0.093

Language Barrier -0.804 -0.457 -0.628 -0.430 -0.500

Vacation Indicated -0.981 -1.276 -1.150 -0.813 -1.429

# of phone phone attempts -0.140 -0.223 -0.240 -0.232 -0.291

# of personal visit attempts -0.167 -0.243 -0.267 -0.218 -0.322

Prior Reluctance -0.477 -0.296 -0.373 -0.360 -0.378

Negative reluctance onprevious contact -0.168 -0.111 -0.304 -0.292 -0.157

Roster size on previouscontact * * * * *

Time in Sample * * * * *

Key Points:• Model coefficients

are consistent frommonth-to-month

• Negative effects arelogical

• Access has largestinfluence

• Strata differencehas minimal butsignificant effect

14

Average Propensity Scores by Day ofData Collection: October

14.5%

15.0%

15.5%

16.0%

16.5%

17.0%

17.5%

18.0%

18.5%

19.0%

1-O

ct2-

Oct

3-O

ct4-

Oct

5-O

ct6-

Oct

7-O

ct8-

Oct

9-O

ct10

-Oct

11-O

ct12

-Oct

13-O

ct14

-Oct

15-O

ct16

-Oct

17-O

ct18

-Oct

19-O

ct20

-Oct

21-O

ct22

-Oct

23-O

ct24

-Oct

25-O

ct26

-Oct

27-O

ct28

-Oct

29-O

ct30

-Oct

31-O

ct

Prop

ensit

y Pe

rcen

tage

Date

Average Propensity (October, Open Cases)

15

Average Scores: June - September

14.5%

15.0%

15.5%

16.0%

16.5%

17.0%

17.5%

1-Se

p

3-Se

p

5-Se

p

7-Se

p

9-Se

p

11-S

ep

13-S

ep

15-S

ep

17-S

ep

19-S

ep

21-S

ep

23-S

ep

25-S

ep

27-S

ep

29-S

ep

Prop

ensit

y Pe

rcen

tage

Average Propensity (September, Open Cases)

14.5%

15.0%

15.5%

16.0%

16.5%

17.0%

17.5%

18.0%

1-Au

g

3-Au

g

5-Au

g

7-Au

g

9-Au

g

11-A

ug

13-A

ug

15-A

ug

17-A

ug

19-A

ug

21-A

ug

23-A

ug

25-A

ug

27-A

ug

29-A

ug

31-A

ug

Prop

ensit

y Pe

rcen

tage

Average Propensity (August, Open Cases)

12.5%

13.0%

13.5%

14.0%

14.5%

15.0%

15.5%

1-Ju

l

3-Ju

l

5-Ju

l

7-Ju

l

9-Ju

l

11-Ju

l

13-Ju

l

15-Ju

l

17-Ju

l

19-Ju

l

21-Ju

l

23-Ju

l

25-Ju

l

27-Ju

l

29-Ju

l

31-Ju

l

Prop

ensit

y Pe

rcen

tage

Average Propensity (July, Open Cases)

13.5%

14.0%

14.5%

15.0%

15.5%

16.0%

1-Ju

n

3-Ju

n

5-Ju

n

7-Ju

n

9-Ju

n

11-Ju

n

13-Ju

n

15-Ju

n

17-Ju

n

19-Ju

n

21-Ju

n

23-Ju

n

25-Ju

n

27-Ju

n

29-Ju

n

Prop

ensit

y Pe

rcen

tage

Average Propensity Score (June, Open Cases)

16

Validity Check: Do high propensity casescomplete at higher rate?

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

October Completion Rates

Top 25%

Bottom 75%

17

Actual Completion Rates of Top Quartile Casesby Day of Data Collection: June - September

0%5%

10%15%20%25%30%35%40%45%

1-Se

p

3-Se

p

5-Se

p

7-Se

p

9-Se

p

11-S

ep

13-S

ep

15-S

ep

17-S

ep

19-S

ep

21-S

ep

23-S

ep

25-S

ep

27-S

ep

29-S

ep

September Completion Rates

0%5%

10%15%20%25%30%35%40%

1-Au

g

3-Au

g

5-Au

g

7-Au

g

9-Au

g

11-A

ug

13-A

ug

15-A

ug

17-A

ug

19-A

ug

21-A

ug

23-A

ug

25-A

ug

27-A

ug

29-A

ug

31-A

ug

August Completion Rates

0%

10%

20%

30%

40%

50%

1-Ju

l

3-Ju

l

5-Ju

l

7-Ju

l

9-Ju

l

11-Ju

l

13-Ju

l

15-Ju

l

17-Ju

l

19-Ju

l

21-Ju

l

23-Ju

l

25-Ju

l

27-Ju

l

29-Ju

l

31-Ju

l

July Completion Rates

0%5%

10%15%20%25%30%35%40%

June Completion Rates

18

How to Develop Field Guidelines? How to use these scores and their impact… Increase response rate? Target low probability cases with selected characteristics more

aggressively? (e.g., presence of age 18-24 adult?)

Develop business rules E.g., on Day 10 assess # of contact attempts to high propensity

cases and prioritize those with less than X attempts E.g., on Day 21 prioritize high propensity cases and close out

low propensity cases quickly

Decide how to implement / enforce

19

Who? All NCVS survey staff within one or two Regional Offices. Monitored by regional Survey StatisticiansSystem Changes? Push to the Unified Tracking System or Case Management Syst. Reports and their interpretationTest vs. Control? Survey areas assigned to test vs. control, with some matching

by demographics of the areas Manipulate case assignments through case management?

How to Test and Measure the Benefitof Using Propensity Scores?

Moving Forward Continue to improve the model Adding contact strategies and their interaction

effects for optimal contact on a case-by-case basis Analyze differences in completion rates and

propensity scores amongst experienced FR vs.newly hired FRs

Test in the field

20

To be continued…

21

Thank You

[email protected]@census.gov

22

predicting completed interviews in a national panel survey · predicting completed interviews in a...

Documents