Download - Paul Biemer RTI International and University of North Carolina Andy Peytchev RTI International
Nonresponse Bias Correction in Telephone Surveys Using Census Geocoding:
An Evaluation of Error PropertiesPaul Biemer
RTI International and University of North Carolina
Andy PeytchevRTI International
2
Estimating the Population Mean in an RDD Survey
Respondents {R} Nonrespondents
TOTAL SAMPLE
Nonrespondents {NR}
{ } { } { } { }R R NR NRy p y p y
mean of NRs is unknown
3
Methods for Adjusting and Evaluating for RDD Nonresponse
Very limited information on NRs in RDD surveys Post-stratification adjustments are the norm
Effectiveness at reducing bias is questionable at best Bias is sometimes evaluated using
Maximum followup effort approaches Only evaluates reduction in bias due to slight elevations in response rates
Comparison to external gold standard estimates Limited in scope to a few characteristics
Census block group geocoding Error properties are largely unknown
The focus of this research
4
Census Geocoding (CG) Method
Obtain the addresses for nonrespondents (50-60% “success” rate)
Geocode addresses Link to census aggregate data
Address matched: link unit to census block group (CBG) via geocode
Exchange matched: link to census tract (CT) via telephone number
Substitute the corresponding CBG or CT mean for the nonrespondent’s characteristic
5
Estimating the Population Mean in an RDD Survey
Respondents {R} Nonrespondents
TOTAL SAMPLE
Nonrespondents {NR}
{ } { } { } { }R R NR NRy p y p y
mean of NRs is unknown
6
Impute Nonrespondent Characteristics from Census Aggregate Data
Respondents {R} Nonrespondents
TOTAL SAMPLE
Nonrespondents {NR}
{ } { } { } { }NR RR NRp y py y
Obtained from NRs CBG or CT
7
Questions Addressed by this Research
What is the bias in ?
Is a valid estimate of the bias in the
unadjusted (or post-stratified) estimator of the mean?
Does the CG method provide useful data for modeling
response propensity?
The first two questions will be addressed in today’s
presentation.
y
{ }ˆ
RB y y
y
{ }ˆ
RB y y
8
Decomposition of the Bias in
Respondents Nonrespondents
Correctly matched
addresses
Incorrectly matched
addresses
Correctly matched
exchanges
Incorrectly matched
exchanges
TOTAL SAMPLE
{ }CA { }IA { }CE { }IE
{ }CA { }IA { }CE { }IE
y
Size
Expected Difference
9
{ } { } { }
{ } { } { }
{ } { } { }
{ } { } { }
CA CA CA
IA IA IA
CE CE CE
IE IE IE
y y
y y
y y
y y
{ } { }
{ } { }
{ } { }
{ } { }
( )
( )
( )
( )
CA CA
IA IA
CE CE
IE IE
E p
E p
E p
E p
10
Components of the Bias in
{ } { } { } { } { } { }
{ } { } { } { }
NR NR CA CA IA IA
CE CE IE IE
E y y
{ } { } { }
Bias( ) ( )
NR NR NR
y E y y
E y y
y
where
11
Components of the Bias in
{ } { } { } { } { } { }
{ } { } { } { }
NR NR CA CA IA IA
CE CE IE IE
E y y
{ } { } { }
Bias( ) ( )
NR NR NR
y E y y
E y y
y
wherecorrect CBG match incorrect CBG match
correct CT match incorrect CT match
12
Estimation of the Bias Components
National Comorbidity Survey Replication (NCS-R) National probability sample of 18+ in households Face to face survey with 71% response rate All addresses were geocoded
CG was applied to 8,178 responding hh’s that provided a telephone number (88% of NCS sample)
CG bias components estimated based on 41% response rate (response after 3 callbacks)
Sensitivity analysis based on three response rates: 2 callbacks 26% response rate 3 callbacks 40% response rate 5 callbacks 60% response rate
13
Why is it reasonable to use a face to face survey to evaluate the CG bias in an RDD survey?
The nonresponse mechanism is not a critical factor in the assessment of the CG bias.
A survey with a relatively high response rate is needed to evaluate the bias.
Addresses are known for all sample members and can therefore be geocoded to their correct CGs.
Sensitivity analysis can be performed to assess the effect on CG bias of increasing response rates.
14
Weighted Respondent Mean, True Mean, and CG Imputed Mean for Available Characteristics
Whit
eBlac
kAsia
nOthe
rHisp Male
Female
1 pers
HH
2+ pe
rs HH
< 18 H
H
Other H
H
Age 18
–24
Age 25
–34
Age 35
–49
Age 50
–59
Age 60
+
< $15,0
00
$15K
- $3
0K
$30K
–$50
K
$50K
–$75
K
≥ $7
5K
0
10
20
30
40
50
60
70
80
Respondent Mean True Mean Imputed Mean
15
Weighted Respondent Mean, True Mean, and CG Imputed Mean for Available Characteristics
Whit
eBlac
kAsia
nOthe
rHisp Male
Female
1 pers
HH
2+ pe
rs HH
< 18 H
H
Other H
H
Age 18
–24
Age 25
–34
Age 35
–49
Age 50
–59
Age 60
+
< $15,0
00
$15K
- $3
0K
$30K
–$50
K
$50K
–$75
K
≥ $7
5K
0
10
20
30
40
50
60
70
80
Respondent Mean True Mean Imputed Mean
16
White
Black
Asian
Other
Hisp Male
Female 1 H
H1+
HH
1+ ch
ild
No chil
d18
-2425
-3435
-4950
-59 60+
<15k
15K-30
K
30k-5
0k
50k-7
5k75
k+
-50.00%
-40.00%
-30.00%
-20.00%
-10.00%
0.00%
10.00%
20.00%
26% RR41% RR60% RR
Demographic Characteristics
by Response RateRelbias( | ) Relbias( | )R Ry y y y
RTI International
17
Average Estimates of for {s} = {CA}, {IA}, {CE}, and {IE}
{ } { }s s
{ } { }s s
Bias Component
{CA} {IA} {CE} {IE}0.00%
0.10%
0.20%
0.30%
0.40%
0.50%
0.60%
0.70%
0.80%
0.90%
1.00%
26% RR41% RR60% RR
(Percentage points)
RTI International
18
Average Relative Size of the Bias Components
{ }s { }| |s26% RR 41% RR 60% RR 26% RR 41% RR 60% RR
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
{CA} {IA} {CE}
{IE}
19
Conclusions
Bias in the CG estimates of NR bias is unacceptably large race, age, and income were the most biased
Major source of bias {IE} followed by {CA} (surprisingly) Approximately 75% of the cases fall into these subsets
Correctly matching to CBGs reduces the bias, but minimally Biases tend to build across components rather than netting
out. Increasing the survey response rate reduces bias in the CG
approach; relative importance of each component is stable
20
Next Steps
Further characterize the CG bias by its components Consider the use of CBG and CT information obtain from
the CG method for: modeling of response propensities adjusting for nonresponse bias
21
EMAIL ME TO REQUEST FULL [email protected]