of an experiment in administrative records use a register ...3 what was the purpose of arex 2000?...
TRANSCRIPT
![Page 1: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/1.jpg)
1
Merging Administrative Records in the Absence of a Register: Data Quality Concerns and Outcomes of an Experiment in Administrative Records Use
Dean H. Judson
Planning, Research and Evaluation Division
![Page 2: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/2.jpg)
2
Basic Conclusions and Recommendations
• An administrative records census is not currently feasible, but might be for 2020
• Numerous applications of administrative records research for 2010:• Nonresponse Follow-up (NRFU) substitution• Imputation methods improvement• Master Address File (MAF) targeting• Census unduplication confirmation• Large scale data processing and record linkage improvements• Social Security Number (SSN) verification and search• Population estimation, survey improvement
• Four major improvements needed for an Administrative Records Experiment in 2010 (AREX 2010):
• Race and Hispanic origin improvements (underway)• Timeliness of AR data (harder, but doable)• Greater geographic representativeness • Experimental variation on key design dimensions
![Page 3: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/3.jpg)
3
What Was the Purpose of AREX 2000?
• Test the feasibility of an administrative records census• two counties in Maryland
• 1.4M persons in 558,000 households• three counties in Colorado
• 1.2M persons in 459,000 households
• Test two methods for conducting an administrative records census • top-down method• bottom-up method (match to address list,
additional operations)
![Page 4: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/4.jpg)
4
The Foundation of AREX: The Statistical Administrative Records System (StARS)
• Prototype based on 1998 vintage files• Census-like structure and content • Final database comparable Census
2000(Flowchart page A)
![Page 5: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/5.jpg)
5
The Statistical Administrative Records System-1999
TY98 IRS 1040119,946,193
TY98 IRS 1099598,075,971
Medicare56,837,022
Selective Service13,176,234
HUD TRACS3,342,234
Indian Health Service
3,106,821
EditedIRS 1040
243,260,776Edited
IRS 1099Edited
MedicareEdited
Selective Service
EditedHUD TRACS
EditedIndian Health
Service
NUMIDENT676,589,439
CensusNUMIDENT396,185,872
Address Processing795,742,702
Person Characteristics
File (PCF)396,185,872
Hygiene & Unduplication136,154,293
Geocoding102,965,122 (75.6% Coded)33,189,171 (24.4% Uncoded)
Person Processing875,750,973
SSN Validation (PVS)844,945,296 Valid
(96.5%)
Unduplication279,601,038
Remove Deceased/Create
Composite Record257,764,909
Extraction of AREX Test Site Records1,459,760 in Baltimore Site1,229,274 in Colorado Site
InvalidSSNs
30,805,677(3.5%)
RaceModel
GenderModel
MortalityModel
TIGER
Code 1
ABI
? Research
Page A
![Page 6: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/6.jpg)
6
Administrative Source Files (StARS)
• Internal Revenue Service tax files• Medicare enrollment database• Public housing assistance file• Selective Service registration file• Indian Health Service file• Social Security Number master file
(NUMIDENT--lookup file)(Refer to Tables, pages B,C)
![Page 7: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/7.jpg)
7
Creating Final StARS Database
• Select best address and demographics based on• geocodability• currency• quality
• Impute missing demographics• Flag records for deceased people• Final database is like the census
![Page 8: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/8.jpg)
8
Address Processing Results (StARS)
• Almost 800 million addresses at start• About 6 percent identified as potential
businesses• 136 million address records after
unduplication• About 75 percent geocoded
• 85 percent geocoding rate for city-style addresses
![Page 9: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/9.jpg)
9
Person Processing Results (StARS)
• 875 million records at start• 845 million have valid SSN record (96.5%)• 280 million after unduplication by SSN• 261 million after removal of known deceased• 257 million after removal of known deceased
and persons residing in outlying territories
(vs. Census 2000 population of 281 million)
![Page 10: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/10.jpg)
10
Additional Operations of AREX 2000
• Clerical geocoding• Request for physical address (for P.O.
Boxes, Etc.)• Match to Decennial Master Address File• Field address verification
(refer to AREX flowchart, page D)
![Page 11: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/11.jpg)
11
County Results: Total Population
97-15,70491-45,831519,326Jefferson County
99-7,28091-44,642501,533El Paso County
97-5,66085-27,030175,300Douglas County
10211,32891-54,753625,401Baltimore City
99-8,44795-40,469736,652Baltimore County
99-25,76392-212,7252,558,212Total
%A-C Diff%A-C Diff
Bottom-UpTop-DownCensus 2000
Note: A=AREX count; C=Census count
![Page 12: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/12.jpg)
12
County Results: Race and Hispanic Algebraic Percent Error (A-C)/C
Size of minority population impacted ALPE results.
-35%
-25%
-15%
-5%
5%
15%
25%
35%
BaltimoreCounty
BaltimoreCity
DouglasCounty
El PasoCounty
JeffersonCounty
Bottom-up race and Hispanic results by county
ALP
E
WhiteBlackAIAPIHispanic
![Page 13: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/13.jpg)
13
County Results: AREX Race Imputation
Arex race imputation greater in CO counties, especially blacks.
0%5%
10%15%20%25%
30%35%
BaltimoreCounty
BaltimoreCity
DouglasCounty
El PasoCounty
JeffersonCounty
Persons with imputed race by county
Perc
en
t o
f p
ers
on
s
Any raceBlacks
Cause: Very high imputation rates
![Page 14: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/14.jpg)
14
Additional County Results
• Bottom-up performed better than top-down• Children undercounted• Aged overcounted• Evidence of imputation problems
![Page 15: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/15.jpg)
15
Tracts and Blocks
• MD site: 404 tracts, 17,041 blocks• CO site: 283 tracts, 22,945 blocks
…• Emphasis on ALPE { (A-C)/C } distributions• 5% criterion: AREX is +/-5% of Census• 25% criterion: AREX is +/-25% of Census
![Page 16: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/16.jpg)
16
Tract results: Total Population
>75% of tracts met 5% criterion, except Baltimore City, 95% met 25% criterion.
0%10%20%30%40%50%60%70%80%90%100%
BaltimoreCounty
BaltimoreCity
DouglasCounty
El PasoCounty
JeffersonCounty
Tract ALPE distributions by county
Per
cen
t o
f tr
acts
5% criterion25% criterion
ALPEs
Results are more stable in Colorado;Highly variable in Baltimore County and City
~50% of AREX tract counts within 5% of Census in Baltimore City
![Page 17: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/17.jpg)
17
Block Results: Total Population
18-39% of blocks met 5% criterion, 85% met 25% criterion.
0%10%20%30%40%50%60%70%80%90%100%
BaltimoreCounty
BaltimoreCity
DouglasCounty
El PasoCounty
JeffersonCounty
Block ALPE distributions by county
Perc
en
t o
f b
lock
s
5% criterion25% criterion
ALPEs
Results notably less stable at the block level
~25% of AREX block counts within 5% of Census in Baltimore City
![Page 18: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/18.jpg)
18
Household Evaluation Questions
• Can we use administrative records data to contribute to census operations at the household level?
• Analytic questions:• Do AREX addresses (computer) link to different kinds of Census
addresses at varying rates?• Do AREX addresses that (computer) link to Census addresses
contain the same number of people?• Do AREX addresses that link to Census addresses contain similar
demographic distributions (demographically match)?• Can we predict, using non-Census information, when an address
will demographically match?
![Page 19: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/19.jpg)
19
Link Rates
• How often did AREX addresses link to Census addresses?• 81.4% of census addresses linked
• ~ 5% more “imperfectly” linked• higher (84.0%) in occupied• lower (46.4%) in vacant
![Page 20: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/20.jpg)
20
Link Rates and Coverage:By NRFU Status
Coverage by AREX of Census housing units, by NRFU status*
Type of Census housing unit Total
Linked with AREX
housing unitsNRFU 360,914 70.9%non-NRFU 716,450 88.4%Occupied NRFU 289,224 76.7%Occupied non-NRFU 715,115 88.5%Vacant NRFU 71,690 47.6%Vacant non-NRFU 1,335 58.7%
![Page 21: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/21.jpg)
21
Link Rates and Coverage:By Imputation Status
Coverage by AREX of Census housing units, by imputation status
Type of Census housing unit
Total
Linked with
AREX housing units
Imputed 24,584 62.3% Non-imputed 1,067,876 81.9% Imputed occupied 23,811 63.2% Non-imputed, occupied 993,462 84.5% Imputed vacant 773 34.7% Non-imputed, vacant 74,414 46.5%
![Page 22: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/22.jpg)
22
Number of People in Linked Addresses
• Do AREX addresses that link to Census addresses contain the same number of people?• Equal number: 51.1% of all linked addresses• Plus/minus one: 79.4% of all linked addresses
![Page 23: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/23.jpg)
23
Matching Demographic Distributions in Linked Households
• How do the demographic properties of linked households compare?
• Demographic categories:• Sex• Race (4 groups, with Census multirace allocated)• Hispanic origin• Age (5 year categories and 0-17, 18-64, 65+)• ARSH—age, race, sex and Hispanic origin
![Page 24: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/24.jpg)
24
Demographic Distributions:Overall
Comparisons between AREX and Census for demographic groups, for linked households with the same number of people only.
HH Size
Total linked,
of equal size
Equal for all sex groups
Equal for all race groups
Equal for all Hisp. groups
Equal for all
5-year age groups
Equal for age groups
0-17, 18-64, 65+
Equal for all demographic
groups
All sizes 445,426 91.2% 93.4% 94.8% 81.3% 93.1% 80.5%
1 85.4%
2 84.3%
3 72.2%
4 74.0%
5 69.5%
6 59.2%
7+ 28.7%
![Page 25: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/25.jpg)
25
Prediction
• Predicting Where An Arex Household Will Be Similar To A Census Household• Goal: Use only information available prior to
Census MO/MB and NRFU operations to predict which addresses will match accurately
• If we can predict well, then we can use that predictive model to tell us which addresses are good candidates for NRFU substitution or imputation
• Exploratory logistic regression model
![Page 26: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/26.jpg)
26
Prediction:Simple Relationships
Difference
of
proportions
Address contains only persons 65 and older versus demographicmatch/nonmatch status.
All AREX persons age 65 or older? No Yes Total Nonmatch 513,926 33,418 547,344 66.56 28.43 Match 258,150 84,144 342,294 33.44 71.57 Total 772,076 117,562 889,638 86.79 13.21 100
![Page 27: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/27.jpg)
27
Logistic RegressionModel Results
• Estimated odds ratios• Single unit: 2.6• One or two persons in HH: 3.5 • No AREX imputed race: 2.1• AREX one or more white: 2.1 • All AREX 65 and older: 1.7
• Interaction effects:• Total effect of 65+,nonmulti,nonimputed: 5.2• Total effect of 65+,1+white, 1-2 persons: 19.2
![Page 28: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/28.jpg)
28
Conclusions, continued
• Top-down results are not sufficient for enumeration at the block level
• Bottom-up results were better• Tract, block results differential and predictably less accurate• AREX covered the universe of Census HUs well• Comparisons of household size and demographic
composition were relatively promising• AREX and Census household level characteristics were less
similar for NRFU HUs• Open questions: Role of vacant addresses, quality of NRFU data.
![Page 29: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/29.jpg)
29
Recommendations• Continue development now
• Build/train team; build acquisition relationships• Improve race and Hispanic origin data
• Underway—NUMIDENT race enhancement• Continue to update NUMIDENT with ACS
(others?)• Improve timeliness of AR data
• Obtain IRS/other agencies’ data on a flow basis• SSA W-2’s• Birth/death data• Additional data sources
![Page 30: of an Experiment in Administrative Records Use a Register ...3 What Was the Purpose of AREX 2000? • Test the feasibility of an administrative records census • two counties in Maryland](https://reader034.vdocuments.us/reader034/viewer/2022051808/600baf3132655c525212f984/html5/thumbnails/30.jpg)
30
Recommendations• Improve geographical representativeness
of AREX 2010• Implement experimental variation on key
processing dimensions