validating the inputs used to create geo-referenced census ... · data mining & integration....

22
Validating the Inputs Used to Create Geo-Referenced Census Data Deirdre Dalpiaz Bishop Chief, Geography Division U.S. Census Bureau United Nations Economic Commission for Europe Conference of European Statisticians Workshop on Population and Housing Censuses Palais des Nations, Geneva 26-28 September 2018

Upload: others

Post on 22-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Validating the Inputs Used to Create Geo-Referenced Census Data

Deirdre Dalpiaz BishopChief, Geography Division U.S. Census Bureau

United Nations Economic Commission for EuropeConference of European StatisticiansWorkshop on Population and Housing CensusesPalais des Nations, Geneva26-28 September 2018

Page 2: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

2

Purpose: To conduct a census of population and housing and disseminate the results to the President, the States, and the American People.

Primary Uses of Decennial Census Data:• Apportion representation among states.• Draw congressional and state legislative districts, school districts and voting precincts.• Distribute federal dollars to states.• Inform federal, tribal, state, and local government planning decisions.• Inform business and nonprofit organization decisions.• Provide population benchmark for nearly every other United States survey.

The Decennial Census

Page 3: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

The 2020 CensusA Complete and Accurate Count of the Population and Housing

Count everyone once, only once, and in the right place.

ESTABLISH WHERE TO COUNT

SELF-RESPONSE

TABULATE DATA AND RELEASE CENSUS RESULTS

MOTIVATE PEOPLE TO RESPOND

NONRESPONSE FOLLOWUP

3

Page 4: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

4

The 2020 CensusEstablish Where to Count

Identify all addresses where people could live

• Minimize in-field work with in-office updating.• Use imagery to review addresses in the office to substantially cut field work.• Use multiple data sources to identify areas with address and spatial changes.• Get local government input, e.g., through the Local Update of Census Addresses

(LUCA).

Page 5: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

The 2020 CensusA New Design for the 21st Century – Detecting Change

5

Page 6: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Detecting ChangeUnited States Postal Service Delivery Sequence File

6

Year

Number of DSF ResidentialAddresses New DSF Residential Addresses

New DSF Residential Addresses Matched to the MAF

New DSF Residential Addresses Added to the MAF

Number Percent Number Percent

2018 128,586,346 248,054 42,991 17.3 205,063 82.5

2017 128,674,735 894,077 148,326 16.6 745,751 83.4

2016 127,228,165 1,681,773 745,131 44.3 936,642 55.7

2015 125,109,354 719,481 138,559 19.3 580,922 80.7

2014 124,093,239 1,074,855 223,033 20.7 851,822 79.3

2013 122,165,387 323,958 87,031 26.9 236,927 73.1

2012 122,319,744 626,504 183,387 29.3 443,117 70.7

2011 121,591,753 625,513 220,290 35.2 405,223 64.8

2010 121,209,943 873,434 420,416 48.1 453,018 51.9

Total 2010-2018

7,067,649 2,209,164 31.3 4,858,485 68.7

Page 7: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Detecting Change GSS Partner File Processing

7

Total Number of Records Percent of Total Records

Total Number of Records Received 133,940,091 100%

Total Number of Records Accepted 106,602,993 79.59%

Total Number of Records Matched to the MAF 106,082,021 99.51%

Total Number of New Records Created 520,972 0.49%

Total Number of Records Rejected 27,337,098 20.41%

Records Received with a Distinct XY Coordinate 125,646,340 93.81%

Total Number of New MSPs Created 75,127,071 56.09%

Page 8: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Detecting ChangeCommercial File Processing

8

Vendor Usable Addresses Number of Usable Addresses Matched to MAF Addresses

Percentage of Usable Addresses Matched to MAF Addresses

1 120,270,430 119,529,128 99.4

2 102,313,410 95,822,185 93.6

3 152,581,321 148,730,349 97.5

4 98,037,776 90,919,679 92.7

5 111,040,589 109,148,391 98.3

Page 9: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Innovation – Validating InputsIn-Office Address Canvassing – Initial Pass of the Nation

• The goal of In-Office Address Canvassing is to manage as much of the review, validation, and updating of the address list as possible in the office, allowing resources to be focused on areas in which fieldwork is necessary to assure a complete and accurate address list

• 100 percent review in the office• Started Execution: September 2015• Completed Initial Pass of the Nation: June 8, 2017

• Status in June 2017: 11,155,486 - All US Blocks reviewed

Status Block Counts Percent of Blocks

Active 1,893,310 17.0%

Passive 7,921,288 71.0%

On Hold 1,340,888 12.0%

Total 11,155,486 100%

9

Page 10: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Innovation – Validating Inputs In-Office Address Canvassing - Block Assessment, Research and Classification Application (BARCA)

Slide bar between two vintages of imagery 10

Page 11: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Innovation – Validating Inputs In-Office Address Canvassing - Identifying Stability

2008 Imagery Current Imagery

11

Page 12: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

1212

Imagery review identifies discrepancy between the MAF and imagery; updates are clustered in a portion of the block

Innovation – Validating Inputs In-Office Address Canvassing - Under-coverage

Page 13: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Innovation – Validating Inputs In-Office Address Canvassing -– Current Status

Status Block Counts Percent of Blocks

Active 1,703,731 15.3%

Passive 8,898,238 79.8%

On Hold 496,651 4.5%

Triggered 56,866 0.4%

TOTAL 11,155,486 100%Triggered Blocks:

• A trigger is an “event” that provides information and/or data that suggest the need to send a block, or area of blocks, back through Interactive Review (IR)

Triggers:• To date, 37 trigger events have resulted in blocks returning to IR. For example:

o Ungeocoded trigger where ungeocoded addresses are geocoded to blocks and result in a change in the number of addresses in those blocks results in the block(s) returning to IR

o Boundary and Annexation Survey (BAS) trigger where changes to a city’s boundary results in blocks returning to IR

13

Page 14: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Innovation – Validating InputsLocal Update of Census AddressesThe Local Update of Census Addresses (LUCA) is the only opportunity offered to tribal, state, and local governments to review and comment on the Census Bureau's residential address list for their jurisdiction prior to the 2020 Census.

14

Phase DescriptionRegistration and Review Materials 11,550 Entities have registered to participate in LUCA and

require registration packages

LUCA Registration Coverage 98.1% of the population and 98.1% of the housing covered by at least one LUCA participant

98.8% of the population and 98.7% of the housing covered in tracts with the lowest response scores in the hardest to count areas

LUCA Submissions Total LUCA responses received 8,225 6,751 responses with changes 1,474 responses with no changes

LUCA Processing Processed 2,914 entities 431,616 potentially new addresses to Census

Page 15: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsChange Detection

15

Change Detection

DSF

Imagery & LIDAR

Data Mining & Integration

Address Inventory triggers from the DSF using the enhanced Line Of Travel (eLOT) to locate new addresses.

Remote sensing triggers for areas with new development like structures & roads.

Integrate Data mining from local partners – identify change in partner files through metadata.

Page 16: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsDSF-USPS eLOT®

SEQUENCE DSF_LOW_HN DSF_STREET DSF_SUBTYPE55532C05100220416 3900 KIERAN ST55532C05100220417 3912 KIERAN ST55532C05100220418 3916 KIERAN ST55532C05100220419 3920 KIERAN ST55532C05100220420 3930 KIERAN ST55532C05100220421 3934 KIERAN ST55532C05100220422 4000 KIERAN ST55532C05100220423 4008 KIERAN ST55532C05100220424 4010 KIERAN ST55532C05100220425 4022 KIERAN ST55532C05100220426 4026 KIERAN ST55532C05100220427 4034 KIERAN ST55532C05100220428 4036 KIERAN ST

16

Page 17: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsDSF-USPS eLOT® - Uncovering New Development

Question:Where is 1 Hawks Nest Ln?

Answer:Somewhere between 19 Crimson King Dr. and 21 Crimson King Dr.

17

Page 18: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsImagery - LiDAR + Imagery = Change Detection

2011 NAIP Imagery

2016 NAIP Imagery

2016 LiDAR Point Cloud

Building footprints created and compared to address database

18

Page 19: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsBringing it All Together: Data Integration

19

Address with Coordinates

Parcel linked to Address via coordinates

Address from Parcel is transferred to Structure

Title13

Page 20: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsGold Plate

20

Geospatial Data Quality Integration Title13

Page 21: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Future Innovation – Validating InputsThe National Address Database

21

Page 22: Validating the Inputs Used to Create Geo-Referenced Census ... · Data Mining & Integration. Address Inventory triggers from the DSF using the enhanced Line Of Travel ( eLOT) to locate

Thank You!

22