![Page 1: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/1.jpg)
Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey
(ACS) Public Use Microdata Sample (PUMS)
Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey
(ACS) Public Use Microdata Sample (PUMS)
13th TRB Application Conference, Reno, NV
May 11th, 2011
Wu Sun
Clint Daniels
& Ziying Ouyang, SANDAG
Peter Vovsha
& Joel Freedman, PB Americas
![Page 2: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/2.jpg)
Presentation OutlinePresentation Outline Project Background SANDAG PopSyn
– Feature– Scenarios– Methodology– Geographies– Key steps– Control variables
Data Sources Validations Results Analysis Conclusions
![Page 3: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/3.jpg)
Project BackgroundProject Background SANDAG & SANDAG Travel Models SANDAG PopSyn & ABM
– What is a PopSyn?– What role does a PopSyn play in an ABM?
![Page 4: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/4.jpg)
SANDAG PopSyn DevelopmentSANDAG PopSyn Development
PopSyn II
PopSyn I PopSyn I• Based on Atlanta PopSyn• Updated controls and
programming• No person level controls
PopSyn II
![Page 5: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/5.jpg)
PopSyn II FeaturesPopSyn II Features Formulated as an entropy-maximization problem Balance person and household controls
simultaneously Applicable to both Census 2000 and ACS data Updated household weight discretizing step Added household allocation from TAZ to small
geography Database-driven and OOD
![Page 6: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/6.jpg)
PopSyn ScenariosPopSyn Scenarios
Year 2000 PopSyn Year 2008 PopSyn Future year PopSyn(s)
2000 Census Base Year
2010
2008 ACS Base Year 2050
Future Years
![Page 7: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/7.jpg)
An entropy-maximization problem by Peter Vovsha
Subject to constraints:
αi
Where
i = 1, 2….I Household and person controls
Set of households in the PUMA
A priori weights assigned in the PUMA
Zonal controls
αi Coefficients of contribution of household to each control
MethodologyMethodology
![Page 8: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/8.jpg)
PopSyn GeographiesPopSyn Geographies
MGRA (33,000)
TAZ (4,605)
PUMA (16)
![Page 9: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/9.jpg)
SANDAG PopSyn Key StepsSANDAG PopSyn Key Steps
Create Sample HHs
Balance HH Weights
Discretize HH Weights
Allocate HHs
Validate PopSyn
Create control targets
Create validation measures
![Page 10: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/10.jpg)
Control VariablesControl Variables
Household level controls– Household size (1,2,3,4+)– Household income (5 categories)– Number of workers per household (0, 1, 2, 3+)– Number of children in household (0, 1+)– Dwelling unit type (3 categories)– Group quarter status (4 categories)
Person level controls– Age (7 categories)– Gender (2 categories)– Race (8 categories)
![Page 11: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/11.jpg)
Data SourcesData Sources
Census and ACS PUMS– Household and person level microdata
Census and ACS summary data– Source for base year control targets– Source for base year validation data
SANDAG estimates and forecasts– Source for future year control targets
![Page 12: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/12.jpg)
ACS Vs. CensusACS Vs. CensusACS Census
Frequency Every year Every 10 years
Data Collected
Both SF1 and SF3 data
oSF1: number of people, age, race, gender, etc.oSF3: income, education, disability status, etc.
Estimates Period estimates "Point-in-time" estimates
Sample Size
1 in 40 householdso Short form SF1: 100% counto Long form SF3: 1 in 6 households
o 1-year PUMS: 1%o 3-year PUMS: 3%o 5-year PUMS: 5%
PUMS: 5% sample
![Page 13: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/13.jpg)
Why ACS?Why ACS?
Advantages• Timeliness: a new set of data every year for areas that
are large enough (population > 65,000).
Disadvantages• Based on a smaller sample associated with increased
error compared with decennial Census. • ‘Period estimates’ vs. ‘Point in time’. Which year does
the ACS PUMS data represent?
![Page 14: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/14.jpg)
ValidationsValidations Objectives
– Compare PopSyn against Census or ACS Number of validation measures
– Year 2000: 96– Year 2008: 86
Variables used as universes– Number of households– Number of persons
Controlled variables Non-Controlled variables
![Page 15: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/15.jpg)
Validation StatisticsValidation Statistics
Mean percentage difference Standard Deviations Absolute values vs. percentage values Geography: PUMA
![Page 16: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/16.jpg)
ResultsResults
HHID HH Serial # GeoType GeoZone Version SourceID
…
HH Serial # PUMA Attributes
Allocated Household Table
PUMS Person TablePerID HH Serial # Attributes
PUMS Household Table
![Page 17: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/17.jpg)
Results-Validation ExcerptResults-Validation Excerpt
Label Description PopSyn CensusMean Diff.
Standard Dev.
1 number of HHs 985938 992681 -0.6% 0.9%
6 size 1 24.2% 24.2% -0.4% 1.5%
7 size 2 32.3% 32.0% 0.8% 1.0%
8 size 3 15.9% 16.1% -1.8% 2.0%
9 size 4 27.7% 27.7% -0.7% 3.3%
![Page 18: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/18.jpg)
Census 2000 Population DensityCensus 2000 Population Density
![Page 19: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/19.jpg)
Results-Examples(I)Results-Examples(I)
![Page 20: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/20.jpg)
Results-Examples(II)Results-Examples(II)
![Page 21: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/21.jpg)
Results-Examples(III)Results-Examples(III)
![Page 22: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/22.jpg)
Results-Examples(IV)Results-Examples(IV)
![Page 23: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/23.jpg)
Results-Household Characteristics
Results-Household Characteristics
![Page 24: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/24.jpg)
Results-Person CharacteristicsResults-Person Characteristics
![Page 25: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/25.jpg)
Results-Summary(I)Results-Summary(I)
Mean Diff. Range by PUMA Census 2000
ACS2005-2009
>-2% & <2% 40/96 28/86
>-5% & <5% 59/96 50/86
>-10% & <10% 78/96 67/86
>-20% & < 20% 87/96 84/86
![Page 26: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/26.jpg)
Results-Summary(II)Results-Summary(II)
ACS-Based vs. Census-Based PopSyn(s)– Both produced acceptable results– Census PopSyn performed better than ACS PopSyn
in validation measures– Consistency between targets and validation data
• Census PopSyn: both from Census summary• ACS PopSyn: targets from estimates, validation data
from ACS summary– Target accuracy at small geography is the key
![Page 27: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/27.jpg)
Results-Software PerformanceResults-Software Performance
Test environment– Dell Intel Xeon PC with dual 2.69 GHz processors
and 3.5 GB of RAM Performance
Year 2000 Year 2008Runtime 11.8 min 14.1 min
SynPop Pop 2.77mil 2.95mil
SynPop HHs 0.99mil 1.05mil
![Page 28: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/28.jpg)
Issues and Future WorkIssues and Future Work
Issues– Consistency of various geographies
• Census/ACS geography• Transportation modeling geography• Land use modeling geography
– Accuracy of land use estimates and forecasts at small geographies
Future Work– Add worker occupations as controls– Improve control target accuracy– Automate control target generations
![Page 29: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/29.jpg)
ConclusionsConclusions
Closed form formulation provides a sound theoretical basis
Balance household and person controls simultaneously
Applicable to both ACS and Census data An early application using 2009 ACS 5-year data Database-driven and OOD makes software easy to
maintain, expand, and transfer
![Page 30: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/30.jpg)
AcknowledgementsAcknowledgements
The authors thank SANDAG staff:– Daniel Flyte, – Ed Schafer, – Eddie Janowicz,
For their help in this project, especially in providing control target data.
![Page 31: Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application](https://reader031.vdocuments.us/reader031/viewer/2022032206/56649ed25503460f94be18ae/html5/thumbnails/31.jpg)
Questions & ContactsQuestions & Contacts
Questions? Contacts
– Wu Sun: [email protected]– Ziying Ouyang: [email protected]– Clint Daniels: [email protected]