background data harmonization data output web: variable documentation system web: data extract...
TRANSCRIPT
Background
Data harmonization
Data output
Web: Variable documentation system
Web: Data extract system
IPUMS Dissemination System
Variable Harmonization
MARST Marital Status
code label CN82A403 CO73A411 KN89A413 MX70A402 US90A425
100 SINGLE/NEVER MARRIED 1=never married 4=single 1=single 9=single 6=never married
200 MARRIED/IN UNION
210 Married (not specified) 2=married 2=married 3=monogamous 1=married
211 Civil 3=only civil
212 Religious 4=only religious
213 Civil and religious 2=civil and religious
214 Polygamous 3=polygamous
220 Consensual union 1=free union 5=free union
300 SEPARATED/DIVORCED 3=sep. or divorced
310 Separated 6=separated 8=separated 3=separated
321 Legally separated
322 De facto separated
330 Divorced 4=divorced 5=divorced 7=divorced 4=divorced
400 WIDOWED 3=widowed 5=widowed 4=widowed 6=widowed 5=widowed
999 UNKNOWN/MISSING 0=missing 6=unknown B=blank 1=unknown
China1982
Colombia1973
Kenya1989
Mexico1970
U.S.A.1990
(Marital status)
IPUMS MicrodataHome
OwnershipRelationto Head
Age MaritalStatus
Occupation
Data extract
3. Submit extract
19212023311
19212023311
19211212211
19211214400
17051612211
17051212211
17051223310
17051214400
03241214400
03242014400
03242023310
03242013310
Pooled Data Extractssample water sex education
Argentina 20013.6 million
Chile 20021.5 million
Cuba 20021.1 million
Extract
Engine
Argentina 2001
Chile 2002
Cuba 2002
Water supply
Sex
Education
1. Select samples
2. Select variables
1 dataset
3 censuses
4 variables
6.2 million records
Harmonized codes
Q: How can we give researchers the information they need without overwhelming them?
Q: How can we best encourage comparative research?
A: Organize information by variable, not sample
A: Ability to filter out unnecessary information
A: Access to full detail when that is desired
Variable Documentation System
1. Exploring the Database
Variables Page
Variables Page
159 samples
Sample Filtering
Variables Page – Filtered
2. Variables – Codes
Variable Codes(Marital status)
Variable Codes(Marital status)
Variable Codes(Marital status)
3. Variable Descriptions
Variable Description(Marital status)
Comparability Discussion(Marital status)
4. Variables – Deep Documentation
Enumeration Text
Enumeration Text(Marital status, Cambodia)
Variable Description(Unharmonized source variables)
Unharmonized Variables(Source data for marital status)
Make it easy to get only the variables and samples that a user needs.
Pool the data across time and countries.
Provide tools to help users manage the size of the data.
Provide advanced features to empower researchers to do new kinds of research.
Data Extract System
Extract – Select Samples
Extract – Select Samples
Extract – Select Variables
1. Case selection
2. Customized sample size
3. Attached characteristics
4. Extract revisions
Advanced Extract Features
Case Selection
Customize Sample Sizes
Customize Sample Sizes
Customize Sample Sizes
Pernum Relationship Age Sex Marst Chborn
1 head 53 female separated 6
2 child 28 male single n/a
3 child 22 male single n/a
4 child 21 male single n/a
5 child 25 female married 2
6 child-in-law 28 male married n/a
7 grandchild 3 male single n/a
8 grandchild 1 male single n/a
9 non-relative 32 female separated 2
10 non-relative 10 male single n/a
11 non-relative 5 female single n/a
Location
Location
Location
0
0
0
0
0
6
5
0
0
0
0
0
0
1
1
1
1
0
5
5
0
9
9
0
0
0
6
6
0
0
0
0
0
Spouse’s Father’sMother’s
Constructed “Pointer” Variables
Attached Characteristics
Age of spouse
Employment status of father
Occupation of father
Attached Characteristics
Download or Revise Extract