2016 ce survey microdata users’ workshop sample design …jul 14, 2016  · sample design and...

37
1 U.S. BUREAU OF LABOR STATISTICS bls.gov 2016 CE Survey Microdata Users’ Workshop Sample Design and Weights Brian T. Nix Mathematical Statistician Division of Price Statistical Methods Bureau of Labor Statistics U.S. Department of Labor July 14, 2016

Upload: others

Post on 07-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

1 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

2016 CE Survey MicrodataUsers’ Workshop

Sample Design and Weights

Brian T. NixMathematical Statistician

Division of Price Statistical MethodsBureau of Labor StatisticsU.S. Department of Labor

July 14, 2016

Page 2: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

2 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Overview

• History and Concepts• Sample Selection

• Define PSUs• Stratify and select a Sample of PSUs• Stratify and Select a Sample of Households• The Future (2010 Census-based Sample Design)

• Weighting the households (CUs)

2

Page 3: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

3 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

History of Sample Redesigns

• New sample of geographic areas and addresses selected every decade (2010)

• 1980 Census-Based Sample Design (1986–1995)

• 1990 Census-Based Sample Design (1995–2004)

• 2000 Census-Based Sample Design (2005–2014)

• 2010 Census-Based Sample Design (2015–2024?)

3

Page 4: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

4 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Concepts Target Population=U.S. non-institutional

civilian population Consumer Unit

person or a group of persons in a household related by blood, marriage, adoption, or other legal arrangements

OR are unrelated but pool their incomes to make joint expenditure decisionsSame as households approximately 98% of time

4

Page 5: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

5 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Concepts (continued) Sampling Frame – List of Households from

which we draw our sample Unit Frame: Regular households (80%) Area Frame: Rural households (10%) Permit Frame: New construction (9%) Group Quarters: (1%)

Will Change to Census Bureau’s Master Address File (MAF) in 2015 (2010 Census with updates twice per year by US Postal Service)

5

Page 6: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

6 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Sample Selection – Overview

• Geographic areas are randomly selected to represent the total U.S.

• Households are randomly selected to represent the geographic areas

• Guiding principle:“Randomness ensures representativeness.”

6

Page 7: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

7 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

PSU Definitions

• PSU: Primary Sampling Unit Counties are geographically grouped together to become

units for sample selection• CBSA: Core Based Statistical Areas (~old MSA)

Counties are grouped together into geographic entities called core based statistical areas (CBSA’s) by Office of Management and Budget

Metropolitan – one or more counties centered around urban area of > 50,000 people

Micropolitan – one or more counties centered around urban area of 10,000 - 50,000 people

• Over 3,000 county and county equivalents in the U.S.• Over 900 CBSAs defined by OMB

7

Page 8: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

8 — U.S. BUREAU OF LABOR STATISTICS • bls.gov 8

Page 9: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

9 — U.S. BUREAU OF LABOR STATISTICS • bls.gov 9

Page 10: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

10 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Selection of PSUs(2000-Census Design)

10

PSU size

Self-Repre-

senting?

CBSA/Non-CBSA

PopulationExamplesTotal

A YES Metropolitan(urban)

More Than 2,700,000

A103A423

Boston MASeattle WA

X NO Metropolitan(urban)

Less Than 2,700,000 Topcoded

Y NO Micropolitan(urban) Topcoded

Z NO Non CBSA(rural) Topcoded

Page 11: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

11 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Selection of PSUs(2010-Census Design)

11

PSU class

Description CBSA/Non-CBSA

PopulationExamplesTotal

S Self-Representing

Metropolitan(urban)

More Than 2,500,000

S11AS49D

Boston MASeattle WA

N Non-Self-Representing

Metro- or Micropolitan(urban)

Less Than 2,500,000 Topcoded

R Rural (also Not Self-Representing)

Non-CBSA (rural) Topcoded

Page 12: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

12 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

2000 Census-based Sample SelectionCPI – 75 PSUs; CE – 91 PSUs

12

PSU Size

Region TotalNortheast Midwest South West

A 5 4 6 6 21

X 4 10 16 8 38

Y 2 4 6 4 16

Z 2 4 6 4 16

Total 13 22 34 22 91

Page 13: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

13 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

The Four Census Regions

Page 14: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

14 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

The Nine Census Divisions

Page 15: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

15 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

2000 Census-based Sample SelectionCPI – 75 PSUs; CE – 91 PSUs

15

PSU Size

Region TotalNortheast Midwest South West

A 5 4 6 6 21

X 4 10 16 8 38

Y 2 4 6 4 16

Z 2 4 6 4 16

Total 13 22 34 22 91

Page 16: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

16 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

2010 Census-based Sample SelectionCPI – 75 PSUs; CE – 91 PSUs

(same numbers as before!)

16

PSU Size

Region/Division TotalNortheast Midwest South West01 02 03 04 05 06 07 08 09

S 1 2 2 2 5 0 2 2 7 23

N 2 4 8 4 12 6 8 4 4 52

R 1 1 2 2 2 2 2 3 1 16

Total 4 7 12 8 19 8 12 9 12 91

Page 17: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

17 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Hypothetical PSU Selection

17

Page 18: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

18 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Hypothetical PSU Selection (continued)

CBSA 2000 Population

Probability of Selection

Augusta-Aiken, GA-SC 523,162 0.54746Columbus, GA-AL 274,624 0.28738Albany, GA 157,833 0.16516Total 955,619 1.00000

CBSA2000

PopulationProbability

of SelectionSavannah, GA 293,000 0.30906Macon, GA 222,368 0.23456Athens, GA 207,668 0.21905Warner Robins, GA 134,433 0.14180Rome, GA 90,565 0.09553Total 948,034 1.00000

Page 19: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

19 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Hypothetical PSU Selection (continued)

CBSA 2000 Population

Probability of Selection

Augusta-Aiken, GA-SC 523,162 0.54746Columbus, GA-AL 274,624 0.28738Albany, GA 157,833 0.16516Total 955,619 1.00000

CBSA2000

PopulationProbability

of SelectionSavannah, GA 293,000 0.30906Macon, GA 222,368 0.23456

Athens, GA 207,668 0.21905Warner Robins, GA 134,433 0.14180Rome, GA 90,565 0.09553Total 948,034 1.00000

Page 20: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

20 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Number of Households

Allocate Target Sample to PSUs Target size: ~7,000 interviewed households

• Based on Finite Budget• For Diary Survey per year• For Interview Survey per quarter

6,600 to Households used jointly by CE and CPI• for CPI cost-weight calculations• 21 A-size PSUs • 54 X-size and Y-size PSUs

400 to CE Households• 16 Z-Size PSUs

20

Page 21: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

21 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Number of Households(continued)

Target Sample Size 7,000 interviewed households per year (Diary) 7,000 interviewed households per quarter (Interview,

interviews #2-5 only)

Target Sample Yield 14,000 weekly diaries per year (=7,000 x 2) 28,000 quarterly interviews per year (=7,000 x 4)

21

Page 22: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

22 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Number of Households(continued)

Local Target Sample Size Allocate 7,000 interviewed households to individual

PSUs, proportional to each stratum’s population

Minimizes CE’s nationwide variance

22

Page 23: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

23 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Translate Addresses into Interviewed Households

80% “eligibility” rate: (most of the missing 20% are unoccupied)

70% response rate 56% “participation” rate (0.56 = 0.80×0.70)

23

Page 24: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

24 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Translate Interviewed Households into Addresses (continued)

24

PSUInterviewedhouseholds Addresses %

A102 Philadelphia 169 322 52

A103 Boston 195 286 68A109 New York City 220 420 52A110 NY-Conn suburbs 212 335 63A111 NJ suburbs 182 291 63

etc. etc. etc.Total 7,000 12,000

Page 25: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

25 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Select a Random Sample of Households (Mechanics)

• Sort households from poor to rich based on information from Decennial Census and ACS:

• Number of people in household• Tenure (owner, renter)• Market value of home (owners)• Monthly rent (renters)

25

Page 26: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

26 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Select a Random Sample of Households (Continued)

• Compute the sampling interval for each PSU

• Sampling interval = (# addresses in sampling frame) ÷ (# addresses in CE sample)

• Typical sampling intervals:• Every 1,000th address (X+Y, Z PSUs)• Every 5,000th address (A PSUs)

26

Page 27: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

27 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Select a Random Sample of Households (Continued)

• --- D --- I --- D --- I --- D --- I --- D --- I --- D --- I --- D --- I --- etc.

• D=Diary, I=Interview

• Each “D” and “I” has enough sample to cover the next 10 years

27

Page 28: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

28 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

The Future

• New Sample Selection• Based on 2010 Census

• Updated Metropolitan/Micropolitan Statistical Area definitions from OMB

• Updated populations used for selection probabilities• Already in Production (began in 2015)• Data Being Released This Year

28

Page 29: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

29 — U.S. BUREAU OF LABOR STATISTICS • bls.gov 29

??????

The Future

Page 30: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

30 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weighting Process

• Usable Interviews• Diary Survey

• Good interviews • Diary complete enough to be counted

• Interview Survey• Good interviews

• Interviews 2 – 5• Completed enough to be counted

30

Page 31: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

31 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weighting Process (Continued)Base Weight Calculation: Real-World Example

(Self-Representing PSU) S49A (Los Angeles): population 12,828,837

MAF counts 4,500,000 housing units 470 addresses needed for each survey

based on estimated response rates specific to L.A. proportional to 12,000 addresses needed for all PSUs

“Take Every” = 4,500,000 / 470 ≈ 9,575(every 4,788th address when considering both surveys)

Stratum population also 12,828,837(self-representing PSU)

PSU Weight = 1 (for any self-representing PSU) Base Weight = “Take Every” ∗ PSU Weight

≈ 9,575 ∗ 1 ≈ 9,575

31

Page 32: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

32 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weighting Process (Continued)Base Weight Calculation: Hypothetical Example

(Non-Self-Representing PSU) PSU Population 538,200

MAF counts 224,250 housing units 115 addresses needed for each survey

based on estimated response rates specific to this PSU proportional to 12,000 addresses needed for all PSUs

“Take Every” = 224,250 / 115 ≈ 1,950(every 975th address when considering both surveys)

Stratum population 2,800,000 PSU Weight = 2,800,000 / 538,200 ≈ 5.2025 Base Weight = “Take Every” ∗ PSU Weight

≈ 1,950 ∗ 5.2025 = 10,145

32

Page 33: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

33 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weighting Process (Continued)

Base Weight (~10,000)9,999 CUs + Self

Weighting Control Factor (~1.00)Apartment Building instead of a House

Non-interview Adjustment Factor (~1.3)Type A: Refusal to Participate

Calibration Adjustment Factor (~1.15)Adjusts sample estimate to CPS Totals

33

Page 34: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

34 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Weighting Process (Continued)

• Final Weight

• Variable FINLWT21

• Base Weight * Weighting Control Factor *Non-interview Adjustment Factor *Calibration Adjustment Factor

• ~15,000 to 18,000

34

Page 35: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

35 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

ConclusionBoth Sample Design and Weighting

Work Together to Produce: Best Estimates of U.S. Expenditures Subject to Allotted CE Budget

35

Page 36: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

36 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Any Questions?

36

Page 37: 2016 CE Survey Microdata Users’ Workshop Sample Design …Jul 14, 2016  · Sample Design and Weights Brian T. Nix Mathematical Statistician. Division of Price Statistical Methods

37 — U.S. BUREAU OF LABOR STATISTICS • bls.gov

Contact Information

Brian T. NixMathematical Statistician

Statistical Methods Divisionwww.bls.gov/cex

[email protected]