vital and health statisticsdesign and estimation for the national health interview survey,...

41
Series 2 No. 130 Design and Estimation for the National Health Interview Survey, 1995–2004 June 2000 Vital and Health Statistics From the CENTERS FOR DISEASE CONTROL AND PREVENTION / National Center for Health Statistics U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics

Upload: others

Post on 08-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2No. 130

Design and Estimation forthe National Health InterviewSurvey, 1995–2004

June 2000

Vital andHealth StatisticsFrom the CENTERS FOR DISEASE CONTROL AND PREVENTION / National Center for Health Statistics

U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICESCenters for Disease Control and Prevention

National Center for Health Statistics

Page 2: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Copyright Information

All material appearing in this report is in the public domain and may bereproduced or copied without permission; citation as to source, however, isappreciated.

Suggested citation

Botman SL, Moore TF, Moriarity CL, and Parsons VL. Design and estimation forthe National Health Interview Survey, 1995–2004. National Center for HealthStatistics. Vital Health Stat 2(130). 2000.

Library of Congress Cataloging-in-Publication Data

Design and estimation for the National Health Interview Survey, 1995-2004 /Steven L. Botman. . . [et al.].

p.; cm. — (Vital and health statistics. Series 2, Data evaluation andmethods research ; no. 130) (DHHS publication ; no. (PHS) 2000-1330)

‘‘March 2000.’’Includes bibliographical references.ISBN 0-8406-0562-51. Health surveys—United States. I. Botman, Steven. II. National Center for

Health Statistics (U.S.) III. Series. IV. Series: DHHS publication ; no. (PHS)2000-1330

[DNLM: 1. National Health Interview Survey (U.S.) 2. HealthSurveys—United States. 3. Data Collection—United States. 4. ResearchDesign—United States. W2 A N148bv no. 130 2000]RA409 .D475 2000614.4'273'0723—dc21 00-037092

For sale by the U.S. Government Printing OfficeSuperintendent of DocumentsMail Stop: SSOPWashington, DC 20402-9328Printed on acid-free paper.

Page 3: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Design and Estimation forthe National Health InterviewSurvey, 1995–2004

Series 2:Data Evaluation andMethods ResearchNo. 130

Hyattsville, MarylandJune 2000DHHS Publication No. (PHS) 2000-1330

Vital andHealth Statistics

U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICESCenters for Disease Control and PreventionNational Center for Health Statistics

Page 4: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

National Center for Health Statistics

Edward J. Sondik, Ph.D., Director

Jack R. Anderson, Deputy Director

Jack R. Anderson, Acting Associate Director forInternational Statistics

Jennifer H. Madans, Ph.D., Associate Director for Science

Lester R. Curtin, Ph.D., Acting Associate Director forResearch and Methodology

Jennifer H. Madans, Ph.D., Acting Associate Director forAnalysis, Epidemiology, and Health Promotion

P. Douglas Williams, Acting Associate Director for DataStandards, Program Development, and Extramural Programs

Edward L. Hunter, Associate Director for Planning, Budget,and Legislation

Jennifer H. Madans, Ph.D., Acting Associate Director forVital and Health Statistics Systems

Douglas L. Zinn, Acting Associate Director forManagement

Charles J. Rothwell, Associate Director for DataProcessing and Services

Office of Research and Methodology

Lester R. Curtin, Ph.D., Acting Associate Director

Lester R. Curtin, Ph.D., Chief, Statistical Methods Staff

Trena Ezzati-Rice, Chief, Survey Design Staff

Jimmie D. Givens, Acting Chief, Statistical Technology Staff

Kenneth W. Harris, Acting Manager, Questionnaire DesignResearch Laboratory Staff

Page 5: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

to

s

h

all

or

c,d

,

a

,

d

nd

y

,

ge

e

the

d’

dr

g

s

en

Preface

This report presents a detaileddescription of the sample designfeatures for the National Health

Interview Survey (NHIS) for the period1995–2004. The report is a successoran earlier publication,Design andEstimation for the National HealthInterview Survey, 1985–94. (5)

NHIS is one of the major datacollection programs of the NationalCenter for Health Statistics (NCHS).Through NHIS, information concerningthe health of the U.S. civiliannoninstitutionalized population iscollected in household interviewsthroughout the United States. NHIS habeen in continuous operation since1957, and its sample design has beenreevaluated and modified following eacof the last four decennial censuses ofthe U.S. population.

The 1995–2004 redesign of NHISwas a major undertaking that involvednumber of government agencies as weas several private contractors. TheSurvey Design Staff in the Office ofResearch and Methodology (ORM),NCHS, in collaboration with theDivision of Health Interview Statistics(DHIS), NCHS, had overallresponsibility for the development andimplementation of the 1995–2004 NHISredesign. Monroe Sirken, formerly theORM Associate Director, played a majrole in the conceptualization andplanning of the research program. Thelate James Massey, former Chief of theSurvey Design Staff, and Donald Maleformerly of the Survey Design Staff, hathe primary responsibility for directingthe redesign research and coordinatingthe research activities conducted byNCHS, the U.S. Bureau of the Censusand Westat, Inc. The late StevenBotman, formerly of the Survey DesignStaff, had the lead responsibility inhelping to implement the 1995–2004NHIS redesign, and started developingmanuscript that was the genesis of thisreport. Van Parsons of the StatisticalMethods Staff, ORM, contributed to thedevelopment of subdesigns for the

1995–2004 NHIS redesign. The lateOwen Thornberry, formerly the DirectorDHIS, and John Horm, formerly theacting Chief of the Survey Planning anDevelopment Branch, DHIS, were theleading DHIS participants in the1995–2004 NHIS redesign work.

Under contract with NCHS betwee1989 and 1993, Westat, Inc., conductea major portion of the research for the1995–2004 NHIS redesign. The primarresearchers at Westat included DavidJudkins, David Marker, and JosephWaksberg. As part of Westat’s researcha sample design referred to as the‘‘alpha’’ design was developed. Itassumed a 50-percent data collectionbudget increase to permit oversamplingof racial and ethnic minorities. TheWestat researchers also developed the‘‘beta’’ design, a modification of thealpha design, which assumed no chanin the NHIS data collection budget. Thebeta design, which became the designimplemented in 1995, is the designdescribed in this report. Another NCHSreport,National Health InterviewSurvey: Research for the 1995–2004Redesign, (6) provides a detaileddescription of Westat’s research work,including a description of the alphadesign.

Additional NHIS redesign researchwas conducted by the U.S. Bureau ofthe Census and coordinated through thTask Force on Household SurveyRedesign, assembled and directed bylate Maria Gonzalez of the Office ofManagement and Budget (OMB). Thetask force was formed to coordinate anmonitor the U.S. Bureau of the Censusredesigns of those household surveysconducted by the Bureau for itself andother Federal government agencies, anthat are simultaneously redesigned afteeach decennial census. The task forceplayed an important role in coordinatinboth the technical and fundingrequirements for the redesigns of thesurveys.

The U.S. Bureau of the Census habeen the primary data collector forNHIS since the inception of NHIS. TheU.S. Bureau of the Census also has beinvolved with the research and

implementation of the NHIS redesigns.For the 1995–2004 redesign, theDemographic Statistical MethodsDivision (DSMD) at the U.S. Bureau ofthe Census had the primaryresponsibility for evaluating alternativeprimary sampling unit (PSU) definitionsfor NHIS and for implementing theredesigned sample. Persons at the U.S.Bureau of the Census deserving specialrecognition for their contribution to the1995–2004 NHIS redesign effort includePreston Jay Waite, formerly Chief ofDSMD; Thomas Moore, Chief of theHealth Surveys and SupplementsBranch, DSMD; Robert Mangold,formerly Chief of the Health SurveysBranch, Demographic Surveys Division;Patricia Wilson of DSMD; and LloydHicks, formerly of DSMD.

This report is organized into threechapters so researchers and other usersof the NHIS data can refer to differentparts of the report for different levels ofdetail about the NHIS design.Chapter 1provides a general overview of NHIS,chapter 2provides a detailed descriptionof the NHIS sample design, andchapter 3presents a description of theestimators used for analyzing NCHSdata.

iii

Page 6: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

This report is dedicated to the memory of our late colleagues

Steven L. Botman and James T. Massey,both formerly in the Survey Design Staff,

Offıce of Research and Methodology,National Center for Health Statistics.

Both men made significant contributions to theNational Health Interview Survey.

Steven L. Botman James T. Massey

Page 7: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Contents

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Abstract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 1. Overview of the National Health Interview Survey. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Sample Design and Basic Subsamples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Data Collection Instruments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Reuse of the Survey Sample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Sample Redesign. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Chapter 2. 1995–2004 NHIS Sample Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Redesign Objectives Achieved. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Description of the Sample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5PSU and Stratum Formation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5PSU Stratification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Special Situations Arising From PSU Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6PSU Sample Allocation and Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Rotating PSU’s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Panels of PSU’s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Substrata Within PSU’s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7SSU Formation and Sampling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Example of SSU Sampling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8SSU Clustering Characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Example of Within-Substratum Sampling Rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Sampling in the Permit Frame. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Sampling of Group Quarters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Sampling of Persons Within Households. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Interviewer Assignments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Balancing the Sample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Additional Subsampling Situations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Assignment of Screening Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Sampling Rule Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Chapter 3. Design and Estimation Structures for the 1995–2004 NHIS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Methods for Creating Weights and Estimators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14National Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Base Weight Estimator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Factors Contributing to Annual Weight Fluctuations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Household Nonresponse Adjustment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Methodology for 1995 and 1996 Surveys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Methodology for 1997 Survey and Beyond. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Estimator Based on the Product ofWI andWnr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Ratio Adjustments for Person-Level Weights. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

First-Stage Ratio Adjustment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Second-Stage Ratio Adjustment (Poststratification). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Creation of Other Weights. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

v

Page 8: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Variance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Simplified Design Structures for Variance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Variance Estimators for National Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

SR PSU’s: Conceptual NHIS Within PSU Sampling and Estimation Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20NSR PSU’s: Variance Estimator that Accounts for PSU Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Variance Estimators for National Estimators: Combining Across NSR and SR PSU’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Estimation Variances for Poststratified Totals and Nonlinear Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Public Use Data and Limitations on Design Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Precision Comparisons Between Surveys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Definition of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Figures

1. Housing unit distribution by measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92. Housing unit distribution by annual secondary sampling unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Tables1. National Health Interview Survey designs: 1973–84, 1985–94, 1995–2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122. Sampling strata characteristics: National Health Interview Survey, 1995–2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123. Primary sampling units by census region: National Health Interview Survey, 1995–2004 . . . . . . . . . . . . . . . . . . . . . . . . . 124. Race and ethnicity density strata and within primary sampling unit parameters: National Health Interview

Survey, 1995–2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135. Components of conceptual sampling design: National Health Interview Survey, 1995–2004 . . . . . . . . . . . . . . . . . . . . . . 246. First-stage ratio adjustment factors: National Health Interview Survey, 1997 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247. The 88 age-sex-race/ethnicity classes used for poststratification: National Health Interview Survey, 1995–2004 . . . . . . 248. Impact of poststratification on variance estimation: National Health Interview Survey, 1995 . . . . . . . . . . . . . . . . . . . . . . 259. Sample size and weighted size of persons by domain: National Health Interview Survey, 1995 . . . . . . . . . . . . . . . . . . . . 2610. Comparison of variance estimates obtained by two methods: National Health Interview Survey, 1995 . . . . . . . . . . . . . . 2711. Precision comparisons between two survey designs: National Health Interview Survey, 1985–94 and 1995–2004 . . . . . 2812. Sample size and percent change by domain: National Health Interview Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

vi

Page 9: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Design and Estimation for theNational Health InterviewSurvey, 1995–2004

y

t

ernt

nsf

s

nn

The authors appreciate the input of several NCHS staff who reviewed the report at various stages ofdevelopment. Deborah D. Ingram, Ph.D., Office of Analysis, Epidemiology, and Health Promotion, servedas peer reviewer and made many useful comments. The authors also thank other NCHS staff who reviewedthe report at various stages of development and made helpful comments and suggestions, particularly IrisM. Shimizu, Ph.D., and Myron J. Katzoff, Ph.D., Office of Research and Methodology. In addition, theauthors thank Alfred G. Meier of the U.S. Bureau of the Census, who reviewed the report and suppliedmuch useful material. The report was edited by Klaudia M. Cox and typeset by Zung T. N. Le of thePublications Branch, Division of Data Services.

ObjectivesThis report presents an overview, a

detailed description of the sampledesign features, and estimationstructures for the 1995–2004 NationalHealth Interview Survey (NHIS). It isintended to serve the same role for thecurrent (1995–2004) National HealthInterview Survey design as the NCHSpublication, Series 2, No. 110, Designand Estimation for the National HealthInterview Survey, 1985–94, did for theprevious design.

MethodsThe 1995–2004 NHIS sample design

uses cost-effective complex-samplingtechniques including stratification,clustering, and differential samplingrates to achieve several objectives.These objectives include improvedreliability of racial, ethnic, andgeographical domains. This reportprovides a description of thosemethods.

ResultsThis report presents the operating

characteristics of the 1995–2004 NHIS.The general sampling structure ispresented along with a discussion ofthe weighting and variance estimationtechniques. This report is intended forthe general users of NHIS datasystems. A companion report, Series 2,No. 126, National Health InterviewSurvey: Research for the 1995–2004Redesign, provides a finer level ofdetail on the redesign process.

Keywords: sampling c weighting cnonresponse adjustment c varianceestimation

Chapter 1.Overview of theNational HealthInterview Survey

Lead authors: Steven L. Botmanand Christopher L. Moriarity,Offıce of Research andMethodology, National Center forHealth Statistics

BackgroundThe National Health Interview Surve

(NHIS) is the Nation’s primary source ofgeneral health information for the residencivilian noninstitutionalized population.NHIS is conducted by the National Centfor Health Statistics (NCHS), a componeof the Centers for Disease Control andPrevention, U.S. Public Health Service,Department of Health and HumanServices. In accordance with specificatioestablished by NCHS, the U.S. Bureau othe Census, under a contractualrelationship, participates in the planningfor NHIS and the collection of data. NHIShas continuously collected data since1957. This continuous data collection haadministrative, operational, and dataquality advantages because fieldwork cabe handled on a continuous basis with aexperienced, stable staff.

NHIS provides estimates on healthindicators, health care utilization andaccess, and health-related behaviors forthe U.S. resident civiliannoninstitutionalized population.Summary reports and reports on specialtopics for each year’s data are preparedby NCHS’ Division of Health InterviewStatistics for publication in Series 10 ofthe Vital and Health Statisticspublications series. In these reports,basic NHIS survey estimates arepublished annually for variouspopulation subgroups. These subgroupsinclude those defined by age, sex, race,family income, census region(Northeast, Midwest, South, and West),place of residence (central city of ametropolitan area, metropolitan areabut not in a central city, andnonmetropolitan areas), and otherdomains covered on the particular NHISstatistic. Statistics from NHIS alsoappear in other NCHS reports, inprofessional journal articles, and inmany other publications.

NCHS releases annual NHISmicrodata files in several forms. Since1969, public use data files have beenprepared for each year of datacollection. Public use microdata also areavailable on compact disk read-onlymemory (CD-ROM) for data collectionyears starting in 1987. Recently, publicuse microdata also have been madeavailable via the Internet.

Page 1

Page 10: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 2 [ Series 2, No. 130

These microdata files usuallycontain information that allows users todevelop direct estimates of samplingerrors that are consistent with thecomplex survey design of NHIS. Since1985, NCHS has based estimates ofNHIS sampling error on the Taylorseries linearization method. This methodcan produce estimates that take intoaccount the sample design, using aconceptual model of the NHIS sampledesign that captures many of its keyfeatures. In particular, for NHISvariance estimation, NCHS has used theSUDAAN software (1), which includesthe Taylor series linearization method.To facilitate the use of software thatpermits analysis of NHIS data in whichthe sample design is taken into account,NCHS has coordinated with otheragencies within the U.S. Public HealthService to support enhancements for theSUDAAN software.

To maintain respondentconfidentiality in NHIS, NCHSwithholds variables from the NationalHealth Interview Survey’s public usedata files that could permit explicit orimplicit identification of surveyrespondents. One of the major risks forinadvertent respondent identification isthe inclusion of identifiers on surveyfiles that place respondents in smallgeographic areas (for example, censusblock, census block group, county, orstate). Thus, in these data releases,variables identifying specific geographicareas smaller than one of the fourcensus regions usually are withheld toprotect respondent confidentiality.However, NCHS has releasedinformation that allows identification ofthe largest metropolitan areas.

Sample Design and BasicSubsamples

The National Health InterviewSurvey is based on a stratifiedmultistage sample design. The specificparameters of the design, however, havechanged over time; a new sample designis implemented following each decennialcensus. For example, the 1973–84survey design had 386 sample primarysampling units (PSU’s), the 1985–94survey design had 198 sample PSU’s,

and the current 1995–2004 surveydesign has 358 sample PSU’s. For moredetails, refer to table 1 in chapter 2.

The 1995–2004 NHIS has beendesigned to produce estimates for theNation, for each of the four censusregions, and within census regions byareas determined by metropolitan andnonmetropolitan status. Although the1995–2004 survey samples from all ofthe States and the District of Columbia,it is not designed to produce reliableState-level estimates for every State.

For the 1995–2004 survey design,the NHIS sample is partitioned into anumber of subsamples. First, the samplefor the 10-year survey period ispartitioned into subsamples that areassigned for data collection each yearfrom 1995 through 2004. Moreover, forsurvey administration, data collection,and processing, the NHIS annual sampleis assigned to four calendar quarters.Within each quarter, the sample isassigned to individual weeks. Theportion of the survey sample assigned toeach calendar quarter of the year isrepresentative of the target population.The portion of the survey sampleassigned to each week also isrepresentative of the target population,although estimates based on weeklysamples tend to be unstable. (Tworeasons for the instability are a smallsample size and the fact that eachsample PSU represented in a quarterlysample is not necessarily represented inthe weekly sample. This reduces theprecision of variance estimators.) Theassignment of subsamples to specificdata collection periods permits the NHISsample to be used to estimate highfrequency measures or to obtainestimates for large population groupsfrom a short period of data collection.Other measures sometimes can beobtained by accumulating the surveysample for longer periods of time,including more than 1 year.

The assignment of the NHIS sampleto weekly and quarterly subsamples alsohas a number of operational andadministrative benefits. For example, theassignment of NHIS to weeklysubsamples results in the field staff havinga continuous workload, which enhancesthe quality of the resultant data.

Beginning with the 1985–94 surveydesign, the annual NHIS samples havebeen partitioned into four panels(subsamples) each having approximatelythe same number of sample householdsand conceptually similar statisticalfeatures. Each panel can be consideredas a probability sample of the U.S.population. (Note that ‘‘ panels’’ and‘‘ calendar quarters’’ are not the same;the set of four panels is a partition ofthe annual survey samples thatencompasses the four calendar quarters.)The panels have several anticipateduses. They provide a mechanism tomake large cuts in the survey samplesize in case insufficient funds areavailable for data collection. Becausethe sample is reused as a samplingframe for other surveys, the panels alsoprovide a mechanism for NHIS toprovide nonoverlapping samples forreuse as sampling frames for otherstudies. After the 1995–2004 surveydesign sample was selected, NCHSpartitioned the new sample into a newset of four panels.

Data CollectionInstruments

The NHIS data collectioninstrument for 1995 and 1996 wassimilar to the instrument used before1995. Prior to 1995, the last majorrevision of the instrument occurred in1982. A major revision of theinstrument occurred in 1997. The datacollection instruments for 1995 and1996, and 1997 and beyond, aredescribed below.

NHIS data collection is conducted bythe U.S. Bureau of the Census under aninteragency agreement with NCHS. NHISinterviewers are employees of the U.S.Bureau of the Census. These interviewersreceive extensive training, and their workis monitored through a quality assuranceprogram. Data are collected from eachfamily in the survey sample using aface-to-face interview. If a sampledhousehold contains more than one family,many aspects of the interview are repeatedfor each family in the household.

For 1995 and 1996, the datacollection instrument had threecomponents: a basic health and

Page 11: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 3

demographic ‘‘ core’’ questionnaire, acondition list, and one or moresupplemental questionnaires thataddressed health topics of special publichealth interest. The NHIS basic healthand demographic questionnaire consistedof a fixed set of health andsociodemographic questions.Additionally, each household wasrandomly assigned one of six conditionlists. The ‘‘ core’’ questionnaire and thecondition list responses are used fordeveloping annual estimates of varioushealth variables, including acute andchronic conditions, hospital stays,medical visits, and limitations ofactivities.

The supplemental questionnairesaddressed health issues or topicsidentified as being of particular interestwithin the U.S. Public Health Service.From 1985 through 1994, thesesupplemental questionnaires on topicssuch as cancer control and cancerepidemiology, AIDS knowledge andawareness, and health promotion anddisease prevention, were administered toa randomly selected adult in each familyof the survey sample. For several yearsin the 1990’s, the supplementalquestionnaires on health insurance andaccess to health care were administeredto the entire survey sample. In 1994 and1995, NHIS included a supplement ondisability that was administered to theentire sample. For the last few yearsprior to 1997, several supplements, suchas ‘‘ Family Resources’’ and ‘‘ ChildhoodImmunization,’’ were included in thesurvey on an annual basis. Householddata collection for the supplementalquestionnaires averaged about 30–90minutes.

Prior to 1997, the basic strategy fordata collection was for the interviewerto assemble all available householdmembers aged 19 years and older at asample address. A knowledgeable adultcould report for absent adults. In mostcases, proxy reporting by aknowledgeable adult was used forpersons under age 19 years, althoughpersons aged 18 years could report forthemselves and persons aged 17 yearscould report for themselves under somecircumstances. As part of theenumeration of each household in thesurvey sample, persons in an individual

household were partitioned into separatefamilies if multiple families werepresent. A separate basic health anddemographic questionnaire wasadministered to members of eachindividual family in the household.

In 1996, NHIS began testing theuse of computer-assisted personalinterviewing (CAPI) that was based on arevised data collection instrument.Beginning in 1997, the survey switchedto a CAPI system and a revised datacollection instrument that wasrestructured and shortened in an effort toreduce respondent burden. Many of thequestions formerly asked of all personsin the ‘‘ core’’ questionnaire areadministered on a sample basis. Theredesigned questionnaire has three partsor modules: a basic module, a periodicmodule, and a topical module.

The basic module functions as thenew ‘‘ core’’ questionnaire. Plans are forit to remain largely unchanged fromyear to year. The basic module containsthree components: the family core, thesample adult core, and the sample childcore. The family core componentcollects information on everyone in thefamily. Information collected in thefamily core component includeshousehold composition andsociodemographic characteristics. It alsoincludes basic indicators of health statusand utilization of health care services.From each family in the survey, onesample adult and one sample child (ifany children under age 18 are present)are randomly selected, and informationon each is collected with the sampleadult core and the sample child corequestionnaires. Because some healthissues are different for children than foradults, these two questionnaires differ insome items, but both collect basicinformation on health status, health careservices, and behavior.

One important change implementedin 1997 is that random selection of oneof six condition lists was eliminated.Instead a checklist of conditions appearsin the sample adult core and samplechild core.

The purpose of a periodic module isto collect more detailed information onsome of the topics included in the basicmodule from the sample persons. It wasused for the first time in 1998.

The role of a topical module isanalogous to the supplementalquestionnaires of the 1982–96 survey;that is, it is used to respond to publichealth data needs as necessary. In 1999,the first year a topical module was used,the topical module focused on diseaseprevention issues.

The family core component of thebasic module is administered in amanner similar to the ‘‘ core’’questionnaire prior to 1997. All adultmembers of the household 17 years ofage and over who are at home at thetime of the interview are invited toparticipate and to respond forthemselves. For children and adults notat home or unable to respond forthemselves during the interview,information is provided by a responsibleadult family member (18 years of age orover) residing in the household. For thesample adult core component, one adultper family is randomly selected, and thisindividual responds for him/herself tothe questions in this section (i.e., noproxy response is allowed). Informationfor the sample child core component isobtained from a knowledgeable adult inthe household.

Reuse of the SurveySample

During the 1985–94 survey designperiod, the sample was reused forseveral surveys with smaller samplesizes. These surveys were for studiessponsored in part by NCHS’ Division ofHealth Interview Statistics or by otherunits in NCHS. Such linkages are mostcost-effective when the smaller survey isbased on a subdomain of the populationor if it oversamples domains whosemembership can be determined from theNHIS interview.

The National Survey of FamilyGrowth (NSFG), Cycles IV and Vsamples, were selected from respondentsin the survey who met the NSFGeligibility criteria. Cycles IV and V ofNSFG, sponsored by NCHS and otherU.S. Public Health Service agencies,were surveys of women aged 15–44years and focused on their reproductivehistory and intentions.

Page 12: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 4 [ Series 2, No. 130

Phase 2 of the NHIS DisabilitySupplement was based on a reuse ofcertain respondents in the 1994–95Phase 1 Disability Supplement.

The 1996 Medical ExpenditurePanel Survey (MEPS), sponsored by theAgency for Healthcare Research andQuality, was based on a reuse of aportion of the 1995 NHIS sample.Subsequent years of the MEPS arebased on reuse of a portion of theprevious year’s NHIS sample.

Sample RedesignSince its inception in 1957, the

NHIS sample has been redesignedfollowing each decennial census of thepopulation to accommodate changes insurvey requirements and to take intoaccount the changes in the populationand its distribution (2–5). For the1995–2004 NHIS sample designimplemented in 1995, the U.S. Bureauof the Census and Westat, Inc.,conducted research on issues related tothe sample design using specificationsprovided by NCHS (6).

Although the main goals in the1995–2004 NHIS sample design wereimproving the reliability of statistics forracial, ethnic, economic, and geographicdomains, other issues also wereaddressed in the research. The resultsoften led to conflicting sampleallocations; that is, a sample allocationthat would be optimal for one type ofdomain would be far from ideal foranother type of domain. The final sampleallocation was a compromise betweenideal allocations for the various domains.

The primary features in the 1995–2004 NHIS sample design implementedin January 1995 are as follows:

1. Use of an all area samplingframe—The 1995–2004 survey isbased on an area sampling framefor housing units in place at thetime of the 1990 Census. Each ofthe other current demographicsurvey samples conducted by theU.S. Bureau of the Census(including the Current PopulationSurvey, the National CrimeVictimization Survey, and theSurvey of Income and ProgramParticipation) uses a combination of

frames. The combination consists ofan address sampling frame (i.e.,addresses compiled for thepreceding decennial census) and anarea sampling frame, where theaddress information is incomplete.The use of an all area-frame samplepermits NCHS to release the surveysample addresses to its contractorsfor additional data collection. U.S.Bureau of the Census confidentialityconstraints do not permit the releaseof addresses that were obtainedthrough listings compiled for thepreceding decennial census. For thisreason, the survey sample has beenbased on an all-area sampling framestarting with the 1985–94 design.

In parts of the country wherelocal governments issue buildingpermits, the U.S. Bureau of theCensus supplements the area samplewith a sample of permits forresidential housing units built afterApril 1990. In the rest of thecountry, the units constructed afterApril 1990 are included in the areaframe.

2. State stratification and an increasein the number of PSU’s— In mostcases, the first-stage sampling stratado not straddle State boundaries.The exception is the largestmetropolitan areas. These areself-representing PSU’s and maystraddle State boundaries. In thesecases, the survey second-stagesamples were drawn independentlywithin each State component of thePSU. This State stratification, takentogether with an almost doubling ofthe number of PSU’s in the surveysample, enhances the ability of thesurvey to make reliable State-levelestimates for the largest States. TheState stratification and the increase inthe number of PSU’s also will alloweasier integration of a random digitdialing telephone survey with thesurvey as a dual frame survey tomake reliable subnational estimates ifthe decision is made in the future todo this type of integration.

The largest increase in thenumber of sample PSU’s occurs inthose representing nonmetropolitanareas. A contributing factor to this

increase is separate strata formetropolitan areas versusnonmetropolitan areas in all States(except New Jersey, which isentirely metropolitan). To obtainthis increase, the average number ofsample households assigned to suchprimary sampling units wassubstantially reduced. For thesurvey, two sample PSU’s usuallywere selected from eachnonself-representing first-stagestratum. However, there were 21nonself-representing strata whereonly one sample primary samplingunit was selected.

3. Oversampling of black andHispanic persons—The sampledesign implemented in 1995oversamples black and Hispanicpersons. This is accomplished withtwo features. First, the U.S. Bureauof the Census selected for NHIS alarger initial sample than wouldotherwise be required. A subsampleof the initial sample was selectedfor interviewing with housing unitsin areas having higher concentrationsof black and Hispanic persons beingretained at higher rates. Second, allhouseholds in the subsample with oneor more black or Hispanic eligible(i.e., civilian) members are retained inthe survey, and only a subsample ofother households are retained. Thedetermination of a household’s raceand/or ethnicity status is accomplishedthrough the administration of a briefscreening interview. (The screeninginterview consists of the initialsteps of the regular interview. Afterthe household roster is determined,the decision is made to ‘‘ screen in’’or ‘‘ screen out.’’ )

Approximate oversampling ratesare 2:1 for Hispanic persons and1.5:1 for black persons. Note thatoversampling of black persons wasdone during the 1985–94 surveydesign. Oversampling of Hispanicpersons is new to the current design.

Additional detail on the NHIS1995–2004 sample design is included inchapter 2. Chapter 3 provides a detaileddescription of estimation procedures forthe NHIS 1995–2004 design.

Page 13: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 5

Chapter 2.1995–2004 NHISSample Design

Lead author: Thomas F. Moore,Demographic Statistical MethodsDivision, U.S. Bureau of theCensus

The 1995–2004 National HealthInterview Survey (NHIS) sample designwas implemented in 1995, and thisdesign will be used through 2004. Thecurrent design will yield more reliableestimates for health-relatedcharacteristics of Hispanic persons thanthose based on the 1985–94 design.

In forming strata for the primarysampling units (PSU’s), State was used asa stratification variable. Although thesurvey is not designed to produce reliableestimates for every State, this Statestratification of PSU’s enhances thesurvey’s ability to produce data for a fewof the most populous States and enhancesthe potential for using NHIS data withother data to produce State estimates.

Redesign ObjectivesAchieved

The following NHIS redesignobjectives were achieved by the1995–2004 sample design:

(a) Improving the reliability ofestimates for Hispanic persons,

(b) Improving the reliability ofestimates for subnational areas,including States, and

(c) Continuing to have NHIS serveas a sampling frame for follow-onsurveys.

Other objectives also wereconsidered as part of the redesignresearch (6).

Description of the SampleNHIS is based on a stratified,

multistage sampling plan. In the UnitedStates, primary sampling units areindividual counties or contiguous groupsof counties. (‘‘ County’’ is usedgenerically to include county equivalentssuch as parishes in Louisiana,

independent cities in Virginia, Maryland,Missouri, and Nevada, etc.) Outside ofNew England, metropolitan areas(MA’s) are defined at the county level,and were used as primary samplingunits. In New England, MA’s aredefined at a subcounty level, so NewEngland County Metropolitan Areas(NECMA’s), which are defined at thecounty level, were used as primarysampling units instead of MA’s.Secondary sampling units (SSU’s) arenoncompact clusters of housing units.

Table 1 summarizes the majorfeatures of the designs since 1973. Someof the terms used in the table aredefined in the glossary and/or aredescribed later in this chapter. Thefigures given for designated housingunits per year, screened households peryear, interviewed households per year,and interviewed persons per year are allestimates. ‘‘ Designated’’ refers to theinitial designation of units forinterviewing and includes vacant andother ineligible units and householdswhere no interviews occur (e.g.,refusals). ‘‘ Screened’’ refers tohouseholds where a successful interviewhas taken place, up to the point wherescreening can occur for those cases tobe screened (described in more detaillater in this chapter). ‘‘ Interviewed’’refers to households retained after thescreening step (if applicable) and forwhich the interview process has beencompleted.

Note that table 1 shows informationfor the 1995–2004 design for both thealpha and the beta samples. The alphasample is the sample originally selectedfor the 1995–2004 survey. The betasample is a subsample of the alphasample and is the sample actuallyfielded for the 1995–2004 survey. Thealpha sample, with an expected annualinterview size of 55,000 households,was designed and selected under theassumption of an increase in the NHISbudget for the 1995–2004 design. Whenthe budget increase did not occur, thebeta sample, with an expected annualinterview size of 41,000 households,was selected. All the subsampling toselect the beta sample was done withinprimary sampling units (i.e., all thePSU’s selected for the alpha samplewere retained for the beta sample).

Additional details on how the betasample was selected from the alphasample are given later in this chapter.Additional information on the alphasample is available in anotherpublication (6).

PSU and StratumFormation

For the 1995–2004 NHIS design,the United States was partitioned into1,995 primary sampling units. The 1,995PSU’s are individual counties or groupsof adjacent counties. Outside of NewEngland, MA’s were used as primarysampling units. In New England,NECMA’s were used as primarysampling units instead of metropolitanareas. The PSU’s for the 52 largestmetropolitan areas were assigned toself-representing (SR) strata, eachcontaining only one PSU. After theinitial 52 SR strata were formed, the1,995 PSU’s were then partitioned into atotal of 194 design strata. Because noindividual PSU was allowed to accountfor more than 50 percent of a stratum’stotal size (estimated 1995 population),an additional 43 PSU’s becameself-representing after the initialstratification process. These 43 PSU’swere assigned to self-representing strata,giving a final total of 237 design strata.All of the remaining 1,900 primarysampling units were nonself-representing(NSR), and each was assigned to one ofthe 142 NSR strata. In addition to theselection of the PSU in each of the 95self-representing strata, two sampleprimary sampling units were selectedfrom 121 of the NSR strata. Theremaining 21 nonself-representing stratawere so small, in terms of totalestimated 1995 population, that only onesample PSU was selected from each ofthese strata.

Metropolitan areas includeconsolidated metropolitan statisticalareas (CMSA’s), metropolitan statisticalareas (MSA’s), and primarymetropolitan statistical areas (PMSA’s).PMSA’s are used as building blocks forCMSA’s, while MSA’s are‘‘ stand-alone.’’ In areas with PMSA’sand CMSA’s, CMSA’s were used asprimary sampling units.

Page 14: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 6 [ Series 2, No. 130

Table 2 displays information aboutseveral characteristics of the samplingstrata for the 1995–2004 survey andtable 3 shows the distribution of theself-representing and nonself-representing primary sampling units bycensus region.

PSU StratificationExcept for some of the 52 largest

self-representing strata, no stratumstraddles a State boundary. Several ofthe largest self-representing stratastraddle State boundaries because thosemetropolitan areas straddle Stateboundaries. Even for these SR strata,because of the way the second-stagesample is drawn, one could considerState as a stratification variable for eachstratum. Within a State, the approximatenumber of strata and sample PSU’s wasfixed in the design. PSU’s in a Statewere partitioned by metropolitan and/ornonmetropolitan status.

If additional strata were required,the U.S. Bureau of the Census stratifiedthe PSU’s by the poverty rate. Thestratification computer program waswritten for the redesign of severalCensus sample surveys and was basedon a clustering algorithm developed byFriedman and Rubin (7). The procedurebegan with a random stratification. Itsystematically created a large number ofstratifications by moving PSU’s fromone stratum to another or by exchangingpairs of PSU’s in different strata. Theresulting between-PSU variance for eachstratification variable was calculated.The criterion for comparing possiblestratifications was the sum ofbetween-PSU variances for thestratification variables; the smaller thesum, the better the stratification. Aconstraint that limited the variability ofthe stratum size to plus or minus25 percent of the average stratum size ina State also was used. Because thesurvey used only poverty to stratifywithin State and metropolitan and/ornonmetropolitan status, the stratificationgenerally grouped PSU’s with thelowest poverty rates in one stratum, thenext lowest in the next stratum, and soforth. The size constraint caused someexceptions to this pattern. There isvariability in stratum definitions from

State to State due to variations inpoverty and the size constraint.

Special Situations ArisingFrom PSU Definitions

To coordinate sampling with theCurrent Population Survey and otherongoing census sample surveys, the U.S.Bureau of the Census partitions somePSU’s, especially the self-representingPSU’s for the largest metropolitan areasthat have components in several States,into finer units (Basic Primary SamplingUnit Components) prior to selection ofsecondary sampling units. Also, onedesign PSU has components in twodifferent census regions. For thesesituations, partitioning the PSU’s intoBasic Sampling Unit Components priorto second-stage sampling results inintroducing State-level stratification. Itoccurs because these components alwaysrespect State boundaries.

For in-house variance estimation,NCHS uses the basic conceptual modelwith 237 strata and 95 self-representingPSU’s, with some collapsing asnecessary. Additional information onvariance estimation is in chapter 3.

PSU Sample Allocation andSelection

Within a nonself-representingstratum, two PSU’s usually wereselected for the survey sample. In 21nonself-representing strata, however,only one PSU was selected for thesample. A total of 358 PSU’s wasselected for the sample. Onenonself-representing PSU is so smallthat it will be ‘‘ rotated’’ out of thesample in 2000 and another PSU will berotated in to take its place.

Primary sampling units wereselected without replacement withprobability proportional to the estimated1995 PSU population size usingDurbin’s procedure (8).

Rotating PSU’sDuring the partition of the United

States into PSU’s, any that were toosmall (less than 3,000 housing units, asenumerated in the 1990 Census) to

provide the minimum sample needed forthe survey was defined as a potentialrotating PSU. It was placed in a rotationcluster for sample selection. The rotationcluster included any other PSU in thesame stratum that was too small toprovide the minimum sample size. If thecombined size of the rotation clusterwas still too small, the next smallestPSU was added to the cluster.

The rotation cluster was treated as asingle primary sampling unit forsampling. If it was selected, the PSU’sin the rotation cluster were randomlyordered, and a random start was chosento determine in which primary samplingunit the sample would start. A sequenceof samples was designated based on therandom start and the size of the PSUand continued on to the next primarysampling unit.

One such rotation cluster wasselected for NHIS. The initial PSUincludes the sample for 1995 through1999. In 2000, it will be rotated out ofthe sample and another PSU in therotation cluster will be rotated in to takeits place for the remainder of the design.

Because of their small size, nooversampling is done in rotating primarysampling units.

Panels of PSU’sNCHS partitioned the survey

sample into four panels of primarysampling units. Each panel is asubsample of the full set of 358 samplePSU’s with the following properties:

+ Each panel has the same samplingproperties.

+ Each panel produces unbiasedestimates for the U.S. population.

+ Each nonself-representing PSU isassigned to only one panel.

In addition to the properties listedabove, the panels are approximatelyequal in the number of sample housingunits. The assignment of PSU’s topanels was intended to keep samplingvariability small for a single-panelestimator.

The panels enable NCHS to restrictthe portion of the NHIS sample that isreused as the sampling frame for othersurveys to a subset of the full NHISdesign. The panels also allow large

Page 15: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 7

sample reductions to be made, ifnecessary, by dropping one or morepanels. This is more efficient thandropping weeks of interviewing orrandomly chosen sample units.

To produce unbiased estimates forthe U.S. population, the largerself-representing PSU’s had to be in allpanels. The U.S. Bureau of the Censusdivided the set of secondary samplingunits in the 13 largest self-representingPSU’s into four parts and assigned eachpart to a panel. The set of SSU’s in the20 medium SR PSU’s was divided intotwo parts and assigned either to panels 1and 3 or to panels 2 and 4. The smallself-representing primary sampling unitsand the nonself-representing primarysampling units were assigned to singlepanels defined by NCHS.

The U.S. Bureau of the Censusdesignated the self-representing PSU’sas ‘‘ large,’’ ‘‘ medium,’’ or ‘‘ small.’’ Ingeneral, PSU’s with populations greaterthan 3,000,000 were designated as‘‘ large,’’ PSU’s with populationsbetween 1,600,000 and 3,000,000 wereclassified as ‘‘ medium,’’ and PSU’s withpopulations less than 1,600,000 werelabeled as ‘‘ small.’’ Some exceptionswere made to reduce the variability inthe sample sizes of the panels.

Substrata Within PSU’sWithin each PSU, the survey uses

an area frame. In some PSU’s, a permitframe also is used. For the area sample,NHIS partitions the 1990 Census blocks(and in some cases ‘‘ combined blocks’’ )in the primary sampling unit into 20substrata based on the 1990 Censusconcentration of black and Hispanicpersons. The substrata definitions areconsistent across PSU’s. Hence, in someprimary sampling units, some of the 20substrata may be empty. In all PSU’s,addresses corresponding to dwellingunits built prior to April 1, 1990, aresubject to sampling from these 20substrata. In areas where governmentalunits issue and maintain buildingpermits, dwelling units built since April1, 1990, are subject to sampling onlyfrom the list of building permits. If adwelling unit built since April 1, 1990,is encountered in the area frame, theinterview is terminated and the unit is

considered to be out of scope. When thepermit frame is used in a PSU, it isconsidered to be the 21st substratum inthat primary sampling unit. However, inareas that do not issue building permitsand in rotating PSU’s, all dwelling unitsregardless of when they were built aresubjected to sampling in the 20 substratain the area sample. Table 4 providesmore information about the 21 substrata.

Units in higher density black orHispanic areas are sampled at higher ratesto improve the reliability of estimates forblack and Hispanic persons. This is thefirst component of the oversamplingstrategy. The second component isdiscussed in the ‘‘ Assignment ofScreening Code’’ section.

It should be noted that densitysubstratum 20 was so small that no surveysample cases were selected from it.

SSU Formation andSampling

Using the U.S. Bureau of theCensus’ usual procedure in areasegmentation, clusters of housing unitswere formed within the substrata tocreate secondary sampling units(SSU’s). The sampling was done inconjunction with other census surveys,and operational techniques of implicitstratification and systematic samplingwere used. An in-depth discussion of themethods used has been published (9). Asimplified structural framework is givenbelow. The ultimate sampling unitconsists of a cluster containing anexpected four housing units orequivalents, based on 1990 Censusinformation. These clusters may beempty or may include ineligible units(e.g., vacant housing units) at the timeof interview. For cost and statisticalefficiency, the expected number ofhousing units to be covered annually byscreening and sampling within an SSUwas planned to be 8 or 12 for the areaframe substrata and 4 for the permitframe substratum (see table 4, ‘‘ BetaSSU size’’ column). For the area framesubstrata, cluster sizes of expected size 8or 12 were created by joining together 2or 3 adjacent ultimate sampling units ofexpected size 4. However, the actualnumber of units may vary from the

expected number, especially in PSU’swhere the survey does not use a permitframe. Moreover, about 20 percent of theaddresses in a national area sampletypically do not include any personseligible for the survey.

Within a substratum, eachsecondary sampling unit is part of a‘‘ super-SSU’’ consisting of 12‘‘ annual-SSU’s’’ (the terms SSU orannual-SSU refer to the annual samplingunit). The 12 annual-SSU’s in asuper-SSU are intended to serve thesurvey for the entire 10-year designperiod and to provide 2 years of‘‘ reserve sample’’ as a contingency (e.g.,if there was a delay in theimplementation of the next NHISdesign). Thus, any housing unit within asuper-SSU can be in sample at mostonce in the 10-year design period.

The sampling process used to formthe super-SSU’s within the primarysampling unit substrata is outlinedbelow. The following simplifications aremade for ease of presentation:

Selection of a 10-year sample(without the reserve sample) isdiscussed.

The subsampling is described as‘‘ within-PSU,’’ rather than ‘‘ withinBasic Primary Sampling UnitComponents,’’ which were discussed inthe ‘‘ Special Situations Arising FromPSU Definitions’’ section.

The generic term ‘‘ block’’ is used todenote both single 1990 Census blocksand ‘‘ combined blocks.’’ Combinedblocks are formed by joining 1990Census blocks with a small number ofhousing units, as enumerated in the1990 Census, with adjacent blocks.

The steps in the sampling processare:

1. All blocks within a PSU’s densitysubstrata are sorted in geographicorder using the following 1990Census variables: district officenumber, address register areanumber, and block number. Eachblock is assigned an associatedmeasure of size defined as thenumber of housing units enumeratedin the 1990 Census.

2. Each block contains an integralnumber of ‘‘ measures’’ where eachmeasure contains about four housing

Page 16: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 8 [ Series 2, No. 130

units. The total number of measuresvaries from block to block, and apartition of the block into realmeasures does not occur unless theblock is actually sampled.

3. Using systematic samplingprocedures, a sequence of ‘‘ hit’’measures is selected. Each ‘‘ hit’’measure defines the first measure ina super secondary sampling unit.The super-SSU will contain the hitmeasure plus the next consecutive19 or 29 measures (to yield a totalof 8–12 expected housing units peryear). The number of measures inthe super-SSU varies by type ofdensity substratum; it is thesubstratum’s entry in table 4, ‘‘ BetaSSU size’’ column, multiplied by{10/4} (the {10/4} comes from{10-year sample/measure of sizeequal to 4}). At this stage ofsampling, every measure has thesame probability of being ‘‘ hit.’’

One could characterize thesampling process as a systematicsample of sets of measures, orsuper-SSU’s, from a hypotheticaluniverse listing. Two possibleconceptualizations of systematicsampling for either measures orsuper-SSU’s can be made.

A sampling interval SImeaon apopulation Mmeaof measures yields asample of (Mmea/SImea) super-SSU’sof size k measures each.

One may also think of acorresponding population ofsuper-SSU’s of size MSSU=(Mmea/k). Here, a sampling intervalof SISSU= SImea/k on MSSUyieldsthe same sample size.

A representation of systematicsampling as a single samplinginterval for either measures orsuper-SSU’s will satisfy thefollowing self-weighting criterion:

For density substratum type (j),within PSU(i) sampled withprobability πi, πi / SISSU(j,i) =constant ( j) (i.e., a constant thatdepends only on j), whereSISSU( j,i) is the sampling intervalfor density substratum type ( j)within PSU(i).

By the definition of a samplinginterval, the conditional probability

that a super-SSU is selected fromsubstratum type ( j), given thatPSU(i) was selected, is either1/SISSU( j,i) or k/SImea( j,i), where kis the number of consecutivemeasures.

Thus, the unconditionalprobability that a super-SSU fromsubstratum type ( j) is in sample isπi c [1/SISSU( j,i)], which simplifiesto constant ( j).

4. The annual secondary sampling unitswithin each selected super-SSU areconstructed by combining every twoor three consecutive measures,depending upon classification ofdensity substratum.

Example of SSU Sampling

The following example illustratesthe formation of super-SSU’s asoutlined in steps 1–4.

(Step 1) Let blocks A–H representeight blocks in a substratum of a PSUthat are in consecutive order as a resultof sorting on selected characteristics(figure 1). The column ‘‘ Housing unitcount’’ shows the 1990 Census housingunit count for each block. In this figure,a number is assigned to each housingunit in the block to illustrate howhousing units would be assigned todifferent measures within the block.

(Step 2) The column ‘‘ Measurecount’’ in figure 1 shows the number ofmeasures each block contains. Forexample, block B consists of 19 housingunits that are partitioned into 5measures.

(Step 3) Suppose that measure 2 inblock B is a sample ‘‘ hit’’ measure fromthe population of measures. If thissubstratum requires an expected 80housing units for the 10-year surveydesign period (i.e., the expectedannual-SSU size is eight housing units),then the next 20 measures will be usedto form the super-SSU. Starting withmeasure 2 in block B, the super-SSUwill include measures from blocks B–G.

(Step 4) A well-defined U.S.Bureau of the Census listing processpartitions the selected blocks B–G intothe specified number of measures. Thehousing units within these selectedblocks are listed by some adjacent or

neighbor order. (Note: in this example,it is not necessary for the housingunits in blocks A or H to be listed.)The assignment of the actual measurelabels will look somewhat like thosepresented in figure 1 (e.g., the first sixadjacently listed housing units inblock B are assigned to measures1,2,3,4,5,1, respectively). To form theannual-SSU’ s, the 20 measures labeled(B,2) to (G,1) are consecutivelypaired. Thus,SSU (year 1) = measures (B,2),(B,3)SSU (year 2) = measures (B,4),(B,5)SSU (year 3) = measures (C,1),(C,2)SSU (year 4) = measures (C,3),(C,4)SSU (year 5) = measures (D,1),(D,2)SSU (year 6) = measures (E,1),(E,2)SSU (year 7) = measures (E,3),(F,1)SSU (year 8) = measures (F,2),(F,3)SSU (year 9) = measures (F,4),(F,5)SSU (year 10) = measures (F,6),(G,1).

Figure 2 shows the year eachhousing unit in the super-SSU will be ina sample.

SSU Clustering Characteristics

There are several clustering aspectsof the sample housing units withinsecondary sampling units that should benoted.

1. An examination of the sample housingunits in an annual-SSU in figure 2shows that in any given year, somehousing units may be neighbors.However, the sample housing unitswill tend to be spread through thegeography of a given block.

2. From one year to the next, there is ageographical clustering of samplehousing units (i.e., sample housingunits of a given year usually areneighbors of sample housing units ofthe next year). However, thegeographical clustering weakensover time.

3. The annual-SSU’s may cross blocksin any given year (e.g., years 7 and10 in the example). The blocks in anSSU may not be adjacent.

4. The super-SSU’s are notwell-defined clustered populationunits like PSU’s, but are artifacts ofa block sort and random systematicsampling.

Page 17: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Figure 1. Housing unit distribution by measure

Block

Housingunit

countMeasure

count Housing unit measure identification

A . . . . . . . . 8 2 1 2 1 2 1 2 1 2B . . . . . . . . 19 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4C . . . . . . . . 14 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2D . . . . . . . . 9 2 1 2 1 2 1 2 1 2 1E . . . . . . . . 12 3 1 2 3 1 2 3 1 2 3 1 2 3F . . . . . . . . 22 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4G . . . . . . . . 17 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1H . . . . . . . . 9 2 1 2 1 2 1 2 1 2 1

NOTES: Blocks ordered by a sorted list are a subset of eight blocks shown for illustration. Housing units ordered by adjacent units within block.

Figure 2. Housing unit distribution by annual secondary sampling unit

Block

Housingunit

countMeasure

count Year1 housing unit is in sample

A . . . . . . . . . . 8 2 x x x x x x x xB . . . . . . . . . . 19 5 x 1 1 2 2 x 1 1 2 2 x 1 1 2 2 x 1 1 2C . . . . . . . . . . 14 4 3 3 4 4 3 3 4 4 3 3 4 4 3 3D . . . . . . . . . . 9 2 5 5 5 5 5 5 5 5 5E . . . . . . . . . . 12 3 6 6 7 6 6 7 6 6 7 6 6 7F . . . . . . . . . . 22 6 7 8 8 9 9 10 7 8 8 9 9 10 7 8 8 9 9 10 7 8 8 9G . . . . . . . . . . 17 4 10 x x x 10 x x x 10 x x x 10 x x x 10H . . . . . . . . . . 9 2 x x x x x x x x x

11 is first year, 2 is second year, 10 is 10th year, etc. An x represents a housing unit not in the sample.

NOTES: Blocks ordered by a sorted list are a subset of eight blocks shown for illustration. Housing units ordered by adjacent units within block.

Series 2, No. 130 [ Page 9

The within-substratum SSUsampling parameters needed to formselection probabilities and weights areprovided in table 4. This table providesthe alpha-design parameters along withthe subsampling rates to achieve thebeta-design that was implemented forthe 1995–2004 design period.

Example of Within-SubstratumSampling Rates

Consider SSU sampling in densitysubstratum 12 of a self-representingprimary sampling unit from table 4. Thissubstratum consists of all 1990 Censusblocks in the PSU that contained30–60 percent black persons and5–10 percent Hispanic persons accordingto the 1990 Census. The reference groupwill be the permit substratum 21, whichis not oversampled.

For simplicity, suppose that eachsubstratum’s universe can be defined as aset of population alpha annual-SSU’s of asize defined by the ‘‘ Alpha SSU size’’column of table 4 (e.g., substratum 12contains population secondary samplingunits of size 12 housing units (3measures) and substratum 21 contains

SSU’s of size 4 housing units(1 measure)).

Prior to any oversampling, abaseline sampling interval (referred topreviously as SISSU) for the alpha-designwas defined as 1,748 in all substrata in aself-representing PSU.

1. In substratum 21, 1 in every 1,748SSU’s is sampled.

2. In substratum 12, 1 in every1,748/2 = 874 SSU’s are sampled (theoversampling rate for the alpha-designin substratum 12 is 2; refer to the‘‘ Original SSU unit oversamplingrate’’ column of table 4).

Then, to reduce the alpha-designSSU’s to the NHIS beta-design, thefollowing steps are taken:

1. In substratum 21, the ‘‘ SSU unitreduction’’ column of table 4indicates that a reduction of 30 outof 101 SSU’s occurred. Hence,71/101 of the SSU’s are retained.Thus, the base weight is(101/71) c 1,748 = 2,486.5915

2. In substratum 12, the ‘‘ SSU unitreduction’’ column of table 4

indicates that a reduction of 25 outof 101 SSU’s occurred. Hence,76/101 of the SSU’s are retained.Also, referring to the ‘‘ Alpha SSUsize’’ and ‘‘ Beta SSU size’’ columnsof table 4, the expected SSU sizewas reduced from 12 to 8. Thus, thebase weight is (101/76) c (12/8) c 874= 1,742.25

These base weights are the inversesof the probability of annual-SSUselection.

Continuing with the step ofoversampling black and Hispanichouseholds after the secondary samplingunit has been selected (refer to the‘‘ Nonminority household subsamplerate’’ of table 4):

1. In substratum 21, there is nooversampling. All housing units inthe SSU are targeted for interview.Thus all target housing unitsreceive the SSU base weight,2486.5915.

2. In substratum 12, a proportion(0.7032) of all households aretargeted for interview. The remainingare screened and interviewed only if

Page 18: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 10 [ Series 2, No. 130

one or more eligible (i.e., civilian)black or Hispanic residents arefound. Thus, households containingone or more eligible black orHispanic residents are in samplewith certainty; their multiplicativebase weight is

SSU weight c 1 = 1,742.25Nonminority households are

subsampled and their base weightsare

(1,742.25 / 0.7032) = 2,477.60

An example for density substratumtype (j), within nonself-reporting PSU(i)sampled with probability πi, would besimilar. The one difference would bethat the original sampling intervalwould be adjusted from 1,748 to satisfythe self-weighting requirementπi /SISSU( j,i) = constant ( j) (in thisexample, constant ( j) is the reciprocal ofthe value in table 4, the ‘‘ Alpha SSUbase weight’’ column) , where SISSU( j,i)is the sampling interval for densitysubstratum type ( j) within PSU(i).

Sampling in the Permit Frame

The proportion of the survey sampleselected from the permit frame is about5 percent of the total sample at thebeginning of the 10-year design cycleand increases by about 1 percent peryear. Hypothetical measures are selectedduring within-PSU sampling inanticipation of the construction of newhousing units. Identifying the addressesfor these permit measures involves alisting operation conducted at buildingpermit offices, clustering of addresses toform measures, and associating theseaddresses with the hypotheticalmeasures in the sample.

The U.S. Bureau of the Censusconducts the Building Permit Survey,which collects information each monthfrom all building permit offices in theUnited States about the number ofhousing units authorized to be built. Thesurvey results are converted to measureswith an expected size of four housingunits. These measures are continuouslyaccumulated and linked with the frameof hypothetical measures used to selectthe survey sample. This linkingidentifies which building permit officescontain measures that are in sample.

U.S. Bureau of the Census fieldrepresentatives then visit these buildingpermit offices to obtain lists of addressesof units that were authorized to be built.These lists are used to identify thesample units. To the extent possible, theU.S. Bureau of the Census attempts toform groups of sample units that havesome geographic clustering.

Sampling of Group Quarters

Noninstitutional, nonmilitary groupquarters contain persons eligible forinclusion in NHIS. The sampling ofgroup quarters is done exclusively fromthe area frame. If a group quarters unitis encountered during the address listingoperation that occurs prior to NHISinterviewing, the NHIS address listeruses reference materials to determinewhether the group quarters isnoninstitutional and nonmilitary. If so, itis included in the address list. Thepermit frame never is used to selectgroup quarters; any group quarters unitencountered among permit frame samplecases is considered to be out of scope.Interviewing within noninstitutional,nonmilitary group quarters is similar tohousehold interviewing.

Sampling of Persons WithinHouseholds

In 1995 and 1996, persons withinhouseholds were selected for NHISquestionnaire supplements usingprobability sampling methods. Forexample, in 1995 the ‘‘ Year 2000Objectives’’ supplement wasadministered to one sample adultrandomly chosen from each familyresiding in half of the samplehouseholds. Beginning in 1997, onesample adult and one sample child(if children are present) are randomlyselected from each family residing in asample household for administration ofa large portion of the survey interview.

Interviewer Assignments

The NHIS sample areas have beendivided into 170 assignment areas ofone or more counties. The areas weredefined by the U.S. Bureau of theCensus’ Field Division for the survey.

Generally there is a single interviewerfor each assignment area, with severalinterviewers assigned to the largerassignment areas. The sample for eachquarter is divided into approximately930 weekly interviewer assignments.The number of weekly interviewerassignments in each assignment area ispart of the input to the balancingprocedure discussed below.

The survey samples secondarysampling units in each assignment areaand quarter, and they are grouped intoweekly interviewer assignments.Interviewers have an expected 10–12completed interviews per week. Tominimize travel in a weekly interviewerassignment, secondary sampling unitswere grouped in the same countywhenever possible. SSU’s from adjacentcounties were used to complete weeklyassignments that could not be formedwithin a single county.

Balancing the Sample

For operational reasons, the specificweeks that each interviewer receives anassignment should be as evenly spacedas possible throughout all 52 weeks ofthe year, not simply within the 13 weeksof each quarter. Because someassignment areas have more than 13weekly interviewer assignments andmore than one interviewer, the weeklyinterviewer assignments in eachassignment area were distributed evenlyamong the weeks. The census regionaloffice staffs then distribute the weeklyinterviewer assignments among theinterviewers in multi-interviewerassignment areas.

For estimation purposes, theweekly interviewer assignments aredistributed among the 13 weeks in aquarter. This minimizes the variationamong the weeks in the number ofmeasures and number of expectedcompleted interviews in the followingcategories:

+ Each census regional office+ Each census region (Northeast,

Midwest, South, West) and the totalUnited States

+ Each census region and the totalUnited States by type of PSU(SR and NSR)

Page 19: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 11

+ Each census region and the totalUnited States by the geographiccategories C, B, U, R (C is thecentral cities of a metropolitan area;B is the urbanized area not incategory C; U is the urban placesnot in an urbanized area and not incategory C; R is all other areas), andby new construction

The U.S. Bureau of the Censusdeveloped software to balance thesample within these operational andestimation constraints.

Additional SubsamplingSituations

During the address listingoperation that occurs prior tointerviewing, sometimes a larger thanexpected number of housing units isidentified in a secondary samplingunit. If the address lister finds morethan twice the number of expectedunits in an SSU, the lister subsamplesthe units. This causes an adjustment inthe probability of sample selection forall units in that SSU.

Sometimes the NHIS interviewerencounters extra units at a sampleaddress that were not identified duringthe listing operation. Often when threeor more additional units are identified,the interviewer subsamples one of theunits. This causes an adjustment in theprobability of sample selection for theparticular unit that is selected. Thistype of subsampling is rare and iscarried out to maintain reasonableinterviewer workloads.

Assignment of ScreeningCode

The second component of theoversampling strategy involves theelimination of some samplehouseholds that do not contain anyblack or Hispanic persons. First,screening codes are assigned to sampleaddresses. Then screening interviewseliminate some households.

Prior to data collection, but afterthe address listing operation and SSUlevel subsampling (if necessary), someof the addresses from each SSU arerandomly assigned a screening code of

I (interview) by the U.S. Bureau ofthe Census. Others are assigned ascreening code of S (screen). Theproportion of addresses assigned toeither I or S depends on the samplingsubstratum from which the SSU wasdrawn (see the ‘‘ Nonminorityhousehold subsample rate’’ column oftable 4). The S cases are spread out asmuch as possible within the SSU. Asindicated in the ‘‘ Nonminority householdsubsample rate’’ column of table 4, allsample cases drawn from the permit frameare assigned an I code.

For 1995 and 1996, theassignment of screening codes wascarried out by using a systematicsampling procedure using integer-length sampling intervals. Forexample, if 75 percent of the addressesin a particular substratum were to beassigned a screening code of I, all ofthe addresses in half of the SSU’ s inthis substratum would be selected, andthe addresses in the other half of theSSU’ s would be selected at a rate of 1of every 2. All the selected unitswould be assigned a code of I and theremainder a code of S. This systematicsampling procedure was necessarybecause the subsampling was asystematic sampling clerical procedurecarried out in the Census regionaloffices, and the U.S. Bureau of theCensus headquarters staff wanted theregional office staff to useinteger-length sampling intervals.

Starting in 1997, the assignmentof screening codes was automated.Integer-length sampling intervals wereno longer needed, and a singlesampling interval is used in eachsubstratum. In the example above,using a uniform systematic selectionrate of 3 out of every 4, selected unitswould be assigned a code of I and theremainder would be assigned the Scode.

Sampling RuleImplementation

The interviewers conduct the usualNHIS basic health and demographicinterview for every address thatcontains a household and has ascreening code of I .

For every household at addresseswith a code of S, the interviewersconduct the interview by collecting thehousehold roster and the race andethnicity for each household member.If the S household contains one ormore eligible (i.e., civilian) black orHispanic persons, then the householdis retained in the sample and theinterviewer completes the remainder ofthe interview. If the S householdcontains neither an eligible blackperson nor an eligible Hispanicperson, then the household is notretained in the sample, and theinterviewer does not complete theremainder of the interview.

Because I and S codes areassigned prior to interviewing, therecan be discrepancies between expectedsubsampling rates and actualsubsampling rates. For example, allthe I codes in a secondary samplingunit could be assigned to addressesthat contain households, while all ofthe S codes in the SSU are assigned toaddresses that are vacant housingunits. In this case, no subsamplingwould occur in the secondarysampling unit. Another example iswhen all households in an SSU that donot include any eligible black orHispanic persons are assigned I codes.In this case, no subsampling wouldoccur. Presumably, subsamplefluctuations at the secondary samplingunit level balance out over the entiresample.

During 1995 and 1996, if theinterviewer had made twounsuccessful attempts to conductscreening at an occupied sample unitwith a code of S, the interviewer couldcontact two neighbors to ask if thesample household contained any blackor Hispanic persons. If both neighborsagreed the household did not containany black or Hispanic persons, thehousehold did not need to beinterviewed. Otherwise, theinterviewer was supposed to continuetrying to contact the sample householdto determine whether a completeinterview was required.

Beginning in 1997, informationfrom neighbors no longer was used forscreening purposes.

Page 20: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 1. National Health Interview Survey designs: 1973–84, 1985–94, 1995–2004

Design period 1973–84 1985–94

1995–2004

Alpha Beta

Sampling frame . . . . . . . . . . . . . . . . . . . . . . . . . Address, area,permit, groupquarters

Area, permit Area, permit Area, permit

PSU definitions . . . . . . . . . . . . . . . . . . . . . . . . . One or morecounties

One or morecounties

One or morecounties

One or morecounties

1970 SMSA’s 1983 MSA’s 1990 MSA’s 1990 MSA’sSR sample PSU’s . . . . . . . . . . . . . . . . . . . . . . . 156 52 95 95NSR sample PSU’s . . . . . . . . . . . . . . . . . . . . . . 220 146 263 263Total sample PSU’s . . . . . . . . . . . . . . . . . . . . . . 376 198 358 358Sample PSU’s per NSR stratum . . . . . . . . . . . . . . 1 2 Normally 2,

sometimes 1Normally 2,sometimes 1

First level of stratification . . . . . . . . . . . . . . . . . . . 4 censusregions

4 censusregions

50 States and theDistrict of Columbia

50 States and theDistrict of Columbia

Designated housing units per year . . . . . . . . . . . . . 51,000 61,400 124,000 70,000Screened households per year . . . . . . . . . . . . . . . Not applicable Not applicable 99,000 57,000Interviewed households per year . . . . . . . . . . . . . . 40,000 49,000 55,000 41,000Interviewed persons per year . . . . . . . . . . . . . . . . 108,000 132,000 159,000 107,000SSU size (except permit frame) — expected number

of housing units . . . . . . . . . . . . . . . . . . . . . . . . 4 8 8, 12, 16 8, 12Permit frame SSU size — expected number of

housing units . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 4 4Number of panels . . . . . . . . . . . . . . . . . . . . . . . Not defined 4 Not defined 4Minority sampling techniques . . . . . . . . . . . . . . . . None Oversample for

black personsOversample andscreen for black andHispanic persons

Oversample andscreen for black andHispanic persons

NOTE: PSU is primary sampling unit, SSU is secondary sampling unit, SR is self-representing, NSR is nonself-representing, SMSA is Standard Metropolitan Statistical Area as defined by the U.S.Office of Management and Budget in 1959, MSA is Metropolitan Statistical Area as defined by the U.S. Office of Management and Budget in 1983, and MA is Metropolitan Area as defined by the U.S.Office of Management and Budget in 1990.

Table 2. Sampling strata characteristics: National Health Interview Survey, 1995–2004

Stratum type

Numberof

strata

Universecoverage

(population)SamplePSU’s

Size of strata(population)

SR . . . . . . . . . . . . . . . . . . . . . 95 64% 95 90,000–18,000,000NSR, 2 PSU’s . . . . . . . . . . . . . . 121 34% 242 380,000–1,000,000NSR, 1 PSU . . . . . . . . . . . . . . . 21 2% 21 90,000–360,000Total . . . . . . . . . . . . . . . . . . . . 237 100% 358 . . .

. . . Category not applicable.

NOTE: PSU is primary sampling unit; SR is self-representing; NSR is nonself-representing.

Table 3. Primary sampling units by census region: National Health Interview Survey, 1995–2004

Type of primarysampling unit

Census region

TotalNortheast Midwest South West

Self-representing . . . . . . . . . . . . . . . . . . . . . . . . 22 19 34 20 95Large . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 4 2 13Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 7 4 5 20Small . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 10 26 13 62Nonself-representing . . . . . . . . . . . . . . . . . . . . . . 29 76 119 39 263Total . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 95 153 59 358

NOTES: The designation of self-representing primary sampling units as ‘‘large,’’ ‘‘medium,’’ or ‘‘small’’ was done by the U.S. Bureau of the Census. In general, primary sampling units with populationsgreater than 3,000,000 were designated as ‘‘large,’’ PSU’s with populations of 1,600,000–3,000,000 were designated as ‘‘medium,’’ and PSU’s with populations less than 1,600,000 were designated as‘‘small.’’

Page 12 [ Series 2, No. 130

Page 21: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 4. Race and ethnicity density strata and within primary sampling unit sampling parameters: National Health Interview Survey,1995–2004

Substratumlabel

1990 Census densityPercentof U.S.

populationin 1990

OriginalSSU unit

over-sampling

rate3

RevisedSSU unit

over-sampling

rate4

SSUunit

reduction5

AlphaSSUsize6

BetaSSUsize7

AlphaSSUbase

weight8

BetaSSUbase

weight9

Nonminority household1

Percentblack2

PercentHispanic2

Subsamplerate10

Baseweight11

1 . . . . . . . . . . . . <10 <5 55.3 1.60 1.00 17 16 12 1,092.5000 1,751.4683 0.7032 2,490.71142 . . . . . . . . . . . . <10 5–10 7.7 2.00 1.50 0 16 12 874.0000 1,165.3333 0.4688 2,485.77933 . . . . . . . . . . . . <10 10–30 8.0 3.20 1.50 38 16 12 546.2500 1,167.6455 0.4688 2,490.71144 . . . . . . . . . . . . <10 30–60 3.1 4.00 1.60 61 12 12 437.0000 1,103.4250 0.4395 2,510.63715 . . . . . . . . . . . . <10 60+ 3.5 4.00 2.30 43 8 8 437.0000 760.9828 0.3057 2,489.31236 . . . . . . . . . . . . 10–30 <5 4.6 1.60 1.00 38 12 12 1,092.5000 1,751.4683 0.7032 2,490.71147 . . . . . . . . . . . . 10–30 5–10 1.4 2.00 1.50 25 12 12 874.0000 1,161.5000 0.4688 2,477.60248 . . . . . . . . . . . . 10–30 10–30 1.9 3.50 1.50 58 12 12 499.4286 1,173.0764 0.4688 2,502.29619 . . . . . . . . . . . . 10–30 30–60 0.9 4.00 1.50 63 8 8 437.0000 1,161.5000 0.4688 2,477.6024

10 . . . . . . . . . . . . 10–30 60+ 0.7 4.00 2.00 51 8 8 437.0000 882.7400 0.3516 2,510.637111 . . . . . . . . . . . . . 30–60 <5 2.3 1.70 1.00 12 12 8 1,028.2353 1,750.3106 0.7032 2,489.065212 . . . . . . . . . . . . 30–60 5–10 0.5 2.00 1.00 25 12 8 874.0000 1,742.2500 0.7032 2,477.602413 . . . . . . . . . . . . 30–60 10–30 0.8 3.50 1.00 58 12 8 499.4286 1,759.6146 0.7032 2,502.296114 . . . . . . . . . . . . 30–60 30–60 0.7 4.00 1.50 63 8 8 437.0000 1,161.5000 0.4688 2,477.602415 . . . . . . . . . . . . 30–60 60+ 0.2 4.00 2.00 51 8 8 437.0000 882.7400 0.3516 2,510.637116 . . . . . . . . . . . . 60+ <5 6.6 1.95 1.05 49 8 8 896.4103 1,741.1045 0.7032 2,475.973517 . . . . . . . . . . . . 60+ 5–10 0.6 2.00 1.00 51 8 8 874.0000 1,765.4800 0.7032 2,510.637118 . . . . . . . . . . . . 60+ 10–30 0.9 3.50 1.20 66 8 8 499.4286 1,441.2082 0.5860 2,459.399619 . . . . . . . . . . . . 60+ 30–60 0.2 4.00 1.50 63 8 8 437.0000 1,161.5000 0.4688 2,477.602420 . . . . . . . . . . . . 60+ 60+ 120.0 4.00 1.50 63 8 8 437.0000 1,161.5000 0.4688 2,477.602421 . . . . . . . . . . . . permit permit . . . 1.00 1.00 30 4 4 1,748.0000 2,486.5915 1.0000 2,486.5915

. . . Category not applicable.1Nonminority includes everyone except black and Hispanic persons.2For intervals of form ‘‘x-y,’’ the lower endpoint is included and the upper endpoint is not included.3Alpha-design is the oversampling rate for secondary sampling units.4Beta-design is the oversampling rate for secondary sampling units.5The number of alpha-design secondary sampling units that are removed per 101 alpha-design secondary sampling units to obtain the beta-design secondary sampling units.6Expected number of housing units in each alpha-design secondary sampling unit.7Expected number of housing units in each beta-design secondary sampling unit.8Annualized base weight of alpha-design secondary sampling units. A given entry in this column is equal to 1,748 divided by the corresponding entry in the ‘‘Original SSU unit oversampling rate’’column. For example, the first entry in this column, 1092.5000, is equal to 1,748/1.6.9Annualized base weight of beta-design secondary sampling units. A given entry in this column is equal to the corresponding entry in the ‘‘Alpha SSU base weight’’ column, multiplied by a factor tocompensate for SSU unit reductions (see the ‘‘SSU unit reduction’’ column), and multiplied by a factor to compensate for change in the SSU size (see the ‘‘alpha SSU size’’ and ‘‘beta SSU size’’columns). For example, the first entry in this column, 1751.4683, is equal to 1092.5000 c [101/(101–17)] c (16/12).10Expected fraction of nonminority households sampled in NHIS secondary sampling units.11Base weight of nonminority household in NHIS SSU. A given entry in this column is equal to the ratio of the corresponding entries in the ‘‘Beta SSU base weight’’ column and the ‘‘nonminorityhousehold subsample rate’’ column. For example, the first entry in this column, 2490.7114, is equal to (1751.4683/0.7032).12Quantity more than zero but less than 0.05.

NOTE: SSU is secondary sampling unit.

Series 2, No. 130 [ Page 13

Page 22: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 14 [ Series 2, No. 130

Chapter 3.Design and EstimationStructures for the1995–2004 NHIS

Lead author: Van L. Parsons,Ph.D., Offıce of Research andMetholodogy, National Center forHealth Statistics

IntroductionThe National Health Interview

Survey (NHIS) is designed to makeinferences about the civiliannoninstitutionalized population of theUnited States. While a generaldescription of the NHIS sample designis presented in chapter 2 of this report,this chapter focuses on the designstructures needed in making statisticalinference. The NHIS program at theNational Center for Health Statistics(NCHS) focuses its attention on makingdesign-based inferences about the healthof persons and households in the targetpopulation. This is accomplished byinflating the responses of each surveyedperson or household in NHIS, referredto as elementary units, by a nationalweight factor. This is called a nationalweight because it permits an(approximately) unbiased design-basedestimator of any U.S. target populationtotal. With this weight an unbiasedestimator, X, for any given truepopulation characteristic total, X, can beexpressed as a weighted sum over allelementary units:

X = ∑u

Wf (u) x(u)(1)

where

u represents an elementary unit ofNHIS

x(u) is the characteristic or responsefor unit u

Wf (u) is the final national weight forunit u

This estimator of total is used togenerate the NHIS estimates of totals,percents, and rates that appear in NCHSofficial publications. A national weight is

provided on the NHIS Public-Usedatabases that allows users to directlycreate estimators of the form in equation(1). In the sections that follow, thetechnical aspects of the procedures usedto create national weights and theprocedures used to estimate thevariances of NHIS estimators arediscussed.

Methods for CreatingWeights and Estimators

Complex estimation techniques arerequired for the survey because it isbased on a highly stratified multistageprobability sample. The true samplingdistributions of any surveyimplementing complex clusteringstructures, implicit stratification, andsystematic sampling tend to bemathematically intractable. Therefore,the survey design will be conceptualizedin a somewhat simplified framework.This conceptual design will provide atractable structural model that capturesthe most important design features. Theprimary sampling unit (PSU) andwithin-PSU sampling steps discussed inchapter 2 can be expressed as ahierarchical sampling design with levelsand probabilities shown in table 5.

It should be noted that the numberof super-secondary sampling units(SSU’s) in the survey is a randomvariable, but with very little variability;hence the super-SSU sample sizes willbe treated as fixed.

The number of eligible householdsin an SSU also is a random variable. Ahousing unit (HU) may be classified asan eligible household or an ineligibleunit. Nationally, about 20 percent of alladdresses yield an ineligibleclassification (e.g., vacant). Newconstruction and/or destruction ofhousing units within a block in the5 years or more after the 1990 Censusresults in a degradation of the originalframe composition information. It ispossible, although a rare event, thatextreme conditions can occur within asecondary sampling unit at the time ofsampling. Either no eligiblehouseholds exist or too many newHU’s exist (use of a permit frameand/or field subsampling, as discussed

in chapter 2, dampen the influence ofthe latter condition). The annualnumber of persons in the sample alsois a random variable. As aconsequence, the annual surveysexhibit different sample counts ofhouseholds and persons.

The three broad estimation criteriathat NCHS applies when deciding onestimation strategies to use for surveydata are:

1. The estimation methods must bedesign-basedfor finite populations.That is, the randomness of the datais a result of sampling finiteuniverses having no imposeddistributional assumptions. This is incontrast to a model-based approachwhere the data typically haveimposed distributional assumptions.The design-based methods may bethought of as nonparametric androbust.

2. The design-based methods should bepractical and should permit(approximately) unbiased estimatorsof population totals.

3. The design-based methods shouldpermit practical variance estimationstrategies to assess the stability ofthe estimator.

To satisfy these criteria, NCHS, aswell as many other sponsors of largegovernment surveys, has been usingstandard accepted design-based methodsdiscussed in such classic references asCochran (10), Kish (11), and Hansen,Hurwitz and Madow (12).

National WeightsThe NHIS estimator of a

characteristic total as presented inequation (1) uses methodology based onthe features of the complex multistageprobability sample to define a nationalweight, Wf , for each elementary unit.This national weight is the product ofup to four of the following weightingfactors:

1. Inverse of the probability ofselection

2. Household nonresponse adjustment

3. First-stage ratio adjustment

Page 23: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 15

4. Second-stage ratio adjustment(poststratification)

When the elementary unit is anindividual, all four weighting factorsdefine the person’s final weight. Becausethe NHIS ratio adjustments are based onperson characteristics, only the first twoweighting factors are used to define anational household weight.

NCHS creates weights for eachcalendar quarter of the NHIS sample,using information provided by the U.S.Bureau of the Census. These weightspermit national estimates to be made foreach quarter. Quarterly weights aredivided by 4 to create annual weights,which are used when making nationalestimates from a calendar year of thesurvey sample.

Base Weight EstimatorThe overall probability that a unit is

in sample is the product of theconditional selection probabilitiespresented in table 5. This basic inflationweight is defined as:

WI (u) = 1/Probability (unit u is in sample).

Roughly speaking, based onprobabilistic sampling, unit u representsWI (u) population units. This weightdepends in part on the minority status ofthe household and substratum class towhich the unit u belongs. WI is the firstcomponent weight of Wf in equation (1).Table 4 in chapter 2 presents the targethousehold inflation weights in the ‘‘ BetaSSU base weight’’ and ‘‘ Nonminorityhousehold base weight’’ columns.

Infrequently, this base samplingweight, WI , will be modified. If, duringthe HU selection process of step 4 intable 5, a secondary sampling unit isdetermined to contain too many housingunits for interview, then a subsamplewill be selected. If the subsampleconsists of less than 1/4 of all HU’s inthe secondary sampling unit, then theconditional probability of selection willbe truncated by the U.S. Bureau of theCensus at 1/4. This is a rare occurrence,and the biases introduced by such amodification should be small.

In an ideal hypothetical samplingsituation having no nonsampling errorcomponents (e.g., frame problems,

nonresponse, and interviewer effects),equation (1) with WI substituted for Wf

becomes

X0 = ∑u

WI (u) x(u)(2)

which will provide an unbiasedestimator for the true population total,X. Such an estimator is referred to as abase weight estimatoror as aHorvitz-Thompson estimator.

Factors Contributing toAnnual WeightFluctuations

Complex sampling implemented byspecified guidelines is difficult toachieve in the real world. Some types ofspecial situations that may occur duringthe survey and will have an influence onweighting procedures for a given yearare:

1. Initial start-up problems during thefirst year of the survey may result inminor deviations from the samplingplan.

2. NHIS budgetary changes may resultin additional subsampling to reducethe sample. For example, in 1985and 1986 the NHIS samples werereduced by 25 and 50 percent,respectively, but 1987–94 were fullbudget years. Similar contingencyplans exist for the 1995–2004 design(see the section ‘‘ Panels of PSUs’’ inchapter 2).

3. Phase-in of new field operations maymodify the sample. In 1996, thetransition from paper and pencil tocomputer-assisted interview resultedin only 5/8 of the full survey samplebeing targeted for eventual publicrelease.

4. In recent years, the survey hasused one or more weeks at thebeginning of the year (duringquarter 1) for interviewer training.During this period, no data arecollected for release. In thissituation, all sampled units forquarter 1 have their basic inflationweight increased by an appropriatefactor to inflate to 13 weeks. For

example, if 1 week was used for

training, the factor is 13/12; if 2weeks were used for training, thefactor is 13/11.

5. Unexpected one-time events mayalter the design. For example,interviewing for the 1995 surveywas not completed due to agovernment shutdown in December1995. As a result, the within-PSUsampling came up 3 weeks ‘‘ short’’for several PSU’s that year. Forquarter 4 of 1995, the basic inflationweight was increased by a factor of13/10.

In this report, only the anticipatedweighting adjustments are consideredfor a full sample of the survey.

Household NonresponseAdjustment

During 1995–97, the first 3 years ofthe current survey design, the householdnonresponse rates for the core surveywere about 6.2, 6.2, and 8.2 percent,respectively. Although small, thishousehold nonresponse will most likelybias an estimator of the form shown inequation (2), and consequently, aweighting adjustment for householdnonresponse is justified. This needresults in the creation of the secondweight factor, the householdnonresponse adjustment.

The standard householdnonresponse adjustment is done byinflating the sampling weights for allresponding households within an SSU tocompensate for the nonrespondinghouseholds within that same secondarysampling unit. A special situation doesoccur. Typically, 5–15 SSU’s in aquarter have 100 percent nonresponse.In such a situation, no nonresponseadjustment is made. It is assumed thatthe poststratification ratio adjustmentwill compensate for the nonresponse.

In the 1985–94 NHIS design, whenall eligible households (i.e., householdswith one or more civilian members) in asecondary sampling unit were sampledwith certainty (i.e., no screeningoccurred), NCHS used the simplenonresponse adjustment

� all eligible households in SSUΣ all responding households in SSU

Page 24: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 16 [ Series 2, No. 130

(Note that this unweighted sum usuallyis equivalent to the sum obtained usingthe base weights because the baseweight is constant within an SSU. Theonly exception is the rare occurrencewhen the interviewer discovers three ormore additional units at a sampleaddress. Additional information isavailable in ‘‘ Additional SubsamplingSituations’’ in chapter 2.

In the 1995–2004 NHIS design,households that have no black orHispanic persons are subsampled at theSSU level in the area frame, asdescribed in chapter 2. The subsamplingoccurs after all housing units in the SSUhave been randomly divided by the U.S.Bureau of the Census into two groups,coded I and S, prior to interviewing. Allsampled households that have no blackor Hispanic persons must come from theI group. Those in the S group arescreened out. All black and Hispanichouseholds in either group areinterviewed. This within-SSUsubsampling creates two issues that didnot have to be addressed in previoussurvey designs:

1. The race/ethnicity of somehouseholds cannot be determinedbecause the interviewer neversucceeds in making contact with thehousehold. If the true status is blackor Hispanic, then these arenonresponding households. If thetrue status is ‘‘ other,’’ then thehousehold may be nonsampled orsampled, but nonresponding.

2. Even with no nonresponse, the baseweight estimated total number ofhouseholds within the SSU, basedon the interviewed sample, may notbe equal to the true total number ofhouseholds within the secondarysampling unit. (The expected value,however, is unbiased.) A simplehypothetical example is: Suppose theI/S sampling rule requires that withinevery substratum of type a, asubsample of 1 of every 2 householdunits within an SSU be taken. Allhouseholds that have no black orHispanic persons in this I subsampleare targeted for interview, but allblack and Hispanic households inthe entire SSU are targeted. Thus,every sampled black and Hispanic

household will receive a componentweight of 1, while every sampled‘‘ other’’ household will have aweight of 2. The sum of the ‘‘ other’’household weights will be an evennumber that may not equal the truevalue.

Methodology for 1995 and1996 Surveys

In 1995 and 1996, NCHS used thefollowing method to adjust fornonresponse:

Each household in a given SSU wasclassified as belonging to 1 of 3 groups:

M = black or Hispanic householdO = ‘‘ other race’’ householdU = unknown race and ethnicity status

household

The class O households weresubsampled, but all class M householdswere selected for interview.

Let WH = conditional inflationweight restricted to step 5 of table 5(and any other special inflation factorsas discussed in the ‘‘ Base WeightEstimator’’ section above).

The following weighted sums (withrespect to WH) were computed acrossthe SSU:

WH (M) = weighted sum of sampleclass M households

WH (O) = weighted sum of sampleclass O households

WH (Mres) = weighted sum of respondingsample class M households

WH (Ores) = weighted sum of respondingsample class O households

The nonresponse adjustment foridentified class M or O households inthe SSU was defined as

NROM =[WH (M) + WH(O)]

[WH(Mres) + WH(Ores)]

This does not include the unknownclass U. To compensate, let

N(M,O,U) = unweighted sum of allhouseholds (M,O,U) in the SSU, andN(M,O) = unweighted sum of M and Ohouseholds in the SSU.

Note, here the unweighted sumsN(M,O,U) and N(M,O) are over theentire SSU, while the weighted sums,WH( ) , are computed only on the

sampled component.The nonresponse adjustment factor

for the SSU was defined as

NR= [N(M,O,U) / N(M,O)] c NROM

If all households were successfullyscreened with respect to race/ethnicity, i.e., if N(M,O,U) = N(M,O),then

NR = NROM

The final household nonresponseadjustment factor for the SSU, Wnr, wasdefined as

Wnr = minimum (NR,2.0)

That is, the final factor was truncated to2 to control the variability in theweights due to this factor. Typically,fewer than 0.5 percent of SSU’s usedthis truncated factor.

Methodology for 1997 Surveyand Beyond

Beginning in 1997, NCHS has useda different method to adjust fornonresponse. This method accounts forthe difference in eligibility for inclusionin the survey, based on use of the I andS screening codes, as described inchapter 2.

All of the households in a givenSSU that belong to one of the followingfour groups are eligible:

MI = black or Hispanic household,I screening code

MS = black or Hispanic household,S screening code

OI = ‘‘ other race’’ household,I screening code

UI = unknown status household,I screening code

None of the households in the SSUin the following group are eligible:

OS = ‘‘ other race’’ household,S screening code

Some, none, or all of thehouseholds in the SSU in the followinggroup are eligible:

US = unknown status household,Sscreening code

If the number of households in theUS group is not zero, the proportion ofeligible households in the SSU isestimated using information from

Page 25: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 17

households with known race and/orethnicity. The eligible proportion isestimated by summing the number ofMI, MS, OI, and OS households in theSSU and then computing:

MINPROP =(MI + MS)

(MI + MS + OI + OS)

assuming the denominator is not zero;otherwise, MINPROP is set equal to 0.Once MINPROP is defined, thecomplementary proportion OTHPROP isdefined as [1−MINPROP] if thedenominator of MINPROP is not zero;OTHPROP is set equal to 0 if thedenominator of MINPROP is zero.

Let WH = conditional inflationweight restricted to step 5 of table 5(and any other special inflation factorsas discussed in the ‘‘ Base WeightEstimator’’ section).

The following weighted sums (withrespect to WH) are computed across theSSU:

WH (MI) = weighted sum of sampleclass MI households

WH (MS) = weighted sum of sampleclass MS households

WH (OI) = weighted sum of sampleclass OI households

WH (MI, res) = weighted sum ofresponding sample classMI households

WH (MS,res) = weighted sum ofresponding sample classMS households

WH (OI, res) = weighted sum ofresponding sample class OI

households

Then, the nonresponse factor for theSSU is computed as:

NR= [WH(MI) + WH(MS) + WH(OI)+ f1(UI) + f2(US)] / [WH(MI, res)+ WH(MS,res) + WH(OI, res)]

where f1(UI) denotes a partition of thehouseholds in UI using MINPROP andOTHPROP, with appropriate WH factorsapplied to each piece, and f2 (Us)denotes estimation of the black and/orHispanic households in US usingMINPROP, and the appropriate WH

factor applied to that piece.

More specifically,

f1(UI) = WH(OTH) (OTHPROP c UI)

+ WH(MIN) (MINPROP c UI)

f2(US) = WH (MIN) (MINPROP c US)

whereWH (OTH) (OTHPROP c UI) =weighted sum of the ‘‘ other race’’estimated proportion of UI

households, where the weightWH (OTH) is the weight applied to‘‘ other race’’ households in the SSU.

WH (MIN) (MINPROP × UI) = weightedsum of the black and Hispanicestimated proportion of UI

households, where the weight WH(MIN)

is the weight applied to black andHispanic households in the SSU.

WH (MIN) (MINPROP × US) =weighted sum of the black andHispanic estimated proportion of US

households, where the weightWH(MIN) is the weight applied toblack and Hispanic households inthe SSU.

In essence, the nonresponse factorNR consists of a numerator that is anestimate of the total number of eligiblehouseholds in a given SSU, and adenominator that is the total number ofinterviewed households. Weight factorsthat account for subsampling within theSSU are included as appropriate.

The final household nonresponseadjustment factor for the SSU, Wnr , isdefined as

Wnr = minimum (NR, 2.0)

That is, the final factor is truncatedto 2 to control the variability in theweights due to this factor. Typically,fewer than 0.5 percent of SSU’s use thistruncated factor.

Estimator Based on theProduct of WI and Wnr

The estimator produced bysubstituting the product of WI and Wnr

for Wf in equation (1)

X′ = ∑u

WI (u) c Wnr(u) c x(u)(3)

should produce approximately unbiasedestimators for the population total X, aslong as the true nonresponding

population does not significantly differfrom the responding population.

Beginning in 1997, the weightWI c Wnr is used to define a final nationalweight for households, and it is used toproduce estimates of householdcharacteristics. (In 1995 and 1996, theweight WI was used to define a finalnational weight for households.)

Ratio Adjustments forPerson-Level Weights

The third and fourth weightingfactors to be defined are ratioadjustments. Statistical sampling theoryhas demonstrated that in manysituations, the estimators obtained usinga ratio estimation procedure often havesmaller mean squared error (MSE) thanthe base weight estimators expressed byequation (2). More precisely, if X′ andY′ are base weight estimators of twopopulation characteristic totals, X and Y,respectively, and the ‘‘ true’’ total Y isknown, then the ratio estimatorX′′ = (X′/Y′ ) Y for X will have smallerMSE than the estimator X′ when thereis a high positive correlation between X′and Y′ and the sample size is large.

The ratio adjustment also is used tohelp correct survey bias due tosystematic undercoverage. It should benoted that an observed survey estimatorof the form shown in equation (3) maybe larger or smaller than the true valuejust by chance alone. There are somepopulations that the U.S. Bureau of theCensus has identified as ‘‘ difficult tosample.’’ For example, historically, therehas been survey undercoverage of theyoung black male population in theNHIS, and the estimator of equation (3)may be negatively biased in estimatingyoung black male populationcharacteristic totals. Such a bias due toundercoverage is often reduced by theuse of the ratio adjustments.

Note that the ratio adjustments areapplied at the person level, which canintroduce variation in the person-levelweights within a given sampledhousehold. The previous two componentsof the weights, the inverse of theprobability of selection and the householdnonresponse adjustment, are equal for allpersons in a given sampled household.

Page 26: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 18 [ Series 2, No. 130

First-Stage Ratio Adjustment

The first-stage ratio adjustment isused in an attempt to reduce thebetween-PSU variance component ofsampling variation among thenonself-representing (NSR) PSU’s.For each of six residence and/orrace-ethnicity classes within thenonself-reporting strata of each of thefour census regions, the first-stage ratioadjustments are defined using thefollowing methodology and are alsopresented in table 6.

The first-stage ratio adjustmentfactors are created by the U.S. Bureauof the Census. The 24 residence and/orrace-ethnicity classes in the UnitedStates defined within the NSR PSU’swill be indexed by the letter c,

c = 1,2,...,24.

Let

Zc= the projected 1995 populationtotal for class c over all NSR PSU’sin the population.

Considering only the first stage ofsample selection as presented in theconceptual design (refer to step 1 intable 5), the sample of NSR PSU’s canproduce an unbiased estimator of Zc,say, Zc,

Zc = ∑NSR PSU’s(si)

Zsic / πsi

where Zsic = projected 1995 populationtotal for class c in sample NSR PSU iof stratum s, and πsi = the selectionprobability of PSU i of stratum s (referto step 1 in table 5).

Note, Zsic is based on U.S. Bureauof the Census projections and not on thefielded sample.

The first-stage ratio adjustment factorassociated with class c is defined as

Fc = Zc / Zc

with adjustment (i.e., collapsing withanother class and/or truncation)occurring if the factor falls outside theinterval [0.8125, 1.3].

The U.S. Bureau of the Censussuggested that the lower and upperbounds should satisfy the symmetricequations L = 1/(2−(1/U)) andU = 1/(2−(1/L)). With the upper boundspecified at U = 1.3, the lower bound isdetermined to be L = 0.8125.

A universal first stage ratioadjustment, Wr 1, can be defined for eachsample person by defining a new classindex, c = 0, to denote all persons notreceiving the Fc ratio adjustment:

Wr1 = Wr1(c) = Fc

if c = 1,2,...,24 for NSR PSUs

Wr1 = Wr1(c) = 1if c = 0 for SR PSU’s

A first-stage ratio-adjusted nationalestimator, X′′ , of a population total, X, isdefined by equation (1) when the weightWf is replaced with the product of thefirst three component weights:WI c Wnrc Wr1 ,

X′′ = ∑u

WI (u) c Wnr(u) c Wr1(y) c x(u)(4)

As shown in table 2, about 64 percentof the U.S. population resides in theself-representing strata, which do notreceive the first-stage ratio adjustment.The research of Parsons andCasady (13) on the previous 1985–94NHIS design has shown that theinclusion of the first-stage ratioadjustment factor had very littleimpact on NHIS estimates.

Table 6 shows the first-stage ratioadjustment factors for 1997. These aretypical values for the entire 1995–2004NHIS design. However, they weredifferent in 1996 because of the reducedsample, and there will be a small changestarting in 2000 when one nonself-representing PSU rotates out andanother nonself-representing PSU rotatesin (refer to ‘‘ Rotating PSU’s’’ inchapter 2).

Second-Stage Ratio Adjustment(Poststratification)

The main advantages of theratio-estimation process are exploitedby the introduction of a second ratiofactor, the poststratification adjustmentweight. This weight assures that theNHIS estimates for 88 age-sex-raceand/or ethnicity classes of the civiliannoninstitutionalized population of theUnited States (shown in table 7) agreewith independently determinedpopulation controls prepared by theU.S. Bureau of the Census.Furthermore, these independent

controls also include an adjustment fornet undercoverage in the 1990 Census,and are the same controls used for theCurrent Population Survey (CPS).Thus, the national population estimatesfor any combination of the age-sex-race and/or ethnicity groups from thetwo surveys are the same, whichgreatly enhances comparability of thetwo surveys.

Each month, the U.S. Bureau of theCensus produces national estimates forthe 88 age-sex-race and/or ethnicityclasses. While the survey is conductedweekly, the poststratification adjustmentis only computed for quarterly estimates.The quarters and the dates of thepopulation estimates used as the controlsare:

NHIS Quarter Population Estimates

January-MarchApril-JuneJuly-SeptemberOctober-December

February 1May 1August 1November 1

For each quarter, 88 age-sex-raceand/or ethnicity adjustment weights arecomputed, for a total of 352 adjustmentweights annually. If a represents one ofthe 88 age-sex-race and/or ethnicclasses, Y(a) represents the U.S. Bureauof the Census population estimate forclass a, and Y′′ (a) represents the NHISfirst-stage ratio-adjusted national totalfor class a, that is:

Y′′ (a) = ∑u

WI (u) c Wnr(u) c Wr1(u) c Ia(u),

where

Ia(u) = 1 if person u is in class a= 0 otherwise

then the second-stage ratio adjustmentfor class a, Wr2 , is defined as:

Wr2(a) = Y(a) / Y′′ (a)

In implementing the abovesecond-stage ratio adjustment, NCHSgenerally requires each class a tocontain at least 30 sample persons. If aclass contains too few sample persons,that class will be pooled with anadjacent age class. The two-stage ratioadjusted national estimator, X, of apopulation total, X, is defined byformula (1) with the weight Wf definedby the product of the four component

Page 27: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 19

weights: WI c Wnr c Wr 1 c Wr 2, i.e.,

X = ∑u

WI (u) c Wnr(u) c Wr1(u) c Wr2(u) c x(u)(5)

In the 1973–84 and 1985–94 NHISdesigns, both the first-stage andpoststratification ratio adjustments werestructured similarly to the structurespresented in tables 6 and 7, but theracial classes were black and nonblack.The introduction of oversamplingHispanic households and the use ofHispanic class ratio adjustments in 1995have improved the precision of Hispanicdomain statistics.

Creation of Other WeightsThe preceding discussion has

outlined the procedure for creatinghousehold-level and person-levelweights for the survey. Other weightsalso were created for 1995 and 1996NHIS supplements, and beginning in1997, for sample adult, sample child,and family-level files. The basic strategyfor creating these other weights issimilar to the preceding discussion.Some form of the household-levelweight or the person-level weightalways is the starting point forcreating the other weights. Asappropriate, additional levels ofsampling are accounted for. In someinstances, an additional nonresponseadjustment is done (e.g., someone mayhave responded to the main survey,but refused to respond to asupplement). Usually, some form ofpoststratification is done; for example,the sample adult weight uses a smallerset of poststrata than the 88 poststratalisted in table 7.

Variance EstimationMost of the estimates produced by

NCHS from the survey are totals andratios of totals, such as means, percents,and rates. All such totals are producedusing the final national weight asdescribed in the preceding sections.These estimators are subject to bothsampling and nonsampling errors. Thenonsampling errors such as responseerrors, defective sample frames,nonresponse, and undercoverage are

difficult quantities to measure, but everyeffort is made to minimize such errors ateach step of the NHIS operation. Thesampling error, however, can bemeasured by the variance of theestimator.

While equation (1) provides afunctional form that permits simplecomputation of point estimates, thevariances of such estimators are moredifficult to compute. The functionalform of a variance estimator dependson the nature of the survey design andmethods used to adjust the weights.Some complexities in the surveydesign that require some specialtechniques are:

1. The higher levels of sampling (e.g.,selection of SSU’s within a PSU)are the result of a very complicatedprocess involving the partitioning ofblock groups, estimating measures ofsize, and applying systematicsampling techniques. Even given thecensus information about the PSU, itwould be extremely difficult todefine a ‘‘ user friendly’’ samplingmechanism that captured all the truestochastic structure of the systemand could be implemented with astandard variance estimationprocedure.

2. Although there are 21 densitysubstrata definitions, most PSU’s haveonly a few such substrata within thePSU that are nonempty. Many densitystrata within a primary sampling unithave just one sampled SSU.

3. Some nonself-representing stratahave only one sampled PSU.

4. Some self-representing strata aresmall. They are part of largemulti-State metropolitan areas, butsampled as distinct areas.

5. To protect the confidentiality ofsurvey respondents, NCHS cannotrelease design information thatcould be used to identify smallergeographical areas in which thesurvey was conducted. Smallsample areas with raresocioeconomic-demographiccharacteristics must not beexplicitly or implicitly identifiableby design information.

6. With weighting adjustmentsapplied to the base samplingweight, estimators of total becomenonlinear in nature. Thiscomplicates the variance estimationprocedure.

7. In practice, data analysts who studyNHIS data use large-sample theorywhen making inferences aboutpopulations. Variance estimationprocedures suitable for largesubpopulations may be unstable forsmaller subpopulations. NCHStargets stable, all-purpose varianceestimation structures that should beeasy to implement with existingcomputer software.

Simplified DesignStructures for VarianceEstimation

Wolter (14) and Rust (15) wroteexcellent reviews of design-basedvariance estimation for complexsurveys. Of the available methods, thetwo most commonly used are theTaylor series linearization method andthe Balanced Repeated Replication(BRR) method. Software for analysisof complex surveys include PC CARP,STATA, SUDAAN, VPLX, andWESVAR. A comparison of thesesoftware packages is beyond the scopeof this report, but an Internet worldwide web page, Summary of SurveyAnalysis Software, currently located athttp://www.fas.harvard.edu/~stats/survey-soft/survey-soft.html ,provides references and discussion.

Currently, NHIS public use filescontain design information suitable forthe Taylor series linearization method(also suitable for some replicationsoftware).

Variance Estimators forNational Estimators

First, a variance estimationstructure is developed forself-representing PSU’ s. Then, astructure is developed that accountsfor the PSU sampling fornonself-representing PSU’ s. The twostructures are then combined to give a

Page 28: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 20 [ Series 2, No. 130

variance estimator for nationalestimators. These structures aredescribed in more detail below.

SR PSU’s: Conceptual NHISWithin PSU Sampling andEstimation Structures

1. The super-SSU will be consideredas a well-defined population clusterwhere within-super-SSU samplinginflation weights, steps 3 to 6 oftable 5, produce an unbiasedestimator of super-SSU total.

2. The super-SSU sampling, step 2 oftable 5, can be treated as atraditional ‘‘ with replacementsampling’’ from an infinitepopulation of super-SSU’s within asubstratum. All populationsuper-SSU’s within a substratumhave the same selection probability.Sampling is independent oversubstrata.

Under assumptions 1 and 2 above,one can develop a variance estimator forthe estimated total at the substratumlevel. The following indices shall beused to denote the levels of nestingwithin the survey design:

s stratumi PSUj substratumk sampled annual-SSUu sampled elementary unit within

annual-SSU k.

For substratum j nested within PSUi, nested within stratum s, let

Nsij = the number of populationsuper-SSUs in substratum j

the number of super-SSUssampled in substratum j

the probability in step 2 oftable 5 times Nsij

nsij ={Wjlsi = Nsij / nsij = super-SSU selection

weight within substratum j

Wuk|sij = within super-SSUconditional selection weight for unit u inannual-SSU k computed using theinverses of probabilities of selection asspecified in steps 3–6 of table 5.

An unbiased estimator of the totalof a characteristic X for substratum jmay be expressed as:

Xsij = ∑k = 1

nsij

( ∑uεSSUk

Wj | si c Wuk | sij c xu)

= �k = 1

nsij

Xsijk

and an unbiased estimator of itsvariance is:

Var(Xsij) = nsij Ssij2

where

Ssij2 = ∑

k = 1

nsij

(Xsijk − Xsij.)2 / (nsij−1)

Xsij. = ∑k = 1

nsij

Xsijk / nsij

These functional forms can beextended over all the substrata withinthe primary sampling unit to obtain anunbiased estimator of the PSU totaland its corresponding varianceestimator:

Xsi = ∑j = 1

21

∑k = 1

nsij

Xsijk and

Var(Xsi) = ∑j = 1

21

nsij Ssij2

(6)

Treating the sampling as ‘‘ withreplacement’’ when it actually is‘‘ without replacement’’ results in apositive, but not dominating, varianceestimator bias. Furthermore, becausemany substrata may be empty or havefew sampled SSU’s, the substrata maybe collapsed within PSU by substratacharacteristics to form fewer substrata.Using these new substrata will typicallyresult in a more stable varianceestimator, but greater positive varianceestimation bias.

Typically, the variance estimatorwith a reduced set of substrata will be

�jεCsi

nCsi(j) SCsi(j)2

(7)

where Csi is some collapsing of theoriginal 21 substrata of PSU i in stratums, with the n and S2 terms defined asin equation (6), but on the new collapsedsubstrata. For example, one possiblecollapsing would be Csi = {Csi(1), Csi(2)},where

Csi(1) = {1,2,3,4,5,6,7,8,9,10,11,12,13,16,17,18,21}

Csi(2) = {14,15,19,20}

NSR PSU’s: Variance Estimatorthat Accounts for PSUSampling

The variance estimators presentedabove in equations (6) and (7) appearreasonable for totals restricted toself-representing strata. In thenonself-representing strata, the varianceestimator should also reflect thefirst-stage selection of PSU’s. First,consider a hypothetical NHIS having100 percent response and using only theweights determined from steps 1–6 intable 5. The basic inflation estimator ofequation (2) is:

X0 = ∑u

WI (u) c x(u)

which can also be expressed as

X0 = ∑s:stratum

∑i:PSU

Xsi..

πsi= ∑

s:stratum

∑i:PSU

Xwsi..

Here, Xwsi.. is the estimator of PSU totalin equation (6), inflated by 1/πsi. Thevariance estimator of Xwsi ..

corresponding to equation (6) will bedenoted as

�j = 1

21

nsij Swsij2

where

Swsij2 = ∑

k = 1

nsij

(Xwsijk−Xwsij.)2 / (nsij−1)

using the SSU totals based on the entireinflation weight, WI.

Page 29: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 21

Variance Estimators forNational Estimators:Combining Across NSR andSR PSU’s

An estimator for the variance of X0

is

Var(X0) = ∑sεNSR2

[(πs1πs2−πs12)πs12

(Xws1..−Xws2..)2

+ ∑i = 1

2

πsi ∑jεCsi

nCsi (j) SwCsi(j)2 ]

+ ∑CstreNSR1

∑ieCstr

m(Cstr(i ))

SwCstr( i )

2 + ∑seSR

∑jeCsi

nCsi(j )

SwCsi( i )2

(8)

Here, the set NSR2 is a set of NSRstrata with two sampled PSU’s. Thevariance estimators for these strata arethe so-called Yates-Grundy-Sen 2-stageforms with the Csi( j) representing thecollapsed PSU substrata as discussed inequation (7). The set NSR1 is a union ofcollapsed original NSR strata (denotedby Cstr) defined in a manner analogousto the example that was given in thediscussion that follows equation (7).Here, m(Cstr(i)) is the number of PSU’sin a given set, and the PSU’s are treatedas being sampled with replacement fromthis collapsed stratum. Only thefirst-stage unit will be used for variancecomputation. The S2 form is thevariation of PSU totals, Xwsi.. , within aCstr(i) collapsed set.

The set SR is a set of SR strata,with possibly collapsed original strataand/or substrata, defined in a manneranalogous to the example given in thediscussion following equation (7).

As mentioned earlier, about94 percent of NHIS householdsrespond. Therefore, it is assumed thatthe nonresponse-adjusted weight canbe treated as an inflation weightwithin the SSU sampling level withlittle bias. Furthermore, for nationalNHIS estimators, previous research byParsons and Casady (13) has shownthat the first-stage adjustment seems tohave little impact on the magnitude of

the estimated variances. With this inmind, equation (8) is extended tocover the nonresponse and first-stageratio weighting adjustments. For thenonresponse-adjusted estimator X ′ ofequation (3) or the first-stageratio-adjusted national estimator, X ′′ ,of equation (4) an approximatevariance estimator is given byequation (8), but with the WI weightreplaced by the Wnr and Wr1 weights,respectively.

Estimating Variances forPoststratified Totals andNonlinear Statistics

The final national weight estimatorX of equations (1) and (5) incorporates apoststratification adjustment. This is theform of the estimator that is presentedin official NCHS publications and is theform that most analysts study. Thisestimator is nonlinear because of thepoststratification adjustment. Acommonly used method for estimatingthe variance of a nonlinear statistic is to‘‘ linearize’’ the statistic using Taylorseries methods and then to useequation (8) applied to the linearizedform. Estimators of totals usingpoststratification weights can belinearized. Basically, equation (8) isused to estimate the variance of thelinearized total, but a prepost-stratification inflation weight is used inthe computation of equation (8).

Several of the variance estimationmethods just discussed will be examinedusing NHIS data in tables 8–10. A set ofNHIS variables used in these tables are:

+ Binary variables:ACT = 1 if person has an activitylimitationLDR = 1 if person has not seen adoctor in 2 or more yearsHLT_FP = 1 if person reports fair orpoor healthBIN10 is a randomly generatedvariable with a Bernoulli distributionand a probability of success 0.10 foreach person

+ Event count variables over pastyear:HDI = number of hospital dischargeincidents

HED = number of hospital episodedays

+ Event count variables over past twoweeks:AIC = number of acute incidenceconditionsBED = number of bed daysEPI = number of injury episodesRAD = number of restricted activitydaysTDV = number of doctor visits

The tables that use these ‘‘ two-weekrecall’’ variables present all tabulationsin an annualized time frame. Furtherdiscussion of these variables will bepresented as needed and will focus onspecific tables.

In practice, implementation ofcomputer software packages based onlinearization often requires the finalweight, Wf , which may include apoststratification adjustment, to betreated as an inflation weight. Forexample, in the SUDAAN version 7.5software, regression statistics can belinearized, but not with a simultaneouslinearization of the poststratificationweights. Thus, SUDAAN regressioncomputations for variance assume thefinal poststratification weight is aninflation weight. For estimated totals,this practice tends to lead to somewhatinflated variance estimators. For ratiosof totals (e.g., means or percents) theimpact varies. Table 8 presents somecomparisons of variance estimates fromthe 1995 survey obtained by treatingfinal weights as inflation weights versusa linearization of the final weight.Table 9 supplies auxiliary information tothe domains shown in table 8. Thepresentation in table 8 is not intended tobe a comprehensive study of this issue.However, one can see that the impact ofthe linearization of the poststratificationweights on the estimated standard errorsof totals can be substantial, while thecorresponding impact on the standarderrors of means is somewhat marginal,in general. For many health variables,empirical evidence suggests that theinflation in the estimated standard errorsof means may be of little practicalimportance. The treatment of the finalweight, Wf , as an inflation weight, maybe reasonable if software limitationswarrant such a simplification. It should

Page 30: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Page 22 [ Series 2, No. 130

be noted that the population domainsthat are aggregates of several componentpoststratification classes should beexpected to have a greater variancereduction than those population domainscovered by few poststratification classes.In general, economic-type variables mayexhibit a greater impact than health-typevariables. For regression-type analysis,the inclusion of age-sex-race and/orethnic predictors tends to reduce theimpact of treating the final weight as aninflation weight.

Public Use Data andLimitations on DesignStructures

NCHS forbids the disclosure ofinformation that may compromise theconfidentiality promised to surveyrespondents. These concerns aboutconfidentiality require the omission orconcealing of some design informationfrom public-use databases. Thefollowing types of information havebeen subject to omission or concealmenton public-use databases. Policies ondesign information release often change,and NHIS data users should checkdatabase documentation for the availabledesign information.

1. Most of the distinct probabilities ofselection of table 5 are not released,although some products of sequentialweights are released. In particular,the πsi and πsij probabilities have notbeen released.

2. Original strata, density substrata, andPSU’s may have been collapsed withothers to avoid implicit or explicitgeographical disclosure or collapsedto create convenient forms forvariance estimation. For example,the PSU counts in table 2 will notagree with tabulations from public-use databases.

Obviously, without knowledge ofthe πsi and πsij, equation (8) must bereplaced with a reasonable substitution.Here, all NSR2-type strata may betreated as the NSR1-type strata (i.e., thestrata having two sampled PSU’s aretreated as being sampled withreplacement). No second-stage

component would be included in thefunctional form for these NSR1-typestrata because overestimating thevariance on the average is expected. Inthe SR strata, the second-stage variationwould still be used. Thus, with thislimited information, an approximationfor equation (8) will be

Var (X0) = ∑sεNSR2

(Xws1..−Xws2..)2

+ ∑CstrεNSR1

∑iεCstr

m(Cstr(i)) SwCstr(i)

2

+ �sεSR

∑jεCsi

nCsi(j) SwCsi(j)2

(9)

where,NSR2 is a set of original NSR

strata, each with two sampled primarysampling units,

NSR1 is the set of collapsed strata,Cstr, defined in a manner analogous tothe discussion that follows equation (7),and

SR is a set of SR strata, possiblycollapsed original strata.

The above form can be expressed ina condensed form:

Var (X0) = ∑c

mc Swc2

(10)

where c represents a (collapsed) stratum,either NSR or SR, and S2

wc is the samplevariance of the mc weighted PSU totalswithin a nonself-representing stratum orthe mc weighted SSU totals within aself-representing stratum. This functionalform is easily implemented. SampleSUDAAN code for equations (9) or (10)and additional explanations of NHISpublic-use variance estimation aredocumented (16) and are provided withreleased data. Table 10 provides somecomparisons for these two designstructures. SUDAAN software, using thedesign structures of equation (8) bothwith and without a linearization of thefinal poststratification weight and usingequation (9) on public-use designstructures, are implemented in this table.For both means and totals, there is atendency for standard errors usingequation (8) to be slightly smaller thanthose produced by equation (9). It alsoappears that the poststratificationimplementation has a greater impact thanthe choice of equation (8) or (9),especially for the standard errors of totals.

Beginning with the 1997 survey,some geographical disclosure concernsresulted in a further coarsening of thereleased public-use design information.The techniques of stratum collapse,stratum partitioning and SSU mixingwere used to coarsen the self-representing design structures with littleanticipated bias, but at the expense ofloss of degrees of freedom. (Thesetechniques are discussed briefly inEltinge (17) and Parsons and Eltinge(18)). The result was a public-use designstructure with an imposed 2-PSU’s perstratum design with over 300 nominaldegrees of freedom. The varianceestimator takes the generic form of thefirst term of equation (9) and isimplemented by many softwarepackages.

Precision ComparisonsBetween Surveys

As discussed in anotherpublication (6), the funding for theoriginally proposed alpha-design surveywas reduced, and the beta version wasimplemented for the 1995–2004 NHIS.In planning the general cost andprecision, objectives were:

1. The 1995–2004 design would havecomparable funding to the previous1985–94 design.

2. The precision of estimators for blackdomains would be comparable to theprevious design.

3. The precision of estimators forHispanic domains would beimproved over the previous designand comparable to those for blackdomains.

4. The precision for white and totaldomains may be allowed to drop tocompensate for meeting objectives 1,2, and 3.

In table 11, some precisionmeasures were estimated for the 1995and 1994 NHIS samples for some selectvariables discussed earlier in thischapter. Table 12 presents someauxiliary information for the domainsshown in table 11. For table 11,SUDAAN software, along with a design

Page 31: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Series 2, No. 130 [ Page 23

corresponding to equation (8), was used.The final weight was treated as aninflation weight. It should be kept inmind that the coefficients of variation(CV’s) presented in table 11 are randomvariables subject to sampling variability,and one should consider trends and/orpatterns as opposed to looking at singlesample realizations. Furthermore, thevariables TDV and AIC are subject tooutlier response, which will increaseestimated CV instability for any givenyear. BIN10, a randomly generatedbinary response for each sampled personhaving probability of success 0.10, hasbeen included as a standard. Thisvariable is independent of the design.One may attribute its CV to a ‘‘ pure’’clustering and sampling process havingno interaction with the variable itself.

One caveat about this table is that1995 was the start-up data collectionyear of the current NHIS design. Severalsampling changes were made before thesampling parameters were actuallyfinalized. Furthermore, 3 weeks ofintended interviews were lost due to thegovernment shutdown in December1995. To make table 11 values looksomewhat more representative of a‘‘ full’’ NHIS, the 1995 sample sizeswere inflated by 51/48 and CV’s weredeflated by (48/51)1/2. Some generalobservations are:

1. Objectives 2 and 3 appear to havebeen met. For black persons, someof the 1995 CV’s are smaller thanthe 1994 CV’s, some are larger;there is no distinct pattern. Samplesizes for black domains tend to besmaller in 1995 than in the previousdesign. The 1995 sample tends tohave at least an 80 percent increasein Hispanic domain sample sizes.

2. The Hispanic domain CV’s for 1995show great improvement over 1994.A comparison of correspondingblack and Hispanic domains byCV’s does not show a consistentorder relation pattern.

3. For Objective 4, there seems to beno great loss of precision. Again,there is no consistent order relationpattern when comparing 1994 to1995.

Based on this limited study, there isno reason to believe that generalobjectives 1–4 were not achieved.

References

1. Shah BV, Barnwell BG, Bieler GS.SUDAAN User’s Manual, Release 7.5.Research Triangle Park, NC: ResearchTriangle Institute. 1997.

2. National Center for Health Statistics.The statistical design of the HealthHousehold-Interview Survey. HealthStatistics. PHS Pub. No. 584-A2.Public Health Service. Washington:U.S. Government Printing Office. 1958.

3. National Center for Health Statistics.Health Interview Survey procedure,1957–1974. National Center for HealthStatistics. Vital Health Stat 1(11). 1975.

4. Kovar MG, Poe GS. The NationalHealth Interview Survey design,1973–1984, and procedures, 1975–83.Vital Health Stat 1(18). 1985.

5. Massey JT, Moore TF, Parsons VL,Tadros W. Design and estimation forthe National Health Interview Survey,1985–94. Vital Health Stat 2(110).1989.

6. Judkins D, Marker D, Waksberg J, etal. National Health Interview Survey:Research for the 1995–2004 Redesign.Vital Health Stat 2(126). 1999.

7. Friedman HP, Rubin J. On someinvariant criteria for grouping data. J.Am. Stat. Assoc. 62:1159–1178. 1967.

8. Durbin J. Design of multi-stage surveysfor the estimation of sampling errors.Appl. Stat. 16:152–164. 1967.

9. U.S. Bureau of the Census. The currentpopulation survey, design andmethodology. Technical Paper 40.Washington: U.S. Government PrintingOffice. 1978.

10. Cochran WG. Sampling techniques(3rd ed). New York: Wiley. 1977.

11. Kish L. Survey sampling. New York:Wiley. 1965.

12. Hansen MH, Hurwitz WM, MadowWG. Sample survey methods andtheory, Vol. I, Methods andApplications; Vol. II, Theory. NewYork: Wiley. 1953.

13. Parsons VL, Casady RJ. Varianceestimation and the redesigned HealthInterview Survey. In: Proceedings ofthe section on survey research methods.American Stat Assn. 406–411. 1987.

14. Wolter KM. Introduction to varianceestimation. New York: Springer-Verlag.1985.

15. Rust K. Variance estimation forcomplex estimators in sample surveys.J. Offic. Stat. 1:381–397. 1985.

16. National Center for Health Statistics.Variance estimation for person datausing the NHIS public use person datatape, 1995–2004 National Center forHealth Statistics. NHIS Public-UseData document. Available at the NCHSNHIS Internet WWW sitehttp://www.cdc.gov/nchs/nhis.htm.1998.

17. Eltinge JL. Evaluation and reduction ofcluster-level identification risk forpublic-use survey microdata files. In:Proceedings of the section on surveyresearch methods. American Stat Assn,1999. To be published.

18. Parsons VL, Eltinge JL. Stratumpartition, collapse and mixing inconstruction of balanced repeatedreplication variance estimators. In:Proceedings of the section on surveyresearch methods. American Stat Assn,1999. To be published.

Page 32: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 5. Components of conceptual sampling design: National Health Interview Survey, 1995–2004

Samplestep

Sampleunit

Within levelunit

Conditional probabilityof selection

1. PSU1 Stratum πsi , for PSU i from stratum sπsij , joint selection probability for PSU’si and j from stratum s

2. Super-SSU1 PSU’s substratum Pr(super-SSU | substratum)3. Annual-SSU1 Super-SSU 1/104. HU2 SSU Pr(HU | SSU)5. Household2 HU 1 if black/Hispanic household,

Pr (‘‘other’’) if nonblack/non-Hispanic household6. Person(s) Household Pr (person | household)

1See chapter 2 for details.2HU is the residential dwelling selected, without regard to its occupants (if any). The household is the collection of occupants selected by characteristics within the HU (see chapter 2).

NOTE: PSU is primary sampling unit, SSU is secondary sampling unit, and Pr is probability.

Table 6: First-stage ratio adjustment factors: National Health Interview Survey, 1997

Residence,race, and ethnicity

Census region

Northeast Midwest South West

Metropolitan area

Hispanic . . . . . . . . . . . . . . . . . . 0.941306 1.120916 0.906019 0.851367Non-Hispanic black . . . . . . . . . . . 0.894335 1.033701 1.038407 0.879097Non-Hispanic other . . . . . . . . . . . 1.001781 0.995499 0.985160 1.043640

Nonmetropolitan area

Hispanic . . . . . . . . . . . . . . . . . . 0.853849 1.194508 0.812500 0.956770Non-Hispanic black . . . . . . . . . . . 0.812500 0.814337 1.063537 1.121297Non-Hispanic other . . . . . . . . . . . 0.995420 0.997996 1.009349 1.010929

Table 7. The 88 age-sex-race/ethnicity classes used for poststratification: National Health Interview Survey, 1995–2004

Age

Hispanic Non-Hispanic black Non-Hispanic other

Male Female Male Female Male Female

Under 1 year . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X1–4 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X5–9 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X10–14 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X15–17 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X18–19 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X20–24 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X25–29 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X30–34 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X35–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X45–49 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X50–54 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X55–64 years . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X65–74 years1 . . . . . . . . . . . . . . . . . . . . . . . . . . . X X X X X X75 years and older . . . . . . . . . . . . . . . . . . . . . . . X X X X X X

1Age categories 65–74 years and over were collapsed into one category, 65 years and over, for Hispanic persons.

Page 24 [ Series 2, No. 130

Page 33: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 8. Impact of poststratification on variance estimation: National Health Interview Survey, 1995[Treatment of the final poststratification weight as a base weight versus a linearization of the final poststratification weight]

Domain andvariable

Estimated totals Estimated means

Number inthousands1 CVpost

2

RatioCVbase/CVpost

3 Mean4 CVpost5

RatioCVbase/CVpost

6

All persons

ACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38,521 1.09 1.26 0.15 1.09 1.08AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456,872 1.65 1.15 1.74 1.65 1.00BED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1,593,023 2.09 1.06 6.08 2.09 1.01EPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61,304 3.70 1.01 0.23 3.70 0.99HDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27,505 1.82 1.10 0.11 1.82 1.03HED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134,271 2.80 1.01 0.51 2.80 1.01LDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36,133 1.23 1.26 0.14 1.23 1.00RAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4,097,080 1.55 1.11 15.64 1.55 1.02TDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1,547,136 1.27 1.17 5.91 1.27 1.02

All females

ACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20,237 1.22 1.25 0.15 1.22 1.12AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246,914 2.14 1.10 1.84 2.14 1.01BED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915,213 2.58 1.06 6.81 2.58 1.01EPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28,212 5.12 1.01 0.21 5.12 1.00HDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16,070 2.25 1.07 0.12 2.25 1.01HED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71,556 3.15 1.03 0.53 3.15 1.02LDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13,154 1.86 1.14 0.10 1.86 1.01RAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2,349,449 1.86 1.09 17.49 1.86 1.01TDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921,550 1.63 1.13 6.86 1.63 1.01

Currently employed persons

ACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11,729 1.80 1.10 0.09 1.76 1.00AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180,163 2.44 1.07 1.44 2.39 1.00BED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397,326 3.47 1.03 3.18 3.45 1.00EPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27,538 5.48 1.01 0.22 5.46 1.00HDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8,951 3.12 1.04 0.07 3.10 1.00HED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31,347 3.78 1.01 0.25 3.75 1.00LDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22,505 1.29 1.25 0.18 1.25 1.00RAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1,324,768 2.41 1.06 10.61 2.38 1.00TDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602,286 2.18 1.09 4.82 2.15 1.00

Family income over $35,000

ACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4,518 3.02 1.03 0.11 2.66 1.01AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75,804 4.33 1.03 1.78 3.99 1.00BED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173,114 5.79 1.01 4.06 5.58 1.00EPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9,528 9.92 1.01 0.22 9.85 1.00HDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3,311 5.43 1.03 0.08 5.20 1.01HED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13,004 6.94 1.00 0.31 6.78 1.01LDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5,396 3.27 1.05 0.13 2.86 1.00RAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483,943 4.29 1.02 11.35 4.02 1.00TDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237,663 4.52 1.03 5.58 4.28 1.00

College graduate, ages 35–44 years

ACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 956 5.73 1.02 0.08 5.59 1.00AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18,612 7.06 1.02 1.60 6.81 1.00BED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41,005 10.57 1.00 3.52 10.43 1.00EPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2,340 18.32 1.00 0.20 18.29 1.00HDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 809 9.79 1.02 0.07 9.62 1.00HED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2,372 11.80 1.01 0.20 11.72 1.01LDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1,698 4.52 1.03 0.15 4.14 1.00RAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123,529 7.64 1.01 10.61 7.36 1.00TDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61,898 5.56 1.03 5.32 5.33 1.00

Black, ages 65–74 years

ACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737 4.90 1.51 0.44 4.82 1.00AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1,367 20.67 1.01 0.81 20.68 1.00BED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24,154 14.48 1.05 14.34 14.50 0.99EPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 47.88 1.03 0.13 47.89 1.03HDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476 11.96 1.18 0.28 11.99 1.04

Series 2, No. 130 [ Page 25

Page 34: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 8. Impact of poststratification on variance estimation: National Health Interview Survey, 1995—Con.[Treatment of the final poststratification weight as a base weight versus a linearization of the final poststratification weight]

Domain andvariable

Estimated totals Estimated means

Number inthousands1 CVpost

2

RatioCVbase/CVpost

3 Mean4 CVpost5

RatioCVbase/CVpost

6

Black, ages 65–74 years—Continued

HED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2,573 13.40 1.08 1.53 13.43 1.03LDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 11.92 1.03 0.09 11.89 0.99RAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57,212 10.93 1.08 33.98 10.95 1.00TDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16,664 8.92 1.20 9.90 8.93 1.02

1Estimated total is based upon poststratification weight.2Estimated CV of total using linearized final weight, shown as a percent.3A given entry in this column is the ratio of the CV of total using the final weight as an inflation weight, divided by the corresponding entry in the ‘‘Estimated totals CVpost’’ column.4Estimated mean based upon poststratification weight.5Estimated CV of mean using linearized final weight, shown as a percent.6A given entry in this column is the ratio of the CV of mean using the final weight as an inflation weight, divided by the corresponding entry in the ‘‘Estimated means CVpost’’ column.

NOTE: CV is coefficient of variation; ACT is persons with activity limitation; AIC is number of acute incidence conditions, based on 2-week recall of event; BED is number of bed days, based on 2-weekrecall of event; EPI is number of injury episodes, based on 2-week recall of event; HDI is number of hospital discharge incidents; HED is number of hospital episode days; LDR is persons who havenot seen a doctor in 2 or more years; RAD is restricted activity days, based on 2-week recall of event; and TDV is total doctor visits, based on 2-week recall of event.

Table 9. Sample size and weighted size of persons by domain: National Health Interview Survey, 1995

Domain Sample size Weighted size1

All persons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102,467 261,890,000All females . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53,658 134,319,000Currently employed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46,730 124,900,000Family income over $35,000 . . . . . . . . . . . . . . . . . . . . . . . . 15,707 42,622,000College graduate, ages 35–44 years . . . . . . . . . . . . . . . . . . . 4,162 11,644,000Black, ages 65–74 years . . . . . . . . . . . . . . . . . . . . . . . . . . 772 1,684,000

1Numbers rounded to the nearest thousand.

Page 26 [ Series 2, No. 130

Page 35: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 10. Comparison of variance estimates obtained by two methods: National Health Interview Survey, 1995[Yates-Grundy-Sen variance estimator with poststratification versus public-use, collapsed strata, single-stage, with-replacement design]

VariableAll

persons

Sex Age

Male Female0–17years

18–44years

45–64years

65 yearsand over

LDR1

Sample size . . . . . . . . . . . . . . . . . . . . . . . . . . . 102,467 48,809 53,658 29,711 40,801 20,000 11,955Weighted size . . . . . . . . . . . . . . . . . . . . . . . . . . 261,889,549 127,570,237 134,319,312 70,670,755 108,040,689 51,713,265 31,464,840Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.13784 0.17991 0.09789 0.08845 0.18484 0.14484 0.07587Standard error (full-p) . . . . . . . . . . . . . . . . . . . . . . 0.00170 0.00221 0.00182 0.00258 0.00248 0.00298 0.00268Standard error (full) . . . . . . . . . . . . . . . . . . . . . . . 0.00170 0.00222 0.00184 0.00265 0.00249 0.00299 0.00269Standard error (public-p) . . . . . . . . . . . . . . . . . . . . 0.00174 0.00228 0.00183 0.00265 0.00253 0.00301 0.00269Standard error (public) . . . . . . . . . . . . . . . . . . . . . 0.00174 0.00231 0.00184 0.00271 0.00254 0.00303 0.00269Total . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36,098,739 22,950,542 13,148,198 6,250,515 19,970,576 7,490,307 2,387,341Standard error (full-p) . . . . . . . . . . . . . . . . . . . . . . 44,4710 282,405 244,714 182,641 268,084 154,026 84,414Standard error (full) . . . . . . . . . . . . . . . . . . . . . . . 560,398 357,742 280,044 210,839 329,031 182,239 88,117Standard error (public-p) . . . . . . . . . . . . . . . . . . . . 455,090 291,298 245,310 187,526 273,660 155,834 84,523Standard error (public) . . . . . . . . . . . . . . . . . . . . . 578,577 369,213 283,769 212,932 340,961 187,949 89,183

TDV2

Sample size . . . . . . . . . . . . . . . . . . . . . . . . . . . 102,467 48,809 53,658 29,711 40,801 20,000 11,955Weighted size . . . . . . . . . . . . . . . . . . . . . . . . . . 261,889,549 127,570,237 134,319,312 70,670,755 108,040,689 51,713,265 31,464,840Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.90468 4.89733 6.86141 4.29787 4.87876 7.08472 11.09687Standard error (full-p) . . . . . . . . . . . . . . . . . . . . . . 0.07511 0.08320 0.11217 0.09066 0.11858 0.16180 0.27613Standard error (full) . . . . . . . . . . . . . . . . . . . . . . . 0.07693 0.08483 0.11368 0.09241 0.11917 0.16290 0.27573Standard error (public-p) . . . . . . . . . . . . . . . . . . . . 0.07587 0.08303 0.11331 0.09007 0.11749 0.16023 0.28469Standard error (public) . . . . . . . . . . . . . . . . . . . . . 0.07704 0.08503 0.11403 0.09116 0.11805 0.16109 0.28387Total . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1,546,373,944 624,753,629 921,620,316 303,733,537 527,105,033 366,374,024 349,161,350Standard error (full-p) . . . . . . . . . . . . . . . . . . . . . . 19,671,625 10,613,283 15,067,049 6,406,983 12,811,560 8,367,336 8,688,269Standard error (full) . . . . . . . . . . . . . . . . . . . . . . . 23,120,033 11,875,947 16,950,069 7,296,215 13,916,041 9,418,063 10,273,218Standard error (public-p) . . . . . . . . . . . . . . . . . . . . 19,870,304 10,592,542 15,219,637 6,365,507 12,693,608 8,285,799 8,957,873Standard error (public) . . . . . . . . . . . . . . . . . . . . . 24,081,902 12,019,642 17,611,040 7,264,252 13,955,259 9,457,535 10,601,351

HLT_FP3

Sample size . . . . . . . . . . . . . . . . . . . . . . . . . . . 101,277 48,266 53,011 29,183 40,423 19,834 11,837Weighted size . . . . . . . . . . . . . . . . . . . . . . . . . . 258,974,266 126,232,939 132,741,328 69,441,900 107,059,972 51,315,313 31,157,082Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.10127 0.09125 0.11080 0.02555 0.06624 0.16633 0.28322Standard error (full-p) . . . . . . . . . . . . . . . . . . . . . . 0.00137 0.00159 0.00165 0.00116 0.00153 0.00342 0.00487Standard error (full) . . . . . . . . . . . . . . . . . . . . . . . 0.00148 0.00166 0.00179 0.00117 0.00153 0.00348 0.00496Standard error (public-p) . . . . . . . . . . . . . . . . . . . . 0.00140 0.00164 0.00167 0.00116 0.00155 0.00343 0.00493Standard error (public) . . . . . . . . . . . . . . . . . . . . . 0.00152 0.00174 0.00182 0.00118 0.00157 0.00351 0.00501Total . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26,225,462 11,518,295 14,707,167 1,774,227 7,091,475 8,535,450 8,824,311Standard error (full-p) . . . . . . . . . . . . . . . . . . . . . . 356,321 201,393 218,719 80,234 163,364 176,090 152,043Standard error (full) . . . . . . . . . . . . . . . . . . . . . . . 419,927 227,337 255,322 83,095 173,960 199,162 207,802Standard error (public-p) . . . . . . . . . . . . . . . . . . . . 362,920 207,377 222,266 80,641 166,380 176,367 154,171Standard error (public) . . . . . . . . . . . . . . . . . . . . . 428,842 234,288 260,217 83,466 178,137 201,110 209,683

1LDR is last doctor visit over 2 years ago.2TDV is number of doctor visits in past year.3HLT_FP is health classified fair or poor.

NOTE: The following NHIS design structures and SUDAAN were used for variance estimation:(full): use equation (8) of this report, with final weight treated as inflation weight(full-p): use equation (8) of this report, along with linearization for poststratification

(public): use equation (9) of this report, with final weight treated as inflation weight(public-p): use equation (9) of this report, along with linearization for poststratification.

Series 2, No. 130 [ Page 27

Page 36: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 11. Precision comparisons between two survey designs: National Health Interview Survey, 1985–94 and 1995–2004

Domain and variable1Mean1994

Mean19952

Percent ofCV 19953

Percent ofCV change

Deft19944

Deft1995

All persons

LDR . . . . . . . . . . . . . . . . . . . . 0.14 0.14 1.23 3 1.58 1.58ACT . . . . . . . . . . . . . . . . . . . . 0.15 0.15 1.18 9 1.51 1.57HLT_FP . . . . . . . . . . . . . . . . . . 0.10 0.10 1.46 0 1.62 1.56TDV . . . . . . . . . . . . . . . . . . . . 6.09 5.91 1.30 3 1.25 1.27AIC . . . . . . . . . . . . . . . . . . . . . 1.71 1.74 1.66 −3 1.38 1.32BIN10 . . . . . . . . . . . . . . . . . . . 0.10 0.10 0.98 0 1.09 1.05

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . . 0.08 0.08 3.53 10 1.09 1.11ACT . . . . . . . . . . . . . . . . . . . 0.38 0.37 1.44 2 1.30 1.21HLT_FP . . . . . . . . . . . . . . . . . 0.28 0.28 1.75 −8 1.38 1.20TDV . . . . . . . . . . . . . . . . . . . 11.30 11.10 2.48 −9 1.16 1.09AIC . . . . . . . . . . . . . . . . . . . 1.10 1.03 5.36 2 1.19 1.09BIN10 . . . . . . . . . . . . . . . . . . 0.10 0.10 2.81 1 1.10 1.03

Black

LDR . . . . . . . . . . . . . . . . . . . . 0.13 0.13 3.21 −3 1.57 1.44ACT . . . . . . . . . . . . . . . . . . . . 0.16 0.16 2.75 −14 1.76 1.41HLT_FP . . . . . . . . . . . . . . . . . . 0.15 0.14 2.97 −15 1.79 1.40TDV . . . . . . . . . . . . . . . . . . . . 5.41 5.18 3.34 1 1.13 1.14AIC . . . . . . . . . . . . . . . . . . . . . 1.54 1.39 4.74 −2 1.39 1.26BIN10 . . . . . . . . . . . . . . . . . . . 0.10 0.11 2.64 0 1.09 1.07

Age 18–44 years:LDR0 . . . . . . . . . . . . . . . . . . 0.17 0.17 3.70 −6 1.36 1.25ACT . . . . . . . . . . . . . . . . . . . 0.12 0.12 4.59 0 1.28 1.25HLT_FP . . . . . . . . . . . . . . . . . 0.11 0.12 4.41 −7 1.30 1.17TDV . . . . . . . . . . . . . . . . . . . 4.96 4.52 5.23 −13 1.15 1.04AIC . . . . . . . . . . . . . . . . . . . 1.54 1.33 7.32 19 1.08 1.18BIN10 . . . . . . . . . . . . . . . . . . 0.10 0.11 4.43 6 1.06 1.13

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . . 0.07 0.08 10.37 14 0.99 1.04ACT . . . . . . . . . . . . . . . . . . . 0.48 0.46 3.78 11 1.26 1.18HLT_FP . . . . . . . . . . . . . . . . . 0.43 0.42 4.42 2 1.46 1.27TDV . . . . . . . . . . . . . . . . . . . 11.97 10.42 7.08 −9 1.10 0.99AIC . . . . . . . . . . . . . . . . . . . 0.94 0.80 16.17 −29 1.54 0.95BIN10 . . . . . . . . . . . . . . . . . . 0.09 0.10 8.41 −24 1.37 0.98

Hispanic

LDR . . . . . . . . . . . . . . . . . . . . 0.18 0.19 2.16 −31 1.56 1.53ACT . . . . . . . . . . . . . . . . . . . . 0.11 0.11 3.08 −20 1.43 1.57HLT_FP . . . . . . . . . . . . . . . . . . 0.11 0.11 3.09 −21 1.41 1.57TDV . . . . . . . . . . . . . . . . . . . . 4.83 4.43 3.49 −3 1.07 1.35AIC . . . . . . . . . . . . . . . . . . . . . 1.68 1.62 4.11 −15 1.23 1.41BIN10 . . . . . . . . . . . . . . . . . . . 0.11 0.10 2.25 −22 1.04 1.08

Age 18–44 years:LDR . . . . . . . . . . . . . . . . . . . 0.25 0.26 2.28 −30 1.29 1.28ACT . . . . . . . . . . . . . . . . . . . 0.09 0.07 4.59 −19 1.19 1.21HLT_FP . . . . . . . . . . . . . . . . . 0.09 0.09 4.57 −13 1.17 1.36TDV . . . . . . . . . . . . . . . . . . . 4.29 3.38 5.46 −7 1.01 1.17AIC . . . . . . . . . . . . . . . . . . . 1.31 1.25 5.83 −19 1.09 1.17BIN10 . . . . . . . . . . . . . . . . . . 0.11 0.10 3.52 −16 1.02 1.09

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . . 0.09 0.10 10.05 −26 1.05 1.07ACT . . . . . . . . . . . . . . . . . . . 0.42 0.40 4.55 −14 1.08 1.22HLT_FP . . . . . . . . . . . . . . . . . 0.35 0.35 4.97 −31 1.27 1.19TDV . . . . . . . . . . . . . . . . . . . 9.78 11.05 8.58 −9 1.12 1.04AIC . . . . . . . . . . . . . . . . . . . 1.40 0.99 18.27 −9 1.10 1.09BIN10 . . . . . . . . . . . . . . . . . . 0.08 0.10 10.34 −25 1.01 1.16

Page 28 [ Series 2, No. 130

Page 37: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 11. Precision comparisons between two survey designs: National Health Interview Survey, 1985–94 and 1995–2004—Con.

Domain and variable1Mean1994

Mean19952

Percent ofCV 19953

Percent ofCV change

Deft19944

Deft1995

All females

Age 18–44 years:LDR . . . . . . . . . . . . . . . . . . . 0.10 0.11 2.41 6 1.15 1.22ACT . . . . . . . . . . . . . . . . . . . 0.10 0.10 2.38 5 1.16 1.16HLT_FP . . . . . . . . . . . . . . . . . 0.08 0.08 2.65 −3 1.18 1.11TDV . . . . . . . . . . . . . . . . . . . 6.39 6.38 3.17 45 1.10 1.37AIC . . . . . . . . . . . . . . . . . . . 1.83 1.79 3.17 10 1.10 1.19BIN10 . . . . . . . . . . . . . . . . . . 0.10 0.10 2.15 2 1.07 1.06

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . 0.07 0.07 4.51 3 1.07 1.03ACT . . . . . . . . . . . . . . . . . . . 0.39 0.38 1.64 −2 1.21 1.06HLT_FP . . . . . . . . . . . . . . . . . 0.28 0.28 2.12 1 1.16 1.10TDV . . . . . . . . . . . . . . . . . . . 11.77 11.61 3.25 −13 1.24 1.09AIC . . . . . . . . . . . . . . . . . . . 1.29 1.12 6.80 10 1.14 1.08BIN10 . . . . . . . . . . . . . . . . . . 0.10 0.10 3.64 1 1.07 1.02

White female

Age 18–44 years:LDR . . . . . . . . . . . . . . . . . . 0.10 0.10 2.81 11 1.13 1.25ACT . . . . . . . . . . . . . . . . . . . 0.10 0.10 2.64 6 1.13 1.16HLT_FP . . . . . . . . . . . . . . . . . 0.07 0.07 3.14 −3 1.15 1.11TDV . . . . . . . . . . . . . . . . . . . 6.67 6.67 3.64 51 1.11 1.41AIC . . . . . . . . . . . . . . . . . . . 1.87 1.86 3.42 9 1.08 1.18BIN10 . . . . . . . . . . . . . . . . . . 0.10 0.10 2.38 4 1.05 1.06

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . 0.07 0.07 4.81 2 1.07 1.02ACT . . . . . . . . . . . . . . . . . . . 0.38 0.37 1.77 −3 1.19 1.05HLT_FP . . . . . . . . . . . . . . . . . 0.26 0.26 2.30 0 1.13 1.07TDV . . . . . . . . . . . . . . . . . . . 11.62 11.67 3.50 −14 1.24 1.09AIC . . . . . . . . . . . . . . . . . . . 1.31 1.11 7.35 15 1.10 1.08BIN10 . . . . . . . . . . . . . . . . . . 0.10 0.10 3.88 4 1.03 1.03

Black female

LDR . . . . . . . . . . . . . . . . . . . . 0.09 0.09 5.03 10 1.33 1.37ACT . . . . . . . . . . . . . . . . . . . . 0.16 0.16 3.21 −10 1.48 1.22HLT_FP . . . . . . . . . . . . . . . . . . 0.16 0.16 3.20 −15 1.54 1.20TDV . . . . . . . . . . . . . . . . . . . . 6.21 5.86 3.97 6 1.07 1.11AIC . . . . . . . . . . . . . . . . . . . . . 1.64 1.53 5.56 −4 1.28 1.14BIN10 . . . . . . . . . . . . . . . . . . . 0.10 0.10 3.48 −2 1.10 1.01

Age 18–44 years:LDR . . . . . . . . . . . . . . . . . . . 0.09 0.11 6.20 3 1.12 1.19ACT . . . . . . . . . . . . . . . . . . . 0.11 0.11 5.46 −1 1.13 1.07HLT_FP . . . . . . . . . . . . . . . . . 0.13 0.13 5.06 −2 1.18 1.08TDV . . . . . . . . . . . . . . . . . . . 5.50 5.33 5.60 1 1.06 1.02AIC . . . . . . . . . . . . . . . . . . . 1.78 1.58 8.06 1 1.13 1.06BIN10 . . . . . . . . . . . . . . . . . . 0.09 0.11 5.54 −2 1.07 1.07

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . . 0.06 0.06 15.55 31 0.97 1.07ACT . . . . . . . . . . . . . . . . . . . 0.48 0.48 4.15 10 1.13 1.04HLT_FP . . . . . . . . . . . . . . . . . 0.42 0.43 5.21 2 1.34 1.20TDV . . . . . . . . . . . . . . . . . . . 14.26 11.26 8.76 −13 1.15 0.99AIC . . . . . . . . . . . . . . . . . . . 1.09 1.02 18.45 −41 1.76 0.96BIN10 . . . . . . . . . . . . . . . . . . 0.09 0.08 11.61 −7 1.26 0.92

Hispanic female

Age 18–44 years:LDR . . . . . . . . . . . . . . . . . . . 0.15 0.16 3.74 −30 1.10 1.09ACT . . . . . . . . . . . . . . . . . . . 0.09 0.08 5.75 −20 1.13 1.15HLT_FP . . . . . . . . . . . . . . . . . 0.11 0.11 4.98 −20 1.11 1.19TDV . . . . . . . . . . . . . . . . . . . 5.68 4.25 5.64 −13 1.06 1.16AIC . . . . . . . . . . . . . . . . . . . 1.47 1.36 7.58 −21 1.10 1.14BIN10 . . . . . . . . . . . . . . . . . . 0.11 0.10 4.60 −20 1.00 1.06

Series 2, No. 130 [ Page 29

Page 38: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Table 11. Precision comparisons between two survey designs: National Health Interview Survey, 1985–94 and 1995–2004—Con.

Domain and variable1Mean1994

Mean19952

Percent ofCV 19953

Percent ofCV change

Deft19944

Deft1995

Age 65 years and over:LDR . . . . . . . . . . . . . . . . . . . 0.08 0.09 13.33 −31 1.07 1.04ACT . . . . . . . . . . . . . . . . . . . 0.43 0.40 5.46 −12 1.00 1.12HLT_FP . . . . . . . . . . . . . . . . . 0.35 0.35 6.09 −27 1.13 1.11TDV . . . . . . . . . . . . . . . . . . . 9.87 11.55 10.69 −18 1.18 1.00AIC . . . . . . . . . . . . . . . . . . . 1.48 1.17 22.02 −4 1.01 1.08BIN10 . . . . . . . . . . . . . . . . . . 0.09 0.10 12.83 −24 0.99 1.08

1LDR is proportion of persons with no doctor visit in 2 years or more, ACT is proportion of persons with an activity limitation status, HLT_FP is proportion of persons with fair or poor health, TDV ismean number of doctor visits per person, AIC is mean number of acute incidence conditions per person, and BIN10 is proportion of persons with a 10 percent characteristic, independent of design.2Values are adjusted to account for 1995 reduced sample.3 (CV(1995) - CV(1994))/CV(1994) * 100.4Deft is sqrt (Variance (Mean | NHIS) / Variance (Mean | SRS)), SRS based on observed domain sample size.

Table 12. Sample size and percent change by domain: National Health Interview Survey

Domain

Adjusted sample sizein thousands

for 1995

Percent changefrom 1994

to 1995

All persons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108.9 −6Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.7 −13

Black . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.7 −11Age 18–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 −10Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 −23

Hispanic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.1 +89Age 18–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 +86Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 +86

Female . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (1) (1)Age 18–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.8 −5Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 −1White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (1) (1)

Age 18–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.5 −4Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 −12

Black . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 −13Age 18–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 −1Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.7 −29

Hispanic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (1) (1)Age 18–44 years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 +85Age 65 years and over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.7 +84

1Data not available.

Page 30 [ Series 2, No. 130

Page 39: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

ye

Series 2, No. 130 [ Page 31

Appendix

Glossary

Acronyms

CAPI computer-assisted personalinterviewing

CD-ROM Compact Disk-Read OnlyMemory

CPS Current Population SurveyCMSA consolidated metropolitan

statistical areasCV coefficient of variationDHIS Division of Health Interview

StatisticsDSMD Demographic Statistical

Methods DivisionHU Housing unitMA metropolitan areaMEPS Medical Expenditure Panel

SurveyMSA metropolitan statistical areaMSE mean square errorNCHS National Center for Health

StatisticsNECMA New England County

Metropolitan AreaNHIS National Health Interview

SurveyNSFG National Survey of Family

GrowthNSR nonself-representingOMB Office of Management and

BudgetORM Office of Research and

MethodologyPMSA primary metropolitan

statistical areaPSU primary sampling unitSI sampling intervalSR self-representingSSU secondary sampling unitWWW world wide web

Definition of Terms

Address frame—is a portion of the1973–84 NHIS sample frame consistingof addresses compiled during the 1970decennial census.

Alpha sample—is the sampleoriginally selected for the 1995–2004NHIS. This sample is described in moredetail in another publication (6).

Area frame—is a portion of the1985–94 and 1995–2004 NHIS sampleframes consisting of geographic areaswhere address listing operations areconducted to obtain a list of addressesfrom which NHIS sample cases areselected.

Beta sample—is the sample actuallyused for the 1995–2004 NHIS. It is asubsample of the alpha sample.

Condition list—is a portion of theNHIS data collection instrument used in1995–96 consisting of a list of medicalconditions (e.g., arthritis, deafness,gallstones, diabetes, heart disease,asthma). There were six differentcondition lists; one of the six lists waschosen randomly for each interviewedhousehold.

Group quarters frame—is a portionof the 1973–84 NHIS sample frameconsisting of a list of group quarterscompiled during the 1970 decennialcensus. ‘‘Group quarters’’ are defined bthe U.S. Bureau of the Census as a typof residential quarters where theresidents share common facilities orreceive authorized care or custody.

Household—is an occupiedresidential housing unit.

Nonself-representing primarysampling unit—is a primary samplingunit selected from a sampling stratumcontaining other primary sampling units( i.e., a primary sampling unit selectedwith probability less than 1).

Oversample—is a samplingprocedure designed to give ademographic or geographic population alarger proportion of representation in thesample than the population’s proportionof representation in the overallpopulation.

Permit frame—is a portion of the1985–94 and 1995–2004 NHIS sampleframes consisting of residential buildingpermits.

Screen—is an interviewingprocedure whereby households who donot meet specified criteria (e.g.,households that do not contain anycivilian black or Hispanic members) arenot administered a full-length interview.In NHIS, the screening procedureconsists of the initial portion of theNHIS interview, up to and including thepoint where the household compositionis determined.

Self-representing primary samplingunit—is a primary sampling unit that isthe only member of its sampling stratum(i.e., a primary sampling unit selectedwith certainty).

Page 40: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

Vital and Health Statisticsseries descriptions

SERIES 1. Programs and Collection Procedures—These reportsdescribe the data collection programs of the National Centerfor Health Statistics. They include descriptions of the methodsused to collect and process the data, definitions, and othermaterial necessary for understanding the data.

SERIES 2. Data Evaluation and Methods Research—These reports arestudies of new statistical methods and include analyticaltechniques, objective evaluations of reliability of collected data,and contributions to statistical theory. These studies alsoinclude experimental tests of new survey methods andcomparisons of U.S. methodology with those of othercountries.

SERIES 3. Analytical and Epidemiological Studies—These reportspresent analytical or interpretive studies based on vital andhealth statistics. These reports carry the analyses further thanthe expository types of reports in the other series.

SERIES 4. Documents and Committee Reports—These are finalreports of major committees concerned with vital and healthstatistics and documents such as recommended model vitalregistration laws and revised birth and death certificates.

SERIES 5. International Vital and Health Statistics Reports—Thesereports are analytical or descriptive reports that compare U.S.vital and health statistics with those of other countries orpresent other international data of relevance to the healthstatistics system of the United States.

SERIES 6. Cognition and Survey Measurement—These reports arefrom the National Laboratory for Collaborative Research inCognition and Survey Measurement. They use methods ofcognitive science to design, evaluate, and test surveyinstruments.

SERIES 10. Data From the National Health Interview Survey—Thesereports contain statistics on illness; unintentional injuries;disability; use of hospital, medical, and other health services;and a wide range of special current health topics coveringmany aspects of health behaviors, health status, and healthcare utilization. They are based on data collected in acontinuing national household interview survey.

SERIES 11. Data From the National Health Examination Survey, theNational Health and Nutrition Examination Surveys, andthe Hispanic Health and Nutrition Examination Survey—Data from direct examination, testing, and measurement onrepresentative samples of the civilian noninstitutionalizedpopulation provide the basis for (1) medically defined totalprevalence of specific diseases or conditions in the UnitedStates and the distributions of the population with respect tophysical, physiological, and psychological characteristics, and(2) analyses of trends and relationships among variousmeasurements and between survey periods.

SERIES 12. Data From the Institutionalized Population Surveys—Discontinued in 1975. Reports from these surveys areincluded in Series 13.

SERIES 13. Data From the National Health Care Survey—These reportscontain statistics on health resources and the public’s use ofhealth care resources including ambulatory, hospital, and long-term care services based on data collected directly fromhealth care providers and provider records.

SERIES 14. Data on Health Resources: Manpower and Facilities—Discontinued in 1990. Reports on the numbers, geographicdistribution, and characteristics of health resources are nowincluded in Series 13.

SERIES 15. Data From Special Surveys—These reports contain statisticson health and health-related topics collected in specialsurveys that are not part of the continuing data systems of theNational Center for Health Statistics.

SERIES 16. Compilations of Advance Data From Vital and HealthStatistics—Advance Data Reports provide early release ofinformation from the National Center for Health Statistics’health and demographic surveys. They are compiled in theorder in which they are published. Some of these releasesmay be followed by detailed reports in Series 10–13.

SERIES 20. Data on Mortality—These reports contain statistics onmortality that are not included in regular, annual, or monthlyreports. Special analyses by cause of death, age, otherdemographic variables, and geographic and trend analysesare included.

SERIES 21. Data on Natality, Marriage, and Divorce—These reportscontain statistics on natality, marriage, and divorce that arenot included in regular, annual, or monthly reports. Specialanalyses by health and demographic variables andgeographic and trend analyses are included.

SERIES 22. Data From the National Mortality and Natality Surveys—Discontinued in 1975. Reports from these sample surveys,based on vital records, are now published in Series 20 or 21.

SERIES 23. Data From the National Survey of Family Growth—Thesereports contain statistics on factors that affect birth rates,including contraception, infertility, cohabitation, marriage,divorce, and remarriage; adoption; use of medical care forfamily planning and infertility; and related maternal and infanthealth topics. These statistics are based on national surveysof women of childbearing age.

SERIES 24. Compilations of Data on Natality, Mortality, Marriage,Divorce, and Induced Terminations of Pregnancy—These include advance reports of births, deaths, marriages,and divorces based on final data from the National VitalStatistics System that were published as supplements to theMonthly Vital Statistics Report (MVSR). These reports providehighlights and summaries of detailed data subsequentlypublished in Vital Statistics of the United States. Othersupplements to the MVSR published here provide selectedfindings based on final data from the National Vital StatisticsSystem and may be followed by detailed reports in Series 20or 21.

For answers to questions about this report or for a list of reports published inthese series, contact:

Data Dissemination BranchNational Center for Health StatisticsCenters for Disease Control and Prevention6525 Belcrest Road, Room 1064Hyattsville, MD 20782-2003

(301) 458–4636E-mail: [email protected]: www.cdc.gov/nchs/

Page 41: Vital and Health StatisticsDesign and Estimation for the National Health Interview Survey, 1995–2004 Series 2: Data Evaluation and Methods Research No. 130 Hyattsville, Maryland

DEPARTMENT OFHEALTH & HUMAN SERVICES

Centers for Disease Control and PreventionNational Center for Health Statistics6525 Belcrest RoadHyattsville, Maryland 20782-2003

OFFICIAL BUSINESSPENALTY FOR PRIVATE USE, $300

DHHS Publication No. (PHS) 2000-1330, Series 2, No. 1300-0255 (6/00)

STANDARD MAIL (A)POSTAGE & FEES PAID

CDC/NCHSPERMIT NO. G-284