vital and health statistics; series 2, no. 82 (12/79)5. age group: under 17 years, 17-44 years,...

27
DATA EVALUATION AND METHODS RESEARCH SmallAreaEstimation: An Empirical Comparison of Conventional and Synthetic Estimators for States This report describes the results of a large empirical investigation into the appropriateness of several competing methods for making estimates for States from the National Health Interview Survey. Be- cause no method is superior, a strategy that should improve exist- ing methodology is suggested. DHEW Publication No. (PHS) 80-1356 Series 2 Number 82 U.S. DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE Public Health Service Office of Health Research, Statistics, and Technology National Center for Health Statistics Hyattsville, Md. December 1979

Upload: others

Post on 09-Mar-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

DATA EVALUATION AND METHODS RESEARCH

SmallAreaEstimation:An Empirical Comparison of Conventional

and Synthetic Estimators for States

This report describes the results of a large empirical investigationinto the appropriateness of several competing methods for makingestimates for States from the National Health Interview Survey. Be-cause no method is superior, a strategy that should improve exist-ing methodology is suggested.

DHEW Publication No. (PHS) 80-1356

Series 2Number 82

U.S. DEPARTMENT OF HEALTH, EDUCATION, AND WELFAREPublic Health Service

Office of Health Research, Statistics, and TechnologyNational Center for Health StatisticsHyattsville, Md. December 1979

Page 2: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

Library of Congress Cataloging in Publication Data

United States. National Center for Health Statistics. Small area estimation.

(Vital and health statistics : Series 2, Data evaluation and methods research 2 ; no. 82)(DHEW publication ; no. (PHS) 80-1 356)

Bibliography: p. 9.1. Health surveys-Statistical methods. 2. Estimation theory. 3. Health surveys-United

States–Statistical methods. I. Schaible, Wesley L. II. Title. III. Series: United States. Na-tional Center for Health Statistics. Vital and health statistics : Series 2, Data evaluation andmethods research ; no. 82. IV. Series: United States. Dept. of Health, Education, and Wel-fare. DHEW publication ; no. (PHS) 80-1356.RA409.U45 no. 82 312’.07’23s [614.4’2’0182] 79-22509ISBN 0-8406-01 76-X

Page 3: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

NATIONAL CENTER FOR HEALTH STATISTICS

DOROTHY P. RICE, Director

ROBERT A. ISRAEL, Deputy Director

JACOB J. FELDMAN, Ph.D., Associate Director for Analysis

GAIL F. FISHER, Ph.D., Associate Director for the Cooperative Health Statktics System

ROBERT A. ISRAEL, A cting Associate Director for Data Systems

ROBERT M. THORNER, SC.D., A ctirzgAssociate Director for International Statistics

ROBERT C. HUBER, Associate Director for Marzagement

MONROE G. SIRKEN, Ph.D., Associate Director for ik(athematical S tatistics

PETER L. HURLEY, Associate Director for Operations

JAMES M. ROBEY, Ph.D., Associate Director for Program Development

PAUL E. LEAVERTON, Ph.D., Associate Director for Research

ALICE HAYWOOD, Information Officer

OFFICE OF STATISTICAL RESEARCH

PAUL E. LEAVERTON, Ph.D., Associate Director

GEORGE A. SCHNACK, Deputy Associate Director

DWIGHT B. BROCK, Ph.D., Acting chief Statistical Research Branch

ROY E. HEATWOLE, Acting Chiej Research Technology ~ranch

PAUL E. LEAVERTON, Ph.D., Acting Chiej .Epidemiology ~ranch

Vital and Health Statistics-Series 2-No. 82

DHEW Publication No. (PHS) 80-1356Library of Congress Catalog Card Number 79-22509

Page 4: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

PREFACE

This report presents the results of a large empirical investigation of severalcompeting techniques for making small area estimates from the Health InterviewSurvey of the National Center for Health Statistics. The project was conducted bythe staff of the Statistical Research Branch of the Office of Statistical Research,under the direction of Dr. Wesley L. Schaible, formerly the acting chief of theStatistical Research Branch.

In addition to internal reviews (by Dr. Paul E. Leaverton, Office of StatisticalResearch, and Mr. Dwight K. French, Office of Statistical Research), NationalCenter for Health Statistics policy requires that methodological reports undergoan external review for technical merit and clarity. Dr. Steven Cohen of the Na-tional Center for Health Services Research performed this review and providednumerous constructive comments on an early draft of this report. Finally, theauthors acknowledge the outstanding work of Mr. Barry W. Peyton of the Statisti-cal Research Branch in preparing the computation for this project and Mr. EugeneDiggs for computer graphics.

...Ill

Page 5: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

CONTENTS

Preface .............................................................................................................. .. ...................... .............

Introduction ........................................................................................................ ...................................

Approach ...............................................................................................................................................

Estimators ..............................................................................................................................................Simple Direct Estimator ....................................................................................................................Poststratified Estimator ................................................................................................. ...................Synthetic Estimators ........................................................................... ..............................................Ratio-Adjusted Estimators ................................................................................................................

Results ................................................................................................................................... ................

Summary .................................................................................................................................. ..............

References ......................................................................................... .....................................................

AppendixSupplemental Tables and Figures ......................................................................................................

...Ilr

1

2

4

8

9

12

LIST OF TEXT FIGURES

1. Percent of the population who have completed high scbool-simple direct estimates and actualvalues for50States,ad fie District of Colubia: Hedfiktefiew Sumey, 1969 -71 .. ............... 5

2. Percent of the popdation whohave completed high school-poststiatified estimates md actidvalues for50States adtie District of Colubia: Hedti Intefiew Suwey, l969-7l .. ............... 5

3. Percent of the population who have completed high school–16-cell synthetic estimates andacturd values for 50 States and the District of Columbia: Health Interview Survey, 1969-71 ....... 6

4. Percent of the population who have completed high school–64+ell synthetic estimates andactual values for 50 States and the District of Columbia: Health Interview Survey, 1969-71 ....... 6

50 Squared errors of simple direct estimates of the percent of State population who completed highschool, by sample size: United States, 1969-71 .................................................. ......................... 7

6. Squared errors of poststratified estimates of the percent of State population who completed highschool, by sample size: United States, 1969-71 ................................... ............................ ............ 7

LIST OF TEXT TABLES

A. Average squared errors of estimates for 50 States and the District of Columbia by selected attri-bute variables and small area estimators: Health Interview Survey, 1969-71 ................................ 4

B. Correlation coefficients between actual and estimated State values for 2 direct and 2 syntheticestimators, by selected attribute variables: Health Interview Survey, 1969-71 ............... .............. 8

v

Page 6: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

SYMBOLS

Data not available ------------------------------------ ---

Category not applicable ----------------------------- . . .

Quantity zero ---------------------------------- -

Quantity more than Obut less than 0.05 ------ 0.0

Figure does not meet standards ofreliability or precision ------------------------ *

vi

Page 7: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

SMALL AREA ESTIMATION: AN EMPIRICAL COMPARISON OF

CONVENTIONAL AND SYNTHETIC ESTIMATORS FOR STATES

Wesley L. Schaible, Ph.D.,a Dwight B. Brock, Ph.D.,b Robert J. Casady, Ph.D.:and George A

INTRODUCTION

Most large samples, for example, those ofthe Current Population Survey and Health Inter-view Survey, were originally designed to givenational and regional estimates. However, esti-mates also are needed for States, Health ServiceAreas, counties, and other small areas. One wayto meet this demand is to redesign these surveys,but this process can be both expensive and timeconsuming. If operational considerations makeredesign difficult, then the immediate questionbecomes “How and under what conditions canreasonably accurate estimates for small areas beobtained from a large survey designed for na-tional estimates?”

One approach, synthetic estimation, has re-ceived considerable attention. It was introducedin 1968 in a National Center for Health Statisticspublication Synthetic State Estimates of Disabil-ity. 1 The authors stated that the sample size(and design) of the Health Interview Surveywere inadequate to make direct State estimatesby conventional procedures and suggested a syn-thetic approach that has since been extensivelyinvestigated. Levy2 used mortality data to com-pute average relative errors of synthetic esti-mates for States. Gonzalez and Waksberg3 cal-

QU.S.Bureauof Labor Statistics.boffice of stati5tic~ Re5earch, Nation~ center for

HealthStatistics.COffice of Mathematical Statistics, National Center

for ElealthStatistics.

Schnackb

culated mean square errors averaged over. allsmall areas and compared synthetic and directestimates for ‘selected standard metropolitanstatistical areas. Gonzalez and Hoza4 investi-gated synthetic-estimate errors by using unem-ployment data for counties from the CPS andthe 1970 census. Namekata, Levy, and0’Rourke5 investigated synthetic State estimatesof work-loss disability in a similar manner.Schaible, Brock, and Schnack6 compared theaverage squared errors of synthetic and directestimates of unemployment rates for countygroups in Texas. Levy and French7 discussedproperties of the nearly unbiased, synthetic, andregression-adjusted synthetic estimators andcompared different synthetic estimators.Finally, Purcell and Kish8 have given an excel-lent summary of the most recent literature onsmall area estimation.

If information from a national sample isused to make estimates for small areas and thereare no sample units in the small area of interest,then, obviously, conventional estimation meth-ods cannot be used and a synthetic approach isnecessary. However, the sample size in a smallarea can be rather large; for example, the HealthInterview Survey sample size in California isgreater than 10,000 persons each year. There-fore, when estimating for areas with large sam-ple sizes, should one ignore traditional proce-dures and use a synthetic approach? When thesample size of a small area approaches the size ofthe area’s population, a conventional directestimator becomes more desirable than a syn-thetic estimator. This statement is true whether

1

Page 8: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

or not the sample was designed to produce esti-mates for small areas.

The purpose of this report is to compare a,variety of direct and synthetic estimators formaking State estimates and to investigate therelative performance of these estimators overdifferent State sample sizes.

APPROACH

The appropriateness of an estimation pro-cedure often depends on the sample design.Thus a brief description of the Health InterviewSurvey (HIS) design used from 1969 through1971 is given below. A more complete descrip-tion is given in a separate National Center forHealth Statistics (NCHS) publication} This HISsample design is a multistage probability designwhich permits a continuous sampling of house-holds from the civilian noninstitutionalizedpopulation of the United States. The first stageconsists of a sample of 357 primary samplingunits (PSU’S) drawn from approximately 1,900geographically defined PSU’S that cover the 50States and the District of Columbia. A PSU con-sists of a county, a small group of contiguouscounties, or a standard metropolitan statisticalarea. Within PSU’S smaller units called segments,each containing an expected six households, areselected. The usual HIS sample consists ofapproximately 8,000 segments, which yield aprobability sample of about 134,000 persons in42,000 households interviewed in a year. Thenumber of sample persons residing in each Stateand the District of Columbia in 1970 and1969-71 is given in table III.

The usual HIS estimation procedure is elabo-rate. Each responding sample person is assignedan estimation weight that is the product of thereciprocal of the probability of selection, twononresponse factors, and two poststratificationfactors. A weighted estimate of a populationaggregate is then made by weighting the observa-tion of interest for each responding sample per-son by the corresponding estimation weight andthen summing over all persons.

For the purposes of this study, a variety ofdirect and synthetic estimators are used withHIS sample data to estimate know’ populationvalues for each State and the District of

Columbia. This procedure allows calculation ofactual errors to compare the estimators’ per-formance. The known population values wereobtained from data taken from the 5-percentsample questionnaire of the 1970 census. Theactual data from the Public Use Sample Tapescontain a one-in-one-hundred sample of the totalU.S. population. The variances of “estimates”from a sample this large are negligible, and thequantities computed from these tapes aretreated as population values. Comparable vari-ables were selected from the HIS, and estimateswere calculated from the 1970 and 1969-71data. The variables studied are (1) percent of thepopulation less than 1 year old, (2) percent ofthe population married, (3) percent of the popu-lation separated, (4) percent of the populationcompleting high school, and (5) percent of thepopulation completing college. Hereafter, thesevariables will be referred to as “less than one,”“married,” “separated,” “high school,” and“college,” respectively.

Sixteen estimates were calculated for eachvariable: four basic and four ratio adjusted, andeach estimate was calculated with and withoutthe HIS estimation weights. The four basic esti-mators are (1) the simple direct, (2) a posts trati-fied, (3) a 16-cell synthetic, and (4) a 64-cellsynthetic. The four basic estimators and an ex-planation of the ratio adjustments are given inthe following section.

ESTIMATORS

Let ytia denote the observation of intereston the iti unit in the j* State in the o!* post-stratification cell (or demographic class):

i=l,2, . . .. Zjaja

j=l,2, . . ..5l

a=l, z,. ... k

where nja is the HIS sample size in the@* cell,and let N.a denote the population size in the P*cell. NO~E: When nja = O, we define

2

Page 9: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

The usual dot summation convention will be oused here; for example,

Simple Direct Estimator

The simple direct estimator for the jti Stateis

Poststratified Estimator

The sample mean of the c@ poststratifica-tion cell in the jth State is

Thus the usual poststratified estimator for Statej is

The variables used to define the 16 poststratifi-cation cells were age, sex, and color, as describedfor the synthetic estimator.

Synthetic Estimators

The sample mean of the CYfi demographicclass (poststratification cell) for the total U .S. is

The simple synthetic estimator for State j is

The variables that define the demographicclasses are as follows:

1, Color: white, all other.

2. Sex: male, female.

3.

4.

5.

Age group: under 17 years, 17-44 years,45-64 years, 65 years and over

Family size: fewer than five members,five members or more.

Industry of head of family: Standard In-ternational Classifications: (1) forestryand fisheries, a~iculture, construction,mining and manufacturing; (2) and allother industries.

The 64 classes produced by these variableswere used in the 64-cell synthetic estimator. The16 classes defined by the age, sex, and colorgroups were used for the 16-cell synthetic esti-mator.

Ratio-Adjusted Estimators

An additional four estimators were createdby modifying each basic estimator by a regionalratio adjustment. The ratio adjustment for thesimple direct estimator (~jo) illustrates the ad-justment procedure.

Let

and let ~R be the usual weighted HIS estimator,where R denotes the HIS region that includesthe State of interest. The ratio-adjusted simpledirect estimator for the jfi State is then:

The ratio adjustment for each of the remainingestimators is the same except in the denomina-tor of the ratio where the simple direct esti-mators are replaced by the particular estimatorsbeing adjusted. This ratio adjustment forces theweighted sum of the State estimates, when eachState estimate is weighted by the proportion ofthe regional population in that State, to be con-sistent with the usual HIS estimate for eachregion.

Page 10: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

RESULTS

The average squared difference between eachestimate and its corresponding State censusvalue, that is, the average squared error, is shownfor the 5 variables and 16 estimators in table A.The estimates used to compute these errors werebased on the 1969-71 HIS sample data. Theaverage squared error obtained when the HISregional estimate is used for each State is alsoshown in this table. This average squared errorof the regional estimator is an indicator of thevariability of the State values within regions.This indicator is important because syntheticestimators tend to perform better when the esti-mated characteristic does not vary substantiallyamong small areas. For example, the variable“married” shows little variability among smallareas, and has a regional average squared error of5.71 (see table A). On the other hand “highschool,” which vanes considerably more fromState to State, has a regional average squarederror of 13.43. This average squared error for

the regional estimator also can be used for com-parison with the mean square errors of otherestimators. A minimum expectation of a Stateestimator is to outperform this simplisticmethod of estimation.

Relatively large differences in averagesquared errors occur among the eight direct andeight synthetic estimators. This finding is par-tially explained by the size of the statistic. Thesynthetic estimators have smaller averagesquared errors than direct estimators when theyare used to estimate “less than one” because thzkvariable differs lit tle between the States within aregz”on. The average squared error of the regionalestimate is 0.04, and in 1970 this percent rangedfrom 1.5 in Rhode Island to 2.7 in Alaska. Forthe variables “married” and “separated,” thedirect estimators have smaller average squarederrors than do the synthetic. For the remainingtwo variables neither group of estimators hasconsistently smaller average squared errors.

Although major differences in averagesquared errors occur among the direct and syn-

Table A. Average squared errors of estimates for 50 States and the District of Columbia, by selected attribute variables and small areaestimators: Health Interview Suway, 1969-71

Estimator

Simple direct ... .... ... . ... . ... . .. .. .. .... ... . ... .. . .. . ..... .... . ... .. .. . .... . ... ... .. .. .. .. .. ... . .Weighted ... ... .... ... .. .. .... .. .. .. ... .. ... .. ... .. .. ... ... .. .. .. .. .... .. ... .. ... .. .. .. ..... .. ..

Ratio adjusted .. .... .. ... . .. .. . ... .. .. .... .. .... . ... .. ... . . .. .. ... . ..... .. . .... .. ... . ... ... ..Weighted, ratio adjusted . .... . ... .... .... .. .. .... .. .. .. .. .. .... .. .... .... .. .. .. .... .. .. .

Poststratifiad .. .. .... ... .. ... ... .. .. .. .. ... ... .. ... .. . .. .. . .... ... .. ... .. .. ... . .. .. .. ... . ... . .. ... .

Waighted ...... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. ... ... .Ratio adjusted .. .... ... .. . .... ... .. ... .. ... .. .. ... .. .. .. .... ..... .. ... ... . .... . ... .. .. .. .. ...Weighted, ratio adjusted .. ... .. .. ... . .... .. .. .. .. .. .. .. .. .. .. ... . .. .. .. .. .. .. .. .. .. .. .. .

Synthetic (16) ...... .. . .. .. . .. .. ... . .... .. .. .. ... .... . .. .. . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... . ..Weighted .. .. .... . ... .. ... .. . .. .. . .. .. .. .. .. .... .. .... .. . ... .. .. .. .. .. ... . .. .. .. .. .. .. .. .. .. .. ..Ratio adjusted .. .. ... ... . .. ... ... .... .. .. ... ... . ... .. ... ... .. .. . ... .. . ... . .... .. ... .. .... .. ..Weighted, ratio adjusted .. .. .... .. ... ... . .. ... ... .. .. .. .... . .... .. ... . ... .. ... .. .. . .... .

Synthetic (64) ... .... .. .. . ... .. .. .. ... ... .. .. .. ... . .... .... .. .. .. .. . .. .. .. ... . .... .. .. .. .. ..... ...Weighted ... ..... . .. .. .. ... .. ... .. ... ... .. .. .... .. .. .. ... ... .. .. .. .. .. . . .. .. .. .. .. .. .. ... ... .. ..Ratio adjusted ... ... .. .... .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. ... ... .. .. .. .. .. .. ... . .. .. .. .Weighted, ratio adjusted .. .. ... .. .. .. ... . .. .. ... . ... .. ... .. ... ... ... . .... . .. ... .. .... ...

Regional estimate .. ... ... ... . .... .. .. .. . .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. . ... ... ... .. .. .... ... .

Variable

Percent Percent PercantParcent Percentless than

marriad separatedcompleting completing

1 year old high school college

0.160.15

0.170.16

0.14

0.140.160.15

0.020.010.020.02

0.020.020.02

0.02

0.04

1.812.27

1.942.23

2.412.342.062.26

3.813.813.233.21

3.833.81

3.213.19

5.71

Average squared error

0.05

0.05

0.050.05

0.050.050.050.05

0.310.310.230.23

0.300.30

0.220.22

0.49

12.36

12.5912.7212.48

7.45

6.837.957.18

19.5419.5111.3711.27

16.4216.42

8.948.89

13.43

1.811.86

1.841.86

1.70

1.721.711.74

2.322.332.342.32

1.681.69

1.781.76

1.95

Page 11: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

thetic estimators, some within-group differencesalso exist. The synthetic estimators that have theregional ratio adjustment generally producesmaller average squared errors than those that donot. The synthetic estimators that use the HISestimation weights have average squared errorsalmost identical to those that do not. The use ofthe ratio adjustment with the direct estimatorstends to increase the average squared errorrather than decrease it. Furthermore, the use ofestimation weights does not improve the per-formance of the direct estimators. In fact,when the 1970 data were used to produce esti-mates, the addition of estimation weights in-creased the average squared errors of the directestimators (table I). In general, these results indi-cate that if direct estimators are used for pro-ducing State estimates, perhaps they should notbe weighted or ratio adjusted and if syntheticestimators are used, they should be ratio ad-justed. As a result of this evidence four estima-tors were chosen for further investigation: thesimple direct and poststratified estimators,neither weighted nor ratio adjusted, and the 16-and 64-cell synthetic estimators, both weightedand ratio adjusted. The weighted synthetic esti-mators were chosen instead of the unweighedestimators because the weighted a-ceil meansused are generally more readily available thanthe unweighed means.

The plots of State estimates and State censusvalues in figures 1-4 show the results when thetwo direct and two synthetic estimators wereused to estimate the percent completing highschool. Plots of results obtained by using thesefour estimators for each of the four remainingvariables are presented in figures I-XVI. In eachplot the census value is shown on the horizontal

axis and the estimate on the vertical axis. EachState (and the District of Columbia) is repre-sented by a point. The error in an estimate for aState is the vertical distance from the point forthat State to the 45° line bisecting the plot.

All four estimators generally produce esti-mates that approximate the State census values.The 16- and 64-ceil synthetic estimators (figures3 and 4) both tend to overestimate State valuesthat m-e low and underestimate those that arehigh, In the estimates produced by both syn-thetic estimators the largest error occurs in the

60.0 –

500 - 0

00

040.0 -

30.0

20.0 -

10,0 -

0,00,0 10.0 20.0 30.0 40.0 50.0 60.0

ACTUAL

Figure 1. Percent of the population who have completed highschool—simple direct estimates and actual values for 50States and the District of Columbia: Health lnterviaw

Survey, 1969-71

600 -

50.0

0

40.0 -‘$0

30.0

20,0 -

100

0.00.0 100 200 30.0 40.0 50.0 ao.o

ACTUAL

Figure 2. Percent of the population who have complated highschool —poststratified estimates and actual values for 50States and the District of Columbia: Health Interview

Survey, 1969-71

5

Page 12: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

60.0 ~

500 -

400 -

300 -0

20.0 -

10.0 -

0.00.0 10.0 20,0 30.0 40,0 50.0 60,0

Figure 3. Percent of the population who have completed highschool—16-cell synthetic estimates and actual values for 50States and the District of Columbia: Health interviewSurvey, 1969-71

60.0 -

50,0 -

0

40,0

30.0 -0

20.0 -

10.0 -

0.00,0 10.0 20.0 30.0 40.0 50,0 60.0

ACTUAL

Figure 4. Percent of the population who have completed highschool —64-cell synthetic estimates and actual values for 50States and the District of Columbia: Health InterviewSurvey, 1969-71

estimate for the District of Columbia and thenext largest error occurs in the estimate forHawaii. These observations are generally true forthe other four variables. In fact, the syntheticestimates for the District of Columbia andHawaii are in error to the extent that, except forthe variable “less than one ,“ they dominate theaverage squared errors of the synthetic estima-tors presented in table A. If these two errors areremoved from the average squared error calcula-tions, the synthetic estimators generally havesmaller average squared errors than the directestimators. For example, excluding the Districtof Columbia and Hawaii reduces the averagesquared error of the 64-cell ratio adjusted syn-thetic estimator from 3.19 to 1.08 for “mar-ried,” from 0.22 to 0.08 for “separated,” from8.89 to 6.72 for “high school,” and from 1.76to 1.15 for “college. ” If these results indicatewhat to expect when estimating other character-istics, then special care should be taken when in-terpreting synthetic estimates for these twolocations.

In the simple direct estimates in figure 1, the

three largest errors occur for States with thesmallest sample sizes. One large error for thepoststratified estimates in figure 2 occurs in Ver-mont, the State with the second smallest samplesize. Figures 5 and 6 show the squared errors ofthe simple direct and poststratified estimatorsfor each State by the combined 1969-71 HISState sample size. Similar graphs for these twoestimators and the remaining variables are shownin figures XVII-XXIV. As expected with anyconventional estimator, the States with largesample sizes generally have smaller errors thanthe States with small sample sizes. This result istrue for both estimators regardless of the varia-ble considered.

The fitted curves were generated by themodel, yj =-4 + B/ni + ei, where yi is the squarederror of the estimate for State i, q. is the samplesize, ei is a random error term, and A and B areunknown parameters. One imposed restrictionwas that A must equal zero when the sample sizeequals the average State population size. Theparameters A and B were estimated from thedata by minimizing the sum of the absolute val-ues of the ei’s (table II).

Page 13: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

Because the expected errors of the simpledirect and poststratified estimators decrease asState sample size increases, it is interesting toascertain at what sample size(s) these errors willbecome smaller than those expected from a syn-thetic estimator. The preceding model fits thesquared errors of the direct estimates reasonablywell; however, some difficulty arises in finding amodel that adequately describes the squarederrors of the State synthetic estimates as a func-tion of State sample size. This difficulty is pri-marily due to the fact that, as noted bySchaible, Brock, and Schnack,G the squarederror of the synthetic estimator is subject onlyto a small sampling variance inherent in theestimated large area mean but is usually domi-nated by a bias component which is independentof sample size.

Thus the squared error of the State syntheticestimates is described by a constant function,specifically the squared error averaged overStates. However, in the variables investigatedthis model tended to overestimate the squarederrors of those States with larger sample sizes.This overestimation was due to three interre-lated factors. First, the State synthetic estimateshave a smaller bias component when the Statecell values approach the regional or national cellvalues. Second, the contribution of a State cellvalue to a regional or national value is in directproportion to the size of the State’s cell popula-tion, so that States with large populations tendto have actual cell values near the actual regionalor national cell values. Third, the HIS sample isdesigned so that States with large populationshave large sample sizes.

With the assumption that a more accuratefunction can be approximated by this simpleone, the average squared errors (omitting theDistrict of Columbia and Hawaii) of the 64-cellratio adjusted synthetic estimator can be com-pared with the appropriate curves in figures 5and 6 and figures XVII-XXIV. The sample sizesat which the two expected squared error func-tions intersect are approximately 8,000 for thevariable “less than one ,“ 300 for the variable“separated,” and approximately 2,000 for thethree remaining variables. Many State samplesizes are large compared with the values where

220.0

200.0

180.0

I160,0

a:

lx 1400

:ua 120,0<

2 100.0.

60.0

60.0

40.0 -*

20.0

0.00 5,000 10,000 15,000 20,000 25.000 30,000 35,0CQ 110,000

SAMPLE SIZE

Figure 5. Squared errors of simple direct estimates of the per-

cent of State population who completed high school, by

sample size: United States, 1969-71

2200

2000

1s00

16001

a0 1400aCc

: 120.0.K~ 100.0 -0m

80.0

60.0

40.0

1’ v200

0.00 5,OOO 10,000 15.000 20,000 25,000 30,000 35,000 40,000

SAMPLE SIZE

Figure 6. Squared errors of poststratified estimates of the per-

cent of State population who completed high school, by

sample siza: United States, 1969-71

7

Page 14: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

these functions intersect (table III). In fact, theaverage State sample size for the combined1969-71 sample is 7,500. This observation sug-gests that in many States the direct estimatorsare expected to perform, as well as, if notslightly better than, the synthetic estimators.

Although the average squared error is usedto evaluate synthetic estimators, this measure isnot appropriate for all purposes. Generally, theavqrage squared error is a measure of how wellthe estimate agrees with the parameter beingestimated. However, in some cases, rather thanestimating only one parameter, the differencebetween the parameters of two domains is esti-mated. In other cases, information is providedon the relative position of the parameters fromseveral domains. When comparing estimators,the one with the smallest average squared errormight be expected to perform best in estimatingboth relative positions and the level of a singleparameter. This expectation is not necessarilyjustified. Table I shows that for the variable“college” the simple direct estimator has anaverage squared error of 3.50, which is muchlarger than the 1.72 of the 64-cell ratio adjustedsynthetic estimator. However, the simple directestimator for this variable is more highly corre-lated with the actual values than the syntheticestimator is (table B). In fact, the direct esti-mators correlate better with census values thanthe synthetic estimators do for all the variablesexcept “less than one .“

SUMMARY

Sixteen different estimators, eight direct andeight synthetic, were used with HIS data to esti-mate five different known census values for eachof the 50 States and the District of Columbia.The effect of several different estimation tech-niques on the average squared errors of theseestimators was noted. Estimators that used theHIS estimation weight designed for national esti-mates did not outperform unweighed esti-mators, in fact, in some instances the use of HISestimation weights increased the average squarederrors of direct estimators. A ratio adjustment toregional HIS estimates improved synthetic esti-mators but did not improve direct estimators. A64-cell synthetic estimator outperformed a 16-cell synthetic estimator for two of the five vari-ables considered. The average squared errors ofthese two estimators were essentially identicalfor the remaining three variables. A poststrati-fied estimator generally produced smaller aver-age squared errors than did the simple directestimator. Moreover, as expected, the squarederrors of the direct estimators decreased as thesample sizes in the States increased.

Although the preceding differences werenoticeable, the major differences in the 16 esti-mators occurred between the direct and syn-thetic estimators. The direct estimators hadsmaller average squared errors with some vari-

Table B. Correlation coefficients between actual and estimated State values for 2 direct and 2 synthetic estimators, by selected

attribute variables: Health Interview Survey, 1970 and 1969-71

Data vears and estimator

1970

Simple direct . ...... . ... ... .. . .. . .... .. .... .. ... . .. . ... .. ... .. ... ... . ... .. .. ... .... .. . ... . ... .. .. ..Poststratif ied ... .... .. .... . ... .. ... ... ... . ... ... . ... .. .. ... .... . .... . ... .. .. ..... . .. .. . ... .. .. .. .. .

Synthetic (16)–weighted, ratio adjusted . .. .. ... ... .. . .. .. ... ... .. .. .. .. .. .. .. ... . ..Synthetic (64)–weighted, ratio adjusted . ..... ... ... .. ... . ... .... . ... ... .. .. ... ... ..

1969-71

Simple direct .. .... .. .. ... ... ... . ... . .. .... .. .. .... ... ... .. . .... .. ... .. ... .. .... . ... ... . .... ... ... .

Poststratified ... .... . ..... ... ... .. .. ... ... .. ... . .... .. ... .. ... ... .. ... . .. ... .. .. . .... .. .. .. .. ... ...

Synthetic (16)–weighted, ratio adjusted .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... .. ...Synthetic (64)–weighted, ratio adjusted ...... .. ... .. .. .. ... . ... . ... ... .. ... .... .. ..

Varieble

I I 1 1

Percent Percent PercentPercent Percentless than

married separatedcompleting completing

1 year old high school college

0.360.360.710.69

0.44

0.400.790.76

Correlation coefficient

0.770.720.680.68

0.88

0.B60.670.67

0.930.940.810.82

0.96

0.960.800.81

0.650.780.720.78

0.79

0.860.740.79

0.550.600.250.42

0.69

0.710.270.45

8

Page 15: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

abIes, and the synthetic estimators had smalleroverage squared errors with others. However, thesynthetic estimators produced estimates for theDistrict of Columbia and Hawaii with such un-usually large errors that, when these two placeswere omitted from the comparisons, the syn-thetic estimators generally had smaller averagesquared errors than the direct estimators.

When the correlation between an estimateand its corresponding census value was used asan evaluation criterion instead of the averagesquared error, the results were different becausethe direct estimators generally outperformed thesynthetic estimators. These results suggest thatthe direct estimators might serve better when

estimating differences in characteristics betweenStates, or, similarly, when determining relativepositions among States.

These considerations and previous stud-ieslJ10 indicate that the selection of a singleestimator to produce State estimates from HISmay not be the best choice. A superior estimatormight be obtained by using a linear combina-tion of estimators where the components areweighted according to their expected perform-ance. Recently, these types of estimators, calledcomposite estimators, were studied theoreti-Ca]]yl1,12 and, on a limited basis, empiri-cally.1 3Y14A more extensive empirical investiga-tion will be the subject of future research.

1National Center for Health Statistics: SyntheticState Estimates of Disability. PHS Pub. No. 1759. Pub-lic Health Service. Washington. U.S. Government Print-ing Office, 1968.

2Levy, P. S.: The use of mortality data in evaluatingsynthetic estimates, in Proceedings of the AmericanStatistical Association 1971, Social Statistics Section.Washington. American Statistical Association, 1974. pp.3~~.331.

~Gcmzalez, M. E., and Waksberg, J. E.: Estimation ofthe Error of Synthetic Estimates. Presented at the firstmeeting of the International Association of Survey Sta-tisticians, Vienna, Austria, Aug. 1973.

~Gonz~ez, M, E., ~d Hoza, CO: Small ~ea estfia-tion with application to unemployment and housingestimates, J. Am. Stat. Assoc. 73:7-15, 1978.

5 Namekata, T., Levy, P. S., and O’Rourke, T. W.:Synthetic estimates of work loss disability for each Stateand the District of Columbia. Pub. Health Rep. 90:532-538, 1975.

6Sch~bIe, w. LO, Brock, D. B,, ~d Schnack, G. A.:

An Empirical Comparison of Two Estimators for SmallAreas, Presented at the Second Annual Data Use Confer-ence of the National Center for Health Statistics, Dallas,Tex., 1977.

7Nation~ Center for Health S tati~tic~: Synthetic

estimation of State health characteristics based on the

Health Interview Survey, by P. S. Levy and D. K.French, Vital and Health Statistics. Series 2-No. 75.DHEW Pub. No. (HRA) 78-1349. Health Resources Ad-ministration. Washington. U.S. Government PrintingOffice, Oct. 1977.

8purcell, N. J., ~d Kish, L.: Estimation for sm~l

domains. Biometrics 35:365-384, 1979.9Nation~ Center for Heal~ Statistics: Statistical

design of the Health Household Interview Survey. HealthStatistics. Series A-2. PHS Pub. No. 584-A2. PublicHealth Service. Washington. U.S. Government PrintingOffice, 1958.

10 Royall, R. M.: Discussion of two papers on recentdevelopments in estimation for local areas, in Proceed-ings of the American Statistical Asso ciation 1973, So cialStatistics Section. Washington. American StatisticalAssociation, 1974. pp. 4344.

11 Royall, R. M.: Prediction Models in Small AreaEstimation. Presented at the NIDA-NCHS Workshop onSynthetic Estimates, Princeton, N.J., 1978.

12 Schtible, W. L.: Choosing weights for compositeestimators for small area statistics, to appear in Proceed-ings of the American Statistical Association 1978, Sur-vey Research Section. Washington. American StatisticalAssociation, 1979. pp. 741-746.

1 ?’sch~ble, ~. L., Brock, D. B., and Schnack, G. A.:

An empirical comparison of the simple inflation, syn-thetic and composite estimators for small area statistics,in Proceedings of the American Statistical Association1977, Social Statistics Section. Washington. AmericanStatistical Association, 1978. pp. 1017-1021.

14 Brock, D. B., and Peyton, B. W.: Small Area Esti-mation: An Application of Three Methods to the U.S.National Health Interview Survey. Presented at the 36thAnnual Meeting of the United States–Mexico BorderHealth Association, Reynosa, Tamaulipas, Mexico, 1978.

9

Page 16: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

APPENDIX

CONTENTS

Supplemental Tables and Figures ...........................................................................................................

LIST OF APPENDIX TABLES

I. Average squared errors of estimates for 50 States and the Dktrict of Columbia, by selected attri-bute variables and small area estimators: Health Interview Survey, 1970 .....................................

II. Estimated parameters of curves fitted to plots of squared errors of the simple direct and post-stratified estimators for 5 selected attibute variables: Health Intemiew Survey, 1970 and1969-71 .......................................................................................................................................

III. Health Interview Survey sample sizes by State for 1970 and 1969-71 ............................................

I.

II.

HI.

IV.

v.

VI.

VII.

VIII.

IX.

x.

XI.

XII.

LIST OF APPENDIX FIGURES

Percent of the population under 1 year of age–simple direct estimates and actual values for50 States and the District of Columbia: Health Interview Survey, 1969-71 ............................

Percent of the population under 1 year of age–poststratified estimates and actuaf values for50 States and the District of Columbia: Health Interview Survey, 1969-71 ............................

Percent of the population under 1 year of age–1 6-cell synthetic estimates and actual valuesfor 50 States and the District of Columbia: Health Interview Survey, 1969-71 ......................

Percent of the population under 1 year of age–64well synthetic estimates and acturd valuesfor 50 States and the District of Columbia: Health Interview Survey, 1969-71 ......................

Percent of the population who are married-simple direct estimates and actual values for 50States and the District of Columbia: Health Interview Survey, 1969-71 .................................

Percent of the population who are married-poststratified estimates and acturd values for 50States and the District of Columbia: Health Interview Survey, 1969-71 .................................

Percent of the population who are married-16-cell synthetic estimates and actual values for50 States and the District of Columbia: HealtA Interview Survey, 1969-71 ............................

Percent of the population who are married-64 -eelf synthetic estimates and actnaI vrdues for50 States and the District of Columbia: Health Interview Survey, 1969-71 ............................

Percent of the population who are separated-simple dwect estimates and actual values for 50States and the District of Columbia: Health Interview Survey, 1969-71 .................................

Percent of the population who are separated-poststratified estimates and actual vrdues for 50States and the District of Columbia: Health Survey, 1969-71 ................................................

Percent of the population who are separated-16-cell synthetic estimates and actual wdues for50 States and the District of Columbia: HeaItb Interview Survey, 1969-71 ............................

Percent of tbe pops.dation who are separated-64-eefl synthetic estimates and actual vahses for50 States and the District of Columbia: Health Interview Survey, 1969-71 ............................

12

12

13

13

14

14

14

14

15

15

15

15

16

16

16

16

10

Page 17: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

XIII.

Xw,

xv.

XVI.

XVII.

Percent of the population who have completed college-simple direct estimates and actualvalues for 50 States and the District of Columbia: Health Interview Survey, 1969-71 ... .........

Percent of the population who have completed college-poststratified estimates and actualvalues for 50 States and the District of Columbia: Health Interview Survey, 1969- 71 ............

Percent of the population who have completedcoflege-16-cefl synthetic estimates and actualvalues for 50 States and the District of Columbia: Health Interview Survey, 1969-71 ............

Percent of the population who have completed college-64-cell synthetic estimates and actuidvalues for 50 States and the District of Columbia: Health Interview Survey, 1969-71 ............

Squared errors of simple direct estimates of the percent of State population less than 1 yearof age, by sarnpIe size: United States, 1969-71 .~..........................~..~ ........... ...................~......

XVIII. Squared errors of poststratified estimates of the percent of State poprdation less than 1 yearof age, by sample size: United States, 1969-71 .............. .................................................... .....

XIX. Squared errors of simple direct estimates of the percent of State population who are married,by sample size: United States, 1969-71 ................................................................ ..................

xx. Squared errors of poststratified estimates of the percent of State population who are married,by sample size: United States, 1969-71 ..................................................................................

XXI. Squared errors of simple direct estimates of the percent of State population who are sepa-rated, by sample size: United States, 1969-71 ...................................................................... ..

XXII. Squared errors of poststratified estimates of the percent of State population who are sepa-rated, by sample size: United States, 1969-71 ............................... .........................................

XXIII. Squared errors of simple direct estimates of the percent of State population who completedcollege, by sample size: United States, 1969-71 .....................................................................

XXIV. Squared errors of poststratified estimates of the percent of State population who completedcollege, by sample size: United States, 1969-71 .....................................................................

17

17

17

17

18

18

18

18

19

19

19

19

11

Page 18: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

APPENDIX

SUPPLEMENTAL TABLES AND FIGURES

Table 1. Average squared errors of estimates for 50 States and the District of Columbia, by selected attribute variables and small areaestimators: Health Interview Survey, 1970

Estimator

Simple direct ..... .... . .. ... . ... .. .. .. . ... .. .. .. ... .. .. . ... .. ... . .. ... .... .... ... ... .. .. .... .. .. .. .Weighted .. .. ... .... .... . . ... .. .. .. ... .. .. . .. ... .. .. .... . ... .. ... . .. ... . ... . .. . ...... .... .... . ..Ratio adjusted .. ... .. .. ... ... .. ... .. ... ... ... ... .. .. .. ... . .. .. .... ... .. . .. . .... . .. .. .. .. .. .. .Weighted, ratio adjusted .. .... .. ... .. ... .. ...... .. ... . .... ..... ... .. .... .. .. ... ... .. .. ..

Poststratifiad ..... .. .... . ... .. .. .... .. .... . ... .. ... .. ... ... . .. .. .....fl . ... .. . .. . ..... ... .. .. .... ..Weighted .. ... ... . .. ..... .. .. ... .... .. .. ... ... .. .. ... ... ... ... .... . ..... . ... ... ... ... . ..... ... ..Ratio adjusted .. .. .... .. .. .. .. ... .. ... .. ... .. ... .. ... .... .. ... .. ... . ... .... .. . .. ... .... .. ... .Weighted, ratio adjusted ... ... .. ..... .. . ... ... ... ... .. ... .. ... .. ..... ... ... .. ... ... .. .. .

Synthetic (16) . ....... .. .. ..... I. .. . ... .. ...~.. .... .. .. ... ... .. ... . .... ....~. .. ... .. ... ... . .. .. ...Weighted ... ... ... .. .. ... ... .. .. .. .. ... . ..... . ... . ... .. . .. .. .. ... ... .. .. ... ... .. . .. .. . .... .. ... .Ratio adjusted .. ... ... ... . .... .. ... ... .. ... ... .. .. .. .... .. .... . .. .. ... . ... .. ... .... .. . .... ...Weighted, ratio adjusted .... ... ... . .. .. .... .. ... ... .. .. .. .... ... .. ... . .... . .. .. .... .. ...

Synthetic (64) ... ... . .... ... .. .. ... .. .. . ... .. . .. .. .. .. .... .. .. . ... .. ... ... .. ... ... .. .... .. .. ... ...Weighted .... . ... .. ... .... .. .. .... ... .... ... .. .. .. .. .. ... .. ... .... .. .. .. .... . . .. .. .. ... ..... .. ..Ratio adjusted ... ... .. .. .... ... . .... .. ... ... .. .. .... .. .... ... . .. .. ... .. . ... .... .. ... .. ..... . .Weighted, ratio adjusted . ... ... . ... . .. .. ... ... .. .... .. ... ... .. .. .. .... .. ... ... .. ... ... ..

Regional estimate . ..... .. ... .. .. .. .. .. ... ... .. ... .. ... .. ... ... .. .. . .... . ... .. .. .. .. .. ... .... .. ..

Variable

PercantPercent Percent Percent Percant

less thanmarried separated

completing completing1 year old high school college

0.520.410.51

0.45

0.460.360.480.38

0.020.010.030.02

0.020.020.020.02

0.04

5.216.735.74

6.56

5.636.20

4.805.88

3.833.763.14

3.08

3.643.753.14

3.06

5.85

Average squared error

0.110.16

0.110.16

0.090.12

0.090.11

0.310.310.23

0.22

0.310.300.22

0.22

0.47

30.8733.68

32.2032.64

13.8412.94

14.9013.80

20.2220.2513.52

13.50

17.1917.1210.97

10.86

16.69

3.503.50

3.653.49

3.003.11

3.053.22

2.392.432.47

2.47

1.721.761.891.90

2.02

12

Page 19: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

Table 11, Estimated parameters of curvesl fitted to plots of squared errors of the simple direct and poststratified estimators for selected~ attribute variables: Health interview Survey, 1970 and 1969-71

Data years, estimator,and parameter

1970

Simple direct

A ................................................B ,,, ,..,,,,,,,,.. m..,, .,, ....... .................

Poststratified:A ................................................B .................................................

1969-71

Simple direct:

A ................................................B ,,, .,,,,,,,..,., .................................

Poststratified:A ................................................B .................................................

Percentless than

1 year old

–0.4200x 10-40.0168 x 104

–0.4199 x 10-40.0168 x 104

-0.3446 X 10-40.0138 X 104

–0.4360 X 10–40.0174x 104

Variable

Percent PercentPercent Percent

married separatedcompleting completing

high school college

Estimated parameters

-7.5800 X 10-40.3033 x 104

–9.5559 x 10–40.3822 X 104

-5.8840 X 10-40.2354 X 104

–6.0494 X 10-40.2420 X 104

-0.0684 X 10-40.0027 X 104

–0.041 6 X 10-4

0.0017 x 104

-0.0486 X 10-40.0019 x 104

–0.0573 x 10-40.0023 X 104

–24.2600 X 10 +0.9705 x 104

–6.4205 X 10–40.2568 X 104

-33.9500 x 10-41.3580 X 104

–17.0790 x 10-40.6832 X 104

1 Model: Y;= A + B/ni + ei, see text for further explanation.

Table II 1. Health Interview Survey sample sizes by State for 1970 and 1969-71

State

Total .. .... ... .. ... ... .. .. ... ... ... .. .. ... ... .. ... .

California .... ... ... .... .. ... .. ... ... .. . .. . .... . .... ... .. . .New York .... .. .. .. .. .. .. .. ... ... .. .... . ... .. .. .. .... .. .. .

Pennsylvania .. .. ..... .. ... ... .. .. .. .. .. .. .. . ..... .. .. . ...

Texas ... .. .. .. .. .. .. .. ... . .. .. ... . .. ... . ... .. ... . ... . .. .. .. ..Ohio .... .. ...... ... .. .. .. ... ... .. ... .. ... ... ... . .... . .. .. .. ..

Illinois ... ...... .. .. . ... . .... .. .. .... .. .. .. .... .. ... ... .. ... .Michigan . .... ..... . ... .. .. .. .... .. .. .. .. .. .. .. .... .. .. .. ...New Jersey ... .. .... .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .... ..Mtissachusetts .. ...... .... .. .. .. ... . .. .. ... .. ... .. ... .. ..

Florida ... ... .... . .. .. . .. . .. .. .. .... . .. ... ... . .... ... .. .. .. .

lndiana ... .... .. ... . .. .... .. .. .... .. .. .. .... .. .. .. . .... . ... .Virginia ... ... ... .. .. .. .. . ... .. .. ... ... .. ... .. .. . .. .. ... .. ..Georgia .. .... .. .. ... .... . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .Missouri .. ..... ..i . ... . .. .. .... .. .. .. .. .. .. .. .. ... .... . .. .. .North Carolina .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Wisconsin ... ....... ... . .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... . .. .Tennessee .. ... .. .. .. . ... . ... .. .. .. .. .. .. .. .. .. .. ... . .. .. ..Louisiana . .. . .... .. ... ... .. .. . ... .. .. . .. ... .... .. .. .... . .. .Maryland . .... .. .. .. ... . .... .. .. .. .. .. .. .. ... .. . .. .. ... . .. .Minnesota . ... .... ... .. ... .. .... .. .. .. .. .... .. .. ... . .. .. .. .

Connecticut . ..... .. .. .. .. . ..... .... .... .. ... . .. . ... .. ... .

Kentucky ... .... .. . ... ... .. .. .. .. .. ... . .. .. .. .. .... ... . ...Alabama .... .. ..... .. ..... .. .. .. .. ... .. .... . ... . ..... .. .. ..Washington .... .. ..... ... .. .. ... . .. . . .. ... ... .. ... ... .. .. .South Carolina ... ... .. .. .. ... . .... .. ... . .. .. .. .... .. .. .

1970sample

size

116,401

11,49710,0176,9676,6536,433

6,2745,2614,5813,6333,156

2,8162,7942,7782,6102,601

2,5852,5152,4032,2262,061

1,9561,9011,8441,7051,643

1969-71sample

size

382,543

37,50932,78923,00322,32820,941

20,55117,02314,57611,37811,035

9,6548,8468,4758,5698,802

9,1818,4707,6627,6826,676

6,5225,6495,8856,2295,308

State

Mississippi . .. . ..... .. .. .. .... ... . .. .. .. .. .. .. .. .. .. ... . .. .lowa .. .. .. .. ... .. .... . .. .. .. .... .... . ... .... . ... .. ..... .... . .Oklahoma ... ... .... .. .. .... ... . .. .. .. .. .. .. .. .. .. .. .. .. .. .Colorado .. .. .. ... .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Nebraska .. . ...... .. .. .. .. .... .. .... .. .. .... .. .. .. .. .. .. .. .

Oregon ... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. ..Kansas ... ... .. .. . .. ... ... .. .. .. .... .. .... .. ... .. . ... .. ... .. .Arizona ... .. ... ... .. ... .. ..... . .. .. .. .. .. .. .. .. .. .. .. .. .. ..West Virginia ..... . ... . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... ..Arkanses .. . .. .. .. .. .. .. .. .. ...%.... .. .. .. .. .. .... .. .... .. ..

Rhode island . ... .... .. .. .... .. .. ... ... .. ... . .. .. .... .. ..

Maine .... .. .. .. ... .. .. .. ... .... . ... ... .. ... .. ... .... .. .. .. ..Utah .. . ... .. .. .. .. .. .. .. .. ... . .. ... . ... .. ... .. .. .. .... .. .. .. .New Mexico ... .. . ... ... .. ... ... .. .... .. .. ... . .. .. ... ... .Hawaii ... .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

Delaware . .. .. .. .. .. .. .. .. .... .. .... .. .. .. .. .. .. .... .. .. .. .

District of Columbia ... ... .... . .... . .. .. . .. .. ... ... ..ldaho . . ... ..... .. .. .. .. .... .. ... ... .. .. ... ... ... . .... .. .. .. ..North Dakota ..... .. .. .. .. .. ... . .... .. .. .... .. ... . .... . .New Hampshire .. .. .. .. . .. .. .. .. .. .. .. .... .... .. .... .. .

South Dakota .... .. ... . ... .. ... .. .... .. .. .. ... ... ... .. ..Montana .. .. .. .. .. .. .. .. .. .. .. .. ... . .. .. .. ... . .. .. .. ... .. ..Alaska ... .. . .. .. .. .. .. .. .. .. .. .. ... .. ... .. .. .. ... .... .. .. . ..

Wyoming .. .. ... .. .. .. ... ... ... ... . .. .. ... ... .. .. .. .. ... .. .Vermont . .. ... ..... .. .. .. ... ... .. ... ... .. .. ... .. ... . .... .. .Nevada .. ... .. .. . .. .. ..... .. .. .. . .. .. ... ... .. ... ... .. .. ... ..

-3.3700 x 10-40.1349X 104

-1.3076 X 10–40.0523 X 104

‘?

-9.3670 X 10-40.3747 x 104

–5.1814 X 10–4

0.2073 X 104

1970sample

size

1,5221,4531,2601,106

1,045

1,0301,025

904901886

773

761

700581510

447

429376332328

269267224

177

13649

1969-71sample

size

5,1485,1903,9033,617

3,275

3,3923,2213,5013,0862,846

2,448

2,5072,2801,8441,466

1,319

1,463

1,2271,131

968

9321,015

635

602463321

13

Page 20: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

6,0 -

5.0 -

4.0 -

~o3.0 - 0

00

2.0 - 8

1.0 -

0.000 1.0 2.0 3.0 4.0 5.0 6.0

ACTUAL

6.0 -

5.0 -

4.0 -

3.0 -

2.0 -

1.0 -

0.00.0 1.0 2.0 3.0 4.0 5.0 6.0

ACTUAL

Figure 1. Percent of the population under 1 year of aga–simple Figure I I 1. Percent of tha population under 1 yaar of age–1 6-direct estimates and actual values for 50 Statas and theDistrict of Columbia: Haalth Interview Sutvey, 1669-71

cell synthetic estimates and actual values for 50 Statas andthe D“istrict of Columbia: Health Interviaw Survey, 1969-71

6.0 - .

5.0 -

4.0 -

3.0 - 000

2.0 - 0

1.0 .

0,00,0 1.0 2.0 3.0 4.0 5.0 6.0

ACTUAL

. Figure II. Percent of the population under, 1 year of age–post-stratified estimates and actual values for 50 States and the

District of Columbia: Health Interview Survey, 1969-71

a.o –

5.0 –

4.0 -

3.0 -

0

2.0 -

1.0 –

0.00.0 1.0 2.0 3.0 4.0 5.0 a.o

ACTUAL

Figure IV. Percent of the population undar 1 year of age++cell synthetic estimates and actual valuas for 50 States and

the District of Columbia: Health Interview Survay, 1969-71

14

Page 21: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

60,0 -

60.0 -

40.0 -

30.0 -

20.0 -

10.0

0.00.0 10.0 20.0 30.0 40,0 50.0 60.0

ACTUAL

60.0

5000

0

40.0 –

30.0 -

20.0 -

10.0 -

0.00.0 10.0 20.0 300 40.0 50.0 60.0

ACTUAL

Figure V. Percent of the population who are married-simple Figure VII. Percent of the population who are married–l6-

direct estimates and actual values for 50 States and the cell svnthetic estimates and actual values for 50 States and

District of Columbia: Haalth Interview Survey, 1969-71 the D’istrict of Columbia: Health Interview Survey, 1969-71

0.00.0 10.0 20.0 30,0 40.0 50,0 60,0

ACTUAL

600 –

500 -e

o

40,0 -

30.0 -

200 –

100 –

0.00.0 10.0 20.0 30.0 40,0 50.0 60.0

ACTUAL

stratified estimates and actual values for 50 States and theDistrict of Columbia: Health I ntewiew Survey, 1969-71

Figure V1. Parcent of the population who are married–post- Figure VI 11. Percent of the population who are married-64-

cell synthetic estimates and actual values for 50 States andthe District of Columbia: Health Interview Survey, 1969-71

15

Page 22: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

6,0 –

5.0 -

4.0

3,0 -

00

20 - 0

0

1,0 -

0,0 1.0 2.0 3.0 4.0 5.0 6.0

ACTUAL

6.0 –

5.0 -

4.0 -

3.0 - 0

2.0 – 4

0.00.0 1.0 2.0 3,0 4,0 5.0 6,0

ACTUAL

dir’ect estimates and actual values for 50” States and ~heDistrict of Columbia: Health Interview Survey, 1969-71

F;gure IX. Parcant of the population who are separated–simrzle Figure Xl. Percent of the population who are separated–l6-cell svnthetic estimates and actual values for 50 Statas andthe &strict of Columbia: Health Interview Survey, 1969-71

6,0 -

5,0 -

4.0 -

3.0 -

00

2.0 -

0

01.0 -0

0.00.0 1.0 2.0 3,0 4.0 5.0 6,0

ACTUAL ACTUAL

Figure X. Percent of the population who are separated–post- Figure Xl 1. Percent of the population who are separated–64-stratified estimates and actual values for 50 States and theDistrict of Columbia: Health [ntarview su~ey, IW9-71

cell synthetic estimates and actual values for 50 States andthe District of Columbia: Health Interview Survey, 1969-71

16

Page 23: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

14,0

12.0

L 10.0 100

0

0

Y“/

2.0

t/

oo~0,0 2.0 4.0 6,0 8,0 10.0 12.0 14,0

ACTUAL

14.0!-

12.0 -

10.0 -

6.0 -

6.0 -

0

4,0 -

2.0 -

0,00.0 2.0 4.0 6.0 6.0 10.0 12.0 14.0

ACTUAL

Figure Xl I I. Percent of the population who have completed col- Figure XV. Parcent of the population who have completed col-

lege–simple direct estimates and actual values for 50 States lege–16-cell synthetic estimates and actual values for 50

end the District of Columbie: Health Interview Survey, States and the District of Columbia: Health 1nterview

1969-71 Survey, 1969-71

1(J,U

$

n 8,0[

2,0

L0.0.

/

0

/>

/.-36

/

I I ! 1 1 10.0 2.0 4,0 6.0 8,0 10.0 12.0 140

ACTUAL

Figure XIV. Percent of the population who have completed col-

Iege–poststratified estimates and actual values for 50 States

and the District of Columbia: Heelth Interview Suwey,

1969-71

14.0~

12.0 -

E

2 10,0 -1- 0Evc 6.0 -.f ~o;o

z~

6.0 -A ~:

iiv 0s 4.0 -

2.0 -

0.00.0 2.0 4.0 6.0 a.o 10.0 12.0 14.0

ACTUAL

Figure XVI. Percent of the population who have completed col-lege–64-cell synthetic estimates and actual values for 50

States and the District of Columbia: Health Intewiew

Suwey, 1969-71

17

Page 24: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

1,6

1.4

I

o

1.2 0

Kg 1.0KUJ

n0.6

%!<3:

0.6

0.4 ~~ ~

*

0.00 5,0S+2 10,000 15,000 20,000 25,00Q 30,000 35,000 40,030

SAMPLE SIZE

16.0

16,0 0

14.0

= 12.0

0Ka. 10.0 40u

2 S.o>

~

6.o ,* Q

2.0 -

0.00 5,0W 10,000 15,000 20,000 25,00+3 30,000 35,OOO 40,000

SAMPLE SIZE

Figure XVI 1. Squarad errors of simple direct estimates of the Figure XIX. Squared errors of simple direct estimates of the

percent of State population less than 1 year of aga, by sam- percent of Stata population who are married, by sampleple size: Unitad States, 1889-71 size: United States, 1889-71

1f

1.4

1.2

cco 1.0KcIll0. 0.8

;

aIn 0,6

0.4

0.2

0.00 5.o130 10,000 15,oao 20.000 25,000 30,00+2 35,000 40,co0

SAMPLE SIZE

1a.oo

16.0

14.0

12.0Kou v

2 10.0 -

~

cc~ 6.o

g

a.o - Q

2.0 -

0.00 6,000 10.OW 15,0M 20,000 25,000 30,00+3 35,0W 40,000

SAMPLE SIZE

percent of State population less than 1 year of age, by sam-

ple size: United States, 1869-71

Figure XVI II. .%ruared errors of Doststratified estimates of the Figura XX. Squarad errors of poststratified estimatas of thepercent of State population who are married, by samplesize: United Statas, 1889-71

18

Page 25: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

0,6

0,4

zCc 0.3Ew0.

2

2 0.2.

0.1

0,0

0

0~o o

) %

o 0

5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000

140

12.0

[

1e

10.0

80

6,0

40

I 10

::~o 5,000 10,000 15,000 20,000 25.000 30.000 35,000 40,000

SAMPLE SIZE

Figure XXI. Squared errors of simple direct estimetes of the Figure XXI II. Squered errors of simple direct estimates of the

percent of State population who ara separated, by sampla

size: United States, 1969-71percent of State population who completed college, by

sample size: United States, 1969-71

I

0.4

c0

:. 0.3

g

c<

2“, 0.2

01

0 000

0 5,000 10,00015,000 20,000 25,000 30,000 35,000 40,000

SAMPLE SIZE

Figure XXI 1. Squared errors of poststratified estimates of the

percent of State population who are separated, by sample

size: United States, 1%9-71

140

120

CC 10.0~

:0 80.u<30m 6,0

4.0

2.0

, “!

0 5,oOO 10,000 15,000 20.000 25,000 30,000 35,00040,000

SAMPLE SIZE

Figure XXIV. Squared errors of poststratified astimates of the

percent of State population who completed collega, by

sample size: United States, 1969-71

*U.S. Q3VESNMENT PR2NTING OFFICE : 1979 0.311-240/3e19

Page 26: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or

VITAL AND HEALTH STATISTICS Series

Series 1. Programs and Collection Procedures. –Reports which describe the general programs of the NationalCenter for Heafth Statistics and its offices and divisions and data collection methods used and includedefinitions and other material necessary for understanding the data.

Series 2. Data Evaluation and Methods Research, –Studies of new statistical methodology including experi-mental tests of new survey methods, studies of vital statistics collection methods, new analyticaltechniques, objective evaluations of reliability of collected data, and contributions to statistical theory.

Series 3. Analytical Studies. –Reports presenting analytical or interpretive studies based on vital and healthstatistics, carrying the analysis further than the expository types of reports in the other series.

Series 4. Documents and Committee Reports. –Final reports of major committees concerned with vital andheafth statistics and documents such as recommended model vital registration laws and revised birthand death certificates.

Series 10. Data Fro m the Health Interview Survey. –Statistics on illness, accidental injuries, disability, use ofhospital, medical, dental, and other services, and other health-related topics, all based on data collectedin a continuing national household interview survey.

.$erics 11. Data From the Health Examination Survey and the Health and Nutrition Examination Survey .–Datafrom direct examination, testing, and measurement of national samples of the civilian noninstitu-tionalized population provide the basis for two types of reports: (1) estimates of the medically definedprevalence of specific diseases in the United States and the distributions of the population with respectto physicaf, physiological, and psychological characteristics and ( 2) analysis of relationships among thevarious measurements without reference to an explicit finite universe of persons.

Series 12. Data Fro m the Institutionalized Population Survey s.–Discontinued effective 1975. Future reports fromthese surveys will be in Series 13.

Series 13. Data on Health Resources Utilization. –Statistics on the utilization of health manpower and facilitiesproviding long-term care, ambulatory care, hospital care, and family planning services.

Series 14. Data on Health Resources: Manpower and Facilities. –Statistics on the numbers, geographic distri-bution, and characteristics of health resources including physicians, dentists, nurses, other healthoccupations, hospitals, nursing homes, and outpatient facilities.

Series 20. Data on Mortality. –Various statistics on mortality other than as included in regular annual or monthlyreports. Special analyses by cause of death, age, and other demographic variables; geographic and timeseries anafyses; and statistics on characteristics of deaths not available from the vital records based onsample surveys of those records.

Series 21. Data on Natality, Marriage, and Divorce. –Various statistics on natafity, marriage, and divorce otherthan as included in regular annual or monthly reports. Special analyses by demographic variables;geographic and time series analyses; studies of fertility; and statistics on characteristics of births notavailable from the vital records based on sample surveys of those records.

Series 22. Data From the National Mortality and Natality Survey s.–Discontinued effective 1975. Future reportsfrom these sample surveys based on vital records will be included in Series 20 and 21, respectively.

Sen”es 23. Data From the National Survey of Family Gro zoth. -Statistics on fertility, family formation and dis-solution, famify planning, and related maternal and infant health topics derived from a bienniaI surveyof a nationwide probability sample of ever-married women 15-44 years of age.

For a list of titles of reports published in these series, write to: Scientific and Technical Information BranchNational Center for Health StatisticsPublic Health ServiceHyattsville, Md. 20782

Page 27: Vital and Health Statistics; Series 2, No. 82 (12/79)5. Age group: under 17 years, 17-44 years, 45-64 years, 65 years and over Family size: fewer than five members, five members or