unido.org/statistics international workshop on industrial statistics 8 – 10 july, beijing non...

16
unido.org/ statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

Upload: joanna-southers

Post on 31-Mar-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsInternational workshop on industrial statistics

8 – 10 July, Beijing

Non response in industrial surveys

Shyam Upadhyaya

Page 2: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsWhat is non-

response?

• Failure of obtaining data from some units of the target population of a survey

• Unlike the survey of human population, which is relatively homogeneous, non-response may create serious problems in industrial survey

• Larger establishments account for higher share in estimates of total value as well as the variance of key variables

• A certain number of non-response is always expected. Thus, a plan for non-response treatment should be thought in priori.

Page 3: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsHow does the non-response affect –

conceptual framework

Page 4: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Response rate

  In the frame Not in the frame

Total 

Within scope A1 A2

In scope = A1 + A2

Outside the scope

B1 B2

Outside the scope = B1 + B2

 Total

In the frame = A1 + B1

Missing units = A2 + B2

 

Response rate is the ratio of statistical units actually observed with respect to the number of eligible units for the survey.

This ratio may not be found when the frame is imperfect

Page 5: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Total number of statistical units in the register

Statistical units in the survey frame

Statistical units not in the frame

Units within the scope of survey (A1)

Units outside the scope of survey

Non-respondents

Units identified as other activity

Unidentified units – non-existent

Permanently closed units

Temporarily closed units

Respondents

Refusals

No contacts

Units within the scope of survey (A2)

Units outside the scope of survey

The existing frame should be updated with the additional information data from listing operation or administrative sources.

Page 6: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsMeasurement of response rates

• Unit response rate

Particularly important for monitoring the progress of survey

• Weighted response rate

Share of respondents in total value of a key variable of the survey (in case of sample survey w means design weights)

For survey estimates, WRR carries more value as it reflects the actual coverage, thus representativeness of the survey

;M

RURR 21 AAM

YywWRRR

iii

1

Page 7: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsVariation of URR and WRR by sub-

population

URR and WRR are rarely equal due to the variation of size of establishments. If better response is achieved from larger establishments WRR is higher.

Page 8: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Types of non-response

1. Unit non-response, when there was no response from some statistical units

2. Item non-response – when some statistical units provided incomplete data (data missing for some variables within the unit)

3. Wave non-response – it may occur in panel surveys, when some statistical units respond in one round but do not respond in another.

Page 9: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

How to handle non-response?

Treatment of non-response depends on the type of non-response as well as the type of survey

  Unit non-responseItem non-response

Sample survey

Weight adjustment to reflect the reduction of sample size

Imputation

Census

No-internal solution External sources such as admin data or past survey data

Imputation

Page 10: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Unit non-responseIn sample survey

Weight adjustment: design weight

estimation weight

Non-response in sample survey is considered as reduction of the sample size. Subsequently design weight is inflated, assuming that non-response has occurred at random.

n

Nw

n

N

URRw

1

In census: There is no weights to adjust. Other ways to compensate unit non-response are :

• administrative data or • earlier survey data adjusted with applicable growth

rates

Page 11: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Imputation for non-response

• Imputation is a technique of finding some artificial values to replace missing data due to non-response

• Basis consideration of replacement is that imputation is done from the observed value of a statistical unit that is quite similar to the non-respondent

• Imputation is particularly effective for item non-response.

Many variables of industrial survey are highly correlated; therefore mean and ratio of observed units may serve as predictor for non-respondents

Page 12: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsSome imputation methods

• Imputation based on mean valueMissing data is estimated by the mean value of observed units

Effective for homogeneous statistical units, for example within a size class of industry group at 4-digit level of ISIC

• Hot deck imputation

Missing data are replaced by the value of observed units. For this purpose a pool of ‘’donors ‘’ created. Under the random hot-deck method ‘’donors’’ are selected at random

Alternatively, a ‘’donor’’ can be the nearest neighbour. This method is called deterministic hot-deck method.

Page 13: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Est_IDNumber of employees

SaleDistance

[Abs] Replacing

value

4781 989 144560

4782 895 147675 109

4783 786 … ←128589

4784 771 128589 15

4785 653 101868

4786 554 84762

4787 321 68150

4788 205 30135 7

4789 198 … ←30135

4790 106 25946 92

Example: Imputation with nearest neighbour method

Page 14: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsImputation methods…cont

• Cold deck methodsAs opposed to hot-deck (the term refers to punch cards) cold deck method is based on past data

• Post stratificationStatistical units are further stratified to create homogenous groups from which mean, median or ratio are computed to replace the missing value

• Statistics modellingRegression or similar models are constructed where the regression coefficients (or parameters) may serve as predictor of the missing value

Page 15: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statisticsImputation for unit non-response using

external data sources

• Administrative dataIn case of unit non-response, there would not be any information from the survey. Alternatively, data for some key variables might be obtained from administrative sources.

• Data from the previous survey Often termed as Carry forward replacing the major values by results from earlier survey – effective for quarterly/monthly surveys

For annual and surveys with longer interval growth adjustment is necessary

Page 16: Unido.org/statistics International workshop on industrial statistics 8 – 10 July, Beijing Non response in industrial surveys Shyam Upadhyaya

unido.org/statistics

Some other points on non response

• Imputation does not necessarily reduce the bias, in sample survey it may even increase the standard error

• Unlike the household survey where ratio and mean estimates are important, industrial survey results are supposed to produce the total measure – such as industrial output, employment

• Imputation for missing data helps to improve the coverage of the survey estimates

• Imputation for large database requires carefully developed software application