quality indicators for statistics based on multiple sources

11
Eurostat Quality indicators for statistics based on multiple sources Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju Eurostat, European Commission

Upload: kyne

Post on 05-Jan-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Quality indicators for statistics based on multiple sources. Mihaela Agafiţei , Fabrice Gras, Wim Kloek , Sorina Vâju Eurostat, European Commission. Content. Introduction Quality of statistics – general discussion Output quality assessment – input and process - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Quality indicators  for statistics based on multiple sources

Eurostat

Quality indicators for statistics based on

multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina VâjuEurostat, European Commission

Page 2: Quality indicators  for statistics based on multiple sources

Eurostat

Content

1. Introduction

2. Quality of statistics – general discussion

3. Output quality assessment – input and process

4. Direct output quality assessment

5. Conclusions

2/11Q2014 – Vienna – 5th of June, 2014

Session No 32 - Statistics beyond survey and administrative dataQuality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Page 3: Quality indicators  for statistics based on multiple sources

Eurostat

Challenges

• Reduce response burden

• Reduce cost of raw data collection

• Increase ability to face new demands

• Increase ability to produce more detailed statistics

Increase use of administrative data sources

•Direct use•Use in sampling frame•Auxiliary information•Calibration

Output quality

assessment

• Can consider the integration effect?

• Can consider the variety of statistical approaches

• Can advantages be offset by possible decreases in the quality?

1. Introduction

3/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 4: Quality indicators  for statistics based on multiple sources

Eurostat

2. Quality of statistics – general discussion

Input

Quality of raw data

Whether and how a given data source can be used on a regular basis to produce statistics

Process

Whether final data is “real”

Magnitude of errors introduced in processing stage

Analyse of statistical process

Output

User easy to understand information on the quality of the final data

4/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 5: Quality indicators  for statistics based on multiple sources

Eurostat

2. Quality of statistics – general discussion

The ESS Code of Practice

The ESS Standard for Quality Reports

ESS Handbook for Quality Reports

RelevanceAccuracy

&Reliability

Timeliness &

Punctuality

Coherence &Comparability

Accessibility &

Clarity

5/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 6: Quality indicators  for statistics based on multiple sources

Eurostat

3. Output quality assessment: input and process

Not feasible:

• multiple sources

• multiple uses

• large and complex processes

• certainly at the European level

6/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 7: Quality indicators  for statistics based on multiple sources

Eurostat

3. Output quality assessment: input and processProcess step Risk

Impacted quality dimension

Error measurement

Linkage and determination of the target population

Missed link, wrong link: under/over coverage

Accuracy, comparability

Bias, confidence range of the target population

Concept/definition

Aggregation of different concept/definitions

Relevance, accuracy, comparability

Bias, Variance error, qualitative assessment

Imputation/estimation

Estimation error Accuracy Bias, variance error

Classification Wrong classification

Relevance, accuracy, comparability below a certain level of aggregation

Bias, variance error

7/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 8: Quality indicators  for statistics based on multiple sources

Eurostat

4. Direct output quality assessment

• Direct assessment of output quality from the output itself

• Assessment of output quality with a common reference data source

• Bootstrapping

Not replacing the input + process approach

8/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 9: Quality indicators  for statistics based on multiple sources

Eurostat

4. Direct output quality assessment Direct assessment of output quality from the output itself

• time series or cross-sectional data

• breaks in series are a direct indication of bias

• revisions

• outliers

Assessment of output quality with a common reference data source

• quality survey

• additional statistics or administrative sources with considerable

conceptual harmonisation

9/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 10: Quality indicators  for statistics based on multiple sources

Eurostat

4. Direct output quality assessment

Methods derived from bootstrapping

Possible use Application Remarks Main practical problem

As primary and/or complementary data

Yes

Existence of overlapping survey data is welcome and can significantly increase the feasibility and relevance of the method

Inference on the distribution and/or generating process of the administrative data. Detection of break and outliers in time series.

Support sampling surveys

PartiallyUncertainty can be inserted by estimating false positive and negative probability

How to simulate the addition of a previously non selected unit in the replication of the sample

Auxiliary information

Yes

Modelling on how randomness is channelled through the production process

Simulation of the error caused by the imputation/estimation methods

10/11Quality indicators for statistics based on multiple sources

Mihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data

Page 11: Quality indicators  for statistics based on multiple sources

Eurostat

5. Conclusions

Output quality assessment through input and process quality gets too

complex in processes combining several sources, especially at the European

level

Alternative solutions should be found:

direct output assessment

a common reference source

bootstrapping

Output quality assessment:

internal use: to monitor and improve statistical production process

external use: a coherent summary of information on quality output

Assessing quality is not for free

11/11

Quality indicators for statistics based on multiple sourcesMihaela Agafiţei, Fabrice Gras, Wim Kloek, Sorina Vâju

(Eurostat, European Commission)

Q2014 – Vienna – 5th of June, 2014Session No 32 - Statistics beyond survey and administrative data