the use of administrative sources for economic statistics an overview steven vale office for...
TRANSCRIPT
The Use of Administrative Sources for Economic Statistics
An Overview
Steven ValeOffice for National Statistics
UK
Contents
• Definitions
• Advantages of using administrative data
• Common problems
• Quality of administrative data
• Using administrative data in practice
• Conclusions
P rim a ry(S ta tis t ica l)
P u b licS e c to r
P riva teS e c to r
S e con d a ry(N o n -s ta tis t ica l)
D a ta S o urces
Narrow Definition
Wider Definition
P rim a ry(S ta tis t ica l)
P u b licS e c to r
P riva teS e c to r
S e con d a ry(N o n -s ta tis t ica l)
D a ta S o urces
Administrative sources are sources containing information which is not primarily collected for statistical purposes.
Reasons for this Definition
• Privatisation of some government functions
• Growth of private sector “value-added re-sellers”
• User interest in new types of data
Benefits of Administrative Data
• Cost– Surveys / censuses are expensive,
administrative data are often “free”
• Response burden– Reduced burden on data suppliers
– Statistics can be compiled more frequently with no extra burden
Benefits of Administrative Data
• Coverage– Full coverage of target population– No survey errors and lower non-response– Better small-area data
• Timeliness (sometimes!)• Public image
– Making use of existing data can enhance the prestige of a statistical organisation by making it seem more efficient
Population Census Costs2000-2001
• UK, €367m, €6.2 per person• Austria, €56m, €6.9 per person• Finland, €0.8m, €0.2 per person
Source: Eurostat – Documentation of the 2000 round of population and Housing censuses in the EU, EFTA and Candidate Countries; Table 22
Common Problems
• Administrative units do not always coincide with statistical units
• Conversion via automatic rules for simple cases
• Profiling for more complex cases– Gives a better understanding of
complex business structures– Expensive and needs trained staff
Common Problems
• Different definitions and classifications– Administrative and statistical priorities are
often different– Conversion matrices needed for different
classifications
• Timeliness– Data arrive too late– Data relate to a different time period
VAT Birth Lags
0
20
40
60
80
100
120
140
160
180
2000 50 100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
Lag in days
Fre
quen
cy (
thou
sand
s)
VAT Birth Lags
• 2/3 of businesses are on the register within 2 months of start-up
• Mean lag = 4 months due to “outliers”
• Median = Approx. 40 days
• Some pre-register - negative lags
Common Problems
• Change management– Risk of changes in government policy,
thresholds, definitions, coverage etc.
– Need contingency plans
• Data from multiple sources– Matching / linking issues
– Data conflicts – priority rules
• There are many aspects to quality
• Administrative data will be better than survey data in some aspects but not in others
• It is important to look at overall quality
• Do the data meet the needs of users?
Quality of Administrative Data
Three Aspects of Quality
• Quality of incoming data
• Quality of processing(matching, merging, ...)
• Quality of outputs - likely to be different to survey based outputs, but are they better?
Quality Measurement
• How to measure the quality of data from administrative sources?– Comparing sources
– Quality check surveys
– Knowledge of source (metadata)
– Quality reports / templates
Quality Templates
Companies House Data
• Framework: Contract
• Frequency: Quarterly updates, continuous
on-line access
• Timeliness: Good
• Quality: Good
• Delivery: CD-ROM / Internet
• Key content: Legal name, company number
Using Administrative Data
• Conversion to statistical concepts and definitions
• Linking / Matching– Exact Matching - linking records from
two or more sources, often using common identifiers
– Probabilistic Matching - determining the probability that records from different sources should match, using a combination of variables
Business Register
VAT PAYE
Survey inputs
Geographic information
systems
Company registrations
Dun and Bradstreet
Satellite
registers
UK Business Register
Satellite Registers
Examples of Satellite Registers
• Tourism - hotel register (category, number of beds)
• Transport - vehicle or ship register (type, capacity)
• Distributive trades - buildings register (building size, sales area)
Conclusions
• Administrative sources should be defined in the widest sense
• There are many benefits in using administrative data, particularly reduced costs
• There are problems when using administrative data, but usually someone has found a solution
Conclusions
• Most problems can be reduced by effective planning and detailed knowledge of the source
• The benefits are often greater than the costs