improving quality in the office for national statistics’ annual earnings statistics pete brodie...

25
Improving Quality in the Office for National Statistics’ Annual Earnings Statistics Pete Brodie & Kevin Moore UK Office for National Statistics

Upload: augustus-nichols

Post on 25-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Improving Quality in the Office for National Statistics’ Annual Earnings Statistics

Pete Brodie & Kevin Moore

UK Office for National Statistics

Outline

• What is being measured?• Background• Improvements introduced for the Annual Survey of Hours

and Eanings (ASHE)– weighting – output statistic– coverage– variance estimation

• Future – improved sample design

• Conclusions

What is being measured?

• The Annual Survey of Hours and Earnings (ASHE) is the main vehicle for measuring wage levels and working hours in the UK

– For government taxation and wage policy – Measuring the effect of the National Minimum Wage– Measuring gender pay differences– Measuring compliance with maximum hours directives– For compensation cases to estimate loss of earnings or

future care costs– For regional and local planning purposes– Also used for pay bargaining

Background

• Formerly the “New Earnings Survey” (NES)

• Survey largely unchanged since the early 1970s• Receive a 1 in 100 sample from the tax office• All employees who have one or more jobs as part

of a Pay as You Earn (PAYE) scheme• No weighting carried out• Only crude sample variance estimates produced

Improvements introduced for ASHE

• The Annual Survey of Hours and Earnings (ASHE) recently (2004) replaced the NES

• Changes introduced– weighting– outputs– coverage– variance estimation

Weighting (1/2)

• The Labour Force Survey (LFS) which is a household survey also measures the labour market in the UK

• The LFS is calibrated to the mid year population estimates

• Since wages, hours and response rates are highly correlated with demographic factors we calibrate to LFS outputs

Weighting (2/2)

• Analysis determined which factors were most associated with key ASHE variables.

• The final factors to be included in the model were (most significant first):

– Main Occupational Category 9– Sex 2– Age (less than 25, 25-49 and 50+) 3– Region (London&South East, remainder)

2

• Giving a total of 108 cross groups

Outputs (1/2)

• The NES output focussed on means• We are actually interested in distributions • The ASHE output focusses on medians and

includes ten other percentiles outputs• Means are also published• Also publish year on year change for every variable• Every output has a sampling variance estimate

published

Outputs (2/2)

Number of low paid 148,605 c.v=8.74%

average wage £387.19 c.v.=0.37%

lower decile £80.80 c.v.=1.50%

median £317.64 c.v.=0.36%

upper decile £713.96 c.v.=0.50%

• Average pay of females in Wales

£257.18 c.v.=1.43%

Coverage (1/3)

• Initial sample drawn in January• Questionnaires sent out April • Use responses to this first questionnaire and

updated admin data for a second phase – those who have changed employer (the movers)– those who have recently joined a PAYE scheme

(joiners)

• To compensate for the difference in coverage of the LFS and ASHE we also took a sample of companies outside the PAYE scheme

Coverage (2/3)

PAYE employees

Movers

Joiners

Stayers

Employees of VAT onlyCompanies

LFS

Coverage (3/3)

• First phase ≈ 250,000 employees• returned details ≈ 160,000 employees• number of employees no longer with the same employer but still working ≈ 26,000

Variance estimation (1/3)

• We use GES software to calculate simple outputs with variance estimates

• We treat our calibration totals as fixed (this underestimates the variance slightly)

• For the percentile outputs we use indicator variables to estimate approximate variances

Variance estimation (2/3)

Distribution of Earnings

0

2

4

6

8

10

12

25 325 625 925

Weekly wage in £s

Per

cent

age

Unweighted

Weighted

Variance estimation (3/3)

• For year on year changes we use a repeated sampling method

• Have to be careful when sampling– year one only– both years– year two only

Future (1/7)

• Sample design is unchanged and so still quite inefficient

• No auxiliary information used• Simple random sample everywhere• Looked at options for using extra information

– Information about the rest of the frame– Additional auxiliary information– Options for sub-sampling

Future (2/7)

• Currently have a simple Bernoulli 1% sample Details of their current employer only

• We have additional information from our own business register (the IDBR) which holds details of size and industry of employing business

• Sample variance of returned values correlated with the industry

• Too much sample in some industries and too little in others!

Future (3/7)

• Easy to reduce sample sizes• Looked at the effect on the overall variance when

we removed sample from the “good” industries• Stratified the returned sample by industry and

removed sample from the “worst” industry then the second worst etc. until full reduction achieved

• Could impose restrictions too• Compared with removal at random

Future (4/7)

Future (5/7)

• Considered increasing sample in some industries

• Postulated that we start with a 2% sample of admin data

Future (6/7)

Future (7/7)

• There is the possibility of getting auxiliary information

• One of the opportunities arising out of Independence for National Statistics is more sharing of Administration data within government

• There may be a suitable variable

Conclusions

• Substantial improvements have been made to UK earnings statistics

• Efficiency savings could be made by substantially cutting costs with little loss in quality

• There is some scope for improving quality of high level outputs while reducing sample sizes

• There is a possibility of making vast improvements with access to more detailed administrative data (Independence might bring this)

Questionnaire Issues

• Talk by Jacqui Jones of the ONS

• Improved Questionnaire Design yields better data: Experiences from the UK ASHE

• Tomorrow morning (Wednesday) Session 36: A Global Path to Standards in Questionnaire Design

Any Question?

Contact details:

[email protected]