an r vs sas experiment megan pope and gareth clews office for national statistics

14
An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Upload: brent-beasley

Post on 21-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

An R vs SAS Experiment

Megan Pope and Gareth ClewsOffice for National Statistics

Page 2: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

R at ONS

• Open source software in ONS• Supporting the government IT strategy• Development of training for GSS

• R Development Groupi. Support use of R within ONSii. Increase user baseiii. Aim for incorporation in production systems

• Teaching R to a SAS audience• Increasing usage

2

Page 3: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

SAS at ONS

• Designated standard software

• Statistics Canada Generalised Estimation System (GES)

• Suite of SAS macros

• Calibration weights, domain estimates, variance estimates

3

Page 4: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

ReGenesees

• Free R package

• R evolved Generalised software for sampling estimates and errors in surveys

• Developed by Italian Statistics Office (Istat)

4

Page 5: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

R vs SAS

• Comparative study of complex survey estimation software

• Quality Improvement Fund (QIF)

• SAS (GES) v R (ReGenesees)

• Investigating open source in line with GSS strategy

5

Page 6: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Calibration

• Used if there is a relationship between auxiliary data and response variable

• An estimation procedure which constrains sample-based estimates of auxiliary variables to known totals (or accurate estimates)

6

Page 7: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Surveys chosen and why ...

• Business surveys

• QSI– Cut-off sample

• BRES – Separate calibration totals Set thresholds for Winsorisation

• ABS – Biggest survey with 4,000 strata Externally calibrated weights

7

Page 8: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Surveys chosen and why ...

• Social surveys

• LFS – biggest survey resource intensive

• LOS – longitudinal

• IPS – 2-stage calibration

8

Page 9: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Quarterly Stock Inquiry

• Cut-off samplingCombined ratio estimationCalibration to one auxiliary

• Estimates and variance estimates

• GES – Seven separate input filesReGenesees – Six simple commands

9

Page 10: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Quarterly Stock Inquiry - GES

10

Page 11: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Quarterly Stock Inquiry - ReGenesees

• design<e.svydesign(data= ids= strata= weights= fpc=)

• template<-pop.template(data= calmodel= partition=)

• pop<-fill.template(universe= template=)

• population.check(df.population= data= calmodel= partition=)

• cal<-e.calibrate(design= df.population= sigma2=)

• est<-svystatTM(design= y= by= ,)

11

Page 12: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

What we found ......

• Software comparison

• Time

• Missing values

• Programming

12

Page 13: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Conclusions/Recommendations

• ReGenesees successfully used in place of GES

• ReGenesees easier – less risk!

• GES more capable for some aspects and vice versa

• Recommend to explore further!

13

Page 14: An R vs SAS Experiment Megan Pope and Gareth Clews Office for National Statistics

Questions