realizing the statistical potential of administrative data paper presented at the u.n.e.c.e. seminar...

16
Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection, 31 October-02 November, 2012, Geneva John Dunne, John Hayes Central Statistics Office, Ireland

Upload: randolph-norman

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

Realizing the statistical potential of administrative data

Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

31 October-02 November, 2012, Geneva

John Dunne, John HayesCentral Statistics Office, Ireland

Page 2: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

Introduction

• This paper describes the progression towards an Irish Statistical System, a holistic system based on the exploitation of administrative data, comprehending linkages to survey data and other administrative data.

• The paper focuses on the role of the CSO’s Administrative Data Centre, which has the dual purpose of acting as clearing house for administrative data and promoting the development of the Irish Statistical System.

2

Page 3: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

The National Statistics Board• In 2009, the National Statistics Board (NSB) laid out a strategy1 for achieving an Irish Statistical System. Amongst the implementation priorities identified is:Developing systems to ensure that the statistical value of existing survey and administrative data is maximized.

• The NSB paper also identified three critical infrastructural requirements in developing the Irish Statistical System: A unique business identifier and a central business register; A unique personal identifier; Spatial and geographic data capture.

1 Strategy for Statistics, 2009-2014, http://www.nsb.ie/media/nsbie/pdfdocs/StrategyforStatistics2009-2014.pdf

3

Page 4: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

Policy progression• In 2011, an NSB position paper1 elaborated on some of the core

objectives of the earlier document, advocating, in particular: The development of the infrastructure to maximise the use of data sources,

including the compilation of registers of persons, businesses, and buildings, with linkage between each such register – “joined-up” data.

• The government Public Sector Reform Plan2, published in 2011, further supports the development of the Irish Statistical System with the following stated objectives:

Improved sharing of data on businesses across the Public Service, including the development of business registers linkable to that of the Revenue Commissioners;

Developing a code of practice for data gathering and its use for statistical purposes across the Public Service, including promoting consistent approaches to identifiers, classifications, and geo-spatial/postcode data.

1Double paper The Irish Statistical System: The Way Forward and Joined Up Government Needs Joined Up Data http://www.nsb.ie/media/nsbie/pdfdocs/NSB%20ISS%20Position%20Papers.pdf

2 http://per.gov.ie/wp-content/uploads/Public-Service-Reform-pdf3.pdf

4

Page 5: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

The Statistics Act, 1993• The CSO was established statutorily under the Statistics

Act, 19931. This legislation assigns certain powers to the Director General of the CSO with respect to data held by public authorities:

The Director General may require a public body to provide copies of any records in its charge for statistical purposes;

The Director General may require a public body to co-operate with him on assessing the statistical potential of its records and in developing its recording methods and systems for statistical purposes;

A public body shall consult with the Director General, and accept his reasonable recommendations, if it proposes to introduce or revise any system for the storage and retrieval of information or to make a statistical survey.

1 http://www.irishstatutebook.ie/1993/en/act/pub/0021/print.html

5

Page 6: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

A joined-up data system (after Thygesen1)

6

1 The importance of the archive statistical idea for the development of social statistics and population and housing censuses in Denmark, Thygesen, Lars, 2011 http://ww4.dst.dk/upload/nordbotten_and_denmark_final_draft_4.pdf

Page 7: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

Joined-up data and the CSO• The CSO’s Business Register is fully aligned with administrative

sources from the Revenue Commissioners.

• Linkage between persons and businesses is available to the CSO from employer tax returns to the Revenue Commissioners.

• There exists in Ireland a comprehensive buildings database for the state, called the Geodirectory1, available on a commercial basis.

• Ireland does not yet have a post code system, but this is planned for 2013.

• The Department of Social Protection maintains the master list of official Personal Public Service Numbers (PPSN) in the state. This list is the basis of the CSO’s Person Activity Register, which identifies each person’s engagement with key administrative systems.

71 http://www.geodirectory.ie/

Page 8: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

The CSO’s Administrative Data Centre

• The Administrative Data Centre (ADC) is the CSO unit designated as the conduit for data transfers from other government bodies and is the central repository for received data from those bodies.

• This unit currently maintains over fifty different administrative data flows serving the statistical production systems in the CSO.

• ADC controls access to the data in accordance with confidentiality obligations under national and EU legislation.

• Subject to these criteria, ADC may also make anonymized data available as Research Micro Files (RMFs) to external researchers.

8

Page 9: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

9

Page 10: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

Following the setting-up of the ADC...

10

Page 11: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

ADC interaction with other public bodies• ADC policy is to implement institutional-level Memorandums of

Understanding (MoUs) to underpin the flow of administrative data to the CSO, as distinct from having data flow-specific MoUs.

• In the case of the Office of the Revenue Commissioners, the MoU1 has led to a relationship which has allowed the CSO to adopt a business register that is based on the Revenue Commissioners’ registration system and to use the Revenue Customer Number as a common business identifier between the two bodies. 

• The government has charged the CSO with developing a statistical code of practice for the Irish public service. The ADC is progressing this objective through its chairing of the Statistician Liaison Group, a forum of statistical units across the public service.

1 http://www.cso.ie/en/aboutus/descriptionsandfunctions/memorandumofunderstandingbetweenthecsoandrevenue/

11

Page 12: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

ADC – technical aspects• Data received from other government bodies are converted to SAS

datasets and held in a warehouse environment having Source, Analysis, and External Researcher tiers. 

• In the case of person-based administrative data, ADC anonymizes such files before making them available to CSO users, as Analysis tier data flows.  

• All CSO staff have access, via a data portal, to core metadata and summary statistics on all administrative data held.  

• The data model for the administrative data held in the ADC domain is a hierarchical model:

Data flow  Data flow instance Instance version Datasets

12

Page 13: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

ADC – technical aspects• An example of an Analysis tier data flow is the P35 (employee)

dataset, which links person- and business-based registers as illustrated here:

13

Page 14: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

The future – concrete objectives• The key challenge for the CSO will be to avail of the

increasing opportunities for joining up available administrative data sources. Steps to complete a fully joined-up data system in Ireland might include:

• The implementation in public administration systems of a link between a person and a residence, where the residence is itself identified by a location or (x,y)-based identifier;

• The mandatory use of the PPSN in the engagement of persons with the state through the different life stages;

• The implementation of a unique business identifier for businesses interacting with the state, and the linking of this identifier with a building identification number.

14

Page 15: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

The future – critical success factors

• Statistical code of practice for the Irish public sector

• Partnership approach to development of joined-up data

• Delivery of projects which deliver value for policy purposes

15

Page 16: Realizing the statistical potential of administrative data Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,

Conclusion

The Irish Statistical System continues to face significant challenges in the years ahead; however in the words of W. Edwards Deming, “It is not necessary to change. Survival is not mandatory.”

16