analysis of aggregate outputs - richard wiseman

27
Analysis of the aggregate outputs from the 2011 Census to develop alternative integrated multidimensional conceptual models of the data and geographies for easier management and dissemination Richard Wiseman Socio-Economic Data Specialist UK Data Service ONS Census Analysis workshop July 2014

Post on 19-Oct-2014

170 views

Category:

Government & Nonprofit


4 download

DESCRIPTION

Census Analysis Workshop, 17 July, London, Analysis of Aggregate Outputs - Richard Wiseman

TRANSCRIPT

Page 1: Analysis of Aggregate Outputs - Richard Wiseman

Analysis of the aggregate outputs from the 2011 Census to develop alternative integrated multidimensional conceptual models of the data and geographies for easier management and dissemination

Richard Wiseman

Socio-Economic Data Specialist

UK Data Service

ONS Census Analysis workshop

July 2014

Page 2: Analysis of Aggregate Outputs - Richard Wiseman

Overview

• Background• Casweb• InFuse• Integrated descriptive model• Integrated model of geographies

Page 3: Analysis of Aggregate Outputs - Richard Wiseman

What is the UK Data Service?

• a comprehensive resource funded

by the ESRC

• a single point of access to a wide range of secondary social science data

• support, training and guidance

Page 4: Analysis of Aggregate Outputs - Richard Wiseman

UK Data Service Census Support

• Specialist function of UK Data Service

• Access and support services for outputs from recent UK censuses

• Add value by making census outputs easy to find, understand and use

• Engagement with UK census agencies

• Long history of technological innovation in service development

• census.ukdataservice.ac.uk

Page 5: Analysis of Aggregate Outputs - Richard Wiseman

census.ukdataservice.ac.uk

Page 6: Analysis of Aggregate Outputs - Richard Wiseman

• Aggregate component of census outputs

Census Support at Manchester

Justin Hayes

Rob Dymond-Green

Richard Wiseman

Jamey Hart

Page 7: Analysis of Aggregate Outputs - Richard Wiseman

• Aggregate component of census outputs

Census Support at Manchester

Justin Hayes

Rob Dymond-Green

Richard Wiseman

Jamey Hart

Page 8: Analysis of Aggregate Outputs - Richard Wiseman

Casweb

Page 9: Analysis of Aggregate Outputs - Richard Wiseman

Casweb

• UK-wide aggregate data from 1971 to 2001• Revolutionary when first launched in 1997

• First GUI interface to (UK?) census data

• Representations of published census tables allowing selection of cells, with basic table search

• Drill-down geography selection• Integrated digital boundary data in GIS formats• Heavyweight and inflexible

• All intelligence built in application

Page 10: Analysis of Aggregate Outputs - Richard Wiseman

InFuse

Page 11: Analysis of Aggregate Outputs - Richard Wiseman

InFuse

• Open access• Aggregate data from 2011 census across the UK• Makes data easy to

• Find• Understand• Use

• Global query using variable combinations• No tables!• “No data” fast!

Page 12: Analysis of Aggregate Outputs - Richard Wiseman

Variable combination selection

Page 13: Analysis of Aggregate Outputs - Richard Wiseman

Variable combination selection

Page 14: Analysis of Aggregate Outputs - Richard Wiseman

Category combination selection

Page 15: Analysis of Aggregate Outputs - Richard Wiseman

Area selection

Page 16: Analysis of Aggregate Outputs - Richard Wiseman

Data download

Page 17: Analysis of Aggregate Outputs - Richard Wiseman

Under the bonnet

• Integrated multidimensional descriptive model• Integrated model of geographies• The really important bits!

Page 18: Analysis of Aggregate Outputs - Richard Wiseman

InFuse 2011 release 2: Raw data

• England and Wales Local and Detailed Characteristics to output area level

• UK harmonised data to local authority level• 422 tables, mainly multivariate• 31 geography types• 241,334 areas• 11,311 files• 15Gb volume

Page 19: Analysis of Aggregate Outputs - Richard Wiseman

Integrated descriptive model

• Processing of raw metadata• Deconstruction, rationalisation and re-integration• Library of variables and categories• Re-insertion of data values• Attachment of associated metadata

• Global description using standards• Global operations via Web service API

• Data is self-describing• Enables lightweight, generic applications

Page 20: Analysis of Aggregate Outputs - Richard Wiseman

Benefits of this work

• Data producers• Efficient data management• Flexible output production• Best value

• Application developers• Easy access to self describing web services• Light weight generic applications

• End users• Quick and easy global search• Context along with data

Page 21: Analysis of Aggregate Outputs - Richard Wiseman

InFuse 2011 release 2: Processed data

• 97 variables• 2,501 categories• 281 variable combinations• 140 thousand category combinations• 4.6 billion values

• A 460Km high stack of sticky notes!• Anticipating approximately 10 billion values in all

Page 22: Analysis of Aggregate Outputs - Richard Wiseman

Integrated model of UK census geographies

• Assembly of raw information on geographies• 31 geography types• 241,334 areas (anticipating ~ 2 million including postcodes)• Direct and indirect hierarchies

• Simplified presentational model• 11 composite geography layers• Simplification of merged geographies in England and Wales

• Calculation of ‘missing’ data• Linkage between descriptive and geography models

• Partial availability of data for geographies and extents

Page 23: Analysis of Aggregate Outputs - Richard Wiseman

Raw admin and statistical geographies

Page 24: Analysis of Aggregate Outputs - Richard Wiseman

Admin and statistical geography layers

infuse.mimas.ac.uk/help/definitions/2011geographies

Page 25: Analysis of Aggregate Outputs - Richard Wiseman

What’s next for InFuse

• Interface improvements• Geography first option• Fine tune interface features• Select categories from more than one category combination• ‘Select all’ categories• Back button• Geography tree improvements (multiple hierarchies)

• User testing

Page 26: Analysis of Aggregate Outputs - Richard Wiseman

What’s next?

• More data• More comparable data

• Different data• Boundary and flow data

• More functionality• Personalisation, analysis and visualisation

• Public InFuse API• Work with statistical agencies?

• Machine-friendly data from source• Flexible generation with automated disclosure control?• Information on usage and contact with users

Page 27: Analysis of Aggregate Outputs - Richard Wiseman

Give InFuse a go!

infuse.mimas.ac.uk

• Comments, questions and ideas welcome• [email protected]