open gsbpm compliant data processing system in statistics estonia (vais) 2011 msis conference maia...

17
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing Systems Department Statistics Estonia 23th. of May 2011

Upload: jeffry-mills

Post on 18-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)2011 MSIS Conference

Maia Ennok

Head of Data Warehouse Service

Data Processing Systems Department

Statistics Estonia

23th. of May 2011

Page 2: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Strategy of Statistics Estonia 2008–2011

“From data collector to information service provider”

Objective: High-quality information service

Standardise the process of data processing:

Indicator: Introduction of the unified data processing software

Working out and introduction of the universal data processing information system

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)21.4.2023

Page 3: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Architecture of the information system

Dissemination

Statistical registers

Metadata system

Data collection ProcessingStatistical analysis

Persons

Administrative registers

Users

iMETA

VVIS

ADAMeGeostat

SRS

VAIS

Economic entities

eSTAT PX-Web

Census-HUB

KUNDE

Data Warehouse

21.4.2023 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 4: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Data processing system (VAIS)

VAIS is a collection of tools and technologies aimed at automating data processing (Phase 5 in GSBPM).

In essence, the task of check, clean, and transforming statistical activity data can be identified as taking the raw data from one or more sources and transforming it to analytical system source data input data base structures (observation registry).

21.4.2023 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 5: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Framework for …

Integrate data Classify & code Review, validate and edit Impute Derive new variables

& statistical units Calculate weights Calculate aggregates Finalize data files

21.4.2023 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 6: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Metadata driven template based tool

Template driven approach provides an universal solution for three main goals of the VAIS project:Create an easy to use statistical data processing tool requiring minimal programming skills for transformation package creation. Create a metadata driven process-oriented and automated statistical data processing tool.Create an extendable data transformation tool.

21.4.2023 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 7: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Design Phase

21.4.2023

Common Metadata

Repository

Data Sources for Statistical Activity N

Validation Rules for Statistical Activity N

Imputation Method for Statistical Activity N

Aggregation Def for Statistical Activity N

Data Sources for Statistical Activity N

Target Dataset for Statistical Activity N

INTEGRATE DATA

VALIADTE

IMPUTE

AGGREGATE

INTEGRATE DATA

LOAD DATA

INTEGRATE DATA

VALIADTE

IMPUTE

AGGREGATE

INTEGRATE DATA

LOAD DATA

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 8: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Data processing with VAIS

21.4.2023

Automating and speeding up data transformation Raw data, transformation metadata

and source data audit trails Metadata driven template

based tool Balancing automation and manual intervention

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 9: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

VAIS architecture

Page 10: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Balancing automation and manual intervention

21.4.2023

RAW data

Metadata (validation and transformation rules)

Automated data processing

Manual data

processing

OK? Data Warehou

se

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 11: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

VAIS applications and roles

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)21.4.2023

Roll VAIS Designer

VAIS Operator

VAIS Administrator

URMA

Designer x

Data Warehouse programmer

x

Chief operator x

Operator x

Administrator x x

Page 12: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

URMA

User rights management application Allows using existing user for authorization Allows create roles and link users with roles Allows set rights according to domain statistical work

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)21.4.2023

Page 13: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

VAIS Designer

Application for data processing design User interfaces for designing each processing

procedures Procedures group to packages Packages setup fallows policy of ETL Packages are designed for each statistical work

version

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)21.4.2023

Page 14: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

VAIS Operator

Allows user to manually intervene to data processing.

Allows to solve tasks created from data validation. Report of data processing gives overview of data in

process. Gives users information for decision, that is

necessary to solve tasks.

Open GSBPM compliant data processing system in Statistics Estonia (VAIS)21.4.2023

Page 15: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Technical platform

VAIS is built on open-sourced freely available technological components.XDTL (eXtensible Data Transformation Language – an XML based descriptional language designed for specifying data transformations, see http://xdtl.org) run-time engine (XDTL RT).MMX Metadata Repository, part of Metadata Framework (a MOF compliant metadata management environment designed with a wide variety of metadata-driven applications in mind, see http://mmframework.org). Apache Foundation's Velocity template engine (http://velocity.apache.org) is used as the template engine combining excellent template rendering functionality with very easy to use template language.The user applications are programmed in Java, based on Wicket MVC framework (http://wicket.apache.org)Quartz scheduling framework (http://www.quartz-scheduler.org) is used for execution scheduling.

21.4.2023 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 16: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Implementation

VAIS development 05.2010- 10.2011 Data processing of Population and Housing Census 2011

(31.12.2011) Reuse administrative data (2012)

Data collecting system for administrative data (ADAM) and eSTAT development for prefilling questionnaires in eSTAT with administrative data (annual bookkeeping report). (31.08.2011). VAIS is used for converting administrative data into the statistical data format. (for the year 2012 i.e for the reference year 2011 data collection)

Data processing of other statistical activities (first pilots 2013) Data processing of next registry based Population and Housing

Census (pilot 2014)

21.4.2023 Open GSBPM compliant data processing system in Statistics Estonia (VAIS)

Page 17: Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing

Questions?

Thank you!