restricted daejeon, 26-29 april 2010 1 an sdmx based unified data catalogue (udc) msis – meeting...

17
Daejeon, 26-29 April 2010 Restricted 1 An SDMX based unified data catalogue (UDC) MSIS – Meeting on the Management of Statistical Information Systems 1 Gabriele Becker / Massimo Bruschi Statistical Information Systems Monetary & Economic Department Bank for International Settlements

Upload: charlene-arnold

Post on 30-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Daejeon, 26-29 April 2010

Restricted

1

An SDMX based unified data catalogue (UDC)

MSIS – Meeting on the Management of Statistical Information Systems

1

Gabriele Becker / Massimo Bruschi

Statistical Information Systems

Monetary & Economic Department

Bank for International Settlements

Restricted

2

The SDMX vision

Need: up-to-date numbers, data documentation, good quality data

Data can be offered by: NSOs, CBs, IOs How to choose, filter out duplication, get the “fresher” ?

Data providers (originators) offer their data “in SDMX” Dissemination = reporting = data sharing… single storage ! SDMX registries help users and organisations to find data

How “real” is this SDMX vision? What do we still need to learn?

Restricted

3

The Unified Data Catalogue (UDC) concept Can we “implement” the vision ? UDC: a single data catalogue that allows to discover,

select and retrieve statistical data from all registered data sources

discovery implies access to metadata:• DSD – data structure definitions• concepts and code-lists• category schemes

An SDMX registry is a natural repository

Unified Data Catalogue feasibility study to analyse this

Restricted

4

UDC study: Objectives

Provide centralised access to a variety of internal and external data-sources

Generic search facilities against “registered” data sources Directly retrieve data and metadata from all data sources Use SDMX technical standards, SDMX registry, web services Broaden SDMX knowledge within BIS (business area and IT

colleagues)

Restricted

5

User stories Registrations Constraints GUI features Navigation /

Search Query & retrieval Output handling Automation Security

Restricted

6

UDC prototype architecture

Simplistic approach: to search and retrieve data from a data source all what we need to know are the data structures and the source query language

If a source follows the SDMX-IM we also need a (web) service connected to it able to respond to SDMX Query

SDMX-enabled data source: “native” or “adaptable”

SDMX-ML file + DSD + “file-query-handler” = simplest SDMX enabled source

Restricted

7

SDMXRegistry

web appl. SDMXUDC GUI

mappabledata source

SDMXquery adapterweb service

SDMXdata sourceweb-service

SDMXfiles

web service

Registrations

Plan: schematic architecture

Internalor

externalsources

Restricted

8

Components of the UDC prototype SDMX Registry (“off the shelf” SDMX Tool)

• Data structure definitions of all “connected” data sources• Registrations for all data flows for all connected data sources• URLs to SDMX-files and SDMX query services• Updated via SDMX-ML messages or interactively (“KeyMaster”)

UDC (developed for the study)• GUI to navigate the registry information• Queries the data sources• Retrieves data and presents them to the user

SDMX query web services (developed for the study)• For the different types of data sources

Data query services (partly existing, partly developed)• For each of the connected queryable data sources

Restricted

9

BIS Data Bank

DBQL output

SDMX-MLproxy daemon

medts.aLinux

MarkIT SQL database

SQL storedprocedures

mstat.sWin

TS web service

mstat.aWin

MSTAT Cubes

v.ds03Linux

SDMX-MLquery

web service/databank/query

SDMX-MLquery

web service/mstat/query

SDMX-MLquery

web service/markit/query

SDMXRegistry

web appl.

R/O Registry

UDC web appl.

SDMX-MLfile

browser

Internet ExplorerUDC GUI

PCWin

What we did: detailed architecture

SDMX-MLdatafiles

.xml

.xml

.xml

.xml

Restricted

10

UDC GUI key features

Browse the Categories / Data-flows / Provision registrations Browse selected DSD: dimensions, attributes, code-lists Build queries based on DSD (code selection) Run query and view results (simple table) Download results and DSDs in SDMX-ML format Search by Concept / Codelist

Restricted

11

1

23

Search by Concept/Codelist - 1

Restricted

12

Search by Concept/Codelist - 2

4

6

5

Restricted

13

UDC Prototype: some results UDC can provide (unsecured) access to

• BIS Data Bank: time series repository, SDMX-EDI IM, LINUX, FAME, Sybase, own query language + query adapter

• MSTAT OLAP: IBFS data multi-dimensional cubes, MS Windows, SQL Server, SDMX Query to OLAP / MDX adapter

• MSTAT Sandbox, research data in relational base, MS Windows, SQL Server, DSD on unstructured dataset + SMDX / SQL adapter

• SDMX-ML generic files + generic file adapter Practical use of registration, provisioning, constraints processing,

… SDMX vision is real … with some practical issues

Restricted

14

Issues found (Aug. 2009, SDMX 2.0)

Not possible to register compact or utility files in registry used

Not possible to register files using message groups and annotations as not supported in registry used

Missing functionality in SDMX Query message Some issues with registry implementation used Constraints processing on registry did not work ECB does not provide DSDs on their website (files are OK) Cross-platform communication with security not solved In general: access authorisation to query-able data sources

is unresolved

Restricted

15

Conclusions

SDMX vision is real: the UDC works Enhancements to standards already part of SDMX 2.1 Enhancements to registry implementation (eg industrial

strength required) Non-SDMX issues (cross-platform connectivity and

access authentication) exist and need to be looked into Current SDMX offerings from other organisations are

rather diverse (message types, features used, version implemented)

Diverse offerings make requirements for a UDC more complex

Restricted

16

Next steps for the BIS

UDC can be a central part of future BIS environment Road to UDC will take a few years Continue the feasibility study in the next year Refine UDC

• More data sources

• More user facilities for search and navigation Work with SDMX standards experts on issues found Work with other SDMX data providers

Restricted

17

Thank you!

[email protected]

[email protected]