relazione del gruppo di lavoro sui sistemi...

36
ESSnet CORE COmmon Reference Environment Partner’s name: Istat WP number and name: WP3: Generic interface design for interconnecting GSBPM sub-processes Deliverable 3.3 Lessons learned on the use of SDMX within data processing Partner in charge Istat Version 4.0 Date March 08 2012 Versi on Changes Changed by Date 1.0 First draft Istat 26-01- 2012 2.0 Second draft Istat 17-02- 2012 3.0 Third draft Istat 05-03- 2012 4.0 Final version Istat 08-03- 2012 Date of dissemination Version Page March, 08 2012 4.0 1

Upload: others

Post on 25-Feb-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Partner’s name: Istat

WP number and name: WP3: Generic interface design for interconnecting

GSBPM sub-processes

Deliverable 3.3

Lessons learned on the use of SDMX within data processing

Partner in charge Istat

Version 4.0

Date March 08 2012

Version Changes Changed by Date1.0 First draft Istat 26-01-2012

2.0 Second draft Istat 17-02-2012

3.0 Third draft Istat 05-03-2012

4.0 Final version Istat 08-03-2012

Date of dissemination Version Page

March, 08 2012 4.0 1

Page 2: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

This document is distributed under Creative Commons licence

"Attribution-Share Alike - 3.0 ", available at the Internet site:

http://creativecommons.org/ licenses /by-sa/3.0

Date of dissemination Version Page

March, 08 2012 4.0 2

Page 3: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Summary

This document presents the feedbacks of the usage of SDMX in a test scenario of CORE.

Keywords: CORE, SDMX

Date of dissemination Version Page

March, 08 2012 4.0 3

Page 4: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Contents

1 Introduction..................................................................................................................................5

2 CORE data format........................................................................................................................6

3 SDMX information model and data format.................................................................................7

4 CORE and SDMX........................................................................................................................9

4.1 Test-bed scenario: a brief description..................................................................................9

4.2 The CORE and SDMX experience....................................................................................11

4.2.1 The Data Structure Wizard...........................................................................................11

4.2.2 The SDMX Converter tool...........................................................................................13

4.2.3 Data translation from CORE format to SDMX format................................................14

4.2.4 Converting data using SDMX Converter.....................................................................16

5 Conclusions................................................................................................................................18

6 Annexes......................................................................................................................................19

6.1 Annex A – Codelists used in the DSD...............................................................................19

6.2 Annex B – extract of the SDMX DSD file.........................................................................21

6.3 Annex C – Minutes of CORE-SDMX Meeting 09-05-2011..............................................24

6.4 Annex D – Minutes of CORE-SDMX Meeting 11-07-2011.............................................25

Date of dissemination Version Page

March, 08 2012 4.0 4

Page 5: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

1 Introduction

The principal aim of the CORE project is the detailed design and prototype implementation of an architecture supporting the execution of statistical business processes. Such processes are defined in terms of services calling each other, according to principles of modern information systems design. In this architecture is very relevant the way to exchange data and metadata between services.

A specific task of WP3 (Generic interface design for interconnecting GSBPM sub-processes) was dedicated to design the model of data and metadata to be exchanged. From a technological perspective, CORE data are represented by means of XML standard technologies. A further activity is to study how to translate output data expressed in the CORE model into the SDMX (Statistical Data and Metadata eXchange) format.

SDMX is an international initiative by several international organization (BIS, ECB, EUROSTAT, IMF, OECD, UN, World Bank) with the aim to define a common standard for the exchange of statistical data and metadata, together with content guidelines and an IT tools.

This document provides the results obtained studying the usage of SDMX in the CORE model. Specifically, in the context of a test-bed scenario, used to validate the whole implementation cycle of the CORE environment, we designed a service called Convert to SDMX to transform CORE output data and convert them in SDMX XML format. This service encapsulate the SDMX converter tool and uses the manual action of domain expertise.

Date of dissemination Version Page

March, 08 2012 4.0 5

Page 6: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

2 CORE data format

Data exchanged by CORE-compliant services are XML data and must be conform to a set of defined schemata. Specifically, three XSD schemata are defined in the CORE model, namely:

CORE Data Model. This schemata contains the definition of the CORE model structure, it must be specified once and it is valid for all processes. This schemata is extensible, for example, core tags (data set level, column level and rows level), data set kind or column kind can be modified to meet new needs.

CORE Domain Descriptor. The Domain Descriptor can be optionally specified and reports domain concepts (like persons, enterprises, etc.) with related properties. The purpose of this schema is to define a very easy meta-model for the representation of domain knowledge.

CORE Mapping. This schemata is intended to be specified with respect to Domain Descriptor. It contains information about the correspondence between the CORE Models and the Domain Descriptor concepts, if Domain Descriptor is present, otherwise between the CORE Model and the specific format of the tool encapsulated by the CORE service. In more details, the CORE Mapping file serves the purpose of specifying how to translate data from a tool specific format to CORE and vice versa. Considering the principal data formats managed by tools internal to NSIs, CORE Mapping should support at least CSV/CORE transformations and Relational/CORE transformations. More transformations can be obviously defined, but these two are able to meet most of requirements exhibited by tools used by NSIs.

For a complete description of these topics, we refer to Deliverable 3.2: Technical Environment Specification.

Date of dissemination Version Page

March, 08 2012 4.0 6

Page 7: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

3 SDMX information model and data format

The SDMX standard is based on a conceptual meta-model called SDMX Information Model (SDMX-IM), consisting of a set of functional entities organized as packages, which allow to understand, reuse and maintain the model. These entities in the version 2.0 of the standard are organized on three conceptual levels:

I - SDMX Base

II - Structural Definitions

III - Reporting and Dissemination.

For a description and documentation of data, we refer to entities in level II (Structural Definitions), also called "artefacts", which are based on the "Item Scheme", belonging to the "SDMX base" layer.

Data and referential metadata are represented in the "Reporting and Dissemination" level (Data Set and Metadata Set) .

The main artefacts used for describing data are the following: CategoryScheme: The descriptive information for an arrangement or division of categories into

groups based on characteristics, which the objects have in common. Is an artefact for organizing categories (CategoryScheme items) which themselves link to dataflow definition (*).

ConceptScheme: The descriptive information for an arrangement or division of concepts into groups based on characteristics, which the objects have in common. A concept scheme is a maintained list of concepts (ConceptScheme items) that are used in key family and metadata structure definitions.

KeyFamily: A Key Family or Data Structure Definition (DSD) specifies a set of concepts, which describe and identify a set of data. It tells us which concepts are dimensions (identification and description) which are attributes (just description) and measures, and it gives the attachment level for each of these concepts on the basis of the packaging structure (dataset, group, series or observation), as well as their status (mandatory or conditional). It also specifies which code lists provide possible values for the coded concepts and gives possible values for the attributes, either as code lists or free text fields(**). In a DSD there can be many such concept schemes (*).

CodeList: Every possible value for a coded concept is defined in a code list. Each value on that list is given a language-independent abbreviation (code or CodeList item) and a language-specific description.

DataFlow: A structure which describes, categorizes and constrains the allowable content of a data set that providers will supply for different reference periods. In SDMX, data sets are reported or disseminated according to a data flow definition. The data flow definition identifies the data structure definition (key family) and may be associated with one or more subject matter domains; this facilitates the search for data according to organized category schemes. A “data flow”, in this context, is an abstract concept of the data sets, i.e. a structure without any data(**).

Date of dissemination Version Page

March, 08 2012 4.0 7

Page 8: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

As far as concerns the SDMX data message types for data sharing and exchange, there are the following types:

Generic Data Message: Conveys data in a form independent of a data structure definition message, as the structure is embedded in the message; is used when applications receiving the data do not have detailed understanding of the data set structure before they obtain the data set itself (***).

Compact data Message: It allows the exchange of large data sets in a data structure definition-dependent form. In terms of XML syntax, all codes and observation values are attributes. The permissible values of the codes are defined in the schema (which is specific to the data structure definition) so that a generic XML parser can be used to validate data file against its structural definition. Is particularly used to share and exchange short terms statistic data (***).

Cross-Sectional Data Message: It allows the exchange of many more than one observation type in a data structure definition dependent form; it is intended for situations where the statistical data consist of multiple observations at a single point in time, or for each combination of dimension members in the multidimensional table. For example, in foreign trade statistics where, for combination of reporting country, partner country, commodity and time period there may be three observations: a value, a weight and a quantity (***).

(*) OECD Glossary of statistical terms(**) SDMX-ML: Schema and documentation(***) SDMX User guide version 2009.1

Date of dissemination Version Page

March, 08 2012 4.0 8

Page 9: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

4 CORE and SDMX

4.1 Test-bed scenario: a brief description In this section we briefly describe the process scenario that we used as empirical test-bed during the whole implementation cycle of the CORE environment. For more details, we refer to Deliverable 3.1: Technical Environment Specification.

Figure 1 . Process scenario

Figure 1 shows a traditional flow chart representation of the process scenario.

The process flows in a simple waterfall fashion, with services being activated sequentially. Only the services identified by yellow rectangles have been implemented in the scenario, whereas those encapsulated inside the dashed central container have been reported only for conceptual

Date of dissemination Version Page

March, 08 2012 4.0 9

Page 10: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

completeness. The scenario breaks down into two disconnected sequences, both covering a portion of the typical processing steps performed for sample surveys performed by NSIs.

The first sequence covers the phases of sample allocation and selection.

To realize the sample allocation, two services have been implemented:

i) Compute Strata Statistics: we have assumed to be a simple SQL aggregated query with a group-by clause. Thus, we have implemented an API supporting Relational/CORE transformations.

ii) Allocate the Sample: we have wrapped the MAUSS-R1 (Multivariate Allocation of Units in Sampling Surveys – R version) system. Thus, we have implemented an API supporting CSV/CORE transformations.

The Select the Sample service wraps a simple SAS script to be executed in batch mode.

The second sequence of the scenario deals with the computation of estimates and sampling errors, as well with their storage (e.g. for later dissemination) and subsequent conversion to SDMX format (e.g. for later bilateral exchange).

To realize the estimation phase two services must be invoked:

i) Calibrate Survey data: we have wrapped ReGenesees2 (R Evolved Generalized Software for Sampling Estimates and Errors in Surveys ) system.

ii) Compute Estimates and Sampling Errors: we have wrapped ReGenesees2.

In both services, we decided to export SAS datasets, that are proprietary with a closed format, in CSV files and reuse a standard CSV/CORE transformations.

The Store Estimates and Sampling Errors service simply executed a set of SQL statements, so that previously computed survey estimates can be persistently stored in a relational DB. Again, we needed an API supporting Relational/CORE transformations.

Finally, the Convert to SDMX service, that is the service we are focusing on, converts data expressed in the CORE XML format, to SDMX XML format.

4.2 The CORE and SDMX experience In this section we describe the work done to implement the Convert to SDMX service that converts data expressed in the CORE XML format, to SDMX XML format.

1 http://www.istat.it/it/strumenti/metodi-e-software/software/mauss-r2 http://www.istat.it/it/strumenti/metodi-e-software/software/regenesees

Date of dissemination Version Page

March, 08 2012 4.0 10

Page 11: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

As first step we have converted the input file expressed in the CORE XML format in a CSV file to be able to use SDMX convert tools. Actually, the converter is able to deal with “generic” aggregated data (i.e. data stemming from arbitrary application domain), provided the schema of the CSV dataset is as follows:

K+2N columns where:o the first K columns represent the dimensions (categorical variables whose

intersections define the estimation domains);o the next N columns are measures (estimates computed on N variables of interest

in the domains mentioned above);o the last N columns are measures (the sampling errors for the estimations

referred to above).

The proof of concept we have performed has been carried out on data coming from the Italian Time-Use Survey. Specifically, the file had K=3 and N=44.

The dimensions were: Rip: a territorial division of Italy in 5 classes; S5: sex; S11: professional status.

The measures were: Mean.y45, …, Mean.y88: the daily average time spent in 44 different life activities; SE.Mean.y45, …, SE.Mean.y88: the estimated standard errors for the average times above.

The following tools have been used for CORE and SDMX experience: the Data Structure Wizard and the SDMX Converter.

In the following paragraphs they will be briefly described.

4.2.1 The Data Structure Wizard

The Data Structure Wizard (DSW) is a tool developed by Eurostat in Java language that allows the creation of DSD and MSD. In order to create a DSD some steps must be accomplished in a sequential way. They will be shown the steps done in our case study.

Step1: Definition of codelists. In the first step were defined all the codelists used in the DSD as well as the codelist used in the measure dimension. For every codelist were included the following information: the version (1.1), the agency (ESTAT) and a status (isfinal=true), this status is mandatory to reuse the codelists in the DSD creation. The codelists defined were: CL_Rip – geographical Codelist (see Annex A- table1) CL_S5 – sex codelist (see Annex A - table2) CL_S11 – professional status codelist (see Annex A - table3) CL_MEAS - codelist for measure dimension (see Annex A - table4)

Date of dissemination Version Page

March, 08 2012 4.0 11

Page 12: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Figure 2 . Codelists definition

Step2: Definition of concept schema and related concepts. A concept schema, CPTS_CORE, was created and all the concepts belonging to this concepts schema were defined. For the concept schema were included the following information: the version (1.1), the agency (ESTAT) and a status (isfinal=true), this status is mandatory to reuse the concept schema in the DSD creation. The concepts defined were: cpt_rip; cpt_s5; cpt_s11; core_md; cpt_y45, cpt_y46,….cpt_y87, cpt_y88; one for every CrossSectional measure; err_md, concept defining the measure’s error.

Step3: DSD creation. The last step consist in the DSD creation using the information previously inserted. For every component (dimension, measure and attribute) were included the following information: the version (1.1), the agency (ESTAT) and a status (isfinal=true). For all the

Date of dissemination Version Page

March, 08 2012 4.0 12

Page 13: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

components except for the cpt_rip a “CrossSectionalAttachmentObservation = true” xml attribute were defined while for cpt_rip a “CrossSectionalAttachSection=true xml attribute was defined. The err_md component had a xml attribute “AttachmentLevel”=”Obs”. The components defined were:

Three dimension were defined using the concepts cpt_rip, cpt_s5 and cpt_s11; A measure dimension was defined using the concept core_md; An attribute was defined using the concept core_md.

The DSD created (see Annex B-table1) was used as input in the converter tool.

Figure 3 . DSD creation

4.2.2 The SDMX Converter tool

The SDMX converter tool is a standalone desktop application used either to convert a CSV or a GESMES file into one of the SDMX data formats (compact, generic, cross-sectional), or for conversions between SDMX data formats (i.e.: from generic to compact, etc..). The application is produced by Eurostat and has been developed in Java.

It has a single conversion’s configuration and execution window where it’s possible to specify the following basic information:

Input file path and name;

Date of dissemination Version Page

March, 08 2012 4.0 13

Page 14: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Output file path and name;

Input data format;

Output data format;

The specifications of the output file’s keyfamily and Dataflow (Path and file name, Id, and Version Agency);

The characteristics of the header of the SDMX output file (edited or default);

A possible mapping between the fields in the input file and the dimensions or attributes of the keyfamily.

Once entered the configurations it’s possible to click the "Convert" button to start the file conversion by the tool.

The file conversion generates an XML file containing data in the SDMX data format chosen for the conversion (Generic, Compact, Cross-Sectional).

4.2.3 Data translation from CORE format to SDMX format

The phase of the statistical process in which it was decided to apply the conversion from a data file produced by the CORE environment to a data file in SDMX format has been the “Compute Estimates and Sampling Errors” (as the aggregated data dissemination phase).

Therefore the experimentation has been done in order to convert into one of the SDMX data formats a csv data file generated by the estimation processes implemented in the CORE environment and containing the values of the estimates and the sampling error values for each estimate. The structure of the input file is shown in Table1.

Field name Description

Rip Geographical repartitions dimension

S5 Sex dimension

S11 Professional status dimension

Mean.y45 Mean value for the y45 variable

Mean.y46 Mean value for the y46 variable

……… A column for every mean value from y47 to y87

Mean.y88 Mean value for the y48 variable

SE.Mean.y45 Error value for the y45 variable

SE.Mean.y46 Error value for the y46 variable

Date of dissemination Version Page

March, 08 2012 4.0 14

Page 15: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

……… A column for every error value from y47 to y87

SE.Mean.y88 Error value for the y48 variableTable1 – Structure of the csv input file

In order to carry out this conversion, starting from the CORE file structure it was created a SDMX DSD (Data Structure Definition) for describing data (see Table2). The SDMX data format chosen to represent this type of data is the SDMX cross-sectional format: this choice was due to the fact that in the CORE file there wasn’t a single measure, but a set of measures with the corresponding sampling errors and there wasn’t the time dimension.

The discrete qualitative variables of the file were transformed into dimensions, the set of measures was modeled as a measure dimension and the first idea was to define each sampling error as an attribute of the corresponding measure. This latter operation, however, is not compatible with the SDMX data modeling rules, as it is possible to define attributes which affect only the whole set of measures but not a single measure. To solve this problem , the input file structure was "verticalized" that is, the dimension columns were maintained and new fields were defined:

one representing the measure dimension and containing the CrossSectional measures (Mean.Name);

one containing the values of the CrossSectional measures (Mean.Value);

one containing the sample error for every CrossSectional measure (SE.Mean.Value).

Field name Description

Rip Geographical repartitions dimension

S5 Sex dimension

S11 Professional status dimension

Mean.Name measure dimension

Mean.Value values of the mean for every CrossSectional measures

SE.Mean.value values due to the sample error for every CrossSectional measuresTable2 – Structure of the “verticalized” csv input file

By this way it has been possible to define in the DSD, in addition to the artefacts previously indicated, an attribute related to sampling errors and defined at observation level (i.e., each value of the measurement).

An extract of the DSD is shown in Annex C.

Date of dissemination Version Page

March, 08 2012 4.0 15

Page 16: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

4.2.4 Converting data using SDMX Converter

Once prepared the DSD, we proceeded to convert the CORE file using the SDMX Converter tool. To do this, we have entered all the information relating to the conversion (input data format: csv; output data format: SDMX Cross-sectional; name and path of the input and output file; id, version, and Agency of the DSD and its file path and filename), and finally we indicated the mapping of the dimensions and attributes of the DSD on the corresponding file’s fields (indicated by their ordinal position).

In the figure below is shown the SDMX Converter interface containing the conversion settings:

Figure 4. File conversion settings in the SDMX Converter tool’s interface

At the end of the conversion, tool has generated an xml file containing the representation of the input data in SDMX cross sectional data.

Date of dissemination Version Page

March, 08 2012 4.0 16

Page 17: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Figure 5. Output file in Cross Sectional SDMX format

4.2.5 Final remarks

The experiment has shown the feasibility of the conversion to SDMX format of a data file obtained as a CORE output.

This activity, as it’s possible to see from the description and the results of the experiment, was not carried through automation, but it was necessary to make a series of evaluations and subsequent operations. First the mapping of the CORE output’s fields to the dimensions and attributes of the SDMX DSD was necessary. Second, as SDMX does not manage more than measure, it was necessary the verticalization of the CORE output file (that was indeed a normal outpurt of a generalized Istat software system) in order to convert it to the SDMX

Date of dissemination Version Page

March, 08 2012 4.0 17

Page 18: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

cross sectional since it was needed to describe through the SDMX DSD specific attributes (sampling errors) which had to be applied to all the different measures.

Date of dissemination Version Page

March, 08 2012 4.0 18

Page 19: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

5 Annexes

5.1 Annex A – Codelists used in the DSD

Cod description

1 north-west

2 North-east

3 Center

4 South

5 Islands

Table1 – cl_rip

Cod description

1 S5_1 Male

2 S5_2 Female

Table2 – cl_S5

Cod Description

1 S11_1

2 S11_2

3 S11_3

4 S11_4

5 S11_5

6 S11_6

7 S11_7

8 S11_8

9 S11_9

10 S11_10

Table 3 – cl_s11

Date of dissemination Version Page

March, 08 2012 4.0 19

Page 20: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Cod description

Mean.y46

Mean.y46

Mean.y47

Mean.y47

Mean.y48

Mean.y48

Mean.y49

Mean.y49

Mean.y50

Mean.y50

Mean.y51

Mean.y51

Mean.y52

Mean.y52

Mean.y53

Mean.y53

Mean.y54

Mean.y54

Mean.y55

Mean.y55

Mean.y56

Mean.y56

Mean.y57

Mean.y57

Mean.y58

Mean.y58

Mean.y59

Mean.y59

Mean.y60

Mean.y60

Mean.y61

Mean.y61

Date of dissemination Version Page

March, 08 2012 4.0 20

Page 21: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Mean.y62

Mean.y62

Mean.y63

Mean.y63

Mean.y64

Mean.y64

Mean.y65

Mean.y65

Mean.y66

Mean.y66

Mean.y67

Mean.y67

Mean.y68

Mean.y68

Mean.y69

Mean.y69

Mean.y70

Mean.y70

Mean.y71

Mean.y71

Mean.y72

Mean.y72

Mean.y73

Mean.y73

Mean.y74

Mean.y74

Mean.y75

Mean.y75

Mean.y76

Mean.y76

Mean.y77

Mean.y77

Mean.y78

Mean.y78

Mean.y7 Mean.y79

Date of dissemination Version Page

March, 08 2012 4.0 21

Page 22: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

9

Mean.y80

Mean.y80

Mean.y81

Mean.y81

Mean.y82

Mean.y82

Mean.y83

Mean.y83

Mean.y84

Mean.y84

Mean.y85

Mean.y85

Mean.y86

Mean.y86

Mean.y87

Mean.y87

Mean.y88

Mean.y88

Table 4- cl_meas

5.2 Annex B – extract of the SDMX DSD fileFor a better comprehension of the DSD (released together with the other documentation) only an extract is reported. In bold are indicated the main sections in which the DSD can be divided, concepts, codes and codelist and the dotted lines indicate the part of the DSD have been omitted.

<?xml version="1.0" encoding="UTF-8"?><Structure xmlns="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message" xmlns:structure="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/structure" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message SDMXMessage.xsd"> <Header> <ID>REGISTRY_RESPONSE</ID> <Test>true</Test> <Truncated>false</Truncated> <Prepared>2011-11-22T12:53:50</Prepared> <Sender id="EUROSTAT" />

Date of dissemination Version Page

March, 08 2012 4.0 22

Page 23: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

<Extracted>2011-11-22T12:53:50</Extracted> </Header> <CodeLists> <structure:CodeList id="CL_RIP" agencyID="ESTAT" version="1.1" isFinal="true"> ………………. </structure:CodeList> <structure:CodeList id="CL_S5" agencyID="ESTAT" version="1.1" isFinal="true"> ………………. </structure:CodeList> <structure:CodeList id="CL_S11" agencyID="ESTAT" version="1.1" isFinal="true"> ………………. </structure:CodeList> <structure:CodeList id="CL_MEAS" agencyID="ESTAT" version="1.1" isFinal="true"> <structure:Name xml:lang="en">Measure</structure:Name> <structure:Description xml:lang="en">Measure</structure:Description> <structure:Code value="Mean.y45" parentCode="" urn=""> <structure:Description xml:lang="en">Mean.y45</structure:Description> </structure:Code> <structure:Code value="Mean.y46" parentCode="" urn=""> <structure:Description xml:lang="en">Mean.y46</structure:Description> </structure:Code> ………………. <structure:Code value="Mean.y88" parentCode="" urn=""> <structure:Description xml:lang="it">Mean.y88</structure:Description> <structure:Description xml:lang="en">Mean.y88</structure:Description> </structure:Code> </structure:CodeList> </CodeLists> <Concepts> <structure:ConceptScheme id="CPTS_CORE" agencyID="ESTAT" version="1.1" isFinal="true"> <structure:Concept id="cpt_rip"> <structure:Name xml:lang="en">ripartition</structure:Name> <structure:Description xml:lang="en">ripartition</structure:Description> </structure:Concept> <structure:Concept id="cpt_s5"> <structure:Name xml:lang="en">s5</structure:Name> <structure:Description xml:lang="en">s5</structure:Description> </structure:Concept> <structure:Concept id="cpt_s11"> <structure:Name xml:lang="en">s11</structure:Name> <structure:Description xml:lang="en">s11</structure:Description> </structure:Concept> <structure:Concept id="core_md"> <structure:Name xml:lang="en">core measure</structure:Name> <structure:Description xml:lang="en">core</structure:Description> </structure:Concept> <structure:Concept id="time"> <structure:Name xml:lang="en">time dimension</structure:Name> <structure:Description xml:lang="en">time dimension</structure:Description> </structure:Concept> <structure:Concept id="cpt_y45"> <structure:Name xml:lang="en">cpt_Mean.y45</structure:Name>

Date of dissemination Version Page

March, 08 2012 4.0 23

Page 24: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

<structure:Description xml:lang="en">cpt_Mean.y45</structure:Description> </structure:Concept> <structure:Concept id="cpt_y46"> <structure:Name xml:lang="en">cpt_Mean.y46</structure:Name> <structure:Description xml:lang="en">cpt_Mean.y46</structure:Description> </structure:Concept> ……….. <structure:Concept id="cpt_y88"> <structure:Name xml:lang="en">cpt_Mean.y88</structure:Name> <structure:Description xml:lang="en">cpt_Mean.y88</structure:Description> </structure:Concept> <structure:Concept id="err_md"> <structure:Name xml:lang="en">error measure dimension</structure:Name> <structure:Description xml:lang="en">error measure dimension</structure:Description> </structure:Concept> <structure:Concept id="obs_value"> <structure:Name xml:lang="en">observation value</structure:Name> <structure:Description xml:lang="en">observation value</structure:Description> </structure:Concept> </structure:ConceptScheme> </Concepts> <KeyFamilies> <structure:KeyFamily id="DSD_CORE" agencyID="ESTAT" version="1.0" isFinal="false"> <structure:Name xml:lang="en">dsd core</structure:Name> <structure:Description xml:lang="en">dsd core</structure:Description> <structure:Components> <structure:Dimension conceptRef="cpt_rip" conceptSchemeRef="CPTS_CORE" conceptSchemeVersion="1.1"

conceptSchemeAgency="ESTAT" codelist="CL_RIP" codelistVersion="1.1" codelistAgency="ESTAT" crossSectionalAttachSection="true" />

<structure:Dimension conceptRef="cpt_s5" conceptSchemeRef="CPTS_CORE" conceptSchemeVersion="1.1" conceptSchemeAgency="ESTAT" codelist="CL_S5" codelistVersion="1.1" codelistAgency="ESTAT" crossSectionalAttachObservation="true" />

<structure:Dimension conceptRef="cpt_s11" conceptSchemeRef="CPTS_CORE" conceptSchemeVersion="1.1" conceptSchemeAgency="ESTAT" codelist="CL_S11" codelistVersion="1.1" codelistAgency="ESTAT" crossSectionalAttachObservation="true" />

<structure:Dimension conceptRef="core_md" conceptSchemeRef="CPTS_CORE" conceptSchemeVersion="1.1" conceptSchemeAgency="ESTAT" codelist="CL_MEAS" codelistVersion="1.1" codelistAgency="ESTAT" isMeasureDimension="true" />

<structure:Attribute conceptRef="err_md" conceptSchemeRef="CPTS_CORE" conceptSchemeVersion="1.1" conceptSchemeAgency="ESTAT" attachmentLevel="Observation" assignmentStatus="Mandatory" crossSectionalAttachObservation="true" isEntityAttribute="true" />

<structure:PrimaryMeasure conceptRef="obs_value" conceptSchemeRef="CPTS_CORE" conceptSchemeVersion="1.1" conceptSchemeAgency="ESTAT" />

<structure:CrossSectionalMeasure conceptRef="cpt_y45" conceptSchemeRef="CPTS_CORE" conceptSchemeAgency="ESTAT" conceptSchemeVersion="1.1" measureDimension="core_md" code="Mean.y45" />

<structure:CrossSectionalMeasure conceptRef="cpt_y46" conceptSchemeRef="CPTS_CORE" conceptSchemeAgency="ESTAT" conceptSchemeVersion="1.1" measureDimension="core_md" code="Mean.y46" />

.............. <structure:CrossSectionalMeasure conceptRef="cpt_y88" conceptSchemeRef="CPTS_CORE"

conceptSchemeAgency="ESTAT" conceptSchemeVersion="1.1" measureDimension="core_md"

Date of dissemination Version Page

March, 08 2012 4.0 24

Page 25: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

code="Mean.y88" /> </structure:Components> </structure:KeyFamily> </KeyFamilies></Structure>

Table 5_- extract of the DSD

5.3 Annex C – Minutes of CORE-SDMX Meeting 09-05-2011Participants:

Francesco Rizzo (SDMX Essnet)

Laura Vignola (SDMX Essnet)

Alessio Cardacino (SDMX and CORE Essnet)

Mauro Bruno (CORE Essnet)

Monica Scannapieco (CORE Essnet)

The meeting started at 10 a.m. The principal discussed topics concerned:1. CORE scenario implementation;2. CORE environment implementation;3. SDMX usage for micro data;4. SDMX usage for macro data.

With respect to point 1, Scannapieco and Bruno illustrated the selected scenario that realizes a statistical process covering the sample allocation, calibration and estimates computation steps. The Istat tools selected to be made CORA compliant are MAUSS, ad-hoc SAS procedures and GENESEES. Input and output of the tools were described and feedbacks were provided by SDMX people on issues on possible translations to SDMX.

With respect to point 2, Scannapieco and Bruno illustrated the activities that are going on on the design and implementation of the CORE environment. Specifically: (i) GUIs realization; (ii) workflow engine selection and (iii) APIs implementation. On the API realization a discussion started related to the format and the model for the representation of data to be exchanged within CORA processes. In particular, Francesco Rizzo suggested to have a look at “SDMX Information Model: UML Conceptual Design”, sections “11: Process and Transitions” and “12: Transformations and Expressions”. Moreover, Rizzo stressed the importance of having a global schema for data reconciliation, and Scannapieco answered that such a reconciliation step is taken into account into CORE environment, though not explicitly supported by a dedicated tool.

Date of dissemination Version Page

March, 08 2012 4.0 25

Page 26: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

In relation to point 3, the group discussed on relationships between SDMX concepts and CORE layers, and a point emerged about terminology. Scannapieco and Bruno illustrated the idea of performing a translation to SDMX after the CORE output production. Rizzo highlighted that output could be directly SDMX. Scannapieco noticed that though SDMX compatibility must be checked, a first step of output production in CORE format is probably needed.

In relation to point 4, Rizzo illustrated the schemata supported by the mapping assistant tool, and a discussion started on the translation of CORE data into a dissemination database conform to such schemata. From the discussion such a translation appeared not easy to be implemented. Anyway, it was decided to perform an in-depth analysis in order to better clarify this point.

The meeting ended around 12:00 p.m.

5.4 Annex D – Minutes of CORE-SDMX Meeting 11-07-2011

Participants:

Francesco Rizzo (SDMX Essnet)

Laura Vignola (SDMX Essnet)

Mauro Bruno (CORE Essnet)

Monica Scannapieco (CORE Essnet)

Giulia Vaste (CORE Essnet)

The meeting started at 14:30. The principal discussed topics concerned:

1. Illustration of CORE status of advancement

2. Usage of SDMX inside CORE implementation scenario

With respect to point 1, the CORE team illustrated the design of CORE XML schemas. The CORE team also described the general CORE architecture, and the various components (GUIs, process engine, etc.). The SDMX team asked more details on XML schemas and it was agreed that technical material will be later sent for a better understanding.

With respect to point 2, the CORE team illustrated the current scenario implementation. Then, the CORE team proposed a possible integration into the scenario of an SDMX component, namely the SDMX converter. Rizzo underlined that this solution could be more a didactic solution than a real viable solution. Hence, he suggested to adopt a solution oriented towards dissemination databases. Following the first meeting suggestion, it was agreed to study the feasibility of mapping CORE to such dissemination schemata.

Then, a detailed discussion on the mapping between SDMX and CORE model started.

Date of dissemination Version Page

March, 08 2012 4.0 26

Page 27: Relazione del gruppo di lavoro sui sistemi informaticiec.europa.eu/eurostat/cros/system/files/CORE-Deliverable3... · Web viewSDMX is an international initiative by several international

ESSnet

CORE COmmon Reference Environment

Rizzo outlined the need to map DSD and CORE global schema. The CORE team underlined that CORE is probably at the level of MSD. Rizzo outlined that MSD and DSD are exchanged together but there is not a strict supported connection. It was then discussed how to generate a DSD starting from CORE XML files. Some indications were suggested by the SDMX team on how to produce DSDs.

A further discussion followed on the level of automation of translations from CORE to SDMX. It emerged that a fully automated translation will be not possible, but human intervention will be necessary.

The following to do list was proposed by the CORE team.

1) Add-on of a service (SERV) producing macro data to the CORE scenario;

2) Tentative mapping of SERV CORE output (and related schemata) into DSD/MSD;

3) Add-on of a dissemination service into the CORE implementation scenario that produces final SDMX data (enhanced by CORE information). This dissemination service could be, as an example, I.stat that is the Istat current solution for web dissemination.

The meeting ended around 16:30.

Date of dissemination Version Page

March, 08 2012 4.0 27