sdmx advanced topics on technical standards arofan gregory and chris nelson sdmx capacity building...

103
SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Upload: kristian-chandler

Post on 25-Dec-2015

223 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Advanced Topics on Technical Standards

Arofan Gregory and Chris Nelson

SDMX Capacity Building Workshop Washington January 11 2007

Page 2: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Advanced Topics (1)

• Many of these will be presented in the context of a live prototype system– Data structures– Provisioning metadata– Registry interfaces

• Submit structure and provisioning metadata• Query for structure• Register data and metadata set• Query for registered data and metadata sets

– Alignment with other standards

Page 3: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Advanced Topics (2)

• Others will presented by explanation and example– Hierarchical Code Set– Structure Set– Reporting Taxonomy– Services based architecture

• Notification• RSS feed

Page 4: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Information Model and Technical Specifications: High Level

Overview (Reminder)

Page 5: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

CategoryScheme

Category

can have child categories

comprises subject or reporting categories

Data or Metadata

Flow

Data Provider

Provision Agreement

can get data/metadata from multiple data/metadata providerscan provide

data/metadata for many data/metadata flows using agreed data/metadata structure

Structure Definition

uses specific data/metadata structure

can be linked to categories in multiple category schemes

Data Set or

Metadata Set

publishes/reports data sets or metadata sets

conforms to business rules of the data/metadata flow

Information Model: High Level Schematic

Data or Metadata

Set

URL, registration date etc.

Registers existence of data and metadata sets

Structure MapsStructure and

Code List maps

Page 6: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

SDMX Registry

Page 7: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

REPOSITORY Provisioning

Metadata

REGISTRY Data Set/

Metadata Set

REPOSITORY Structural Metadata

Subscription/Notification

Applications can subscribe to notification of new or changed objects

Register

Query

Submit

Query

Submit

Query

SDMX Registry/Repository

Describes data and metadata structures

Describes data and metadata sources and reporting processes

Indexes data and metadata

SDMX Registry Interfaces

Page 8: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

CategoryScheme

Category

can have child categories

comprises subject or reporting categories

Data or Metadata

Flow

Data Provider

Provision Agreement

can get data/metadata from multiple data/metadata providerscan provide

data/metadata for many data/metadata flows using agreed data/metadata structure

Structure Definition

uses specific data/metadata structure

can be linked to categories in multiple category schemes

SDMX Artefacts: Registry Contents

Data

Set

URL, registration date etc.

registers existence of data and metadata sets

Structure MapsStructure and

Code List maps

CategoryScheme

CategoryData or Metadata

Flow

Structure Definition

Structure Maps

Structural Metadata

Provisioning Metadata

Registered Data and Metadata

Page 9: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interfaces

Page 10: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Practical Examples

Page 11: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

CountrySTATRegionSTAT

National Publication Server(s)

Regional Publication Server

FAO SDMX Registry

Flow of FAO CountrySTAT-RegionSTAT Implementation

1

23a

4

3b

SDMX in Action: Prototype System

FOOD AND AGRICULTURE ORGANIZATIONOF THE UNITED NATIONS

Slide courtesy of the FAO

Page 12: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

FOOD AND AGRICULTURE ORGANIZATIONOF THE UNITED NATIONS

1 CountryStat National Publication Server

•The web site is published from the files in CountryStat

SDMX Publication

•The new CountryStat files are converted to SDMX-ML data sets and made web accessible on the CountryStat web site

•These files are registered in the FAO SDMX Registry

RegionStat Regional Publication Server

•Queries the registry for new registrations which responds with registration details including the URL of the new data sets

•Retrieves the new data sets from the CountryStat web site

•Converts the SDMX-ML files to an internal format and integrates the new data sets with existing RegionStat data sets

•Re-publishes the RegionStat web site

2

3a

4

Prototype System: Explanation

Slide courtesy of the FAO

3b

Page 13: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Data Structure Definitions: Registration and Query

Page 14: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data Set and Structure

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

Page 15: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data Set: Structure

• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the

observation value– Concept that is the observation value– Any of these may be

• coded• text• date/time• number• etc.

Dimensions

Attributes

Measure

Representation

Page 16: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data Set and Structure

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

(Dimensions)

(Dimension)

(Dimension)

(Attributes)

(Dimension)(Measure)

Dimensions Frequency

Reference Region Commodity

Time Measure Observation Value

Attributes Unit

Unit Multiplier 1

Page 17: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data Structure Definition

Key Group Key

Dimensions

RepresentationConcept

Attributes Measures

takes semantic

from

has format

takes semantic

from

takes semantic

from

has format

has format

concepts that add metadata

concepts that Identify the observation

concepts that are observed phenomenon

concepts that Identify groups of keys

Data Structure Definition

Page 18: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data Structure Definition

Key Group Key

Dimensions

RepresentationConcept

Attributes Measures

FREQREF_AREA_REGCOMMODITYTIME

AGRICULTURE_COMMODITY

CL_FREQCL_AREA_CTYCL_COMMODITY

UNITUNIT_MULT

OBS_VALUE

Registry Contents - DSD

CL_MEASURE_UNITCL_UNIT_MULT

Page 19: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interfaces: Submit Structure

Data Structure Definition Artefacts

Page 20: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interfaces: Submit Structure

Page 21: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interfaces: Query Structure

Query for KeyFamily with resolveReferences set to “true” will return all related Concepts and Code Lists

Page 22: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interfaces: Query Structure

The registry will respond with all DSDs maintained by the FAOSTAT agency

Page 23: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Dataflows, Data Providers, Category Scheme

Page 24: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data

Flow

Data Provider

Structure Definition

FAOSTAT:AGRICULTURE_COMMODITY

FAOSTAT:AGRICULTURE_AREA

FAOSTAT:AGRICULTURE_PRODUCTION

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

SDMX:SDMXStatSubMatDomainsWD1

(adoption of UNECE Classification of

International Statistical Activities)

(Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries)

SDMX:SDMXStatSubMatDomainsWD1.

Domain_2.C4.C1

Registry Contents – Other Structures

The data flows are connected to the relevant Category in the Category Scheme

Page 25: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Structure

Artefacts

Page 26: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Structure

Category Scheme

Page 27: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Structure

Links the Dataflow to the (Subject Matter Domain) Category

Data Providers

Dataflow

Page 28: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Submit Provision Agreements

Page 29: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data

Flow

Data Provider

Provision Agreement

Structure Definition

FAOSTAT:AGRICULTURE_COMMODITY

FAOSTAT:AGRICULTURE_AREA

FAOSTAT:AGRICULTURE_PRODUCTION

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

SDMX:SDMXStatSubMatDomainsWD1

(adoption of UNECE Classification of

International Statistical Activities)

(Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries)

There are eight provision agreements, one for each combination of Data Provider and Data Flow

SDMX:SDMXStatSubMatDomainsWD1.

Domain_2.C4.C1

Registry Contents – Structure and Provisioning

Page 30: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Provision Agreement

Unique Id. of the Dataflow

Unique Id. of the Data Provider

Page 31: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Provision Agreement

Unique Id. of the Dataflow

Unique Id. of the Data Provider

Page 32: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Provision Agreement Response

The status indicates success or failure

Page 33: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Submit Provision Agreement Response

The response returns the URN as well as confirmation of the provisioning details submitted

Page 34: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Structured URNs• The URNs in SDMX are compound identifiers

which reflect the relationships described in the information model– They are unique and predictable– They can be easily validated– They function exactly like URLs for the registry

• Each identifier tells you which organization maintains the identified object

• Each identifier tells you which agency maintains the scheme from which the identifier comes

Page 35: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

URN Structure

urn:sdmx:org.sdmx.infomodel.registry.Provision

Agreement=FAOSTAT:OS_FAO_DATA_PROVIDER.29.FAOSTAT:AGRICULTURE_PRODUCTION

Data Provider Scheme

Maintenance Agency

Maintenance Agency

Data Provider

Dataflow

Data Provider

Provision Agreement

Data Flow

Page 36: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Register a Data Set

Page 37: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data

Flow

Data Provider

Provision Agreement

Structure Definition

Data

Set

Data Set Registration

•The data set is “registered” against the provision agreement

•The registry stores metadata (e.g. URL) about the data set: it does not store the data set

URL, registration date etc.

registers existence of data set

Page 38: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data

Flow

Data Provider

Provision Agreement

Structure Definition

FAOSTAT:AGRICULTURE_COMMODITY

FAOSTAT:AGRICULTURE_AREA

FAOSTAT:AGRICULTURE_PRODUCTION

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

Data Set

Metadata

URL, registration date etc.

There can be eight data sets registered, one for each Provision Agreement

Registry Contents – Data Set Registrations

Page 39: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Data Set Registration

Action is “replace”, “append” etc.

An SDMX-ML file is a simple datasource

Identifies the Provision Agreement either by URN or by Dataflow and Data Provider

Page 40: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Data Set Registration

URL of the SDMX-ML file

URN of the Provision Agreement

Page 41: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Query for a Data Set

Page 42: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data

Set

Data

Flow

Data Provider

Provision Agreement

Structure Definition

Data

Set

Query for Data Sets

AGRICULTURE_AREA

AGRICULTURE_PRODUCTION

29 - Bénin

42 - Burkina Faso

66 - Côte d’Ivoire

217 - Sénegal

Provision AgreementProvision

Agreement

Data Set

Metadata

Query for Data Sets

•for all Provision Agreements linked to Data Flow or

•linked to a specific Provision Agreement

Page 43: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Query for Data SetsQueryType is “DataSets” “MetadataSets” etc.

Page 44: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Query for Data Sets

Could be done with URN or as shown here with

explicit fields

Page 45: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Data Set Query Response

URL of the SDMX-ML file

Identification of the Provision Agreement

Page 46: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry Interface: Data Set Query Response

Note that the URN of the registered data set included the date and time of registration

Page 47: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Metadata Structure Definition

Page 48: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata – Reported according to a Quality Framework

Metadata pertaining to a Quality Framework are reported in a Metadata Set, whose structure is defined by a Metadata Structure Definition

Metadata Attribute Metadata Attribute: Metadata Content

Page 49: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data Flow

Data Provider

Provision Agreement

MetadataReport

Metadata Reporting“Quality” metadata about published or reported data sets are linked to the

Provision Agreement, or the Data Flow, or the Data Provider

AGRICULTURE_AREA

AGRICULTURE_PRODUCTION

29 - Bénin

42 - Burkina Faso

66 - Côte d’Ivoire

217 - Sénegal

Page 50: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Provision Agreement

Identify Structure

•Concepts

•Hierarchies

•Representation (e.g. code list)

Metadata Structure Definition (MSD)

Page 51: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Full Target Identifier

Partial Target Identifier

Metadata Structure Definition

Identifier Components Item Scheme

uses defined concepts

defines “keys” of object types to which metadata can be “attached”

specifies the identifier components (“key”) of the target object

identifies the code list from which the value

of the (key) component must be

taken when metadata is reported

Report Structure

Target Object Type

identifies target object

type of the component

Metadata Structure Definition

Page 52: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata AttributesFormat and Permitted Value List

Report Structure

Concept Scheme

concept defined inConcept

takes semantic and context

from definition of format and permitted values

Metadata Structure Definition

can comprise the specification of

one or more report

can have hierarchy

MSD – Defining the Metadata Report

Page 53: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata Attributes

Full Target Identifier

Partial Target Identifier

Metadata Structure Definition

Identifier Components

Format and Permitted Value List

Item Scheme

uses defined concepts

defines “keys” of object types to which metadata

can be “attached”specifies to which

object types the Report can be “attached”

specifies the identifier components (“key”) of the target object

identifies the code list from which the value

of the (key) component must be

taken when metadata is reported

Report Structure

Concept Scheme

concept defined inConcept

takes semantic and context from

Target Object Type

identifies target object

type of the component

can have hierarchy

definition of format and permitted values

MSD – Complete Picture

Page 54: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Full Target Identifier

Partial Target Identifier

Metadata Structure Definition

Identifier Components Item Scheme

Target Object Type

QUALITY_METADATA

P_AGREEMENT

AGENCY

DATAFLOW

Dataflow

DataProvider

FAOSTAT:OS_FAO_DATA_PROVIDER

FAOSTAT:DATAFLOWS

MSD – Identification of the Target

Page 55: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

MSD Metadata Concepts: Data Quality

Concepts

Concept Id Description

DISSEMINATION_FORMATS* Refers to the various means of dissemination used for making the data available to the public. It would include a description of the various formats available, including where and how to get the information (paper, electronic formats, longer time series)

FREQUENCY_PERIODICITY* Frequency refers to the time interval between the observation of a time series. Periodicity refers to the frequency of compilation of the data (e.g., a time series could be available at annual frequency but the underlying data are compiled monthly, thus have a monthly periodicity).

PERIODICITY The frequency of compilation of the data

FREQUENCY the time interval between the observation of a time series

RELEASE_CALENDAR* Describes the policy regarding the release of statistics according to a preannounced schedule and its availability. It also contains the release calendar information

1

Page 56: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata Attributes

Format and Permitted Value List

Report Structure

Concept Scheme

Concept

SCOPE_COVERAGE

DATA_QUALITY_REPORT

FREQUENCY_PERIODICITY

PERIODICITY

DISSEMINATION_FORMATS

SOURCE_DATA

REFERENCE_PERIOD

ADVANCE_RELEASE_CALENDAR

FAOSTAT:METADATA_CONCEPTS

This varies depending on the Metadata Attribute: Scope_Coverage, Source_Data are text, Reference_Period is Date/Time, and the remainder are linked to a Code List

The reporting hierarchy must respect the concept hierarchy. No additional reporting hierarchy is specified

TIMELINESS

MSD – Data Quality Report

Page 57: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

MSD Metadata Concepts: Contact

Concepts

Concept Id Description

CONTACT* An instance of a role of an individual or an organization (or organization part or organization person) to whom an information item(s), a material object(s) and/or person(s) can be sent to or from in a specified context.

NAME The identity, expressed in natural language, of a person or organisation

PERSON_NAME The identity, expressed in natural language, of a person

ORGANISATION_NAME The identity, expressed in natural language, of an organisation

ADDRESS The identity of a building, a house or other structure.

BUSINESS_ADDRESS The address at which a business is located.

E-MAIL_ADDRESS The address of an electronic mailbox.

TELEPHONE_NUMBER The number by which a natural person or organisation can be contacted by telephone

1

Page 58: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata Attributes

Format and Permitted Value List

Concept Scheme

Concept

E-MAIL ADDRESS

CONTACT_REPORT

CONTACT

NAME

ADDRESS

TELEPHONE_NUMBER

BUSINESS_ADDRESS

PERSON_NAME

FAOSTAT:METADATA_CONCEPTS

All Metadata Attributes are text

The reporting hierarchy must respect the concept hierarchy but may also introduce an additional hierarchy. In this respect the Contact Metadata Attribute is the parent of all other Metadata Attributes

ORGANISATION_NAME

Report Structure

MSD – Contact Report

Page 59: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

MSD Metadata Concepts: Advance Release Calendar

Concepts

Concept Id Description

REFERENCE_PERIOD The time period to which a variable refers

RELEASE_DATE_TIME The specific point in time that data or metadata are made available

DATE_TOLERANCE The possible or permissible variance of a time period relative to a known point in time.

RELEASE_STATUS The state of preparedness of a statement on the availability of data or metadata

ANNOTATION Additional metadata

1

Page 60: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata AttributesFormat and Permitted Value List

Concept Scheme

Concept

ARC_REPORT

REFERENCE_PERIOD

RELEASE_DATE_TIME

RELEASE_STATUS

ANNOTATION

DATE_TOLERANCE

FAOSTAT:METADATA_CONCEPTS

This varies depending on the Metadata Attribute: Reference_Period and Release_Date_Time are Date/Time, Release_Status is linked to a Code List, Date_Tolerance and Annotation are text

The reporting hierarchy must respect the concept hierarchy but may also introduce an additional hierarchy.

Report Structure

MSD – Advance Release Calendar

Page 61: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

MSD - Identifiers

Page 62: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

MSD – Report Structure

Page 63: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata Set

Page 64: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata Set: Quality Report

Page 65: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata Set: Contact Report

Page 66: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Metadata Provisioning

Page 67: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Metadata

Flow

Data Provider

Provision Agreement

MetadataStructure Definition

FAOSTAT:QUALITY_METADATA

FAOSTAT:QUALITY_REPORT

FAOSTAT:ARC_REPORT

FAOSTAT:CONTACT_REPORT

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

SDMX:SDMXStatSubMatDomainsWD1

(adoption of UNECE Classification of

International Statistical Activities)

(Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries)

There are 12 provision agreements, one for each combination of Data Provider and Metadata Flow

SDMX:SDMXStatSubMatDomainsWD1.

Domain_2.C4.C1

Registry Contents - Metadata Provisioning

Page 68: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Submit Provision Agreement to the Registry

• This is identical in form to that submitted for Data except the Data Provider is paired with a Metadataflow instead of a Dataflow

Page 69: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Register and Query for Metadata

Page 70: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Data

Set

Metadata

Flow

Data Provider

Provision Agreement

Structure Definition

Data

Set

Metadata Registration and Query

QUALITY_REPORT ARC_REPORTCONTACT_REPORT

29 - Bénin

42 - Burkina Faso

66 - Côte d’Ivoire

217 - Sénegal

Provision AgreementProvision

Agreement

Metadata Set

Metadata

Page 71: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Register and Query for Metadata

• This is identical in form to the query and response for data except the artefact is a metadata set conforming to the business rules of a metadata flow instead of a data set conforming to the business rules of a data flow

Page 72: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Hierarchical Code Lists

Page 73: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Hierarchical Code Lists – Example Scenario

• France is a country• France is part of the continent of Europe• France is a member of NATO• France is a member of the EU• France is a member of the G10• When I analyse statistics I might want to see totals by

– continent– trading block– military alliance– financial grouping

• France will be grouped with different sets of countries depending on the “view” required

• How do we express these groupings?

Page 74: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Reference Area

6B NATO

B0 EU

B1 NAFTA

BE Belgium

BG Bulgaria

CA Canada

CH Switzerland

CZ Czech Republic

DE Germany

DK Denmark

E1 Europe

E8 North America

EE Estonia

ES Spain

FI Finland

FR France

GB United Kingdom

GR Greece

HU Hungary

JP Japan

I2 Euro 12

IT Italy

NE Netherlands

US United States

Code Parent

BE E1

BG E1

CH E1

CZ E1

DE E1

DK E1

EE E1

ES E1

FI E1

FR E1

GB E1

etc

Code Parent

BE E0

CZ E0

DE E0

DK E0

EE E0

ES E0

FI E0

FR E0

GB E0

etc

Europe EU countries

Code Parent

BE 6B

BG 6B

CA 6B

CZ 6B

DE 6B

DK 6B

EE 6B

ES 6B

FR 6B

GB 6B

etc

NATO countries

Code Parent

CA B1

US B1

MX B1

NAFTA countries

Code Parent

CA B1

US B1

North America

Code Composition

Code Parent

BE G0

CA G0

CH G0

DE G0

FR G0

GB G0

JP G0

IT G0

NL G0

SE G0

US G0

G10 countries

Code Association

Code List

Code

Hierarchy-1

Code Composition

Hierarchy-2 Hierarchy-3

Code Composition

Hierarchy-4

Code Composition

Page 75: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Hierarchical Code Scheme

Code Code Association

Code Composition

Level

Hierarchy

parent code

code

relates a code to a parent code

groups codes with the same parent

comprises code groupscomprises hierarchies

comprises code groups

level based hierarchy has formal levels

value based hierarchy has code groups

Property

Code List

belongs to

Properties of the association

The codes may be in variety of code lists.

Schematic of the Hierarchical Code Scheme

Page 76: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Item Scheme Maps

• Many types of “item scheme” use the same fundamental structure– Code list– Category scheme– Concept scheme

• Two Item Schemes can be mapped

Page 77: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Item Scheme

Item Item Association

has item associations

source item

Item Scheme

Itemtarget item

Item Scheme Association

source item schemetarget item scheme

Code List Category Scheme

Concept Scheme

Code Category Concept

Code List Map

Category Scheme

Map

Concept Scheme

Map

Association Role

Code List Category Scheme

Concept Scheme

Code Category Concept

Schematic of the “Code” Mapping

Page 78: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Structure Maps

• Structures can also be mapped– Data structures– Metadata structures

Page 79: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Structure Sets

Page 80: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Structure Map

Page 81: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Code List Map

Page 82: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Information Model: Summary• Supports data and metadata reporting and exchange

– Data and metadata structure definitions– Data and metadata sets

• Supports the process of reporting and exchange– Data/metadata providers– Data/metadata flows– Provision agreements

• Supports registration– Data and metadata sets– Data and metadata can be linked

• Supports query– Categories linked to data and metadata– Constraints for finer grained queries– Retrieval of metadata linked to data

• Supports data analysis, comparison and conversion– Hierarchical code schemes– Structure, Concept, Code, Category maps

Page 83: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

CategoryScheme

CategoryData or Metadata

Flow

Data Provider

Provision Agreement

Structure Definition

Data Set or

Metadata Set

Content

Constraint

Data/Metadata Reporting, Query, Analysis, Mapping

Structure and Item Scheme

Maps

Registered Data Set or Metadata

Set

Attachment

Constraint

Page 84: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Reporting Taxonomy

Page 85: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Reporting Taxonomy

• An SDMX Reporting Taxonomy is a group of data flows and/or metadata flows which form the basis of a single real-world document or report

• They can be organized into groups and sub-groups as needed

• They can be named and identified• Useful for managing various types of

reports over time

Page 86: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Processes

Page 87: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Processes

• SDMX 2.0 provides the ability to document the steps and logic of a process flow

• This is not executable, but serves as documentation to describe the processes which produce data and metadata

• It is useful as a target for the attachment of reference metadata describing processing

Page 88: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Services Based Architecture

Page 89: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

What is Services-Based Architecture?

• A “services-based architecture” (or services-oriented architecture, SOA) is an architecture that supports distributed applications – Each service, or component, can exist elsewhere on the

network – typically the Internet– The services are coordinated by the use of registries and

event notifications– They communicate using XML messages (like SDMX-ML)

• This type of architecture can be very powerful when data sources and metadata sources are available in standard formats, using standard protocols

Page 90: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry: Subscription Service

Page 91: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Registry: Notification Service

Page 92: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

RSS Feed

Page 93: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Technical Standards

Alignment with Other Standards

Page 94: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Other Statistical Standards

• There are many statistical standards which are potentially used by SDMX systems:– Data Documentation Initiative (metadata for microdata)– ISO/IEC 11179 (for semantic models and definitions)– eXtensible Business Reporting Language (for business

reporting)– ISO 19115 (for geophysical metadata and maps)

• Typically, these standards represent the source information of aggregate SDMX data, or represent additional metadata

• SDMX has been aligned with these standards to support such systems

Page 95: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

SDMX Standards Alignment Example: The Data

Documentation Initiative(DDI)

XML specification for microdatahttp://www.ddialliance.org

Page 96: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

What is the DDI?• Purpose

– Capture extensive metadata for archiving, dissemination and use of microdata

– 5 sections (document, survey, files, variables, documentation), hundreds of elements

• DDI Alliance– Expert Committee, Steering Committee and Working

Groups– http://www.ddialliance.org

• DDI Users– US & European academics and statistical agencies– International Household Survey Network (IHSN) &

developing countries

Page 97: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

What is the DDI?• DDI is a mature product with a long history

– ISR OSIRIS (1970), The IASSIST Codebook Action Group (SGML, DTD) (1993), Draft DDI (1997), Beta-testing (1999), DDI 1.0 (2000), DDI 2.0 (2003), DDI 3.0 (2007)

• DDI 1/2.x model: single survey

• DDI 3.0 model: the survey life cycle

Page 98: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Tools for DDI• International Household Survey Network

– Objectives: Improve the availability, quality and use of survey data in developing countries

– Members: International organizations & national agencies supporting survey programs in developing countries

– Management: DFID, ILO, PARIS21, UNICEF, UNSD, WHO, World Bank

– Activities: Coordinating survey programs, Harmonizing concepts & methods, Maintaining a survey catalog, Developing data dissemination tools

– http://www.surveynetwork.org

• Microdata Management Toolkit– DDI based user friendly package for archiving and preservation

of surveys

Page 100: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Microdata Management Toolkit• Status

– Available in English, French, Spanish, Russian– http://www.surveynetwork.org/toolkit

• Roll-out program– Completed training / pilot in about 20 countries,

mainly in the Africa region– Expected use by UNICEF for next round of Multiple

Indicators Cluster Surveys (MICS, 55 countries)– Asia: Partnership with United Nations Economic and

Social Commission for Asia and the Pacific (ESCAP)– Latin America: partnership with Inter-American

Development Bank (IADB)– Used by IHSN member agencies (WHO, ILO, etc.)– Component of World Bank Accelerated Data Program

(ADP)

Page 101: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

DDI and SDMX

SDMXAggregated data

Indicators, Time SeriesAcross time

Across geographyOpen AccessEasy to use

DDIMicrodata

Low level observationsSingle time period Single geographyControlled accessExpert Audience

• Microdata data is a important source of aggregated data• Crucial overlap and mappings exists between both

worlds (but commonly undocumented)• Interoperability provides users with a full picture of the

production process

Page 102: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Demo: SDMX – DDI Integration

• Aggregates and microdata on the website of the Nigerian statistical office

Page 103: SDMX Advanced Topics on Technical Standards Arofan Gregory and Chris Nelson SDMX Capacity Building Workshop Washington January 11 2007

Questions?