sdmx advanced topics on technical standards arofan gregory and chris nelson sdmx capacity building...

Post on 25-Dec-2015

223 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SDMX Advanced Topics on Technical Standards

Arofan Gregory and Chris Nelson

SDMX Capacity Building Workshop Washington January 11 2007

Advanced Topics (1)

• Many of these will be presented in the context of a live prototype system– Data structures– Provisioning metadata– Registry interfaces

• Submit structure and provisioning metadata• Query for structure• Register data and metadata set• Query for registered data and metadata sets

– Alignment with other standards

Advanced Topics (2)

• Others will presented by explanation and example– Hierarchical Code Set– Structure Set– Reporting Taxonomy– Services based architecture

• Notification• RSS feed

SDMX Technical Standards

Information Model and Technical Specifications: High Level

Overview (Reminder)

CategoryScheme

Category

can have child categories

comprises subject or reporting categories

Data or Metadata

Flow

Data Provider

Provision Agreement

can get data/metadata from multiple data/metadata providerscan provide

data/metadata for many data/metadata flows using agreed data/metadata structure

Structure Definition

uses specific data/metadata structure

can be linked to categories in multiple category schemes

Data Set or

Metadata Set

publishes/reports data sets or metadata sets

conforms to business rules of the data/metadata flow

Information Model: High Level Schematic

Data or Metadata

Set

URL, registration date etc.

Registers existence of data and metadata sets

Structure MapsStructure and

Code List maps

SDMX Technical Standards

SDMX Registry

REPOSITORY Provisioning

Metadata

REGISTRY Data Set/

Metadata Set

REPOSITORY Structural Metadata

Subscription/Notification

Applications can subscribe to notification of new or changed objects

Register

Query

Submit

Query

Submit

Query

SDMX Registry/Repository

Describes data and metadata structures

Describes data and metadata sources and reporting processes

Indexes data and metadata

SDMX Registry Interfaces

CategoryScheme

Category

can have child categories

comprises subject or reporting categories

Data or Metadata

Flow

Data Provider

Provision Agreement

can get data/metadata from multiple data/metadata providerscan provide

data/metadata for many data/metadata flows using agreed data/metadata structure

Structure Definition

uses specific data/metadata structure

can be linked to categories in multiple category schemes

SDMX Artefacts: Registry Contents

Data

Set

URL, registration date etc.

registers existence of data and metadata sets

Structure MapsStructure and

Code List maps

CategoryScheme

CategoryData or Metadata

Flow

Structure Definition

Structure Maps

Structural Metadata

Provisioning Metadata

Registered Data and Metadata

Registry Interfaces

SDMX Technical Standards

Practical Examples

CountrySTATRegionSTAT

National Publication Server(s)

Regional Publication Server

FAO SDMX Registry

Flow of FAO CountrySTAT-RegionSTAT Implementation

1

23a

4

3b

SDMX in Action: Prototype System

FOOD AND AGRICULTURE ORGANIZATIONOF THE UNITED NATIONS

Slide courtesy of the FAO

FOOD AND AGRICULTURE ORGANIZATIONOF THE UNITED NATIONS

1 CountryStat National Publication Server

•The web site is published from the files in CountryStat

SDMX Publication

•The new CountryStat files are converted to SDMX-ML data sets and made web accessible on the CountryStat web site

•These files are registered in the FAO SDMX Registry

RegionStat Regional Publication Server

•Queries the registry for new registrations which responds with registration details including the URL of the new data sets

•Retrieves the new data sets from the CountryStat web site

•Converts the SDMX-ML files to an internal format and integrates the new data sets with existing RegionStat data sets

•Re-publishes the RegionStat web site

2

3a

4

Prototype System: Explanation

Slide courtesy of the FAO

3b

SDMX Technical Standards

Data Structure Definitions: Registration and Query

Data Set and Structure

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

Data Set: Structure

• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the

observation value– Concept that is the observation value– Any of these may be

• coded• text• date/time• number• etc.

Dimensions

Attributes

Measure

Representation

Data Set and Structure

Reference Region

Commodity

Frequency and Time

Observation Value

Measure Type

Unit and Unit Multiplier

Measurement = 1,000 Kg

(Dimensions)

(Dimension)

(Dimension)

(Attributes)

(Dimension)(Measure)

Dimensions Frequency

Reference Region Commodity

Time Measure Observation Value

Attributes Unit

Unit Multiplier 1

Data Structure Definition

Key Group Key

Dimensions

RepresentationConcept

Attributes Measures

takes semantic

from

has format

takes semantic

from

takes semantic

from

has format

has format

concepts that add metadata

concepts that Identify the observation

concepts that are observed phenomenon

concepts that Identify groups of keys

Data Structure Definition

Data Structure Definition

Key Group Key

Dimensions

RepresentationConcept

Attributes Measures

FREQREF_AREA_REGCOMMODITYTIME

AGRICULTURE_COMMODITY

CL_FREQCL_AREA_CTYCL_COMMODITY

UNITUNIT_MULT

OBS_VALUE

Registry Contents - DSD

CL_MEASURE_UNITCL_UNIT_MULT

Registry Interfaces: Submit Structure

Data Structure Definition Artefacts

Registry Interfaces: Submit Structure

Registry Interfaces: Query Structure

Query for KeyFamily with resolveReferences set to “true” will return all related Concepts and Code Lists

Registry Interfaces: Query Structure

The registry will respond with all DSDs maintained by the FAOSTAT agency

SDMX Technical Standards

Dataflows, Data Providers, Category Scheme

Data

Flow

Data Provider

Structure Definition

FAOSTAT:AGRICULTURE_COMMODITY

FAOSTAT:AGRICULTURE_AREA

FAOSTAT:AGRICULTURE_PRODUCTION

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

SDMX:SDMXStatSubMatDomainsWD1

(adoption of UNECE Classification of

International Statistical Activities)

(Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries)

SDMX:SDMXStatSubMatDomainsWD1.

Domain_2.C4.C1

Registry Contents – Other Structures

The data flows are connected to the relevant Category in the Category Scheme

Registry Interface: Submit Structure

Artefacts

Registry Interface: Submit Structure

Category Scheme

Registry Interface: Submit Structure

Links the Dataflow to the (Subject Matter Domain) Category

Data Providers

Dataflow

SDMX Technical Standards

Submit Provision Agreements

Data

Flow

Data Provider

Provision Agreement

Structure Definition

FAOSTAT:AGRICULTURE_COMMODITY

FAOSTAT:AGRICULTURE_AREA

FAOSTAT:AGRICULTURE_PRODUCTION

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

SDMX:SDMXStatSubMatDomainsWD1

(adoption of UNECE Classification of

International Statistical Activities)

(Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries)

There are eight provision agreements, one for each combination of Data Provider and Data Flow

SDMX:SDMXStatSubMatDomainsWD1.

Domain_2.C4.C1

Registry Contents – Structure and Provisioning

Registry Interface: Submit Provision Agreement

Unique Id. of the Dataflow

Unique Id. of the Data Provider

Registry Interface: Submit Provision Agreement

Unique Id. of the Dataflow

Unique Id. of the Data Provider

Registry Interface: Submit Provision Agreement Response

The status indicates success or failure

Registry Interface: Submit Provision Agreement Response

The response returns the URN as well as confirmation of the provisioning details submitted

SDMX Structured URNs• The URNs in SDMX are compound identifiers

which reflect the relationships described in the information model– They are unique and predictable– They can be easily validated– They function exactly like URLs for the registry

• Each identifier tells you which organization maintains the identified object

• Each identifier tells you which agency maintains the scheme from which the identifier comes

URN Structure

urn:sdmx:org.sdmx.infomodel.registry.Provision

Agreement=FAOSTAT:OS_FAO_DATA_PROVIDER.29.FAOSTAT:AGRICULTURE_PRODUCTION

Data Provider Scheme

Maintenance Agency

Maintenance Agency

Data Provider

Dataflow

Data Provider

Provision Agreement

Data Flow

SDMX Technical Standards

Register a Data Set

Data

Flow

Data Provider

Provision Agreement

Structure Definition

Data

Set

Data Set Registration

•The data set is “registered” against the provision agreement

•The registry stores metadata (e.g. URL) about the data set: it does not store the data set

URL, registration date etc.

registers existence of data set

Data

Flow

Data Provider

Provision Agreement

Structure Definition

FAOSTAT:AGRICULTURE_COMMODITY

FAOSTAT:AGRICULTURE_AREA

FAOSTAT:AGRICULTURE_PRODUCTION

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

Data Set

Metadata

URL, registration date etc.

There can be eight data sets registered, one for each Provision Agreement

Registry Contents – Data Set Registrations

Registry Interface: Data Set Registration

Action is “replace”, “append” etc.

An SDMX-ML file is a simple datasource

Identifies the Provision Agreement either by URN or by Dataflow and Data Provider

Registry Interface: Data Set Registration

URL of the SDMX-ML file

URN of the Provision Agreement

SDMX Technical Standards

Query for a Data Set

Data

Set

Data

Flow

Data Provider

Provision Agreement

Structure Definition

Data

Set

Query for Data Sets

AGRICULTURE_AREA

AGRICULTURE_PRODUCTION

29 - Bénin

42 - Burkina Faso

66 - Côte d’Ivoire

217 - Sénegal

Provision AgreementProvision

Agreement

Data Set

Metadata

Query for Data Sets

•for all Provision Agreements linked to Data Flow or

•linked to a specific Provision Agreement

Registry Interface: Query for Data SetsQueryType is “DataSets” “MetadataSets” etc.

Registry Interface: Query for Data Sets

Could be done with URN or as shown here with

explicit fields

Registry Interface: Data Set Query Response

URL of the SDMX-ML file

Identification of the Provision Agreement

Registry Interface: Data Set Query Response

Note that the URN of the registered data set included the date and time of registration

SDMX Technical Standards

Metadata Structure Definition

Metadata – Reported according to a Quality Framework

Metadata pertaining to a Quality Framework are reported in a Metadata Set, whose structure is defined by a Metadata Structure Definition

Metadata Attribute Metadata Attribute: Metadata Content

Data Flow

Data Provider

Provision Agreement

MetadataReport

Metadata Reporting“Quality” metadata about published or reported data sets are linked to the

Provision Agreement, or the Data Flow, or the Data Provider

AGRICULTURE_AREA

AGRICULTURE_PRODUCTION

29 - Bénin

42 - Burkina Faso

66 - Côte d’Ivoire

217 - Sénegal

Provision Agreement

Identify Structure

•Concepts

•Hierarchies

•Representation (e.g. code list)

Metadata Structure Definition (MSD)

Full Target Identifier

Partial Target Identifier

Metadata Structure Definition

Identifier Components Item Scheme

uses defined concepts

defines “keys” of object types to which metadata can be “attached”

specifies the identifier components (“key”) of the target object

identifies the code list from which the value

of the (key) component must be

taken when metadata is reported

Report Structure

Target Object Type

identifies target object

type of the component

Metadata Structure Definition

Metadata AttributesFormat and Permitted Value List

Report Structure

Concept Scheme

concept defined inConcept

takes semantic and context

from definition of format and permitted values

Metadata Structure Definition

can comprise the specification of

one or more report

can have hierarchy

MSD – Defining the Metadata Report

Metadata Attributes

Full Target Identifier

Partial Target Identifier

Metadata Structure Definition

Identifier Components

Format and Permitted Value List

Item Scheme

uses defined concepts

defines “keys” of object types to which metadata

can be “attached”specifies to which

object types the Report can be “attached”

specifies the identifier components (“key”) of the target object

identifies the code list from which the value

of the (key) component must be

taken when metadata is reported

Report Structure

Concept Scheme

concept defined inConcept

takes semantic and context from

Target Object Type

identifies target object

type of the component

can have hierarchy

definition of format and permitted values

MSD – Complete Picture

Full Target Identifier

Partial Target Identifier

Metadata Structure Definition

Identifier Components Item Scheme

Target Object Type

QUALITY_METADATA

P_AGREEMENT

AGENCY

DATAFLOW

Dataflow

DataProvider

FAOSTAT:OS_FAO_DATA_PROVIDER

FAOSTAT:DATAFLOWS

MSD – Identification of the Target

MSD Metadata Concepts: Data Quality

Concepts

Concept Id Description

DISSEMINATION_FORMATS* Refers to the various means of dissemination used for making the data available to the public. It would include a description of the various formats available, including where and how to get the information (paper, electronic formats, longer time series)

FREQUENCY_PERIODICITY* Frequency refers to the time interval between the observation of a time series. Periodicity refers to the frequency of compilation of the data (e.g., a time series could be available at annual frequency but the underlying data are compiled monthly, thus have a monthly periodicity).

PERIODICITY The frequency of compilation of the data

FREQUENCY the time interval between the observation of a time series

RELEASE_CALENDAR* Describes the policy regarding the release of statistics according to a preannounced schedule and its availability. It also contains the release calendar information

1

Metadata Attributes

Format and Permitted Value List

Report Structure

Concept Scheme

Concept

SCOPE_COVERAGE

DATA_QUALITY_REPORT

FREQUENCY_PERIODICITY

PERIODICITY

DISSEMINATION_FORMATS

SOURCE_DATA

REFERENCE_PERIOD

ADVANCE_RELEASE_CALENDAR

FAOSTAT:METADATA_CONCEPTS

This varies depending on the Metadata Attribute: Scope_Coverage, Source_Data are text, Reference_Period is Date/Time, and the remainder are linked to a Code List

The reporting hierarchy must respect the concept hierarchy. No additional reporting hierarchy is specified

TIMELINESS

MSD – Data Quality Report

MSD Metadata Concepts: Contact

Concepts

Concept Id Description

CONTACT* An instance of a role of an individual or an organization (or organization part or organization person) to whom an information item(s), a material object(s) and/or person(s) can be sent to or from in a specified context.

NAME The identity, expressed in natural language, of a person or organisation

PERSON_NAME The identity, expressed in natural language, of a person

ORGANISATION_NAME The identity, expressed in natural language, of an organisation

ADDRESS The identity of a building, a house or other structure.

BUSINESS_ADDRESS The address at which a business is located.

E-MAIL_ADDRESS The address of an electronic mailbox.

TELEPHONE_NUMBER The number by which a natural person or organisation can be contacted by telephone

1

Metadata Attributes

Format and Permitted Value List

Concept Scheme

Concept

E-MAIL ADDRESS

CONTACT_REPORT

CONTACT

NAME

ADDRESS

TELEPHONE_NUMBER

BUSINESS_ADDRESS

PERSON_NAME

FAOSTAT:METADATA_CONCEPTS

All Metadata Attributes are text

The reporting hierarchy must respect the concept hierarchy but may also introduce an additional hierarchy. In this respect the Contact Metadata Attribute is the parent of all other Metadata Attributes

ORGANISATION_NAME

Report Structure

MSD – Contact Report

MSD Metadata Concepts: Advance Release Calendar

Concepts

Concept Id Description

REFERENCE_PERIOD The time period to which a variable refers

RELEASE_DATE_TIME The specific point in time that data or metadata are made available

DATE_TOLERANCE The possible or permissible variance of a time period relative to a known point in time.

RELEASE_STATUS The state of preparedness of a statement on the availability of data or metadata

ANNOTATION Additional metadata

1

Metadata AttributesFormat and Permitted Value List

Concept Scheme

Concept

ARC_REPORT

REFERENCE_PERIOD

RELEASE_DATE_TIME

RELEASE_STATUS

ANNOTATION

DATE_TOLERANCE

FAOSTAT:METADATA_CONCEPTS

This varies depending on the Metadata Attribute: Reference_Period and Release_Date_Time are Date/Time, Release_Status is linked to a Code List, Date_Tolerance and Annotation are text

The reporting hierarchy must respect the concept hierarchy but may also introduce an additional hierarchy.

Report Structure

MSD – Advance Release Calendar

MSD - Identifiers

MSD – Report Structure

Metadata Set

Metadata Set: Quality Report

Metadata Set: Contact Report

SDMX Technical Standards

Metadata Provisioning

Metadata

Flow

Data Provider

Provision Agreement

MetadataStructure Definition

FAOSTAT:QUALITY_METADATA

FAOSTAT:QUALITY_REPORT

FAOSTAT:ARC_REPORT

FAOSTAT:CONTACT_REPORT

FAOSTAT:OS_FAO_DATA_PROVIDER.29 (Bénin)

FAOSTAT:OS_FAO_DATA_PROVIDER.42 (Burkina Faso)

FAOSTAT:OS_FAO_DATA_PROVIDER.66 (Côte d’Ivoire)

FAOSTAT:OS_FAO_DATA_PROVIDER.217 (Sénegal)

CategoryScheme

Category

SDMX:SDMXStatSubMatDomainsWD1

(adoption of UNECE Classification of

International Statistical Activities)

(Economic Statistics.Sectoral Statistics.Agriculture, forestry, fisheries)

There are 12 provision agreements, one for each combination of Data Provider and Metadata Flow

SDMX:SDMXStatSubMatDomainsWD1.

Domain_2.C4.C1

Registry Contents - Metadata Provisioning

Submit Provision Agreement to the Registry

• This is identical in form to that submitted for Data except the Data Provider is paired with a Metadataflow instead of a Dataflow

SDMX Technical Standards

Register and Query for Metadata

Data

Set

Metadata

Flow

Data Provider

Provision Agreement

Structure Definition

Data

Set

Metadata Registration and Query

QUALITY_REPORT ARC_REPORTCONTACT_REPORT

29 - Bénin

42 - Burkina Faso

66 - Côte d’Ivoire

217 - Sénegal

Provision AgreementProvision

Agreement

Metadata Set

Metadata

Register and Query for Metadata

• This is identical in form to the query and response for data except the artefact is a metadata set conforming to the business rules of a metadata flow instead of a data set conforming to the business rules of a data flow

SDMX Technical Standards

Hierarchical Code Lists

Hierarchical Code Lists – Example Scenario

• France is a country• France is part of the continent of Europe• France is a member of NATO• France is a member of the EU• France is a member of the G10• When I analyse statistics I might want to see totals by

– continent– trading block– military alliance– financial grouping

• France will be grouped with different sets of countries depending on the “view” required

• How do we express these groupings?

Reference Area

6B NATO

B0 EU

B1 NAFTA

BE Belgium

BG Bulgaria

CA Canada

CH Switzerland

CZ Czech Republic

DE Germany

DK Denmark

E1 Europe

E8 North America

EE Estonia

ES Spain

FI Finland

FR France

GB United Kingdom

GR Greece

HU Hungary

JP Japan

I2 Euro 12

IT Italy

NE Netherlands

US United States

Code Parent

BE E1

BG E1

CH E1

CZ E1

DE E1

DK E1

EE E1

ES E1

FI E1

FR E1

GB E1

etc

Code Parent

BE E0

CZ E0

DE E0

DK E0

EE E0

ES E0

FI E0

FR E0

GB E0

etc

Europe EU countries

Code Parent

BE 6B

BG 6B

CA 6B

CZ 6B

DE 6B

DK 6B

EE 6B

ES 6B

FR 6B

GB 6B

etc

NATO countries

Code Parent

CA B1

US B1

MX B1

NAFTA countries

Code Parent

CA B1

US B1

North America

Code Composition

Code Parent

BE G0

CA G0

CH G0

DE G0

FR G0

GB G0

JP G0

IT G0

NL G0

SE G0

US G0

G10 countries

Code Association

Code List

Code

Hierarchy-1

Code Composition

Hierarchy-2 Hierarchy-3

Code Composition

Hierarchy-4

Code Composition

Hierarchical Code Scheme

Code Code Association

Code Composition

Level

Hierarchy

parent code

code

relates a code to a parent code

groups codes with the same parent

comprises code groupscomprises hierarchies

comprises code groups

level based hierarchy has formal levels

value based hierarchy has code groups

Property

Code List

belongs to

Properties of the association

The codes may be in variety of code lists.

Schematic of the Hierarchical Code Scheme

Item Scheme Maps

• Many types of “item scheme” use the same fundamental structure– Code list– Category scheme– Concept scheme

• Two Item Schemes can be mapped

Item Scheme

Item Item Association

has item associations

source item

Item Scheme

Itemtarget item

Item Scheme Association

source item schemetarget item scheme

Code List Category Scheme

Concept Scheme

Code Category Concept

Code List Map

Category Scheme

Map

Concept Scheme

Map

Association Role

Code List Category Scheme

Concept Scheme

Code Category Concept

Schematic of the “Code” Mapping

Structure Maps

• Structures can also be mapped– Data structures– Metadata structures

Structure Sets

Structure Map

Code List Map

Information Model: Summary• Supports data and metadata reporting and exchange

– Data and metadata structure definitions– Data and metadata sets

• Supports the process of reporting and exchange– Data/metadata providers– Data/metadata flows– Provision agreements

• Supports registration– Data and metadata sets– Data and metadata can be linked

• Supports query– Categories linked to data and metadata– Constraints for finer grained queries– Retrieval of metadata linked to data

• Supports data analysis, comparison and conversion– Hierarchical code schemes– Structure, Concept, Code, Category maps

CategoryScheme

CategoryData or Metadata

Flow

Data Provider

Provision Agreement

Structure Definition

Data Set or

Metadata Set

Content

Constraint

Data/Metadata Reporting, Query, Analysis, Mapping

Structure and Item Scheme

Maps

Registered Data Set or Metadata

Set

Attachment

Constraint

SDMX Technical Standards

Reporting Taxonomy

Reporting Taxonomy

• An SDMX Reporting Taxonomy is a group of data flows and/or metadata flows which form the basis of a single real-world document or report

• They can be organized into groups and sub-groups as needed

• They can be named and identified• Useful for managing various types of

reports over time

SDMX Technical Standards

Processes

Processes

• SDMX 2.0 provides the ability to document the steps and logic of a process flow

• This is not executable, but serves as documentation to describe the processes which produce data and metadata

• It is useful as a target for the attachment of reference metadata describing processing

SDMX Technical Standards

Services Based Architecture

What is Services-Based Architecture?

• A “services-based architecture” (or services-oriented architecture, SOA) is an architecture that supports distributed applications – Each service, or component, can exist elsewhere on the

network – typically the Internet– The services are coordinated by the use of registries and

event notifications– They communicate using XML messages (like SDMX-ML)

• This type of architecture can be very powerful when data sources and metadata sources are available in standard formats, using standard protocols

Registry: Subscription Service

Registry: Notification Service

RSS Feed

SDMX Technical Standards

Alignment with Other Standards

Other Statistical Standards

• There are many statistical standards which are potentially used by SDMX systems:– Data Documentation Initiative (metadata for microdata)– ISO/IEC 11179 (for semantic models and definitions)– eXtensible Business Reporting Language (for business

reporting)– ISO 19115 (for geophysical metadata and maps)

• Typically, these standards represent the source information of aggregate SDMX data, or represent additional metadata

• SDMX has been aligned with these standards to support such systems

SDMX Standards Alignment Example: The Data

Documentation Initiative(DDI)

XML specification for microdatahttp://www.ddialliance.org

What is the DDI?• Purpose

– Capture extensive metadata for archiving, dissemination and use of microdata

– 5 sections (document, survey, files, variables, documentation), hundreds of elements

• DDI Alliance– Expert Committee, Steering Committee and Working

Groups– http://www.ddialliance.org

• DDI Users– US & European academics and statistical agencies– International Household Survey Network (IHSN) &

developing countries

What is the DDI?• DDI is a mature product with a long history

– ISR OSIRIS (1970), The IASSIST Codebook Action Group (SGML, DTD) (1993), Draft DDI (1997), Beta-testing (1999), DDI 1.0 (2000), DDI 2.0 (2003), DDI 3.0 (2007)

• DDI 1/2.x model: single survey

• DDI 3.0 model: the survey life cycle

Tools for DDI• International Household Survey Network

– Objectives: Improve the availability, quality and use of survey data in developing countries

– Members: International organizations & national agencies supporting survey programs in developing countries

– Management: DFID, ILO, PARIS21, UNICEF, UNSD, WHO, World Bank

– Activities: Coordinating survey programs, Harmonizing concepts & methods, Maintaining a survey catalog, Developing data dissemination tools

– http://www.surveynetwork.org

• Microdata Management Toolkit– DDI based user friendly package for archiving and preservation

of surveys

Microdata Management Toolkit• Status

– Available in English, French, Spanish, Russian– http://www.surveynetwork.org/toolkit

• Roll-out program– Completed training / pilot in about 20 countries,

mainly in the Africa region– Expected use by UNICEF for next round of Multiple

Indicators Cluster Surveys (MICS, 55 countries)– Asia: Partnership with United Nations Economic and

Social Commission for Asia and the Pacific (ESCAP)– Latin America: partnership with Inter-American

Development Bank (IADB)– Used by IHSN member agencies (WHO, ILO, etc.)– Component of World Bank Accelerated Data Program

(ADP)

DDI and SDMX

SDMXAggregated data

Indicators, Time SeriesAcross time

Across geographyOpen AccessEasy to use

DDIMicrodata

Low level observationsSingle time period Single geographyControlled accessExpert Audience

• Microdata data is a important source of aggregated data• Crucial overlap and mappings exists between both

worlds (but commonly undocumented)• Interoperability provides users with a full picture of the

production process

Demo: SDMX – DDI Integration

• Aggregates and microdata on the website of the Nigerian statistical office

Questions?

top related