Transcript
Page 1: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

LSD DimensionsUse and Reuse of Linked

Statistical Data as RDF Data Cube

Albert Meroño-Peñuela@albertmeronyo

WAI meeting 06-10-2014

Page 2: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Statistics!

Page 3: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Data integration – 220 years ago

Page 4: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Data integration - nowadays

Page 5: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Data integration - nowadays

Page 6: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Towards 5-star Linked Statistical Data

Page 7: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Towards 5-star Linked Statistical Data

Page 8: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Towards 5-star Linked Statistical Data

DFT

Page 9: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Towards 5-star Linked Statistical Data

DFT

Eurostat TSV

Page 10: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

RDF Data Cube

• 4-star LSD: use URIs to denote (statistical) things

• 5-star LSD: link own (statistical) things to other (statistical) things

“There are many situations where it would be useful to be able to publish multi-dimensional data, such as

statistics, on the web in such a way that they can be linked to related data sets and concepts.”

Page 11: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
Page 12: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
Page 13: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

RDF Data Cube vocabulary (QB)• SDMX compatible• Defines cubes as a set of observations that consist of

dimensions, measures and attributes

• Dimensions: time period, region, sex (qb:DimensionProperty)• Measure: population life expectancy (qb:MeasureProperty)

• Attribute: unit of measure = years, metadata status = measured (qb:AttributeProperty)

Observation: “the measured life expectancy of males in Newport in the period 2004-2006 is 76.7 years”

Page 14: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

5-star LSD: 270a.info

Sarven Capadisli, Sören Auer, Reinhard Riedl. “Linked Statistical Data Analysis”. 1st Int. Workshop on Semantic Statistics (SemStats) ISWC 2013.

Page 15: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Are we done?

• P1: Comparability? Can we arbitrarily combine any pair of these datasets/dimensions?

• P2: Reusability? How often are dimensions reused? Can we reuse dimensions created by others?

• P3: Discoverability? How to discover dimensions created by others?

• P4: Relevance? What’s the size of LSD?

Page 16: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

P1: Comparability of LSD: SSCLSDA

Sarven Capadisli, Albert Meroño-Peñuela, Sören Auer, Reinhard Riedl. “Semantic Similarity and Correlation of Linked Statistical Data Analysis”. 2nd Int. Workshop on Semantic Statistics (SemStats) ISWC 2014.

Page 17: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

P2+P3+P4: LSD Dimensions

Need for an intelligent system that helps us on (1) discovering (2) reusing (3) analyzing dimensions in LSD

Page 18: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

http://lsd-dimensions.org/

Page 19: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

http://lsd-dimensions.org/

Page 20: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
Page 21: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
Page 22: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube
Page 23: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Are we done?

• P1: Comparability? Can we arbitrarily combine any pair of these datasets/dimensions? Unclear

• P2: Reusability? How often are dimensions reused? Can we reuse dimensions created by others? Logarithmic law / Probably yes

• P3: Discoverability? How to discover dimensions created by others? LSD Dimensions

• P4: Relevance? What’s the size of LSD? ~8.5% of the LOD cloud

Page 24: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Future Work

• Monitor additional metadata (rdfs:subPropertyOf, rdfs:range)

• Generate PROV during crawling

• Modeling of formulas in RDF Data Cube

• Plug to LOD Laundromat

• Crawl dimensions and codes from qb:Observation

• SPARQL endpoint and API– Suggest dimensions and codes to users

Page 25: LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Thank you

Questions, suggestions, comments most welcome

@albertmeronyo

http://lsd-dimensions.org/https://github.com/albertmeronyo/LSD-Dimensionshttps://github.com/csarven/sense-of-lsd-analysis

http://www.cedar-project.nl


Top Related