dm2e project meeting bergen: wp2 presentation, kai eckert (university of mannheim)
DESCRIPTION
TRANSCRIPT
co-funded by the European Union
Work Package 2
All WP Meeting12th June 2014, Bergen
Kai Eckert
Timetable
16.04.2013 DM2E Review: Work Package 2 2
Q8 •Final Version of the Interoperability Infrastructure•Integration of OmNom, MINT, Silk
Q9 •Review II•DM2E Model 1.1•Data Ingestions, Contextualisation
Q10 •DM2E Model 1.2, RDF Validation, Application Profiles•Europeana Ingestion•Contextualisation
Old next steps (Athens)
• Final Version (January 2014)– Complete transformation and ingestion infrastructure– Integrated contextualization– Connection with scholarly environment (WP3)– Documentation Drafts
• Data ingestions(!)• Maintenance phase
– Bug fixing and performance tuning in OmNom– Documentation and introductory materials with WP4– Data provision for Europeana– Search and browse interface
DM2E Review: Work Package 2 316.04.2013
Data Model
DM2E All WP Meeting: Work Package 2 411.06.2013
Iterative Process (Ingestion Model)
1. Issues are tracked based on validation and test reports.
2. Changes to the model are collected and included in the current draft version of the model.
3. Providers adjust their mappings.
13.03.2014 DM2E Review: Work Package 2 5
Automatic validation
IngestionTests
Mapping creation
T 2.1
4. A new draft is published on a regular basis for additional feedback.
5. Based on the feedback, a new version of the model is released.
DM2E Model
• Reviewer recommendation: Stop changing the model• Open issues mentioned in Review:
– EAD requirements– Evaluation– Error fixing
• Current state: Finalisation of the last model version 1.2– Final release in July 2014
• Model specification– Now working on Revision 1.2_RC1– Check: DM2E model compared to EDM Mapping Guidelines v2.1 (May
2014)• Removed edm:isDerivativeOf from edm:WebResource• Removed edm:wasPresentAt from edm:WebResource
– Corrected CIDOC CRM namespace– Added section “Technical Details”
Evaluation
• The model evaluation took place in April/May 2014• Base:
– 10 datasets– Delivered by eight data providers– Mapped by six different institutions– Altogether 61,365,146 triples
DM2E All WP Meeting: Work Package 2 711.06.2013
Evaluation: Some Insights
• Many classes and properties are not mapped– For example: edm:Event, dm2e:misattributed, edm:happenedAt, skos:hiddenLabel
• Some of these were asked for by providers!
– Conclusion: Unused classes and properties can be removed to achieve a higher simplicity of the model
• Different providers have different mapping styles– Conclusion: Mapping recommendations are important!
• Evaluation advantages– Helpful to discover mapping errors that were not detected
by the validator (e.g. no usage of rdf:type)
Evaluation: Property Usage
• A few properties were used very often• Most properties were rarely used (Long tail phenomenon)• About a third of all properties were never used
Das DM2E-Modell: ein Anwendungsprofil des EDM 915.04.2014
Evaluation: CHO Types
Dataset bibo:Series
bibo:Book
dm2e:Manu-script
dm2e:Para-graph
bibo:Journal
bibo:Issue
fabio:Article
bibo:Letter
dm2e:Page
Dataset 1 - - 24 - - - - - 10,427
Dataset 2 1,251 10 530,314
Dataset 3 4,552 39,873 - - - - - - -
Dataset 4 - - 175 - - - - - 46,006
Dataset 5 - - 1,012 - - - - - 307,202
Dataset 6 - 2,916 - - - - - - 472,994
Dataset 7 - 1,295 - - - - - - 416,172
Dataset 8 - - - - - - - 3,630 34,596
Dataset 9 - - - - 1 346 42,173 - 159,277
Dataset 10 - - 20 9,635 - - - - -
Total 4,552 45,335 1,241 9,635 1 346 42,173 3,630 1,976,988
DM2E All WP Meeting: Work Package 2 1011.06.2013
Evaluation: Number of statements
• Average number of statements per dataset
DM2E All WP Meeting: Work Package 2 1111.06.2013
Consequences
• Goal: Remove unnecessary elements from the model• All unused classes and properties are marked in orange
in the current draft revision• Please provide feedback to UBER whether they can be
removed from the model description• Feedback until 22th of June 2014
DM2E All WP Meeting: Work Package 2 1211.06.2013
DM2E All WP Meeting: Work Package 2 1311.06.2013
Konstantin Baierer, Evelyn Dröge, Vivien Petras, Violeta Trkulja:
Linked Data Mapping Cultures: An Evaluation of Metadata Usage and Distribution in a Linked Data Environment
Submitted to Dublin Core Conference 2014
Infrastructure
DM2E All WP Meeting: Work Package 2 1411.06.2013
UI Integration
DM2E Review: Work Package 2 1513.03.2014
T 2.5
MINT
Silk
OmNom
Access to the data
Search and browse the data
DM2E Review: Work Package 2 1613.03.2014
View and accessthe data
DM2E Review: Work Package 2 1716.04.2013
Kai Eckert, Dominique Ritze, Konstantin Baierer, Christian Bizer:
RESTful Open Workflows for Data Provenance and Reuse
Poster at WWW Conference, 2014
ContextualisationRDF ValidationEuropeana Ingestion
DM2E All WP Meeting: Work Package 2 1811.06.2013
See dedicated presentations.
Outreach and Sustainability
• DCMI RDF-Application Profiles– Establishing RDF Application Profiles– DM2E: UBER, UMA, ONB– External: W3C, Europeana, DNB, …
• JudaicaLink– Provide web encyclopediae as Linked Data for contextualisation resources– DM2E: UMA, EAJC– Encyclopediae: RUJEN, YIVO, …
• MarineLives– Transcription of depositions from High Court of Admiralty, 17th century.– DM2E: UMA, NET7– External: MarineLives, University Bath Spa, …
• COST Action Linked Open Data across the Humanities– Proposal for trans-domain action to advance collaboration– DM2E: UMA, …– External: University of Oxford, Università del Piemonte Orientale, UNED, …
DM2E Review: Work Package 2 1913.03.2014
DM2E All WP Meeting: Work Package 2 2011.06.2013
Dominique Ritze, Caecilia Zirn, Colin Greenstreet, Kai Eckert, Simone Paolo Ponzetto: Named Entities in Court: The Marine Lives Corpus, Paper at LRT4HDA Workshop at LREC 2014
Thomas Bosch, Kai Eckert: Towards Description Set Profiles for RDF using SPARQL as Intermediate Language, Submitted to Dublin Core Conference 2014
Thomas Bosch, Kai Eckert: Requirements on RDF Constraint Formulation and Validation, Submitted to International Semantic Web Conference 2014
Thank you.
DM2E Review: Work Package 2 2116.04.2013