towards a linked data publishing methodology
TRANSCRIPT
Berner Fachhochschule | Haute école spécialisée bernoise | Bern University of Applied Sciences
Towards a Linked Data Publishing Methodology
Krems, 18 Mai 2016
Eduard Klein / E-Government Institute
BUAS – Bern University of Applied Sciences / Faculty of Business
Context: EC funded "Fusepool"
projects (2012-2016)
Berner Fachhochschule | Haute école spécialisée bernoise | Bern University of Applied Sciences
▶ Lecturer @ BUAS – Bern Univ. of Applied Sciences
▶ Teaching areas: Software Engineering, (Semantic) Web Technology,
Linked (Open) Data
▶ Research projects since 2009, EC funded (FP7)
▶ Ambient Assisted Living (AAL call): Third Age Online (TAO), 2009-2011: digital
inclusion of senior citizens (web accessibility)
▶ Fusepool SME, 2012-2014: integrating and semantically enriching
heterogenous data for product development and protection of intellectual
property
▶ Fusepool P3, 2014-2016: efficient data publishing through a highly automated
data life cycle and tooling support in tourism use cases (Tuscany, Trentino)
▶ The Fusepool projects are Linked Open Data (LOD) projects
Eduard Klein / Profile
▶ Linked Data (LD) approach in general (at least in many
projects…)
▶ Realization within a large-scale research project ("Fusepool")
▶ Focus on Methodology
▶ Analysis and Externalization
▶ Practical Template for use in LD projects
▶ Experiences
Outline
3
Linked Data (LD) Approach in General
4
Events POIsGLAM
Data
Linked Data:
enriched and integrated
Legacy Data:…
lod-cloud.net
- Tools & Techniques
- LD life cycle
- Ease of use?
- Sustainability?
▶ Sustainability:
▶ Compliant to W3C's Linked Data Platform (LDP)
▶ Loosely coupled components (RESTful API)
Architecture: Linking Data in "Fusepool"
5
RDF Triple Store
Custom
Services
SPARQL
Endpoint
LDP 1.0
Server
R R
Transformer API
Pipeline Transfomer
Single Transfomers
LDP Transforming Proxy
RLDP 1.0
Client Applications / Fusepool P3 DashboardClient Applications / Fusepool P3 Dashboard
Client Applications / Fusepool P3 Dashboard
R Transforming Container API
(Extension of LDP 1.0)
RSPARQL
RREST
User Interaction
Request Registry
Transformer
Registry
Transformer
Factory Registry
R
User Interaction
Request API
Key
Clients
Transformers
LDP Transforming Proxy
Backends
▶ Focus on Re-Use of Linked Data Publishing Process:
▶ Suitable starting point for LD Use Case planning
▶ Completeness (of planning), necessary project skills,
duration of completed projects
▶ Documentation of essential tasks helps answering:
▶ "How long will it take to develop a use case with this
platform?"
▶ "Necessary technical skills?"
▶ Shortening the learning curve
▶ Better estimation of future projects based on documented
experiences
Methodology
6
Linked Data Publishing Methodology (LIDAPUME)
7
1)Typical key stakeholders: end users and data owners
2) Interviews and (field) research
3)Starting point for conceptual and functional test models
4) Identifying (available and missing) data sources
5) Identification of appropriate taxonomies, vocabularies,
ontologies
6)Specify mapping of non-RDF data to RDF data (e.g. through
XSLT)
7)Definition of transformation steps (Fusepool: configuration
front-end)
Linked Data Publishing Methodology (cont.)
8
A template for documentation of essential
activities
9
▶ Shape of template heavily discussed,
e.g. "too general?" "too unspecific?"
▶ Archival Data from the Federal Archives of 4 Swiss cantons
▶ SPARQL endpoint
Validation of the framework ("Swiss Archive" Use Case)
10
(1D=1 effort-day)
▶ FU Berlin library content
▶ GND and Dbpedia loaded and pre-processed
Validation of the framework ("Library Keyword Clustering")
11
▶ Events of touristic regions
▶ Interlinked with POIs, historical characters etc.
▶ Hackathon outcome: LD based web application
Validation of the framework ("Event Explorer")
12
▶ Several adaptions of 7-step model through project phases
▶ Number of phases
▶ Type and granularity of documented information (detail of
description)
▶ Formalism of notation: not too formal, semi-structured
▶ Detail of description: too detailed would not be of value for
"average" user
▶ Columns "Activities", "Skills", "Effort" helped a lot for planning
of future Linked Data Use Cases
Experiences with Methodology & Template
13
▶ Evaluation and Validation of Publishing Framework LIDAPUME
and Template
▶ In our ongoing and future projects
▶ (hopefully) by other projects
▶ possible evolution of methodology based on evaluation and
validation feedback
▶ Comparability of approaches based on documentation with the
same template
Outlook / Further Research
14