| ‹#›
Thomas Vestdam (PhD) Head of Product Technology Elsevier (Aalborg, Denmark) [email protected]
Closing the Loop - Technology Implementations Thorsten Höllrigl, David Baker, Thomas Vestdam
| ‹#›
My Background
• (PhD) Doctorate in Computer Science • (Associate) Professor in Computer Science
• Developer, Architect, Technical Lead and Product Manager on a modern (C)RIS (Pure) that we re-build from ground up - 5 years • Later: Head of Pure Platform
• Active in euroCRIS, in particular on CERIF-XML • Responsible for our “technical” activities within euroCRIS, CASRAI,
ORCID, VIVO,…
• Now: Head of Product Technology
2
| ‹#›
The Grand Promises of a Standard for Interoperability
• Reduction of formats we need to support = true interoperability
• Commoditisation (and standardisation) • Common and shared vocabularies, semantics
& use-cases = “Enter once, use everywhere”
• Preservation of knowledge - we can reap the fruits of already established best practices
3
| ‹#›4
CCCV
BioSketch
REF2014 Mods’DC
Scopus XML
WoS XML
Pure CV
Finance SystemsHR Systems
Assessment Systems Web-Service/API/OAI
CVsPubMed XML
ORCIDDataCite
CERIF XML’’
Pure XML
Pure XML
BFI
DOI
?
Online Sources
CERIF XML’
?
Pure XML
SciVal XML
SHERPA RoMEO
DNFDBFI
OpenAire
?
Eprints XML
DSpace
Equella XML
Fedora
Mods
Repositories
?
SEP ERA
?
VIVO RDF
| ‹#›6
Librarian
LTP Open Access
Funder
Applications CV Feedback
Publisher
Citations Metrics Submission Workflow
Research Administrator
Costs Metrics Performance
| ‹#›7
CASRAI
Goverance
ORCID
CERIF
DC
VIVO
C#
VIVO
CERIF
MODS
REST API
SWORD
Software Implementation and Developers
Modeling and Exchange Experts
Business and Policies (subject matter experts)
| ‹#›8
Terminology, Profiles and
Objects
Terminology, Profiles and
Objects
Model(s)Model(s)
Data Formats and Exchange
Protocols
Data Formats and Exchange
Protocols
Software Implementation and Developers
Modeling and Exchange Experts
Business and Policies (subject matter experts)
| ‹#›
So what is the developers perspective?
9
• [B1] Defines and manages semanticsHelps implementors understand how to use the meta-data model
• [B2] Defines and manages use-casesUse-cases helps to understand and define business rules - i.e. the fundamentals
Business and Policies
| ‹#›
So what is the developers perspective?
10
• [M1] Defines a meta-data model for research informationHelps implementors in getting their meta-data model right, based on the domain knowledge embedded in the standard model
• Comprehensive and fine grained • Explicit and relevant entities • Rich and temporal relations • Easy to implement
Modeling and Exchange
| ‹#›11
Terminology, Profiles and
Objects
Terminology, Profiles and
Objects
Model(s)Model(s)
Data Formats and Exchange
Protocols
Data Formats and Exchange
Protocols
Software Implementation and Developers
Modeling and Exchange Experts
Business and Policies (subject matter experts)
| ‹#›12
Software Implementation and Developers
Modeling and Exchange Experts
Governance
StakeholderStakeholder
Business and Policies (subject matter experts)
| ‹#›
So what is the developers perspective?
13
• [G1] Defines a vision, strategy and clear goals Setting the scope and direction for a given standard - and act as a champion
• [G2] Coordinate and steer activities Coordinate internal and external activities and stakeholders
• [G3] Ensures that the standard is agnostic A standard should encourage best practice, not dictate specific technologies, or other matters that are internal concern in a given system
• [G4] Implements version control and managementNot in the technical sense, but reelases of the standard must be carefully coordinated and managed centrally
Governance
| ‹#›
So what is the developers perspective?
14
• [T1] Specifies a Data Format for exchangeSpecification of the structure of data
• [T2] Protocol for exchanging data Specification of how to exchange data
• [T3] Defines how to be compliant A technical and concise definition
Software Implementation
| ‹#›
Data Format and Protocol for exchanging data
15
A B
Exchange Format +
Protocol
Business and Policies (subject matter experts)
Modeling and Exchange Experts
| ‹#›
Data Format and Protocol for exchanging data
16
A B
Exchange Format +
Protocol
Querying vs Harvesting vs File download
• Do you want to search and investigate data?
• Do you want live-searches?• Do you just want all data?
| ‹#›
Data Format and Protocol for exchanging data
17
A B
Exchange Format +
Protocol
Querying vs Harvesting vs File download
• OAI-PMH• OData
• What are the use-cases?
| ‹#›
Data Format and Protocol for exchanging data
18
A B
Exchange Format +
Protocol
JSON vs XML
• Standard tools• Versioning of format definitions
• Schema => datatype and structure validation
• Format specification
| ‹#›
[T1/T2] Specifies a Data Format for exchange / Protocol for exchanging data
• The typical data-format of the times is XML (or JSON) • the format needs to well-structured and clearly defined • the format must “facilitate” validation • it is preferable if the format facilities data-types
• The typical transport protocols of the times is REST • However, this is less important
• The real concerns in context of the protocol are • Meta-data: versions, how should data be interpreted, etc • API: which “handles” should be available? • API: how to facilitate harvesting?
• Only use “technologies” that are de facto industri standard, and supported via off-the-shelf open source tools
19
| ‹#›
[T3] Defines how to be compliant
• Requires a formal definition of what it means to be in compliance with a given standard
• The compliance definition must be supplied by the standard organisation
• Such definitions will differ from standard to standard • In the ideal case
• Should be concise and unambiguous • Should include some kind of benchmark “test” to help implementers
assess or validated whether a given implementation is compliant • The standardisation organisation should supply compliance
certificates • Must not limit systems that implement the standard - i.e. it must be
possible to do “more” • Must not dictate a certain “technology” or fundamental architecture • The compliance “statement" must be accepted by the community -
specifically by the implementors
20