paving the way to open and interoperable research data service workflows
TRANSCRIPT
![Page 1: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/1.jpg)
Paving the way to open and interoperable research data service workflows Progress from 3 perspectives
Angus Whyte, Digital Curation Centre Rory Macneil, Research SpaceStuart Lewis, University of Edinburgh
Repository Fringe, Edinburgh, 2nd August 2016
![Page 2: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/2.jpg)
Paving the way to open and interoperable research data service workflows Progress from 3 perspectives
Angus Whyte, Digital Curation CentreResearch data service models, new DCC guidance and some draft ‘design principles’ for institutions integrating research data service workflows
Rory Macneil, Research SpaceIntegrating the RSpace ELN with University of Edinburgh’s DataShare and Harvard’s Dataverse repositories
Stuart Lewis, University of EdinburghDataVault –Jisc Research Data Spring prototype for packaging data to be archived
![Page 3: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/3.jpg)
Guidance from Digital Curation Centre
Aims to effectively support organisations with-
• Providing effective research data services
• Promoting reusability of their research data
• and reproducibility of their research
![Page 4: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/4.jpg)
Models that reflect and help shape reality
Higgins, S. (2008) The DCC Curation Lifecycle Model. Int. Journal Digital Curation 2008, Vol. 3, No. 1, pp. 134-140 doi:10.2218/ijdc.v3i1.48
Key element of DCC guidancee.g. Curation Lifecycle Model (2008)
![Page 5: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/5.jpg)
DCC research data service model (2016)
RDM policy & strategyBusiness plans &
sustainability
Training
Data Management Planning
Active data management
Appraisal & risk assessment
Preservation
Access & publishing
Discovery
Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
Advisory services
![Page 6: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/6.jpg)
Some working definitions
Research data service – “a means of delivering value to the producers and users of digital objects by facilitating outcomes they want to achieve without the ownership of specific costs or risks” (derived from ITIL definition of a service)
Sub-types -
Active data management “services used to create or transform digital objects for the purposes of research”
Preservation: “services offering to ensure digital objects meet a defined level of FAIRness - findability, accessibility, interoperability, and reusability -for a designated community and period of time”
Publication: “services offering to enhance digital objects FAIRness by reviewing their quality on specified criteria, or connecting them to additional metadata”
Guidance: “services offering practical guidance on choosing or using the above services”
![Page 7: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/7.jpg)
Draft design principles for integration of research data service workflows
1. Active data management services should use openstandards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms
![Page 8: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/8.jpg)
Draft design principles for integration of research data service workflows
2. Preservation and publication services should publishpolicies stating what digital object types they accept, for what communities, and on what terms and conditions
![Page 9: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/9.jpg)
Draft design principles for integration of research data service workflows
3. Preservation and publication services should makeopenly available sufficient metadata to enable reuseof their outputs, including all terms and conditionsfor third-party access and reuse
![Page 10: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/10.jpg)
Draft design principles for integration of research data service workflows
4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support thereproducibility of research that produced the digitalobjects they act upon
![Page 11: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/11.jpg)
Draft design principles for integration of research data service workflows
5. Guidance services should support users of otherservices to make an informed choice of downstreamservice capabilities, informed by best practices forreuse and reproducibility
![Page 12: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/12.jpg)
Draft design principles for integration of research data service workflows
6. Guidelines 1-5 should be implemented usingmachine-actionable content
![Page 13: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/13.jpg)
DCC research data service model (2016)
RDM policy & strategyBusiness plans &
sustainability
Training Advisory services
Data Management Planning
Active data management
Appraisal & risk assessment
Preservation
Access & publishing
Discovery
Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
![Page 14: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/14.jpg)
Data Management Planning
Active data management
Appraisal & risk assessment
Preservation
Access & publishing
Discovery
Real services are made up of many more partstools and services at researchers disposal are increasing
![Page 15: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/15.jpg)
‘lifecycles’ are often non-linear
Integration use cases do not follow neat sequences
One size does not fit all
![Page 16: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/16.jpg)
What about Functional Models? OAIS foundation for
Trustworthy Digital Repository standards
IngestData
management
AIP AIP
Preservation PlanningContext Context
Description Description
RDM
SIP DIP
UsersProducer
Archival storage
Access
Administration
![Page 17: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/17.jpg)
RDM platforms can be grafted on, to relate OAIS functions to context infosources
IngestData
management
AIP AIP
CRIS
Open access Repository
Preservation PlanningContext Context
Description Description
RDM
SIP DIP
UsersProducer
Data Catalogue
Archival storage
Collection
Appraisal
Access
Data Preservation
Active Data Management
Data MgmtPlanning
Data Publication
Administration
![Page 18: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/18.jpg)
What best practice models apply ‘upstream’?
Q. How can institutions ensure researchers have informed choice of trustworthy services before they need a repository
• “Whole lifecycle” service models are one solution
• But how should they be governed?
– Commercial services?
– Commons?
– Hybrid?
![Page 19: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/19.jpg)
Commercial services modele.g. Elsevier
“On Wednesday 1 June, Elsevier acquired Hivebench to help further streamline the workflow of researchers – putting research data management at their fingertips. The added value of the integration lies in linking Hivebench with Elsevier’s existing Research Data Management portfolio for products and services.
The research data that researchers have stored in the Hivebench notebook are linked to the Mendeley Data repository, which will be linked to Pure. This ….adds instant value to the datasets because they become far more suitable for reuse.”
![Page 20: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/20.jpg)
Commons approach e.g. Principles for Open Scholarly Infrastructures
“…What should a shared infrastructure look like? Infrastructure at its best is invisible. We tend to only notice it when it fails. If successful, it is stable and sustainable. Above all, it is trusted and relied on by the broad community it serves. Trust must run strongly across each of the following areas: running the infrastructure (governance), funding it (sustainability), and preserving community ownership of it (insurance).” (emphasis added)
Cameron Neylon, Science in the Open, 25 Feb 2015
![Page 21: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/21.jpg)
Commons approach e.g. Open Science Framework
Workflows in effect governed by a ‘mixed economy’of platforms
So what principles should apply?
Mathew Spitzer An Open Science Framework for Solving Institutional Challenges: Supporting the Institutional Research Method Research Data Access and Preservation Summit, 2016 Atlanta, GA May 4-7, 2016
![Page 22: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/22.jpg)
Background - RDA Working Group on Data Publishing Workflows
Reviewed 25 examples of repository data publishing workflows, including some integrating with ‘downstream’ services e.g. data journals for peer review
Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542
Drafted a reference model and made best practice recommendations-1. Start small, building modular, open source and shareable
components2. Follow standards that facilitate interoperability and permit
extensions 3. Facilitate data citation, e.g. through use of digital object PIDs,
data/article linkages, researcher PIDs 4. Document roles, workflows and services
![Page 23: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/23.jpg)
Background - RDA Working Group on Data Publishing Workflows
• Follow up call (Dec 15) for examples of repositories connecting with upstream research workflows e.g. to gather metadata earlier
• Aiming to identify whether recommendations apply, and how the intention to publish data is changing research practices.
• Collected 12 cases - mix of concrete examples, prototypes (e.g. Dendro), conceptual models (e.g. Science 2.0 repositories)
• Report in preparation
![Page 24: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/24.jpg)
Review of upstream workflow examples
Are the services components underpinning research workflows loosely coupled?
– Modular design of workflow components? yes, plenty evidence
– Standard vocabularies and protocols to describe components some evidence
– Significant investment in building trust-based relationships among participants some evidence
– Standardized ways of specifying capabilities and performance requirements limited (e.g. WDS-DSA Catalogue of Requirements)
* ‘The Joy of Flex’ Hagel and Seely-Brown, 2005 see e.g. http://www.cio.com.au/article/29148/joy_flex
![Page 25: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/25.jpg)
So do we need more …
E.g. DCC How-to describe research data service workflows (forthcoming)
• Definitions – to describe services ?
• Design principles – to articulate best practice?
• Capability models - to articulate mutual expectations of service owners?
• Case studies of repository workflow integration?
• Understanding of how tools are actually being integrated, and effect on data publication practices
![Page 26: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/26.jpg)
Draft design principles for integration of research data service workflows
1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, includingtheir access and reuse terms
2. Preservation and publication services should publish policies stating what digitalobject types they accept, for what communities, and on what terms and conditions
3. Preservation and publication services should make openly available sufficientmetadata to enable reuse of their outputs, including all terms and conditions forthird-party access and reuse
4. Active data management, preservation and publication services should makesufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informedchoice of downstream service capabilities, informed by best practices for reuseand reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content
Thanks for comments to Suenje Dalmeier Tiessen and Amy Nurnberger, co-chairs of the RDA Working Group on Data Publishing Workflows
![Page 27: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/27.jpg)
Maintaining trust across the research cycle
Q. How do these principles apply in real cases?
Case Study 1 – Rory Macneill, Research Space
Integrating the RSpace ELN with University of Edinburgh’s DataShareand Harvard’s Dataverse repositories
![Page 28: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/28.jpg)
Export data and metadata
Archive or Repository
Current Repository Paradigm
![Page 29: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/29.jpg)
The reality is (a lot) more complex
DataFilesLinks
RepositoriesArchivesELNsFile systemsDatabases
Capture and structure dataGenerate metadata
Export data and metadataEstablish links to file(s) and databases
Track file locations
ActionsVehiclesResearch units
IssuesData versus file linksForms of data export
Data from intermediary vehiclesFile location and integrity of linksPost deposit access - permissionsPost deposit access - capabilities
Pre-deposit impact on post-deposit capabilities
Repositories need the ability to easily ingest diverse data types and formats, and links to files,
in a structured manner, directly and from other tools
![Page 30: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/30.jpg)
The Dataverse – Starfish – RSpace project @Harvard Medical School
Towards a new paradigm
Data, files and research Capture and structure dataGenerate metadata
Track file locations Make data and files available for public access and query
![Page 31: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/31.jpg)
Access hyperlinks √Track file locations √
Export data and metadataIn various formats
Using open standardsPDF √
Word √HTML √XML √
On premises orCloud file
system
Database
Capture and structure data √Generate metadata √
Track file locations √
![Page 32: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/32.jpg)
Draft design principles for integration of research data service workflows
1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, includingtheir access and reuse terms
2. Preservation and publication services should publish policies stating what digitalobject types they accept, for what communities, and on what terms and conditions
3. Preservation and publication services should make openly available sufficientmetadata to enable reuse of their outputs, including all terms and conditions forthird-party access and reuse
4. Active data management, preservation and publication services should makesufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informedchoice of downstream service capabilities, informed by best practices for reuseand reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content
![Page 33: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/33.jpg)
Case study 2
Stuart Lewis, University of Edinburgh
DataVault –Jisc Research Data Spring prototype for packaging data to be archived
![Page 34: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/34.jpg)
Stuart LewisDeputy Director, Library &
University CollectionsThe University of Edinburgh
What is a Data Vault?
@JiscDataVault
![Page 35: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/35.jpg)
Research Data Management Services
Data Management Support
Data Management
Planning
Active Data Infrastructure
Data Stewardship
![Page 36: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/36.jpg)
Data Stewardship
• DataVault
– Long term archival storage
– First envisaged a few years ago…
![Page 37: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/37.jpg)
What is the DataVault - Analogies
https://www.flickr.com/photos/brookward/8457736952
![Page 38: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/38.jpg)
What is the DataVault - Analogies
https://www.flickr.com/photos/timshelyn/4125480767
![Page 39: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/39.jpg)
Where does it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
![Page 40: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/40.jpg)
Where does it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
PURE / CRIS / Data Catalogue
![Page 41: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/41.jpg)
Where does could it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
PURE / CRIS / Data Catalogue
Open Data Repository
![Page 42: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/42.jpg)
Phase 1 (3 months)
Phase 2 (4 months)
Phase 3 (6 months)
![Page 43: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/43.jpg)
![Page 44: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/44.jpg)
![Page 45: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/45.jpg)
![Page 46: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/46.jpg)
![Page 47: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/47.jpg)
![Page 48: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/48.jpg)
![Page 49: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/49.jpg)
![Page 50: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/50.jpg)
![Page 51: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/51.jpg)
![Page 52: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/52.jpg)
![Page 53: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/53.jpg)
![Page 54: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/54.jpg)
Make available online
Metadata already captured from CRIS, plus files from the Vault
![Page 55: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/55.jpg)
![Page 57: Paving the way to open and interoperable research data service workflows](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a67b5877f8b9a2f638b5f55/html5/thumbnails/57.jpg)
Draft design principles for integration of research data service workflows
1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, includingtheir access and reuse terms
2. Preservation and publication services should publish policies stating what digitalobject types they accept, for what communities, and on what terms and conditions
3. Preservation and publication services should make openly available sufficientmetadata to enable reuse of their outputs, including all terms and conditions forthird-party access and reuse
4. Active data management, preservation and publication services should makesufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informedchoice of downstream service capabilities, informed by best practices for reuseand reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content