research data allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/tuni10039343.pdf ·...

14
Tung Tung Chan July – 2019 EN Research Data Alliance Open Science Monitor Case Study

Upload: others

Post on 22-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

Tung Tung Chan July – 2019

EN

Research Data Alliance

Open Science Monitor Case Study

Page 2: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

Research Data Alliance - Open Science Monitor Case Study

European Commission Directorate-General for Research and Innovation Directorate G — Research and Innovation Outreach Unit G.4 — Open Science E-mail [email protected] [email protected] European Commission B-1049 Brussels Manuscript completed in July 2019.

This document has been prepared for the European Commission however it reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein. More information on the European Union is available on the internet (http://europa.eu). Luxembourg: Publications Office of the European Union, 2019

EN PDF ISBN 978-92-76-12112-1 doi: 10.2777/261887 KI-04-19-652-EN-N

© European Union, 2019. Reuse is authorised provided the source is acknowledged. The reuse policy of European Commission documents is regulated by Decision 2011/833/EU (OJ L 330, 14.12.2011, p. 39).

For any use or reproduction of photos or other material that is not under the EU copyright, permission must be sought directly from the copyright holders.

Page 3: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

EUROPEAN COMMISSION

Research Data Alliance Open Science Monitor Case Study

2019 Directorate-General for Research and Innovation EN

Page 4: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

4

Table of Contents

Acknowledgements ...................................................................................... 5

1 Background .......................................................................................... 6

2 Drivers ................................................................................................ 7

3 Barriers ............................................................................................... 7

4 Impact ................................................................................................ 8

5 Lessons Learnt.................................................................................... 10

6 Policy conclusions ................................................................................ 11

References ............................................................................................... 12

Page 5: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

5

Acknowledgements

Disclaimer: The information and views set out in this study report are those of the author(s) and do not necessarily reflect the official opinion of the Commission. The Commission does not guarantee the accuracy of the data included in this case study. Neither the Commission nor any person acting on the Commission’s behalf may be held responsible for the use which may be made of the information contained therein.

The case study part of Open Science Monitor led by the Lisbon Council together with CWTS, ESADE and Elsevier.

Author

Tung Tung Chan – CWTS

Acknowledgements

The study team would like to thank Hilary Hanahoe, Secretary General of the RDA for sharing her valuable time and experiences.

Page 6: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

6

1 Background

Sharing research data is central to advancing the aims of Open Science, as sharing creates opportunities for reuse, increases collaboration, and transparent research practices. Open sharing of research data on a global scale do not only require the cooperation of researchers, but also an interoperable data infrastructure with international framework and standards (Wittenburg et al., 2010). However, the global research data infrastructure needed to enable research data sharing and exchange is still far from realised. To address this gap, the Research Data Alliance (RDA) was established in 2013 through the support and funding by the European Commission, Australian Government Department of Education and Training, and the United States National Science Foundation (NSF) and National Institute of Standards and Technology (NIST) (Berman, Wilkinson & Wood, 2014).

An international and community-driven organisation, RDA builds the social and technical bridges which enable open research data sharing on a global scale through a neutral forum (Parsons, 2013). It facilitates discussion and exchange by identifying best practices and standards for research data, tools and infrastructure across scientific disciplines (Parsons, 2013; Treloar, 2014). This is achieved through concrete outputs and recommendations developed by Working Groups and Interest Groups, formed by expert members from academia, private sector, and government (Berman, Wilkinson & Wood, 2014).

As of May 2019, the RDA includes more than 8,400 members representing 137 countries, growing exponentially since its inception six years ago (with 1,300 members from 53 countries). The following actors and programs are instrumental in realising the goals of RDA, which create, develop and adopt the social, organisational, and technical infrastructure solutions needed to reduce barriers to research data sharing and exchange.

• Individuals: As primary research data producers and users, researchers currently make up the largest group of RDA stakeholders, with 1,898 members (May, 2019).

• Research Performing Organisations (RPOs): Organisations can become sponsors, supporters or organisational members through various forms of financial contribution.

• Libraries: Library and information service professionals represent the second largest group of stakeholders, with more than 1,000 members. They are one of the key providers within universities to raise awareness, provide training and support to researchers in managing data in all steps of the research lifecycle.

• The European Open Science Cloud (EOSC): Launched in 2015 by the European Commission, the EOSC will operate as a federation of research data infrastructures, to create a trusted environment for hosting and processing research data to support open science (Giannoutakis & Tzovaras, 2016). The EOSC implementation roadmap outlines six action lines: (1) architecture, (2) data, (3) services, (4) access and interfaces, (5) rules and (6) governance, where the first four action lines fall within the scope of RDA.

• Regions: RDA US and RDA Europe are official regional groups within RDA global which facilitate its members to coordinate activities, events and exchanges in their research and data management communities on a national level. Members may volunteer to be a national contact point and form a national group. As of May 2019, there are 13 RDA Europe national nodes (Austria, Denmark, Ireland, Netherlands, United Kingdom, Germany, Greece, Spain, Italy, France, Finland, Slovenia, Portugal) and six national groups: Norway, Brazil, North America, United States, Asia and South Eastern Europe.

• Students and Early Career Professionals: RDA/US Data Share Program and RDA EU Early Career Support Programme introduce its research fellows to RDA’s work through attending the bi-annual plenaries, and participating in RDA’s Working

Page 7: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

7

Groups and Interest Groups, under the mentorship of RDA advisory committee members.

2 Drivers

In 2010, the Final report of the High-Level Expert Group on Scientific Data highlighted the importance international partnerships and global governance in building open e-infrastructures to collect, curate, preserve and share the ever-increasing amounts of scientific data (Wittenburg et al., 2010). Inspired by the success of Internet Engineering Task Force (IETF), the Data Access Interoperability Task Force (DAITF) from Europe and the Data Web Forum proposed by NSF and NIST from the United States led to the bottom-up establishment of RDA (Treloar, 2014). The emergence of RDA was a result of the following scientific, industrial and societal drivers.

• For science, open research data infrastructures that support seamless access, use, and re-use, would enable anyone to find, access and process the data they need. This will encourage all researchers to deposit their data, foster collaboration among scientists, generate new insights and new forms of scientific inquiry to address the grand challenges of society.

• For industry, public research data can be used for commercial purposes. Ideally, cross-fertilisation and knowledge exchange between public and private sectors will remove adversarial attitudes amongst the two, which will generate new discoveries, new companies and new jobs. Open research data infrastructures would facilitate academics and industrialists to engage in a virtuous cycle to amplify the impact of innovation, and advance the national economy.

• For society, open research data infrastructures that protect data ownership and integrity will create trust, increase confidence in our abilities to use and understand data, and evaluate the degree to which that data was collected in a responsible manner. This will empower citizens to contribute more easily and creatively to the scientific process.

3 Barriers

The challenges encountered by RDA members seeking to engage in research data management issues are found at the individual level and at the organisational level. On the individual level, a large percentage of RDA members come from US and Western Europe, and they are overrepresented in Working and Interest Groups (Treloar, 2014). Members from other regions may not have the financial means to be present, or lack skills and confidence to contribute in the events and meetings. Further, institutions which RDA members represent, may not have the resources or capacities to create a supportive environment for research data management. It would be difficult for all RDA members to adopt the outputs and recommendations to bring about a wider change in behaviour among researchers within his/her organisation. Therefore, the success of RDA in removing data sharing barriers depends on both the upskilling of members across disciplines and regions, and the willingness of researchers from all scientific communities to engage in the process. It is thus essential that RDA keeps growing in membership vis-à-vis attractive institutional subscription models to maintain its inclusive and international position.

On the organisational level, community support for and interest in research data infrastructures and management requires consensus and regular discussions. The rise in membership will also bring new topics and shared interests, which will increase the number of Working Groups and Interest Groups. The RDA forum might turn into a complex environment that will be difficult for new members to navigate. Finally, the RDA organising committee will have to be more creative in developing processes and activities that would engage all members to take the RDA outputs and recommendations up to their organisations. This is the most challenging of all, as policies come only after practices have stabilised and become accepted, yet this is not the case with research data management (Asher et al., 2013).

Page 8: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

8

Finally, RDA’s organisational visions and missions may not be well-aligned with that of universities. This makes it difficult for RDA to translate or demonstrate its values to academic institutions which researchers are a part of. Research culture rewards publications, and put research data sharing and reuse far down the list of institutional priority (McNutt et al., 2016). The contributions of researchers, data stewards and data professionals may be invisible if universities do not recognize research data stewardship and data management as scientific excellence, for they build trust, promote transparency and reproducibility in science.

4 Impact

The Research Data Alliance (RDA) plays a significant role in developing a consolidated global research data infrastructure. There is an urgent need to identify the necessary technical aspects, governance issues, and best practices required to support more coordinated approaches to make research data sharing a reality. Specifically, RDA creates concrete pieces of social, organisational, and technical infrastructure that accelerate data sharing and exchange for a target community, use and adopt the infrastructure within the target community, and recommend the infrastructure to other communities. These efforts are being developed by Working Groups (WGs) and Interest Groups (IGs).

WGs generate outputs and recommendations which develop and implement data infrastructure, including tools, policy, practices and products in the timespan of 12 to 18 months. WGs members are RDA individuals who will endorse, adopt, and use these outputs in their projects, organisations and communities. IGs operate without a time limit as they provide discussion forums in topical areas that address a specific data sharing problem, and identify the kind of research data infrastructure solutions that needs to be developed in WGs.

Table 1 below contains 12 endorsed outputs officially recognised by the RDA (May, 2019). In this table, the scientific, industrial and societal impact of the outputs will be examined through solution (impact statements provided by RDA), identification of target community (researchers, libraries, funders, policymakers etc), as well as the disciplinary domains they address.

Table 1. RDA Endorsed Recommendations (RDA, May, 2019). Recommendation title

Target community

Disciplinary domain

Solution Impact

Scalable Dynamic Data Citation Methodology

Researchers, developers, data centres

Data science and research data management (RDM)

Supporting accurate citation of data subjected to change, for the efficient processing of data and linking from publications

Scientific: technological aspects of data infrastructure.

Data Description Registry Interoperability Model

Researchers RDM for cross-disciplinary and cross-platform research data discovery

Provides researchers a mechanism to connect datasets in various data repositories based on various models such as co-authorship, joint funding, grants, etc.

Scientific: interoperability of data infrastructure.

Basic Vocabulary of Foundational Terminology Query Tool

Researchers All disciplines Ensures researchers apply a common core data model when organising their data and thus making data accessible and re-usable.

Scientific: ICT technical specifications for common standards and harmonisation of terminologies.

Page 9: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

9

Data Type Model and Registry

Researchers, developers, RPOs, governmental agencies

All disciplines Ensures data producers classify their data sets in standard data types, allowing data users to automatically identify instruments to process and visualise the data

Scientific: ICT technical specifications for interoperability of data infrastructure. Societal: Remove technical hurdles on data sharing.

FAIRsharing: standards, databases, repositories and policies – Final Recommendation

Researchers, RPOs, developers, funders, policy makers, librarians, journal publishers and learned societies

All disciplines Guide producers of standards, databases, repositories and create a registry to make their resources discoverable to prospective users. This registry tracks the development and evolution of standards, implementation and adoption in data policies from funders, journals and RPOs.

Scientific: Reduce knowledge gap across stakeholders and encourage FAIR practices through information curation and implementation of common standards. Societal: Remove technical and social hurdles on data sharing, provide education and training.

Persistent Identifier Type Registry

Developers Data science and RDM

Defines standard core PID information types to enable simplified verification of data identity and integrity

Scientific: ICT technical specification for semantic interoperability of data.

Machine Actionable Policy Templates

Policy makers, RPOs, governmental agencies

All disciplines A standardised template which can be used to enforce management, automate administrative tasks, validate assessment criteria, and automate scientific analyses

Scientific: ICT technical specification for common standards in automate process management.

Repository Audit and Certification Catalogues

Data centres, data communities and services

Data science and RDM

Creates harmonized common procedures for certification of repositories at the basic level, drawing from the procedures already put in place by the Data Seal of Approval (DSA) and the ICSU World Data System (ICSU-WDS)

Scientific: certification of data repositories.

Recommendation on Research Data Collections

Libraries, data centres, data communities, repositories and services

RDM Provides a comprehensive model for actionable collections and a technical interface specification to enable client-server interaction for research data collections.

Scientific: technological aspects of data collections.

Workflows for Research Data Publishing: Models and Key Components

Researchers, publishers, libraries, data centres

All disciplines Assists research communities in understanding options for data publishing workflows and increases awareness of

Societal: remove technical and social hurdles on data publication to enable data sharing.

Page 10: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

10

Of the 12 official RDA recommendations, four were ICT technical specifications endorsed by the European Commission. At a glance, the generated outputs are non-domain specific and primarily focused on research data management through common standards and best practices. RDA delivered predominantly scientific impact, followed by societal impact that enable research data sharing, exchange and interoperability. Considering its industrial impact, publishers such as Elsevier and Wiley do adopt and endorse the above outputs. However, there is hardly any evidence of collaboration or engagement with global corporations, SMEs, commercial software and data service providers. This is logical given RDA’s focus in reaching a critical mass of individual and institutional members from the public sector during its start-up phase. For without the financial contribution of funders, bottom-up support of academic researchers and librarians, RDA would not have achieved such prolific outputs.

While there have been one or two corporate representation in the RDA advisory board, such appointment is insufficient to build ‘bridges’ to industry. Attracting industrial RDA members may be an important next step for RDA, to enable information and knowledge exchange between the public and private. With their participation, innovative education and training programmes as well as unforeseen opportunities may arise between academics and industrialists, which will help accelerate the development of research data infrastructure in the public domain.

5 Lessons Learnt

RDA is a crucial platform in advancing the field of research data management and accelerating the development of open data research infrastructure. While there are many technical and social hurdles that awaits its members, the structure of WGs and IGs are useful in generating concrete outputs and recommendations. These have provided the research communities with abundant opportunities for reflection, identification of best practice and analysis of beneficial ways forward. RDA is well on its way in developing areas that advance the scientific gateways, which Barker et al. (2018) refer to as community-driven digital environments that meet the particular needs of a research community:

• “Technical solutions for the development of science gateways, including interoperability, standards, software registries, and data management.

emerging standards and best practices.

Research Data Repository Interoperability WG Final Recommendations

Libraries, data centres, data communities, repositories and services

Research data management

Provides interoperable packaging and exchange format for digital content in data repository. Once implemented, compliant packages can be used to migrate or replicate digital content between research data repository platforms or for preservation purposes.

Scientific: Common standards and exchange format which enable the interoperability of research data repository

Wheat Data Interoperability Guidelines, Ontologies and User Cases

Researchers Agriculture An interactive cookbook that helps researchers create, manage and exchange wheat data.

Scientific: the standardization and harmonization of wheat data.

Page 11: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

11

• Best practices and policies for the valuation of science gateways, including incentives for open science, reproducibility, data and software citation.

• Sustainability models for the maintenance, development, and exploitation of science gateways, including development of skills, training, career paths and funding.”

Barker et al. (2018, p.17)

Six years after its establishment, RDA is well beyond its start-up phase. It is time to review its organisational structure and reflect on whether RDA is generating necessary and sufficient impact on the scientific, industrial and societal level for a diverse set of target community. Currently, RDA outputs are mostly technical solutions for research data infrastructure issues, best practices and policies. They have not yet addressed the sustainability models as outlined by Barker et al. (2018), which is fundamental in fostering a bottom-up cultural change for research data sharing and reuse. New and creative ways of engaging with individual members beyond the WGs, IGs and bi-annual plenaries meeting would increase awareness, generate momentum and engagement in these issues. Further, RDA should consider transforming its website to a more user-friendly and forward-looking web design and layout, given its tremendous amount of content and various sub-groups embedded within larger groups. Finally, the RDA organisational body may consider working on a local level with other regions, to obtain a diverse and sustainable array of funding sources, further replicating the success of RDA US and RDA Europe.

6 Policy conclusions

In conclusion, it is important that RDA continues to evolve, increase international collaboration and global sharing mechanisms, to remove social and technical barriers to research data sharing and reuse. Public funding of RDA is crucial to its survival and operational evolution. The ongoing investment in national and international programs, in tandem with community and disciplinary initiatives, are facilitating the public debate on research data issues across scientific communities. However, the lack of cooperation with industry and development of sustainability models may hinder the ability of research data to increase industrial and societal impact, improve research career paths and funding on the individual and institutional level. Increased coordination across varied initiatives in the Working Groups and Interest Groups will continue to improve identification of best practice and development of policies and standards. New members need to be trained and mentored by more experienced RDA members across regions, to realise their full potential in demonstrating the value of research data sharing and reuse.

Page 12: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

12

References

Asher, A., Deards, K., Esteva, M., Halbert, M., Jahnke, L., Jordan, C., ... & Urban, T. (2013). Research data management: Principles, practices, and prospects In. Washington, DC: Council on Library and Information Resources.

Barker, M., Olabarriaga, S. D., Wilkins-Diehr, N., Gesing, S., Katz, D. S., Shahand, S., ... & Treloar, A. (2019). The global impact of science gateways, virtual research environments and virtual laboratories. Future Generation Computer Systems, 95, 240-248.

Berman F, Wilkinson R, & Wood J. (2014). Building Global Infrastructure for Data Sharing and Exchange through the Research Data Alliance. D-Lib Magazine 20 (1), retrieved from www.dlib.org/dlib/january14/01guest_editorial.html.

Giannoutakis, K. M., & Tzovaras, D. (2016, October). The European Strategy in Research Infrastructures and Open Science Cloud. In International Conference on Data Analytics and Management in Data Intensive Domains (pp. 207-221). Springer, Cham.

McNutt, M., Lehnert, K., Hanson, B., Nosek, B. A., Ellison, A. M., & King, J. L. (2016). Liberating field science samples and data. Science, 351(6277), 1024-1026.

Parsons, M. A. (2013). The research data alliance: Implementing the technology, practice and connections of a data infrastructure. Bulletin of the American Society for Information Science and Technology, 39(6), 33-36.

RDA. (2019, May 20). All Recommendations & Outputs. Retrieved June 11, 2019, from https://rd-alliance.org/recommendations-and-outputs/all-recommendations-and-outputs

Treloar, A. (2014). The Research Data Alliance: globally co-ordinated action against barriers to data publishing and sharing. Learned Publishing, 27(5), S9-S13.

Wittenburg, P., Van de Sompel, H., Vigen, J., Bachem, A., Romary, L., Marinucci, M., ... & Lopez, D. R. (2010). Riding the wave: How Europe can gain from the rising tide of scientific data.

Page 13: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

13

Getting in touch with the EU IN PERSON All over the European Union there are hundreds of Europe Direct Information Centres. You can find the address of the centre nearest you at: http://europa.eu/contact ON THE PHONE OR BY E-MAIL Europe Direct is a service that answers your questions about the European Union. You can contact this service – by freephone: 00 800 6 7 8 9 10 11 (certain operators may charge for these calls), – at the following standard number: +32 22999696 or – by electronic mail via: http://europa.eu/contact Finding information about the EU ONLINE Information about the European Union in all the official languages of the EU is available on the Europa website at: http://europa.eu EU PUBLICATIONS You can download or order free and priced EU publications from EU Bookshop at: http://bookshop.europa.eu. Multiple copies of free publications may be obtained by contacting Europe Direct or your local information centre (see http://europa.eu/contact) EU LAW AND RELATED DOCUMENTS For access to legal information from the EU, including all EU law since 1951 in all the official language versions, go to EUR-Lex at: http://eur-lex.europa.eu OPEN DATA FROM THE EU The EU Open Data Portal (http://data.europa.eu/euodp/en/data) provides access to datasets from the EU. Data can be downloaded and reused for free, both for commercial and non-commercial purposes.

Page 14: Research Data Allianceuni-sz.bg/truni11/wp-content/uploads/biblioteka/file/TUNI10039343.pdf · Education and Training, and the United States National Science Foundation (NSF) and

This case study reports activities performed by Research Data Alliance (RDA), an international grassroots organisation that promotes international collaboration and global sharing mechanisms to remove social and technical barriers to research data sharing and reuse. Drivers, impacts and barriers of the RDA were identified and discussed using secondary data sources, close reading of the RDA website, interview, as well as attending the RDA 13th Plenary Meeting. The report concludes with lessons learnt and policy conclusions, calling for increased industrial impact, development of sustainability models for open research data management, and induction and mentorship programs for RDA members.

Studies and reports