overview of emerging requirements for data management of federally funded research
TRANSCRIPT
Overview of Emerging Requirements for Data Management of Federally Funded ResearchRichard HuffineLeveraging Data to Lead: SLA Maryland Premier EventNovember 5, 2015
Overview
• Recent History
• Presidential Directive
• Pending Legislation
• Data Management
• Open Data
• Public Access vs.
Public Domain
• Creative Commons
• Impact on Researchers
• Options for Sharing Data
• Questions?
• Contact Information
Recent History
• Federal agencies fund research in a variety of ways:– Grant solicitations– Partnership Agreements– Student internships– Contracts with universities– Contracts with companies
• The products of those activities have rarely been claimed to be the property of the government
– New requirements do not change the ownership of scholarship and date– It does, however, require that the results of research be made publicly
available
• The National Institutes of Health established the first U.S. Public Access policy in 2004. That policy included sharing data.
Presidential Directive• On February 22, 2013, the White House issued a directive,
requiring that the results of taxpayer-funded research – both articles and data – be made freely available to the general public. The goal of the directive was to accelerate scientific discovery and fuel innovation.
• U.S. Government agencies with annual research and development expenditures over $100 million are required to develop a plan for accomplishing the goals of the directive.
• Agencies covered by the directive had until August 22, 2013 to develop a plan for implementation. As of now, 24 Agencies have self-identified as meeting the requirements of the Directive and 15 of them have released at least draft plans.
Presidential Directive
• Public access means:– Free to read– Defined rights for re-use, reproduction, acknowledgement
• In addition to new federal requirements, many other funders or partners are establishing expectations for transparency, access, and utility of research and the data used to produce it
– College and University policies– Grantmaking Foundations– Publishers– Societies and Associations
• Even some research disciplines are beginning to coalesce around standards of practice for data availability and use.
Pending LegislationCurrently Pending:
• H.R.1426 - Public Access to Public Science Act– Referred to the Subcommittee on Research and Technology on August 18, 2015
• S.779 - Fair Access to Science and Technology Research Act– Ordered to be reported with an amendment in the nature of a substitute favorably
on July 29, 2015
• H.R.1477 - Fair Access to Science and Technology Research – Referred to the House Committee on Oversight and Government Reform on
March 19, 2015.
Data Management• All of the plans either require or encourage the creation and
adherence to Data Management Plans. • Plans may be required submissions as part of a grant request or
may be incorporated into the partnership agreement between federal agencies and external partners.
• The DMPTool (https://dmptool.org/) is a free tool available to:– Draft data management plans that comply with the requirements of specific
funding agencies– Provides easy to follow instructions for building a data management plan– Informs users of resources and services available to help follow the plan that you
draft
• Data management is required at the beginning of a research effort but it should be reviewed, revised, and followed throughout the life of a project
Data Management• The data-management-planning process ensures access to data
that supports published research. It also ensures that research data is documented, deposited, and made available for other researchers.
• Each agency plan addresses different aspects of the challenge of making research data available.
– NASA is planning to “explore the development of a research data commons, a federated system of research databases.”
– The U.S. Department of Energy is planning to provide digital object identifiers (DOIs) to datasets resulting from its funded research in order to improve the discoverability and future citation of datasets.
Open Data• This effort is part of the Administration’s Open Government Initiative
and it places an emphasis on data availability and reuse as well as the ability for people to reproduce the results of research.
• The various agency-specific plans that have been released also require the release of data sets upon publication of articles or as part of the completion of a project.
• The approach to releasing data is rarely prescribed and varies from:– encouraging the use of “approved external repositories,” – prescription to use existing data centers, – use of an “interoperable data infrastructure,” or – no specific requirement, just to use a public repository.
• The plans speak of developing “Enterprise Data Inventories” or creating data catalogs or locator services for the public to find data based on the source of funding.
Public Access vs. Public Domain
• It is important to note that in the United States, data cannot be copyrighted. The control or protection of rights for an aggregation of data is also limited.
• The copyright and control of data, datasets, and databases is very different in other parts of the world.
• Creators can however license data, data sets or databases. Releasing research data under a license is one approach that could potentially meet both the funders requirements and the interests of their research partners.
See: Miller, Arthur R. "Copyright Protection for Computer Programs, Databases, and Computer-Generated Works: Is Anything New Since CONTU?." Harvard Law Review (1993): 977-1073.Ginsburg, Jane C. "Copyright, common law, and sui generis protection of databases in the United States and abroad.” University of Cincinnati Law Review. 66 (1997): 151.
Creative Commons• Many producers grant narrow permissions to use data via a “terms
of service” agreement. A lot of data sharing also occurs among researchers in an ad hoc manner.
• Copyright and similar restrictions may otherwise limit dissemination or reuse of data but data sharing can be facilitated by distribution under standard, public legal tools used to manage those terms.
• Creative Commons licenses and the CC0 (C-C-zero) public domain dedication can facilitate data sharing while maintaining specific permissions for the use of the data.
See: https://wiki.creativecommons.org/wiki/Data
Impact on Researchers• The impact of this emerging patchwork of requirements is difficult
to determine today. One thing is certain, if you work with multiple agencies, you will need to navigate conflicting guidance and reporting requirements.
• Many institutions (large research Universities, Federal laboratories, consortia, etc.) are working to get in front of the curve and develop standard practices that exceed any specific requirement.
• There are a number of federal research initiatives that are partnering with funders and recipients to ensure that the requirements don’t add too much of a burden to the work of researchers:
– DataOne - https://www.dataone.org/– Data Conservancy - https://dataconservancy.org
Options for Sharing Data• NIH Data Sharing Repositories
– 69 different providers listed– https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html
• Dryad Digital Repository – prices start at $75/submission for a non-member and go down based on volume
and affiliation.– http://datadryad.org/
Questions?
Richard Huffine (orcid.org/0000-0002-8974-2750)https://www.linkedin.com/in/[email protected]