brown bag talk with micah altman integrating open data into open access journals
TRANSCRIPT
Integrating Open Data into Open Access Journals
Integrating Open Data into Open Access Journals
Micah AltmanDirector of Research
MIT Libraries
Prepared for
Program on Information Science Brown Bag Series
MIT
October 2015
Integrating Open Data into Open Access Journals
Roadmap
Motivation
• Reproducibility
Intervention
• Integrating journal and data publication workflow
Future• Changing policies & uses
Integrating Open Data into Open Access Journals
Credits&
Disclaimers
Integrating Open Data into Open Access Journals
DISCLAIMERThese opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators
Secondary disclaimer:
“It’s tough to make predictions, especially about the future!”
-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx,
Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc.
Integrating Open Data into Open Access Journals
Collaborators & Co-Conspirators Collaborators and Co-conspirators
IQSS, Harvard: Eleni Castro, Mercè Crosas, Phil Durbin
PKP Project: Alex Garnett, Jenn Whitney
Research Support Supported by the Sloan Foundation
Integrating Open Data into Open Access Journals
Related Work
Project Website
projects.iq.harvard.edu/ojs-dvn
Related publications:(Reprints available from: informatics.mit.edu ) Altman M, Castro E, Crosas M, Durbin P, Garnett A, Whitney J. Open
Journal Systems and Dataverse Integration-- Helping Journals to Upgrade Data Publication for Reusable Research. Code4Lib Journal. Forthcoming.
Altman M, Avery M. 2015 Information wants someone else to pay for it: laws of information economics and scholarly publishing. Information Services and Use . 2015;35(1-2):57-70.
Altman M, Borgman C, Crosas M, Martone M. An Introduction to the Joint Principles for Data Citation. Bulletin of the Association for Information Science and Technology [Internet]. 2015;41(3):43-44.
Brand A, Allen L, Altman M, Hlava M, Scott J. Beyond authorship: attribution, contribution, collaboration, and credit. Learned Publishing. 2015;28(2):151-155.
Integrating Open Data into Open Access Journals
Concerns forReliable Science
Integrating Open Data into Open Access Journals
New Initiatives to Improve Scientific Reliability
Retraction monitoring Data citation Clinical trial
preregistration Registered replication Open data Badges
Integrating Open Data into Open Access Journals
What can go wrong?
Misconduct & Lies
Integrating Open Data into Open Access Journals
Irreproducible Results
Integrating Open Data into Open Access Journals
Many journals have no replication policy
Even in journals with clear policy, success rate is low
The File Drawer Problem
Integrating Open Data into Open Access Journals
Daniel Schectman’s Lab Notebook
Providing Initial
Evidence of Quasi Crystals
• Null results are less likely to be published published results as a whole are biased toward positive findings
• Outliers are routinely discarded unexpected patterns of evidence across studies remain hidden
Integrating Open Data into Open Access Journals
Potential Interventions
Trustworthy Science
Access to Scholarly Record
Reproducible processes
Attribution &
Provenance
Management and
Governance of the
Evidence Base
Measurement and Evaluation
Integrating Open Data into Open Access Journals
The ProjectCitation to Data
Citation to
Article• Technical Integration
• Socio-Technical Intervention
Integrating Open Data into Open Access Journals
Integrating Open Journals and Data Publishing Who? Address the needs of journals
publishers and editors What? Enable journals to seamlessly
manage the submission, review, citation, and publication of data associate with published articles.
How? Integrate existing technologies and workflows, promote adoption through outreach and involvement.
Why? Increase replicability of science, facilitate peer-review of data, promote long-term access to the scientific evidence base.
17
Introduction to Dataverse
Provides incentives for researchers to share:• Recognition & credit via data
citations• Control over data & branding• Fulfill journal data availability and
funder requirements.
Software framework for publishing, citing and preserving research data (open source on github for others to install)
1290 Dataverses
Harvard Dataverse (open to all; repository instance at Harvard) has:
59,346 Dataset
s 248,351 Files
> 1 Million Downloads
Integrating Open Data into Open Access Journals
Open Journal System (OJS)
Open source journal management and publishing systemcreated by the Public Knowledge Project (PKP) to expand & improve
access to research.
Integrating Open Data into Open Access Journals
About OJS OJS Software hosts almost
10000 active journals Used in all continents Particularly popular in developing
countries OJS Model
Open Access Publication Open Software
Services Journal hosting Crossref intergration PLOS Article Level Metrics LOCKSS Integration
OJS is part of a suite of products for automating workflow: Journal Publishing Monograph Publishing Conference Hosting
Integrating Open Data into Open Access Journals
Integrating Workflow Across the Lifecycle
Integrating Open Data into Open Access Journals
Actor Roles The Author submits their article and research data to the journal's OJS
article submission system. (Note that the article and data do not have to be submitted at the same time. Authors can also submit data at a later time, or they can just provide a persistent link with a data citation pointing to the repository that their data is currently in.)
Editors and/or Peer Reviewers review the article and data. If the article and corresponding research data are approved for
publication, the Authors' research data and its corresponding metadata is automatically deposited from OJS into the Dataverse through the API. No redundant information need be entered. A permanent identifier (DOI) will be automatically included that allows the data to be cited and tracked. There will be a data citation included in the journal article page in OJS (and ideally within the Reference section of the article) enabling readers of the article to quickly access the data.
The Dataverse stores the dataset metadata and files (including raw data, documentation, code, etc). There will also be a permanent publication citation link within the Dataverse for researchers to access the article in OJS that corresponds to this research data.
Integrating Open Data into Open Access Journals
Developing a Data Submission & Review Workflow
Integrating Open Data into Open Access Journals
OJS Plugin Architecture Plugin’s Extend Back-End Functionality and User Interface Data Publication now part of OJS distribution Can target any Dataverse repository
Integrating Open Data into Open Access Journals
Author Submission Extends
supplementary file submission
Can provide extended metadata
Can provide data citation
Integrating Open Data into Open Access Journals
Data Publication Data published through dataverse Provides on-line exploration, reformatting,
etc. Linked through citations, DOI’s and author
ID’s Data updates managed in repository
Integrating Open Data into Open Access Journals
Integration Through Sword
Full SWORD 2 deposit interface
The core supported functions:
Retrieve SWORD service document Create a dataset with an Atom Dublin Core Terms (DC Terms) Qualified
Mapping - Dataverse DB Element Crosswalk
List datasets in a dataverse Add files to a dataset with a zip file Display a dataset atom entry Display a dataset statement Delete a file by database id Replacing metadata for a dataset Delete a dataset Determine if a dataverse has been
published Publish a dataverse Publish a dataset
Complementary Dataverse API’s Search Data Access Data Analysis Native Harvesting
Integrating Open Data into Open Access Journals
REST-ful-nessList datasets in a dataversecurl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/collection/dataverse/$DATAVERSE_ALIAS
Add files to a dataset with a zip filecurl -u $API_TOKEN: --data-binary @path/to/example.zip -H "Content-Disposition: filename=example.zip" -H "Content-Type: application/zip" -H "Packaging: http://purl.org/net/sword/package/SimpleZip" https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/edit-media/study/doi:TEST/12345
Display a dataset atom entry
curl -u $API_TOKEN: https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/edit/study/doi:TEST/12345
Integrating Open Data into Open Access Journals
Results First complete journal + data publishing workflow Successful integration of two major OSS systems Leverages a standard, open protocol; plugin
architecture Released and supported by existing development
communities
Integrating Open Data into Open Access Journals
The Future
Integrating Open Data into Open Access Journals
Policy Changes Since we Started
Preregistration Requirements
Data Sharing Requirements
Open Access Requirements
Data Citation Requirements
Integrating Open Data into Open Access Journals
Evaluating Current Open Journal Policies Self selected sample of OJS Publishers Random Samples of
OJS Journals DOAJ Journals
Coding of data sharing policy by strength
Integrating Open Data into Open Access Journals
Substantial Interest(In a self-selected sample) >200 OJS Journals
95% -- Data citation is important 75% -- Data sharing is important 72% -- Replicability is important
Integrating Open Data into Open Access Journals
Limited Adoption of Data Policies in Open Access Journals
Integrating Open Data into Open Access Journals
Comparison to Other Fields
Integrating Open Data into Open Access Journals
Future Integrations PKP
push into Archivematic(experimental)
Deposit into other SWORD endpoints Dataverse
Accept deposits from OSF Accept deposits from other SWORD suppliers
Integrating Open Data into Open Access Journals
Additional References● Crosas M. "A Data Sharing Story." Journal of
eScience Librarianship. 1(3), 173-179. 2013.● Crosas, M. "The Dataverse Network™: An
Open-Source Application for Sharing, Discovering and Preserving Data," D-lib Magazine 17(1/2). 2011.
● Willinsky, J. "Open Journal Systems: An example of open source software for journal management and publishing." Library Hi-Tech 23 (4), 504-519. 2005.
Questions?E-mail: [email protected]
Web: informatics.mit.edu
Integrating Open Data into Open Access Journals
Integrating Open Data into Open Access Journals
Creative Commons License
This work. Managing Confidential information in research, by Micah Altman (http://redistricting.info) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.