mark patterson - store & retrieve data anywhere · “the visualization shows a structure of...

29
Unlocking references from the literature: The Initiative for Open Citations Dario Taraborelli • Mark Patterson FORCE 2017 • Berlin, 27 October 2017

Upload: others

Post on 31-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Unlocking references from the literature: The Initiative for Open Citations

Dario Taraborelli • Mark PattersonFORCE 2017 • Berlin, 27 October 2017

Page 2: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

DEAN MORLEY [CC BY ND] • flickr.com/photos/33465428@N02/4490667565

Page 3: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

BLUESTAR FRUIT RIPENING GAS • indiamart.com/cold-room-engineers/

Page 4: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

The Initiative for Open Citations (I4OC)

Page 5: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

The Initiative for Open Citations (I4OC)

Page 6: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

How it came together

Page 7: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

http://blogs.plos.org/biologue/2017/04/07/setting-your-cites-on-open/

Page 8: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

How it came together

The starting pointMost publishers already deposit their reference data with CrossrefThe default state for the data is closed

The challengeCould we persuade a group of influential publishers to release their data all at once?

Page 9: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Making the case

It’s easy and doesn’t cost anythingAll you need to do is to send an email to [email protected]

The goal cannot be achieved aloneA comprehensive network of all scholarship can only be achieved if data is pooled

Publishers also benefitBetter discovery tools mean that content will be found and used more

Page 10: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Making it happen

Focus on publishers depositing the most dataContacted the top-20 publishers asking for agreement in principle and permission to share their decision

Agree a deadlineEveryone has time to prepare their comms and to be part of a big splash

Leverage the early adoptersAs soon as we had a few publishers on board, others quickly followed

Page 11: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Progress so far

Page 12: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Progress

Page 13: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Progress

Page 14: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Stakeholders

A coalition of major funders, scholarly platforms, open data organizations and publishers supporting the unrestricted availability of scholarly citation data.

STAKEHOLDERS OF THE INITIATIVE FOR OPEN CITATIONS • https://i4oc.org/#stakeholders

Page 15: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Data reuse

The Wikidata Citation Graph

36 million citation linksusing the cites (P2860)Property in Wikidata

PARTIAL CITATION GRAPH FOR ULRICH K. LAEMMLI (1970) • http://tinyurl.com/y7acpqzd

Page 16: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Data reuse

Tools to create profiles

Scholia uses data from Wikidata

PROFILE INFORMATION FOR EGON WILLIGHAGEN • https://tools.wmflabs.org/scholia/author/Q20895241

Page 17: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Data reuse

The Open Citations Corpus

A broad and open collection of citation information from many sources

David Shotton and Silvio Peroni

THE OPEN CITATIONS CORPUS • http://opencitations.net/corpus

Page 18: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Data reuse

VISUALIZING FREELY AVAILABLE CITATION DATA USING VOSVIEWER • https://www.cwts.nl/blog?article=n-r2r294

Page 19: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

The road ahead

Page 20: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Lessons learned

A single, measurable goal

Low cost

Agnostic to business model

Amplification

Page 21: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Towards a (truly) open scholarly graph

“The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which were based on Web of Science or Scopus data.”

VISUALIZING FREELY AVAILABLE CITATION DATA USING VOSVIEWER • https://www.cwts.nl/blog?article=n-r2r294

Page 22: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Who will benefit from this

OPENING UP RESEARCH CITATIONS: A Q&A WITH DARIO TARABORELLI • http://bit.ly/2hfnC3b

Page 23: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

41% Crossref records have reference data

47% of those have open reference data

Acknowledgment: Daniel Ecer, Data Scientist, eLife. See https://elifesci.org/crossref-data-notebook

Challenges: coverage

Page 24: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Over 1 billion references

49% are open

53% have DOIs (and can be linked to another record)

Some cleanup required

Acknowledgment: Daniel Ecer, Data Scientist, eLife. See https://elifesci.org/crossref-data-notebook

Challenges: data quality

Page 25: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

The road to 100%

Page 26: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

The road to 100%

Major publishers among the top 20 DOI depositors not distributing open references (as of October 2017)

Elsevier

IEEE

Wolters Kluwer Health

IOP Publishing

Oxford University Press

American Chemical Society

Page 27: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

The road to 100%

CROSSREF MEMBERS WITH OPEN REFERENCES • https://www.crossref.org/reports/members-with-open-references/

A list of all Crossref members with open references and statistics on their open reference coverage

Page 28: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Getting involved

https://twitter.com/i4oc_org/status/894934190625402880

Page 29: Mark Patterson - Store & Retrieve Data Anywhere · “The visualization shows a structure of science that is well known from earlier large-scale bibliometric visualizations, which

Thank you

D. Taraborelli, M. Patterson (2017) Unlocking references from the literature: The Initiative for Open Citations. FORCE 2017 [CC BY 4.0]

AcknowledgmentsThe I4OC founders: OpenCitations, Wikimedia Foundation, PLOS, eLife, DataCite, the Center for Culture and Technology at Curtin University.

The I4OC instigators: Jonathan Dugan, Martin Fenner, Jan Gerlach, Catriona MacCallum, Daniel Mietchen, Cameron Neylon, Michelle Paulson, Silvio Peroni, David Shotton.

The I4OC stakeholders (i4oc.org/#stakeholders) and participating publishers (i4oc.org/#publishers)