npg scientific data overview for gbif - tdwg meeting oct 2013

14
Honorary Academic Editor Susanna-Assunta Sansone, PhD (University of Oxford, UK) Managing Editor Andrew L Hufton, PhD Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators Visit nature.com/scientificdata Email [email protected] Tweet @ScientificData

Upload: susanna-assunta-sansone

Post on 27-Jan-2015

106 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Honorary Academic Editor Susanna-Assunta Sansone, PhD (University of Oxford, UK) Managing Editor Andrew L Hufton, PhD Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators

Visit nature.com/scientificdata

Email [email protected]

Tweet @ScientificData

Page 2: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Supported by

Now open for submissions!

Launching May 2014 Advisory Panel

Michael Huerta ● National Institutes of Health, USA ● Mark Thorley ● Natural Environment

Research Council, UK ● Patricia Cruse ● University of California, USA ● Susan Gregurick ● Office

of Biological and Environmental Research, Department of Energy, USA ● Ioannis Xenarios ● Swiss

Institute of Bioinformatics, Switzerland ● Chris Bowler ● IBENS, France ● Mark Forster ● Syngenta,

UK ● Anthony Rowe ● Johnson & Johnson, USA ● Stephen Chanock ● National Cancer Institute,

USA ● Weida Tong ● National Center for Toxicological Research, FDA, USA ● Albert J. R. Heck ●

Utrecht University, The Netherlands ● Johanna McEntyre ● EMBL-EBI, European Bioinformatics

Institute, UK ● Simon Hodson ● CODATA, France ● Joseph R. Ecker ● Howard Hughes Medical

Institute & Salk Institute, USA ● Stephen Friend ● Sage Bionetworks, USA ● Jessica Tenenbaum ●

Duke Translational Medicine Institute, USA ● Anne-Claude Gavin ● EMBL, Germany ● David Carr ●

Wellcome Trust, UK ● Wolfram Horstmann ● University of Oxford, UK ● Piero Carninci ● RIKEN

Omics Science Center, Japan ● Pascale Gaudet ● Swiss Institute of Bioinformatics, Switzerland ●

Judith A. Blake ● The Jackson Laboratory, USA ● Richard H. Scheuermann ● J. Craig Venter

Institute, USA ● Caroline Shamu ● Harvard Medical School, USA

Susanna-Assunta Sansone

Honorary Academic Editor

Andrew L Hufton

Managing Editor

Ruth Wilson

Publisher

Page 3: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Supported by

Introducing a new content type:

Data Descriptor

Now open for submissions!

Launching May 2014

Page 4: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Data Descriptor

Synthesis

Analysis

Conclusions

Interpretation

What is the

sample?

What did I do to

generate the data?

Where is the data?

How was the data

processed?

Who did what when?

Summary

of DD

Facts

Data Descriptor

Journal article

● The data descriptor is only concerned with the facts behind the

methodology of data generation/collection and processing

● A data descriptor can be:

– submitted prior to journal article

– submitted at the same time as the journal article

– submitted after journal article

Data Descriptor vs. Traditional Article

Page 5: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Prior Publication Policy

“Nature-titled journals will not consider prior Data Descriptor publications to

compromise the novelty of new manuscript submissions as long as those

manuscripts go substantially beyond a descriptive analysis of the data, and

report important new scientific findings appropriate for the journal. This policy

does not necessarily extend to subsequent journal articles whose primary

purpose is to describe a new dataset or resource.”

See the full text in our Editorial Policies online

Page 6: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Barriers to data sharing and reuse

● Datasets are not released

● Datasets are not reusable or discoverable

● Lack of credit for sharing data and making it

reusable

Page 7: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Two sample Data Descriptors now online

7

Page 8: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Data Descriptor has 2 components

8

Article

or

narrative component (PDF and HTML)

Experimental metadata

or

structured component (in-house curated, machine-readable formats)

Supported by

Page 9: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Data Descriptor - article

Sections:

• Title

• Abstract

• Background & Summary

• Methods

• Technical Validation

• Data Records

• Usage Notes

• Figures & Tables

• References

In traditional publications this is

not provided in a sufficiently

detailed manner

However this information is

essential for understanding,

reusing, and reproducing

datasets

Page 10: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

10

Submit ISA-Tab* files directly OR Submission tools and simple templates

help authors provide the information

without special tools

In-house curator

standardizes the

structured content

Data Descriptor – experimental metadata

*Sansone et al., Nature Genetics, 2012

Page 11: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

11

Discover similar datasets

SciData DD

Structured

content

Structured content allows users to link, with one click, to other datasets

studying the same tissue, disease, organism, or using the same experimental

platform

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

SciData DD

Structured

content

Same tissue

Same organism

Same assay

Page 12: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Get Credit for Sharing Your Data

Publications will be listed in the major indexes and will be citeable

Open-access

Authors select from three Creative Commons licences for the main

Data Descriptor. Each publication supported by curated CC0 metadata

Focused on Data Reuse

All the information others need to reuse the data; no interpretative

analysis or hypothesis testing

Peer-reviewed

Rigorous peer-review managed by our Editorial Board of academic

researchers ensures data quality and standards

Promoting Community Data Repositories

Data stored in community data repositories

Page 13: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Complementary to both journal articles

and data repositories

Export to various formats

(ISA_tab, RDF, etc)

Page 14: NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013

Partnership between GBIF and

NPG Scientific

Data

Mapping the DD article and GBIF Metadata

Profile

Enhancement to GBIF IPT to export the DD

article

Call for manuscript

submissions

1st set of Data

Descriptors published

Q4 2013

Q43 2014

Q42 2014

Q4 2013

Q4 2014

Mapping the DD experimental metadata and

GBIF Metadata Profile

Further enhancements

to GBIF IPT

The two components of the Data Descriptor (DD):

• DD article or narrative component

• DD experimental metadata or structured component (ISA-Tab format, progressively others e.g. RDF)

Scientific Data and GBIF: Roadmap

PHASE 1

PHASE 2

Vishwas Chavan