publish your data - hku librarieslib.hku.hk/general/research/scientificdata_hku.pdf ·...

Post on 11-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

PUBLISH YOUR DATA Nature Publishing Group's new initiatives to promote

and credit open data sharing

Andrew L. Hufton Managing Editor, Scientific Data Nature Publishing Group andrew.hufton@nature.com

HKU, September 18th, 2014

Since 1869 Nature’s mission

has been:

To communicate the world’s

best and most important

science to scientists across

the world and to the wider

community interested in

science.

Nature Publishing Group

shares a similar mission:

Nature

• Publishes the most significant advances

with the widest implications.

• Significance should be readily apparent

to anyone from any field.

Nature Research Journals

• Publishes the most significant

advances across the discipline

each covers.

• Significance should be apparent

to anyone in that discipline.

Traditional Nature journals

NPG’s other in-house journals

High impact.

Targeted at specialists.

High quality.

Rapid publication.

Helping you publish,

discover and reuse

research data

An introduction to the editorial process Open access at Nature Publishing Group Publish your data Helping scientists publish important datasets, and ensuring they get credit for sharing. Other initiatives to promote and credit open, reproducible research

The Editorial Process

From Submission to Publication

At the Nature-titles

Submitted Manuscript

Editor

Reject without review (50-80%)

Referees

Evaluations Decision: Reject Accept Revise and reconsider

From Submission to Publication

At the Scientific-titles

Submitted Manuscript Editorial Board

Member

Reject without review (Out of Scope only)

Referees

Evaluations

Decision: Reject Accept Revise and reconsider

Managing Editor

Open access & Nature Publishing Group

What is Open Access?

● Part of a global trend to encourage wider and easier access to

research ideas and information.

• Wider dissemination of scientific knowledge speeds the pace

of scientific progress.

• The Berlin Declaration on Open Access to Knowledge in the

Sciences and Humanities (22nd October 2003) defined Open

Access as the ability of others to

“copy, use, distribute, transmit and display the work

publicly and to make and distribute derivative works,

in any digital medium for any responsible purpose,

subject to proper attribution of authorship.”

Growth of gold Open Access by region

Laakso and Björk BMC Medicine 2012 10:124 doi:10.1186/1741-7015-10-124

(Reproduced under CC-BY http://creativecommons.org/licenses/by/2.0)

Why publish Open Access?

Because it’s better for science.

• Scientific knowledge belongs to everyone.

• Science progresses more rapidly when new

ideas, new results and new understanding

are shared most freely.

• Public understanding of science is improved

by public access to primary research.

Why publish Open Access?

Because HKU wants you to!

Prof Tsui, Vice Chancellor of HKU,

signed the Berlin Declaration in

November 2009

From the HKU OA policy: “By

sharing our intellectual output, we and

our community can realize greater

benefits economically, socially, and

intellectually.”

Why publish Open Access?

Because it’s better for YOU!

“We found strong evidence that, even in a journal that is widely available in research libraries, OA articles are more immediately recognized and cited by peers than non-OA

articles published in the same journal.”

在同一本期刊中,开放获取的论文会更快被你的同行注意到从而更快地被引用。

Eysenbach, G. Citation Advantage of Open Access Articles.

PLoS Biology 4, e157 (2006).

http://dx.doi.org/10.1371/journal.pbio.0040157

Green versus Gold

Green Open Access

• Free but usually with restrictions.

• arXiv.org, PubMed Central, …

• Allowed by all Nature journals under the

following terms:

Submitted version can be posted at any time.

Final refereed version can be posted 6 months

after publication.

Published version (that is, the published PDF)

should never be posted.

Green versus Gold

Gold Open Access

• Fewer restrictions:

Full rights to do whatever YOU want to do with

the final paper.

Limited rights for others (depending on the

license) to do what THEY want with the paper.

• Author pays Article Processing Charge.

• Free for all to download in perpetuity.

Nature Communications

● Publishes significant advances that have

to potential to influence thinking of

specialists in a field.

● Broad appeal isn’t a prerequisite for

publication… but great science is!

● 2013 Impact Factor = 10.742.

● Choice of subscription access or Open

Access!

● Specialist scope means the chances of

being published are more than twice

that of other Nature journals. 对专业性的强调意味着发表的机率是其他自然期刊的两倍。

Scientific Reports

● Impact Factor: 5.078.

● Speed: Scientific Reports is committed

to providing rapid publication service.

● Acceptance rate: Over 60%

● Scope: Publishes technically sound,

original research papers in all areas of

the natural and clinical sciences.

● International Editorial Board: 1600

experts across all disciplines.

● Visibility: Over 800,000 article page

views per month. http://www.nature.com/srep

Open Access means anybody can

download, read, and cite your paper

开放获取意味着任何人都可以下载、访问和引用你的文章

Publish your data

In 1953 a scientific work could

change the world with…

• One page of text.

• Two authors.

• One figure.

• No raw data.

Watson, J. D. & Crick, F. H.

C. Molecular structure of

nucleic acids. Nature 171,

737-738 (1953).

… but in the 21st century scientific

discovery is more about data and

collaboration.

In 2012 the Encyclopedia

of DNA Elements

(ENCODE) generated

• Thirty papers.

• Across three different

journals.

• From thirty-two different

research institutes.

The Data Deluge

Photo by Shalom Jacobovitz, via Wikipedia

Data, data, data Depositions of datasets in archives continue to grow, surpassing

journal articles in biomedical research

Growth of biomedical research

publications (red; current total >19

million), alongside the accumulation

of research data, including nucleic

acid sequences (black; current total

~163 million), computer-annotated

protein sequences (magenta; current

total 9 million), manually annotated

protein sequences (green; current

total 500,000) and protein structures

(blue; current total 60,000)

Source: Biochemical Journal 2009 424, 317-333 - Teresa K. Attwood, Douglas B. Kell and others.

The Data Journal concept

• Data must be well described before others can use it

and benefit from it.

• Scientists who share data in a reusable manner deserve

credit through citable publications.

• Several journals now offer “data paper” article-types,

including GigaScience, F1000Research, Earth Systems

Science Data, Biodiversity Data Journal

25

Now launched! Visit nature.com/scientificdata Email scientificdata@nature.com Tweet @ScientificData

Honorary Academic Editor Susanna-Assunta Sansone Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators

Supported by

Helping you publish, discover

and reuse research data

Now Live!

Get Credit for Sharing Your Data

Publications will be indexed and citeable.

Open-access

Authors select from three Creative Commons licenses for the main Data

Descriptor. Each publication supported by CCO metadata.

Focused on Data Reuse

All the information others need to reuse the data; no interpretative analysis,

or hypothesis testing

Peer-reviewed

Rigorous peer-review focused on technical data quality and reuse value

Promoting Community Data Repositories

Not a new data repository; data stored in community data repositories

Synthesis

Analysis

Conclusions

What did I do to generate the data?

How was the data processed?

Where is the data?

Who did what when

Methods and technical analyses supporting the quality of the measurements.

Do not contain tests of new scientific hypotheses

Data Descriptor

relation with traditional articles

When should you submit

a manuscript to Scientific Data? • Alongside your article at a Nature-

journal.

• Describe standalone datasets that don’t

fit in your other publications.

• Release data used in your previous

research articles.

30

Focus on RNA sequencing quality control (SEQC) In the September issue of Nature Biotechnology

A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium SEQC/MAQC-III Consortium | doi:10.1038/nbt.2957 The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance Wang et al. | doi:10.1038/nbt.3001

Cross-platform ultradeep transcriptomic profiling of human reference RNA samples by RNA-Seq Xu et al. | doi:10.1038/sdata.2014.20 Transcriptomic profiling of rat liver samples in a comprehensive study design by RNA-Seq Gong et al. | doi:10.1038/sdata.2014.21

Stem Cells

• Associated Nature

Article

• Data at figshare &

NCBI GEO

• Integrated figshare

data viewer

Neuroscience

Code in GitHub

• New Dataset

• Data in OpenfMRI

• Source code in GitHub

• Big Data

Environmental

• New Dataset

• Data in figshare

• Code in figshare

• Integrated figshare

data viewer

• Cited in Science

Linking between research papers, Data Descriptors, and data records

Making data discoverable

The Data Descriptor article-type

Data Descriptor

Experimental metadata or

structured component

(in-house curated, machine-readable

formats)

Article or

narrative component

(PDF and HTML)

Focus on data reuse

Sections: • Title • Abstract • Background & Summary • Methods • Technical Validation • Data Records • Usage Notes • Figures & Tables • References • Data Citations

Data Descriptor

Detailed descriptions of the methods and technical analyses

supporting the quality of the measurements.

Does not contain tests of new scientific hypotheses

Joint Declaration of Data Citation Principles by the Data Citation Synthesis Group, incl.: - CODATA - Research Data Alliance, - Force11

In-house curation team:

• assists users to submit the

structured content via simple

templates and an internal

authoring tool

• performs value-added semantic

annotation of the experimental

metadata

For advanced users/service

providers willing to export ISA-Tab

for direct submission, we have

released a technical specification:

analysis

method script

Data file or

record in a database

Data Descriptor structured metadata (CC0)

The right licence for the right content

Data: the primary datasets will reside in public

repositories. Partnering with figshare and Dryad,

which both use the CC0 waiver.

Metadata: released under the CC0 waiver to

maximize reuse and aid data miners

Data Descriptor article: Licensed under one of

three Creative Commons licenses, by author

choice:

Editorial process & policies

Editorial Board Active scientists oversee peer-review

Peer-review assesses

• The completeness of the description

• Alignment with community standards

• Data deposition in an appropriate repository

• Technical quality of the measurements

• Reuse value

Clear data sharing policies

• Data must be deposited to an approved data repository

before manuscript submission, prior to peer-review.

• If datasets are private, they must be made accessible to

editors and referees in a secure and confidential manner.

• Must agree to release data to the public, without undue

restrictions, at the time of publication.

• Reasonable controls allowed for datasets with human privacy

restrictions.

Our recommended repositories

• We currently recognize over 60 public data repositories.

• We have integrated systems with both figshare and

Dryad

• Earth sciences repositories include: Pangaea, ORNL

DAAC, NERC Data Centres, and more

• We work with you to find the best place to archive your

data.

Other initiatives to promote open-science at

the Nature Publishing Group

Promoting Reproducible Science

• Strong data deposition requirements in fields

with well-established repositories, across all

Nature-titles.

• New life sciences reproducibility checklist,

helps ensure that key information is included in

each manuscript.

• Collaboration between the Nature-titles and

Scientific Data to promote wider data sharing.

46

Data Citations Formally link Data Descriptor to external data records

Joint Declaration of Data Citation Principles by the Data Citation Synthesis Group, incl.: - CODATA - Research Data Alliance, - Force11

NPG supports ORCIDs ORCIDs are open, unique personal identifier for researchers.

ORCIDs help you get credit for your scientific output

• A persistent digital identifier that distinguishes you from every other researcher

• Register for your own at

www.orcid.org

• Include your ORCID when you are authored on a manuscript, in grants, and on your website

• Make the most out of your research

• Share your data, get the credit you

deserve

• Register for an ORCID today

50

Concluding points

Now launched! Visit nature.com/scientificdata Email scientificdata@nature.com Tweet @ScientificData

Managing Editor, Scientific Data Andrew L. Hufton andrew.hufton@nature.com Honorary Academic Editor Susanna-Assunta Sansone Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators

Thanks!

top related