publish your data - hku librarieslib.hku.hk/general/research/scientificdata_hku.pdf ·...
TRANSCRIPT
PUBLISH YOUR DATA Nature Publishing Group's new initiatives to promote
and credit open data sharing
Andrew L. Hufton Managing Editor, Scientific Data Nature Publishing Group [email protected]
HKU, September 18th, 2014
Since 1869 Nature’s mission
has been:
To communicate the world’s
best and most important
science to scientists across
the world and to the wider
community interested in
science.
Nature Publishing Group
shares a similar mission:
Nature
• Publishes the most significant advances
with the widest implications.
• Significance should be readily apparent
to anyone from any field.
Nature Research Journals
• Publishes the most significant
advances across the discipline
each covers.
• Significance should be apparent
to anyone in that discipline.
Traditional Nature journals
NPG’s other in-house journals
High impact.
Targeted at specialists.
High quality.
Rapid publication.
Helping you publish,
discover and reuse
research data
An introduction to the editorial process Open access at Nature Publishing Group Publish your data Helping scientists publish important datasets, and ensuring they get credit for sharing. Other initiatives to promote and credit open, reproducible research
The Editorial Process
From Submission to Publication
At the Nature-titles
Submitted Manuscript
Editor
Reject without review (50-80%)
Referees
Evaluations Decision: Reject Accept Revise and reconsider
From Submission to Publication
At the Scientific-titles
Submitted Manuscript Editorial Board
Member
Reject without review (Out of Scope only)
Referees
Evaluations
Decision: Reject Accept Revise and reconsider
Managing Editor
Open access & Nature Publishing Group
What is Open Access?
● Part of a global trend to encourage wider and easier access to
research ideas and information.
• Wider dissemination of scientific knowledge speeds the pace
of scientific progress.
• The Berlin Declaration on Open Access to Knowledge in the
Sciences and Humanities (22nd October 2003) defined Open
Access as the ability of others to
“copy, use, distribute, transmit and display the work
publicly and to make and distribute derivative works,
in any digital medium for any responsible purpose,
subject to proper attribution of authorship.”
Growth of gold Open Access by region
Laakso and Björk BMC Medicine 2012 10:124 doi:10.1186/1741-7015-10-124
(Reproduced under CC-BY http://creativecommons.org/licenses/by/2.0)
Why publish Open Access?
Because it’s better for science.
• Scientific knowledge belongs to everyone.
• Science progresses more rapidly when new
ideas, new results and new understanding
are shared most freely.
• Public understanding of science is improved
by public access to primary research.
Why publish Open Access?
Because HKU wants you to!
Prof Tsui, Vice Chancellor of HKU,
signed the Berlin Declaration in
November 2009
From the HKU OA policy: “By
sharing our intellectual output, we and
our community can realize greater
benefits economically, socially, and
intellectually.”
Why publish Open Access?
Because it’s better for YOU!
“We found strong evidence that, even in a journal that is widely available in research libraries, OA articles are more immediately recognized and cited by peers than non-OA
articles published in the same journal.”
在同一本期刊中,开放获取的论文会更快被你的同行注意到从而更快地被引用。
Eysenbach, G. Citation Advantage of Open Access Articles.
PLoS Biology 4, e157 (2006).
http://dx.doi.org/10.1371/journal.pbio.0040157
Green versus Gold
Green Open Access
• Free but usually with restrictions.
• arXiv.org, PubMed Central, …
• Allowed by all Nature journals under the
following terms:
Submitted version can be posted at any time.
Final refereed version can be posted 6 months
after publication.
Published version (that is, the published PDF)
should never be posted.
Green versus Gold
Gold Open Access
• Fewer restrictions:
Full rights to do whatever YOU want to do with
the final paper.
Limited rights for others (depending on the
license) to do what THEY want with the paper.
• Author pays Article Processing Charge.
• Free for all to download in perpetuity.
Nature Communications
● Publishes significant advances that have
to potential to influence thinking of
specialists in a field.
● Broad appeal isn’t a prerequisite for
publication… but great science is!
● 2013 Impact Factor = 10.742.
● Choice of subscription access or Open
Access!
● Specialist scope means the chances of
being published are more than twice
that of other Nature journals. 对专业性的强调意味着发表的机率是其他自然期刊的两倍。
Scientific Reports
● Impact Factor: 5.078.
● Speed: Scientific Reports is committed
to providing rapid publication service.
● Acceptance rate: Over 60%
● Scope: Publishes technically sound,
original research papers in all areas of
the natural and clinical sciences.
● International Editorial Board: 1600
experts across all disciplines.
● Visibility: Over 800,000 article page
views per month. http://www.nature.com/srep
Open Access means anybody can
download, read, and cite your paper
开放获取意味着任何人都可以下载、访问和引用你的文章
Publish your data
In 1953 a scientific work could
change the world with…
• One page of text.
• Two authors.
• One figure.
• No raw data.
Watson, J. D. & Crick, F. H.
C. Molecular structure of
nucleic acids. Nature 171,
737-738 (1953).
… but in the 21st century scientific
discovery is more about data and
collaboration.
In 2012 the Encyclopedia
of DNA Elements
(ENCODE) generated
• Thirty papers.
• Across three different
journals.
• From thirty-two different
research institutes.
The Data Deluge
Photo by Shalom Jacobovitz, via Wikipedia
Data, data, data Depositions of datasets in archives continue to grow, surpassing
journal articles in biomedical research
Growth of biomedical research
publications (red; current total >19
million), alongside the accumulation
of research data, including nucleic
acid sequences (black; current total
~163 million), computer-annotated
protein sequences (magenta; current
total 9 million), manually annotated
protein sequences (green; current
total 500,000) and protein structures
(blue; current total 60,000)
Source: Biochemical Journal 2009 424, 317-333 - Teresa K. Attwood, Douglas B. Kell and others.
The Data Journal concept
• Data must be well described before others can use it
and benefit from it.
• Scientists who share data in a reusable manner deserve
credit through citable publications.
• Several journals now offer “data paper” article-types,
including GigaScience, F1000Research, Earth Systems
Science Data, Biodiversity Data Journal
25
Now launched! Visit nature.com/scientificdata Email [email protected] Tweet @ScientificData
Honorary Academic Editor Susanna-Assunta Sansone Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators
Supported by
Helping you publish, discover
and reuse research data
Now Live!
Get Credit for Sharing Your Data
Publications will be indexed and citeable.
Open-access
Authors select from three Creative Commons licenses for the main Data
Descriptor. Each publication supported by CCO metadata.
Focused on Data Reuse
All the information others need to reuse the data; no interpretative analysis,
or hypothesis testing
Peer-reviewed
Rigorous peer-review focused on technical data quality and reuse value
Promoting Community Data Repositories
Not a new data repository; data stored in community data repositories
Synthesis
Analysis
Conclusions
What did I do to generate the data?
How was the data processed?
Where is the data?
Who did what when
Methods and technical analyses supporting the quality of the measurements.
Do not contain tests of new scientific hypotheses
Data Descriptor
relation with traditional articles
When should you submit
a manuscript to Scientific Data? • Alongside your article at a Nature-
journal.
• Describe standalone datasets that don’t
fit in your other publications.
• Release data used in your previous
research articles.
30
Focus on RNA sequencing quality control (SEQC) In the September issue of Nature Biotechnology
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium SEQC/MAQC-III Consortium | doi:10.1038/nbt.2957 The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance Wang et al. | doi:10.1038/nbt.3001
Cross-platform ultradeep transcriptomic profiling of human reference RNA samples by RNA-Seq Xu et al. | doi:10.1038/sdata.2014.20 Transcriptomic profiling of rat liver samples in a comprehensive study design by RNA-Seq Gong et al. | doi:10.1038/sdata.2014.21
Stem Cells
• Associated Nature
Article
• Data at figshare &
NCBI GEO
• Integrated figshare
data viewer
Neuroscience
Code in GitHub
• New Dataset
• Data in OpenfMRI
• Source code in GitHub
• Big Data
Environmental
• New Dataset
• Data in figshare
• Code in figshare
• Integrated figshare
data viewer
• Cited in Science
Linking between research papers, Data Descriptors, and data records
Making data discoverable
The Data Descriptor article-type
Data Descriptor
Experimental metadata or
structured component
(in-house curated, machine-readable
formats)
Article or
narrative component
(PDF and HTML)
Focus on data reuse
Sections: • Title • Abstract • Background & Summary • Methods • Technical Validation • Data Records • Usage Notes • Figures & Tables • References • Data Citations
Data Descriptor
Detailed descriptions of the methods and technical analyses
supporting the quality of the measurements.
Does not contain tests of new scientific hypotheses
Joint Declaration of Data Citation Principles by the Data Citation Synthesis Group, incl.: - CODATA - Research Data Alliance, - Force11
In-house curation team:
• assists users to submit the
structured content via simple
templates and an internal
authoring tool
• performs value-added semantic
annotation of the experimental
metadata
For advanced users/service
providers willing to export ISA-Tab
for direct submission, we have
released a technical specification:
analysis
method script
Data file or
record in a database
Data Descriptor structured metadata (CC0)
The right licence for the right content
Data: the primary datasets will reside in public
repositories. Partnering with figshare and Dryad,
which both use the CC0 waiver.
Metadata: released under the CC0 waiver to
maximize reuse and aid data miners
Data Descriptor article: Licensed under one of
three Creative Commons licenses, by author
choice:
Editorial process & policies
Editorial Board Active scientists oversee peer-review
Peer-review assesses
• The completeness of the description
• Alignment with community standards
• Data deposition in an appropriate repository
• Technical quality of the measurements
• Reuse value
Clear data sharing policies
• Data must be deposited to an approved data repository
before manuscript submission, prior to peer-review.
• If datasets are private, they must be made accessible to
editors and referees in a secure and confidential manner.
• Must agree to release data to the public, without undue
restrictions, at the time of publication.
• Reasonable controls allowed for datasets with human privacy
restrictions.
Our recommended repositories
• We currently recognize over 60 public data repositories.
• We have integrated systems with both figshare and
Dryad
• Earth sciences repositories include: Pangaea, ORNL
DAAC, NERC Data Centres, and more
• We work with you to find the best place to archive your
data.
Other initiatives to promote open-science at
the Nature Publishing Group
Promoting Reproducible Science
• Strong data deposition requirements in fields
with well-established repositories, across all
Nature-titles.
• New life sciences reproducibility checklist,
helps ensure that key information is included in
each manuscript.
• Collaboration between the Nature-titles and
Scientific Data to promote wider data sharing.
46
Data Citations Formally link Data Descriptor to external data records
Joint Declaration of Data Citation Principles by the Data Citation Synthesis Group, incl.: - CODATA - Research Data Alliance, - Force11
NPG supports ORCIDs ORCIDs are open, unique personal identifier for researchers.
ORCIDs help you get credit for your scientific output
• A persistent digital identifier that distinguishes you from every other researcher
• Register for your own at
www.orcid.org
• Include your ORCID when you are authored on a manuscript, in grants, and on your website
• Make the most out of your research
• Share your data, get the credit you
deserve
• Register for an ORCID today
50
Concluding points
Now launched! Visit nature.com/scientificdata Email [email protected] Tweet @ScientificData
Managing Editor, Scientific Data Andrew L. Hufton [email protected] Honorary Academic Editor Susanna-Assunta Sansone Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators
Thanks!