data peer review workshop

32
Show me the data! Data peer review at Scientific Data Varsha Khodiyar, Scientific Data 30.03.2017

Upload: varsha-khodiyar

Post on 11-Apr-2017

6 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Data peer review workshop

Show me the data!Data peer review at Scientific Data

Varsha Khodiyar, Scientific Data30.03.2017

Page 2: Data peer review workshop

2

Scientific Data, a Nature Research journalData Descriptor

Primary article type; sound science and facilitates data reuse

AnalysisNew analyses or meta-analyses of existing data

ArticleOriginal reports on advances in data sharing & reuse

CommentAnnouncements of broad interest; usually invited

www.nature.com/scientificdata

Page 3: Data peer review workshop

3

Under the hood of a Data Descriptor

• Context for data generation (background)

• How was data generated?• How was data processed?• Where is the data?

• Synthesis• Analysis• Conclusions

Page 4: Data peer review workshop

4

A key principle of publishing at Scientific Data

Wilkinson M.D., et al . The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3; 160018 (2016) doi:10.1038/sdata.2016.18

Findable – (meta)data is uniquely and persistently identifiable.

Accessible – data is reachable and accessible by humans and machines, using standard formats and protocols.

Interoperable – (meta)data is machine readable and annotated with resolvable vocabularies and ontologies.

Reusable – (meta)data is sufficiently well-described to allow integration with compatible data.

Page 5: Data peer review workshop

5

Data Descriptors have human and machine understandable components

Human readable representation of

studyi.e. article (HTML &

PDF)

Human readable representation of

studyi.e. article (HTML &

PDF)

Page 6: Data peer review workshop

6

Data Descriptors have human and machine understandable components

Machine accessible representation of

studyi.e. metadata

Page 7: Data peer review workshop

7

What types of data can be published?

7

Decades old dataset

Standalone dataset

Data that has been used in an analysis article

Large consortium dataset

Data from a single experiment

Any data that the researcher finds valuable and that others

might find useful too

Data associated with a high impact analysis article

Page 8: Data peer review workshop

8

When can a Data Descriptor be published?

8

After data analysis has been

published

Before analysis has been published

Authors not intending to analyse data

Data Descriptors can be submitted and published at

any point in the research workflow, i.e. whenever it makes most sense for your

data

After data analysis has been

published

Before the analysis has been published

Publication alongside analysis article

Page 9: Data peer review workshop

99

Why peer review data?

Page 10: Data peer review workshop

10

Researchers are sharing and reusing data

• Direct contact between researchers

(on request) is the most common

way of sharing data

• Repositories are second most

common method of sharing

Why might direct contact be the

most preferred method?Fig 2A & C; Kratz and Strasser, PLOS ONE (2015)

doi: 10.1371/journal.pone.0117619

Page 11: Data peer review workshop

11

Researchers see peer review as a mark of data quality

• Respondents trust peer review above all else: 72% (n = 175) say peer review confers high or complete confidence in the data

Figure 6B; Kratz and Strasser, PLOS ONE (2015) doi: 10.1371/journal.pone.0117619

Page 12: Data peer review workshop

1212

How is data peer reviewed at Scientific Data?

Page 13: Data peer review workshop

13

Editorial office

Susanna-Assunta SansoneHonorary Academic Editor

Andrew L. HuftonManaging Editor

Varsha K. KhodiyarData Curation Editor

Page 14: Data peer review workshop

14

Selection of Editorial Board members

Experts in their discipline

AND

Demonstrable experience of data standards, data reuse or data analysis in

their discipline

www.nature.com/sdata/about/editorial-board#eb

Page 15: Data peer review workshop

15

Data peer review

www.nature.com/sdata/policies/for-referees

Experimental Rigor and Technical Data Quality

Were data produced in a sound manner?

Technical quality of data – appropriate statistical analyses?

Experimental rigor - appropriate depth, coverage?

Completeness of the Description

Sufficient detail to allow others to reproduce these steps?

Sufficient detail to allow others to reuse this data?

Consistent with relevant minimum reporting standards?

Integrity of the Data Files and Repository Record

Do data files appear complete and match manuscript descriptions?

Are data archived to the most appropriate repository?

Page 16: Data peer review workshop

16

We capture metadata about the dataset being described in each Data Descriptor.

During the metadata curation process• Manuscript re-read• Data archive checked• Minor issues with the data and/or manuscript often identified

Metadata curation and final data checking

Page 17: Data peer review workshop

17

Why a Data Descriptor may be rejected

Reject without review• Out of scope or no data present

Reject after review• Serious flaws in the study design,

e.g. lack of crucial controls• Serious issues identified in the data

files by the peer reviewers

After rejection• Address concerns and resubmit to Scientific Data

• Resubmit to another data journal• Withdraw data from Scientific Data integrated repositories

Data should be technically reliable and suitable for use by others

Page 18: Data peer review workshop

1818

Ensuring your data is peer review ready

Page 19: Data peer review workshop

19

Create a data management plan

• Can avoid problems later• Increasingly required by funders• Critically evaluate existing practices – you may be setting standards for

your field• Some aspects of best practice may incur costs• Find people and resources that can help you

Datasets CodeMetadataResearch paper

Nature Genetics

Page 20: Data peer review workshop

20

Archive your data to the most appropriate repository

We currently list around 90 repositories, across biological, medical, physical and social sciences

www.nature.com/sdata/policies/repositories

Considerations:

1. Is there a discipline or data-specific repository for your data?

2. If no discipline or data-specific repository for your data exists, does your

funder or institution mandate deposition to a particular repository?

Page 21: Data peer review workshop

21

Spot the mistakes

Unhelpful document name

Formatting used to convey information

Special characters can cause text mining errors

Meaningless column titles

Undefined abbreviation No units are

given

Page 22: Data peer review workshop

22

Increasing intelligibilitySelf-explanatory document name

Removed cell formatting

Removed special characters

Meaningful column titles

Defined ‘BUN’

Page 23: Data peer review workshop

23

Increasing assessability

Information which was asterisked is now added to

results section

Added Units column

Page 24: Data peer review workshop

24

Increasing re-usability

Additional information to be added to methods section or table legend

Page 25: Data peer review workshop

25

Increasing reproducibility

• Include any additional information needed to understand the data, methods, parameters, e.g. which instrument (make and model) was used to measure blood carbon dioxide levels?

• Include availability statements for any code that was used to view, parse or analyse the data, in support of the conclusions.

Page 26: Data peer review workshop

26

Reporting Guidelines

Page 27: Data peer review workshop

2727

What happens when data is shared well?

Page 28: Data peer review workshop

28

Data reuse by other researchers in the same field

28

“The Data Descriptor made it easier to use the data, for me it was critical that everything was there…all the technical details like voxel size.”

Professor Daniele Marinazzo

Page 29: Data peer review workshop

29

29

www.bbc.co.uk/news/science-environment-33057402

Data reuse by the non-research community

Page 30: Data peer review workshop

30

Data reuse by the non-research community

30

http://www.nytimes.com/interactive/2014/12/30/science/history-of-ebola-in-24-outbreaks.html

Page 31: Data peer review workshop

31

Data peer review at Scientific Data

Data Archive

• Checked multiple times• Scientific reasoning underlying data reviewed by active researchers• Technical validity reviewed by discipline experts

Data Citations

• Citation accuracy confirmed by specialist editor• Citation format checked by editorial team• Data linkage tested by production team

Data Peer Review

• Does not have to be onerous• Can save overall reviewing time• Results in data that is reusable and useful!

Page 32: Data peer review workshop

3232

Thank you!

Visit nature.com/scientificdataEmail [email protected] @ScientificData