winning the tour de france, research data and data stewardship

26
Presentation to Sport Data Valley meeting May 2016 Alastair Dunning 3TU.Datacentrum hosted at TU Delft Library @alastairdunning, [email protected] Winning the Tour de France, Research Data and Data Stewardship

Upload: alastair-dunning

Post on 12-Apr-2017

133 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Winning the Tour de France, Research Data and Data Stewardship

Presentation to Sport Data Valley meetingMay 2016

Alastair Dunning3TU.Datacentrum hosted at TU Delft Library@alastairdunning, [email protected]

Winning the Tour de France, Research Data and Data Stewardship

Page 2: Winning the Tour de France, Research Data and Data Stewardship

In the 2015 Tour de France Chris Froome won the Bastille Day Stage 10, with a 1.610m Hors Categorie climb, by 59 seconds

Page 3: Winning the Tour de France, Research Data and Data Stewardship

Critics immediately questioned Froome’s dominance over other riders, accusing him of doping.

Page 4: Winning the Tour de France, Research Data and Data Stewardship

Such criticism has been around since Froome shot to fame in 2012, and then as winner of the Tour de France in 2013

Page 5: Winning the Tour de France, Research Data and Data Stewardship

As a response, Froome’s TeamSky published the ‘power data’ behind his performance

Page 6: Winning the Tour de France, Research Data and Data Stewardship

Later in the year, Froome underwent more testing and the lab data was released

Page 7: Winning the Tour de France, Research Data and Data Stewardship

Results showed that much of Froome’s improvement was down to weight loss (>5 kilos)Since then, criticism of Froome has diminished.

Page 8: Winning the Tour de France, Research Data and Data Stewardship

What happened to TeamSky and Chris Froome is happening across scientific study.

Page 9: Winning the Tour de France, Research Data and Data Stewardship

How does any scientist look after their data? Not just to prove arguments to others but to themselves at a later time.

Page 10: Winning the Tour de France, Research Data and Data Stewardship

In a digital age, with data readily available, how does science verify and reproduce the claims it makes ?

Page 11: Winning the Tour de France, Research Data and Data Stewardship

This has led to the fields of research data management and data stewardship

Page 12: Winning the Tour de France, Research Data and Data Stewardship

I would urge anybody creating or using data as evidence to start thinking about these issues

Page 13: Winning the Tour de France, Research Data and Data Stewardship

The safe storage and protection of intellectual

capital developed by scientists

Best practice in ensuring scientific arguments are

replicable in the long term

Better exposure of work of scientists and improved

citation rates

Improved practices for meeting the demands of funders, publishers and others in respect to research data

Shared values behind Data Stewardship

Page 14: Winning the Tour de France, Research Data and Data Stewardship

Around 1 in 6 researchers at Erasmus University had no idea if their data is backed up

56 professors in the USA agreed to have their data practices analysed: “a majority of them had experienced the loss of at least one work-related digital object that they considered to be important in the course of their professional career.”

Safe storage and protection of intellectual capital

Page 15: Winning the Tour de France, Research Data and Data Stewardship

Safe storage and protection of intellectual capital

Study in Cell: The Availability of Research Data Declines Rapidly with Article Age

“We examined the availability of data from 516 studies

between 2 and 22 years old”

“The odds of a data set being reported as extant fell by 17%

per year”

“Policies mandating data archiving at publication are clearly

needed”

Page 16: Winning the Tour de France, Research Data and Data Stewardship

Safe storage and protection of intellectual capital

Page 17: Winning the Tour de France, Research Data and Data Stewardship

Disproving Einstein’s Theory of Locality - Professor Ronald Hanson and his team, including featured Ph.D. student Bas Hansen. Published in Nature

Best practice in ensuring scientific arguments are replicable in the long term

Hanson and Hensen knew they were working on a high impact paper. So they realised there would be requests for the raw data so that the experiment could be validated and the data checked for consistency. Given that scientists had been using this experimental method since the 1960s, and results had always been contested, there was a tradition of sharing data related to this experiment. So they knew from the start they would open up the data.

A couple of months since its publication and the dataset is already gaining interest. In the first six months since its deposit, the first dataset has been viewed 650 times. The second dataset has been viewed 56 times in the first three weeks. This is according to Hensen’s expectations. Hensen reckons that this shows that nearly all of the world’s other research groups involved in experimental quantum mechanics have accessed the dataset.

Page 18: Winning the Tour de France, Research Data and Data Stewardship

“The Citation Advantage presently (at the least since 2009) amounts to papers with links to data receiving on the average 50% more citations per paper per year, than the papers without links to data.”

(Astrophysics, 2012)

“Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication.”

(Cancer microarray trials, 2007)

“Findings suggest that all three data sets are highly cited, with estimated citation counts in most cases higher than 99% of all the journal articles published in Oceanography during the same years”

(Oceanography, 2014)

Better exposure of academic work of scientists

Page 19: Winning the Tour de France, Research Data and Data Stewardship

Improved practices for meeting the demands of funders, publishers and others in respect to research data

Page 20: Winning the Tour de France, Research Data and Data Stewardship

The 3TU.Datacentrum (soon to become 4TU) exists to help with these issues

Page 21: Winning the Tour de France, Research Data and Data Stewardship

21

Services of 3TU.Datacentrum data repository

http://data.3tu.nl/repository/

• ‘Frozen’ dataset (version) for future use & long term storage

• ‘Published’ data: visible• Open (max. 2 years embargo):

shareable• Persistent digital object identifier

(DOI): findable and citable• Sustainable formats: readable• Data Seal of Approval: safe and

secure

Page 22: Winning the Tour de France, Research Data and Data Stewardship

22

Page 23: Winning the Tour de France, Research Data and Data Stewardship

Every researcher can upload up to 10 GB of data to 3TU.Datacentrum a year free of charge. For depositing additional data there is a one off cost of € 4.50 per GB.

Page 24: Winning the Tour de France, Research Data and Data Stewardship

3TU.Datacentrum would be happy to discuss options with Sport Data Valley partners for hosting their data

Page 25: Winning the Tour de France, Research Data and Data Stewardship

Presentation to Sport Data Valley meetingMay 2016

Alastair Dunning, Research DataTU Delft & 3TU.Datacentrum

@alastairdunning, [email protected]

Winning the Tour de France, Research Data and Data Stewardship

Page 26: Winning the Tour de France, Research Data and Data Stewardship

Slide 2 - https://en.wikipedia.org/wiki/2015_Tour_de_France,_Stage_1_to_Stage_11#Stage_10Slide 3 - http://www.independent.co.uk/sport/cycling/tour-de-france-2015-doping-claims-dampen-the-mood-as-chris-froome-triumphs-10417336.htmlSlide 5 - http://www.teamsky.com/teamsky/home/article/59618#vYKyzhBzAIYy7BKH.97Slide 6 - http://chrisfroome.esquire.co.uk/Slide 14 - https://www.fosteropenscience.eu/sites/default/files/pdf/919.pdf (Erasmus); http://www.ijdc.net/index.php/ijdc/article/view/10.2.96 (Intellectual Capital at Risk, US Study) https://www.flickr.com/groups/2121762@N23/Slide 15 - http://www.cell.com/current-biology/abstract/S0960-9822(13)01400-0; https://www.flickr.com/groups/2121762@N23/Slide 16 - various. Type ‘Fire Lab University’ into Google !Slide 17 - http://datacentrum.3tu.nl/en/researchers-about-3tudatacentrum/ (forthcoming); http://www.nature.com/nature/journal/v526/n7575/full/nature15759.htmlSlide 18 - Belter CW (2014) Measuring the Value of Research Data: A Citation Analysis of Oceanographic Data Sets. PLoS ONE 9(3): e92590. doi:10.1371/journal.pone.0092590; Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308, Bertil Dorch. On the Citation Advantage of linking to data: Astrophysics. 2012. <hprints-00714715v2>Slide 19 - http://ec.europa.eu/newsroom/dae/document.cfm?doc_id=15266 (EU) , http://www.nwo.nl/en/policies/open+science/data+management (NWO)Slide 21 - http://data.3tu.nl/repository/

Citations