reproducible science - panel at ievobio 2014
DESCRIPTION
My introduction and position slides for the Reproducible Science panel at the 2014 iEvoBio conference, held June 24-25, 2014, in Raleigh, NC.TRANSCRIPT
Reproducible SciencePanel at iEvoBio 2014
Hilmar LappNational Evolutionary Synthesis Center (NESCent)
Disclosure(or call it perspective)
• Funded by NSF through BEACON for a series of workshops on developing reproducible science curriculum
• 3 principle questions:
• What practices, tools, and resources are available now?
• How best to teach these?
• What are the gaps faced by biologist users?
An experiment with sobering results
https://storify.com/hlapp/reproducibility-repeatability-bigthink
An experiment with sobering results
Main takeaways (distilled to tweets)
• software with many dependencies -> exponentially lower prob that all install
• holes or errors in docs -> harmless for experts, often fatal for "method novice"
• software evolution & rot -> parameters that worked 1 year ago now throw an error
• Non-domain reproducers harder: baseline software, packages differ #dependencyhell
http://ropensci.org/blog/2014/06/09/reproducibility/
“arguing that reproducibility is laudable in general glosses
over the fact that for each research group it is a
significant amount of work to make their research
(easily) reproducible for independent scientists”
“Any work you do to make your analysis more reproducible pays dividends for
colleagues and your future self.”
Jeremy Leipzig
For research to be reproducible, the parts need
to be available to start
Collberg et al (2014), Measuring Reproducibility in Computer Systems Research.http://reproducibility.cs.arizona.edu/tr.pdf
A huge tech soup• vagrant
• Ansible
• Docker
• Drone
• Travis
• knitr
• packrat
• VM memory limits
• VM storage limits
• VM uptime limits
• firewalls
• protected data
• data snapshotting
Reproducible science is a huge opportunity
for Research IT to enable & accelerate
science.