results may vary: collaborations workshop, oxford 2014
DESCRIPTION
Thoughts on computational science reproducibility with a focus on software. Given at the Software Sustainability Institute's 2014 Collaborations WorkshopTRANSCRIPT
icanhascheezburger.com
Results may vary
reproducibility.science. software.
Professor Carole GobleThe University of Manchester, UK
The Software Sustainability [email protected]
@caroleannegobleCollaborations Workshop, Oxford, 26 March 2014
“An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” David Donoho, “Wavelab and Reproducible Research,” 1995
datasetsdata collectionsalgorithmsconfigurationstools and appscodesworkflowsscriptscode librariesservices,system software infrastructure, compilershardware
Morin et al Shining Light into Black BoxesScience 13 April 2012: 336(6078) 159-160
Ince et al The case for open computer programs, Nature 482, 2012
http://www.nature.com/nature/focus/reproducibility/index.html
Corbyn, Nature Oct 2012fraud
“I can’t immediately reproduce the research in my own laboratory. It took an estimated 280 hours for an average user to approximately reproduce the paper. Data/software versions. Workflows are maturing and becoming helpful”
disorganisation
Phil BourneGarijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE, DOI: 10.1371/journal.pone.0080278.
inherent
Reporting (publishing)
availability
documentation
Replication Gap
1. Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses. Nature Genetics 41: 142. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html3. Bjorn Brembs: Open Access and the looming crisis in science https://theconversation.com/open-access-and-the-looming-crisis-in-science-14950
Out of 18 microarray papers, results
from 10 could not be reproduced
Stodden V, Guo P, Ma Z (2013) Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption
by Journals. PLoS ONE 8(6): e67111. doi:10.1371/journal.pone.0067111
Required as condition of publicationRequired but may not affect decisions
Explicitly encouraged may be reviewed and/or hostedImpliedNo mention
Required as condition of publicationRequired but may not affect decisions
Explicitly encouraged may be reviewed and/or hostedImpliedNo mention
170 journals, 2011-2012
10 Simple Rules for Reproducible Computational
Research1. For Every Result, Keep Track of How It
Was Produced2. Avoid Manual Data Manipulation Steps3. Archive the Exact Versions of All External
Programs Used4. Version Control All Custom Scripts5. Record All Intermediate Results, When
Possible in Standardized Formats6. For Analyses That Include Randomness,
Note Underlying Random Seeds7. Always Store Raw Data behind Plots8. Generate Hierarchical Analysis Output,
Allowing Layers of Increasing Detail to Be Inspected
9. Connect Textual Statements to Underlying Results
10.Provide Public Access to Scripts, Runs, and Results
Citation: Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
Record Everything
Automate Everything
republic of science*
regulation of science
institution core facilitieslibraries
*Merton’s four norms of scientific behaviour (1942)
public services
recomputation.org
sciencecodemanifesto.org
meta-manifesto• all X should be available and assessable forever and ever
• the copyright of X should be clear• X should have citable, versioned identifiers
• researchers using X should visibly credit X’s creators
• credit should be assessable and count in all assessments
• X should be curated, available, linked to all necessary materials, and intelligible
re-compute
replicatererun
repeat
re-examine
repurpose
recreate
reuse
restorereconstruct review
regeneraterevise
recycle
conceptual replication “show A is true by doing B rather than doing A again”verify but not falsify[Yong, Nature 485, 2012]
regenerate the figure
redo
Scientific publications have at least two goals: (i) to announce a result and (ii) to convince readers that the result is correct
…..papers in experimental science should describe the results and provide a clear enough protocol to allow successful repetition and extension
Jill Mesirov Accessible Reproducible Research
Science 22 Jan 2010: 327(5964): 415-416 DOI: 10.1126/science.1179653
Virtual Witnessing*
*Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer.
Computational Research Virtual Witnessing
Methods(techniques, algorithms, spec. of the steps)
Instruments(codes, services, scripts, underlying libraries)
Laboratory(sw and hw infrastructure, systems software, integrative platforms)
Materials(datasets, parameters, algorithm seeds)
Experiment
Setup
reusereproduce
repeat replicatesame
experimentsame lab
same experiment
different lab
same experimentdifferent set
up
different experiment
some of same
test
Drummond C Replicability is not Reproducibility: Nor is it Good Science, onlinePeng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.
Design
Execution
Result Analysis
Collection
Publish
Peer Review
Peer Reuse
Prediction
Can I repeat & defend my method?
Can I review / reproduce and compare my results / method with your results /
method?
Can I review / replicate and certify
your method?
Can I transfer your results into my
research and reuse this method?
* Adapted from Mesirov, J. Accessible Reproducible Research Science 327(5964), 415-416 (2010)
portability
variability sameness
availabilityopen
descriptionintelligibility
[Adapted Freire, 2013]
preservationpackaging
gather dependenciescapture stepstrack & keep
results
A Reproducibility Framework
Reporting dimension
Archive dimension
versioning
BioSTIF
method
instruments and laboratory
Workflows:capture the steps
standardised pipelinesrepetition & comparisonrecord experiment &
set-up provenance collectionreportingembedded playervariant reuse
infrastructure shieldlocalised / distributedin-house / externalmulti-code experiments
materials
http://www.taverna.org.uk
Provenance the link between computation and resultsRecord
static verifiable recordpartially repeat/reproduce
Tracktrack changescarry citationselect data to keep/release
Analyticsrepaircalc data quality/trustcompare diffs/discrepancies
W3C PROV standardPDIFF: comparing provenance traces to diagnose divergence across experimental results [Woodman et al, 2011]
http://nbviewer.ipython.org/urls/raw.githubusercontent.com/myGrid/DataHackLeiden/alan/Player_example.ipynb?create=1
Workflows:sharing and
reporting
Open, citable workflows
[Scott Edmunds]
Integrative Frameworkgalaxyproject.org/
portability
variability sameness
availabilityopen
descriptionintelligibility
[Adapted Freire, 2013]
preservationpackaging
gather dependenciescapture stepstrack & keep
results
A Reproducibility Framework
Reporting dimension
Archive dimension
versioning
Reporting dimension
AuthoringExec. PapersLink docs to experiment
Sweave
ProvenanceTrack,VersionReplay
Workflows, makefiles
service
Sci as a ServiceIntegrative fws
Read & Run, Co-locationNo installation
hostOpen StoreDescriptive read,White BoxArchived record
Aggregated Assets Infrastructures
Sharing and interlinking multi-stewarded Methods, Models, Data…
Data
Model
Article
ExternalDatabases
http://www.seek4science.org
Metadata
http://www.isatools.org
made reproducible
[Pettifer, Attwood]
http://getutopia.com
portability
variability sameness
availabilityopen
descriptionintelligibility
[Adapted Freire, 2013]
preservationpackaging
gather dependenciescapture stepstrack & keep
results
A Reproducibility Framework
Reporting dimension
Archive dimension
versioning
Archiving & Porting Dimension
host
service
Open Store
Sci as a ServiceIntegrative fws
PreservationRecompute, limited installation, Black BoxByte executionDescriptive read,White BoxArchived record
Read & Run, Co-locationNo installation
ReproZipPackaging PortingWhite Box, Installation Archived record
specialist codes libraries, platforms, tools
services
(cloud) hosted services
commodity platforms
data collectionscatalogues software
repositories
my datamy processmy codes
integrative frameworks
gateways
“lets copy the box that the internet is in”
ArchiveIsolation• Independent• Self contained• Single ownership• Freehold• Fixed
• Self described
ActiveEcosystem• Dependent• Distributed• Multi-ownership• Tenancy• Changeable /
variable
• Multi-described
Closed codes/services, method obscurity, manual steps
Joppa et al SCIENCE 340 2013, Morin et al SCIENCE 336 2012
MitigateDetectRepair
Zhao, Gomez-Perez, Belhajjame, Klyne, Garcia-Cuesta, Garrido, Hettne, Roos, De Roure and Goble. Why workflows break - Understanding and combating decay in Taverna workflows, 8th Intl Conf e-Science 2012
The Reproducibility Windowall experiments become less reproducible over
time• The how, why and what • plan to preserve• prepare to repair• description persists• common frameworks
• partial replication• approximate reproduction• verification• benchmarks for codes
Reproducibility by
InvocationRun It
Reproducibility by
InspectionRead It
The Reproducibility WindowThe explicit documentation of designed-in and anticipated
variation
Reproducibility = Hard WorkData sets
Analyses
Linked to
Linked to
DOI
DOI
Open-Paper
Open-Review
DOI:10.1186/2047-217X-1-18>11000 accesses
Open-Code
8 reviewers tested data in ftp server & named reports published
DOI:10.5524/100044
Open-PipelinesOpen-Workflows
DOI:10.5524/100038Open-Data
78GB CC0 data
Code in sourceforge under GPLv3: http://soapdenovo2.sourceforge.net/>5000 downloads
Enabled code to being picked apart by bloggers in wiki http://homolog.us/wiki/index.php?title=SOAPdenovo2
[Scott Edmunds]
Design
Execution
Result Analysis
Collection
Publish
Peer Review
Peer Reuse
Prediction
* Adapted from Mesirov, J. Accessible Reproducible Research Science 327(5964), 415-416 (2010)
Reproducible Research Environment
Integrated infrastructure for producing and working with reproducible research.
Reproducible Research Publication Environment Distributing and reviewing; credit; licensing etc.
From make reproducible to born reproducible
Software sustainabilitySoftware practicesSoftware depositionLong term access to softwareCredit for software Software JournalsLicensingOpen Source Software
Best Practices for Scientific Computing http://arxiv.org/abs/1210.0530Stodden, Reproducible Research Standard, Intl J Comm Law & Policy, 13 2009
From make reproducible to born reproducible
From make reproducible to born reproducible
The Neylon Equation
Process=InterestFriction
xNumbe
r peoplereach
Cameron Neylon, BOSC 2013, http://cameronneylon.net/
From make reproducible to born reproducible
productivity
reproducibility
personalside effect public
side effect
From make reproducible to born reproducible
ramps
Research is like
software.
Release research.
Jennifer Schopf, Treating Data Like Software: A Case for Production Quality Data, JCDL 2012
From make reproducible to born reproducible
Research Objects• Bundles and relate multi-hosted digital resources of a scientific
experiment or investigation using standard mechanisms• Exchange, Releasing paradigm for publishing
http://www.researchobject.org/
Research Objects for…..
Preservation Archiving
Exchange & CommunicationRelease-based PublishingCreditRecombination/RemixReproducibility, Computation Training
identification
aggregation
annotation
dependencies
provenance
checklists
versioning
RO Core Conventionsencoded using standards
MinimInformationModel Ontology
W3C PROVPAV, VoID
Git
OAI-ORE
W3C OAM
DOI, ORCID, PURL
RO Extensions
code
workflows
data
experiments
biology astronomyNGS SysBio
Mass SpecDiscipline
Asset type
Howard Ratner, Chair STM Future Labs Committee, CEO EVP Nature Publishing Group, Director of Development for CHORUS
(Clearinghouse for the Open Research of US) STM Innovations Seminar 2012
http://www.youtube.com/watch?v=p-W4iLjLTrQ&list=PLC44A300051D052E5
Victoria Stodden, AMP 2011 http://www.stodden.net/AMP2011/, Special Issue Reproducible Research Computing in Science and Engineering July/August 2012, 14(4)Howison and Herbsleb (2013) "Incentives and Integration In Scientific Software Production" CSCW 2013.
http://sciencecodemanifesto.org/http://matt.might.net/articles/crapl/
Technical stuff is the easy stuff
SocialMatters
Organisation
MetricsCulture
Process
[Adapted, Daron Green]
meta-manifesto• all X should be available and assessable forever• the copyright of X should be clear• X should have citable, versioned identifiers• researchers using X should visibly credit X’s creators• credit should be assessable and count in all assessments• X should be curated, available, linked to all necessary materials, and intelligible
• reproducibility spectrum• descriptive reproducibility
• papers -> research objects
• make reproducible -> born reproducible• ramp up tools -> working practice
• adapt and train -> researchers• cost & responsibility -> transparent, accountable and collective
• dominants -> society, culture and policy
• take action, be imperfect
• myGrid– http://www.mygrid.org.uk
• Taverna– http://www.taverna.org.uk
• myExperiment– http://www.myexperiment.org
• BioCatalogue– http://www.biocatalogue.org
• Biodiversity Catalogue– http://www.biodiversitycatalogue.org
• Seek– http://www.seek4science.org
• Rightfield– http://www.rightfield.org.uk
• Open PHACTS– http://www.openphacts.org
• Wf4ever– http://www.wf4ever-project.org
• Software Sustainability Institute– http://www.software.ac.uk
• BioVeL– http://www.biovel.eu
• Force11– http://www.force11.org
Acknowledgements• David De Roure• Tim Clark• Sean Bechhofer• Robert Stevens• Christine Borgman • Victoria Stodden• Marco Roos• Jose Enrique Ruiz del Mazo• Oscar Corcho• Ian Cottam• Steve Pettifer• Magnus Rattray• Chris Evelo• Katy Wolstencroft• Robin Williams• Pinar Alper• C. Titus Brown• Greg Wilson• Kristian Garza
• Wf4ever, SysMO, BioVel, UTOPIA and myGrid teams
• Juliana Freire• Jill Mesirov• Simon Cockell• Paolo Missier• Paul Watson• Gerhard Klimeck• Matthias Obst• Jun Zhao• Pinar Alper• Daniel Garijo• Yolanda Gil• James Taylor• Alex Pico• Sean Eddy• Cameron Neylon• Barend Mons• Kristina Hettne• Stian Soiland-Reyes• Rebecca Lawrence