bot strata uk 2012-10-02
DESCRIPTION
Brian Bot, Oct 2, 2012. Strata Conference, London, UKTRANSCRIPT
dragging scientific communication into the information age
Brian M. Bot | Senior Scientist | Sage Bionetworks
clearScience
Deception at Duke
“I will help you. Trust me.”- Anil Potti to Juliet Jacobs (pictured above)
open
open source
open data
open access
access2research
accessible
clear
research scandals represent merely the extreme of a continuum in the culture of academic research
the status quo tolerates poor communication of findings
6%
21%
8%
11%
54%cannotreproduce
can reproduce in principle
can reproduce w/discrepancies
can reproduce from processed data w/discrepancies
can reproduce partially
Ioannidis A. et al. Repeatability of published microarray gene expression analyses. Nature Genetics 41, 149-155 (2009) | doi:10.1038/ng.295
208,294,724datapoints
124 pagessupplemental material
?? lines unobtainable source code
?? version or architecture ofstatistical analysis program (R)
enumerable R packagesand package dependencies
key R package “ClaNC”no longer available
442 citations
often what is in principle reproducible, is not practically reproducible
unidentified publication‣ from journal with 5 year impact factor of 28‣ article freely available for download‣ data freely available for download
how are we to move science forward
if we cannot understand what was done previously?
Nature 483, 509 (29 March 2012) | doi:10.1038/483509a
let’s go back to basics
4. test hypothesis experimentally
5. analyze experimental data
7. publish results
6. draw conclusions based on data
scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
8. retest (frequently done by other scientists)
4. test hypothesis experimentally
5. analyze experimental data
7. publish results
6. draw conclusions based on data
⤶
7. publish results
finitein
∞...
submit to journal
analyze on local machine
write a documentsent to reviewers as pdf
printed on paper
static html representation
experimentally generate data @ the bench orfrom a clinical cohort
accepted & digitally typeset
static pdf representation
store on local server
are being artificially uncoupled from
scientific claims
science itself
is hardscience
is hardcommunication
(especially for scientists)
getting harder in an era of
‘BIG DATA’
RESTful APIs
clearSciencere-imagining scientific communication
allow consumption of content at a variety of levels of complexity
and abstraction
leverage (open)
allow users to reassemble an entireanalysis environment
clearScience
RESTful APIs
clearScience
‣ hardware‣ software‣ data‣ code
analysis environment
scientific communicationneeds to evolve
along with scienceneeds to evolve
make it easy to do
good science
clearScience
make it easy to do
AcknowledgementsSage Bionetworks
David Burdick - Rockstar Engineer
Stephen Friend - President and CEO
Erich S. Huang - Director of Cancer Research
Mike Kellen - Director of Technology
External Partners
Myles Axton - Nature Genetics
Phil Bourne - PLoS Computational Biology
Josh Greenberg - Alfred P. Sloan Foundation
Kelly LaMarco - Science Translational Medicine
Ian Mulvaney - eLife Sciences
Eric Schadt - Open Network Biology