www.dreamchallenges.org. a crowdsourcing effort that poses questions (challenges) about biology,...

14
www.dreamchallenges.org

Upload: jessie-powell

Post on 25-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

www.dreamchallenges.org

Page 2: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

• A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis:– Transcriptional networks– Signaling networks– Predictions to response to perturbations – Translational research

DREAM: What is it?DIALOGUE FOR REVERSE ENGINEERING

ASSESSMENT AND METHODS

Page 3: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

Benefits of crowd-sourcing

1. Performance Evaluation– Unbiased, consistent, and rigorous method

assessment– Discover the Best Methods– Determine the solvability of a scientific question

2. Sampling of the space of methods– Understand the diversity of methodologies

presently being used to solve a problem

Page 4: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

Benefits of crowd-sourcing, cont’d

3. Acceleration of Research– The community of participants can do in 4 months

what would take 10 years to any group

4. Community Building– Make high quality, well-annotated data accessible.– Foster community collaborations on fundamental

research questions.– Determine robust solutions through community

consensus: “The Wisdom of the Crowds.”

Page 5: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

• Six Years of DREAM Challenge Seasons– 34 DREAM Challenges opened– More than 500 team submissions– 1000 cumulative conference attendees, – 60 papers written using DREAM Challenges, two

edited books and a Special paper in PLoS One– Community email list includes > 7,000

participants

DREAM ChallengesBuilding communities of data experts since 2006

Page 6: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

How Sage/DREAM Nurtures Challenge Communities

• Challenge webinars for live interaction between participants and organizers

• Community forums where participants can learn from each other

• Leaderboards on Synapse to motivate continuous participation

• Incentives to code-share: evolving models never before possible (machine learning + clinical insights

• Annual DREAM Conference to celebrate and discuss Challenge outcomes

DREAM Challenge Leaderboard

Page 7: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

Structure of a Challenge

Page 8: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

Synapse and DREAM Challenges

• Cloud-based (Amazon)

• IRB-approved data repository

• Central hub for all DREAM Challenges

• Registration and messaging services

• Real-time Challenge leaderboards

• Provenance tools for data reproducibility

• Living archive of DREAM methods and winning source code

… beyond a data repository …

Page 9: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

CASE STUDY: Breast Cancer Prognosis Challenge

Goal: use crowdsourcing to forge a computational model that accurately predicts breast cancer survival

• Training data set: genomic and clinical data from 2000 women with breast cancer

• Data access and analysis tools: Synapse

• Compute resources: each participant provided with a standardized virtual machine donated by Google

• Model scoring: models submitted to Synapse for scoring on a real-time leaderboard

Page 10: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

Unique Attributes of the Sage Bionetworks/DREAM Breast Cancer Prognosis

Challenge Open source with code-sharing:

– Synapse’s computational infrastructure enables participants to use code submitted by others in their own model building

– Winning code must be reproducible

New dataset for validation of winning model: – Derived from approx. 200 breast cancer samples– Data generation funded by Avon– Winning model: the one that, having been trained using Metabric

data, is most accurate for survival prediction when applied to a brand new dataset

Challenge assisted peer-review– Overall winner submitted a pre-accepted article

to Science Translational Medicine

Page 11: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

DREAM 2 DREAM 3 DREAM 4 DREAM 5 DREAM 6 DREAM 7 DREAM 8 DREAM8.5+9

0

50

100

150

200

250

300

350

400

DREAM Participation

Num

ber o

f Tea

ms

Challenges DREAM 8.5 + 9

Registered Users

Leader- board

Forum Entries

Unique Submissions

Unique Teams

Total 1,780 11,459 669 368 159

2014: DREAM Challenge participation continues to increase

Page 12: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

2015 DREAM9.5 and 10 Challenges… So Far

Page 13: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

WHAT WILL HAPPEN BEYOND 2015Challenges with clinical impact

Ensemble methods that make use of best submissions to be tested in the clinic (grant under review)

Digital Mammography Challenge: Reduce the false negative rate in mammography screening

Modeling and simulation basedChallenges??

Page 14: Www.dreamchallenges.org. A crowdsourcing effort that poses questions (Challenges) about biology, modeling and data analysis: – Transcriptional networks

AcknowledgementsSage Bionetworks

Stephen Friend Thea Norman Andrew Trister Lara Mangravite Mike Kellen Mette Peters Arno Klein Solly Sieberts Abhi Pratap Chris Bare Bruce Hoff

IBM Erhan Bilal Kely Norel Elise Blaese Pablo Meyer Rojas Kahn Rrhissorrakrai

EBI Julio Saez Rodriguez Thomas Cokelaer Federica Eduati Michael Menden

L. Maximilians University Robert Kueffner,

Univ Colorado, Denver Jim Costello

OHSU Joe Gray Adam Margolin Mehmet Gonen Laura Heiser

Prize4Life Melanie Leitnerr Neta Zach

NCI Dinah Singer Dan Gallahan

ISMMS Eli Stahl Gaurav Pandey

Columbia University Andrea Califano Mukesh Bansal Chuck Karan

Rice University Amina Qutub David Noren Byron Long

MD Anderson Steven Kornblau

Broad Institute Bill Hahn Barbara Weir Aviad Tsherniak

Merck Robert Plenge

BYU Keoni Kauwe

OICR Paul Boutros

UCSC Josh Stuart