computational grand challenges for 21st century biomedical science
TRANSCRIPT
Computational Grand Challenges for 21st Century
Biomedical Science
Daniel Masys, MDAffiliate Professor
Biomedical and Health InformaticsUniversity of Washington
Seattle, WA
November 8, 2012NCBC showcase meeting
Computational GrandChallenges
Topics
�Big data in perspective�Turning the promise of advanced
computation into reality�The road ahead
Computational GrandChallenges
Characteristics of “Big Data”
�Exceeds the capacity of unaided human cognition for its comprehension
�Strains current technology capacity in one or more ways� CPU-bound: computational and algorithmic
complexity� Bandwidth-limited: network communication capacity� Storage-limited: voluminous bits & bytes
Computational GrandChallenges
NIH and successive eras of “Big Data”1960’s: Electronic Medical Records� Sparse matrix data
needing compact storage and rapid retrieval
� NIH support of MGH Laboratory of Computer Science (Octo Barnett) leads to MUMPS programming language
The Hospital Computer Project time-shared DEC PDP-1d . It featured a 50-Mbyte specially built Fastrand drumfor storing medical data files and 64 simultaneously usable telecommunication ports, many of which were connected to Teletype terminals operating at the Massachusetts General Hospital in 1966.(Photo courtesy of BBN Technologies.)
Computational GrandChallenges NIH and successive
eras of “Big Data”
1970’s: Artificial Intelligence
� Pattern detection in high volume complex datasets
� Rule-based expert systems emerge
� NIH support of SUMEX-AIM resource at Stanford leads to Dendral, Mycin, Oncocin, Protege
Joshua Lederberg
Ed Feigenbaum
Ted Shortliiffe
Computational GrandChallenges NIH and successive eras
of “Big Data”
1980’s: Data Mining and molecular complexity� “Tower of Babel” proliferation
of molecular resources leads to creation of NCBI at NLM
� Bill Raub champions the PROPHET workstation
Computational GrandChallenges NIH and successive
eras of “Big Data”
1990’s: Scalar-vector and massively parallel computing technologies come of age, ubiquitous bandwidth arrives� NIH joins Federal High Performance
Computing, Communications at Information Technology (HPCCIT) program
� Human Genome project becomes poster child for molecular volume and complexity, high throughput “omics” technologies arrive
Computational GrandChallenges
NIH and successive eras of “Big Data”
Biomedical research in transition: the Biomedical Information Science and Technology (BISTI) report of 1999� Bottleneck no longer data production; now data analysis� Biology as an information science� A new breed of 21st century scientists� Emerging science careers in intersection of biology,
health, informatics, computer science, quantitative methods
� A vision for interdisciplinary Centers: the NCBCs
Computational GrandChallenges
The road ahead:Computational Grand Challenges in
Biomedical Science
�Computationally tractable�Hard but not impossible�Evidence from current technologies or
applications that similar problems have been at least partially solved
Computational GrandChallenges
The road ahead:Grand challenge examples
�Molecular structure-function prediction�Biomedical Imaging�Simulation�A systems infrastructure for evidence-
driven ‘individualized healthcare’
Computational GrandChallenges
Molecular structure-function prediction
�Perennial holy grail (and an informatics Nobel Prize): solving the protein folding problem
�Deciphering the noncodingbut biologically active genome
�Epigenomics: how do 25K genes make 400K proteins?
Computational GrandChallenges
Spectrum of “NIH-relevant” imaging
Source: Jim Clark, UW Dept. Biol. Structure
Computational GrandChallenges
Image Analysis & Interpretation
� Image segmentation: automated detection of boundaries objects of interest within single images and related sets of images
� Imaging sematics: Automated linkage of volumes of interest to knowledge about those biological objects and processes
� Image quantitation: volumetric change of objects of interest over time
Computational GrandChallenges
Simulation
� In silico cells, tissues, organisms: a true computable systems biology
�Assembling a complete structural and functional model of the human body, at many levels between molecular and whole organism (e.g., organelle & cell assembly, tissues, organs) linking structure to functions and processes when known
Computational GrandChallenges
Simulation, cont’d
�Change over time: Modeling the structural and biochemical processes of aging
�Simulating, from “molecular first principles”, common diseases in a continuum from molecular changes to visible clinical manifestations
Meningoccocal rash
Computational GrandChallenges
Computational infrastructure for 21st
century healthcare and research
�Development of a robust, ubiquitous, interoperable electronic infrastructure for the appropriate linking of person-specific health data � that protects confidentiality� that is responsive to the needs of patients, providers,
payers and other healthcare-related organizations� that supports discovery science based on common and
rare variant molecular patterns and corresponding health states
Computational GrandChallenges
Computational infrastructure for 21st
Century healthcare
�Universal standards for the content of Electronic Health Records, that support both human interpretation, computer-based reasoning and course guidance, and translational science
1000
Fac
ts p
er D
ecis
ion
10
100
1990 2000 2010 2020
Human Cognitive Capacity
The need for systems-level approaches to clinical decision support for “personalized medicine”
Structural Genetics:
e.g. SNPs, haplotypes
Functional Genetics:
Gene expression profiles
Proteomics and other
effector molecules
Decisions by clinical
phenotype
i.e., traditional health care
Be diligent. Read two journal articles every night. At the end of a year a you will be 1275 years behind the literature.
Assume only 1% of the new literature is relevant to what any individual care provider does. At the end of a year you are 12 years behind.
Computational GrandChallenges
Getting There
The functional and structural appeal of Centers as an engine of innovation and problem solving. Communities of scholars, users, builders, evaluators:
�Who advance the state of the art in computational methods
�Who build and/or ‘harden’, and distribute tools that bridge the gap between advanced computational techniques and the needs and capabilities of ‘rank and file scientists’
Computational GrandChallenges
Getting There, cont’d
�Centers as communities of scholars, users, builders, evaluators: �Have interdisciplinary critical mass�Can take advantage of economies of scale�Develop and maintain software frameworks that
make tools interoperable�Know how to avoid ‘reinventing wheels’�Are focal points for training and education for this
and future generations
The elevated creative bandwidth of face-to-face brainstorming over a cup of coffee…