coming to terms with the biomedical tower of...

37
1 CENTER FOR Biomedical Informatics Coming to Terms with the Biomedical Tower of Babel Implications for the design of a biomedical knowledge network Alexa T. McCray Center for Biomedical Informatics Harvard Medical School Board on Research Data and Information National Academy of Sciences February 26, 2013

Upload: dinhthu

Post on 10-May-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

1 CENTER FOR

Biomedical Informatics

Coming to Terms with the Biomedical Tower of Babel

Implications for the design of a biomedical knowledge network

Alexa T. McCray

Center for Biomedical Informatics

Harvard Medical School

Board on Research Data and Information

National Academy of Sciences

February 26, 2013

2 CENTER FOR

Biomedical Informatics

“No aspect of human life has escaped the

impact of the Information Age, and

perhaps in no area of life is information

more critical than in health and medicine.”

U.S. National Academy of Engineering

Grand Challenges 2008

http://www.engineeringchallenges.org

3 CENTER FOR

Biomedical Informatics

1990-2003 Human Genome Project

2004-2010

2011-2020

Beyond 2020

Green ED. Nature 2011; 470(10):203-13

4 CENTER FOR

Biomedical Informatics

1990-2003 Human Genome Project

2004-2010

2011-2020

Beyond 2020

Green ED. Nature 2011; 470(10):203-13

Understanding the structure of genomes

5 CENTER FOR

Biomedical Informatics

1990-2003 Human Genome Project

2004-2010

2011-2020

Beyond 2020

Green ED. Nature 2011; 470(10):203-13

Understanding the biology of genomes

Understanding the biology of disease

6 CENTER FOR

Biomedical Informatics

1990-2003 Human Genome Project

2004-2010

2011-2020

Beyond 2020

Green ED. Nature 2011; 470(10):203-13

Advancing the science of medicine

7 CENTER FOR

Biomedical Informatics

1990-2003 Human Genome Project

2004-2010

2011-2020

Beyond 2020

Green ED. Nature 2011; 470(10):203-13

Improving the effectiveness of healthcare

8 CENTER FOR

Biomedical Informatics

An Inflection Point

National Research Council, 2011:9.

Toward Precision Medicine: Building a Knowledge Network

for Biomedical Research and a New Taxonomy of Disease

9 CENTER FOR

Biomedical Informatics

Precision Medicine

Toward Precision Medicine National Research Council, 2011

10 CENTER FOR

Biomedical Informatics

Precision Medicine

Toward Precision Medicine National Research Council, 2011

11 CENTER FOR

Biomedical Informatics

Precision Medicine

Toward Precision Medicine National Research Council, 2011

12 CENTER FOR

Biomedical Informatics

Precision Medicine

Toward Precision Medicine National Research Council, 2011

13 CENTER FOR

Biomedical Informatics

Precision Medicine

Toward Precision Medicine National Research Council, 2011

14 CENTER FOR

Biomedical Informatics

A New Taxonomy of Disease

“Could it be that something as fundamental as our current system for

classifying diseases is actually inhibiting progress?”

Toward Precision Medicine.

National Research Council, 2011:10

Symptomology New

Taxonomic

Classification

15 CENTER FOR

Biomedical Informatics

Wide Range of Data Sources Needed

Public Databases

Clinical Systems

Social Networking Sites

Laboratory Data

Biosensor Data

All have their own

Some use home-grown

terminologies

Syntax

Semantics

Some use standard

terminologies/ontologies

16 CENTER FOR

Biomedical Informatics

Terminology Use

Coding clinical data

Indexing and retrieving the literature

Annotating genomic data

Statistical reporting, epidemiologic studies

Outcomes measurement

Public health surveillance

Cost analysis

Information exchange and data integration

Data mining, aggregation

Natural language processing

17 CENTER FOR

Biomedical Informatics

Common Terminologies Varying scope, coverage,

rigor, and update schedules

18 CENTER FOR

Biomedical Informatics

How do we come to terms with this Biomedical Tower of Babel?

Unified Medical Language System (UMLS) is one large-

scale effort that has made progress toward facilitating

semantic interoperability of biomedical data

Recognize that a variety of communities of practice exist

Encourage consistency within those communities of practice

Discourage development of “redundant” terminologies/ontologies

Map terminologies to each other for maximum semantic interoperability

Recognize the value of curating data with standard terminologies

Develop robust NLP tools that take advantage of existing terminologies

19 CENTER FOR

Biomedical Informatics

Integrates > 100 existing terminologies/ontologies

UMLS

Includes a higher level ontology

Includes natural language processing tools and resources

Metathesaurus

UMLS Knowledge Sources

Semantic Network

SPECIALIST NLP Tools

20 CENTER FOR

Biomedical Informatics

UMLS Metathesaurus Mapping Example

21 CENTER FOR

Biomedical Informatics

UMLS Two-Level Structure

22 CENTER FOR

Biomedical Informatics

23 CENTER FOR

Biomedical Informatics

24 CENTER FOR

Biomedical Informatics

Terminologies in Use

ClinicalTrials.gov

Harvard Catalyst

Correlating Phenotype & Genotype in Autism Research

25 CENTER FOR

Biomedical Informatics

26 CENTER FOR

Biomedical Informatics

27 CENTER FOR

Biomedical Informatics

28 CENTER FOR

Biomedical Informatics

29 CENTER FOR

Biomedical Informatics

30 CENTER FOR

Biomedical Informatics

31 CENTER FOR

Biomedical Informatics

32 CENTER FOR

Biomedical Informatics

33 CENTER FOR

Biomedical Informatics

34 CENTER FOR

Biomedical Informatics

35 CENTER FOR

Biomedical Informatics

Phenotypic and Genetic Factors in Autism Spectrum Disorders

Correlating Phenotype & Genotype

Aligned and integrated terminology represented in phenotypic instruments

~ 500 families participated

Data collected

Biological samples

Phenotypic data from 25 behavioral instruments

Developed autism-specific ontology

Linked to UMLS whenever possible

Ontology will be openly available

36 CENTER FOR

Biomedical Informatics

Autism Ontology

Scope

Use

~5,000 questions from 25 instruments mapped into ~300 concepts

Three high-level groupings

Personal Traits

Social Competence

Medical History

Concept-based search of Autism Consortium phenotype data

Query, aggregate, and integrate data for hypothesis-driven research

Genotypic data correlated with phenotypic characteristics

37 CENTER FOR

Biomedical Informatics

Concluding Remarks

Precision Medicine is a Grand Challenge that is worth pursuing.

Its realization will involve addressing many social, legal, ethical,

economic, and technical issues.

If we put our minds to it, the technical issues can be among the first to

be solved, and may even show the way to the resolution of some of the

even thornier issues.

Semantic interoperability is one technical issue that can be solved.