clinical and translational informatics overview ppayne...

51
INSTITUTE FOR INFORMATICS | WASHINGTON UNIVERSITY SCHOOL OF MEDICINE From Lab to Laptop and Beyond: Clinical and Translational Research Informatics Philip R.O. Payne, PhD, FACMI Robert J. Terry Professor and Director, Institute for Informatics Washington University School of Medicine Professor of Computer Science and Engineering Washington University School of Engineering and Applied Science

Upload: others

Post on 24-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

FromLabtoLaptopandBeyond:ClinicalandTranslationalResearch

Informatics

PhilipR.O.Payne,PhD,FACMIRobertJ.TerryProfessorandDirector,InstituteforInformatics

WashingtonUniversitySchoolofMedicineProfessorofComputerScienceandEngineering

WashingtonUniversitySchoolofEngineeringandAppliedScience

Page 2: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

LearningObjectives

1) BecomefamiliarwiththefieldsofClinicalInformatics(CI),TranslationalBioinformatics(TBI),andClinicalResearchInformatics(CRI)

2) UnderstandhowTBIandCRIcandriveresearchand practiceintheeraofPrecisionMedicine

3) IdentifyopenresearchopportunitiesintheTBIandCRIdomains

Page 3: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

ReadingsTBI-Focused:• “Translationalinformatics:enabling

high-throughputresearchparadigms”(Payne,PhysGenom,2009)

CRI-Focused:• “ClinicalResearchInformatics:

Challenges,OpportunitiesandDefinitionforanEmergingDomain”(Embi,JAMIA,2009)

InformaticsandPrecisionMedicine:• “Aninformaticsresearchagendato

supportprecisionmedicine:sevenkeyareas.”(Tenenbaum,JAMIA,2016)

RECEIVED 31 August 2015REVISED 25 November 2015

ACCEPTED 24 December 2015PUBLISHED ONLINE FIRST 23 April 2016

An informatics research agenda to supportprecision medicine: seven key areas

Jessica D Tenenbaum,1 Paul Avillach,2 Marge Benham-Hutchins,3

Matthew K Breitenstein,4 Erin L Crowgey,5 Mark A Hoffman,6 Xia Jiang,7

Subha Madhavan,8 John E Mattison,9 Radhakrishnan Nagarajan,10 Bisakha Ray,11 Dmitriy Shin,12 Shyam Visweswaran,13

Zhongming Zhao,14 and Robert R Freimuth4

ABSTRACT....................................................................................................................................................The recent announcement of the Precision Medicine Initiative by President Obama has brought precision medicine (PM) to the forefront for health-care providers, researchers, regulators, innovators, and funders alike. As technologies continue to evolve and datasets grow in magnitude, a strongcomputational infrastructure will be essential to realize PM’s vision of improved healthcare derived from personal data. In addition, informatics re-search and innovation affords a tremendous opportunity to drive the science underlying PM. The informatics community must lead the develop-ment of technologies and methodologies that will increase the discovery and application of biomedical knowledge through close collaboration be-tween researchers, clinicians, and patients. This perspective highlights seven key areas that are in need of further informatics research andinnovation to support the realization of PM.

....................................................................................................................................................

Keywords: precision medicine, informatics, biomarkers, data sharing

The recent announcement of the Precision Medicine (PM) Initiative byPresident Obama1 has brought PM to the forefront for healthcare pro-viders, researchers, regulators, and funders alike. In order for PM tobe fully realized, we must move toward a Learning Healthcare Systemmodel that extends evidence-based practice to practice-based evi-dence by using data generated through clinical care to inform research(Figure 1).2 The leadership and members of the American MedicalInformatics Association Genomics and Translational BioinformaticsWorking Group have identified seven key areas that informatics re-search should explore to enable PM’s vision.

PATIENTS: PAST, PRESENT, AND FUTUREStakeholders in the biomedical enterprise include researchers, pro-viders, payers, and patients. But nearly everyone has been or will be apatient at some point. Patients thus are, and must remain, at the heartof the biomedical enterprise.

Key Area One: Facilitate Electronic Consent and Specimen TrackingIn the era of PM, research studies produce more data than they can pos-sibly use and, paradoxically, would benefit from more data than theycan possibly generate. As genomic sequencing becomes increasinglyavailable, using de-identified biospecimens for research becomes morenuanced.3 Research participants may be asked to give broad consent tothe future use of their data and biospecimens, and to acknowledge thepossible, though unlikely, prospect of sequence-based re-identifica-tion.4,5 To maximize data and biospecimen reuse while protecting studyparticipants’ privacy and adhering to their wishes, it is essential to de-velop machine-readable consent forms that enable electronic queries.6

As large biorepositories linked to electronic health records (EHRs) be-come more common, informatics will enable researchers to identify

cohorts – both intra- and interinstitutionally – that meet their study crite-ria and have given the requisite consent. Proper local management ofspecimens and derived samples enables accurate tracking of chain ofcustody, sample derivations, processing/handling, and quality control –all of which are key elements of rigorous and reproducible research.7

Structured and electronically available consent forms can empowerstudy participants by allowing them to access, review, and modify theirpreferences. A number of large-scale initiatives, including SageBionetworks, the Genetic Alliance, and the Global Alliance for GenomicHealth, are making progress in this area.

Areas of informatics that can facilitate study participant consentand sample tracking include the development of structured consentforms and the adoption of relevant ontologies,6,8 user interface de-sign, and infrastructure to enable participant engagement after thepoint of enrollment. Developing an infrastructure to perform role-baseddistributed queries over cohorts and sample collections, such as thoseprovided by OpenSpecimen, the Shared Health Research InformationNetwork (SHRINE), and PopMedNet, will also be important.9–11

DATA TO KNOWLEDGEThe promise of PM can only be realized by aggregating (virtually orotherwise) and analyzing data from multiple sources. A recent reportby the National Academy of Sciences calls for the development of aninformation commons (IC) that amasses medical, molecular, social,environmental, and health outcomes data for large numbers of individ-ual patients.12 The IC would be continuously updated, enable dataanalyses, and serve as the foundation for a knowledge base (KB) (seeKey Area Five). Creating an IC would require informatics expertise todevelop data standards, ensure data security, standardize processingpipelines, and establish data provenance.

Correspondence to Jessica D Tenenbaum, Box 2721, Durham, NC 27710, USA; [email protected]; Tel: þ1 (919) 684-7308; For numbered affiliations seeend of article.VC The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association.This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use,please contact [email protected]

PERSPECTIVE

791

Tenenbaum JD, et al. J Am Med Inform Assoc 2016;23:791–795. doi:10.1093/jamia/ocv213, Perspective

Downloaded from https://academic.oup.com/jamia/article-abstract/23/4/791/2198415/An-informatics-research-agenda-to-supportby gueston 02 October 2017

Page 4: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

Let’sStartwithaFewDefinitions…

Page 5: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DefiningBiomedicalInformatics(BMI)• “Thefieldthatisconcernedwiththeoptimaluseofinformation,oftenaidedbytheuseoftechnologyandpeople (researchers,practitioners,users,etc.),toimproveindividualhealth,healthcare,publichealth,andbiomedicalresearch”(BillHersh,2010)

5

Page 6: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

ABriefHistoryofBMI:What’sinaName?• BasicScience

• Standardsanddatarepresentation• Knowledgeengineering• Cognitiveanddecisionscience• Humanfactorsandusability• Computationalbiology

• AppliedScience• ClinicalDecisionSupportSystems(CDSS)• ClinicalInformationSystems(incl.EHRs)• Consumer-facingtools(incl.PHRs)• Bio-moleculardataanalysis“pipelines”

• AttheIntersectionofBasicandAppliedScience

• InformationRetrieval(IR)• TextMiningandNaturalLanguageProcessing

(NLP)• Visualization• ImageAnalysis

AIinMedicine

ComputersinMedicine

MedicalInformatics

BiomedicalInformatics

AnEvolvingNom

enclature…

Page 7: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

A“Working”CentralDogmaforBMI:EnablingTranslationandSystemsThinking

Data Information Knowledge+ Context + Application

6.5%MRN=1234

Test=HbA1c

Date=6/15/2016

Normalis<5.7%SendAlerttoPCPviaEHR

Page 8: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

WhyIsNowtheTimeforPrecisionMedicine?

Page 9: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

Critical”Drivers”forBasic,ClinicalandTranslationalResearchIntheHealthcareInformationAge

DataGenerationatScale

DiscoveryScience

TBI

CRI

ClinicalInformatics

Evaluation

OvercomingtheT1BarrierOvercomingtheT2+Barrier

Page 10: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

T-What?ContemporaryBarrierstoResearch

Source:“Practice-BasedResearch—“BlueHighways”ontheNIHRoadmap”,JohnM.Westfall,MD,MPH;JamesMold,MD,MPH;LyleFagnan,MD.JAMA.2007;297(4):403-406.doi:10.1001/jama.297.4.403

Page 11: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

TranslationalBioinformatics(TBI)Thedevelopmentofstorage,analytic,andinterpretivemethodstooptimizethetransformationofincreasinglyvoluminousbiomedicaldataintoproactive,predictive,preventative,andparticipatoryhealth.(Source:JAMIA)

Bioinformatics

Molecules

Populations

Bench BedsideT1 Translational Barrier

Clinical Genomics

Genomic Medicine

Pharmacogenomics

Genetic Epidemiology Hea

lth In

form

atic

s

TranslationalBioinformatics

Sarkar IN, Butte AJ, Lussier YA, Tarczy-Hornoch P, Ohno-Machado L. “Translational Bioinformatics: Linking Knowledge Across Biological and Clinical Realms” Journal of the American Medical Informatics Association. 2011. Jul-Aug;18(4):354-7.

Page 12: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

Clinical Research InformaticsClinicalResearchInformatics(CRI)isthesub-domainofbiomedicalinformaticsconcernedwiththedevelopment,application,andevaluationoftheories,methodsandsystemstooptimizethedesignandconductofclinicalresearchandtheanalysis,interpretationanddisseminationoftheinformationgenerated. (Source:JAMIA)

ClinicalResearchInformatics(CRI)

Page 13: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DoWeNeedaNewApproachtoResearch andPracticeInOrdertoAccelerateResearchandDeliveryPrecisionMedicine?

6 TrendsDrivingThisEvolvingParadigm

Page 14: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

1.EHRAdoptionIsPervasive

Page 15: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

2.HealthDataWillBecomeUbiquitousandExtendBeyondtheClinicandHospital

Page 16: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

3.GenomeSequencingWillBecometheNormandNoDifferentThanAnyOtherLabTest

Page 17: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

4.Data-drivenApproachestoPopulationHealthAreGainingTraction

Page 18: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

5.Technology-DriveDisruptionoftheHealthandLifeSciencesis(Finally)Happening

Page 19: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

6.OpenDataandPatient-CenteredParadigmsAreEmpoweringNovelApproachestoResearch,Education,andCare

Page 20: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

AQuick”Tour”oftheTBILiterature

Page 21: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DefiningTBI(Butte,JAMIA,2008)

JAMIAPerspectives� on� Informatics

Viewpoint�Paper�!

Translational�Bioinformatics:�Coming�of�Age

ATUL�J.�BUTTE,�MD,�PHD

A b s t r a c t � The�American�Medical�Informatics�Association�(AMIA)�recently�augmented�the�scope�of�itsactivities�to�encompass�translational�bioinformatics�as�a�third�major�domain�of�informatics.�The�AMIA�has�definedtranslational�bioinformatics�as�“.�.�.�the�development�of�storage,�analytic,�and�interpretive�methods�to�optimize�thetransformation�of�increasingly�voluminous�biomedical�data�into�proactive,�predictive,�preventative,�and�participatory�health.”In�this�perspective,�I�will�list�eight�reasons�why�this�is�an�excellent�time�to�be�studying�translationalbioinformatics,�including�the�significant�increase�in�funding�opportunities�available�for�informatics�from�theUnited�States�National�Institutes�of�Health,�and�the�explosion�of�publicly-available�data�sets�of�molecularmeasurements.�I�end�with�the�significant�challenges�we�face�in�building�a�community�of�future�investigators�inTranslational�Bioinformatics.!� J�Am�Med�Inform�Assoc.�2008;15:709�–714.�DOI�10.1197/jamia.M2824.

IntroductionTranslational� Medicine� has� been� described� as� the� effectivetransformation� of� information� gained� from� the� past� fiftyyears� of� biomedical� research� into� knowledge� that� can� im-prove�[the�state�of]�human�health�and�disease.1� This� trans-formation� requires� two� processes� to� work� effectively:� first,taking� basic� biological� findings� and� applying� them� to� hu-man� biology,� and� second,� taking� clinical� research� findingsand� actually� improving� the� health� of� populations.� Thespecific�development�of�information�systems�is�a�rate-limitingchallenge�for�these�two�processes.1�Many�healthcare�institu-tions�are�expanding�the�role�of�their�operational�informationtechnology� systems,� such� as� electronic� health� record,� deci-sion� support,� and� computerized� provider-order-entry� sys-tems�to�include�the�mission�of�translational�research.2

Achieving� the� impact� of� translational� medicine� requiresexpanding�the�role�and�scope�of�bioinformatics�just�as�muchas� those� for� clinical� informatics.� In� 1999,� the� AdvisoryCommittee� to� the� Director,� National� Institutes� of� Health

(NIH)� Working� Group� on� Biomedical� Computing,� co-chaired� by� David� Botstein� and� Larry� Smarr,� released� theBiomedical� Information� Science� and� Technology� Initiative(BISTI)� report,� which� recommended� that� NIH� should� beresponsive� to� the� growth� in� biological� data� and� shouldapply�funding�resources�to�accelerate�the�development�andapplication� of� computational� tools� to� science.� While� theBISTI�report�certainly�led�to�increased�funding�for�bioinfor-matics� research,� in� retrospect,� the� subsequent� initiativesoften�led�to�the�development�of�novel�tools,�perhaps�at�theexpense� of� identifying� novel� questions.� Perhaps� there� wasno�way�for�the�BISTI�authors�to�predict�that�a�generation�ofscientists,� asking� medical� questions� at� a� molecular� levelsolely� using� computational� resources,� could� appear� soquickly.The�circumstances�are�now�such�that�it�is�time�to�recognizethis� new� area� of� inquiry� called� Translational� Bioinformatics.The�American�Medical� Informatics�Association� (AMIA)�re-cently�added�translational�bioinformatics�as�one�of�its�threemajor�domains�of�informatics.�The�AMIA�has�defined�trans-lational�bioinformatics�as:

“. . . the�development�of�storage,�analytic,�and�interpretive�methodsto�optimize�the�transformation�of�increasingly�voluminous�biomed-ical�data�into�proactive,�predictive,�preventative,�and�participatoryhealth.�Translational�bioinformatics�includes�research�on�the�devel-opment� of� novel� techniques� for� the� integration� of� biological� andclinical�data�and�the�evolution�of�clinical�informatics�methodologyto� encompass�biological�observations.�The� end�product�of� transla-tional�bioinformatics�is�newly�found�knowledge�from�these�integra-tive� efforts� that� can� be� disseminated� to� a� variety� of� stakeholders,including�biomedical�scientists,�clinicians,�and�patients.”3

Translational�Bioinformatics�involves�the�development�anduse� of� computational� methods� that� can� reason� over� theenormous�amounts�of� life� science�data�being�collected�andstored� for� the� purpose� of� creating� new� tools� for� medicine.While� bioinformatics� methodologies� have� been� used� to

Affiliations�of�the�author:�Stanford�Center�for�Biomedical�Informat-ics,�Department�of�Medicine�and�Department�of�Pediatrics,�StanfordUniversity�School�of�Medicine,�Stanford,�CA;�Lucile�Packard�Chil-dren’s�Hospital,�Palo�Alto,�CA.

The�author�thanks�Drs.�Russ�Altman�and�Isaac�Kohane�for�criticalcomments and suggestions for the manuscript. Portions of thismanuscript were presented at the 2008 Summit on TranslationalBioinformatics in San Francisco. The work was supported by grantsfrom the Lucile Packard Foundation for Children’s Health, NationalLibrary of Medicine (K22 LM008261), National Institute of GeneralMedical Sciences (R01 GM079719), Howard Hughes Medical Insti-tute, and the Pharmaceutical Research and Manufacturers of Amer-ica Foundation.

Correspondence: Atul Butte, MD, PhD, Stanford Center for Biomed-ical Informatics, 251 Campus Drive, Room X-215 MS-5479, Stanford,CA�94305-5479;�e-mail:�[email protected]".

Received for review: 04/10/08; accepted for publication: 08/15/08.

Journal of the American Medical Informatics Association Volume 15 Number 6 November / December 2008 709 • IntroducethefirstwidespreaduseofthetermTBI

• Emphasizeditscomplementaryroletootherinformaticssub-disciplines

• Madeanargumentthatthebiggestchallengetothefieldwasadequateawarenessandeducation

Page 22: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

TBIMeetsGenomics(Butte,GenomeMedicine,2009)

• Fouropportunities:• Publicavailabilityofmoleculardata• Potentialof“intersectingexperiments”

fromwhichdataisderived• “Commoditization”ofbioinformatics

methods• Recognitionofneedtomovebeyond

“makinglistsofpotentialbiomarkersandcausalfactors”

• Fourrecommendations:• Informaticians needtoaskandanswer

importantquestions• Informaticians shouldnotonlydevelop

tools,butbethefirsttousethem• Embrace“messy”data• Pursueneworcorrelativedatasourcesin

ordertoincreasescaleofexperimentsandimpact

Genome Medicine 2009, 11::64

CommentaryTTrraannssllaattiioonnaall bbiiooiinnffoorrmmaattiiccss aapppplliiccaattiioonnss iinn ggeennoommee mmeeddiicciinneeAtul J Butte

Addresses: Stanford Center for Biomedical Informatics, Department of Medicine and Department of Pediatrics, Stanford University School of

Medicine, Stanford, CA 94305, USA, and Lucile Packard Children’s Hospital, Palo Alto, CA 94304, USA. Email: [email protected]

AAbbssttrraacctt

Although investigators using methodologies in bioinformatics have always been useful in genomicexperimentation in analytic, engineering, and infrastructure support roles, only recently havebioinformaticians been able to have a primary scientific role in asking and answering questions onhuman health and disease. Here, I argue that this shift in role towards asking questions inmedicine is now the next step needed for the field of bioinformatics. I outline four reasons whybioinformaticians are newly enabled to drive the questions in primary medical discovery: publicavailability of data, intersection of data across experiments, commoditization of methods, andstreamlined validation. I also list four recommendations for bioinformaticians wishing to get moreinvolved in translational research.

Published: 29 June 2009

Genome Medicine 2009, 11::64 (doi:10.1186/gm64)

The electronic version of this article is the complete one and can befound online at http://genomemedicine.com/content/1/6/64

© 2009 BioMed Central Ltd

IInnttrroodduuccttiioonnOver the past decade, a large amount of individual-level

molecular data has come from the use of gene expression

microarrays [1,2], proteomics [3], and DNA sequencing

[4,5]. Although high-throughput measurement modalities

such as these have been used in biomedical research for over

a decade, the role of the bioinformatician has often been

relegated to that of data analyst, librarian, database

manager, distribution specialist, or software engineer. Occa-

sionally, with introductions made early enough, bioinforma-

ticians have been included in the early design phases of

experiments, and their role noted as such on manuscripts

and publications. These engineering and infrastructure

roles, although important, evolved under the assumption

that the scientists making these measurements already know

good questions to ask but lack the specific skills to analyze,

store, retrieve, and disseminate their data. Engineering roles

in bioinformatics are important and are reasonably well

funded today (such as in the Cancer Bioinformatics Grid

(caBIG), Bioinformatics Research Network (BIRN), and the

National Centers for Biomedical Computing (NCBC), all in

the United States).

But considering and funding solely the engineering roles in

bioinformatics understates the potential function of

bioinformaticians as scientists - here defined as those who

come up with questions - and, even more importantly, it

limits the vision for bioinformaticians to ask questions that

no other scientists can ask or answer today. It has become

increasingly rare for the bioinformatician to take the role of

questioner, especially with regard to research that has an

impact on medical care or research that yields tools for

clinicians or patients. Here, I argue that the next steps

needed for the field of bioinformatics are a shift in role

towards asking questions and a shift in focus to medicine.

The field of translational bioinformatics, defined as ‘…the

development of storage, analytic and interpretive methods to

optimize the transformation of increasingly voluminous

biomedical data into proactive, predictive, preventative, and

participatory health’ [6], is the mechanism for this shift. I

outline below four reasons why bioinformaticians are newly

enabled to drive the questions in primary medical discovery,

and provide four recommendations for bioinformaticians

who would like to get more involved in translational

research.

Page 23: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

HuntingDiseasesGenes(Kahn,BriefingsinBioinformatics,2009)

Advances in translationalbioinformatics: computationalapproaches for the hunting ofdisease genesMaricel G. KannSubmitted: 11th August 2009; Received (in revised form): 15th September 2009

AbstractOver a100 years ago,William Bateson provided, through his observations of the transmission of alkaptonuria in firstcousin offspring, evidence of the application of Mendelian genetics to certain human traits and diseases. His workwas corroborated by Archibald Garrod (Archibald AE.The incidence of alkaptonuria: a study in chemical individual-ity. Lancert 1902;ii:1616^20) and William Farabee (Farabee WC. Inheritance of digital malformations in man.In: Papers of the Peabody Museum of American Archaeology and Ethnology. Cambridge, Mass: Harvard University, 1905;65^78), who recorded the familial tendencies of inheritance of malformations of human hands and feet. Thesewere the pioneers of the hunt for disease genes that would continue through the century and result in the discoveryof hundreds of genes that can be associated with different diseases. Despite many ground-breaking discoveriesduring the last century, we are far from having a complete understanding of the intricate network of molecularprocesses involved in diseases, and we are still searching for the cures for most complex diseases. In the last fewyears, new genome sequencing and other high-throughput experimental techniques have generated vast amountsof molecular and clinical data that contain crucial information with the potential of leading to the next majorbiomedical discoveries. The need to mine, visualize and integrate these data has motivated the development ofseveral informatics approaches that can broadly be grouped in the research area of ‘translational bioinformatics’.This review highlights the latest advances in the field of translational bioinformatics, focusing on the advances ofcomputational techniques to search for and classify disease genes.

Keywords: translational bioinformatics; disease genes; computational biology

INTRODUCTIONMore than 100 years ago, Archibald Garrod con-firmed, with his study of the incidence of alkapto-nuria in men, the Mendelian laws of inheritance ofthis disorder. Dr William Bateson, a keen follower ofMendel, had previously hypothesized that alkapto-nuria in offspring resulting from mating of firstcousins might be the due to the fact that ‘first cousinswill frequently be the bearer of similar gametes’dispelling the previous notion that mating of firstcousins in general might lead to the diseases, andhypothesizing that the disease follows similar

inheritance laws observed by Mendel in plants. Justafter the terms genotype and phenotype were coined[1], in 1905, William Farabee [2], a recognizedanthropologist, recorded the familial tendencies ofinheritance for malformations of human hands andfeet and also recognized the Mendelian patterns ofinheritance for those anomalies.

It would take over 90 more years of geneticresearch to identify mutations in the BRCA1 genewith clear relationships to familial breast cancer [3].This breakthrough knowledge has had importantimplications for the diagnosis and prognosis of

MaricelG.Kann is an assistant professor at the University of Maryland, Baltimore County. Her research interests include methods for

alignment of protein sequences, predictors of protein–protein interactions and the study of protein domains and their associations withdisease. She has co-chaired several sessions at international bioinformatics conferences related to the field of translational bioinformatics.

Corresponding author. Maricel G. Kann, University of Maryland, Baltimore County 1000 Hilltop Circle, Baltimore, MD 21250,USA. Tel: þ1-410-455-2258; Fax: þ1-410-455-3875; E-mail: [email protected]

BRIEFINGS IN BIOINFORMATICS. VOL 11. NO 1. 96^110 doi:10.1093/bib/bbp048Advance Access published on 10 December 2009

! The Author 2009. Published by Oxford University Press. For Permissions, please email: [email protected]

• Definesthecharacteristicsthatdefine“diseasegenes”

• Complexinterplayofbio-molecularandclinicalphenotypes

• Surveystypesofpublicdataandmethodsthatcanbeusedtoidentifysuchmarkers

• Desirablefeatures:publicallyavailable,advancedinterfaces,understandableandactionableoutput

Page 24: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

HighThroughputResearchParadigms(Payne,PhysiolGenomics,2009)

Review

Translational informatics: enabling high-throughput research paradigms

Philip R. O. Payne,1,2 Peter J. Embi,4,5 and Chandan K. Sen2,3

1Department of Biomedical Informatics, 2Center for Clinical and Translational Science, and 3Department of Surgery,The Ohio State University, Columbus; and 4Center for Health Informatics and 5 Department of Medicine, Universityof Cincinnati, Cincinnati, Ohio

Submitted 11 March 2009; accepted in final form 1 September 2009

Payne PRO, Embi PJ, Sen CK. Translational informatics: enabling high-throughput research paradigms. Physiol Genomics 39: 131–140, 2009. Firstpublished September 8, 2009; doi:10.1152/physiolgenomics.00050.2009.—A com-mon thread throughout the clinical and translational research domains is the needto collect, manage, integrate, analyze, and disseminate large-scale, heterogeneousbiomedical data sets. However, well-established and broadly adopted theoreticaland practical frameworks and models intended to address such needs are conspic-uously absent in the published literature or other reputable knowledge sources.Instead, the development and execution of multidisciplinary, clinical, or transla-tional studies are significantly limited by the propagation of “silos” of both data andexpertise. Motivated by this fundamental challenge, we report upon the currentstate and evolution of biomedical informatics as it pertains to the conduct ofhigh-throughput clinical and translational research and will present both a concep-tual and practical framework for the design and execution of informatics-enabledstudies. The objective of presenting such findings and constructs is to provide theclinical and translational research community with a common frame of referencefor discussing and expanding upon such models and methodologies.

biomedical research

THE MODERN BIOMEDICAL RESEARCH domain has experienced afundamental shift toward integrative and translational meth-odologies and frameworks over the past several years. Thisshift has been manifested in a number of ways, including thelaunch of the National Institutes of Health (NIH) Roadmapinitiative (82– 84), which has resulted in the creation of theClinical and Translational Science Award (CTSA) program(83), as well as the rapid growth and increasing availabilityof high-throughput biomolecular technologies and corre-sponding bio-marker-to-phenotype mapping efforts (11). Acommonly reported thread in a broad variety of reports andcommentaries concerned with this evolution focuses on thechallenges and requirements related to the collection, man-agement, integration, analysis, and dissemination of large-scale, heterogeneous biomedical data sets (19, 25, 58, 72).However, well-established and broadly adopted theoreticaland practical frameworks intended to address such needs arestill conspicuously lacking in the published literature orother reputable knowledge sources (14, 46, 58). Instead, thedevelopment and execution of integrative clinical or trans-lational research are significantly limited by the propagationof “silos” of both data and expertise. Motivated by thisfundamental challenge, the remainder of this manuscriptwill present the findings of a four-phase approach to definethe current state and practice of clinical/translational scienceand its intersection with biomedical informatics.

METHODOLOGY

As noted in the introduction and illustrated in Fig. 1, thephases and associated findings of the four phase approach usedto develop this manuscript can be broadly divided into thefollowing four categories: 1) a review of the current state ofbiomedical informatics as it pertains to the conduct of high-throughput clinical and translational research, with an empha-sis on key definitions and critical information managementchallenges; 2) the definition of a conceptual framework fortranslational informatics that is intended to foster greater inte-gration of the biomedical informatics and the clinical or trans-lational research domains, informed by the exemplary experi-ences of the authors and a number of contributory literaturereviews; 3) the definition of a practical model for the designand implementation of translational informatics projects; and4) a synthesis of the preceding research products and anassociated set of recommendations concerning how to fullyrealize the potential benefits afforded by systematic approachesto translational informatics in the contemporary biomedicalresearch environment. Our objectives in presenting these find-ings are to: 1) introduce researchers who are new to the clinicaland translational science domains to the basic concepts, chal-lenges, and informatics-related tools and methods incumbent totheir domain; and 2) provide experienced clinical, translational,and informatics researchers with a broad framework in whichto situate their current work and to identify potentially novellinkages between their efforts and emerging challenges andopportunities in the translational informatics domain. Thiswork is not intended to serve as a comprehensive review of thecurrent state of knowledge in the clinical or translationalresearch informatics domains, an area recently addressed in a

Address for reprint requests and other correspondence: P. R. O. Payne,The Ohio State Univ., Dept. of Biomedical Informatics, 3190 Graves Hall,333 W. 10th Ave., Columbus, OH 43210 (e-mail: [email protected]).

Physiol Genomics 39: 131–140, 2009.First published September 8, 2009; doi:10.1152/physiolgenomics.00050.2009.

1094-8341/09 $8.00 Copyright © 2009 the American Physiological Society 131

by 10.220.32.246 on August 22, 2017http://physiolgenom

ics.physiology.org/D

ownloaded from

Page 25: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

LinkingKnowledgeAcrossBiologicalandClinicalRealms(Sarkar,JAMIA,2011)

• “Theexponentialgrowthofgenomicdata,alongwithparallelachievementsinacquiringandanalyzingclinicaldatapositionthebiomedicalresearchenterprisetodeliveronthepromiseoftheHumanGenomeProject.”

• “TBIisaccordinglypositionedtoenableasystemsviewofcomplexdisease.”

Translational bioinformatics: linking knowledgeacross biological and clinical realmsIndra Neil Sarkar,1,2,3 Atul J Butte,4 Yves A Lussier,5,6,7 Peter Tarczy-Hornoch,8,9,10,11

Lucila Ohno-Machado12

ABSTRACTNearly a decade since the completion of the first draft ofthe human genome, the biomedical community ispositioned to usher in a new era of scientific inquiry thatlinks fundamental biological insights with clinicalknowledge. Accordingly, holistic approaches are neededto develop and assess hypotheses that incorporategenotypic, phenotypic, and environmental knowledge.This perspective presents translational bioinformatics asa discipline that builds on the successes of bioinformaticsand health informatics for the study of complex diseases.The early successes of translational bioinformatics areindicative of the potential to achieve the promise of theHuman Genome Project for gaining deeper insights to thegenetic underpinnings of disease and progress towardthe development of a new generation of therapies.

INTRODUCTIONThe study of complex diseases requires the effectiveintegration and analysis of disparate features thatoriginate from genotypic, phenotypic, and environ-mental sources. In contrast to microscopicapproaches that focus on detailed analyses of a singledata type, a macroscopic approach offers a holisticview for exploring systems of relationships.1 Mean-ingful insights from a systems theory approachrequire the coalescence of many, often intractable,heterogeneous data types.2 Traditionally, biomedicalinformatics innovations have focused (‘microscopi-cally’) on innovations constrained to particulardomains3 (eg, clinical innovations in health infor-matics; biological innovations in bioinformatics).This has led to a perceived gulf between bioinfor-matics and health informatics, thus decreasing thepotential impact of a ‘macroscopic’ approach.Recent years have seen recognition of the growingneed to bridge these domains through the develop-ment of trans-disciplinary training programs andcurricula4 as well as venues specifically designed toshare innovations that span the laboratory andclinical spaces (eg, the AMIA Summit on Trans-lational Bioinformatics). Translational bioinfor-matics (TBI) has thus emerged as a systems theoryapproach to bridge the biological and clinical dividethrough a combination of innovations and resourcesacross the entire spectrum of biomedical infor-matics.5 Along with complementary areas ofemphasis, such as those focused on developingsystems and approaches within clinical researchcontexts,6 insights from TBI may enable a newparadigm for the study and treatment of disease.The rapid escalation of activity in TBI can be

attributed to parallel advancements in the biological

and clinical realms. In biology, we have seenunprecedented advances in technology, such asthose associated with generation of molecularsequences.7 In healthcare, we are observing a newera of clinical data acquisition and decision supportthat is driven by Federal legislation fostering adop-tion of electronic health records and enablement ofseamless exchange of health information.8 9 Thechallenges have been paralleled in the biological andclinical realms, where there are common challengesin heterogeneous data integration, missing data, andsemantic mapping. Nonetheless, opportunities todevelop linkages between genetic and clinicalinformation are also increasing as a result ofparticipatory initiatives, such as those promoted bysome direct-to-consumer genetic test vendors.10

Furthermore, there is great opportunity to leveragecomplementary approaches to address thesecommon challenges (eg, some of the tools developedby clinical research informatics researchers6).The promise of the $2.7 billion Human Genome

Project was to enable scientists to understand thegenetic basis of human disease.11 However, nearlya decade since the completion of the first draft ofthe human genome,12 there is still much to beelucidated. Through technological and computa-tional advances, the $1000 genome is becominga very real possibility.13 The availability of a largenumber of complete human genomes with clinical,phenotype, and environmental information mayenable a new paradigm for the development of newsets of hypotheses pertaining to complex diseases,such as those that involve multiple genes andenvironmental parameters.14 A major goal of TBI isthus to develop informatics approaches for linkingacross traditionally disparate data and knowledgesources enabling both the generation and testingof new hypotheses.15 As large volumes of linkedbiological and clinical data become available, thecomplexity of disease may be dissected using novelTBI approaches designed in silico, but validated intraditional in vitro or even in vivo interventions.

BUILDING ON PREVIOUS SUCCESSESTBI is built on the successes of research that haveevolved in the 30 years since the first use16 of theterm ‘bioinformatics.’ Four notable areas germaneto the present discourse are clinical genomics,genomic medicine, pharmacogenomics, and geneticepidemiology (figure 1). The acceptance of clinicalgenomics (which has the purpose of identifyingclinically relevant molecular biomarkers) bythe clinical community can be measured bythe growing number of clinically relevant genetictests.17 Genomic medicine, or ‘personalized

For numbered affiliations seeend of article.

Correspondence toDr Indra Neil Sarkar, Center forClinical and TranslationalScience, University of Vermont,89 Beaumont Avenue, GivenCourtyard N309, Burlington, VT05405, USA;[email protected]

Received 14 March 2011Accepted 19 April 2011Published Online First10 May 2011

This paper is freely availableonline under the BMJ Journalsunlocked scheme, see http://jamia.bmj.com/site/about/unlocked.xhtml

354 J Am Med Inform Assoc 2011;18:354e357. doi:10.1136/amiajnl-2011-000245

Perspective

Page 26: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DataDrivenDrugDiscovery(Butte,ClinicalPharmandTherapeutics,2012)

• PositionsTBIasanalternativetotheslowing“traditional”drugdiscoveryenterprise

• Slow-paceofdiscovery• Drugsbeingremovedfrommarket

duetosafetyissues• Miningincreasingvolumesofpublicdatato

findnewusesforexistingdrugsisfaster,lesscostly,andmoreefficient

• Newtoolsandmethodsneedtobecreatedinordertopursuesuchcomputationaldrugdiscoveryinareproducibleandrigorousfashion

• Trainingofinvestigatorsmustalsoevolvedtomeetthisneed(e.g.,computationallyprepareddiscoveryscience)

CLINICAL PHARMACOLOGY & THERAPEUTICS | VOLUME 91 NUMBER 6 | JUNE 2012 949

EDITORIALnature publishing group

1Division of Systems Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA; 2Lucile Packard Children’s Hospital, Palo Alto, California, USA; 3Division of Clinical Pharmacology and Toxicology, Hospital for Sick Children, Toronto, Ontario, Canada. Correspondence: AJ Butte ([email protected])

doi:10.1038/clpt.2012.55

Translational Bioinformatics: Data-driven Drug Discovery and DevelopmentAJ Butte1,2 and S Ito3

Internet-accessible computing power and data-sharing mandates now enable researchers to interrogate thousands of publicly available databases containing molecular, clinical, and epidemiological data. With emerging new approaches, translational bioinformatics can now provide answers to previously untouchable questions, ranging from detecting population signals of adverse drug reactions to clinical interpretation of the whole genome. There are challenges, including lack of access to some data sources and software, but there are also overwhelming doses of hopes and expectations.

Fifty years ago, some of the first reports were published indicating how a computer system could help in drug discovery. When the National Cancer Institute started to systematically test compounds against animal and cellular cancer models in 1955, the identity of these tens of thousands of compounds and results of these tests were soon kept using computer records. The raw test results were later published in supplements to Cancer Research (the state of the art in public dissemination at the time) starting in 1959, and in 1962 Leiter et al. indicated that they used an IBM 1401 to create their seventh report.1 The conventional IBM 1401 received input using punch cards and had 4 kilobytes of random access memory installed.2 To put this in perspective, the average icon for an application on the Apple iPhone (not the application itself) uses more than 10 times this amount of memory. The IBM 1401 could perform 410 multiplication operations per second;

in the same time, an iPhone can perform 130 million such operations.

Fast forward to today. Today’s laptops and desktop computers are well known to equal the previous decade’s supercomputers, and Internet (“cloud”)-based computing now enables researchers around the world to temporarily purchase the services of an enormous virtual cluster of computers, with just a credit card.3 Tens of terabytes, or even petabytes (a thousand terabytes), of storage can also be provisioned for investigators.

Of course, today’s computationally enabled researcher also does not search for raw molecular data in printed journals. One current listing of publicly available databases available for biomedical research now spans 1,380 databases,4 with 23 databases containing information on small molecules, and another 29 listed under drugs and drug design. Just one of these 1,380 listed databases, NCBI PubChem, can be visualized as a massive

Page 27: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

LinkingtheMolecularandClinicalWorlds(Altman,ClinicalPharmandTherapeutics,2012)

• Manyresourcesavailabletohelplinkandreasonacrossmolecularandclinicaldata

• Emergingfociinthisregardinclude:• Discoveryof”signatures”inmolecular

diagnosticandprognosticmeasurements

• Computationalanalysisoftextualresources(NLP)

• Network-basedapproachesto“systemsmedicine”

• Integrationofpublichealthandenvironmentaldatatypeswiththeaforementionedanalyses

• Criticalissuestoconsidermovingforward:• Privacyandconfidentialityconcerns

surroundinggenomicdata• Theabilitytouseasystemsapproach

topharmacologyandhowitchangesthedrugdiscoveryandmonitoringparadigm

994 VOLUME 91 NUMBER 6 | JUNE 2012 | www.nature.com/cpt

STATE ART nature publishing group

Simply put, translational bioinformatics research integrates information about molecular entities (DNA, RNA, proteins, small molecules, and lipids) with information about clinical entities (patients, diseases, symptoms, laboratory tests, pathol-ogy reports, clinical images, and drugs) to improve patient care and our understanding of biology.

Three important technical capabilities enable translational bioinformatics. First, the revolution in computing has brought advanced capabilities in data storage, analysis, and visualization. Second, the revolution in high-throughput biological measure-ments allows us to sequence entire genomes, and characterize the transcriptome, the proteome, and the metabolome. Finally, the revolution in population-based health data and electronic medical records (EMRs) offers access to clinical data on a large scale. Researchers in translational bioinformatics must there-fore have expertise in the algorithms for managing basic bio-logical and cellular data as well as a good understanding of how to connect the molecular scale to the clinical scale. In the past five years, a cadre of specialists has begun to create a scholarly community with the required infrastructure, emerging journals, and conferences. One such conference is the annual American Medical Informatics Association Summit on Translational Bioinformatics.

As for all informatics, data form the basic raw material for translational bioinformatics. The past few years have seen the introduction of a powerful network of data resources. Although each has its own characteristic noise and bias, together they make it possible to find the links between molecular/cellular entities and clinical entities. Table 1 presents some of the key

data resources containing clinical information. Table 2 presents the key data resources containing molecular information. Importantly, some resources include both clinical data and bio-logical data, thereby providing important links between these worlds, as shown in Table 3. Without these linkages, it would be impossible to connect clinical to basic data. This review covers seven active areas in bioinformatics, as well as recent representa-tive work within each of these areas.

INFORMATICS FOR HEALTHRELATED GENOMICSOur ability to measure DNA sequences has undergone remarka-ble changes in the past decade. Not only can we measure genetic variations accurately and cheaply (through expression arrays, for example), but we can also sequence entire genomes (6 billion bases) and entire exomes (the 2% of the genome that is tran-scribed), with continually dropping costs (currently ~$4,000) and increasing levels of accuracy. This revolution in our ability to measure human genetic information creates opportunities for translational bioinformatics. The most basic tasks include creating the information infrastructure to store and process the raw data from these technologies. The ability to sequence DNA has actually outpaced the growth in computer power for the past several years, and therefore DNA sequencing has begun to stress the infrastructure for storage and analysis. There are also oppor-tunities to create algorithms for associating genetic variations with clinically important outcomes such as disease risk, disease prognosis, and therapeutic response. This can be a straight-forward task if the variations are sufficiently common to allow cohorts to be enrolled in studies with good statistical power.

1Department of Bioengineering, Stanford University, Stanford, California, USA; 2Department of Genetics, Stanford University, Stanford, California, USA; 3Department of Medicine, Stanford University, Stanford, California, USA. Correspondence: RB Altman ([email protected])

Received 14 February 2012; accepted 8 March 2012; advance online publication 2 May 2012. doi:10.1038/clpt.2012.49

Translational Bioinformatics: Linking the Molecular World to the Clinical WorldRB Altman1–3

Translational bioinformatics represents the union of translational medicine and bioinformatics. Translational medicine moves basic biological discoveries from the research bench into the patient-care setting and uses clinical observations to inform basic biology. It focuses on patient care, including the creation of new diagnostics, prognostics, prevention strategies, and therapies based on biological discoveries. Bioinformatics involves algorithms to represent, store, and analyze basic biological data, including DNA sequence, RNA expression, and protein and small-molecule abundance within cells. Translational bioinformatics spans these two fields; it involves the development of algorithms to analyze basic molecular and cellular data with an explicit goal of affecting clinical care.

Page 28: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

TBIEmbracesBigData(Shah,YearbookofMedicalInformatics,2012)

• “Translationalinformaticsisreadytorevolutionizehumanhealthandhealthcareusinglarge-scalemeasurementsonindividuals.”

• “Data–centricapproachesthatcomputeonmassiveamountsofdata(oftencalled“BigData”)todiscoverpatternsandtomakeclinicallyrelevantpredictionswillgainadoption.”

• “Researchthatbridgesthelatestmultimodalmeasurementtechnologieswithlargeamountsofelectronichealthcaredataisincreasing;andiswherenewbreakthroughswilloccur.”

• Exampleapplications:• PredictingAdverseDrugEvents

(ADRs)• InterpretingGWASderiveddatasets• Translatinggenomicsintotheclinic

Survey: Translational Bioinformatics embraces Big Data

Nigam H. ShahStanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California, United States of America

SummaryWe review the latest trends and major developments in translational bioinformatics in the year 2011–2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are:

• Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals.

• Data–centric approaches that compute on massive amounts of data (often called “Big Data”) to discover patterns and to make clinically relevant predictions will gain adoption.

• Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur.

IntroductionSummarizing an entire research field is an intrinsically hard problem and for the purpose of this survey, I rely on discussions among the Scientific Program Committee of the 2012 AMIA Summit on Translational Bioinformatics (TBI), the focus areas of the excellent submissions received at the 2012 Summit [1] and the year-in-review presentations of the past two years at the TBI Summit [2].

The key areas of activity at the 2012 Summit were focused on research that take us from base pairs to the bedside [3], with a particular emphasis on clinical implications of mining massive data-sets, and bridging the latest multimodal measurement technologies with large amounts of electronic healthcare data that are increasingly available. Among the submissions to TBI, those that stood out for their innovation were invited into a special issue of the Journal of the American Medical Informatics Association. These capture some the trends underway in translational bioinformatics. For example, Liu et al [4] demonstrated how the ability to predict Adverse Drug Reactions (ADRs) can be increased by integrating chemical, biological, and phenotypic properties of drugs. They demonstrated that data fusion approaches are promising for large-scale ADR predictions in both preclinical and post-marketing phases. Similarly, for advancing the state of the art on interpreting GWAS data, Russu et al. [5] introduced a novel Bayesian model search algorithm, Binary Outcome

Corresponding Author: Nigam H Shah, MBBS, PhD Stanford University School of Medicine 1265 Welch Road Room X-229 Stanford, CA 94305 Phone: (650) 725-6236 Fax: (650) 725-7944 [email protected].

HHS Public AccessAuthor manuscriptYearb Med Inform. Author manuscript; available in PMC 2015 March 24.

Published in final edited form as:Yearb Med Inform. 2012 ; 7(1): 130–134.

Author Manuscript

Author Manuscript

Author Manuscript

Author Manuscript

Page 29: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

TBI“ComingofAge”(Shah,JAMIA,2012)

• HighlightingtherangeandtypesofsubmissionsattheAMIASummitonTranslationalBioinformatics

• Emergent“crosstalk”withCRIcommunity(moreonthatlater…)

The coming age of data-drivenmedicine: translationalbioinformatics’ next frontierNigam H Shah,1 Jessica D Tenenbaum2

Last year, in 2011, we argued thatbiomedical informatics stands ready torevolutionize human health and health-care using large-scale measurements ona large number of individuals.1 We antici-pated that, with the coming changes inthe amount and diversity of datasets,data-centric approaches that compute onmassive amounts of data (often called‘Big Data’2 3) to discover patterns and tomake clinically relevant predictions wouldbe increasingly common in translationalbioinformatics.

Given these trends, we programmed the2012 Summit on Translational Bioinfor-matics to focus on research that takes usfrom base pairs to the bedside,4 witha particular emphasis on clinical implica-tions of mining massive datasets, andbridging the latest multimodal measure-ment technologies with the large amountsof electronic healthcare data that areincreasingly available.

The coming year did turn out to be theyear of Big Data for the Summit, withmultiple submissions on managing andinterpreting large datasets (figure 1).Among the 35 full paper submissions tothe Summit, four stood out for theirinnovation, and hence the authors wereinvited to expand the work for this specialissue of JAMIAdadding to the growingpresence of translational bioinformatics inthe journal.5e9

Liu et al10 demonstrated how theability to predict adverse drug reactionscan be increased by integrating chemical,biological, and phenotypic properties ofdrugs. They demonstrated that predictionaccuracy increased from 0.9054 (whenonly chemical structures were used) to0.9524 (when chemical structures alongwith biological and phenotypic features

were used). They conclude that datafusion approaches are promising for large-scale adverse drug reaction predictions inboth preclinical and post-marketingphases.Bhavnani et al11 assert that existing

methods to analyze ancestral informativesingle-nucleotide polymorphisms (SNPs)(ie, SNPs that have large differences ingenotype frequencies between two ormore ancestral populations) identifya parsimonious set of SNPs that canidentify distinct population clusters.However, existing methods do not directlyvisualize which clusters of subjects arerelated to which clusters of SNPs, orallow visualization of the genotypesthat determine the cluster memberships.In an attempt to reveal such hiddenrelationships, they used three bipartiteanalytical representations (a bipartitenetwork, a heat map with dendrograms,and a Circos ideogram) to simultaneouslyvisualize clusters of subjects, SNPs, andthe attributes that cause them to cluster.Seeking to maximize the utility of the

abundance of available genome-wideassociation study (GWAS) data, Russuet al12 introduced a novel Bayesian modelsearch algorithm, binary outcomestochastic search, for model selectionwhen the number of predictors (eg, SNPs)far exceeds the number of observations.They propose an innovative stochasticmodel search technique where the rela-tionship between the observed responsesand the available predictors is described bya latent variable model with a probit link.They compare binary outcome stochasticsearch with three established methods(stepwise regression, logistic lasso, andelastic net) in a simulated study and intwo real world studies to demonstratehigher precision (while preserving recall)in identifying SNPs associated with theobserved outcome than the one obtainedfrom established methods.Morgan et al,13 recipient of the Marco

Ramoni Best Paper Award, constructedgenomic disease risk summaries for 55common diseases using reported genee

disease associations in the research litera-ture. They constructed risk profiles basedon the SNPs as well as on 187 whole-genome sequences and show that riskpredictions derived from sequencing differsubstantially from those obtained fromthe SNPs for several different non-mono-genic diseases. When a large fraction ofassociated variants for a given diseaseis not covered by the genotyping array,the overall risk predictions can varydramaticallydby as much as a factor of 20times in some instances.Beyond this year ’s conference papers, in

the larger informatics community,researchers have demonstrated thatGWAS can now be performed byleveraging large amounts of electronicmedical record (EMR) data. For example,Kho et al showed that, by usingcommonly available data from fivedifferent EMRs, it is possible to accuratelyidentify type 2 diabetes cases and controlsfor genetic study across multiple institu-tions.14 In addition, genomic sequencinghas moved out of the research realm andestablished itself in the clinic. Forexample, at the Medical College ofWisconsin, Dr Howard Jacob’s team usedgenome sequencing to identify a novelcausal mutation that led to successfultreatment of a 6-year-old boy with anextreme form of inflammatory boweldisease.15 16

Currently, the discussion of Big Data intranslational informatics often connotesnext-generation sequencing data.3 17 18

However, this is beginning to change: in2011, the use of large public datasets ofvarious kinds increased dramatically. Theresearch activity around data mining forpredicting adverse drug events (ADEs)using public data is an excellentexample.19 Drug safety surveillance iscurrently based on spontaneous reportingsystems, which contain reports ofsuspected ADEs seen in clinical practice.In the USA, the primary database for suchreports is the Adverse Event ReportingSystem (AERS) database at the Food andDrug Administration. This resource hasbeen successfully mined using ‘dispro-portionality measures’, which quantifythe magnitude of difference betweenobserved and expected rates of particulardrugeADE pairs.20 21

Given the amount of data available inAERS,22 researchers are developingmethods for detecting new or latentmulti-drug adverse events. Examplesinclude using side effect profiles fromAERS’ reports to infer the presence ofunreported adverse events,23e25 and

1Stanford Center for Biomedical Informatics Research,Stanford University School of Medicine, Stanford,California, USA; 2Duke Translational Medicine Institute,Duke University, Durham, North Carolina, USA

Correspondence to Dr Nigam H Shah, StanfordUniversity School of Medicine, 1265 Welch Road, RoomX-229, Stanford, CA 94305, USA; [email protected]

e2 J Am Med Inform Assoc June 2012 Vol 19 No e1

Editorial

Page 30: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

TBI– Past,Present,Future(Tenenbaum,GenomicsProteomicsBioinformatics,2016)

REVIEW

Translational Bioinformatics: Past, Present, andFuture

Jessica D. Tenenbaum *,a

Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA

Received 16 December 2015; accepted 20 January 2016Available online 11 February 2016

Handled by Luonan Chen

KEYWORDS

Translational bioinformatics;Biomarkers;Genomics;Precision medicine;Personalized medicine

Abstract Though a relatively young discipline, translational bioinformatics (TBI) has become a keycomponent of biomedical research in the era of precision medicine. Development of high-throughputtechnologies and electronic health records has caused a paradigm shift in both healthcare andbiomedical research. Novel tools and methods are required to convert increasingly voluminousdatasets into information and actionable knowledge. This review provides a definition and contex-tualization of the term TBI, describes the discipline’s brief history and past accomplishments, aswell as current foci, and concludes with predictions of future directions in the field.

Introduction

Though a relatively young field, translational bioinformaticshas become an important discipline in the era of personalizedand precision medicine. Advances in biological methods andtechnologies have opened up a new realm of possible observa-tions. The invention of the microscope enabled doctors andresearchers to make observations at the cellular level. Theadvent of the X-ray, and later of magnetic resonance and otherimaging technologies, enabled visualization of tissues andorgans never before possible. Each of these technologicaladvances necessitates a companion advance in the methodsand tools used to analyze and interpret the results. With the

increasingly common use of technologies like DNA andRNA sequencing, DNA microarrays, and high-throughputproteomics and metabolomics, comes the need for novel meth-ods to turn these new types of data into new information andthat new information into new knowledge. That new knowl-edge, in turn, gives rise to action, providing insights regardinghow to treat disease and ideally how to prevent it in the firstplace.

Translational bioinformatics

Defining translational bioinformatics

According to the American Medical Informatics Association(AMIA), translational bioinformatics (hereafter ‘‘TBI”) is‘‘the development of storage, analytic, and interpretive meth-ods to optimize the transformation of increasingly voluminousbiomedical data, and genomic data, into proactive, predictive,preventive, and participatory health” (http://www.amia.org/

* Corresponding author.E-mail: [email protected] (Tenenbaum JD).

a ORCID: 0000-0003-3532-565X.

Peer review under responsibility of Beijing Institute of Genomics,Chinese Academy of Sciences and Genetics Society of China.

Genomics Proteomics Bioinformatics 14 (2016) 31–41

HO ST E D BY

Genomics Proteomics Bioinformatics

www.elsevier.com/locate/gpbwww.sciencedirect.com

http://dx.doi.org/10.1016/j.gpb.2016.01.0031672-0229 ! 2016 The Author. Production and hosting by Elsevier B.V. on behalf of Beijing Institute of Genomics, Chinese Academy of Sciences andGenetics Society of China.This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Page 31: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

AndNowOntotheCRILiterature...

Page 32: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

TheTranslationalBlocks(Sung,JAMA,2003)• Criticalchallenges:

1) Enhancingresearchparticipation

2) Developinginformationsystems

3) Workforcedevelopment4) Funding

Page 33: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

IntegratingInformaticsandTranslationalResearch(Payne,JIM,2005)

192 JOURNAL OF INVESTIGATIVE MEDICINE • volume 53 number 4 • May 2005

ABSTRACT

The conduct of translational health research has become a vitalnational enterprise. However, multiple barriers prevent the effectivetranslation of basic science discoveries into clinical and communitypractice. New information technology (IT) applications could helpaddress these barriers. Unfortunately, owing to a combination oforganizational, technical, and social factors, neither physician-investigators and research staff nor their clinical and communitycounterparts have harnessed such applications. Recently, at therequest of the Institute of Medicine’s Clinical Research Roundtable,a qualitative study of these factors was conducted at several lead-ing academic medical centers. We explore the current status of ITin the translational research domain, describe the qualitativeresults, and conclude with a proposed set of initiatives to furtherincrease the integration of IT into translational research.Key Words: clinical research, translational research, biomedicalinformatics, information technology

The translation of scientific discoveries into mainstreamtherapies has been one of the key challenges to our clini-cal research, medical practice, health care, and publichealth delivery systems for the past 20 years.1 Recently,members of the Institute of Medicine’s Clinical ResearchRoundtable classified these challenges as falling into twodifferent “translational blocks,” one from the bench to the

conduct of clinical studies and the other from the evidencederived from studies into implementation in health prac-tice.2 The inability to break through these barriers has asignificant impact on the ability of health professionals toprovide safe, effective patient care and to reduce the costof health care delivery.3 The urgency of finding practicablesolutions to these barriers is magnified by the emergenceof translational research as a national enterprise. Toaddress these deficiencies, steps must be taken to improvethe inherent clinical research capacity, dissemination abil-ity, and information management facilities of the healthcare system, with the objective of providing additionalsupport, guidance, and infrastructure for both the conductof translational research and the translation of new find-ings into patient care or community health.2 Numerousefforts in the biomedical informatics domain haveattempted to address these areas, most notably the devel-opment of electronic medical record (EMR) systems.4–6

Unfortunately, these systems have not been widelyadopted, and their potential is universally underrealized.To examine the role of biomedical informatics in the con-text of translational research and assess what steps arenecessary to address these challenges, a qualitative studywas undertaken to examine the way in which physician-investigators and informaticians interacted. This articlediscusses the current application of informatics to trans-lational research from the point of view of published stud-ies and from interview data. Based on these observations,recommendations to improve the integration of informat-ics and translational research are proposed.

BACKGROUND

Physician-investigators are presented with a number ofchallenges throughout the translational research process.These challenges include the design of hypotheses andprotocols, subject identification and recruitment, imple-mentation of data collection instruments, training partic-ipating research staff, ensuring regulatory compliance,and generating timely, meaningful reports.2,7,8 Informationtechnology (IT) has been applied to every phase of theclinical research enterprise.9 The use of IT, especiallyincorporating Internet technology, has demonstrated ben-efits in many areas, including initial study design, data col-lection and analysis, and study monitoring, particularly in

ORIGINAL INVESTIGATION

Breaking the Translational Barriers: The Value of IntegratingBiomedical Informatics and Translational ResearchPhilip R. O. Payne, Stephen B. Johnson, Justin B. Starren, Hugh H. Tilson, David Dowdy

From the Department of Biomedical Informatics (P.R.O.P.,S.B.J., J.B.S.), Columbia University, New York, NY; Departmentof Radiology (J.B.S.), Columbia University, New York, NY; Uni-versity of North Carolina School of Public Health (H.H.T.),Chapel Hill, NC; Johns Hopkins University School of Medicine(D.D.), Baltimore, MD.

The preparation of the manuscript was supported by theNational Academies.

The views presented in this article are those of the authorsand not the Institute of Medicine, the Institute of Medicine’sClinical Research Roundtable, or the Roundtable’s sponsoringorganizations.

Address correspondence to: Mr. Philip R.O. Payne, Depart-ment of Biomedical Informatics, Columbia University, 622 West168th Street, VC5, New York, NY 10032; e-mail: [email protected].

group.bmj.com on August 22, 2017 - Published by http://jim.bmj.com/Downloaded from

Page 34: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DataStandardsinClinicalResearch(Richesson,JAMIA,2007)

• Majorchallengestothemodellingandharmonizationofdataforclinicalresearchpurposes:

• Lackofadequatelydefinedclinicalresearchfocusedstandards

• Divergentdataandinformationmodels

• Lackofevaluationofcompetingstandards

• Unmettechnologyneeds• Futureneeds

• “Interlocking”setofclinicalresearchrelevantdatamodelsandterminologies

• FocusonsyntaxANDsemantics• Rigorousevaluationandselectionof

best-of-breedapproaches

JAMIAPerspectives on Informatics

Viewpoint !

Data Standards in Clinical Research: Gaps, Overlaps, Challengesand Future Directions

RACHEL L. RICHESSON, PHD, MPH, JEFFREY KRISCHER, PHD

A b s t r a c t Current efforts to define and implement health data standards are driven by issues related tothe quality, cost and continuity of care, patient safety concerns, and desires to speed clinical research findings tothe bedside. The President’s goal for national adoption of electronic medical records in the next decade, coupledwith the current emphasis on translational research, underscore the urgent need for data standards in clinicalresearch. This paper reviews the motivations and requirements for standardized clinical research data, and thecurrent state of standards development and adoption–including gaps and overlaps–in relevant areas. Unresolvedissues and informatics challenges related to the adoption of clinical research data and terminology standards arementioned, as are the collaborations and activities the authors perceive as most likely to address them.! J Am Med Inform Assoc. 2007;14:687–696. DOI 10.1197/jamia.M2470.

IntroductionEfforts to build a national health information infrastructure(NHII) and supporting data standards must address theneeds of clinical research.1 Clinical research, as defined bythe National Institutes of Health (NIH) is patient-orientedresearch conducted with human subjects (or on material ofhuman origin that can be linked to an individual).2 Clinicalresearch includes investigation of the mechanisms of humandisease, therapeutic interventions, clinical trials, develop-ment of new technologies, epidemiology, behavioral studies,and outcomes and health services research. The broad scopeof clinical research, coupled with the infusion of technology,has generated increasing amounts of data, and the scientificcommunity needs to identify strategies to share it in mean-ingful ways. The NIH policy on the sharing of research data3

is bringing forth questions about how data should berepresented for data sharing, and making the need forclinical research data standards critical and immediate.Data standards are defined here as consensual specificationsfor the representation of data from different sources orsettings. Standards are necessary for the sharing, portability,and reusability of data.4–7 The notion of standardized dataincludes specifications for both data fields (!variables) andvalue sets (!codes) that encode the data within these fields.Although the current data standards focus is on regulatedresearch (often the narrower context of clinical trials) andtheir business activities (e.g., safety reporting, study report-ing to regulatory bodies), it is important to mention thatclinical research includes many other types of research,including observational, epidemiological, and outcomes re-search, as well as molecular and biology research (e.g.,genetics and biomarkers for disease). Although important,this discussion does not address the “-omics” standards,8

but rather clinical, laboratory, procedure and observationdata collected in the context of clinical research subject visits.The permeation of clinical research data standards that areharmonious with clinical care standards is required for thesharing of patient data between healthcare and research—one ambition of the NHII.9 The goals for the NHII includethe seamless integration of clinical research data to/frompatient care data to/from population data and existingmedical knowledge bases, making standardized data inclinical research a high priority.6,9,10 Interoperability be-tween healthcare and clinical research data can create op-portunities for increased subject enrollment, evidence-basedmedicine, and population monitoring. This paper describesdata standards requirements for subject data in the clinicalresearch domain, the nature of overlaps and gaps in currentstandards coverage, and highlights key informatics chal-lenges that remain.

Affiliations of the authors: Pediatrics Epidemiology Center (RLR,JK), University of South Florida, Tampa, FL

The project described was supported by Grant Number RR019259from the National Center for Research Resources (NCRR), a com-ponent of the National Institutes of Health (NIH).

The authors thank the members of HL7 and CDISC vocabulary andterminology teams whose hard work and dedication providedinspiration for this paper, as well as the members of the RDCRNStandards Committee. The authors also thank the Office of RareDiseases for their support. Contents of the project are solely theresponsibility of the authors and do not necessarily represent theofficial views of NCRR or NIH. The authors are grateful for thethorough reviews and insightful comments of the two anonymousreviewers, whose contributions have strengthened the value andaccuracy of this manuscript.

Correspondence: Rachel L Richesson, PhD, Department of Pedi-atrics, College of Medicine, University of South Florida, 3650Spectrum Blvd., Suite 100, Tampa FL; e-mail: "[email protected]#.

Received for review: 04/03/07; accepted for publication: 08/07/07

Journal of the American Medical Informatics Association Volume 14 Number 6 Nov / Dec 2007 687

Page 35: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DefiningCRI(Embi andPayne,JAMIA,2009)• Community-generateddefinition:

• ClinicalResearchInformatics(CRI)isthesubdomainofbiomedicalinformaticsconcernedwiththedevelopment,application,andevaluationoftheories,methods,andsystemstooptimizethedesignandconductofclinicalresearchandtheanalysis,interpretation,anddisseminationoftheinformationgenerated.

• Applicationareas:• evaluationandmodelingofclinicalandtranslationalresearch

workflow• socialandbehavioralstudiesinvolvingclinicalresearch

professionalsandparticipants• designingoptimalhuman-computerinteractionmodelsforclinical

researchapplications• improvingandevaluatinginformationcaptureanddataflowin

clinicalresearch• optimizingresearchsiteselection,investigator,andsubject

recruitment• knowledgeengineeringandstandardsdevelopmentasappliedto

clinicalresearch• facilitatingandimprovingresearchreportingtoregulatoryagencies• enhancingclinicalandresearchdatamining,integration,and

analysis• integratingresearchfindingsintoindividualandpopulationlevel

healthcare• knowledgeintegrationacrossclinicalandresearchinformation

systems• definingandpromotingethicalstandardsinCRIpractice• educatingresearchers,informaticians,andorganizationalleaders

aboutCRI• drivingpublicpolicyaroundclinicalandtranslationalresearch

informatics

JAMIAOriginal Investigations

Research Paper !

Clinical Research Informatics: Challenges, Opportunities andDefinition for an Emerging Domain

PETER J. EMBI, MD, MS, PHILIP R.O. PAYNE, PHD

A b s t r a c t Objectives: Clinical Research Informatics, an emerging sub-domain of Biomedical Informatics, iscurrently not well defined. A formal description of CRI including major challenges and opportunities is needed todirect progress in the field.Design: Given the early stage of CRI knowledge and activity, we engaged in a series of qualitative studies withkey stakeholders and opinion leaders to determine the range of challenges and opportunities facing CRI. Thesephases employed complimentary methods to triangulate upon our findings.Measurements: Study phases included: 1) a group interview with key stakeholders, 2) an email follow-up surveywith a larger group of self-identified CRI professionals, and 3) validation of our results via electronic peer-debriefing and member-checking with a group of CRI-related opinion leaders. Data were collected, transcribed,and organized for formal, independent content analyses by experienced qualitative investigators, followed by aniterative process to identify emergent categorizations and thematic descriptions of the data.Results: We identified a range of challenges and opportunities facing the CRI domain. These included 13 distinctthemes spanning academic, practical, and organizational aspects of CRI. These findings also informed thedevelopment of a formal definition of CRI and supported further representations that illustrate areas of emphasiscritical to advancing the domain.Conclusions: CRI has emerged as a distinct discipline that faces multiple challenges and opportunities. Thefindings presented summarize those challenges and opportunities and provide a framework that should helpinform next steps to advance this important new discipline.! J Am Med Inform Assoc. 2009;16:316–327. DOI 10.1197/jamia.M3005.

IntroductionClinical research is critical to the advancement of medicalscience and public health. Conducting such research is acomplex, resource intensive endeavor comprised of a mul-titude of actors, workflows, processes, and informationresources. Ongoing large-scale efforts have explicitly fo-cused on increasing the clinical research capacity of thebiomedical sector and have served to increase attention on

clinical research and related biomedical informatics activi-ties throughout the governmental, academic, and privatesectors.1–8 Such programs and initiatives have served assignificant catalysts for the emergence of a new sub-disci-pline of biomedical informatics focused on clinical researchreferred to as Clinical Research Informatics (CRI). The CRIspace is growing rapidly and has already enabled significantimprovements in the quality and efficiency of clinical re-

Affiliations of the authors: Department of Medicine and Center forHealth Informatics, University of Cincinnati (PJE), Cincinnati, OH;Department of Biomedical Informatics and Center for Clinical andTranslational Science, The Ohio State University (PROP), Colum-bus, OH.Both authors contributed equally to the preparation of this manu-script. The authors acknowledge the contributions of those whoparticipated in our face-to-face session at the 2006 AMIA annualsymposium and of members of the AMIA CRI working group whoparticipated in phases of this research. In particular, the authorsthank the following individuals listed alphabetically for their addi-tional comments and other contributions to aspects of the prepara-tion of this manuscript: Barbara Alving, MD; Suzanne Bakken,DNSc, RN; Charles Barr, MD, MPH; Tara Borlawsky, MA; AmarChahal, MD, MBA; Christopher Chute, MD, Dr.P.H.; Milton Corn,MD; Don Detmer, MD; Bill Hersh, MD; Charles Jaffe, MD, PhD;Stephen Johnson, PhD; Srini Kalluri; Stan Kaufman, MD; Rebecca

Kush, PhD; Judith Logan, MD, MS; Daniel R. Masys, MD; ShawnMurphy, MD, PhD; Ricardo Pietroban, MD, PhD, MBA; and IdaSim, MD, PhD; Justin Starren, MD, PhD.

Preliminary analysis and findings of phase one of this study werepresented at the AMIA 2007 Annual Symposium and published inthe AMIA 2007 Annual Symposium Proceedings.

Dr. Embi’s efforts in this research were supported in part by grants fromthe NIH/NLM (K22-LM008534, R01-LM009533). Dr. Payne’s efforts inthis research were supported in part by grants from the NIH/NCI(P01-CA081534, R01CA134232) and NIH/NCRR (U54-RR024384).

Correspondence: Peter J. Embi, MD, MS, Center for Health Infor-matics, University of Cincinnati Academic Health Center, 231Albert Sabin Way, PO Box 670840, Cincinnati, OH, 45267-0840;e-mail: [email protected]".

Received for review: 09/17/08; accepted for publication 02/12/09.

316 Embi and Payne, Clinical Research Informatics

Page 36: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

REDCap (Harris,JBI,2009)

• First“massmarket”andopen-sourceCRIplatform

• Focusonflexible,scalable,knowledge-workerdrivenelectronicdatacapture(EDC)

Research electronic data capture (REDCap)—A metadata-driven methodologyand workflow process for providing translational research informatics support

Paul A. Harris a,*, Robert Taylor b, Robert Thielke c, Jonathon Payne d, Nathaniel Gonzalez e, Jose G. Conde e

a Department of Biomedical Informatics, Vanderbilt University, 2525 West End Avenue, Suite 674, Nashville, TN 37212, USAb Office of Research Informatics, Vanderbilt University, 2525 West End Avenue, Suite 600, Nashville, TN 37212, USAc General Clinical Research Center, Medical College of Wisconsin, 9200 West Wisconsin Avenue, Milwaukee, WI 53226, USAd Biomedical Research Education and Training, Vanderbilt University, 340 Light Hall, Nashville, TN 37232, USAe Center for Information Architecture in Research, University of Puerto Rico, P.O. Box 365067, San Juan, PR 00936, USA

a r t i c l e i n f o

Article history:Received 3 July 2008Available online 30 September 2008

Keywords:Medical informaticsElectronic data captureClinical researchTranslational research

a b s t r a c t

Research electronic data capture (REDCap) is a novel workflow methodology and software solutiondesigned for rapid development and deployment of electronic data capture tools to support clinicaland translational research. We present: (1) a brief description of the REDCap metadata-driven softwaretoolset; (2) detail concerning the capture and use of study-related metadata from scientific researchteams; (3) measures of impact for REDCap; (4) details concerning a consortium network of domesticand international institutions collaborating on the project; and (5) strengths and limitations of the RED-Cap system. REDCap is currently supporting 286 translational research projects in a growing collaborativenetwork including 27 active partner institutions.

! 2008 Elsevier Inc. All rights reserved.

1. Introduction

The R01 funding mechanism may be the cornerstone of America’sbiomedical research program, but individual scientists often requireinformatics and other multidisciplinary team expertise that cannoteasily be obtained or developed in the independent research envi-ronment [1]. The National Center for Research Resources has statedthat the future of biomedical research will involve collaborations bymany scientists in diverse locations linked through high-speed com-puter networks that enable submission, analysis, and sharing of data[2]. However, the need to collect and share data in a secure mannerwith numerous collaborators across academic departments or eveninstitutions is a formidable challenge. This manuscript presents ametadata-driven software application and novel metadata-gather-ing workflow used to successfully support translational researchprojects in the academic research environment. REDCap (ResearchElectronic Data Capture) was initially developed and deployed atVanderbilt University, but now has collaborative support from awide consortium of domestic and international partners.

2. Methods

The REDCap project was developed to provide scientificresearch teams intuitive and reusable tools for collecting, storing

and disseminating project-specific clinical and translationalresearch data. The following key features were identified as criticalcomponents for supporting research projects: (1) collaborativeaccess to data across academic departments and institutions; (2)user authentication and role-based security; (3) intuitive elec-tronic case report forms (CRFs); (4) real-time data validation,integrity checks and other mechanisms for ensuring data quality(e.g. double-data entry options); (5) data attribution and auditcapabilities; (6) protocol document storage and sharing; (7) centraldata storage and backups; (8) data export functions for commonstatistical packages; and (9) data import functions to facilitate bulkimport of data from other systems. Given the quantity and diver-sity of research projects within academic medical centers, we alsodetermined two additional critical features for the REDCap project:(10) a software generation cycle sufficiently fast to accommodatemultiple concurrent projects without the need for custom pro-ject-specific programming; and (11) a model capable of meetingdisparate data collection needs of projects across a wide array ofscientific disciplines.

REDCap accomplishes key functions through use of a singlestudy metadata table referenced by presentation-level operationalmodules. Based on this abstracted programming model, studies aredeveloped in an efficient manner with little resource investmentbeyond the creation of a single data dictionary. The concept ofmetadata-driven application development is well established, sowe realized early in the project that the critical factor for successwould lie in creating a simple workflow methodology allowing re-search teams to autonomously develop study-related metadata in

1532-0464/$ - see front matter ! 2008 Elsevier Inc. All rights reserved.doi:10.1016/j.jbi.2008.08.010

* Corresponding author. Fax: +1 615 936 8545.E-mail address: [email protected] (P.A. Harris).

Journal of Biomedical Informatics 42 (2009) 377–381

Contents lists available at ScienceDirect

Journal of Biomedical Informatics

journal homepage: www.elsevier .com/locate /y jb in

Page 37: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

LeveragingEHRsforResearch(Weiskopf,JAMIA,2012)• Systematicreviewofmeasuresand

methodsthatcanbeusedtodefine“fitnessforuse”ofdatainEHRswhenconductingclinicalresearch

• Criticaldimensionsofthisdomainenumerated:

• Completeness:IsatruthaboutapatientpresentintheEHR?

• Correctness:IsanelementthatispresentintheEHRtrue?

• Concordance:IsthereagreementbetweenelementsintheEHR,orbetweentheEHRandanotherdatasource?

• Plausibility:DoesanelementintheEHRmakessenseinlightofotherknowledgeaboutwhatthatelementismeasuring?

• Currency:IsanelementintheEHRarelevantrepresentationofthepatientstateatagivenpointintime?

Methods and dimensions of electronic health recorddata quality assessment: enabling reuse forclinical researchNicole Gray Weiskopf, Chunhua Weng

ABSTRACTObjective To review the methods and dimensions of dataquality assessment in the context of electronic healthrecord (EHR) data reuse for research.Materials and methods A review of the clinicalresearch literature discussing data quality assessmentmethodology for EHR data was performed. Using aniterative process, the aspects of data quality beingmeasured were abstracted and categorized, as well asthe methods of assessment used.Results Five dimensions of data quality were identified,which are completeness, correctness, concordance,plausibility, and currency, and seven broad categories ofdata quality assessment methods: comparison with goldstandards, data element agreement, data sourceagreement, distribution comparison, validity checks, logreview, and element presence.Discussion Examination of the methods by whichclinical researchers have investigated the quality andsuitability of EHR data for research shows that there arefundamental features of data quality, which may bedifficult to measure, as well as proxy dimensions.Researchers interested in the reuse of EHR data forclinical research are recommended to consider theadoption of a consistent taxonomy of EHR data quality,to remain aware of the task-dependence of data quality,to integrate work on data quality assessment from otherfields, and to adopt systematic, empirically driven,statistically based methods of data quality assessment.Conclusion There is currently little consistency orpotential generalizability in the methods used to assessEHR data quality. If the reuse of EHR data for clinicalresearch is to become accepted, researchers shouldadopt validated, systematic methods of EHR data qualityassessment.

As the adoption of electronic health records (EHRs)has made it easier to access and aggregate clinicaldata, there has been growing interest in conductingresearch with data collected during the course ofclinical care.1 2 The Natonal Institutes of Healthhas called for increasing the reuse of electronicrecords for research, and the clinical researchcommunity has been actively seeking methods toenable secondary use of clinical data.3 EHRs surpassmany existing registries and data repositories involume, and the reuse of these data may diminishthe costs and inefficiencies associated with clinicalresearch. Like other forms of retrospective research,studies that make use of EHR data do not requirepatient recruitment or data collection, both ofwhich are expensive and time-consuming processes.The data from EHRs also offer a window into the

medical care, status, and outcomes of a diversepopulation that is representative of actual patients.The secondary use of data collected in EHRs isa promising step towards decreasing research costs,increasing patient-centered research, and speedingthe rate of new medical discoveries.Despite these benefits, reuse of EHR data has

been limited by a number of factors, includingconcerns about the quality of the data and theirsuitability for research. It is generally accepted that,as a result of differences in priorities between clin-ical and research settings, clinical data are notrecorded with the same care as research data.4

Moreover, Burnum5 stated that the introduction ofhealth information technology like EHRs has lednot to improvements in the quality of the databeing recorded, but rather to the recording ofa greater quantity of bad data. Due to suchconcerns about data quality, van der Lei6 warnedspecifically against the reuse of clinical data forresearch and proposed what he called the first lawof informatics: ‘[d]ata shall be used only for thepurpose for which they were collected’.Although such concerns about data quality have

existed since EHRs were first introduced, thereremains no consensus as to the quality of electronicclinical data or even agreement as to what ‘dataquality ’ actually means in the context of EHRs.One of the most broadly adopted conceptualiza-tions of quality comes from Juran,7 who said thatquality is defined through ‘fitness for use’. In thecontext of data quality, this means that data are ofsufficient quality when they serve the needs ofa given user pursuing specific goals.Past study of EHR data quality has revealed

highly variable results. Hogan and Wagner,8 in their1997 literature review, found that the correctness ofdata ranged between 44% and 100%, andcompleteness between 1.1% and 100%, dependingon the clinical concepts being studied. Similarly,Thiru et al,9 in calculating the sensitivity ofdifferent types of EHR data in the literature, foundvalues ranging between 0.26 and 1.00. In a 2010review, Chan et al10 looked at the quality of thesame clinical concepts across multiple institutions,and still found a great deal of variability. Thecompleteness of blood pressure recordings, forexample, fell anywhere between 0.1% and 51%.Due to differences in measurement, recording,information systems, and clinical focus, the qualityof EHR data is highly variable. Therefore, it isgenerally inadvisable to make assumptions aboutone EHR-derived dataset based on another. We needsystematic methods that will allow us to assess the

Department of BiomedicalInformatics, ColumbiaUniversity, New York,New York, USA

Correspondence toNicole Gray Weiskopf,Department of BiomedicalInformatics, ColumbiaUniversity, 622 W 168th Street,VC-5, New York, NY 10032,USA; [email protected]

Received 3 November 2011Accepted 3 May 2012

Review

144 J Am Med Inform Assoc 2013;20:144–151. doi:10.1136/amiajnl-2011-000681

Published Online First25 June 2012

Page 38: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

AConceptualModelforCRI(Khan,JAMIA,2012)

Clinical research informatics: a conceptualperspectiveMichael G Kahn,1 Chunhua Weng2

ABSTRACTClinical research informatics is the rapidly evolvingsub-discipline within biomedical informatics that focuseson developing new informatics theories, tools, andsolutions to accelerate the full translational continuum:basic research to clinical trials (T1), clinical trials toacademic health center practice (T2), diffusion andimplementation to community practice (T3), and ‘realworld’ outcomes (T4). We present a conceptual modelbased on an informatics-enabled clinical researchworkflow, integration across heterogeneous datasources, and core informatics tools and platforms. Weuse this conceptual model to highlight 18 new articles inthe JAMIA special issue on clinical research informatics.

Clinical research informatics (CRI) is the rapidlyevolving sub-discipline within biomedical infor-matics that focuses on developing new informaticstheories, tools, and solutions to accelerate the fulltranslational continuum1 2: basic research toclinical trials (T1), clinical trials to academic healthcenter practice (T2), diffusion and implementationto community practice (T3), and ‘real world’outcomes (T4).3 Two recent factors acceleratingCRI research and development efforts are (1) theextensive and diverse informatics needs of the NIHClinical and Translational Sciences Awards(CTSAs),4e6 and (2) the growing interest insustainable, large-scale, multi-institutional distrib-uted research networks for comparative effective-ness research.7e9 Given the large landscape thatcomprises translational science, CRI scientists areasked to conceive innovative informatics solutionsthat span biological, clinical, and population-basedresearch. It is therefore not surprising that the fieldhas simultaneously borrowed from and contributedto many related informatics disciplines.Paralleling the growth in CRI prominence,

JAMIA has received an increasing number of CRIsubmissions. In 2010, five published articles werecompletely focused on CRI,10e14 while in 2011 thisnumber rose to 23,15e37 accounting for 11.5% of allJAMIA articles for that year. There was a specialsection focused on CRI papers in the December2011 supplement issue. Much of the increase can beattributed to publications from awardees of theCTSA, since publication rate is related to funding.38

JAMIA publications acknowledging CTSA fundingrose from three in 200939e41 to four in 201014 42e44

and 15 in 2011.15 17 19 36 45e55 Some of the articleswere not exclusively focused on CRI, but weredirectly related, covering many different topicsthat are highly relevant to CRI: data models andterminologies,27 56e68 natural language processing(NLP),16 50 61 69e99 surveillance systems,48 65 80

100e110 and privacy technology and policy.33 111e117

This 2012 CRI supplement adds 18 new publica-tions to this growing field.

A CONCEPTUAL MODEL OF CLINICAL RESEARCHINFORMATICSTo provide guidance on the CRI innovationsrepresented in this special supplement, we devel-oped the conceptual model in figure 1. This figureillustrates how CRI integrates clinical and trans-lational research workflows in addition to coreinformatics methodologies and principles intoa framework that reflects the unique informaticsneeds of translational investigators. The model isorganized around three conceptual components:workflows; data sources and platforms; andinformatics core methods and topics.The central structure that establishes the unique

context for CRI is the informatics-enabled clinicalresearch workflow. The elements and sequence of thisworkflow should be familiar as it reflects the keyphases in the scientific model of knowledgediscovery.118 Unlike diagrams that appear in tradi-tional research methodology textbooks, figure 1applies an informatics-centric perspective to eachstep and contains two translational workflow cycles,which reflect the use of CRI technologies in bothearly (‘T1eT2’) and later (‘T3eT4’) translationalphases.119 120 The ‘inner ’ cycle represents trans-lational discoveries within carefully controlled studyconditions in a limited number of clinical trial sites.The ‘outer ’ cycle represents the later stages of clinicaltranslational research, where implementation anddissemination tasks become more prominent acrosscommunity practices. The later stages of clinicaltranslational research are represented by imple-mentation-oriented translational activities such asevidence generation and synthesis, personalizedevidence application, and population surveillance.New scientific knowledge, both hypothesis-

generating and hypothesis-testing, begins witha research question that drives the investigativeprocess. While previous studies may suggestpossible new research questions, ultimately thisstep reflects the creative insight of a well-trainedtranslational investigator. During the early plan-ning phases, study feasibility assessment andcohort identification are important tasks forensuring that sufficient study participants and dataexist to move the proposed study forward. Eligi-bility alerting, which leverages the growing use ofelectronic health records (EHRs) to notify physi-cians of their patients’ eligibility for clinical trials,is one of the major informatics solutions to addressthe leading cause of failures in clinicalstudiesdthe inability to recruit sufficient study

1Department of Pediatrics,University of Colorado, Aurora,Colorado, USA2Department of BiomedicalInformatics, ColumbiaUniversity, New York, NewYork, USA

Correspondence toDr Michael G Kahn, c/oChildren’s Hospital Colorado,13123 East 16th Avenue, B400,Aurora, CO 80045, USA;[email protected]

Received 23 March 2012Accepted 26 March 2012Published Online First20 April 2012

This paper is freely availableonline under the BMJ Journalsunlocked scheme, see http://jamia.bmj.com/site/about/unlocked.xhtml

e36 J Am Med Inform Assoc 2012;19:e36ee42. doi:10.1136/amiajnl-2012-000968

Perspective

Page 39: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

CRI+BigData+PrecisionMedicine(Weng,YearbookofMedicalInformatics,2016)

Page 40: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

DataStandardsinClinicalResearch(Richesson,JAMIA,2007)

• Majorchallengestothemodellingandharmonizationofdataforclinicalresearchpurposes:

• Lackofadequatelydefinedclinicalresearchfocusedstandards

• Divergentdataandinformationmodels

• Lackofevaluationofcompetingstandards

• Unmettechnologyneeds• Futureneeds

• “Interlocking”setofclinicalresearchrelevantdatamodelsandterminologies

• FocusonsyntaxANDsemantics• Rigorousevaluationandselectionof

best-of-breedapproaches

JAMIAPerspectives on Informatics

Viewpoint !

Data Standards in Clinical Research: Gaps, Overlaps, Challengesand Future Directions

RACHEL L. RICHESSON, PHD, MPH, JEFFREY KRISCHER, PHD

A b s t r a c t Current efforts to define and implement health data standards are driven by issues related tothe quality, cost and continuity of care, patient safety concerns, and desires to speed clinical research findings tothe bedside. The President’s goal for national adoption of electronic medical records in the next decade, coupledwith the current emphasis on translational research, underscore the urgent need for data standards in clinicalresearch. This paper reviews the motivations and requirements for standardized clinical research data, and thecurrent state of standards development and adoption–including gaps and overlaps–in relevant areas. Unresolvedissues and informatics challenges related to the adoption of clinical research data and terminology standards arementioned, as are the collaborations and activities the authors perceive as most likely to address them.! J Am Med Inform Assoc. 2007;14:687–696. DOI 10.1197/jamia.M2470.

IntroductionEfforts to build a national health information infrastructure(NHII) and supporting data standards must address theneeds of clinical research.1 Clinical research, as defined bythe National Institutes of Health (NIH) is patient-orientedresearch conducted with human subjects (or on material ofhuman origin that can be linked to an individual).2 Clinicalresearch includes investigation of the mechanisms of humandisease, therapeutic interventions, clinical trials, develop-ment of new technologies, epidemiology, behavioral studies,and outcomes and health services research. The broad scopeof clinical research, coupled with the infusion of technology,has generated increasing amounts of data, and the scientificcommunity needs to identify strategies to share it in mean-ingful ways. The NIH policy on the sharing of research data3

is bringing forth questions about how data should berepresented for data sharing, and making the need forclinical research data standards critical and immediate.Data standards are defined here as consensual specificationsfor the representation of data from different sources orsettings. Standards are necessary for the sharing, portability,and reusability of data.4–7 The notion of standardized dataincludes specifications for both data fields (!variables) andvalue sets (!codes) that encode the data within these fields.Although the current data standards focus is on regulatedresearch (often the narrower context of clinical trials) andtheir business activities (e.g., safety reporting, study report-ing to regulatory bodies), it is important to mention thatclinical research includes many other types of research,including observational, epidemiological, and outcomes re-search, as well as molecular and biology research (e.g.,genetics and biomarkers for disease). Although important,this discussion does not address the “-omics” standards,8

but rather clinical, laboratory, procedure and observationdata collected in the context of clinical research subject visits.The permeation of clinical research data standards that areharmonious with clinical care standards is required for thesharing of patient data between healthcare and research—one ambition of the NHII.9 The goals for the NHII includethe seamless integration of clinical research data to/frompatient care data to/from population data and existingmedical knowledge bases, making standardized data inclinical research a high priority.6,9,10 Interoperability be-tween healthcare and clinical research data can create op-portunities for increased subject enrollment, evidence-basedmedicine, and population monitoring. This paper describesdata standards requirements for subject data in the clinicalresearch domain, the nature of overlaps and gaps in currentstandards coverage, and highlights key informatics chal-lenges that remain.

Affiliations of the authors: Pediatrics Epidemiology Center (RLR,JK), University of South Florida, Tampa, FL

The project described was supported by Grant Number RR019259from the National Center for Research Resources (NCRR), a com-ponent of the National Institutes of Health (NIH).

The authors thank the members of HL7 and CDISC vocabulary andterminology teams whose hard work and dedication providedinspiration for this paper, as well as the members of the RDCRNStandards Committee. The authors also thank the Office of RareDiseases for their support. Contents of the project are solely theresponsibility of the authors and do not necessarily represent theofficial views of NCRR or NIH. The authors are grateful for thethorough reviews and insightful comments of the two anonymousreviewers, whose contributions have strengthened the value andaccuracy of this manuscript.

Correspondence: Rachel L Richesson, PhD, Department of Pedi-atrics, College of Medicine, University of South Florida, 3650Spectrum Blvd., Suite 100, Tampa FL; e-mail: "[email protected]#.

Received for review: 04/03/07; accepted for publication: 08/07/07

Journal of the American Medical Informatics Association Volume 14 Number 6 Nov / Dec 2007 687

Page 41: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

3CommonThemesSpanningTBIandCRI

Page 42: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

SystemsThinking

InformaticsasanInterventionalDiscipline

BigData

Movingawayfromreductionismandtowardscomplexsystems

AskingANDAnsweringQuestions

Discovering,integrating,andunderstandingemergingandheterogeneousdatasources

Page 43: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

PuttingItAllTogether:TBI,CRIandtheEvidenceGeneratingMedicine(EGM)System

Payne,PhilipRO,andPeterJ.Embi,eds.TranslationalInformatics:RealizingthePromiseofKnowledge-DrivenHealthcare.Springer,2014.

Page 44: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

BiomedicalInformaticsReimaginesasanInterventionalDiscipline

AskingANDAnsweringQuestionsUsingNovelTheories,Methods,andDataAssets

Page 45: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

BehavingLikeAHighPerformanceSystemandEvolvingGracefullyRequiresASystemsApproach

• Threecharacteristicsofahighperformancesystem:1) Leveragedatatoidentify

problemsandopportunities2) Designreproduciblesolutions3) Implementthosesolutions

Masteringtheartofdesigningand implementingsolutionsis

thegreatestchallengefacingthefieldsofTBIandCRI

Page 46: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

ImprovedTranslation

SystemsThinking

EvidenceGeneration

LearningHealthcareSystem(s)

AdvancesinHumanHealth

AHypothesisAboutWhyItAllWorks…

EnabledByTBIandCRITheoriesandMethods

Page 47: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

AnticipatingandEmbracingEvolutioninTechnologiesandInformationNeedsIsCritical…ButWeMustFocusonMethodsFirst!

Page 48: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

ScientificEvolutioninTBIandCRI:WhatShouldOurScholarlyFociBe?1) Fullyembraceinterdisciplinary:

• Structure• Function• Competency-basedTraining

2) Pursueemerging(orremerging)researchfoci:• Datascience• Healthservicesand qualityimprovement• Decisionscienceandsupport(inthecontextof“BigData”)• Humanfactorsandworkflow• Integratingpatientsandcommunitiesintothehealthcareandresearch“fabric”• Understandingandapplyingmulti-scalemodelling

3) Engagewithhealthsystem(s)andindustry:• Analytics• Workflowandhumanfactors• Transformation• Discoveryandtranslation

4) Adaptingstrategiesfromtheprivatesector• Identifyandplacedisproportionateemphasison“blueoceans”• Behavelikeastart-up(speed,agility,“realartistsship”)• Rapidtranslationoftechnologiestothe“market”

Page 49: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

AnEvolvingResearchAgendaforInformaticsintheEraofPrecisionMedicine(AccordingtoTenenbaum etal)1) FacilitateElectronicConsentand

SpecimenTracking

2) Develop,Deploy,andAdoptDataStandardstoEnsureDataPrivacy,Security,andIntegrity,andtoFacilitateDataIntegrationandExchange

3) AdvanceMethodsforBiomarkerDiscoveryandTranslation

4) ImplementandEnforceProtocolsandProvenance

5) BuildaPrecisionMedicineKnowledgeBase

6) EnhanceEHRstoPromotePrecisionMedicine

7) FacilitateConsumerEngagementSource:Tenenbaum,JessicaD.,etal."Aninformaticsresearchagendatosupportprecisionmedicine:sevenkeyareas."JournaloftheAmericanMedicalInformaticsAssociation23.4(2016):791-795.APA

Page 50: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

ReviewofLearningObjectives

1) BecomefamiliarwiththefieldsofClinicalInformatics(CI),TranslationalBioinformatics(TBI),andClinicalResearchInformatics(CRI)

2) UnderstandhowTBIandCRIcandriveresearchand practiceintheeraofPrecisionMedicine

3) IdentifyopenresearchopportunitiesintheTBIandCRIdomains

Page 51: Clinical and Translational Informatics Overview PPayne 10-2017genetics.wustl.edu/ggdpathway/files/2016/08/Clinical_and... · 2017-10-02 · challenge for these two processes.1 Many

I N S T I T U T E F O R I N F O R M A T I C S | W A S H I N G T O N U N I V E R S I T Y S C H O O L O F M E D I C I N E

PhilipR.O.Payne,PhD,[email protected]@prpayne5www.slideshare.net/prpayne5

QuestionsorComments?