georgetown innovation center for biomedical informatics symposium precision oncology and big data...
DESCRIPTION
A discussion of Big Data for precision oncology. Mobile applications and eMERGE.TRANSCRIPT
Precision Oncology and Big Data
Warren A. Kibbe, PhD [email protected] http://wiki.bioinformatics.northwestern.edu/index.php/Warren_Kibbe
Opportuni5es
• Big Data in Cancer – Mobility and pervasive compu5ng – Social data – NGS – Imaging (fMRI, CT scans)
• EHR integra5on – Analy5cs based on clinical data – Decision support
Challenges
• EHR – we need synop5c and seman5c data to support precision medicine
• EHR – Truly automated and useful decision support
• Handling and analyzing big data • Appropriate, open access to pa5ent-‐derived big data
• Incorpora5ng social data • Mobile compu5ng
GeMng Big
• Big Data is about emergent proper5es • Big Data changes the sta5s5cal paradigm – rather than modeling whether the sample is representa5ve of the popula5on, you have all the data from the popula5on
• How do we combine systems biology and social data with therapeu5cs and big data from healthcare providers ?
-‐omics, clinical, nutri5on, exposure
• Teasing apart the factors contribu5ng to risk and therapeu5c efficacy is complicated!
• Sources of data we would like to have across all pa5ents:
Genomic data Microbiome data Treatment Metabolomics Outcomes Exposure data Nutri8on Behavior Labs Medical History
-‐omics, clinical, nutri5on, exposure
• And of course we would like all these data consistent and reliable!
Genomic data Microbiome data Treatment Metabolomics Outcomes Exposure data Nutri8on Behavior Labs Medical History
-‐omics, clinical, nutri5on, exposure
• We aren’t there yet!
• What can we do now?
Examples of current solu5ons
• Mobile ePROs, either at home or in the clinic • Care diaries on tablets – response, recovery • Integra5on of NLP and phenotype algorithms at the point of care
• Integra5on of clinically ac5onable genomic variants into EHR (think Hercep5n and HER2)
• Decision support for infec5ous diseases based on social network and GPS – not just for MRSA
Mobile compu5ng
• Measuring depression, pain and anxiety directly with the pa5ent
• Center for Behavioral Interven5on Technologies at Northwestern – David Mohr, Director and PI
Mobilyze (P20 MH090318 PI: Mohr)
Burns, M. N., Mohr, D. C. (2011). J Med Internet Res, 13(3), e55.
Mobilyze is a mobile applica5on aimed at trea5ng major depression and includes • Didac5c Content (text, video, audio) aimed in providing educa5on about behavior change strategies
• Interac5ve Tools that assist in implemen5ng changes
• No5fica5ons that provide reminders
• Feedback that provide insight. www.cbits.northwestern.edu
Context Awareness • Context awareness refers to the idea that computers can
both sense, and react based on their environment. • A second aim of the Mobilyze project is to harness sensor
data from user phones to develop models that can detect treatment relevant states in real 5me, which can then be used to posi5vely reinforce treatment congruent behavior and provide assistance when need is detected.
www.cbits.northwestern.edu
Context Inference System The Machine Learner is “trained” using EMA (queries)
www.cbits.northwestern.edu
Purple Robot • A full real-‐5me sensor data acquisi5on placorm for
collec5ng informa5on about the user and their immediate surroundings. Purple Robot provides – Full access to the Android sensor framework (e.g. the
accelerometer, gyroscope, pressure sensor, light sensor, etc.)
– Access to other device informa5on (e.g. badery level, running soeware & apps, and hardware informa5on).
– Op5ons to scan for external devices such as wireless access points and visible Bluetooth devices.
– Loca5on sensors that use the built-‐in GPS and cellular triangula5on op5ons to map the user’s loca5on.
– Local environmental data sources such as solar event 5ming (sunrise & sunset) and weather condi5ons.
– Communica5on paderns, including phone logs and text-‐message transcripts.
– Cryptographic anonymiza5on of personally-‐iden5fiable informa5on before it leaves the device.
• Purple Robot has been open sourced. • hdp://tech.cbits.northwestern.edu/purple-‐robot/
www.cbits.northwestern.edu
EHR Integra5on
• EHR systems are rela5vely closed • Implemen5ng two way integra5on is difficult • Innova5on using EHRs is difficult
• Example: Lightweight coupling of electronic pa5ent reported outcomes (ePROs)
Mobile Devices in the Clinic
Andrew Gawron, MD, PhD Center for Healthcare Studies John Pandolfino, MD Division of Gastroenterology and Hepatology Northwestern University
16
Outpa5ent care
Inpa5ent care
Procedures
Our approach
17
https://enotis.northwestern.edu/login
Ø eNOTIS: Open source web-based subject registration system
Ø Meets federal guidelines for electronic reporting and addresses a mandate that accrual information be tracked, validated, and reported.
https://github.com/NUBIC/eNOTIS
Informatics
Integration with eNOTIS
18
Ø eCapture is a web-based system delivers forms for administrator or patient facing data collection and it is linked to eNOTIS.
Ø Information, collected
through these systems, can be linked with other clinical information available in the EHR
https://github.com/NUBIC/surveyor
Delivery on an iPad
19
Dashboard view by study
20
Results
0
50
100
150
200
250
300
350
400
450
Aug Sep Oct Nov Dec Jan Feb Mar April May Jun Jul Aug
Num
ber o
f Pat
ient
s
>4000 ePROs collected in 1 year in 2 clinics
Ø 482 patients recruited v 434 patients completed at least one measure
Ø Mean age 48yrs v 52.5% female, 87.7% white
21
Results: Time burden Patient Reported Outcome Measure
# of items
Patients (N)
Median time, min (IQR)
Disease Specific GerdQ 6 413 1.0 (1.6) Heartburn Symptom/Experience 13 432 1.3 (1.7) Heartburn Vigilance/Awareness 16 424 1.8 (1.8) Impaction Dysphagia Questionnaire 6 426 1.3 (1.8) Visceral Sensitivity Index 15 432 1.9 (2.0) Not Disease Specific Discomfort Tolerance Scale 7 391 1.4 (1.4) Anxiety Sensitivity Index 16 432 1.8 (1.7) BSI-18 18 430 1.4 (1.4) PANAS 20 432 1.8 (1.5) Perceived Stress Scale 4 434 0.8 (0.8)
Ø Most patients required ≤ 2 minutes for each ePRO measure
Ø Average time to complete all measures: ∼ 20 minutes
22
Results: Usability
N=93 patients
Ø ~90% of patients reported the system easy to use
23
Results: Satisfaction and Patient Recall
N=93 patients
Ø 95.7% would recommend the system to other patients
Ø 46.2% reported that the system helped them remember symptom occurrence
Ø 35.5% said that it encouraged them to discuss medical issues with their doctor
Decision Support and EHRs
• eMERGE project – NHGRI funded study to examine the validity of the EHR for iden5fying disease cohorts for gene5c studies – Automated disease / phenotype algorithms – Genera5on of SNP variants from a pa5ent cohort
• Integra5on with EpicCare the phenotype algorithms
• Integra5on of genomic variants with care
eMERGE I Ques5ons
• Technical – Is the informa5on in the EHR? – How to get it out? – Does it work across ins5tu5ons?
• Ethics, Legal, Social (ELSI) – Recrui5ng (Purposeful / Opportunis5c) – Consen5ng (Opt in / Opt out) – Privacy – Data Use EHRs and Genomic Discovery
Vanderbilt Internal + Epic
Group Health Cooperative Epic
Mayo Clinic GE Centricity+
Cerner
Marshfield Clinic Internal
Northwestern Epic Cerner eClinicalWorks
Coordinating center
Geisinger Epic
Mt. Sinai/Columbia Epic/Allscripts
• Rex Chisholm • Maureen Smith
• Jennifer Pacheco • Will Thompson • Arun Muthalagu • Anna Roberts • Tony Miqueli
• Geoff Hayes • Laura Rasmussen-‐Torvik • Loren Armstrong • Doug Scheener
• Jus5n Starren • Abel Kho • Steve Persell • Phil Greenland • Bill Lowe • Mark Graves • Sharon Aufox
Validated Phenotype
Statistical Modeling
Validate
Algorithms
Identify
Subjects
Translate Definition to Data
Defin
ePh
enot
ype Data
Warehouse
Multi-diciplinary Team
Validate
Algorithm
80+ Years ofClinical Notes
AnalyzeData
Complexity / Maturity
Universal Omics
Disease Risk: Order Test
Disease Risk: Change Behavior
Disease Risk: Watchful Wai8ng
Low Disease Associa8on
Gene8c Varia8on of Unknown Clinical Significance (GVUCS)
Act
iona
bilit
y Primary Actor
Clinician
Pa8ent
Consumer
Researcher Informa8cian
Not Computable
Ancillary ‘Omics
Actionable Results
EHR
“Lab” Results
Observa5ons
Patient Information
EHR Integra8on – Overview
Results from CLIA cer5fied lab
Simple Data
Complex Data
‘Omic Repository
External DSS EHR DSS
Option 1 Ac5onable Adributes
Analy5cs
Option 2
High-Throughput Sequence Data, Methylation, Tissue Array, Tertiary Structure, etc.
SNP calls, Regulatory Network Analysis, etc.
SNPs, Network Activation, Indels, CNVs, Rearrangements, etc.
Filter for Actionable Clinical Significance
Na8onal DB of Clinically Significant Variants
Clinically Relevant “Omic” Findings
EHR Integration
Pa8ent Specific Clinical and
Environmental Data
Personalized Health Care
Raw “Omics” Data
Informa8on
Knowledge
Ac8on
Bedside Be
nch
Scien8fic Literature
Na8onal DB of ‘Omic CDS Rules
Open Ques5ons
• What data goes “in the EHR” and what data is stored in ancillary systems.
• What clinical decision support is internal to the EHR and what is external?
• How do we incorporate mobile data?
We need to speed up our ability to transform findings into ac5onable calls
Issues
• Appropriate access to precision oncology data – big data in cancer needs innova5on
• How do we promote pipeline innova5on – data handling, mining, analy5cs?
• How do we build true healthcare learning systems, where every pa5ent contributes to our knowledge of cancer?
CI4CC
Cancer Informa5cs For Cancer Centers
www.ci4cc.org
Thank you
Warren A. Kibbe [email protected]