big medical data – challenge or potential?

23
Big Medical Data – Challenge or Potential? Personalized Medicine Conference, Berlin, March 7 th , 2014 Dr. Matthieu-P. Schapranow, Hasso Plattner Institute

Upload: matthieu-schapranow

Post on 22-Jan-2015

668 views

Category:

Technology


1 download

DESCRIPTION

What are today's challenges of big medical data and how can we use the immense data to turn it into potentials, e.g. for precision medicine. Get insights in application examples, where big medical data are incorporated and how in-memory database technology can enable it instantaneous analysis.

TRANSCRIPT

  • 1. Big Medical Data Challenge or Potential? Personalized Medicine Conference, Berlin, March 7th, 2014 Dr. Matthieu-P. Schapranow, Hasso Plattner Institute

2. Hasso Plattner Institute Key Facts Founded as a public-private partnership in 1998 in Potsdam near Berlin, Germany Institute belongs to the University of Potsdam Ranked 1st in CHE since 2009 500 B.Sc. and M.Sc. students 10 professors, 150 PhD students Course of study: IT Systems Engineering Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20142 3. Hasso Plattner Institute Programs Full university curriculum Bachelor (6 semesters) Master (4 semesters) Orthogonal Activities: E-Health Consortium School of Design Thinking Research School Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20143 4. Hasso Plattner Institute Enterprise Platform and Integration Concepts Group Prof. Dr. h.c. Hasso Plattner Research focuses on the technical aspects of enterprise software and design of complex applications In-Memory Data Management for Enterprise Applications Enterprise Application Programming Model Scientific Data Management Human-Centered Software Design and Engineering Industry cooperations, e.g. SAP, Siemens, Audi, and EADS Research cooperations, e.g. Stanford, MIT, and Berkeley Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20144 Partner of Stanford Center for Design Research Partner of MIT in Supply Chain Innovation and CSAIL Partner at UC Berkeley RAD / AMP Lab Partner of SAP AG 5. Our Motivation Personalized Medicine Motivation: Can we analyze the entire data of a patient, incl. Electronic Health Records (EHR) and genome data, during a doctors visit? Genome data analysis may add up to weeks, i.e. biopsy, biological preparation, sequencing, alignment, variant calling, full analysis, and evaluation Issue: Complex and time-consuming data processing tasks In-memory technology accelerates genome data processing Highly parallel alignment / variant calling Real-time analysis of individual patient or cohort data Combined search in structured / unstructured data Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20145 6. Our Challenge Distributed Big Data Sources Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20146 Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB) Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB PubMed database >23M articles Hospital information systems Often more than 50GB Medical sensor data Scan of a single organ in 1s creates 10GB of raw data Cancer patient records >160k records at NCT 7. Combined column and row store Map/Reduce Single and multi-tenancy Lightweight Compression Insert only for time travel Real-time Replication Working on integers SQL interface on columns and rows Active/passive data store Minimal projections Group key Reduction of Software layers Dynamic multi- threading Bulk load of data Object- relational mapping Text retrieval and extraction engine No aggregate Tables Data partitioning Any attribute as index No disk On-the-fly extensibility Analytics on historical data Multi-core/ parallelization Our Approach In-Memory Technology Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20147 + ++ + + P v +++ t SQL x x T disk 8. Our Vision Personalized Medicine Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 20148 9. Our Vision Personalized Medicine Big Medical Data: Challenge or Potential? Personalized Medicine Conference, Dr. Schapranow, Mar 7, 2014 Desirability Leveraging directed customer services Portfolio of integrated services for clinicians, researchers, and patients Include latest research results, e.g. most effective therapies Viability Enable personalized medicine also in far-off regions and developing countries Share data via the Internet to get feedback from word-wide experts (cost-saving) Combine research data (publications, annotations, genome data) from international databases in a single knowledge base Feasibility HiSeq 2500 enables high-coverage whole genome sequencing in 1d IMDB enables allele frequency determination of 12B records within