imputation-enhanced prediction of septic shock in icu patients

25
Imputation-enhanced Prediction of Septic Shock In ICU Patients Joyce C. Ho, Cheng H. Lee and Joydeeph Ghosh University of Texas at Austin HI-KDD 2012: ACM SIGKDD Workshop on Health Informatics Presenter : Kiyana Zolfaghar

Upload: noah

Post on 23-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Imputation-enhanced Prediction of Septic Shock In ICU Patients. Joyce C. Ho, Cheng H. Lee and Joydeeph Ghosh University of Texas at Austin HI-KDD 2012: ACM SIGKDD Workshop on Health Informatics. Presenter : Kiyana Zolfaghar. Outline. Motivation Challenges of Clinical Data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Imputation-enhanced Prediction of Septic Shock In ICU

PatientsJoyce C. Ho, Cheng H. Lee and Joydeeph Ghosh

University of Texas at AustinHI-KDD 2012: ACM SIGKDD Workshop on Health

Informatics

Presenter : Kiyana Zolfaghar

Page 2: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Outline

Motivation

Challenges of Clinical Data

Predictive model for Sepsis Risk

Septic Shock

Impact of imputation methods on prediction

Results

Page 3: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Sepsis and Septic shock

Sepsis

SevereSepsis

SepticShock

a Severe, systemic inflammatory response with a presumed or identified source of infection.

Sepsis with one or more organ dysfunction, hypoperfusion or hypotension

a complication characterized by low blood pressure despite treatment by >600 mL of fluid inputs in the last hour

Page 4: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Motivation Septic Shock as a Severe illness

the most common cause of death in western societies 25% of ICU bed utilization in western countries mortality rates range 12.8% for sepsis to 45.7% for septic shock

Motivation for Prediction of Septic Shock in ICU Patients Early intervention and therapy can improve the outcome of patients treatment transition

treated by critical carephysicians

in later phases

Proactive treatment in early phases

Page 5: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Prediction of Sepsis and Septic shock Data mining approach for identifying patients at risk for developing sepsis

Predictive models

Issues Regarding Classification and Prediction Data Preparation

Feature selection Data cleaning

remove or reduce noise treatment of missing values

Regression method

Support vector Machines

Decision trees

Bayesian Classification …..

Page 6: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Challenges of Clinical Data Typically noisy and inconsistently gathered

Manually recordings of patient's data at irregular intervals

Accurate measures for physiological variables require use of invasive techniques

Naïve Solution Simply ignoring subjects or features with missing data

large amounts of missing data in clinical studies

Dramatic decrease in sample sizes or feature spaces

Bias in the results

Page 7: Imputation-enhanced Prediction of Septic Shock In ICU Patients

The Paper ContributionInvestigates the role and impact of imputation methods

while building predictive models forSepsis risk Septic shock

Methodology of ResearchData Selection

Building predictive models for sepsis and Septic shock

Leveraging different imputation methods on data

Results

Page 8: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Dataset DescriptionMIMIC-II Database (Multiparameter Intelligent Monitoring in Intensive Care)

Publicly and freely availableIncludes very large population of ICU patientscontains high temporal resolution data including

lab results electronic documentationmonitor trends and waveforms.

Funded by :National Institute Of Biomedical Imaging and Bioengineering

Page 9: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Overview of the data categories General

• Patient demographics• Hospital admissions & discharge Info.• Room tracking, death dates• ICD-9 codes

Physiological measures Hourly vital sign metrics

Medication records Lab test results Fluid Balance

Input and output records Notes and Reports

Discharge summary, nursing progress notes Radiology and echo reports.

Clinical Records in MIMIC-II

Page 10: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Data Selection and Target ClassesDataset Size : 12,179 patients

Avoid adults < 18 at time of admission Patients with least ten observations of BP, TEMP, HR…

Target class

Sepsis Risk Prediction• Patients identified by ICD-9 codlings (\995.91" or \995.92“)•~ 10:8% of dataset size (1,310 patients)

Septic shock Prediction• Patient with hypotension and total fluid intake >600 mL• ~ 44:7%of sepsis patients (586 patients)

Page 11: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Predictive Model for Sepsis Risk Features

Patient's Clinical History• Demographic data (gender and ages)• Medical history • Basic health data (weight ..)

Measurements of Physiological Variables

logistic Regression as prediction modeluse only the clinical history featuresuse clinical history features after step-wise regressionall available featuresuse all available features after step-wise regression

Page 12: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Stepwise logistic Regression model• Logistic Regression

• Type of regression analysis used for predicting the outcome of a categorical target variable

• Stepwise Regression• the choice of predictive variables is carried out by an automatic

procedure1. starting with no variables in the model2. testing the addition of each variable using a chosen model

comparison criterion3. adding the variable (if any) that improves the model the most4. repeating this process until none improves the model.

Page 13: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Septic Shock Prediction ModelFeatures

physiologic and laboratory values

Importance of time in septic shock• Feature matrices creation at reference times of 30, 60, 90, and 120

minutes prior to the onset of septic shock.

Prediction Models

Logistic Regression

Support Vector Machine

Classification tree

all available features,features set after forward stepwise regressionfeatures set after backward stepwise regression

Page 14: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Decision Tree LearningGoal

• create a model to predicts value of a target variable based on several input variables

Learning a decision tree Recursive partitioning Based on selected attribute stopping partitioning All samples for a given node belong to the same class

Decision treeClassification TreesRegression Trees

Sex

Age

Male

Survived

sibsp

<= 9.5

died

Survived died

> 2.5

Female

>9.5

<= 2.5

36%

61%

2%2%

Page 15: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Missing Value Imputation Missing data in MIMIC II

excluding records with missing value

47.2%. Reduction in dataset size

Page 16: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Imputation Methods1) Mean Feature Values (Mean for Subgroup)

Derived from the patients' gender and age group

• accounted for fundamental physiological differences between

genders and among age groups

Challenges

Mean substitution is especially problematic when there are

many missing values

distorts the distribution and variance

Page 17: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Imputation Methods2) Matrix Factorization-based Approaches (Very popular in Bioinformatics fields)

SVDImpute• Used a linear combination of k-eigenvalues to predict the missing value

Probabilistic Principal Component Analysis (PPCA)• Combined an Expectation-Maximization (EM) approach to Principal

Component Analysis (PCA) with a probabilistic model• Use a likelihood function to penalizes data far from the training set

Bayesian PCA• EM approach + Bayesian model to calculate the likelihood for constructed

data

Page 18: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Sepsis Risk Prediction ResultsNo Base Model to compare the result with

Evaluation metric • AUC (Area Under the curve)

Page 19: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Septic Shock Prediction Results• The septic shock EWS as baseline

• Prediction model : logistic regression • predict the onset of septic shock one hour in advance• Use invasively-gathered data from MIMIC waveform data

Page 20: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Imputation-enhanced Prediction Of Septic Shock• Impact of various imputation methods on different

reference time• In comparison with baseline with logistic regression model

Page 21: Imputation-enhanced Prediction of Septic Shock In ICU Patients

AUC Curves for predicting septic shock 60 minutes before onset

Page 22: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Septic shock prediction 60 minutes before onset for three types of models:

Page 23: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Effect of imputation on logistic regression coefficients for predicting septic Shock

Consistency across different imputation methods

Inconsistency of values obtained with and without

Imputation

non-imputed model suffer

from over-fitting

Page 24: Imputation-enhanced Prediction of Septic Shock In ICU Patients

Conclusion Imputing missing data can improve model Performance

especially when dealing with larger, noisier, and more incomplete datasets

Matrix factorization imputation methods like BPCA lead to models with better predictive accuracy than simpler approaches like group means.

Page 25: Imputation-enhanced Prediction of Septic Shock In ICU Patients