prognostic value of combined clinical and myocardial...

J A C C : C A R D I O V A S C U L A R I M A G I N G V O L . - , N O . - , 2 0 1 7

ª 2 0 1 7 B Y T H E AM E R I C A N C O L L E G E O F C A R D I O L O G Y F O U N D A T I O N

P U B L I S H E D B Y E L S E V I E R

I S S N 1 9 3 6 - 8 7 8 X / $ 3 6 . 0 0

h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . j c m g . 2 0 1 7 . 0 7 . 0 2 4

Prognostic Value of Combined Clinical andMyocardial Perfusion Imaging Data UsingMachine Learning
Julian Betancur, PHD,a Yuka Otaki, MD,a Manish Motwani, MB, CHB, PHD,a Mathews B. Fish, MD,b
Mark Lemley, CNMT,b Damini Dey, PHD,a Heidi Gransar, MS,a Balaji Tamarappoo, MD, PHD,a Guido Germano, PHD,a

Tali Sharir, MD,c Daniel S. Berman, MD,a Piotr J. Slomka, PHDa

ABSTRACT

FrobO

Ca

Na

of

con

All

Ma

OBJECTIVES This study evaluated the added predictive value of combining clinical information and myocardial

perfusion single-photon emission computed tomography (SPECT) imaging (MPI) data using machine learning (ML) to

predict major adverse cardiac events (MACE).

BACKGROUND Traditionally, prognostication by MPI has relied on visual or quantitative analysis of images

without objective consideration of the clinical data. ML permits a large number of variables to be considered in

combination and at a level of complexity beyond the human clinical reader.

METHODS A total of 2,619 consecutive patients (48% men; 62 � 13 years of age) who underwent exercise (38%) or

pharmacological stress (62%) with high-speed SPECT MPI were monitored for MACE. Twenty-eight clinical variables, 17

stress test variables, and 25 imaging variables (including total perfusion deficit [TPD]) were recorded. Areas under

the receiver-operating characteristic curve (AUC) for MACE prediction were compared among: 1) ML with all available

data (ML-combined); 2) ML with only imaging data (ML-imaging); 3) 5-point scale visual diagnosis (physician [MD]

diagnosis); and 4) automated quantitative imaging analysis (stress TPD and ischemic TPD). ML involved automated

variable selection by information gain ranking, model building with a boosted ensemble algorithm, and 10-fold stratified

cross validation.

RESULTS During follow-up (3.2� 0.6 years), 239 patients (9.1%) had MACE. MACE prediction was significantly higher for

ML-combined thanML-imaging (AUC: 0.81 vs. 0.78; p< 0.01). ML-combined also had higher predictive accuracy comparedwith

MD diagnosis, automated stress TPD, and automated ischemic TPD (AUC: 0.81 vs. 0.65 vs. 0.73 vs. 0.71, respectively; p< 0.01

for all). Risk reclassification for ML-combined compared with visual MD diagnosis was 26% (p< 0.001).

CONCLUSIONS ML combined with both clinical and imaging data variables was found to have high predictive

accuracy for 3-year risk of MACE and was superior to existing visual or automated perfusion assessments. ML could

allow integration of clinical and imaging data for personalized MACE risk computations in patients undergoing SPECT

MPI. (J Am Coll Cardiol Img 2017;-:-–-) © 2017 by the American College of Cardiology Foundation.

T raditionally, the prognostic value of myocar-dial perfusion single-photon emissioncomputed tomography (SPECT) imaging

(MPI) has been studied with semiquantitative visual

m the aDepartments of Imaging, Medicine, and Biomedical Sciences, Ce

regon Heart and Vascular Institute, Sacred Heart Medical Center, Spr

rdiology, Assuta Medical Centers, Tel Aviv, Israel. This research was s

tional Heart, Lung, and Blood Institute/National Institute of Health (PI: P

the authors and does not necessarily represent the official views of the N

tributed equally to this work. Drs. Berman, Germano, and Slomka have r

other authors have reported that they have no relationships relevant to t

nuscript received March 27, 2017; revised manuscript received July 5, 20

and quantitative analysis of image data (1–3). Anumber of previous studies have shown that clinicaldemographics, functional parameters, and hemody-namic and stress results all affect the evaluation of

dars-Sinai Medical Center, Los Angeles, California;

ingfield, Oregon; and the cDepartment of Nuclear

upported in part by grant R01HL089765 from the

iotr Slomka). The content is solely the responsibility

ational Institutes of Health. Drs. Betancur and Otaki

eceived royalties from Cedars-Sinai Medical Center.

he contents of this paper to disclose.

17, accepted July 5, 2017.

https://doi.org/10.1016/j.jcmg.2017.07.024

ABBR EV I A T I ON S

AND ACRONYMS

CAD = coronary artery disease

CT = computed tomography

MACE = major adverse cardiac

events

MD = physician

ML = machine learning

MPI = myocardial perfusion

imaging

SPECT = single-photon

emission computed

tomography

TID = transient ischemic

dilation

TPD = total perfusion defi

Betancur et al. J A C C : C A R D I O V A S C U L A R I M A G I N G , V O L . - , N O . - , 2 0 1 7

Machine Learning for Automated MACE Prediction - 2 0 1 7 :- –-

2

MPI (4–7). This integration of clinical informa-tion and imaging data into a final impression iscurrently performed subjectively by physi-cians when they assess the MPI test, often ina nonstandardized manner.

Machine learning (ML) is a field of com-puter science that uses computer algorithmsto identify patterns in large multivariabledatasets and can be used to predict out-comes. In recent years, ML has been used forprediction and decision-making in a multi-tude of disciplines, including internet searchengines, customized advertising, naturallanguage processing, finance trending, androbotics (8–10). For MPI, a large number ofparameters, including clinical variables,

stress test results, and imaging data variables, couldbe considered by ML for outcome prediction. Weevaluated the benefits of combining all of thesevariables using an ML algorithm to predict majoradverse cardiac events (MACE) (8). ML predictionusing combined data was also compared with physi-cian (MD) diagnosis (based on a visual read withawareness of clinical data) and with automatedperfusion quantification indexes (stress and ischemictotal perfusion deficit [TPD]).

METHODS

STUDY POPULATION. A total of 2,689 consecutivepatients who were referred for clinically indicatedexercise or pharmacological stress MPI at SacredHeart Medical Center between January 2010 andDecember 2011 were included. The study wasapproved by the institutional review board, includinga waiver for informed consent. After excluding 70patients with early revascularization within 90 days,2,619 patients were included for further analysis.

CLINICAL DATA. Clinical data were derived frompatients’ medical records and included age, sex, andrisk factors. Recorded risk factors were hypertension,diabetes mellitus, dyslipidemia, and smoking(defined as current smoking or cessation within 3months of testing), and family history of prematureclinical coronary artery disease (CAD). Presence ofchest pain, and type and shortness of breath wereassessed by the stress testing MD.

MPI AND STRESS PROTOCOLS. Resting and/or stress1-day 99mtechnetium-sestamibi imaging was per-formed using a high-efficiency, solid-state SPECTscanner (D-SPECT, Spectrum-Dynamics, Haifa, Israel)(11). Weight-adjusted doses of 353 � 151 MBq (9.5 �4.1 mCi) for rest and 1,252 � 196 MBq (34 � 5.3 mCi) forstress (recommended by vendor) were used (12),

cit

equivalent to a total average effective dose of10.7 mSv based on the latest International Commis-sion on Radiological Protection 103 estimates (13).Patients underwent symptom-limited Bruce protocolexercise testing (38%) or pharmacological stress (62%;regadenoson 0.4 mg) with injection at peak stress.Resting image acquisition was performed supine with6- to 10-min acquisition time, based on patient bodymass index. Upright and supine stress imaging (4 to6 min) began 15 to 30 min after stress.

Transaxial images were generated from list modedata maximum likelihood expectation maximizationreconstruction (11). No attenuation or scatter correc-tion was applied. Images were automaticallyre-oriented into short-axis, and vertical and horizontallong-axis slices with Quantitative Perfusion SPECT(QPS)/Quantitative Gated SPECT (QGS) software(Cedars-Sinai Medical Center, Los Angeles, California).

VISUAL PERFUSION ANALYSIS. The visual analysiswas done by multiple MDs who were aware of patientclinical information and quantitative assessment atthe time of the study. Reader scan interpretation (MDdiagnosis) was scored as 0 ¼ normal, 1 ¼ equivocal,2 ¼ probably abnormal, 3 ¼ abnormal, or 4 ¼ defi-nitely abnormal. A 3-step scale probability of CAD wasalso reported (0 ¼ low, 1 ¼ intermediate, 2 ¼ high).

AUTOMATED QUANTIFICATION. All image datasetswere de-identified, transferred to Cedars-Sinai Med-ical Center, and quality control was checked by asingle experienced core laboratory technologistwithout knowledge of clinical data. Automaticallygenerated myocardial contours by QPS/QGS softwarewere evaluated, and when necessary, contours wereadjusted to correspond to the myocardium. Uprightand supine images were quantified as previouslydescribed (14). We used automatic TPD, a quantita-tive perfusion variable that reflects a combination ofdefect extent and severity, and produces stress, rest,and ischemic (stress – rest) TPD values. Ejectionfraction, and systolic and diastolic volumes at stressand rest were quantified separately for each acquisi-tion using standard QGS software with 8 frames percardiac cycle. Transient ischemic dilation (TID) wascomputed as previously described (15). Counts in theleft ventricle were obtained by planar projections ofthe left ventricular region defined during the firststep of data reconstruction (16).

OUTCOME AND FOLLOW-UP DATA COLLECTION.

The endpoint was MACE, which consisted ofall-cause mortality, nonfatal myocardial infarction,unstable angina, or late coronary revascularization(percutaneous coronary intervention or coronaryartery bypass grafting). All-cause mortality was

FIGURE 1 Machine Learning Pathway

Data – 2,619 Cases with Imaging, Stress Test and Clinical Data

Variable Selection – Information Gain Ratio Ranking

Stra

tified

10-F

old

Cros

s Val

idat

ion

Model Building – LogitBoost

Derive MACE probability scores for entire population from 10 models

Repeat× 10 × 10

Model:

Estimate overall prediction by combining all probability scores

1 2 3 ... ... 10

10% holdout for Testing

90% for Training

6 54

3

2110

9

8

7

10%

The overall population is divided into 10 equally sized groups (1, 2,., 10) with approximately the same incidence of major adverse cardiac

events (MACE) (stratified). Of the 10 groups, 1 (10%) is retained as the test set (holdout set), and the others (90%) are used as the training

set. To estimate the machine learning (ML) performance for all the data, the cross-validation procedure loops 10 times over these groups, each

time performing variable selection and model building with a different training set, and then testing this model on the unseen test set.

Therefore, each data point is used once for testing and 9 times for training, and the result is 10 experimental LogitBoost models trained on

90% fractions. Once finished, the estimates of MACE probability for each of the 10 holdout sets derived by the corresponding 10 models are

concatenated to provide an overall expected estimate of ML performance with unseen (holdout) data.

J A C C : C A R D I O V A S C U L A R I M A G I N G , V O L . - , N O . - , 2 0 1 7 Betancur et al.- 2 0 1 7 :- –- Machine Learning for Automated MACE Prediction

3

determined from the Social Security Death Index andcombined with MACE obtained from the hospitalelectronic medical records, including all clinics, aswell as cardiology group and hospital visits. Nonfatalmyocardial infarction was defined based on thecriteria of hospital admission for chest pain, elevatedcardiac enzyme levels, and typical changes on theelectrocardiogram (17). The first event in each patientwas used as the outcome. Patients with earlyrevascularization #90 days after MPI were excluded.

MACHINE LEARNING. Figure 1 illustrates the MLpathway, which involved automated variable selec-tion by information gain ratio ranking and modelbuilding with a boosted ensemble algorithm, bothworked into a stratified 10-fold cross validation pro-cedure, as reported in our previous work (8). MLtechniques were implemented in the open-sourceWaikato Environment for Knowledge Analysis(WEKA) platform 3.8.0 (University of Waikato,Hamilton, New Zealand) (18).

VARIABLE SELECTION. Twenty-five imaging datavariables, 17 stress test variables, and 28 clinicalvariables were available for variable selection by the

information gain ratio (18). Information gain ratiooffers a measure of the effectiveness of a variable inclassifying the training data. Only variables thatresulted in an information gain ratio >0 were subse-quently used in model building (Figure 2B).

MODEL BUILDING. Predictive classifiers for MACEscoring were developed by an ensemble (“boosting”)LogitBoost algorithm. The principle behind MLensemble boosting is to combine the prediction ofsimple classifiers with weak performances to create asingle strong classifier (19). These weak predictionsare then combined in an ensemble (weightedmajority voting) to derive an overall classifier, the MLscore.

CROSS VALIDATION. The performance and generalerror estimation of the entire ML process (variableselection and LogitBoost) were assessed using strati-fied 10-fold cross validation (Figure 1), which iscurrently the preferred validation technique in ma-chine learning (18). The main advantages of thistechnique, compared with the conventional split-sample approach, are: 1) it reduces the variance inprediction error; 2) it maximizes the use of data for

FIGURE 2 Variable Selection

0

A B

Stress EF (%)

Information Gain Ratio AUC

Rest EF (%)Rest TPD (%)

Rest EDV (ml)

Stress supine TPD (%)

Stress upright TPD (%)

Stress combined TPD (%)Ischemic supine TPD (%)Body mass index (kg/m2)

Stress heart rate (beats/min)Reason for termination (1-11)

Location of patient (1-3)Rest ECG abnormality (0,1)

Past myocardial infarction (0,1)Exercise stress (0,1)

Past other open heart surgery (0,1)Weight (kg)

Past CABG (0,1)Rest scan (0,1)

Post TAVR (0,1)Age (yrs)

LV count rest supine

Stress diastolic BP peak (mm Hg)

Stress systolic BP peak (mm Hg)Past PCI (0,1)

Stress EDV (ml)

0.02 0.04 0.48 0.58 0.68 0.780.06 0.08 0.1

Resting BP diastole (mm Hg)ECG response to stress (1-5)

LV counts stress upright

LV counts stress supineTranscient ischemic dilation

Diabetes mellitus (0,1)Rest dose (MBq)

Stress dose (MBq)Presenting symptoms (1-4)

Quality of study (1-5)Imaging protocol (1,2)

Hypertension (0,1)Family history (0,1)

Clinical Indications for test (1-22)Clinical response to stress (1-5)

Exercise duration (min)Sex (M,F)

Pharmocological stress agent (1-5)Stress upright scan time (min) Information gain ratio > 0

Information gain ratio = 0Time of ECG changes response (min)Under drug influence

ST deviation direction (elevation, depression)ST sloping (up, down, horizontal)

Artifacts (0,1)

Dyslipidemia (0,1)

Smoking (0,1)

Height (cm)

Exercise work load (METs)

Chest pain with exercise index (0-2)Stress ST deviation at stress (mm)

Rest scan time (min)Stres supine scan time (min)

Heart rhythm (1-4)Old myocardial infarction (0,1)

Exercise protocol (Bruce , modified Bruce)Conduction disease (0,1)

Resting BP systole (mm Hg)Resting heart rate (beats/min)

Left ventricular hypertrophy (0,1)Maximal predicted heart rate (beats/min)

ST changes at rest (0,1)

Post cardiac transplant (0,1)

Peripheral vascular disease (0,1)Carotid artery disease (0,1)

Stress supine TPD (%)Stress heart rate (beats/min)

Stress systolic BP peak (mm Hg)Stress combined TPD (%)

Rest TPD (%)

Stress diastolic BP peak (mm Hg)Pharmocological stress agent (1-5)

Reason for termination (1-11)Rest ECG abnormality (0,1)

ECG response to stress (1-5)Transcient ischemic dilation

Stress EF (%)Location of patient (1-3)

Exercise protocol (Bruce , modified Bruce)Exercise stress (0,1)

Resting BP diastole (mm Hg)Rest EF (%)

Diabetes mellitus (0,1)ST changes at rest (0,1)

LV count rest supineRest EDV (ml)

Body mass index (kg/m2)Past PCI (0,1)

Hypertension (0,1)Stress dose (MBq)

Quality of study (1-5)Weight (kg)

Family history (0,1)LV counts stress supine

Stres supine scan time (min)Past CABG (0,1)

Rest dose (MBq)LV counts stress upright

Past myocardial infarction (0,1)Maximal predicted heart rate (beats/min)

Resting BP systole (mm Hg)Carotid artery disease (0,1)

Peripheral vascular disease (0,1)

Chest pain with exercise index (0-2)Clinical Indications for test (1-22)

Resting heart rate (beats/min)Exercise work load (METs)

Time of ECG changes response (min)ST deviation direction (elevation, depression)

ST sloping (up, down, horizontal)Under drug influence

Heart rhythm (1-4)Dyslipidemia (0,1)

Smoking (0,1)

Stress ST deviation at stress (mm)Rest scan time (min)

Imaging protocol (1,2)

Left ventricular hypertrophy (0,1)Conduction disease (0,1)

Post TAVR (0,1)Past other open heart surgery (0,1)

Rest scan (0,1)Exercise duration (min)

Presenting symptoms (1-4)Clinical response to stress (1-5)

Post cardiac transplant (0,1)Old myocardial infarction (0,1)

Artifacts (0,1)

Height (cm)

Sex (M,F)

Stress upright scan time (min)

Stress EDV (ml)

Age (yrs)

Ischemic supine TPD (%)Stress upright TPD (%)

(A) Twenty-five imaging data (gray bars: 22 selected), 17 stress test (pink bars: 8 selected) and 28 clinical (green bars: 17 selected) variables ranked by their mean

(95% confidence interval [CI]) information gain ratio within 10-fold cross-validation. (B) Same variables ranked by their individual area under the receiver-operating

characteristic curve (AUC) [95% CI] for MACE prediction. Variables selected by information gain ratio are shown as solid bars. Nonselected variables are shown by open

bars. BP ¼ blood pressure; beats/min ¼ beats per minute; CABG ¼ coronary artery bypass graft; ECG ¼ electrocardiography; EDV ¼ end-diastolic volume; EF ¼ejection fraction; ESV ¼ end-systolic volume; LV ¼ left ventricular; MET ¼ metabolic equivalent; PCI ¼ percutaneous coronary intervention; TAVR ¼ transcatheter

aortic valve replacement; TPD ¼ total perfusion deficit; other abbreviations as in Figure 1.



4

both training and validation, without overfitting oroverlap between the test and validation data; and 3) itguards against testing hypotheses suggested by arbi-trarily split data (20).

STATISTICAL ANALYSIS. Using receiver-operatingcharacteristic analysis and pairwise comparisonsaccording to DeLong et al. (21), the predictive accu-racy for MACE was compared among: 1) ML with all

TABLE 1 Patient Characteristics

All Patients(N ¼ 2,619)

MACEþ(n ¼ 239)

MACE�(n ¼ 2,380) p Value

Age, yrs 62 � 13 70 � 12 62 � 12 <0.0001

Men 1,247 (48) 128 (54) 1,119 (47) 0.054

Body mass index, kg/m2 31 � 8 30 � 9 32 � 8 <0.01

CAD risk factors

Diabetes 691 (26) 100 (42) 591 (25) <0.001

Hypercholesterolemia 1,491 (57) 141 (59) 1,350 (57) 0.5

Hypertension 1,692 (65) 181 (76) 1,511 (63) <0.001

Family history of CAD 1,006 (38) 66 (28) 940 (40) <0.001

Smoker 662 (25) 65 (27) 597 (25) 0.474

Typical angina 301 (11) 38 (16) 263 (11) <0.05

History of CAD

Previous MI 130 (5) 31 (13) 99 (4) <0.001

Previous PCI 231 (9) 52 (22) 179 (8) <0.001

Previous CABG 172 (7) 36 (15) 136 (6) <0.001

Values are mean � SD or n (%).

CABG ¼ coronary artery bypass graft; CAD ¼ coronary artery disease; MACE ¼ major adverse cardiac event;MI ¼ myocardial infarction; PCI ¼ percutaneous coronary intervention.

TABLE 2 Pharmacologic and Exercise Stress Test Results

Pharmacologic stress(n ¼ 1,614)

MACEþ(n ¼ 217)


Resting heart rate, beats/min 75 � 14 73 � 13 <0.05

Peak heart rate at stress, beats/min 95 � 19 103 � 20 <0.0001

Resting SBP, mm Hg 132 � 22 132 � 20 0.577

Resting DBP, mm Hg 73 � 12 77 � 12 <0.001

Peak SBP, mm Hg 131 � 27 143 � 27 <0.0001

Peak DBP, mm Hg 70 � 12 76 � 13 <0.0001

Exercise stress(n ¼ 1,005)

MACEþ(n ¼ 22)

MACE�(n ¼ 983) p Value

Resting heart rate, beats/min 81 � 13 76 � 13 0.072

Peak heart rate at stress, beats/min 142 � 13 148 � 13 <0.05

Resting SBP, mm Hg 128 � 19 126 � 17 0.647

Resting DBP, mm Hg 74 � 9 79 � 10 <0.05

Peak SBP, mm Hg 179 � 27 181 � 25 0.703

Peak DBP, mm Hg 84 � 10 83 � 12 0.700

Ischemic ST change during exercise stress 7 (32) 175 (18) 0.091

Values are mean � SD or n (%).

DBP ¼ diastolic blood pressure; SBP ¼ systolic blood pressure; other abbreviation as in Table 1.


5

available data (ML-combined); 2) ML with only im-aging data (ML-imaging); 3) a 5-point scale visualdiagnosis (MD diagnosis); and 4) automated quanti-tative imaging analysis (stress TPD and ischemicTPD). Brier score and Pearson correlation werecomputed between predicted and observed MACE(22). For all analyses, MACE-free patients werecensored to their follow-up date. To define the low-risk limit for MACE prediction by ML-combined, weused clinical diagnosis ¼ 0, which is considered asdefinitely normal scans, as a well-established, low-risklimit. Then, low-risk cutoffs for ML-combined and TPDwere calculated for approximately the same popula-tion percentile as for the MD diagnosis ¼ 0 (87thpercentile). Subsequently, improvement in riskclassification using ML-combined compared with theMD diagnosis was assessed with a 5-category reclas-sification. Statistical calculations were performedusing R software version 3.3.1 (R Foundation, Vienna,Austria) and PredictABEL package (R Foundation) forthe reclassification.

RESULTS

STUDY POPULATION AND OUTCOME. Table 1 showsthe baseline clinical characteristics of the studiedpopulation. When the first event per patient wasconsidered, there were 239 (9.1%) 3-year MACE, with150 (5.7%) all-cause deaths, 11 (0.4%) nonfatal MIs, 24(0.9%) unstable anginas, and 54 (2.1%) late targetrevascularizations. The observed annual MACE ratewas 3%.

HEMODYNAMIC AND MPI RESULTS. Table 2 showshemodynamic and stress results separately for phar-macological stress and for exercise stress. The fre-quency of exercise stress was lower among patientswith MACE compared with those without MACE(9% with MACE vs. 41% without MACE; p < 0.0001).Table 3 shows quantitative and visual MPI results. Forthe quantitative evaluation of perfusion and func-tion, 9.8% of myocardial contours were corrected bythe core laboratory technologist.

VARIABLE SELECTION. Figure 2A shows the averageinformation gain ratio within 10-fold cross validation.On average, 22 imaging data, 8 stress tests, and 17clinical variables were selected. All perfusion andfunctional variables from MPI had an informationgain ratio >0, including left ventricular counts andinjected dose. Top 9 selected variables were allimaging data variables.

MACE PREDICTION BY INDIVIDUAL VARIABLES. Figure 2Bshows the area under the receiver-operating charac-teristic curve (AUC) for the prediction of MACE by

each individual variable. Stress TPD, stress heart rate,ischemic TPD, stress systolic blood pressure, restingTPD, and age were the best individual predictors.Compared with the information gain ratio inFigure 2A, there were some variables for which indi-vidual AUCs were predictive, yet they did not offerincremental information gain for predicting MACE(white bars). Furthermore, the variables with highestAUCs did not always have the highest informationgain ratio.

MACE PREDICTION BY COMBINED VARIABLES. MACEprediction was significantly higher for ML-combined

TABLE 3 Perfusion and Functional Results

MACEþ(n ¼ 239)


MD-diagnosis: normal 142 (59) 2,138 (90) <0.001

MD-diagnosis: abnormal or definitely abnormal 89 (37) 217 (9) <0.001

Stress TPD, % 9 � 11 3 � 5 <0.0001

Ischemic TPD, % 4 � 4 2 � 3 <0.0001

Resting TPD, % 5 � 9 1 � 3 <0.0001

Stress EDV, ml 112 � 57 91 � 36 <0.0001

Stress ESV, ml 96 � 57 73 � 33 <0.0001

Stress EF, % 46 � 9 49 � 3 <0.0001

Rest EDV, ml 105 � 52 89 � 34 <0.0001

Rest ESV, ml 89 � 52 71 � 31 <0.0001

Rest EF, % 46 � 8 49 � 3 <0.0001

Transient ischemic dilation 1.09 � 0.16 1.03 � 0.14 <0.0001

Values are n (%) or mean � SD.

EDV ¼ end-diastolic volume; EF ¼ ejection fraction; ESV ¼ end-systolic volume; MD ¼ physician; TPD ¼ totalperfusion deficit; other abbreviation as in Table 1.

FIGURE 3 ROC Cur

0.0

1.0

Sens

itivi

ty

M

0.2

0.4

0.6

0.8

1.0

ML combining all va

(ML-combined) had

imaging data variabl

**p < 0.001, in AUC

characteristic; other



6

than ML-imaging (AUC: 0.81, 95% confidence interval[CI]: 0.78 to 0.83 vs. AUC: 0.78, 95% CI: 0.75 to 0.81;p < 0.01). ML-combined also had a higher AUCcompared with the AUCs of automated stress TPDand automated ischemic TPD (Figure 3), and

ves for Prediction of 3-Year MACE (239 of 2,619 Events)

Specificity

achine Learning (ML)

AUC (bars) and 95% CI (whiskers)

ML-combined 0.81* **

0.78

0.73

0.72

ML-imaging

Stress TPD

Ischemic TPD

0.8 0.6 0.4 0.2 0.0

riables using variable selection and LogitBoost algorithm

a significantly higher AUC for MACE prediction than ML combining

es only (ML-imaging), and standard image analysis. *p < 0.01;

comparison by DeLong test. ROC ¼ receiver-operating

abbreviations as in Figures 1 and 2.

compared with the AUCs for probability of CAD (0.64;95% CI: 0.61 to 0.66) or MD diagnosis (0.65; 95% CI:0.62 to 0.68), as reported by the MD (all p < 0.001).When stress test variables were added to image var-iables for ML integration, AUC did not changesignificantly (AUC: 0.79, 95% CI: 0.76 to 0.82 vs. AUC:0.78, 95% CI: 0.75 to 0.81; p ¼ 0.4).

The Brier score for ML-combined prediction ofMACE was 0.07, which indicated good calibrationbetween ML scores (estimated predicted risk) andobserved 3-year risk. The plot of observed MACEversus predicted MACE over percentiles of ML-combined risk is shown in Figure 4. High correlationof ML-combined predicted MACE versus observedMACE was found (r ¼ 0.97; p < 0.0001).

RISK RE-CATEGORIZATION. To allow categoricalcomparison, a low-risk, ML-combined score (<0.15)was determined as the cutoff that defined thesame percentile as visual MD diagnosis ¼ 0 (87thpercentile). This percentile also approximatelycorresponded to the stress TPD threshold of <5% (14).For patients within the 95th to 100th percentileof the ML-combined score, 19% (25 of 131) ofpatients had a normal MD diagnosis and 10% (13 or131) had stress TPD of <5% (Figure 5). Finally, a5-category risk reclassification was 26% forML-combined scores compared with a 5-category MDdiagnosis (p < 0.001) (Table 4), with 30.5% improvedidentification of patients with MACE and �5%decreased identification of MACE-free patients (allp < 0.001).

DISCUSSION

We developed and validated a highly accurate,personalized method for post-MPI risk computationthat used ML. This approach allowed the combinationof all available clinical, stress test, and automaticallyderived imaging data variables without a prioriassumptions about the influence or weighting ofindividual factors, or how they may interact. Themethod was used to evaluate the added value ofclinical and stress test information for the predictionof MACE after MPI. The observed 3% annual MACErate was similar to previous studies that assessed theprognostic value of SPECT MPI (4). The only humaninput required for the derivation of the ML-combinedMACE risk score was the collation of clinical data fromhealth records (conceivably a task fulfilled byadvanced text mining in the future) and the adjust-ment of contours by the technologists in a minority(<10%) of the cases. Figure 6 illustrates how theproposed ML model would allow prediction of therisk of MACE for an individual unknown case by

FIGURE 4 Observed Versus Predicted 3-Year Risk of MACE

00 0.0

0.1

0.2

0.3

0.4

0.5

0.6

10

20

30

40

50

60

5 10 15 20 25 30 35 40 45Percentile of ML Score

Obse

rved

: Pro

port

ion

of E

vent

s (%

)

Predicted: ML Score

50 55 60 65 70 75 80 85 90 95 100

Observed Predicted

Observed proportion of events (pink bars) and predicted ML score (green points) grouped by every fifth percentile of risk. Abbreviations as in Figure 1.

FIGURE 5 Frequency of Normal Clinical Diagnosis and Low Perfusion Scores by

Predicted ML Risk Percentile

0

< 2525-4

950-74

75-9

4≥ 95

< 2525-4

950-74

75-9

4≥ 95

20

40

60

80

10099% 97%

93%

19%

97%

87%95%

56%

10%

69%

Percentile of ML Score

Normal Clinical Diagnosis Stress TPD < 5%

Freq

uenc

y (%

)

The frequency of patients with normal clinical diagnosis and low automated perfusion

score (TPD <5%) across percentiles of the ML score. Abbreviations as Figures 1 and 2.


7

automatically integrating the clinical data with theimaging data.

The performance of the ML-combined score wassuperior to image risk metrics that are traditionallyused to study prognostic outcomes after MPI (1–7).The AUC estimate, derived in a rigorous manner withtest and training data separated within 10-fold crossvalidation (preventing overfitting) was substantiallyhigher than that for ML-imaging, as well as visual orautomated MPI assessment. Furthermore, riskreclassification analysis demonstrated that theML-combined risk allowed better classification ofhigh-risk patients than visual clinical diagnosis. Riskreclassification revealed that the ML-combined scorecould increase the risk score for >30% of patientswith MACE incidence, but also increased the riskscore for 5% of MACE-free patients. At the same time,we found that 19% of the patients in the highestML-combined risk category (top 5%), with a MACEincidence of 38%, were still read as normal scans witha MD diagnosis ¼ 0. These observations highlight thedifficulty in finding the appropriate thresholds for themulticategory risk scores. The low-risk threshold inthis study was derived for the same populationpercentile as “normal” visual scans, and subsequenthigher risk thresholds were defined at 5% incrementsof increasing ML risk score. Furthermore, we foundthat automatically derived stress and/or ischemic

TPD had better predictive value for MACE than aclinical diagnosis, which was in line with our previousreports (9,23), but has not been previously reported inprognostic studies.

To our knowledge, this was the first study thatapplied ML to predict MACE in patients who

FIGURE 6 Illustrat

PatientStre

MPerf

QGS ¼ quantitative g

single-photon emiss

TABLE 4 Risk Reclassification by ML Versus MD Diagnosis

MD Diagnosis

ML-Boosting Risk Category

TotalLow<0.15

Equivocal0.15–0.2

Mild0.2–0.25

Moderate0.25–0.3

Severe$0.3

MACE (n ¼ 239)

Normal 99 19* 9* 7* 8* 142

Equivocal 1† 0 1* 0* 2* 4

Probably abnormal 2† 0† 0 1* 1* 4

Abnormal 11† 5† 8† 7 55* 86

Definitely abnormal 1† 1† 0† 1† 0 3

Total 114 25 18 16 66 239

No MACE (n ¼ 2,380)

Normal 1,959 95* 35* 16* 33* 2,138

Equivocal 5† 1 0* 2* 3* 11

Probably abnormal 8† 0† 0 3* 3* 14

Abnormal 69† 29† 21† 23 67* 209

Definitely abnormal 3† 0† 1† 1† 3 8

Total 2,044 125 57 45 109 2,380

Reclassification 26%

*Up-risking by machine learning (ML). †De-risking by ML.

Abbreviations as in Tables 1 and 3.



8

underwent MPI. Recently, our group assessed thefeasibility and accuracy of ML to predict 5-yearall-cause mortality in 10,030 patients who under-went coronary computed tomography (CT) angiog-raphy (8). In this analysis, ML exhibited a higher AUCcompared with the Framingham risk score or visualCT severity scores alone (8). Automated processing ofCT images was not used. In contrast, the presentstudy capitalized on established automated process-ing software tools that were validated in nuclear

ion of Prognostic Risk Computation in an Individual Patient by the Propo

ss, Rest Scans(QPS/QGS orEquivalent)

ImageQuantification

yocardialusion SPECTImaging

ImagingData

Variables

ated single-photon emission computed tomography; QPS ¼ quantitative pe

ion computed tomography; other abbreviation as in Figure 1.

cardiology to provide multiple imaging data variableswith limited manual interaction. The intent was todemonstrate the feasibility of edging us closer to acompletely automated computer-powered imaginganalysis and risk assessment. A future direction andpotential next step will be to develop tools that arealso capable of automatically extracting clinical vari-ables, for example, by text mining electronic healthrecords.

The ML approach provides a computational inte-gration of all available information that is notfeasible for subjective analysis by the reportingphysician. As part of the clinical decision-making,physicians take into account clinical and stresstesting data; however, this is done subjectivelywithout a systematic way of integrating information.Furthermore, although including these variables aspart of the MPI report is recommended by guide-lines, integration of these findings in the report isnot yet part of standardized reporting guidelines(24,25). Intuitive patient-specific weighting of allindividual clinical and imaging factors for assessingrisk could not be expected to be precise, or consis-tent among different medical centers, whetherperformed by the interpreting physician or thephysician managing the patient.

Although the average patient radiation dose(10.7 mSv) used in this study was higher than thosespecified in current guideline recommendations (26),the data were collected before the latest guidelineswere adopted, using the same day rest�first protocoloptimized for the acquisition speed rather than for

sed ML Model

Machine Learning Model

MACE Risk Prediction

MACE Risk

Database

ElectronicMedical Records

Physician

Stress test and Clinical Variables

rfusion single-photon emission computed tomography; SPECT ¼

PERSPECTIVES

COMPETENCY IN MEDICAL KNOWLEDGE: Combining

clinical and imaging information by an ML algorithm exhibited

significantly better MACE prediction than using only imaging

information or performing visual and automated perfusion

assessment alone in SPECT MPI.

TRANSLATIONAL OUTLOOK: Adding clinical information to

imaging data by ML will aid comprehensive MPI assessment to

improve clinical patient management.


9

the radiation dose. Furthermore, a weight-basedprotocol was used, and most of the patients wereobese (body mass index $30 kg/m2). It is likely that atleast a 50% lower effective radiation dose could beachieved with longer acquisition times without anyeffect on image quality, as previously studied (16).Further dose reductions could be achieved withstress-first and/or stress-only protocols.

IMPLICATIONS. The ability to optimally assess riskin individual patients remains a major challenge incardiology. With MPI, visual image analysis itself issubjective, and the overall risk assessment that in-corporates clinical, stress test, and imaging results,is highly variable, based on physician knowledgeand experience, and limited by the complexity ofappropriately assigning weight to individual factors.The presented ML score provides an automatedprecise and objective risk estimate that combinesimaging, clinical, and stress testing variables. Thesame optimal method for risk computation wouldbe readily available to all imaging centers, includingless experienced centers. The practical imple-mentation will depend on the ability to interfacethe MPI reporting workstation with electronic pa-tient records, to access the clinical variables. Such atool could be perhaps interfaced with large registrydata (e.g., the ImageGuide registry of the AmericanSociety of Nuclear Cardiology [25]), which couldcollect clinical variables similar to those used in thisstudy. The implementation will depend on theavailability of the interface to the electronic healthrecords.

STUDY LIMITATIONS. This was a single-center study,and further multicenter and external validation ofthe derived risk score will be required. Future workshould include the definition of the optimal MLthreshold, to validate prospective practical clinicalimplementation. The sample size was modest andfollow-up was only 3 years; however, all results weresignificant. Although training data were alwaysseparated from test data within the 10-fold crossvalidation, it is not yet known how well such an MLscore can extrapolate among different centers, pa-tient populations, and follow-up time. Although weincluded key perfusion and function imaging vari-ables in this study, the list was not exhaustive. Thederived ML score was generic and could be applied toboth pharmacological and stress protocols, becausethe ML technique uses the information about thetype of test internally. However, further evaluationof ML risk stratification for MACE prediction in

specific subpopulations, for example, in patients withsuspected disease, patients with early revasculariza-tion, or patients undergoing adenosine protocols,may be appropriate in multicenter studies. Riskreclassification metrics have limitations such asdependence on the choice of cutoff values of thecontinuous probability risk score. It is likely thatmore appropriate threshold selection in futurestudies may optimize the reclassification patterns forspecific clinical risks. Alternatively, the MACE riskscore without any categories could be also usedclinically to indicate the probability of events for agiven patient. Finally, we selected a LogitBoostapproach for automatic ML variables integration, asin our previous work (8), but the LogitBoost approachwe used is only one of many possible ML approachesto combine multiple variables for prediction. It ispossible that different approaches such as deeplearning may provide more optimal risk score deri-vation. However, a larger multicenter data set isrequired to evaluate possible advantages of other MLapproaches.

CONCLUSIONS

ML combining both clinical and imaging data vari-ables was found to have high predictive accuracy forthe 3-year risk of MACE, and was superior to existingvisual or automated perfusion assessments in isola-tion. This computational method could allow inte-grating the clinical data with imaging results for theoptimal evaluation of MACE risk in patients under-going MPI.

ADDRESS FOR CORRESPONDENCE: Dr. Piotr J.Slomka, Artificial Intelligence in Medicine Program,Cedars-Sinai Medical Center, 8700 Beverly Boule-vard, Suite A047N, Los Angeles, California 90048.E-mail: [email protected].

mailto:[email protected]



10

RE F E RENCE S

1. Gimelli A, Rossi G, Landi P, et al. Stress/restmyocardial perfusion abnormalities by gatedSPECT: still the best predictor of cardiac events instable ischemic heart disease. J Nucl Med 2009;50:546–53.

2. Hachamovitch R, Kang X, Amanullah AM, et al.Prognostic implications of myocardial perfusionsingle-photon emission computed tomography inthe elderly. Circulation 2009;120:2197–206.

3. Shaw LJ, Berman DS, Maron DJ, et al. Optimalmedical therapy with or without percutaneouscoronary intervention to reduce ischemic burden:results from the Clinical Outcomes UtilizingRevascularization and Aggressive Drug Evaluation(COURAGE) trial nuclear substudy. Circulation2008;117:1283–91.

4. Shaw LJ, Iskandrian AE. Prognostic value ofgated myocardial perfusion SPECT. J Nucl Cardiol2004;11:171–85.

5. Kang X, Berman DS, Lewin HC, et al. Incre-mental prognostic value of myocardial perfusionsingle photon emission computed tomography inpatients with diabetes mellitus. Am Heart J 1999;138:1025–32.

6. Hachamovitch R, Berman DS, Kiat H, et al. Ex-ercise myocardial perfusion SPECT in patientswithout known coronary artery disease: incre-mental prognostic value and use in risk stratifica-tion. Circulation 1996;93:905–14.

7. Sharir T, Germano G, Kang X, et al. Prediction ofmyocardial infarction versus cardiac death bygated myocardial perfusion SPECT: risk stratifica-tion by the amount of stress-induced ischemia andthe poststress ejection fraction. J Nucl Med 2001;42:831–7.

8. Motwani M, Dey D, Berman DS, et al. Machinelearning for prediction of all-cause mortality inpatients with suspected coronary artery disease: a5-year multicentre prospective registry analysis.Eur Heart J 2017;38:500–7.

9. Arsanjani R, Dey D, Khachatryan T, et al. Pre-diction of revascularization after myocardial

perfusion SPECT by machine learning in a largepopulation. J Nucl Cardiol 2015;22:877–84.

10. Betancur J, Rubeaux M, Fuchs T, et al. Auto-matic valve plane localization in myocardialperfusion SPECT/CT by machine learning:anatomical and clinical validation. J Nucl Med2017;58:961–7.

11. Gambhir SS, Berman DS, Ziffer J, et al. A novelhigh-sensitivity rapid-acquisition single-photoncardiac imaging camera. J Nucl Med 2009;50:635–43.

12. Sharir T, Slomka PJ, Hayes SW, et al. Multi-center trial of high-speed versus conventionalsingle-photon emission computed tomographyimaging: quantitative results of myocardialperfusion and left ventricular function. J Am CollCardiol 2010;55:1965–74.

13. Andersson M, Johansson L, Minarik D, Leide-Svegborn S, Mattsson S. Effective dose to adultpatients from 338 radiopharmaceuticals esti-mated using ICRP biokinetic data, ICRP/ICRUcomputational reference phantoms and ICRP2007 tissue weighting factors. EJNMMI Physics2014;1:9.

14. Nakazato R, Tamarappoo BK, Kang X, et al.Quantitative upright–supine high-speed SPECTmyocardial perfusion imaging for detection ofcoronary artery disease: correlation with invasivecoronary angiography. J Nucl Med 2010;51:1724–31.

15. Xu Y, Arsanjani R, Clond M, et al. Transientischemic dilation for coronary artery disease inquantitative analysis of same-day sestamibimyocardial perfusion SPECT. J Nucl Cardiol 2012;19:465–73.

16. Nakazato R, Berman DS, Hayes SW, et al.Myocardial perfusion imaging with a solid-statecamera: simulation of a very low dose imagingprotocol. J Nucl Med 2013;54:373–9.

17. Thygesen K, Alpert JS, White HD. Universaldefinition of myocardial infarction. Circulation2007;116:2634–53.

18. Hall M, Frank E, Holmes G, Pfahringer B,Reutemann P, Witten IH. The WEKA data miningsoftware: an update. SIGKDD Explor Newsl 2009;11:10–8.

19. Friedman J, Hastie T, Tibshirani R. Additivelogistic regression: a statistical view of boosting(with discussion and a rejoinder by the authors).Ann Statist 2000;28:337–407.

20. Kanamori T, Takenouchi T, Eguchi S, Murata N.Robust loss functions for boosting. Neural Comput2007;19:2183–244.

21. DeLong ER, DeLong DM, Clarke-Pearson DL.Comparing the areas under two or more correlatedreceiver operating characteristic curves: anonparametric approach. Biometrics 1988;44:837–45.

22. Brier GW. Verification of forecast expressed interms of probability. Monthly Weather Rev 1950;78:1–3.

23. Arsanjani R, Xu Y, Dey D, et al. Improved ac-curacy of myocardial perfusion SPECT for detec-tion of coronary artery disease by machinelearning in a large population. J Nucl Cardiol 2013;20:553–62.

24. Tragardh E, Hesse B, Knuuti J, et al. Reportingnuclear cardiology: a joint position paper by theEuropean Association of Nuclear Medicine (EANM)and the European Association of CardiovascularImaging (EACVI). Eur Heart J Cardiovasc Imaging2015;16:272–9.

25. Tilkemeier PL, Mahmarian JJ, Wolinsky DG,Denton EA. ImageGuide� Update. J Nucl Cardiol2015;22:994–7.

26. HenzlovaMJ, Duvall WL, Einstein AJ, TravinMI,Verberne HJ. ASNC imaging guidelines for SPECTnuclear cardiology procedures: stress, protocols,and tracers. J Nucl Cardiol 2016;23:606–39.

KEY WORDS machine learning, majoradverse cardiac events, SPECT myocardialimaging

http://refhub.elsevier.com/S1936-878X(17)30804-5/sref1


























































































































prognostic value of combined clinical and myocardial...

Documents