![Page 1: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/1.jpg)
Lucila Ohno-Machado, MD, [email protected]
Division of Health Sciences and Technology
Harvard Medical SchoolMassachusetts Institute of Technology
Introduction to HST 951Medical Decision Support
![Page 2: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/2.jpg)
Welcome
Objectives• Provide a practical approach to medical decision support• Put a strong emphasis on computer-based applications that
utilize concepts from the fields of artificial intelligence and statistics
• Focus on principled predictive modeling in biomedicine
Audience• Background in quantitative methods is desirable• Undergraduates• Graduate students and post-doctoral fellows (MDs) in medical
informatics
![Page 3: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/3.jpg)
Goals
Model Selection
Data Pre-Processing
ModelConstruction
SystemEvaluation
Decision Support Cycle
![Page 4: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/4.jpg)
Types of Models
What type of support is needed?
• “Exploratory analysis”• “Confirmatory analysis” (gold-standard)
• Clustering• Classification
![Page 5: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/5.jpg)
Inputs
Age 34
2Gender
4
.6
.5
.8
.2
.1
.3.7
.2
“Probabilityof Cancer”
0.6
.4
.2
Mitoses
Neural Networks
Inputs
Coefficients
Output
Independentvariables
Prediction
Age 34
1Gender
4
.5
.8
.40.6
“Probability
of cancer”
p = 1 1 + e -( + cte)
Mitoses
Logistic Regression
CART
Rough Sets
Models
![Page 6: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/6.jpg)
Requirements, Strengths and Weaknesses, Application Examples
• Naïve Bayes• Bayesian Networks• Logistic Regression• Neural Networks• Classification Trees• Rough Set Models• Support Vector Machines• Clustering (Hierarchical and Partitioning)
![Page 7: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/7.jpg)
Evaluation and Comparisons
Classification• Calibration (plots, goodness-of-fit)• Discrimination (ROC areas)• Explanation (variable selection)• Outliers, influential observations (case selection)
Clustering• Distance metrics• Homogeneity• Inter-cluster distance
![Page 8: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/8.jpg)
nl disease
threshold
1.0 3.01.7
FN
TN
FP
TP
“D”
“nl”
nl D
40
4010
10
50 50
50
50
Sensitivity = 40/50 = .8Specificity = 40/50 = .8
![Page 9: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/9.jpg)
ROCcurve
“D”
“nl”
nl D
50
30 0
20
50 50
70
30
“D”
“nl”
nl D
40
4010
10
50 50
50
50
“D”
“nl”
nl D
40
5010
0
50 50
40
60
Sens
itivi
ty
1 - Specificity0 1
1
Thre
shol
d 1.
4Th
r esh
old
1 .7
Thre
shol
d 2.
0
![Page 10: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/10.jpg)
ROC Curves
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Sensitivity
1-Sp
ecifi
city
LRNNRS
![Page 11: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/11.jpg)
Sum
of s
yste
m’s
est
imat
es
Sum of real outcomes0 1
1
overestimation
Calibration Curves
![Page 12: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/12.jpg)
RS Model
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8
Observed
LR Model
0
0.2
0.40.6
0.8
1
0 0.2 0.4 0.6 0.8
Observed
NN Model
0
0.2
0.40.6
0.8
1
0 0.2 0.4 0.6 0.8
Observed
![Page 13: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/13.jpg)
Important Topics
• Decision Analysis• Cost-effectiveness analysis
• Design of Experiments
• Real-World Applications
• Blocking inferences: quantifying anonymity
![Page 14: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/14.jpg)
Examples of Projects
![Page 15: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/15.jpg)
Students have worked in the past in different domains• Diagnosis of
– Coronary Artery Disease– Breast Cancer– Melanoma
• Prognosis in – Interventional Cardiology– Spinal Cord Injury– AIDS– Pregnancy
![Page 16: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/16.jpg)
Data Mining and Predictive Modeling in
(Bio) Medical Databases
![Page 17: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/17.jpg)
0.75
0.77
0.79
0.81
0.83
0.85
0.87
0.89
0.91
1 2 3 4 5 6year
Area
und
er R
OC
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
bala
nce
Logistic Neural Net
We emphasize comparison of different models
0.8 y = e-(X)
LogisticRegression
![Page 18: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/18.jpg)
Modeling the Risk of Major In-Hospital Complications
Following Percutaneous Coronary Interventions
Frederic S. Resnic, Lucila Ohno-Machado, Gavin J. Blake, Jimmy Pavliska, Andrew Selwyn, Jeffrey J. Popma
ACC, 2000
![Page 19: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/19.jpg)
Methods
• Consecutive BWH patients, 1/97 through 2/99 randomly divided into training (n = 1,877) and test (n = 927) sets
• Outcomes: death and combined death, CABG or MI (MACE)
• Validation using independent dataset: 3/99 - 12/99 (n = 1,460)
![Page 20: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/20.jpg)
History Presentation Angiographic Procedural Operator/Lab
age acute MI occluded number lesions annual volumegender primary lesion type multivessel device experiencediabetes rescue (A,B1,B2,C) number stents daily volume iddm CHF class graft lesion stent types (8) lab devicehistory CABG angina class vessel treated closure device experienceBaseline creatinine
Cardiogenic shock
ostial gp 2b3a antagonists
unscheduled case
CRI failed CABG dissection postESRD rotablator
hyperlipidemia atherectomyangiojetmax pre stenosis
Data Source:
max post stenosis
Medical Record
no reflow
Clinician Derived
Dataset: Attributes
![Page 21: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/21.jpg)
Study Population
Cases 2,804 1,460
Women 909 (32.4%) 433 (29.7%)
1/97-2/99 3/99-12/99 Development Set Validation Set
Age > 74yrs 595 (21.2%) 308 (22.5%)
Acute MI 250 (8.9%) 144 (9.9%) Primary 156 (5.6%) 95 (6.5%) Shock 62 (2.2%) 20 (1.4%)
Class 3/4 CHF 176 (6.3%) 80 (5.5%)
gp IIb/IIIa antagonist 1,005 (35.8%) 777 (53.2%)
Death 67 (2.4%) 24 (1.6%) Death, MI, CABG (MACE) 177 (6.3%) 96 (6.6%)
p=.066
p=.340
p=.311
p=.214
p=.058
p=.298
p<.001
p=.110
p=.739
![Page 22: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/22.jpg)
Inputs
Coefficients
Output
Independentvariables
Prediction
Age 34
1Gender
4
.5
.8
.40.6
“Probability
of cancer”
p = 1 1 + e -( + cte )
Mitoses
Logistic Regression
Logistic regression
These models are based on statistics and can only discover linear relationships among the data
![Page 23: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/23.jpg)
Probability of complication
0.6
age
IDDM
CHF class
type
number
procedure
Complications in Coronary Intervention
![Page 24: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/24.jpg)
Logistic and Score Models for Death
OddsRatio p-value
2.51 0.022.12 0.052.06 0.138.41 0.005.93 0.030.57 0.200.53 0.127.53 0.001.70 0.172.78 0.04
Age > 74yrsB2/C LesionAcute MIClass 3/4 CHFLeft main PCIIIb/IIIa UseStent UseCardiogenic ShockUnstable AnginaTachycardicChronic Renal Insuf. 2.58 0.06
Logistic Regression Model
![Page 25: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/25.jpg)
Logistic and Score Models for Death
OddsRatio p-value
2.51 0.022.12 0.052.06 0.138.41 0.005.93 0.030.57 0.200.53 0.127.53 0.001.70 0.172.78 0.04
Age > 74yrsB2/C LesionAcute MIClass 3/4 CHFLeft main PCIIIb/IIIa UseStent UseCardiogenic ShockUnstable AnginaTachycardicChronic Renal Insuf. 2.58 0.06
Logistic Regression Model
beta Riskcoefficient Value
0.921 20.752 10.724 12.129 41.779 3-0.554 -1-0.626 -12.019 40.531 11.022 20.948 2
Prognostic Risk Score Model
![Page 26: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/26.jpg)
Inputs
WeightsIndependentvariables
Dependentvariable
Prediction
Age 34
2Gender
4
.6
.5
.8
.2
.1
.3.7
.2
WeightsHiddenLayer
“Probabilityof Cancer”
0.6
.4
.2
Mitoses
Neural Network
Neural networks
These are mathematical models that can discover non-linear relationships
among the data
![Page 27: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/27.jpg)
Neural networks for predicting death and complications
disease free
death
other complications
age
IDDM
CHF class
type
number
procedure
![Page 28: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/28.jpg)
Death ModelsValidation Set: 1460 Cases
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
0.00 0.20 0.40 0.60 0.80 1.00
1 - Specificity
Sens
itivi
ty LRScoreaNN
ROC AreaLR: 0.840Score: 0.855aNN: 0.835ROC = 0.50
![Page 29: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/29.jpg)
53.6%
12.4%
21.5%
2.2%0
500
1000
1500
2000
2500
3000
0 to 2 3 to 4 5 to 6 7 to 8 9 to 10 >10
Risk Score Category
Num
ber o
f Cas
es
0%
10%
20%
30%
40%
50%
60%
Risk Score of Death: BWH ExperienceUnadjusted Overall Mortality Rate = 2.1%
Mortality Risk
Number of Cases
62%
26%
7.6%2.9% 1.6% 1.3%0.4% 1.4%
![Page 30: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/30.jpg)
CART
Regression TreesThese are models that partition the data using
one variable at a time, and can model non-linear relationships among data
![Page 31: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/31.jpg)
Diagnosis of Melanoma(Michael Binder, Greg Sharp et al., 1999)
![Page 32: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/32.jpg)
Dermatoscopy
![Page 33: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/33.jpg)
Dermatoscopy 0- TEST: null VALUE: null Num Cases: 700.0 Num Dsrd: 241.0 2- TEST: breath VALUE: 1 Num Cases: 75.0 Num Dsrd: 1.0 ********PRUNED!!! ********PRUNED!!! 1- TEST: breath VALUE: 0 Num Cases: 625.0 Num Dsrd: 240.0 4- TEST: CWtender VALUE: 1 Num Cases: 11.0 Num Dsrd: .0 3- TEST: CWtender VALUE: 0 Num Cases: 614.0 Num Dsrd: 240.0 8- TEST: age VALUE: >32 Num Cases: 611.0 Num Dsrd: 240.0 10- TEST: Duration VALUE: >72 Num Cases: 3.0 Num Dsrd: .0 9- TEST: Duration VALUE: <=72 Num Cases: 608.0 Num Dsrd: 240.0 12- TEST: Duration VALUE: >48 Num Cases: 2.0 Num Dsrd: 2.0 11- TEST: Duration VALUE: <=48 Num Cases: 606.0 Num Dsrd: 238.0 14 - TEST: prevang VALUE: 1 Num Cases: 340.0 Num Dsrd: 92.0 18 - TEST: Epis VALUE: 1 Num Cases: 8.0 Num Dsrd: .0 17 - TEST: Epis VALUE: 0 Num Cases: 332.0 Num Dsrd: 92.0 22- TEST: Worsening VALUE: >72 Num Cases: 6.0 Num Dsrd: .0 21- TEST: Worsening VALUE: <=72 Num Cases: 326.0 Num Dsrd: 92.0 28 - TEST: Duration VALUE: >36 Num Cases: 3.0 Num Dsrd: .0 27- TEST: Duration VALUE: <=36 Num Cases: 323.0 Num Dsrd: 92.0 36 - TEST: Worsening VALUE: >28 Num Cases: 3.0 Num Dsrd: 2.0 35 - TEST: Worsening VALUE: <=28 Num Cases: 320.0 Num Dsrd: 90.0 44 - TEST: age VALUE: >55 Num Cases: 240.0 Num Dsrd: 81.0 52 - TEST: Worsening VALUE: >0 Num Cases: 238.0 Num Dsrd: 81.0 64 - TEST: OldMI VALUE: 1 Num Cases: 49.0 Num Dsrd: 9.0 74 - TEST: Smokes VALUE: 0 Num Cases: 37.0 Num Dsrd: 9.0 86 - TEST: age VALUE: >65 Num Cases: 30.0 Num Dsrd: 5.0 ********PRUNED!!! ********PRUNED!!! 85 - TEST: age VALUE: <=65 Num Cases: 7.0 Num Dsrd: 4 .0 98 - TEST: Worsening VALUE: >2 Num Cases: 5.0 Num Dsrd: 2.0 97 - TEST: Worsening VALUE: <=2 Num Cases: 2.0 Num Dsrd: 2.0 73 - TEST: Smokes VALUE: 1 Num Cases: 12.0 Num Dsrd: .0 63- TEST: OldMI VALUE: 0 Num Cases: 189.0 Num Dsrd: 72 .0 72 - TEST: Nausea VALUE: 0 Num Cases: 165.0 Num Dsrd: 57. 0 84 - TEST: Duration VALUE: >16 Num Cases: 3.0 Num Dsrd: 2.0 83 - TEST: Duration VALUE: <=16 Num Cases: 162.0 Num Dsrd: 55.0 ********PRUNED!!! ********PRUNED!!! 71 - TEST: Nausea VALUE: 1 Num Cases: 24.0 Num Dsr d: 15.0 82 - TEST: Back VALUE: 0 Num Cases: 21.0 Num Dsrd: 15.0 94 - TEST: post VALUE: 1 Num Cases: 1.0 Num Dsrd: .0 93 - TEST: post VALUE: 0 Num Cases: 20.0 Num Dsrd: 15.0 81 - TEST: Back VALUE: 1 Num Cases: 3.0 Num Dsrd: .0 51 - TEST: Worsening VALUE: <=0 Num Cases: 2.0 Num Dsrd: .0 43 - TEST: age VALUE: <=55 Num Cases: 80.0 Num Dsrd: 9.0 50 - TEST: Worsening VALUE: >1 Num Cases: 68.0 Num Dsrd: 5.0 ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUNED!!! ********PRUN ED!!! ********PRUNED!!! 49 - TEST: Worsening VALUE: <=1 Num Cases: 12.0 Num Dsrd: 4.0 60 - TEST: age VALUE: >47 Num Cases: 10.0 Num Dsrd: 2.0 68 - TEST: OldMI VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 67- TEST: OldMI VALUE: 0 Num Cases: 9.0 Num Dsrd: 1.0 ********PRUNED!!! ********PRUNED!!! 59 - TEST: age VALUE: <=47 Num Cases: 2.0 Num Dsrd: 2.0 13 - TEST: prevang VALUE: 0 Num Cases: 266.0 Num Dsrd: 146.0 16- TEST: Duration VALUE: >0 Num Cases: 259.0 Num Dsrd: 146.0 20- TEST: post VALUE: 1 Num Cases: 13.0 Num Dsrd: 2.0 26 - TEST: Diabetes VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 25 - TEST: Diabetes VALUE: 0 Num Cases: 12.0 Num Dsrd: 1.0 ********PRUNED!!! ********PRUNED!!! 19 - TEST: post VALUE: 0 Num Cases: 246.0 Num Dsrd: 144.0 24 - TEST: Nausea VALUE: 0 Num Cases: 202.0 Num Dsrd: 105.0 32 - TEST: OldMI VALUE: 1 Num Cases: 13.0 Num Dsrd: 1.0 42 - TEST: BP VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 41 - TEST: BP VALUE: 0 Num Cases: 12.0 Num Dsrd: .0 31 - TEST: OldMI VALUE: 0 Num Cases: 189.0 Num Dsrd: 104.0 40 - TEST: age VALUE: >37 Num Cases: 184.0 Num Dsrd: 103.0 48 - TEST: Epis VALUE: 1 Num Cases: 8.0 Num Dsrd: 2.0 58 - TEST: Duration VALUE: >8 Num Cases: 2.0 Num Dsrd: 2.0 57- TEST: Duration VALUE: <=8 Num Cases: 6.0 Num Dsrd: .0 47 - TEST: Epis VALUE: 0 Num Cases: 176.0 Num Dsrd: 101.0 56 - TEST: Duration VALUE: >15 Num Cases: 2.0 Num Dsrd: .0 55 - TEST: Duration VALUE: <=15 Num Cases: 174.0 Num Dsrd: 101 .0 66- TEST: Lipids VALUE: 1 Num Cases: 1.0 Num Dsrd: 1.0 65 - TEST: Lipids VALUE: 0 Num Cases: 173.0 Num Dsrd: 100 .0 76 - TEST: Sweating VALUE: 0 Num Cases: 73.0 Num Dsr d: 32.0 ********PRUNED!!! ********PRUNED!!! 75 - TEST: Sweating VALUE: 1 Num Cases: 100.0 Num Dsrd: 68.0 88 - TEST: Duration VALUE: >8 Num Cases: 7.0 Nu m Dsrd: 2.0 104 - TEST: Rarm VALUE: 0 Num Cases: 5.0 Num Dsrd: .0 103- TEST: Rarm VALUE: 1 Num Cases: 2.0 Num Dsrd: 2.0 87 - TEST: Duration VALUE: <=8 Num Cases: 93.0 Num Dsrd: 66.0 ********PRUNED!!! ********PRUNED!!! 39- TEST: age VALUE: <=37 Num Cases: 5.0 Num Dsrd: 1.0 23 - TEST: Nausea VALUE: 1 Num Cases: 44.0 Num Dsrd: 39.0 30 - TEST: age VALUE: >47 Num Cases: 41.0 Num Dsrd: 39.0 38 - TEST: Duration VALUE: >7 Num Cases: 7.0 Num Dsrd: 5.0 46 - TEST: Larm VALUE: 0 Num Cases: 1.0 Num Dsrd: .0 45 - TEST: Larm VALUE: 1 Num Cases: 6.0 Num Dsrd: 5.0 54 - TEST: Rarm VALUE: 0 Num Cases: 5.0 Num Dsrd: 5.0 53 - TEST: Rarm VALUE: 1 Num Cases: 1.0 Num Dsrd: .0 37 - TEST: Duration VALUE: <=7 Num Cases: 34.0 Num Dsrd: 34.0 29- TEST: age VALUE: <=47 Num Cases: 3.0 Num Dsrd: .0 15 - TEST: Duration VALUE: <=0 Num Cases: 7.0 Num Dsrd: .0 7- TEST: age VALUE: <=32 Num Cases: 3.0 Num Dsrd: .0
asymmetry
border
detail
“benigh”
color
“malig”
borderdetail
< 2
R
< 2
A
detail
Y
“malig”
> 10
“benign”
detail
<2
Y
![Page 34: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/34.jpg)
Performance using ABCD rule
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1ROC CURVES ABCD RULE
1 - SPECIFICITY
SE
NS
ITIV
ITY
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1ROC CURVES OVERALL DIAGNOSIS
1 - SPECIFICITY
SE
NS
ITIV
ITY
![Page 35: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/35.jpg)
Rough Sets
Rough Sets
These are mathematical models that derive rules for grouping cases based
on boolean logic
![Page 36: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/36.jpg)
Multiple subsamples of a large table are created and combined for rule extraction
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
# Sex T3 FTI TT4 TSH Med Status
1 F 1.05 49.9 48 3.8 N OK
2 M 1.10 50.1 49 4.7 Y sick
3 F 1.3 170 51 5.8 N OK
4 M 1.4 175 200 0.4 N sick
If [(number>2) and …]
then
Complication = true
Rules
![Page 37: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/37.jpg)
Comparison of Practical Prediction Models for Ambulation Following
Spinal Cord Injury(Rowland et al, 1998)
![Page 38: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/38.jpg)
Study Population Spinal Cord Injury Model Systems of Care Database
• Admitted to one of 24 federally funded designated regional SCI care systems
• 17,861 patients who sustained a spinal cord injury between 1973 and 1997
• 1755 patients had data for LEMS scores, 1993 to 1997• 1138 had complete data for variables of interest
![Page 39: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/39.jpg)
SCI Mortality NN DesignInput & Output
Admission Info (9 items)
system daysinjury daysagegenderracial/ethnic grouplevel of neurologic fxnASIA impairment indexUEMSLEMS
Ambulation (1 item)
yesno
![Page 40: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/40.jpg)
Results: ROC Curve Area
Model ROC Curve Area Standard Error
Logistic Regression 0.925 0.016
Neural Network 0.923 0.015
Rough Set 0.914 0.016
![Page 41: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/41.jpg)
Results: ROC Curves
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Sensitivity
1-Sp
ecifi
city
LR
NNRS
![Page 42: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/42.jpg)
Other methods
Support Vector Machines, multiple variations of the nearest neighbor
algorithm, etc.
![Page 43: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/43.jpg)
Heart Attack Alert Program(Wang et al., 2001)
![Page 44: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/44.jpg)
Cox’s Models for Prediction
time (years)
![Page 45: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/45.jpg)
Genetic Algorithms
Search mechanism
• Used for variable selection (model construction)
• Case selection (regression diagnostics)
• Multidisorder diagnosis
![Page 46: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/46.jpg)
People
• Brigham and Women’s Hospital • Children’s Hospital• EECS MIT• School of Public Health• Partners Information Systems
![Page 47: Lucila Ohno-Machado, MD, PhD Division of Health Sciences and Technology Harvard Medical School Massachusetts Institute of Technology](https://reader033.vdocuments.us/reader033/viewer/2022051303/5a4d1b547f8b9ab0599a894b/html5/thumbnails/47.jpg)
Administrivia
Grading based on• 30% homeworks (almost every week)/participation• 30% midterm, open notes• 40% project (no final exam)
Lectures on the WWW for referenceHandouts with Prof. Szolovits’ assistant at NE-43 r416