integrating machine learning and physician knowledge to improve the accuracy of breast biopsy
DESCRIPTION
Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy. Inês Dutra University of Porto, CRACS & INESC-Porto LA Houssam Nassif , David Page , Jude Shavlik , Roberta Strigel , Yirong Wu , Mai Elezabi and Elizabeth Burnside UW-Madison. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/1.jpg)
Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy
Inês DutraUniversity of Porto, CRACS & INESC-Porto LA
Houssam Nassif, David Page, Jude Shavlik, Roberta Strigel, Yirong Wu, Mai Elezabi and Elizabeth Burnside
UW-Madison
![Page 2: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/2.jpg)
AMIA 2011 2
Intro and Motivation
•USA:
▫1 woman dies of breast cancer every 13 minutes
▫In 2011: estimated 230,480 new cases of invasive cancer
39,520 women are expected to die• Source: U.S. Breast Cancer Statistics, 2011
![Page 3: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/3.jpg)
AMIA 2011 3
Intro and Motivation
•Portugal:▫Per year:▫4,500 new cases▫1,500 deaths (33%)
– Source: Liga Portuguesa Contra o Cancro - September 2011
![Page 4: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/4.jpg)
AMIA 2011 4
Intro and Motivation
•Mammography
•MRI
•Ultrasound
•Biopsies
![Page 5: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/5.jpg)
AMIA 2011 5
Intro and Motivation
• Image-guided percutaneous core needle biopsy– standard for the diagnosis of suspicious findings
• Over 700,000 women undergo breast core biopsy in the US
• Imperfect: biopsies canbe inconclusive in5-15% of cases– ≈ 35,000-105,000 of the above will require
additional biopsies
![Page 6: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/6.jpg)
AMIA 2011 6
Intro and Motivation
• Breast biopsies:–Fine needle breast biopsy–Stereotactic breast biopsy–Ultrasound-guided core biopsy–Excisional biopsy–MRI-guided
![Page 7: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/7.jpg)
AMIA 2011 7
Objectives
• Hypotheses–Can we learn characteristics of
inconclusive biopsies?–Can we improve the classifer by
adding expert knowledge?
![Page 8: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/8.jpg)
AMIA 2011 8
Terminology
• Inconclusive biopsies: “non-definitive”:–Discordant biopsies–High risk lesions
• Atypical ductal hyperplasia• Lobular carcinoma in situ• Radial scar • etc.
– Insufficient sampling
![Page 9: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/9.jpg)
AMIA 2011 9
Materials and Methods
• Data collected – Oct 1st, 2005 to Dec 31st, 2008
• Tools– WEKA– Aleph
• Validation: leave-one-out cross validation• Results tested for statistical significance
![Page 10: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/10.jpg)
AMIA 2011 10
Data source
1384 image guided core biopsies
349 M 1035 B
925 (C) 110 (NC)
43 (D) 51 (ARS) 16 (I)
94After eliminating redundancies
![Page 11: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/11.jpg)
AMIA 2011 11
Data distribution
• 94 non-concordant cases went through additional procedures (e.g., excision)
• 15 were “upgraded” from benign to malignant • Our population: 15+/79-• Task: find a distinction between upgrades and
non-upgrades
![Page 12: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/12.jpg)
AMIA 2011 12
Data Featuresbiopsy procedure abn. disappearedpathology priority mass presentpathology description asymmettryconcordance architectural distortionmammography birads calcsultrasound birads mass shapeMRI birads mass marginscombined birads mass densitybiopsy needle type calcificationsbiopsy needle gauge calcification distributionclip in site of biopsy mass size
![Page 13: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/13.jpg)
AMIA 2011 13
WEKA and Aleph
• Data cleaning, single table• 15-fold stratified cross-validation
(leave-one-out)• WEKA (Bayes TAN) and Aleph
• Without expert knowledge• With expert knowledge
![Page 14: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/14.jpg)
AMIA 2011 14
WEKA and Aleph
• Waikato Environment for Knowledge Analysis– Collection of machine learning algorithms and
data mining tasks• use “propositionalized” data (flat table)
• A Learning Engine for Proposing Hypotheses– Inductive Logic Programming (ILP) system– Knowledge representation: first order rules
![Page 15: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/15.jpg)
AMIA 2011 15
Method
Learn rules about how and why non-definitive cases
are “upgraded”
Other rules (simple) provided by a specialist
Combine
New rules
![Page 16: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/16.jpg)
AMIA 2011 16
Modelling upgrades without expert knowledge
• Aleph learning on the 94 cases (15+/79-)• Example of rule learnedupgrade(A) IF
bxneedlegauge(A,9) ANDabndisappeared10(A,0) ANDmass01(A,1) ANDanotherlesionbxatsametime01(A,0).
[4+/0-]
![Page 17: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/17.jpg)
AMIA 2011 17
Examples of Expert rules(1) Upgrade increases on stereo if calcifications and
ADH are presentupgrade(A) IF
pathdx(A,atypical_ductal_hyperplasia) AND calcifications(A,’a,f’) AND
biopsyprocedure(A, stereo).(2) Upgrade increases on US(Ultra-Sound) if high
BIRADS categoryupgrade(A) IF
concordance(A,d) AND mammobirads(A,4B/4C/5).
![Page 18: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/18.jpg)
AMIA 2011 18
Modelling upgrades with expert knowledge
• Example of rule learnedupgrade(A) IF
numOfSpecimens(A,gt6) ANDrule1_2(A) ANDrule1_3(A).
[3+/1-]rule1_2(A) IF pathdxabbr(A,adh) AND
calcifications(A,f). [3+/4-]rule1_3(A) IF pathdxabbr(A,adh) AND
calcification_distribution(A,g). [3+/6-]
![Page 19: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/19.jpg)
AMIA 2011 19
Results
Classif. TP FP FN TN R P F
Human Rules alone
9 45 6 34 .60 .16 .26
ILP 5 11 10 68 .33 .31 .32
Rules+ILP 8 13 7 66 .53 .38 .44
Weka 3 9 12 70 .20 .25 .22
Rules+Weka 3 6 12 73 .20 .33 .25
![Page 20: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/20.jpg)
AMIA 2011 20
Results
Classif. TP FP FN TN R P F
Human Rules alone
9 45 6 34 .60 .16 .26
ILP 5 11 10 68 .33 .31 .32
Rules+ILP 8 13 7 66 .53 .38 .44
Weka 3 9 12 70 .20 .25 .22
Rules+Weka 3 6 12 73 .20 .33 .25
![Page 21: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/21.jpg)
AMIA 2011 21
Results:WEKA versus
Aleph and human rules
![Page 22: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/22.jpg)
AMIA 2011 22
Conclusions
• Both WEKA and Aleph can improve results when combining original data with expert knowledge (without tuning)
• WEKA improves on false positives and can be adjusted to achieve Recall of 100% with the need to re-examine around 60 women (out of 79)
• Aleph improves on false negatives and produces a better classifier with the advantage of using a language that is easily understandable by the specialist
![Page 23: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/23.jpg)
AMIA 2011 23
Future Work• NLM project started this month
![Page 24: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/24.jpg)
AMIA 2011 24
Future Work• NLM project started this month• Short term tasks:
– Augment the current dataset– Review consistently misclassified examples– Possibily use association rules to produce a
first subset of rules
![Page 25: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/25.jpg)
AMIA 2011 25
Future Work• NLM project started this month• Short term tasks:
– Augment the current dataset– Review consistently misclassified examples– Possibily use association rules to produce a
first subset of rules
• Medium term tasks:– learn from the cases that were not upgraded– Develop the iterative cycle of learning and
expert rule revision
![Page 26: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/26.jpg)
AMIA 2011 26
Future Work• NLM project started this month• Short term tasks:
– Augment the current dataset– Review consistently misclassified examples– Possibily use association rules to produce a first subset of rules
• Medium term tasks:– learn from the cases that were not upgraded– Develop the iterative cycle of learning and expert rule revision
• Long term tasks: – Use more sophisticated learning methods to produce better
rules– To integrate biopsy attributes and the classifiers obtained to
MammoClass (http://cracs.fc.up.pt/pt/mammoclass/)
![Page 27: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/27.jpg)
AMIA 2011 27
Acknowledgments• UW-Madison Medical School• FCT-Portugal and Horus and DigiScope projects• AMIA reviewers and organizers• Special ack to the UW-Madison Medical
School people that were never “scared” of computer technology and have been always very enthusiastic about using computational methodologies to help their work
![Page 29: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/29.jpg)
AMIA 2011 29
Experiment 1Significance tests - p-valuesRules+ILP Weka Rules+Weka
ILP P: 0.3437R: 0.0824F: 0.2007
P: 0.74R: 0.43F: 0.63
P: 0.74R: 0.43F: 0.63
Rules+ILP P: 0.32R: 0.06F: 0.18
P: 0.32R: 0.06F: 0.18
Weka P: 1R : 1F: 0.80
![Page 30: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/30.jpg)
AMIA 2011 30
Experiment 2tuning
• 5x4 stratified cross validation• Parameters tuned:
– Minacc: 0.05 and 0.1– Minpos: 1, 2, 3– Noise: 0, 1, 2
![Page 31: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/31.jpg)
AMIA 2011 31
Experiment 2tuning, best parameters per fold
Folds ILP rules alone ILP + human rules
minacc minp noise minacc minp noise 1 0.05 3 2 0.05 3 2 2 0.05 3 0 0.05 1 1 3 0.05 1 0 0.05 1 1 4 0.05 1 2 0.05 1 2 5 0.05 1 0 0.05 1 2
![Page 32: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/32.jpg)
AMIA 2011 32
Experiment 2 - Results
TP FP FN TN R P F1 ILP rules alone
4 13 11 66 0.26 0.16 0.20
Human + ILP
7 13 9 67 0.46 0.38 0.42
0.3 0.03 0.1p=
![Page 33: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/33.jpg)
AMIA 2011 33
Summary
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Precision-Recall Curve
ILP rules alone Human Rules + ILP WEKA-TAN WEKA-TAN + Human RulesHuman Rules alone ILP+tuning ILP+tuning+Human rules
Recall
Prec
ision
(PPV
)
![Page 34: Integrating Machine Learning and Physician Knowledge to Improve the Accuracy of Breast Biopsy](https://reader036.vdocuments.us/reader036/viewer/2022070405/56813db0550346895da7770e/html5/thumbnails/34.jpg)
AMIA 2011 34
UW upgrades data
• 94 benign cases after biopsies were discussed• 15 considered to be, in fact, malignant
upgrades• Tasks:
– learn characteristics of upgrades that do not appear in non-upgrades interpretable rules
– Can we improve the classifer by adding expert knowledge?