ai in radiotherapy of multicenter clinical...
TRANSCRIPT
AAPM Annual Meeting, July, 2020
AI in Radiotherapy of Multicenter Clinical Trials
Ying Xiao, PhD
Topics qAI in Clinical Trials
qAI Applications in NRG/CIRO/IROC
2
The State of Clinical Trials: Success Rate
Woo M. An AI boost for clinical trials. Vol. 573, Nature. Nature Publishing Group; 2019. p. S100–2.
Factors Associated with Failure
1. FOGEL DB. FACTORS ASSOCIATED WITH CLINICAL TRIALS THAT FAIL AND OPPORTUNITIES FOR IMPROVING THE LIKELIHOOD OF SUCCESS: A REVIEW. VOL. 11, CONTEMPORARY CLINICAL TRIALS COMMUNICATIONS. ELSEVIER INC; 2018. P. 156–64. 4
Failure in Efficacy or Safety
Finance
Eligibility Criteria
Patient Recruitment and Retention◦ Pre-conceived notions and preferences◦ Resource commitments: finance and time
Applications of AI
5
AI in Enrichment Strategies
6
Trials.ai
7
Clinical Trials of the Future
8
Randomized Trial the Gold Standard
Targeted Therapies
Patient Centered
In Silico Studies
NCTN Radiotherapyand Imaging Quality
Center for Innovation in Radiation Oncology (CIRO)
Imaging and Radiation Oncology Core (IROC)
9
IROC’s Five Core Services
1.Site Qualification (SQ) (FQs, ongoing QA, proton approval, resources)
2.Trial Design Support/Assistance (TD) (protocol review, templates, help desk, key contact QA centers)
3.Credentialing (CD) (tiered system to minimize institution effort)
4. Data Management (DM)(pre-review, use of TRIAD, post-review for analysis)
5.Case Review (CR)(Pre-, On-, Post-Treatment, facilitate review logistics for clinical reviews)
SQ
TD
CD
DM
CR
Continuous ImprovementsImprove Quality & Accuracy◦ Are our quality assurance (QA) processes of the highest standard?
Encourage Safety & Consistency◦ Are our QA processes encouraging safe clinical practice?
Increase Efficiency◦ Can we perform our QA functions more efficiently?
Demonstrate Efficacy◦ Are we performing clinically meaningful QA and improving clinical trial quality?
Evaluate Cost Effectiveness◦ Can we reduce the overall clinical trial expenditure by uncertainty reduction –
accrual number&time? How does the amount of resource affect quality?
XIAO Y, ROSEN M. THE ROLE OF IMAGING AND RADIATION ONCOLOGY CORE FOR PRECISION MEDICINE ERA OF CLINICAL TRIAL.TRANSL LUNG CANCER RES. 2017;6(6):621-624. DOI:10.21037/TLCR.2017.09.06
AI/Machine Learning Implementation Automated Plan QA
Automated Image Segmentation QA
Outcome Driven QA
12
• The figures show a comparison of box plot diagrams of original and re-optimized dose parameters of OARs in 50 treatment plans of patients submitted to the NRG-HN001 clinical trial.
• If feasible improvement of 5% in the maximum dose of brain stem, spinal cord, optic structures or the mean dose of any parotid gland can be achieved, the submitting institution will be sent a report including screenshots of the DVHs comparison and a dosimetriccomparison spreadsheet.
GENG H, GIADDUI TG, RADDEN M, LEE N, XIA P, XIAO Y. KNOWLEDGE-BASED MODEL FOR THE EVALUATION OF NRG-HN001 RADIOTHERAPY TREATMENT PLAN QUALITY. INT J RADIAT ONCOL • BIOL • PHYS [INTERNET]. 2019 SEP 1;105(1):E616. AVAILABLE
FROM: HTTPS://DOI.ORG/10.1016/J.IJROBP.2019.06.1144
HN001 Model-Based Plan Quality Evaluation
Identify Discrepancies
14
Structure Dose PointMulti-Institutions Single-Institution P value
PTV_6996 V69.96Gy[%] 0.02±0.04 0.00±0.02 0.06418BrainStem D0.03cc[Gy] -4.38±4.90 -3.83±2.60 0.34579SpinalCord D0.03cc[Gy] -4.07±4.08 0.23±2.09 0.00531OpticNerves D0.03cc[Gy] -5.66±8.59 -8.86±6.94 0.09623OpticChiasm D0.03cc[Gy] -4.15±6.67 -6.18±6.41 0.38013
TemporalLobes D0.03cc[Gy] -2.41±4.75 -4.07±4.51 0.10950
Parotids Mean[Gy] 0.25±6.17 -11.33±4.89 <0.00001
Cochlea_L Mean[Gy] -4.11±11.95 -17.04±6.25 <0.00001LarynxGSL Mean[Gy] -0.48±9.39 -6.5±3.96 0.02121
Protocol Compliance Overview
After Model-Guided Re-optimization
Quality Assurance of Contouring Automated
The current manual review process is time-consuming and subjective. An automated and objective QA method is needed.We have developed a fully automated method for the QA of oar contouring based on deep active learning
NRG ONCOLOGY/RTOG 1308 CLINICAL TRIAL BASED ON AUTOMATED SEGMENTATION WITH DEEP ACTIVE LEARNING, KUO MEN, PHD, HUAIZHI GENG, PHD, TITHI BISWAS, MD, ZHONGXING LIAO, MD, AND YING XIAO, PHD, ASTRO 2019 ANNUAL MEETING, SEP. 15-18, 2019
Lung CT Segmentation Challenge 201736 Lung cases with experts-reviewed contoursA small gold standard atlas
RTOG 1308 clinical trial110 patientsContours with noiseOARsRTOG-1106 contouring atlas guidelines heart, esophagus, spinal cord, and lungs
Data Input to the Model
QA Procedure
Gold Atlas: 36 cases from Lung CT Segmentation Challenge Candidate Set: the first 70 cases from RTOG-1308 Validation Set: 20 cases from RTOG-1308 Test Set: 20 cases from RTOG-1308
Four Major Components• A high-performance CNN for segmentation• An uncertainty estimation strategy• A strategy for selecting noisy annotations for fine-
tuning the CNN• Decision criteria
Segmentation Model
• Input: 2D CT images • Output: 2D Segmentation probability maps
Men K, Boimel P, Janopaul-Naylor J, et al. Physics in Medicine & Biology, 2018
Decision Criteria
• Passing Criteria:
• mean, σ: mean and standard deviation on validation set• contours of validation set have good quality
• Contours of test set that passed either criterion
1.96test DSC DSCDSC mean s> -
1.96test HD HDHD mean s< +
Model Performance and Passing Criteria
OAR
Dics Hausdorff Distance (pixel)
mean±σPass Criteria
>mean-1.96*σmean±σ
Pass Criteria
<mean+1.96*σ
Heart 0.95 ± 0.03 >0.89 7.2± 4.1 <15.2
Esophagus 0.69 ± 0.13 >0.44 4.6 ± 2.4 <9.3
Spinal cord 0.86 ± 0.06 >0.75 2.0 ± 0.7 <3.4
Lung left 0.96 ± 0.04 >0.88 7.3 ± 6.1 <19.3
Lung right 0.96 ± 0.04 >0.88 7.3± 4.7 <16.5
OARNumber of
samples
Number of correct
samples
Number of incorrect
samplesBA SEN SPE AUC
Heart 4121 3979 142 0.96 0.95 0.98 0.96
Esophagus 4121 4079 42 0.95 0.98 0.92 0.95
Spinal cord 4121 3894 227 0.96 0.96 0.97 0.96
Lung, left 4121 4098 23 0.97 1 0.94 0.97
Lung, right 4121 4094 27 0.97 1 0.94 0.97
Error Detection Results
Outcome Driven Quality Assurance
26