repeatability and reproducibility of f-naf pet...
TRANSCRIPT
Repeatability and reproducibility of 18F-NaF PET quantitative imaging biomarkers
Christie Lin, Tyler Bradshaw, Timothy Perk, Stephanie Harmon, Glenn Liu, Robert JerajUniversity of Wisconsin – Madison, Department of Medical Physics
NCCAAPMMadison, WI| October 30, 2015
Introduction: NaF PET
18F-NaF PET, a surrogate of bone metabolism, was first introduced as an imaging agent for detecting bone lesions (Blau 1962)
18F-NaF exchanges with small hydroxyl ions (OH-) in the bone crystal, hydroxyapatite (Blau 1962)– Within minutes, the ion passes from the plasma through ECF
• “localized” into the shell of bound water surrounding each crystal
Introduction: Imaging bone metastases
Metastases to the bone detection drives the interest to identify imaging biomarkers (Mick 2014)– 18F-NaF PET has superior resolution and sensitivity as compared to
99mTc bone scans (Even-Sapir 2006, Iagaru 2013)
Quantitative Imaging Biomarkers Alliance (QIBA) states repeatability and reproducibility quantification are critical for accurate assessment of therapeutic response (Raunig 2014)
There has been one study to date on the repeatability of NaF PET imaging in humans (Kurdziel 2012)– Conducted imaging at one center
Introduction: Metastasis to the bone
Prostate and breast cancer preferentially metastasizes to the bone– More than 90% of metastatic prostate cancer (mPCa) patients
develop bone metastases (Bubendorf 2000)– About 70% of metastatic breast cancer (mBC) patients develop bone
metastases (Manders 2006)
Survival rates are low for metastatic cancer Patients with mPCa have a poor prognosis and a median survival of
18-24 months from initial progression (Huang 2012) Patients with mBC median survival of 55 months (Ahn 2013)
Because survival rates are higher with earlier detection, early diagnosis and treatment is crucial!
Research Objectives
Quantify the repeatability of 18F-NaF PET-derived standardized uptake value (SUV) imaging and texture features
– Identify NaF PET-derived imaging features which are repeatable
– Quantify variability between imaging sites in a multicenter trial
– Establish response criteria for 18F-NaF PET-based treatment assessment
Methods: Image acquisitionScan Acquisition: Multicenter trial of 34 metastatic castrate-resistant prostate cancer patients
Patients Bone lesionsCenter 1 18 264Center 2 10 67Center 3 6 68
All 34 399
Test Retest
Obtained test-retest whole-body NaF PET/CT scans
Methods: Image acquisition
Osseous Lesion Segmentation:
SUVthreshold = 15
Feature Basis Features
SUV SUVmax, SUVmean, SUVtotal
First-order Max, TLG, Volume, Stdev, Variance, CoV, Skewness, Kurtosis, Energy, Entropy
Co-occurrence matrix
Angular Moment, Contrast-GLCM, Correlation, Sum of Squares Variance, Inverse Difference Moment, Sum Average, Sum Variance, Sum Entropy, Entropy-GLCM, Difference Variance, Difference Entropy,
Information Measure of Correlation 1, Information Measure of Correlation 2, Maximal Correlation Coefficient, Maximum Probability, Diagonal Moment, Dissimilarity, Difference Energy, Inertia, Inverse
Difference Moment, Sum Energy, Cluster Shade, Cluster Prominence
Gray level run length
Small Run Emphasis, Long Run Emphasis, Gray-Level Nonuniformity, Run Length Nonuniformity, Run Percentage, Low Gray-Level Emphasis, High Gray-Level Emphasis, Short Run Low Gray-Level
Emphasis, Short Run High Gray-Level Emphasis, Long Run Low Gray-Level Emphasis, Long Run High Gray-Level Emphasis
Neighboring gray level Small Number Emphasis, Large Number Emphasis, Number Nonuniformity, Second Moment, Entropy-NGL
Neighborhood gray tone difference matrix Coarseness, Contrast-NGL, Busyness
Methods: Image feature extraction
(Galavis 2010)
Lesion-level SUV quantificationTest Retest
15 50
SUV Feature SUV64.5 SUVmax 63.729.7 SUVmean 28.9453 SUVtotal 478
…
SUV Feature SUV48.2 SUVmax 28.822.8 SUVmean 19.4286.4 SUVtotal 92.7
…
Low repeatability
High repeatability
Methods: Statistical analysis
Transforming measurements: Distributions of measurements were skewed, warranting a natural-log
transformation
Measurement difference between scans, within lesion
Measures of repeatability: Coefficient of variation (CV)
Intraclass correlation coefficient (ICC)
b: between lesionsw: within lesions
(Bland 1996, Raunig 2014)
Coefficient of variation varies by feature
ICC varies by feature
b: between lesionsw: within lesions
Repeatability of NaF PET/CT imaging features
ICC vs CV
Features of high repeatability
ICC vs CV
Research Objectives
Quantify the repeatability of 18F-NaF PET-derived standardized uptake value (SUV) imaging and texture features
– Identify NaF PET-derived imaging features which are repeatable
– Quantify variability between imaging sites in a multicenter trial
– Establish response criteria for 18F-NaF PET-based treatment assessment
Inter-site CV is generally consistent
Inter-site CV is generally consistent
Repeatability across sites
X-bars range(ICC)Y-bars range(CV)
Metrics of high repeatability across sites
X-bars range(ICC)Y-bars range(CV)
Research Objectives
Quantify the repeatability of 18F-NaF PET-derived standardized uptake value (SUV) imaging and texture features
– Identify NaF PET-derived imaging features which are repeatable
– Quantify variability between imaging sites in a multicenter trial
– Establish response criteria for 18F-NaF PET-based treatment assessment
Determining confidence intervals
95% confidence intervals developed from test-retest measurements can be applied to untransformed data for establishing response criteria
Log-transformed measurement difference
95% confidence intervals (CI95%)
(Bland 1996)
Confidence intervals of SUV metrics by site
0
0.5
1
1.5
2
2.5
1 2 3 4
95%
con
fiden
ce In
terv
al (
ratio
)
Site
SUVmax
SUVmean
SUVtotal
Pooled
e.g., CI95% of 1.00[0.80, 1.20] indicates 95% confidence intervals of ±20%
Summary: Repeatability of 18F-NaF PET
Quantified the repeatability of 54 18F-NaF PET-derived standardized uptake value (SUV) metrics and PET-derived texture features for individual lesions– High repeatability:
• SUV metrics: SUVmean, SUVtotal, SUVmax• First-order: energy, entropy, median, variance• Neighborhood gray-tone difference matrix: coarseness, contrast-NGL
– Low repeatability: kurtosis, skewness
Evaluated the variability of 18F-NaF PET imaging across multiple centers – Metrics with high repeatability were consistent between sites
Established response criteria for 18F-NaF PET-based treatment assessment
Future work:– Determine repeatability of 18F-NaF PET by the spatial location of the
metastasis
Christie Lin @ [email protected]
References American Cancer Society. Cancer Facts & Figures 2014. Atlanta, GA: American Cancer Society; 2014. Bland, J.M. and D.G. Altman, Transformations, means, and confidence intervals. British Medical Journal, 1996.
312(7038): p. 1079-1079. Bland J. Statistics notes: Transformations, means, and confidence intervals. BMJ 1996; 312 Bubendorf L, Schöpfer A, Wagner U, et al. Metastatic patterns of prostate cancer: an autopsy study of 1,589
patients. Hum Pathol. 2000;31(5):578-583. Galavis P et al. Variability of textural features in FDG PET images due to different acquisition modes and
reconstruction parameters. Acta Oncologica 2010; (49)1012-16. Huang X, Chau CH, Figg WD. Challenges to improved therapeutics for metastatic castrate resistant prostate
cancer: from recent successes and failures. J Hematol Oncol. 2012;5:35. Kurdziel K et al. The Kinetics and Reproducibility of 18F-Sodium Fluoride for Oncology Using Current PET
Camera Technology. J Nucl Med, 2012. Leijenaar R et al. Stability of FDG-PET Radiomics features: An integrated analysis of test-retest and inter-
observer variability. Acta Oncologica, 2013; 52: 1391–1397. Mick C. et al. Molecular Imaging in Oncology: 18F-Sodium Fluoride PET Imaging of Osseous Metastatic
Disease; AJR 2014. Raunig D., et al. Quantitative Imaging Biomarkers: a Review of Statistical Methods for Technical Performance
Assessment. SMMR, 2014. Schirrmeister, H., et al., Prospective evaluation of the clinical value of planar bone scans, SPECT, and (18)F-
labeled NaF PET in newly diagnosed lung cancer. J Nucl Med, 2001. 42(12): p. 1800-4. Tixier F et al. Reproducibility of tumor uptake heterogeneity characterization through textural feature analysis in
18F-FDG PET. J Nucl Med. 2012 May;53(5):693-700. doi: 10.2967/jnumed.111.099127. Vaz S, et al. The Case for Using the Repeatability Coefficient When Calculating Test–Retest Reliability.
10.1371/journal.pone.0073990. Yip S and Jeraj R. Use of articulated registration for response assessment of individual metastatic bone
lesions. 2014 Phys. Med. Biol. 59 1501 doi:10.1088/0031-9155/59/6/1501.
Repeatability of SUVmax: distribution
95% LOA = [-0.27, +0.27]
RC=
Method: Image Acquisition
Scanner– Centers 1 and 2 were taken on the General Electric Discovery VCT scanner– Center 3 were taken on the Philips Gemini scanner
Acquisition – 60 minutes post injection– whole-body scan: 3 minutes per bed position– from the base of skull to the proximal femora
Reconstruction– centers 1 and 2 was 3D ordered subset expectation maximization (OSEM): 256 × 256
grid size, 14 subsets, 2 iterations and 4 mm post reconstruction filter– center 3 was 3D OSEM: 144 × 144 grid size, 33 subsets, and 2 iterations
Articulated Registration Algorithm
(Yip, Jeraj 2014)
Results: coefficient of variation
SUVmax
Kurtosis
Statistical analysis: measurement error indices
Log transform to approximate normal distribution (Bland 1996)
Relative mean difference (RMD): relative difference between the paired measurements
Bland-Altman (B-A) plots: to show trends in variability over the measuring interval
Repeatability Coefficient (RC): least significant difference between two repeated measurements
95% Limits of agreement (LOA): 95% interval in which difference is expected to lie
Coefficient of variation (CV): within lesion variance
Intraclass correlation coefficient (ICC): relative variance
(Vaz et al 2013, QIBA 2014)
SUVmax: Inter-site measurement error indices suggest high repeatability
b: between sitew: within site
Are the
Relative mean difference (RMD) varies significantly by feature
RMD(SUVmax)=0.05%