settings and design - spiral: home · web viewultrasound in obstetrics & gynecology : the...

41
Evaluating the risk of ovarian cancer before surgery using the ADNEX model: a prospective multicenter external validation study Sayasneh A *, 1,2 , Ferrara L *,2,3 , De Cock B 4 , Saso S 1,2 , Al-Memar M 2 , Johnson S 5 , Kaijser K 4 , Carvalho J 2 , Husicka R 2 , Smith A 6 , Stalder C 2 , Ettore G 3 , Van Calster B 4 , Timmerman D 4 , Bourne T 1,2,4 * : The authors consider that the first two authors should be regarded as joint First Authors. 1 : Department of Surgery and Cancer, Hammersmith Campus, Imperial College London, Du Cane Road, London W12 0HS, UK. 2 : Early Pregnancy and Acute Gynecology Unit, Queen Charlottes and Chelsea Hospital, Imperial College London, Du Cane Road, London W12 0HS, UK. 3 : Department of Obstetrics and Gynecology - Garibaldi Nesima Hospital, Catania, Italy. 4 : Department of Development and Regeneration, KU Leuven, Leuven, Belgium. 5 : Southampton University Hospitals, Princess Anne Hospital, Southampton, UK, SO16 5YA. 6 : Ultrasound Scan Department, Queen Charlottes and Chelsea Hospital, Imperial College London, Du Cane Road, London W12 0HS, UK Corresponding author: Mr. Ahmad Sayasneh Locum Consultant Gynecological Oncologist, Guys and St Thomas’ Hospital, and Honorary Senior Clinical Lecturer, Imperial College London. Department of Surgery and Cancer Hammersmith Campus Imperial College London Du Cane Road London W12 0HS Email: [email protected] Tel: 00442083835131 Fax: 00442083835115

Upload: doanxuyen

Post on 09-May-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Evaluating the risk of ovarian cancer before surgery using the ADNEX model: a prospective multicenter external validation study

Sayasneh A*, 1,2, Ferrara L*,2,3, De Cock B4, Saso S1,2 , Al-Memar M2, Johnson S5, Kaijser K4, Carvalho J2, Husicka R2, Smith A6, Stalder C2, Ettore G3, Van Calster B4, Timmerman D4, Bourne T1,2,4

*: The authors consider that the first two authors should be regarded as joint First Authors.1: Department of Surgery and Cancer, Hammersmith Campus, Imperial College London, Du Cane Road, London W12 0HS, UK.2: Early Pregnancy and Acute Gynecology Unit, Queen Charlottes and Chelsea Hospital, Imperial College London, Du Cane Road, London W12 0HS, UK.

3: Department of Obstetrics and Gynecology - Garibaldi Nesima Hospital, Catania, Italy.

4: Department of Development and Regeneration, KU Leuven, Leuven, Belgium.

5: Southampton University Hospitals, Princess Anne Hospital, Southampton, UK, SO16 5YA.

6: Ultrasound Scan Department, Queen Charlottes and Chelsea Hospital, Imperial College London, Du Cane Road, London W12 0HS, UK

Corresponding author: Mr. Ahmad SayasnehLocum Consultant Gynecological Oncologist, Guys and St Thomas’ Hospital, and Honorary Senior Clinical Lecturer, Imperial College London.Department of Surgery and CancerHammersmith CampusImperial College LondonDu Cane Road LondonW12 0HSEmail: [email protected]: 00442083835131Fax: 00442083835115

Running title: characterizing ovarian masses using ADNEX multiclass model

Key words: Diagnostic imaging, ovarian neoplasm, statistical models, ultrasonography

Abstract (250 words max limit)

PURPOSE: To externally validate the International Ovarian Tumor Analysis (IOTA) ADNEX model

(The Assessment of Different NEoplasias in the adnexa model) for the multiclass characterization of

ovarian masses. The secondary aim was to assess the performance of the ADNEX model by level II

ultrasound examiners with varied training and experience.

EXPERIMENTAL DESIGN: This was a cross-sectional multicenter cohort study for diagnostic

accuracy. Patients were recruited from three cancer centers (two in the UK and one in Italy). Patients

with an ovarian mass underwent transvaginal ultrasonography. Only patients who had a histological

diagnosis of surgically removed tissue were included. The diagnostic performance of the ADNEX

model with and without CA125 was calculated.

RESULTS: 610 women were included in the final analysis. The prevalence of malignancy was 30 %

(182) with 7% borderline tumors, 8% stage I primary ovarian cancers, 11% stage II-IV primary

ovarian cancers and 4% secondary metastatic cancers. The area under the curve AUC for the

diagnostic performance for the ADNEX model to differentiate between benign and malignant masses

was 0.937 (95% CI: 0.915-0.954) when CA125 was included, and 0.925 (95% CI: 0.902-0943) when

CA125 was excluded. The ADNEX model showed good discrimination between the different

subtypes (benign, borderline, stage I primary cancer, stages II-IV primary cancers and metastatic

secondary cancers).

CONCLUSION: The performance of the ADNEX model retains its performance on external

validation. Furthermore the model performs well in the hands of ultrasound examiners with varied

training and experience.

Introduction

According to the latest statistics from the National Cancer Institute in USA, there were 12.1 per

100,000 women new cases of ovarian cancers per year between 2008 and 2012, with a mortality of

7.7 per 100,000 women (1). The overall five year survival is estimated to be around 45.6 % for all

stages of the disease (1). However, for early localized ovarian cancers the five year survival exceeds

90% (1). A combination of early diagnosis and centralized management are thought to be key factors

to optimize survival (1-3). For early diagnosis, trials to evaluate ovarian cancer screening have not

been successful (4, 5). Recently, the United Kingdom Collaborative Trial of Ovarian Cancer

Screening showed that screening using the risk of ovarian cancer algorithm (ROCA), doubled the

number of detected primary invasive epithelial ovarian or tubal cancers (iEOCs) compared with a

fixed cutoff of CA 125 (6). However, until the follow up of these patients is complete, the impact of

screening on ovarian cancer mortality will not be known (6).

A further important aspect of clinical management is that an accurate diagnosis is made when a

woman presents with an ovarian mass. The International Ovarian Tumor Analysis group (IOTA) have

developed and validated models and rules to characterize ovarian masses as benign or malignant (7-

11). These models and rules have also been validated in the hands of less experienced (level II)

ultrasound examiners (12).

The IOTA group has developed the multiclass ADNEX model which can differentiate between benign

tumors, borderline tumors, early stage primary cancers, late stage primary cancers (stage II to IV) and

metastatic cancers (validation area under the receiver operating characteristic curves (AUCs) between

0.85 and 0.99). This model should facilitate the management of ovarian masses more efficiently as it

allows patients to be triaged to the correct management pathway, whether for conservative follow up,

surgery at a general gynecology unit, or management at high volume specialized cancer centers.

Correctly classifying the subtype of malignancy if also of critical importance as borderline ovarian

tumors and early stage ovarian cancers can be treated less aggressively, leading to the possibility of

Tom Bourne, 08/20/15,
Need this reference as well:A multicenter prospective external validation of the diagnostic performance of IOTA simple descriptors and rules to characterize ovarian masses.Sayasneh A, Kaijser J, Preisler J, Johnson S, Stalder C, Husicka R, Guha S, Naji O, Abdallah Y, Raslan F, Drought A, Smith AA, Fotopoulou C, Ghaem-Maghami S, Van Calster B, Timmerman D, Bourne T.Gynecol Oncol. 2013 Jul;130(1):140-6.

fertility preservation in younger women (13, 14). On the other hand metastatic ovarian cancers should

be managed according to the origin of the primary cancer (14).

ADNEX is based on three clinical and six ultrasound parameters (15). The model was developed and

temporally validated using parameters collected by experienced (or level III) ultrasound examiners

(15, 16). The primary aim of this project was to externally validate the ADNEX model. The secondary

aim was to assess the performance of the model by level II examiners with varied training (MDs and

sonographers) (15, 16).

Methods

Settings and design

This was a cross-sectional multicenter cohort study for diagnostic accuracy. Data was collected

prospectively, including the ultrasound variables required for the ADNEX model, from transvaginal

ultrasound examinations performed by level II ultrasound examiners (ref for level II). Results using

the ADNEX model were calculated by a single investigator AS using a dedicated excel spreadsheet.

The final histological outcome was then added to the same spreadsheet at a later date when results

became available. Accordingly the ultrasound examiners and investigator calculating the result of the

ADNEX model were blind to the results of the reference test. Patients were recruited from three

cancer centers (Queen Charlotte’s Chelsea Hospital (QCCH), London, UK; Princess Ann Hospital

(PAH), Southampton, UK; Garibaldi Nesima Hospital (GNH), Catania, Italy). The study was

approved as a service evaluation audit at the UK centers and as a validation study by the hospital

authority at the Italian center. The guidelines of the STARD (Standards for Reporting of Diagnostic

Accuracy) initiative were used (17). Patients were recruited consecutively from September 2010 to

November 2014 at QCCH, May 2012 to May 2014 at PAH and September 2012 to February 2015 at

GNH. All patients from GNH and 12 patients from QCCH were recruited into the IOTA 5 study

(https://clinicaltrials.gov/ct2/show/NCT01698632). Patients at QCCH and PAH were recruited also to

the IOTA 4 study (12). All ultrasound examiners received a half-day theoretical training session on

IOTA terminology and the ultrasound variables included in IOTA models. Transvaginal

Tom Bourne, 08/20/15,
Why is this relevant??

ultrasonography was performed using the standardized approach previously published by the IOTA

group (11, 18). Transabdominal ultrasonography was undertaken when a large mass could not be fully

evaluated transvaginally (11).

Participants and data collection

The inclusion criteria were patients presenting with at least one adnexal mass who underwent

transvaginal ultrasonography at one of the participating centers. For bilateral adnexal masses, the

mass with the most complex ultrasound features was included (11, 18). If both masses had similar

ultrasound morphology, the largest mass or the one most easily accessible by ultrasonography was

included (11).

The exclusion criteria were (i) pregnancy, (ii) patients examined by a consultant, (iii) refusal of

transvaginal ultrasonography, (iv) cytology rather than histology as an outcome, and (v) failure to

undergo surgery within 120 days of the ultrasound examination

The NHS Caldecott report guidelines were followed in all steps of data handling (19). At QCCH and

GNH, a secure electronic data-collection system was used (Astraia Software, Munich, Germany). A

unique identifier was generated automatically for each patient’s record. Dedicated data collection

forms and excel sheets were used at PAH. Serum CA125 was measured as per clinician’s discretion or

clinical practice in each center, using Abbott Architect CA125 II (Abbott Park, IL, USA)

immunoassay kit at QCCH and GNH, and UniCel DxI Immunoassay System (Beckman Coulter Inc.,

Brea, CA, USA) Assay at PAH.

The ADNEX model

The Assessment of Different NEoplasias in the adneXa (ADNEX) model contains three clinical and

six ultrasound predictors: age (in years), serum CA-125 level (U/mL), type of center (oncology

centers vs. other hospitals), maximum diameter of lesion (in millimeters), proportion of solid tissue,

more than 10 cyst locules (yes or no), number of papillary projections (0, 1, 2, 3 or more than 3)

acoustic shadows (yes or no), and ascites (yes or no) (15). The ADNEX model is available online and

in mobile applications (www.iotagroup.org/adnexmodel/) (15). The ADNEX model can still be

calculated without including the serum CA125 value. In this study we calculated the performance of

the model with and without CA125.

Reference tests

The reference standard was the histopathological diagnosis of the mass after surgical removal. The

excised tissues underwent histological examination at the local center. Tumors were classified

according to the WHO (World Health Organization) classification of tumors and malignant tumors

were staged according to the FIGO International Federation of Gynecology and Obstetrics) criteria

(20, 21). Histological classification was performed without knowledge of the ADNEX results. The

final diagnosis was categorized into five types: benign, borderline, stage I invasive, stage II-IV

invasive, and secondary metastatic cancer.

Statistical Analysis

There were missing values for serum CA-125 and whether there were more than 10 cyst locules

(loc10). Missing values were handled differently for serum CA-125 and loc10. The number of

missing values for the latter variable was small (2%), so these were dealt with using single stochastic

imputation based on logistic regression. Missing Loc10 values were predicted by a logistic regression

model with Firth correction with the following predictors: age, maximum diameter of the lesion,

proportion of solid tissue, number of papillations, presence of acoustic shadows, ascites, type of

ovarian tumor and type of operator.

The missing serum CA-125 values were handled with multiple stochastic imputation using predictive

mean matching regression. Since the distribution of serum CA-125 was heavily skewed, the log-log

transformation of CA-125 was used (i.e. log(log(CA-125))). In this imputation model, age, maximum

diameter of the lesion, proportion solid tissue, loc10, number of papillations, presence of acoustic

shadows, ascites, type of ovarian tumor, hospital and operator type were used as predictors. Using this

approach, the missing values were replaced by 100 plausible values, leading to 100 completed data

sets. Imputed values were back transformed to the original scale. For the ADNEX model with CA-

125, each of the 100 completed datasets were analyzed separately and their results combined using

Rubin’s Rules (22). Supplementary table 1 illustrates the numbers of missing values for each of the

study centers.

External validation of the ADNEX model with and without CA-125 was performed by evaluating the

model’s performance for discrimination and by evaluating its calibration performance. The area under

the receiver operating characteristic curve (AUC) was calculated for the basic discrimination between

benign and malignant tumors, as well as for each pair of tumor types using the conditional risk

method (23). In addition, the polytomous discrimination index was calculated (24), which estimates

the average proportion of correctly classified patients by the model when presented with five patients,

one with each tumor type. Sensitivity and specificity were calculated using a 1%, 5 %, 10%, 15%,

20% and 30 % cutoff denoting the total risk of malignancy (i.e. the sum of the estimated risks of the

four malignant subtypes). Calibration of the predicted probabilities was assessed through use of

calibration plots. These plots show the relation between the observed and predicted probabilities for

malignant tumors.

Results

During the study period 751 women underwent ultrasonography for a pelvic mass and went through

the surgical management pathway. 141 women were excluded from the final analysis for the

following reasons: 65 women were examined by a consultant, 26 women had no histology result (14

only cytology, 12 no cytology or histology), 24 women had surgery >120 days from the

characterizing ultrasound scan, 15 women were pregnant, 5 women only had a transabdominal scan, 5

women had no surgery performed (declined or were not medically fit), finally 1 woman who had a

recurrence of cervical cancer in the pelvis a few years after radical hysterectomy and underwent a

bilateral salpingoopherectomy was excluded as the tumor was not considered adnexal. In the final

analysis 610 women were included (figure 1). Supplementary table 2 illustrates the detailed numbers

of excluded and included cases for each center. The prevalence of malignancy was 30 % (182) with

7% (42) borderline tumors, 8% (47) stage I primary ovarian cancers, 11% (69) stage II-IV primary

ovarian cancers and 4% (24) secondary metastatic cancers. Supplementary table 3 illustrates the

prevalence of all tumor subtypes for each center. The median age was 47 (IQR: 34-61) with 352

(58%) premenopausal and 258 (42%) postmenopausal women. Table 1 illustrates the distribution of

the ADNEX descriptive parameters among the tumor subtypes for all patients. Supplementary tables

4, 5 and 6 illustrate the distribution of the ADNEX descriptive parameters among the tumor subtypes

in each study center.

The calibration plots for the ADNEX model with and without CA-125 are presented in figures 2 & 3.

The results indicate that there is a near perfect agreement between observed and predicted

probabilities for the model with CA-125. Hence, the predicted probabilities of a malignant tumor

almost perfectly correspond to the observed probabilities. In comparison, the model without CA-125

is less well calibrated. As can be seen in figure 2, small risks are underestimated and high risks are

overestimated. In addition, when we look at the predicted risks in general, though relatively small,

both models show an overestimation of the risk of malignancy.

A high AUC for the diagnostic performance of the ADNEX model to differentiate between benign

and malignant masses was observed whether CA125 was included (0.937, 95% CI: 0.915-0.954) or

excluded from the model calculation (0.925, 95% CI: 0.902-0943) (see figure 4 and table 2). The

model with CA-125 showed slightly better performance (a difference of 0.012 (95% CI: 0.011-

0.013)). Subgroup analysis showing the AUC results for each center, pre vs. postmenopausal women

and for the performance of doctors compared to sonographers are shown in table 2.

Table 3 presents the specificity and sensitivity of both models when different cut-offs were used.

When a cutoff of 1% was used, the models both with and without CA125, correctly classified all

patients with malignant tumors, although at the cost of the sensitivity being extremely low. When

higher cutoff values were used, sensitivity becomes slightly lower and there is a sharp increase in

specificity. When a cutoff of 30% was used, when CA125 was included, a relatively high sensitivity

was still achieved (86.3 %) with a specificity of 83.9% (table 3).

When tumors were classified into benign, borderline, stage I invasive, stage II-IV, invasive, and

secondary metastatic, the model showed good discrimination between the different subtypes although

this varied depending on how they were paired (table 4). For example, discrimination between benign

and Stage II-IV tumors was near perfect for the model with CA-125. In comparison, the model had

more difficulties discriminating between borderline and stage I tumors though its performance is still

good. The polytomous discrimination index (PDI) showed that the model, when presented with five

patients, one with each tumor type, correctly identified a fair proportion of patients. The performance

of the model with CA-125 was approximately 3 times better than random performance (PDI=0.2=1/k

with k being the number of categories) (table 4).

Discussion

In this study, we have shown that in the hands of level II ultrasound examiners, the ADNEX model

was able to discriminate between benign and malignant masses with a very similar level of

performance to that achieved by experienced ultrasound examiners in the original ADNEX temporal

validation study published by the IOTA group (15). In our external validation study using a 10% cut-

off to define malignancy, the ADNEX model achieved a sensitivity of 97.3% and a specificity of

67.7% compared to 96.5% and 71.3% in the original study (15). We also found in the current study,

that the ADNEX model discriminated well between benign tumors and each of four subtypes of

malignancy, and test performance was very similar to the original publication (15) (table 4).

To the best of our knowledge, this is the first external validation study of the IOTA ADNEX model.

Furthermore the validation was carried out by level II ultrasound examiners, whereas in the previous

IOTA development and temporal validation study (15), the ultrasound scan parameters were collected

by experienced level III examiners. A strength of our study is that it is multicenter, and as it includes

level II examiners with varied training and experience (sonographers and medical doctors), we think

the performance of the ADNEX model in this study is likely to be generalizable. Another strength of

our study is the robust selection of the reference test, as only cases with a histological outcome were

included. A potential weakness in the study is that all three participating hospitals are referral centers

of gynecological malignancies, resulting in there being a relatively high prevalence of disease in the

study population. However in the original ADNEX study the prevalence of malignancy ranged from 0

to 66% in the twenty-four participating centers. Whilst the use of the histological examination of

surgically removed tissue in all cases as a reference test may be seen as a strength of the study, it may

also be seen as a weakness in relation to the potential performance of the ADNEX model for masses

that are selected for conservative management as these were not included in the study. The use of

different assay kits for serum CA-125 measurements is a further study limitation, however the

inconsistency in CA 125 levels resulting from this is thought to be minimal (26). Furthermore the

variance in CA 125 assay kits used in the study is a reflection of clinical reality and again means

results are more likely to be reproducible (15). Finally, having no centralized histopathology review

in our study may have led to bias. For example, distinguishing borderline tumors from benign tumors

or even stage I cancer may be challenging for pathologists, where disagreement can occur and this

may give inaccurate diagnostic performance results for the ADNEX model in these cases (15).

However, as all the histopathology departments involved in this study were tertiary referral centers for

gynecological cancers, in the event of a discrepancy (including discrepancies in the referring units) a

local review at the tertiary center would have been held to resolve the disagreement. Furthermore,

centralized review of pathology was discontinued in IOTA studies as it was shown in initial studies

that there were no significant differences between local and central reports (27).

In our study, the classification of the level of experience of the ultrasound examiners (level II) was

based on the recommendations published by the European Federation of Societies for Ultrasound in

Medicine and Biology (16) and by the Royal College of Radiologists (28). The boundaries between

the three levels can be difficult to distinguish and may overlap, but as guidance: a level III examiner

in the United Kingdom equates to a consultant with a special interest in gynecological

ultrasonography (28). In our study, similar to previous findings when the IOTA model LR2 was

validated in the hands of level II examiners (12), we found the AUC for the ADNEX model was

slightly higher when the scans were performed by doctors compared to sonographers.

By characterizing the type of malignancy (borderline, primary stage I cancer, primary stage II-IV

cancer or metastatic) the ADNEX model offers the possibility of a more personalized diagnosis in the

event of an ovarian mass. This potentially may enable fertility preserving surgery in some women,

help plan the most appropriate surgical approach (laparoscopy or laparotomy) in others, or direct

attention to the primary site of malignancy in the event of metastasis. Although the ADNEX model

gives absolute risks ratios, relative risk ratios can be computed to give a comparison with the

background risk for individual patient (25). External validation is a critical step for any diagnostic test

before it can be introduced into clinical practice. We have shown that the performance of the ADNEX

model is retained in units with different patient populations to the original study, and that it performs

well in the hands of examiners with different levels of experience and background training. Our

findings suggest that the ADNEX model has the potential to improve management decisions in daily

clinical practice for women with adnexal tumors.

Acknowledgments

TB is supported by the National Institute for Health Research (NIHR) Biomedical Research Centre

based at Imperial College Healthcare NHS Trust and Imperial College London. The views expressed

are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of

Health. DT is Senior Clinical Investigator of the Research Foundation - Flanders (Belgium) (FWO).

References

1. Howlader N, Noone AM, Krapcho M, Garshell J, Miller D, Altekruse SF, et al. SEER Cancer

Statistics Review, 1975-2012. 2015 [cited 2015; Available from:

http://seer.cancer.gov/csr/1975_2012

2. Bristow RE, Chang J, Ziogas A, Randall LM, Anton-Culver H. High-volume ovarian cancer

care: survival impact and disparities in access for advanced-stage disease. Gynecologic oncology.

2014;132:403-10.

3. Bristow RE, Chang J, Ziogas A, Anton-Culver H. Adherence to treatment guidelines for

ovarian cancer as a measure of quality care. Obstetrics and gynecology. 2013;121:1226-34.

4. Buys SS, Partridge E, Black A, Johnson CC, Lamerato L, Isaacs C, et al. Effect of screening

on ovarian cancer mortality: the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening

Randomized Controlled Trial. JAMA : the journal of the American Medical Association.

2011;305:2295-303.

5. Kobayashi H, Yamada Y, Sado T, Sakata M, Yoshida S, Kawaguchi R, et al. A randomized

study of screening for ovarian cancer: a multicenter study in Japan. International journal of

gynecological cancer : official journal of the International Gynecological Cancer Society.

2008;18:414-20.

6. Menon U, Ryan A, Kalsi J, Gentry-Maharaj A, Dawnay A, Habib M, et al. Risk Algorithm

Using Serial Biomarker Measurements Doubles the Number of Screen-Detected Cancers Compared

With a Single-Threshold Rule in the United Kingdom Collaborative Trial of Ovarian Cancer

Screening. Journal of clinical oncology : official journal of the American Society of Clinical

Oncology. 2015.

7. Timmerman D, Ameye L, Fischerova D, Epstein E, Melis GB, Guerriero S, et al. Simple

ultrasound rules to distinguish between benign and malignant adnexal masses before surgery:

prospective validation by IOTA group. BMJ. 2010;341.

8. Timmerman D, Testa A, Bourne T, Ferrazzi E, Ameye L, Konstantinovic M, et al. A logistic

regression model to distinguish between the benign and malignant adnexal mass before surgery: a

multicenter study by the International Ovarian Tumor Analysis (IOTA) group. Journal of clinical

oncology : official journal of the American Society of Clinical Oncology. 2005;23:8794 - 801.

9. Van Holsbeke C, Ameye L, Testa AC, Mascilini F, Lindqvist P, Fischerova D, et al.

Development and external validation of (new) ultrasound based mathematical models for preoperative

prediction of high-risk endometrial cancer. Ultrasound in obstetrics & gynecology : the official

journal of the International Society of Ultrasound in Obstetrics and Gynecology. 2013.

10. Van Holsbeke C, Van Calster B, Bourne T, Ajossa S, Testa AC, Guerriero S, et al. External

validation of diagnostic models to estimate the risk of malignancy in adnexal masses. Clinical cancer

research : an official journal of the American Association for Cancer Research. 2012;18:815-25.

11. Timmerman D, Van Calster B, Testa AC, Guerriero S, Fischerova D, Lissoni AA, et al.

Ovarian cancer prediction in adnexal masses using ultrasound-based logistic regression models: a

temporal and external validation study by the IOTA group. Ultrasound in obstetrics & gynecology :

the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

2010;36:226-34.

12. Sayasneh A, Wynants L, Preisler J, Kaijser J, Johnson S, Stalder C, et al. Multicentre external

validation of IOTA prediction models and RMI by operators with varied training. British journal of

cancer. 2013;108:2448-54.

13. Darai E, Fauvet R, Uzan C, Gouy S, Duvillard P, Morice P. Fertility and borderline ovarian

tumor: a systematic review of conservative management, risk of recurrence and alternative options.

Human reproduction update. 2013;19:151-66.

14. Hennessy BT, Coleman RL, Markman M. Ovarian cancer. Lancet. 2009;374:1371-82.

15. Van Calster B, Van Hoorde K, Valentin L, Testa AC, Fischerova D, Van Holsbeke C, et al.

Evaluating the risk of ovarian cancer before surgery using the ADNEX model to differentiate between

benign, borderline, early and advanced stage invasive, and secondary metastatic tumours: prospective

multicentre diagnostic study. BMJ. 2014;349:g5920.

16. Education, Practical Standards Committee EFoSfUiM, Biology. Minimum training

recommendations for the practice of medical ultrasound. Ultraschall in der Medizin. 2006;27:79-105.

17. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards

complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Annals of

internal medicine. 2003;138:40-4.

18. Timmerman D, Valentin L, Bourne TH, Collins WP, Verrelst H, Vergote I, et al. Terms,

definitions and measurements to describe the sonographic features of adnexal tumors: a consensus

opinion from the International Ovarian Tumor Analysis (IOTA) Group. Ultrasound in obstetrics &

gynecology : the official journal of the International Society of Ultrasound in Obstetrics and

Gynecology. 2000;16:500-5.

19. Great Britain. Department of H. The Caldicott Committee report on the review of patient-

identifiable information: Great Britain, Department of Health; 1997.

20. Tavassoli FA, Devilee P, International Agency for Research on Cancer. Pathology and

genetics of tumours of the breast and female genital organs. Lyon: International Agency for Research

on Cancer; 2003.

21. Heintz AP, Odicino F, Maisonneuve P, Quinn MA, Benedet JL, Creasman WT, et al.

Carcinoma of the ovary. FIGO 26th Annual Report on the Results of Treatment in Gynecological

Cancer. International journal of gynaecology and obstetrics: the official organ of the International

Federation of Gynaecology and Obstetrics. 2006;95 Suppl 1:S161-92.

22. Rubin DB. Multiple imputation for nonresponse in surveys: Hoboken.

23. Van Calster B, Vergouwe Y, Looman CW, Van Belle V, Timmerman D, Steyerberg EW.

Assessing the discriminative ability of risk models for more than two outcome categories. European

journal of epidemiology. 2012;27:761-70.

24. Van Calster B, Van Belle V, Vergouwe Y, Timmerman D, Van Huffel S, Steyerberg EW.

Extending the c-statistic to nominal polytomous outcomes: the Polytomous Discrimination Index.

Statistics in medicine. 2012;31:2610-26.

25. Van Calster B, Van Hoorde K, Froyman W, Kaijser J, Wynants L, Landolfo C, et al. Practical

guidance for applying the ADNEX model from the IOTA group to discriminate between different

subtypes of adnexal tumors. Facts, views & vision in ObGyn. 2015;7:32-41.

26. Davelaar EM, van Kamp GJ, Verstraeten RA, Kenemans P. Comparison of seven

immunoassays for the quantification of CA 125 antigen in serum. Clinical chemistry. 1998;44:1417-

22.

27. Timmerman D, Testa AC, Bourne T, Ferrazzi E, Ameye L, Konstantinovic ML, et al. Logistic

regression model to distinguish between the benign and malignant adnexal mass before surgery: a

multicenter study by the International Ovarian Tumor Analysis Group. Journal of clinical oncology :

official journal of the American Society of Clinical Oncology. 2005;23:8794-801.

28. The Royal College of Radiologists (RCR) BotFoCR. Ultrasound Training Recommendations

for Medical and Surgical Specialties. 2 ed. London; 2012.

751 patients ultrasonography examination and booked for surgery

for an adnexal mass

Total 141 patients were excluded:

65 scanned by consultant

26 no histology

24 had surgery > 120 days

15 pregnancy

5 only transabdominal scan

5 had no surgery

1 recurrence of cervical ca

610 patients included

QCCH

318

PAH

175

GNH

117

Figure 1: A flow chart illustrating the final sample size and the numbers of excluded cases.

Figure2: Calibration plot for the ADNEX model without CA-125

Figure 3: Calibration plot for the ADNEX model with CA-125

Figure 4. Receiver Operator Curves for benign vs. malignant masses for the ADNEX model with and without serum CA125 levels

Supplementary Table 1. Missing values for variables according to each center

All

(n=751)

QCCH

(n=318)

PAH

(n=232)

GNH

(n=156)

Age, years 0

CA-125, IU/L 143 (19%) 58 (18%) 19 (8%) 66 (42%)

Max lesion diameter 0

Max solid diameter 0

Proportion solid tissue 0

More than 10 locules 17 (2%) 17 (5%)

Number of papillations 0

Acoustic shadows 0

Ascites 0

QCCH: Queen Charlottes and Chelsea Hospital, PAH: Princess Anne Hospital, Southampton, GNH: Garibaldi Nesima Hospital

Supplementary Table 2. Inclusion and exclusion of patients according to each center in the study.

QCCH PAH GNH All

All 363 232 156 751

Included 318 (88%) 175 (75%) 117 (75%) 611 (81%)

Excluded 45 (12%) 57 (25%) 39 (25%) 140 (19%)

Scanned by consultant 27 (7%) 38 (24%) 65 (9%)

Only cytology 4 (1%) 10 (4%) 14 (2%)

No cyto/histo 5 (1%) 7 (3%) 12 (2%)

Surgery >120d 3 (1%) 26 (11%) 29 (4%)

Pregnancy 5 (1%) 9 (4%) 1 (1%) 15 (2%)

Abdominal scan 5 (2%) 5 (1%)

Recurrence of cervical carcinoma

1 (<1%) 1 (<1%)

QCCH: Queen Charlottes and Chelsea Hospital, PAH: Princess Anne Hospital, Southampton, GNH: Garibaldi Nesima Hospital

Supplementary Table 3. Frequencies and percentages of type of tumors according to each center

All (n=611) QCCH (n=318)

PAH (n=175)

GNH (n=117 )

Benign 428 (70%) 214 (67%) 119 (68%) 95 (81%)

Malignant 182 (30%) 104 (33%) 56 (32%) 22 (19%)

Borderline 42 (7%) 28 (9%) 14 (8%) 0 (0%)

Stage I OC 47 (8%) 33 (10%) 10 (6%) 4 (3%)

Stage II-IV OC 69 (11%) 35 (11%) 18 (10%) 16 (14%)

Secondary metastatic cancer 24 (4%) 8 (3%) 14 (8%) 2 (2%)

QCCH: Queen Charlottes and Chelsea Hospital, PAH: Princess Anne Hospital, Southampton, GNH: Garibaldi Nesima Hospital

Supplementary Table 4: Distribution of the ADNEX model descriptives in the tumor subtypes for Queen Charlottes and Chelsea Hospital

QCCH

Statistic

Benign

(n=214)

Borderline

(n=28)

Stage I OC

(n=33)

Stage II-IV OC

(n=35)

Secondary metastasis

(n=8)

Age, years Median (IQR) 40 (30-54)

38 (26-54)

57 (48-69)

64 (52-72)

50 (40-58)

CA-125, IU/L Median (IQR) 22 (14-43)

29 (22-86)

130 (54-246)

593 (200-1279)

138 (46-407)

Max lesion diameter Median (IQR) 69 (50-92)

132 (80-203)

140 (114-186)

118 (77-155)

95 (77-125)

Presence of solid parts N (%) 65 (30%)

17 (61%)

32 (97%)

35 (100%)

8 (100%)

Proportion solid tissue, if present

Median (IQR) 0.32 (0.16-0.66)

0.40 (0.19-0.46)

0.45 (0.32-0.60)

0.56 (0.38-0.97)

0.70 (0.43-0.95)

More than 10 locules N (%) 13 (6%) 10 (36%) 9 (27%) 7 (20%) 1 (50%)

Number of papillations

0 N (%) 194 (91%) 20 (71%) 25 (76%) 27 (77%) 7 (88%)

1 N (%) 13 (6%) 3 (11%) 0 (0%) 5 (14%) 0 (0%)

2 N (%) 3 (1%) 1 (4%) 2 (6%) 0 (0%) 1 (13%)

3 N (%) 1 (0%) 1 (4%) 2 (6%) 0 (0%) 0 (0%)

>3 N (%) 3 (1%) 3 (11%) 4 (12%) 3 (9%) 0 (0%)

Acoustic shadows N (%) 44 (21%) 0 (0%) 5 (15%) 1 (3%) 1 (13%)

Ascites N (%) 1 (0%) 0 (0%) 3 (9%) 10 (29%) 3 (38%)

QCCH: Queen Charlottes and Chelsea Hospital, OC: Ovarian cancer

Supplementary Table 5: Distribution of the ADNEX model descriptives in the tumor subtypes for Princess Anne Hospital

PAH

Statistic

Benign

(n=119)

Borderline

(n=14)

Stage I OC

(n=10)

Stage II-IV OC

(n=18)

Secondary metastasis

(n=14)

Age, years Median (IQR) 50 (39-63)

55 (50-65)

55 (50-64)

61 (58-71)

56 (49-69)

CA-125, IU/L Median (IQR) 15 (10-30)

27 (19-54)

41 (11-72)

174 (57-967)

61 (21-86)

Max lesion diameter Median (IQR) 79 (58-119)

122 (99-171)

152 (105-178)

86 (64-132)

88 (71-151)

Presence of solid parts N (%) 55 (46%)

13 (93%)

10 (100%)

18 (100%)

13 (93%)

Proportion solid tissue, if present

Median (IQR) 0.35 (0.19-0.73)

0.32 (0.24-0.47)

0.26 (0.13-0.65)

0.91 (0.53-1.0)

1.0 (1.0-1.0)

More than 10 locules N (%) 14 (12%) 4 (29%) 3 (30%) 0 (0%) 2 (14%)

Number of papillations

0 N (%) 91 (76%) 6 (43%) 6 (60%) 14 (78%) 12 (86%)

1 N (%) 11 (9%) 3 (21%) 1 (10%) 2 (11%) 0 (0%)

2 N (%) 7 (6%) 1 (7%) 3 (30%) 1 (6%) 1 (7%)

3 N (%) 2 (2%) 1 (7%) 0 (0%) 0 (0%) 1 (7%)

>3 N (%) 8 (7%) 3 (21%) 0 (0%) 1 (6%) 0 (0%)

Acoustic shadows N (%) 29 (24%) 0 (0%) 1 (10%) 0 (0%) 0 (0%)

Ascites N (%) 5 (4%) 1 (7%) 0 (0%) 8 (44%) 4 (29%)

PAH: Princess Anne Hospital, Southampton, OC: Ovarian cancer

Supplementary Table 6: Distribution of the ADNEX model descriptives in the tumor subtypes for Garibaldi Nesima Hospital

GNH Statistic Benign

(n=95)

Borderline

(n=0)

Stage I OC

(n=4)

Stage II-IV OC

(n=16)

Secondary metastasis

(n=2)

Age, years Median (IQR) 39 (30-48)

/ 66 (53-70)

61 (47-74)

70 (65-74)

CA-125, IU/L Median (IQR) 22 (11-45)

/ 59 (22-248)

507 (154-1525)

137 (62-212)

Max lesion diameter Median (IQR) 61 (43-90)

/ 126 (75-174)

100 (82-132)

144 (44-243)

Presence of solid parts N (%) 22 (23%) / 4 (100%) 16 (100%) 1 (50%)

Proportion solid tissue, if present

Median (IQR) 0.38 (0.16-0.87)

/ 0.53 (0.28-0.84)

0.52 (0.36-0.78)

1

More than 10 locules N (%) 4 (4%) / 1 (25%) 4 (25%) 1 (50%)

Number of papillations /

0 N (%) 86 (91%) / 2 (50%) 11 (69%) 2 (100%)

1 N (%) 7 (7%) / 0 (0%) 1 (6%) 0 (0%)

2 N (%) 2 (2%) / 0 (0%) 0 (0%) 0 (0%)

3 N (%) 0 (0%) / 0 (0%) 0 (0%) 0 (0%)

>3 N (%) 0 (0%) / 2 (50%) 4 (25%) 0 (0%)

Acoustic shadows N (%) 21 (22%) / 0 (0%) 0 (0%) 0 (0%)

Ascites N (%) 0 (0%) / 0 (0%) 5 (31%) 0 (0%)

GNH: Garibaldi Nesima Hospital, OC: Ovarian cancer

Table 1: Descriptive information about the patients and masses included in the study according to tumor subtype

All patients Statistic Benign

(n=428)

Borderline

(n=42)

Stage I OC

(n=47)

Stage II-IV OC

(n=69)

Secondary metastasis

(n=24)

Age, years Median 43 (31-55)

47 (30-56)

57 (48-68)

62 (53-72)

55 (49-69)

CA-125, IU/L Median (IQR) 20 (12-39)

28 (21-64)

92 (35-209)

485 (136-1083)

66 (33-129)

Max lesion diameter Median (IQR) 72 (51-95)

128 (91-174)

146 (109-180)

110 (76-140)

90 (73-135)

Presence of solid parts N (%) 142 (33%) 30 (71%) 46 (98%) 69 (100%) 22 (92%)

Proportion solid tissue, if present

Median (IQR) 0.36 (0.18-0.78)

0.37 (0.19-0.47)

0.43 (0.30-0.67)

0.59 (0.41-1)

1 (0.58-1)

More than 10 locules N (%) 31 (7%) 14 (33%) 13 (28%) 11 (16%) 7 (29%)

Number of papillations

0 N (%) 371 (87%) 26 (62%) 33 (70%) 52 (75%) 21 (88%)

1 N (%) 31 (7%) 6 (14%) 1 (2%) 8 (12%) 0 (0%)

2 N (%) 12 (3%) 2 (5%) 5 (11%) 1 (1%) 2 (8%)

3 N (%) 3 (1%) 2 (5%) 2 (4%) 0 (0%) 1 (4%)

>3 N (%) 11 (3%) 6 (14%) 6 (13%) 8 (12%) 0 (0%)

Acoustic shadows N (%) 94 (22%) 0 (0%) 6 (13%) 1 (1%) 1 (4%)

Ascites N (%) 6 (1%) 1 (2%) 3 (6%) 23 (33%) 7 (29%)

OC: Ovarian cancer

Table 2. The area under the receiver operator curve for the discrimination between benign and malignant lesions for ADNEX with and without CA 125 according to type of center and sonographer

Area under the ROC curve

Lower confidence limit

Upper confidence limit

ADNEX without CA-125

All patients 0.925 0.902 0.943

Profession operator

MD 0.924 0.900 0.943

Sonographer 0.916 0.818 0.964

Center GNH 0.983 0.950 0.995

QCCH 0.931 0.900 0.953

PAH 0.889 0.828 0.930

Menopausal status

Premenopausal 0.935 0.901 0.958

Postmenopausal 0.873 0.824 0.910

ADNEX with CA-125

All patients 0.937 0.915 0.954

Profession operator

MD 0.939 0.917 0.956

Sonographer 0.912 0.809 0.962

Center GNH 0.990 0.959 0.998

QCCH 0.942 0.913 0.962

PAH 0.900 0.841 0.938

Menopausal status

Premenopausal 0.939 0.901 0.963

Postmenopausal 0.899 0.855 0.931

ROC: Receiver operator curve, MD: medically qualified doctor, QCCH: Queen Charlottes and Chelsea Hospital, PAH: Princess Anne Hospital, Southampton, GNH: Garibaldi Nesima Hospital

Table 3. The overall sensitivity and specificity (Benign vs. Malignant) of the ADNEX model with and without the inclusion of serum CA125

without CA-125 with CA-125 (CL)

Cutoff 1% Sensitivity 100.0 % 100.0 % (97.4-100.0)

Specificity 12.4 % 11.9 % (9.1-15.5)

Cutoff 5% Sensitivity 98.9 % 99.0 % (94.9-99.8)

Specificity 54.7 % 53.2 % (48.2-58.1)

Cutoff 10% Sensitivity 96.7 % 97.3 % (93.5-98.9)

Specificity 67.1 % 67.7 % (63.0-72.0)

Cutoff 15% Sensitivity 94.5 % 94.4 % (90.0-97.0)

Specificity 72.7 % 75.2 % (70.7-79.2)

Cutoff 20% Sensitivity 90.7 % 90.6 % (85.2-94.1)

Specificity 77.6 % 79.3 % (75.1-83.0)

Cutoff 30% Sensitivity 84.6 % 86.3 % (80.4-90.6)

Specificity 83.4 % 83.9 % (80.1-87.2)

Tom Bourne, 08/20/15,
What does CL mean?