online biomarker validation of survival- associated biomarkers in breast and ovarian cancer using...
TRANSCRIPT
AIMSThe pre-clinical validation of prognostic gene candidates in large independent patient cohorts is a pre-requisite for the development of robust biomarkers. In present study we expanded our online Kaplan-Meier plotter tool to assess the effect of genes on ovarian cancer prognosis.
CONCLUSIONSWe extended our global biomarker validation platform to assess the prognostic power of
22,277 genes in 2,977 breast and 1,346 ovarian cancer patients.
Online access at: http://www.kmplot.com/.
METHODSGene expression data and survival information of breast and ovarian cancer patients were downloaded from GEO and TCGA. To analyze the prognostic value of the selected gene in the various cohorts the patients are divided into two groups according to the quantile expression of the gene. Filtering is implemented for stage, grade, and histology subtypes. Follow-up threshold is implemented to exclude long-term effects. A Kaplan-Meier survival plot is generated and significance is computed in the R statistical environment using Bioconductor packages. The combination of several probe sets can be employed to assess the mean of their expression as a multigene predictor of survival.
RESULTSAll together 1,346 ovarian cancer patients and 2,977 breast cancer patients were entered into the database. These groups can be compared using relapse free survival or overall survival. We used this integrative data analysis tool to validate the prognostic power of 37 biomarkers identified in the literature. Of these, CA125 (p=3.7e-5, HR=1.4), CDKN1B (p=5.4e-5, HR=1.4), KLK6 (p=0.002,HR=0.79), IFNG (p=0.004, HR=0.81), P16 (p=0.02, HR=0.66) and BIRC5 (p=0.00017, HR=0.75) were associated with survival.
Analysis atwww.kmplot.com
Analysis atwww.kmplot.com
Raw datan=5,032
Raw datan=5,032
PostgreSQL database
PostgreSQL database
Remaining n=4,323Remaining n=4,323
Clinical data
Clinical data
Real time computation in R
Real time computation in R
Graphical feedback (KM-plot, hazard ratio and p-value)
Graphical feedback (KM-plot, hazard ratio and p-value)
Filtering for gene expression
Filtering for gene expression
1. Quality control 2. Normalization3. Combination of platforms
1. Quality control 2. Normalization3. Combination of platforms
GEO, TCGAGEO, TCGA
TOP2A in breast cancer
CA125 in ovarian cancer Distribution of CA125
Figure 1. The online query pages Figure 2. Overview of the system
Symbol Surv.
Analyzed in: Affymetrix ID HR p
CA125 PFS All patients 220196_at n.s. n.s.201384_s_at 1.3 0.0003*201383_s_at 1.4 3.7e-05*
KRT19 PFS Debulk = subopt. 201650_at n.s. n.s.KLK6 PFS All patients 216699_s_at 0.79 0.002*
204733_at n.s. n.s.KLK10 PFS Stage = 3+4 209792_s_at n.s. n.s.IL6 OS All patients 205207_at n.s. n.s.FAS PFS All patients 204780_s_at 1.2 0.017
204781_s_at n.s. n.s.212218_s_at 0.84 0.024215719_x_at n.s. n.s.216252_x_at n.s. n.s.
VEGFR OS All patients 203934_at 1.2 0.064CCND1 OS Stage = 3+4 208711_s_at n.s. n.s.
208712_at n.s. n.s.CCND3 OS All patients 201700_at n.s. n.s.CCNE OS Debulk = subopt. 213523_at n.s. n.s.
205034_at n.s. n.s.P15 PFS All patients 204599_s_at n.s. n.s.
212857_x_at 1.3 0.0005*214512_s_at 1.2 0.01218708_at n.s. n.s.
P16 PFS Debulk = subopt. 207039_at 0.66 0.002*209644_x_at n.s. n.s.
CDKN1A PFS Histology = serous 202284_s_at n.s. n.s.CDKN1B PFS All patients 209112_at 1.4 5.4e-05*RB1 OS Stage = 1 203132_at n.s. n.s.E2F1 PFS All patients 2028_s_at 0.83 0.017E2F4 PFS All patients 38707_r_at n.s. n.s.TP53 PFS Stage = 3+4 211300_s_at n.s. n.s.
201746_at 0.84 0.075BAX PFS Therapy = contains
Taxol208478_s_at n.s. n.s.211833_s_at n.s. n.s.
BCL2L1 PFS All patients 212312_at 0.86 0.04215037_s_at n.s. n.s.
BIRC2 OS Stage = 3+4 202076_at n.s. n.s.BIRC5 PFS All patients 210334_x_at 0.75 0.00017*
202094_at 0.84 0.018202095_s_at 0.84 0.018
EGFR PFS Stage = 1+2 201983_s_at n.s. n.s.201984_s_at n.s. n.s.211551_at n.s. n.s.
ERBB2 PFS Histology = serous 216836_s_at n.s. n.s.MET OS Stage = 3+4 217828_at n.s. n.s.
203510_at n.s. n.s.211599_x_at n.s. n.s.213807_x_at n.s. n.s.
MMP2 PFS Histology = endom. 201069_at 0.33 0.05MMP9 OS Stage = 1 203936_s_at n.s. n.s.MMP14 OS Stage = 2+3+4 160020_at n.s. n.s.
202828_s_at n.s. n.s.202827_s_at n.s. n.s.
HE4 PFS All patients 203892_at n.s. n.s.SERPINB5 PFS Debulk = subopt. 204855_at n.s. n.s.BRCA1 OS All patients 204531_s_at n.s. n.s.ERCC1 PFS Stage = 3
Therapy=Tax+Plat203719_at n.s. n.s.
203720_s_at n.s. n.s.
Table 1. The association between
prognostic markers and survival. The markers were analyzed in
subsets of patients with equivalent clinical characteristics to the
cohorts in which the association has previously been described.
GRANT SUPPORT: OTKA PD 83154; TAMOP-4.2.1.B-09/1/KMR-2010-0001; The PREDICT consortium (EU grant no. 259303)