Article
In Silico Prescription of An
ticancer Drugs to Cohortsof 28 Tumor Types Reveals Targeting OpportunitiesGraphical Abstract
Highlights
d Driver genes are comprehensively identified across a large
pan-cancer cohort
d In silico prescription links approved or experimental targeted
therapies to patients
d Up to 73.3% of patients could benefit from agents in clinical
stages
d 80 therapeutically unexploited targetable cancer driver genes
are identified
Rubio-Perez et al., 2015, Cancer Cell 27, 382–396March 9, 2015 ª2015 Elsevier Inc.http://dx.doi.org/10.1016/j.ccell.2015.02.007
Authors
Carlota Rubio-Perez,
David Tamborero, ...,
Abel Gonzalez-Perez,
Nuria Lopez-Bigas
In Brief
Using a large pan-cancer cohort, Rubio-
Perez et al. develop an in silico drug
prescription strategy based on driver
alterations in each tumor and their
druggability options and use it to identify
druggable targets and promising
repurposing opportunities.
Cancer Cell
Article
In Silico Prescription of AnticancerDrugs to Cohorts of 28 Tumor TypesReveals Targeting OpportunitiesCarlota Rubio-Perez,1,4 David Tamborero,1,4 Michael P. Schroeder,1 Albert A. Antolın,2 Jordi Deu-Pons,1
Christian Perez-Llamas,1 Jordi Mestres,2 Abel Gonzalez-Perez,1,5 and Nuria Lopez-Bigas1,3,5,*1Biomedical Genomics Lab, Research Program on Biomedical Informatics, IMIM Hospital del Mar Medical Research Institute and Universitat
Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain2Systems Pharmacology, Research Program on Biomedical Informatics, IMIM Hospital del Mar Medical Research Institute and Universitat
Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain3Institucio Catalana de Recerca i Estudis Avancats (ICREA), Passeig Lluıs Companys 23, 08010 Barcelona, Spain4Co-first author5Co-senior author
*Correspondence: [email protected]
http://dx.doi.org/10.1016/j.ccell.2015.02.007
SUMMARY
Large efforts dedicated to detect somatic alterations across tumor genomes/exomes are expected to pro-duce significant improvements in precision cancer medicine. However, high inter-tumor heterogeneity is amajor obstacle to developing and applying therapeutic targeted agents to treat most cancer patients.Here, we offer a comprehensive assessment of the scope of targeted therapeutic agents in a large pan-can-cer cohort. We developed an in silico prescription strategy based on identification of the driver alterations ineach tumor and their druggability options. Although relatively few tumors are tractable by approved agentsfollowing clinical guidelines (5.9%), up to 40.2% could benefit from different repurposing options, and up to73.3% considering treatments currently under clinical investigation. We also identified 80 therapeuticallytargetable cancer genes.
INTRODUCTION
Recent advances in DNA sequencing technologies provide
unprecedented capacity to comprehensively identify the alter-
ations, genes, and pathways involved in the tumorigenic pro-
cess, raising the hope of extending targeted therapies against
the drivers of cancer from a few successful examples to a
broader personalized medicine strategy (Garraway and Lander,
2013; Stratton, 2011). However, large-scale studies confronted
with the high degree of inter-tumor heterogeneity have uncov-
ered long catalogs of cancer driver genes (Davoli et al., 2013;
Kandoth et al., 2013; Lawrence et al., 2014; Tamborero et al.,
2013a; Vogelstein et al., 2013). The goal of using a tailored
Significance
The development of therapies targeting altered driver proteinscancer cells. Nevertheless, the extent of applicability of avalarge pan-cancer therapeutic landscape covering 6,792 tumoagents following their clinical guidelines could provide treatmwe also uncovered promising repurposing opportunities aimportant fraction of cancer patients upon approval. Furtherdevelopment.
382 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
approach to treat all tumor patients may thus require developing
a vast arsenal of anticancer targeted drugs. In addition, ad-
vances in our ability to precisely assign the most effective
targeted therapy to each patient based on the genomic events
driving the tumor are urgently needed. The present study offers
a comprehensive assessment of the potential scope of targeted
drugs in a large pan-cancer cohort. To pursue this goal, we
developed a three-step in silico drug prescription strategy (Fig-
ure 1), which included identifying the driver events acting across
the cohort, collecting all therapeutic agents targeting them, and
connecting each patient to all targeted therapies that could
benefit them, thus producing a landscape of the scope of tar-
geted therapeutic agents in the cohort.
holds the promise of selectively and efficiently eliminatingilable anticancer targeted agents is unclear. Exploiting ars, we found that the prescription of approved therapeuticent only to a small fraction of tumor patients. Nevertheless,nd found that agents in clinical trials could benefit anmore, we identified additional target genes for therapeutic
Figure 1. Graphical Summary of the Approach
The first step consists of identifying genomic driver events acting in the tumor cohort. Step two involves finding the drugs targeting the driver protein products,
and step three uses the generated information to in silico prescribe drugs to patients on the basis of their genomic driver events. This approach, applied to all
patients in the cohort, provides the therapeutic landscape of cancer drivers shown in the middle panel.
RESULTS
Step 1: Identifying Genes Driving Tumorigenesis in 28Cancer TypesThe first step of the personalized in silico drug prescription
strategy consists in identifying all the alterations driving tumori-
genesis in a patients’ tumor. To implement this step, we first
collected and analyzed somatic mutations (single-nucleotide
variants and short insertions and deletions), copy-number alter-
ations (CNAs), fusion genes, and RNA sequencing (RNA-seq)
expression data in 4,068 tumors of 16 The Cancer Genome Atlas
(TCGA) studies, representing the same number of tumor types
(hereafter, named core cohort). In addition, we collected somatic
mutations for 2,724 additional tumors included in 32 exome
sequencing studies covering 28 cancer types.
The following six sections of results describe this catalog
of cancer driver genes in each tumor type—totaling 475
genes that drive tumorigenesis via mutations, CNAs or gene fu-
sions—as well as their mode of action and whether their driver
mutations are present in the majority of the tumor clonal popula-
tion. All this information integrates the Cancer Drivers Database,
available to researchers via our IntOGen web discovery platform
(http://www.intogen.org; Gonzalez-Perez et al., 2013a).
We Detect 459 Mutational Driver Genes in the 6792Tumors of the Entire CohortThe identification of mutational drivers is challenging due to
both the high heterogeneity in the pattern of somatic mutations
observed across tumor samples and the long tail of cancer
genes mutated at very low frequency. We applied three methods
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 383
B
A
C D
475 drivers
437
9
72
190
1
mutational drivers
CNA drivers
driver fusion genes
E
Cell cycle & DNA damage Angiogenesis Apoptosis
Proliferation Cell adhesion
Chromatin regulatory factors Splicing factors UPS*
*Ubiquitin proteolisis system
Num
ber
of m
utat
ed s
ampl
esN
umbe
r of
mut
ated
sam
ples
Num
ber
of s
ampl
es
Num
ber
of s
ampl
es
Num
ber
of m
utat
ed s
ampl
es
CLASSICALPROCESSES
EMERGINGPROCESSES
AmplifiedDeleted
Tumor type Tumor Type description Projects Samples Mutational driver genes
ALL Acute lymphocytic leukemia 3 122 12
AML Acute myeloid leukemia 1 196 32
BLCA Bladder carcinoma 1 98 156
BRCA Breast carcinoma 6 1148 184
CLL Chronic lymphocytic leukemia 2 290 38
CM Cutaneous melanoma 2 369 250
COREAD Colorectal adenocarcinoma 2 229 95
DLBC Diffuse large B cell lymphoma 1 23 10
ESCA Esophageal carcinoma 1 146 98
GBM Glioblastoma multiforme 2 379 75
HC Hepatocarcinoma 2 90 30
HNSC Head and neck squamous cell carcinoma 2 375 167
LGG Lower grade glioma 1 169 50
LUAD Lung adenocarcinoma 2 391 181
Tumor type Tumor Type description Projects Samples Mutational driver genes
LUSC Lung squamous cell carcinoma 1 174 147
MB Medulloblastoma 2 210 24
MM Multiple myeloma 1 69 18
NB Neuroblastoma 1 210 27
NSCLC Non small cell lung carcinoma 1 31 11
OV Serous ovarian adenocarcinoma 1 316 83
PA Pylocytic astrocytoma 2 101 2
PAAD Pancreas adenocarcinoma 3 214 21
PRAD Prostate adenocarcinoma 1 243 88
RCCC Renal clear cell carcinoma 1 417 105
SCLC Small cell lung carcinoma 2 69 61
STAD Stomach adenocarcinoma 2 161 175
THCA Thyroid carcinoma 1 322 32
UCEC Uterine corpus endometrioid carcinoma 1 230 149
Total 48 6792 459
(legend on next page)
384 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
of analysis designed to detect complementary signals of positive
selection in the pattern of somatic mutations in genes across
samples (Gonzalez-Perez and Lopez-Bigas, 2012; Lawrence
et al., 2013; Tamborero et al., 2013b). The assumption is that mu-
tations in certain genes provide a selective advantage to tumor
cells that subsequently grow and proliferate faster; tracing the
signals left by the selection across a cohort of tumors identifies
the genes driving tumorigenesis (Gonzalez-Perez et al., 2013b).
Furthermore, the combination of methods detecting comple-
mentary positive selection signals produces a more comprehen-
sive and reliable list of cancer drivers (Tamborero et al., 2013a).
In the 48 cohorts under analysis (Table S1, part A), we identified
459 mutational cancer driver genes acting in one or more tumor
types, supporting recent reports that many new cancer genes
are yet to be identified (Lawrence et al., 2014; Vogelstein et al.,
2013) (Figure 2A). The list contains genes previously identified
as cancer drivers (Davoli et al., 2013; Futreal et al., 2004; Kan-
doth et al., 2013; Lawrence et al., 2014; Tamborero et al.,
2013a; Vogelstein et al., 2013), as well as many novel driver can-
didates (Figures 2B, S1A, and S1B and Table S1, part D), some
of which have been previously linked to tumor emergence, pro-
gression or metastasis through an array of different alterations
(e.g., NCOR2, MAP3K4, PIK3CB, HDAC9, and NEDD4L). All of
the identified drivers behaved similarly in several features in
which cancer genes are known to deviate from other genes,
such as their abnormally high connectivity in interactions net-
works, significant enrichment for rare germline variants, and
low baseline tolerance to germline variants with functional
impact (Figure S1C). Most of these cancer drivers exhibit low
mutational frequency in the cohort; 63% were mutated in less
than 1% of samples (Table S1, part B). Although some cancer
genes drive tumorigenesis across several tumor types, others
appear to be specific to certain malignancies (Figure 2, Table
S1, part C). Conversely, the number of genes able to drive the
tumorigenesis varies widely between tumor types, ranging
from the long catalogs detected in cutaneous melanoma and
breast carcinoma (250 and 184, respectively) to the short lists
identified in pilocytic astrocytoma and acute lymphocytic leuke-
mia (2 and 12, respectively) (Figure 2A).
A Quarter of Mutational Drivers Are Involved inCell-Regulatory MechanismsMost of the identified cancer drivers are involved in well-known
cancer pathways (Figure 2B) (Hanahan and Weinberg, 2011;
Vogelstein et al., 2013). Within manually selected modules of
genes acting in the same biological pathway, these drivers
tend to be mutated in a mutually exclusive pattern (Figure S1D
and the Supplemental Experimental Procedures). Interestingly,
about 25% of the drivers participate in general cell-regulatory
processes, which are emerging cancer pathways. Three types
Figure 2. The Cancer Drivers Database
(A) Table summarizing the number of cohorts and samples analyzed for each tum
(B) Frequency of somatic mutations of 177 selected mutational drivers of eight ca
long known to be involved in cancer; the bottom panel is dedicated to regulator
togram follow the legend in (A).
(C and D) Frequency of alterations in (C) CNA and (D) fusion drivers across the c
(E) Distribution of drivers according to their type of oncogenic alterations.
See also Figure S1 and Table S1.
of such regulators are widely represented among cancer drivers:
chromatin regulatory factors (CRFs), splicing and mRNA pro-
cessing regulators (SMRs), and members of the ubiquitin-
mediated proteolysis system (UPSs). Furthermore, some of the
cancer drivers belonging to these groups appear to be mutated
at relatively high frequency, such asMLL2, PBRM1, and ARID1A
among CRFs, SF3B1 and DDX3X among SMRs, and VHL and
FBXW7 among UPSs. With the exception of PBRM1 and VHL,
which are specifically linked to the oncogenic process in renal
clear cell carcinoma, cancer genes of these three functional
groups drive tumorigenesis in several malignancies (Figure 2B).
Their role in malignant transformation is most likely mediated
by the misregulation they cause in specific cancer genes that
are directly linked to oncogenic processes (Brooks et al., 2014;
Ferreira et al., 2013; Malhotra et al., 2010; Przychodzen et al.,
2013; Romero and Sanchez-Cespedes, 2014).
The Mode of Action of the 459 Mutational Driver GenesDriver mutations—and by extension the genes that bear them—
can be classified into three groups: (1) loss of function (LoF), typi-
cally tumor-suppressor genes whose disrupted function favors
tumorigenesis; (2) gain of function (GoF), typical of oncogenes
that, when abnormally activated, provide an advantage to the
cell; and (3) switch of function (SoF), which provides a new func-
tion to the encoded protein that promotes malignant transforma-
tion. Identifying the mode of action of drivers is crucial to
exploring the opportunities to therapeutically target drivers
because, in principle, only proteins with activating (Act) muta-
tions (GoF plus SoF) can be directly targeted with molecules
able to inhibit their function. In contrast, LoF genes should be tar-
geted through other strategies, such as the inhibition of a func-
tionally connected gene (Oza et al., 2011), the restoration of their
abolished function (Khoo et al., 2014), and the induction of syn-
thetic lethality like the paradigmatic PARP inhibition for tumors
with BRCA1/2 mutations (Fong et al., 2009).
We used a random forest classifier, OncodriveROLE, to iden-
tify the mode of action of cancer drivers (Schroeder et al., 2014)
based on mutation and CNA gene patterns. We found that 169
(36.8%) of the 459 cancer drivers identified in our multi-tumor
cohort are Act, while 207 (45%) are LoF. The pattern in the
remaining 83 (18.1%) drivers was not clear enough to achieve
unambiguous classification (Supplemental Experimental Proce-
dures, Figure 3A, and Table S2, part A).
A Limited Number of Drivers Clonally Dominate theTumorsNot all driver genes are equally important in the course of tumor-
igenesis. Tumors may be more addicted to mutations in certain
drivers, which provide basic capabilities to cancer cells—such
as the evasion of apoptosis or uncontrolled proliferation—than
or type and the number of mutational drivers detected in each.
ncer-related biological processes. The top panel contains graphs of processes
y processes only recently linked to tumorigenesis. Colors in row-stacked his-
ore cohort.
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 385
A
C
B
Figure 3. Ancillary Information in the Cancer Drivers Database
(A) Mode of action of mutational driver genes.
(B) Major mutational drivers detected in datasets of the core cohort. Each dot represents a gene placed in the y axis according to their median proportion of clonal
population containing protein affecting mutations in it. Orange dots show the major drivers in each cancer type.
(C) Distribution of the number of altered drivers across all samples of the core cohort (left) and across each of its individual datasets (right).
See also Figure S2 and Table S2.
to others, which render added capacities to the basic tumori-
genic phenotype that allow them to further progress and migrate
(Vogelstein et al., 2013). We hypothesized that mutations in the
former genes, which we call major cancer drivers, should domi-
nate the clonal population of these tumors. We found that muta-
tions in 73 of the driver genes were biased toward large clonal
frequencies across one or more of the 16 tumor samples cohorts
386 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
in which this analysis was possible (Figure 3B and Table S2,
part B). Major cancer drivers should include both genes respon-
sible for the onset of tumorigenesis and genes harboring
mutations that confer such powerful growth advantages to tu-
mor cells that, when sequenced, the clones containing them
have outgrown the founder clone. Although many major cancer
drivers (45) are already well-established cancer genes and are
therefore annotated in the Cancer Gene Census (CGC), others
are involved in core cellular regulatory processes related to
more recently described cancer processes, such as CRFs,
SMRs, and UPSs. Overall, these major cancer drivers, grouped
in several functional modules, appear to alter few cellular path-
ways upon mutation (Figure S2A). These groups of genes there-
fore constitute interesting targets for anticancer therapies.
We Identify 38 Drivers Acting via CNAs or Fusions in theCore CohortTo complete the landscape of driver genes acting in the cohort,
we also considered those actionable drivers acting via amplifica-
tions, deletions, or gene fusions in the 4,068 tumor samples from
16 cancer types in which these data were available (i.e., the core
cohort). In each of these datasets, we considered candidate
CNA drivers: (1) mutational driver genes in the corresponding tu-
mor type and (2) manually selected targeted genes located
within recurrently amplified of deleted genomic segments identi-
fied by the GISTIC analysis of the corresponding tumor type
(Mermel et al., 2011). In both cases, we only considered as
CNA drivers those genes that exhibited CNAs with coherent
expression changes —i.e., overexpressed Act drivers and
underexpressed LoF drivers (Supplemental Experimental Proce-
dures). As a result, we identified 29 targeted genes putatively
driving tumorigenesis in one ormore of thesemalignancies either
via amplification (15) or deletion (14) (Figure 2C and Table S1,
part E). Nine of these genes were driving cancer exclusively via
CNAs: AURKA, CCND2, CCND3, CCNE1, CDK6, CDKN2B,
IGF1R, MDM2, and MDM4. Moreover, 20 other genes were
also mutational drivers (i.e., PTEN and CDKN2A) (Figure 2E).
Finally, we collected a list of genes (ten) known to drive tumori-
genesis upon fusionsbymanually curating the literature (Figure2D
and Table S1, part F). Only high reliable fusion events per sample
for in-frame fusions of both partners were taken into account.
The Cancer Drivers Database Identifies TumorigenicAlterations in 90% of TumorsNinety percent of the 4,068 tumor samples in the core cohort
bear at least one driver alteration defined in the Cancer Drivers
Database. Tumor samples in the core cohort contain a median
of three driver events, although the number of events per sample
differs widely across tumor types (Figures 3C). These figures are
actually underestimated due to, among others, lowly recurrent
drivers that remain unidentified at the current number of tumor
re-sequenced genomes and unidentified driver non-coding mu-
tations not assessed by exome-sequencing data, as well as
other types of genomic or epigenomic alterations not considered
in the present analysis. When only mutations are considered, the
fraction of tumors with identified driver events drops from 90% to
86% in the core cohort and to 82% in the full cohort (Figure S2B).
Step 2: Collecting Drugs Targeting Driver GenesThe second step of the in silico prescription strategy consisted in
exhaustively searching for therapeutic options, either approved
by the Food and Drug Administration (FDA) or in clinical or
pre-clinical studies, to target mutational, CNA, and fusion driver
genes detected in step one. We considered three different plau-
sible approaches to target them: (1) inhibition of activated drivers
with therapeutic molecules (direct targeting), such as Vemurafe-
nib-targeting V600E-activating BRAF mutation; (2) inhibition of
non-altered proteins functionally connected to the altered driver
(indirect targeting), for example Temsirolimus, a MTOR inhibitor
undergoing a clinical trial (Oza et al., 2011) to treat cancer
patients bearing PTEN-inactivating mutations; and (3) gene
therapies designed to compensate for the loss of activity of a
tumor-suppressor driver. The process employed to identify
these interactions and ancillary information relevant to their pre-
scription to cancer patients, as well as its results, is described in
the following two sections. All this information composes the
Cancer Drivers Actionability Database (Supplemental Experi-
mental Procedures) made available to researchers through
IntOGen (http://www.intogen.org/downloads).
Drugs Targeting Cancer DriversOur thorough search for anticancer targeted therapeutic options
identified 57 FDA-approved agents targeting 51 drivers, 47
agents in clinical trials targeting 66 drivers—26 on top of the pre-
vious 51—and 20,470 ligands potently binding 77 drivers—19
non-reported with respect to the previous lists—totaling 96 tar-
geted drivers (Figures 4A and 4B and Table S3). Seventy-four
out of them were exclusively directly targeted, whereas 13
were only targeted indirectly; seven drivers could be targeted
both directly and indirectly. We also found two gene therapies
aimed at restoring the activity of two drivers (Figure 4C). Drivers
targeted by agents following each of the three strategies with
different levels of approval in the clinical practice are shown in
Figure 4D. Finally, we found that 35 other drivers had a protein
structure suitable for small molecule binding, named potentially
druggable, and 26 could be accessed by antibody and protein
therapies, named potentially biopharmable (Figure 4A).
Of special interest in the catalog of targeted therapies are
agents targeting multiple drivers, which could provide a wider
therapeutic scope (Hopkins, 2008; Mestres et al., 2008). We
observed that 32 out of 57 (56.14%) FDA-approved drugs target
multiple drivers (between 2 and 17), stressing the fact that a large
proportion of cancer drugs are poorly selective. Multi-target
options are likely to increase in the near future, as 23 therapeutic
agents in clinical trials and 4,746 preclinical ligands target multi-
ple drivers (Figure S3).
Cancer Drivers Actionability DatabaseWe complemented the list of therapeutic anticancer agents with
ancillary information on their clinical use and the options for re-
purposing them and integrated all these data within the Cancer
Drivers Actionability Database. The database contains informa-
tion on the tumor type for which the drug is prescribed (for
FDA-approved drugs) or is being tested (for drugs in clinical tri-
als). It also contains rules that take into account the type of driver
alteration considered actionable, according to clinical guidelines
(e.g., Afatinib is clinically prescribed for the EGFR L858R muta-
tion) and/or according to the specification in clinical trials or in
the literature (e.g., Imatinib is being tested for PDGFRA muta-
tions). Additionally, the database considers alterations known
to be causative of drug primary resistances (e.g., KRAS muta-
tions cause resistance to Cetuximab), according to clinical
guidelines or clinical or pre-clinical investigations.
The database also defines rules for the possible expansions of
the use of approved drugs in cancer treatment. In particular, we
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 387
A B C D
Figure 4. Drugs Targeting Cancer Drivers(A) Number of driver genes within each unique druggability category. Note that each driver gene is counted only once in this plot and is placed in the category
according to the ordered hierarchy.
(B) Number of driver genes that are targets of therapeutic agents with the three levels of approval in the clinical practice.
(C) Distribution of driver genes susceptible to the three types of targeting strategies with different levels of approval in the clinical practice.
(D) Distribution of driver genes, within each targeting strategy, targeted by agents with the three levels of approval in clinical practice.
See also Figure S3 and Table S3.
have considered six cases where the repurposing of FDA-
approved drugs could prove useful. We stratified them in three
tiers, according to their feasibility (Supplemental Experimental
Procedures). Tier 1 includes (1) tumor type repurposing (tumor
types different than the one considered in the clinical guidelines)
and (2) disease repurposing (agents approved for other diseases
that are however able to target cancer driver proteins). Tier 2 in-
cludes (1) alteration repurposing (agents targeting an alteration
of a driver that could be employed in the event of another type
of alteration of the same gene), (2) indirect targeting repurposing
(agents that target a specific oncoprotein but have proven useful
counteracting the alteration in another connected driver), and (3)
strong off-target repurposing (agents that bind more potently
than drug primary targets, the clinically intended ones). Tier 3
includes mild off-target repurposing (agents which bind to off
targets with an affinity less potent than the one for their primary
targets, but more potent than 1 mM). We have included in the
database rules to filter out 12 known cases of repurposing
that have rendered negative results in clinical trials (Table S4,
part B), obtained manually from the literature. Furthermore, we
have collected information from 98 currently ongoing clinical tri-
als that assay repurposing opportunities considered in our data-
base, some of which have proven successful (Table S4, part A).
Step 3: Connecting Targeted Therapeutic Agents toPatientsThe third step of the in silico prescription strategy consists in
determining which targeted therapies could benefit a patient
(Figure 1). To make this decision, we select the driver alterations
in the tumor on the basis of the Cancer Drivers Database. Then,
therapies are matched to those alterations taking into account
388 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
the interactions and rules in the Cancer Drivers Actionability
Database.
We applied this procedure to the 4,068 patients of the core
cohort, taking into account the observed somatic mutations,
CNAs, and fusion events (Table S4, part D), and to the full cohort
of 6,792 patients considering only somatic mutations (Table S4,
part C). The obtained therapeutic landscapes allowed us to
answer (1) what fraction of the patients of these tumor types
could benefit from currently available targeted therapies, (2)
what fraction of them might benefit from targeted therapies
currently under clinical trials or in pre-clinical stages, and (3)
what fraction of them may receive multiple agents aimed at
improving their response to therapy. The following five sections
describe these targeted therapeutic landscapes and answer
these questions.
Approved Targeted Therapies Could Benefit 5.9% of thePatients in the Core CohortOf the 4,068 samples of the core cohort, only 241 are in silico-pre-
scribed FDA-approved drugs following their primary indications.
In other words, only 5.9% of the patients represented by this
pan-cancer cohort could be treated with targeted therapies
directed to modulate the activity of mutated, amplified, deleted,
or fuseddriverproteins under current clinical guidelines (Figure 5).
All samples tractable with approved targeted approaches are
concentrated in six tumor types: cutaneousmelanoma, colorectal
adenocarcinoma, headandneck squamouscell carcinoma, renal
clear cell carcinoma, breast carcinoma, and acutemyeloid leuke-
mia (Figure 6A).Mostof these (101) arebreast carcinomasamples
bearing ERBB2 amplifications, with in silico prescriptions for
Lapatinib, Pertuzumab, and Trastuzumab (Figure 6B).
A B
Figure 5. Therapeutic Landscape of the Core Cohort
Sets of patients susceptible of different types of interventions are shown both in their overlaps (A) and as non-overlapping groups following a hierarchy (B)
(i.e., each patient is uniquely labeled in one category following the ordered hierarchy). The same classification for full cohort is available in Figure S4A. See also
Table S4.
When only drugs targeting mutational drivers in patients of the
full cohort are considered (Figure S4), there is a substantial
decrease in the proportion of tumors susceptible of treatment
with FDA-approved drugs according to clinical guidelines
(2.9%). The contrast between these two results highlights the
significance of exploring driver CNAs and fusions when making
clinical decisions on cancer patients. In any case, the proportion
of samples benefiting from FDA-approved drugs following clin-
ical guidelines is still slim.
Drug Repurposing Increases up to 40.2% the Fraction ofPatients that Could Benefit from FDA-Approved DrugsDrug repurposing is a promising strategy to maximize the bene-
fits of currently available drugs (Gupta et al., 2013). The in silico
drug prescription procedure assigned FDA-approved drugs ac-
cording to one or more of the repurposing strategies considered
here to 1,635 patients of the core cohort (40.2%) (Figure 5A).
As mentioned above, tumor type repurposing was divided in
three tiers according the feasibility of the therapy. Repurposing
opportunities in tier 1 were mostly tumor type repurposing cases
(835 out of the 838 patients benefiting from opportunities in the
tier). Moreover, 697 of these patients could not benefit from
approved drugs according guidelines (Figure 5B). Most of these
opportunities were concentrated in three cancer types, amount-
ing to 182 thyroid carcinoma, 168 glioblastoma, and 56 lung
adenocarcinoma tumors (Figures 6A and 6B). Kinase inhibitors
represent the highest number of tumor type repurposing oppor-
tunities among all drug families, including cases of multiple
in silico drug prescriptions for one tumor sample.
Repurposing opportunities in tier 2 could benefit 867 patients,
out of which 587 could not be linked to more feasible therapies.
FDA-approved drugs through indirect targeting repurposing
were assigned to 692 patients. Most of these samples were colo-
rectal (120), uterine carcinomas (112), and glioblastomas (99).
Deletions and/or truncating mutations of PTEN (377 samples),
NF1 (148), and APC (138) received the largest share of these
indirect opportunities (Figure 7A). On the other hand, alteration
repurposing opportunities were concentrated in glioblastoma
patients (76). Most of these patients (70) could receive Lapatinib
tosylate, prescribed for ERBB2 overexpression, to target EGFR
mutations.
Lastly, repurposing opportunities in tier 3 could be beneficial
for 1,153 patients, 110 of which could not be prescribed
more feasible therapies. In summary, the fraction of samples
benefiting from FDA-approved drugs prescribed in silico in-
creases markedly (from 5.9% to 40.2%) if repurposing opportu-
nities are considered, with important differences between tumor
types (Figure 6). Nevertheless, the usefulness of repurposing to
treat cancer patients is slimmer if only somatic mutations are
considered, from 2.9% to 24.7% in the full cohort (Figure S4).
1,346, or 33.1% Further Patients in the Cohort CouldBenefit from Drugs Currently Under Clinical TrialsWhen targeted therapies currently in clinical trials are also
considered, the number of tractable tumors increases to 2,981
(73.3%) in the core cohort (Figure 5). Among them, 1,672 sam-
ples were in silico-prescribed drugs in clinical trials directly
targeting their driver genes, 1,254 were found susceptible of
undergoing gene therapies correcting for the LoF of a tumor-
suppressor gene, and 1,284 could receive drugs in clinical trials
indirectly targeting their driver alterations (Figure 6B). The impor-
tance of these drugs will prove paramount to patients suffering
from certain malignancies, such as lower-grade glioma, ovarian
cancer, and head and neck carcinoma, 76.2%, 69.1%, and
55.6% of whom, respectively, could benefit from agents in
clinical trials, but not from any other FDA-approved targeted
therapies. These numbers are explained, to a great extent, by
TP53 (Figure 7B), which is the subject of a gene therapy that
can compensate for its LoF mutations (Buller et al., 2002) and
is altered in 1,250 samples. The overall fraction of patients
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 389
A
B
C
Figure 6. The Landscape of Targeted Therapies in Patients of the Core Cohort(A) Distribution of patients across tumor types of therapeutic interventions defined in the text (non-overlapping groups as in Figure 4C) in each cohort.
(B) Number of patients susceptible of each types of therapeutic intervention (overlapping as in Figure 4B) in each cohort. The bars at the right of the heatmap
illustrate the total number of patients benefiting from each type of therapeutic intervention.
(C) Distribution of patients susceptible to receive different number of targeted therapeutic agents—based on their altered drivers—in each cohort (left) and across
the entire core cohort (right). (Only drivers targeted by FDA-approved agents and agents in clinical trials were considered.)
See also Figure S4.
390 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
benefiting exclusively from drugs in clinical trials is almost
equally significant in the full cohort (34.9%; Figure S4).
Finally, 2,055 patients (50.5%) were assigned ligands currently
in preclinical stages, and only 58 of them are not prescribed
agents either in clinical guidelines or FDA approved; the tumors
of 68 and 34 other patients have protein-affecting mutations in
potentially druggable and biopharmable drivers, respectively
(Figure 5).
39% of Patients in the Core Cohort Could ReceiveCombination Therapies against Multiple DriversThe combination of multiple agents is probably the most intuitive
strategy to improve the efficacy of the response of tumors to
treatment and to diminish their resistance to targeted therapies
(Al-Lazikani et al., 2012; Pemovska et al., 2013). The in silico
drug prescription approach identified 1,598 patients (39%) of
the core cohort with sets of altered drivers susceptible of being
targeted with combinations of therapeutic agents both approved
and undergoing clinical trials (Figure 6C). In other words, an
important fraction of the patients in the cohort could benefit
from existing or future combination therapies, although the ma-
jority of patients could currently only be prescribed a single tar-
geted agent. The landscape of available combinatorial targeting
strategies varies per tumor type (Figure 6C).
Polypharmacology could, in principle, achieve results similar
to drug combinations targeting more than one altered driver in
a tumor, without the potential problem of added toxicity. Sixty-
nine samples presented mutations in at least two drivers that
could be simultaneously targeted by the same FDA-approved
drug. Up to 451 (11.1%) patients would benefit from polyphar-
macology if the drugs in clinical trials are also considered. These
results highlight the need to develop new and smarter anticancer
therapies and strategies to expand the landscape of combina-
tion therapies.
Priority Gene Candidates for Drug DevelopmentIn our thorough survey of anticancer drugs and their targets, we
detected 19 non-LoF drivers tightly bound by preclinical small
molecules that are not currently targeted by any drugs that are
FDA-approved or in clinical trials (Figure 4A). Moreover, 61 other
non-LoF drivers present properties that make them potentially
druggable or biopharmable (Figure 8). These 80 drivers are
altered in 2,730 (67.1%) samples across the cohort. Interest-
ingly, 63 of them are not well-established cancer genes
(not included in the CGC). Six of them, bearing protein affecting
mutations in 370 (5.4%) samples, are major drivers, a fact
that makes them particularly promising as targets for drug
development.
Focusing on the 25 drivers in this list that bear mutations in
more than5%of tumors of at least one cancer type,which consti-
tute probably the first-line candidates for drug development, we
discovered 13 cancer genes that are not well-established (Table
S5 describes ten of them in detail). At least four of these targets
(SVEP1, PCDH18, PTPRU, and FAT2) function in biological pro-
cesses related to cell adhesion and migration, while three others
(KALRN, MACF1, and TRIO) modulate changes in the organiza-
tion of the cytoskeleton. Another gene is involved in the regulation
of cell cycle, TAF1, encodes one component of the transcription
initiation complex and is thus essential for the progression of the
cell cycle into G1. The possibility to develop therapeutic agents
targeting cell-cycle mediators is exciting because most drivers
affecting this process lose their function upon mutation. Other
potentially actionable drivers encode a component of an ubiqui-
tin-mediated proteolysis complex (SPOP), a cell membrane re-
ceptor involved in exocytosis (LPHN2), and an enzyme involved
in the pyrimidine synthesis pathway (CAD) (Figure 8). CTNNB1
presents especially appealing opportunities to develop targeted
therapeutic agents to treat an important fraction of the patients
suffering from uterine carcinomas (28%) and medulloblastomas
(12%), two tumor typeswith fewcurrently available targeted ther-
apeutic options.
DISCUSSION
The availability of the sequences of thousands of tumor ge-
nomes has raised the hope of being able to identify all cancer
driver events that could be targeted by existing and novel
drugs. This in turn is expected to extend targeted therapies
from a few successful examples to a toolbox of tailored drugs
that deliver the promise of personalized cancer medicine. In
this work, exploiting one of the largest available tumor cohorts
(6,792 samples), we sought to (1) determine the scope of appli-
cability of anticancer targeted drugs and (2) propose ranked
lists of genes to develop novel targeted drugs. Pursuing these
goals, we devised an in silico prescription strategy of targeted
therapeutic agents. The strategy includes the development of
a Cancer Drivers Database, containing lists of genes driving
tumorigenesis upon mutations, CNAs, and fusion events in
different tumor types, and a Cancer Drivers Actionability Data-
base, containing a comprehensive set of current and prospec-
tive anticancer targeted agents and sets of rules to prescribe
them to patients. Connecting targeted agents to patients in
the cohort, we obtained its therapeutic landscape as a snap-
shot in time, which will be improved with more updated Cancer
Drivers Database and Cancer Drivers Actionability Database in-
formation as our knowledge of driver genes and anticancer
therapies progress.
The analysis presented here has some limitations inherent to
the nature of the data employed. The identification of driver
genes is limited by three main hurdles. First, our analysis does
not include non-coding mutations and methylation changes.
Second, lowly recurrent drivers not well represented in the
analyzed cohort remain unidentified. Third, the data employed
to detect drivers may contain intrinsic errors, like those intro-
duced by the calling of somatic mutations.
Also, we relied on proxy features to predict the mode of action
of each driver as Act or LoF, with remarkably good results for the
CGC; however, 83 drivers of our list could not be unambiguously
classified, especially those whose mutations pattern could
not be assessed due to their low frequency. Furthermore, we
acknowledge that some genes exhibit divergent modes of action
in different cancer types (Schroeder et al., 2014).
The analysis of clonal frequency is limited by the use of low-
coverage sequencing data. These were suitable, however, to
our endeavor of identifying major drivers, whose therapeutic
intervention could exhibit a larger benefit. The analysis is based
on detection of genes bearing mutations biased toward larger
clonal frequencies across each tumor cohort, which is less
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 391
A
B
(legend on next page)
392 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
ALL
AML
BLCA
BRCA
CLL
CM
COREAD
DLBCL
ESCA
GBM
HC
HNSC
LGG
LUAD
LUSC
MB
MM
NB
NSCLC
OV
PA
PAAD
PRAD
RCCC
SCLC
STAD
THCA
UCEC
MA
CF
1
FAT
2
SV
EP
1
RO
BO
2
PC
DH
18
GN
AS
CO
L1A
1
PT
CH
1
CU
X1
ZN
F63
8
LRP
6
EIF
4G3
PLX
NA
11
PLX
NB
2
FC
RL4
PO
M12
1
ZN
F81
4
DLG
1
NT
N4
GO
LGA
5
ND
RG
1
RH
OT
1
AC
SL3
CLC
C1
AC
AD
8
WN
T5A
CT
NN
B1
PC
SK
5
TR
IO
AC
AC
A
CA
D
AD
CY
1
NE
F2L
2
PT
PR
F
BLM
PT
PN
11
PC
SK
6
AH
R
XP
O1
LNP
EP
AD
AM
10
CA
RM
11
AT
F1
MG
MT
MD
M4
KA
LRN
F8
PLC
B1
LPH
N2
TAF
1
WN
K1
PT
PR
U
AR
FG
EF
2
PR
PF
8
US
P6
SO
S1
NU
P98
CN
TN
AP
1
STA
RD
13
DH
X9
AR
ID4A
PLC
G1
SP
OP
PP
P2R
1A
ITG
A9
NE
DD
4L
TN
PO
1
CA
T
RH
OA
PA
X5
RA
C1
TR
IP10
NR
2F2
NC
K1
CA
PN
7
PIK
3R3
PP
P2R
5C
AR
FG
AP
1
MA
T2A
PS
MA
6
Total mutatedsamples
Mutationfrequency
0
0.1
1
Potentially druggablePotentially biopharmablePre-clinical Ligand
CGC gene
Major driver gene
Potentially biopharmablePotentially druggablePre-clinical ligand
184 134 128 107 103 101 93 89 81 67 50 48 46 38 35 29 17 15 15 184 151 142 131 128 114 93 89 88 84 79 77 70 60 60 59 57 55 54 50 41 34 32 30 29 28 28 23 22 21 21 21 17 15 15 265 218 198 156 153 113 103 97 95 84 83 83 82 81 65 52 50 46 37 34 29 28 27 24 21 20
1 1 1 1 1
1 2 1 1
8 6 3 7 2 7 2 3 1 1 3 5 5 5 4 1 1 2 2 3 4 1 1 1 2 1
11 11 5 11 19 9 9 9 7 3 7 10 7 3 5 3 8 1 3 4 6 4 2 1 2 4 1 4 2 1 1 2 2
2 1 5 1 1 1 1 1 1 1 1
52 52 51 42 18 22 22 20 21 35 11 18 12 19 12 10 8 3 2 17 12 8 9 4 6 14 4 4 1 2 3 7 3 7 5
7 1 2 3 3 3 9 4 1 1 2 3 3 3 1 3 1 1 1 1 2 1
3 2 4 5 2 2 2 3 1 1 1 2 4 5 2 2 1 2 2 2 1
2 6 1 1 4 2 1 3 2 5 4 2 2 2 2 5 2 2 1 3 1 2 1 1
1 1 5 1 1 2 1 1 1 1 1 1 1
23 7 13 8 7 14 6 6 6 6 7 4 6 4 7 1 4 3 3 1 2 2 2 6 5 9 5 4 5 1 3 3 3 4 1
1 3 2 1 3 3 1 2 1 2 4 1 2 1
18 19 20 21 9 8 9 7 3 9 10 6 4 8 5 5 6 1 3 1 3 3 3 1 5 1 2 4 2 1 4 1 2
15 9 6 7 12 17 6 2 5 8 10 3 4 6 5 8 3 1 8 2 2 3 1 2 2 2 3 1
1 1 1 3 1 1 1 1 1
1 1 2 1 1 1 2
1 1 2 1 1 1 3 1 1 1 1 1 1
3 2 2 1 1 1 1
4 5 4 2 5 3 1 1 1 3 1 4 1 1 2 1 4 1 1 1 2 1 1 1 1 1
3 1 1 1 1 1 1 1 1 1
1 1 1 2 1 1 3 1 2 6 22 1 1 2 1 1 1 1 1
5 3 1 2 5 5 4 4 6 2 6 2 5 2 4 2 2 2 1 1 2 1 1 1 1 1 3
5 7 1 1 6 1 4 1 2 2 3 2 2 1 3 2 2 1 4 1 1
8 7 10 5 5 7 9 9 8 1 2 7 5 7 3 6 4 2 2 6 2 3 3 7 1 2 4 4 2
1 1 1 1 4 2 1 1 1 1 1 2 1 1 2 1 1
13 6 2 6 20 2 6 9 11 2 7 5 4 2 2 6 7 15 20 3 5 5 2 1 1 1 1 2 2 2 2 3
1 3 1 1 1
1 1 1 1
14 6 6 3 5 4 1 5 4 2 1 8 1 6 1 3 4 3 1 3 2 2 1 1
33 11 17 8 7 5 6 7 2 6 5 5 6 4 2 1 5 7 2 2 2 1 1 1 2 2
1 1 1 2 1 1 1 1 1 1 1
43 52 72 42 69 24 46 12 23 17 23 21 16 17 19 17 4 10 14 2 10 9 8 16 1 3
7 11 4 5 5 8 4 3 3 4 4 1 2 1 1 3 3 1 1
3 7 4 10 13 5 6 1 3 2 3 1 2 1 2 2 3 1 1 1 1 3
11 11 4 2 2 2 1 1 2 2 1 1 1 2 2 10 1 4 1 1 1 2
2 2 3 1 1 2 1 1 1 1
29 28 16 11 8 8 7 9 13 5 17 4 15 16 12 3 5 5 1 1 1 3 2
1 2 1 1 2 3 3 2 1
21 16 29 23 16 17 5 12 8 11 6 11 8 7 7 1 3 1 5 1 1 1 2 3 1
27 14 7 9 3 7 8 4 8 6 6 2 9 4 2 4 4 2 2 3 4 2 2 1
1 2 13 2 1 1
2 1 2 1 1 1 1 1 1
3 7 1 2 1 2 2
1 1 1 1
10 2 5 2 1 3 1 6 4 1 1 1 3 2 3 1 2 3 3 2 1
1 1 2 4 1 1 2 1
4 6 2 2 3 2 1 1 1 2 1 2 3 1 2 1 6 1 2
12 5 8 3 2 5 1 8 7 1 2 4 2 2 4 2 6 1 2 5 1 1 2 1
9 5 3 1 2 1 1 2 4 1 7 2 2 3 1 1 1 1
15 15 6 10 7 8 5 3 3 4 2 3 4 5 4 5 3 2 2 6 2 2
1 1 1 2 3 1 1 1 1 1 2 1 1
19 12 10 6 3 9 4 8 11 5 8 8 9 9 7 2 4 3 2 1 1 1 2
3
1 9
2 6 10 6 5 1 9 2 4 2 4 3 2 2 3 1 1
2 14 10 7 9 5 2 8 5 2 3 2 7 1 1 2 2 2 3
1 2 2 1 2 1 1 1 2 1 8 1 1
19 60 26 24 20 30 4 20 24 9 21 8 3 12 8 6 5 4 3
9 2 3 4 6 5 3 1 2 1 1 1 1
3 4 3 2 1 6 2 2 2 1 3 1 2 1
1 1 2 4 3 5 6 2 5 4 4 1 2 2 2 1 1
17 3 1 5 1 4 1 1
2 4 12 12 3 10 20 10 5 1 1 3 6 2 3 3 1 5 1
1 2 5 2 2 3 1 1 1 1
10 11 14 6 11 8 4 2 6 9 5 6 2 1 3 3
3 5 7 2 10 8 26 5 2 3 4 1 2 2 3 1 1
26 2 2 1
1 1 1 2 1 1 1
1 3 3 1 7 3
1 1 1 1 1
2 2 2 5 4 4 4 1 1 2 1 2
2
1 2 2 1
9 3 3 1 2 1 1 1 1
1 4 2 5 3 1 5 3 7 1 3 3 6 3 4 2
1 1 1 2 2 6 1 1 2 3 1 1
10 8 10 7 9 3 2 7 7 4 1 2 3 2 3 1 2 3
2 1 2 1 1 1
65 3 11 9 9 4 12 7 4 2 1 2 2 2 1 2 2
Figure 8. Mutation Frequency of Cancer Driver Genes that Are Bound by Pre-clinical Ligands or are Potentially Druggable or Biopharmable
See also Table S5.
sensitive to the accuracy of the data as compared to other ana-
lyses aimed to reconstruct the exact clonal architecture of each
tumor.
Finally, due to the inevitable incompleteness of drug-target
interaction databases, the results we present here are an under-
estimation of the actual number of targeted drivers (Mestres
et al., 2008).
Our in silico prescription approach, based solely on actionable
alterations in drivers, should be treated as an approach to esti-
mate the maximum applicability of direct targeted therapies in
the patients represented in our cohort. The rationale presents
five main limitations: (1) all coherent alterations in driver genes
(protein affecting mutations, amplifications, deletions, and
fusions) acting in the corresponding tumor type are considered
driver events, although some may well be passengers; (2)
intra-tumor heterogeneity in each individual sample is not taken
into account for the prescription of drugs due to limitations of low
coverage sequencing data, although it is conceivable that tar-
geting subclonal driver events may have lower efficacy or even
lead to paradoxical effects in some cases (McGranahan and
Swanton, 2015); (3) the Cancer Drivers Actionability Database
considers resistance biomarkers of targeted drugs, although
this is limited by the current landscape of already known bio-
markers of primary response; (4) although a set of known nega-
tive repurposing cases are eliminated by the in silico prescription
algorithm, these are limited to published results of clinical trials;
and (5) the approach does not take into account specificities
Figure 7. Examples of Targeted Drivers in the Core Cohort
(A) The top 20 genes ranking at the top of the number of patients tractable with FD
left illustrates the numbers of samples with alterations of each gene in each tum
genes.
(B) Same as (A) for genes ranking at the top of the number of patients tractable
targeted genes.
See also Figure S5.
pertaining to the mode of administration of each drug that may
limit their repurposing opportunities.
Despite the aforementioned limitations, the approach pre-
sented here constitutes the proof of principle of a straightforward
strategy to intelligently exploit cancer genome information to aid
toward precise cancer medicine (Garraway, 2013). This strategy,
which combines a Cancer Drivers Database and a Cancer
Drivers Actionability Database by means of an in silico prescrip-
tion method, presents two unique features.
First, our approach considers which alterations are drivers in
each cancer type. It therefore diverges from other studies that
start with a list of pre-defined cancer proteins and consider pre-
scribing all drugs targeting their observed alteration regardless
of their importance in the corresponding tumor type (Van Allen
et al., 2014). The intervention only on driver events avoids over-
prescription (Figure S4D). The information of all driver alterations
in the study cohort, provided via the IntOGen resource (Gonza-
lez-Perez et al., 2013a), can be exploited for other purposes by
the cancer research community, such as the support for inter-
pretation of newly sequenced tumor genomes or the design of
optimal sequencing panels.
Second, targeted therapies in the Cancer Drivers Actionability
Database comprise manually curated anticancer drugs currently
in use in the clinic or currently under clinical trials, as well as
automatically retrieved ligands that bind to cancer genes and
are candidates for prospective drug development. Another
feature of this database is the explicitly defined rules that guide
A-approved targeted agents across the entire core cohort. The heatmap at the
or type separately. See Figure S5A for the complete heatmap of the targeted
with agents in clinical trials. See Figure S5B for the complete heatmap of the
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 393
the different types of repurposing. In these two respects, it is
different from other existing resources that compile anticancer
drugs and their interactions with cancer proteins (Van Allen
et al., 2014; Halling-Brown et al., 2012).
The in silico drug prescription based on these two resources
was tested on one of the largest cohorts of tumor samples
currently collected for research. Themain result highlights the cur-
rent scope of targeted anticancer therapies and its prospects for
growth in view of the drugs that are currently in clinical trials or at
pre-clinical stages. The second important output of this work is a
ranked list of additional target opportunities for anticancer drug
development. Continuous update of drug-target interaction infor-
mation, as well as the application of the strategy to larger cohorts,
will improve the in silico prescription rules contained within the
Cancer Drivers and Cancer Drivers Actionability Databases, thus
enhancing its usefulness within personalized cancer medicine.
EXPERIMENTAL PROCEDURES
Somatic Alterations Data
We obtained the lists of somatic mutations detected across tumor samples of
48 individual cohorts from three main sources: TCGA pan-cancer (Weinstein
et al., 2013) (syn1710431 and syn300013), the ICGCData Coordination Centre
(Hudson et al., 2010), and supplementary material of other published large
cancer cohort publications. Details on all datasets are provided in Table S1,
part A. The datasets were filtered as previously published (Tamborero et al.,
2013a) to eliminate possible false-positive somatic variants. Gene expression
and copy-number data of the 16 TCGA cancer projects were obtained from the
Synapse platform (syn300013).
Identification of Mutational Cancer Drivers
We searched for three complementary signals of positive selection in the
mutational pattern of genes across the samples of each individual cohort
of tumors and across the TCGA pan-cancer cohort. To that effect, we em-
ployed three methods designed to detect (1) abnormal accumulation of
high-impact mutations across samples (OncodriveFM [Gonzalez-Perez and
Lopez-Bigas, 2012]), (2) abnormal clustering of mutations in regions of the
protein sequences (OncodriveCLUST [Tamborero et al., 2013b]), and (3)
the recurrence of mutations above the background mutation rate (MutSigCV
[Lawrence et al., 2013]). The lists of genes identified by these methods were
combined as described in Tamborero et al. (2013a) and the Supplemental
Experimental Procedures to obtain the final list of mutational drivers acting
in each tumor type.
Classification of Mutational Drivers According to their Mode of
Action
We used OncodriveROLE (Schroeder et al., 2014), a random forest classifier-
based tool trained on several parameters related with the alteration pattern of
genes across samples in the TCGA pan-cancer cohort, to predict whether
driver genes act as LoF or activating (GoF/SoF) drivers. The training, test,
and parameters employed in this classification are detailed in Schroeder
et al. (2014) and the Supplemental Experimental Procedures.
Identification of Major Mutational Drivers
We took the ratio between the number of reads supporting the mutant and the
reference alleles as an estimator of the variant allele frequency (VAF) of the mu-
tation.Geneswithin amplifiedchromosomal segmentswere excluded since it is
unknown to which extent they affect the mutant and wild-type allele counts.
Several measurements were applied to correct the VAF values to take into ac-
count heterozygousdeletions, copy-neutral loss-of-heterozygosity events, and
differences in tumor purity between samples.Wecomputed the observed trend
of each driver to bear mutations with higher VAFs. We as considered major
drivers of each tumor cohort those that exhibited a significant bias toward a
higher VAF compared to other genes (Supplemental Experimental Procedures).
394 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
Identification of CNA and Fusion Drivers
To identify genes putatively driving tumorigenesis in TCGA cohorts via CNAs,
we selected known cancer genes with targeted drugs found within recurrently
altered segments according to GISTIC (Mermel et al., 2011) and genes identi-
fied as mutational drivers by the aforementioned analysis per each cohort.
Only those LoF drivers exhibiting homozygous deletions and concomitant
downregulation, and non-LoF drivers with multi-copy amplifications and that
were upregulated, were included as putatively CNA drivers. LoF genes that ap-
peared to be heterozygously deleted and mutated in the same sample were
also included (Supplemental Experimental Procedures). In parallel, a list of
genes known to cause malignization via fusions was manually retrieved from
the literature. Then, we interrogated the TCGA Fusion Gene Data Portal (Yosh-
ihara et al., 2014) to obtain a list of the samples exhibiting in-frame fusions of
these genes with any partner in the core cohort (Supplemental Experimental
Procedures).
All driver genes in each tumor type of the cohort (Supplemental Experi-
mental Procedures), the evidence supporting their involvement in oncogenesis
and their roles in malignization—and consequently the type of their somatic
alterations relevant to treatment—and their relative importance within the
clones making the tumor were collected into the Cancer Drivers Database.
Curation of Drug-Target Interactions
We considered three different targeting strategies: gene therapies and direct
and indirect targeting. Moreover, we grouped therapeutic agents within the
categories in FDA approved, agents in cancer clinical trials, and agents in
cancer pre-clinical ligands.
Direct targeting interactions were mainly retrieved from ChEMBL (v18)
(Gaulton et al., 2012), a manually curated chemical database of bioactive
molecules. We took into consideration all interactions with affinity above
1 mM. For FDA-approved drugs, we distinguished three different types of tar-
gets: primary (as specified in FDA label, according ChEMBL), strong off target
(interacting more potently than the primary targets of the drug), and mild off
target (more potent than 1 mM but less than the primary target). We also
included some direct interactions from other sources (Supplemental Experi-
mental Procedures). Indirect targeting candidates were mainly retrieved
from the TARGET database (Van Allen et al., 2014). However, interactions
with molecules were obtained from literature, expert annotation, the drug’s
FDA label, or ClinicalTrials.gov (http://clinicaltrials.gov/). We also included in-
direct targeting of ligands by therapeutic agents targeting their receptors.
Gene therapies were included from records in The Journal of Gene Medicine
Clinical Trial site (Ginn et al., 2013), a comprehensive source of information
on worldwide gene therapy clinical trials.
Next we added information for the tumor type of clinical prescription (for
FDA-approved drugs, manually curated) and the tumor type of study (for
drugs in clinical trials). We incorporated rules that take into account specific
genomic dependencies for targeting, according clinical guidelines and/or ac-
cording clinical trials or literature, either in the targeted gene or others. Rules
were retrieved by compiling information from FDA labels, ClinicalTrials.gov,
and the literature. Additionally, we also considered biomarkers of primary
resistance from Gene Drug Knowledge Database (Dienstmann et al., 2014),
either in the FDA label or in NCCN guidelines, clinical trials, or case reports.
Moreover, we defined a set of rules to broaden the scope of application of
FDA-approved drugs as different types of repurposing (see the Results). We
manually searched the literature for negative and positive results of clinical
trials essaying the repurposing of opportunities in tier 1 (Table S4).
We also labeled genes as potentially druggable if they were likely to bind
small molecules according to the structure of their protein products and as
potentially biopharmable if they could be targeted with antibodies or protein
therapy (Supplemental Experimental Procedures). All of the information on
the three types of targeting strategies considered, the level of approval of
each agent, their clinical guidelines, and the possibilities to broaden their
scope of application were incorporated into the Cancer Drivers Actionability
Database.
Connecting Therapeutic Agents to Patients
Patients in the cohort were connected to all targeted therapies susceptible of
treating them. To do this, first we enumerated all relevant alterations—in a
sample of the core cohort or mutations in a sample of the full cohort—affecting
drivers of the tumor type of the sample, based on the information in the Cancer
Drivers Database. Then, all targeted therapies suitable to treat each driver
alteration in the sample were automatically retrieved from the Cancer Drivers
Actionability Database, respecting the specific requirements defined for
each therapeutic agent (of genomic alterations, tumor type prescription, and
drug primary resistances). In the absence of such specific information, we con-
nected the targeted drug to any relevant alteration of the target driver in the
sample in question.
SUPPLEMENTAL INFORMATION
Supplemental Information includes Supplemental Experimental Procedures,
five figures, and five tables and can be found with this article online at http://
dx.doi.org/10.1016/j.ccell.2015.02.007.
AUTHOR CONTRIBUTIONS
D.T. performed the driver detection analysis and identification of major drivers
and cowrote themanuscript. C.R.-P. collectedmost of the information of drug-
target interactions, performed the in silico drug prescription, and cowrote the
manuscript. A.A. and J.M. participated in the collection of drug-target interac-
tions. D.T., C.R.-P., and A.G.-P. did manual curation of drug-target interac-
tions and drug mechanisms. C.P.-L. and J.D.-P. contributed on the driver
detection analysis. M.P.-S. characterized driver genes according to their
role and performed the mutually exclusive analysis. A.G.-P. performed the
pathway analyses. J.D.-P. and M.P.-S. prepared the IntOGen web page to
present the results. A.G.-P. and N.L.-B. oversaw and designed the study
and cowrote the manuscript.
ACKNOWLEDGMENTS
We thank Elias Campo and Caroline Heckman for helpful comments on the
manuscript and Cyriac Kandoth for help in obtaining mutation data from
TCGA. We are grateful to the TCGA Research Network, the International Can-
cer Genome Consortium (ICGC), and members of each individual project for
the tumor genome resequencing data generated. We acknowledge funding
from the Spanish Ministry of Economy and Competitiveness (grant number
SAF2012-36199), La Fundacio la Marato de TV3, and the Spanish National
Institute of Bioinformatics (INB). C.R.-P. and M.P.S. are supported by an FPI
fellowship. D.T. is supported by the People Programme (Marie Curie Actions)
of the Seventh Framework Programme of the European Union (FP7/2007-
2013) under REA grant agreement number 600388 and by the Agency of
Competitiveness for Companies of the Government of Catalonia, ACCIO.
A.G.-P. is supported by a Ramon y Cajal contract.
Received: May 23, 2014
Revised: October 21, 2014
Accepted: February 17, 2015
Published: March 9, 2015
REFERENCES
Al-Lazikani, B., Banerji, U., and Workman, P. (2012). Combinatorial drug
therapy for cancer in the post-genomic era. Nat. Biotechnol. 30, 679–692.
Brooks, A.N., Choi, P.S., de Waal, L., Sharifnia, T., Imielinski, M., Saksena, G.,
Pedamallu, C.S., Sivachenko, A., Rosenberg,M., Chmielecki, J., et al. (2014). A
pan-cancer analysis of transcriptome changes associated with somatic muta-
tions inU2AF1 reveals commonly altered splicing events. PLoSONE9, e87361.
Buller, R.E., Runnebaum, I.B., Karlan, B.Y., Horowitz, J.A., Shahin, M.,
Buekers, T., Petrauskas, S., Kreienberg, R., Slamon, D., and Pegram, M.
(2002). A phase I/II trial of rAd/p53 (SCH 58500) gene replacement in recurrent
ovarian cancer. Cancer Gene Ther. 9, 553–566.
Davoli, T., Xu, A.W., Mengwasser, K.E., Sack, L.M., Yoon, J.C., Park, P.J., and
Elledge, S.J. (2013). Cumulative haploinsufficiency and triplosensitivity drive
aneuploidy patterns and shape the cancer genome. Cell 155, 948–962.
Dienstmann, R., Dong, F., Borger, D., Dias-Santagata, D., Ellisen, L.W., Le,
L.P., and Iafrate, A.J. (2014). Standardized decision support in next generation
sequencing reports of somatic cancer variants. Mol. Oncol. 8, 859–873.
Ferreira, P.G., Jares, P., Rico, D., Gomez, G., Martınez-Trillos, A., Villamor, N.,
Gonzalez-Perez, A., Knowles, D., Monlong, J., Johnson, R., et al. (2013).
Transcriptome characterization by RNA sequencing identifies a major
molecular and clinical subdivision in chronic lymphocytic leukemia. Genome
Res. 24, 212–226.
Fong, P.C., Boss, D.S., Yap, T.A., Tutt, A., Wu, P., Mergui-Roelvink, M.,
Mortimer, P., Swaisland, H., Lau, A., O’Connor, M.J., et al. (2009). Inhibition
of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers.
N. Engl. J. Med. 361, 123–134.
Futreal, P.A., Coin, L., Marshall, M., Down, T., Hubbard, T., Wooster, R.,
Rahman, N., and Stratton, M.R. (2004). A census of human cancer genes.
Nat. Rev. Cancer 4, 177–183.
Garraway, L.A. (2013). Genomics-driven oncology: framework for an emerging
paradigm. J. Clin. Oncol. 31, 1806–1814.
Garraway, L.A., and Lander, E.S. (2013). Lessons from the cancer genome.
Cell 153, 17–37.
Gaulton, A., Bellis, L.J., Bento, A.P., Chambers, J., Davies, M., Hersey, A.,
Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., and Overington,
J.P. (2012). ChEMBL: a large-scale bioactivity database for drug discovery.
Nucleic Acids Res. 40, D1100–D1107.
Ginn, S.L., Alexander, I.E., Edelstein, M.L., Abedi, M.R., and Wixon, J. (2013).
Gene therapy clinical trials worldwide to 2012 - an update. J. Gene Med. 15,
65–77.
Gonzalez-Perez, A., and Lopez-Bigas, N. (2012). Functional impact bias
reveals cancer drivers. Nucleic Acids Res. 40, e169.
Gonzalez-Perez, A., Perez-Llamas, C., Deu-Pons, J., Tamborero, D.,
Schroeder, M.P., Jene-Sanz, A., Santos, A., and Lopez-Bigas, N. (2013a).
IntOGen-mutations identifies cancer drivers across tumor types. Nat.
Methods 10, 1081–1082.
Gonzalez-Perez, A., Mustonen, V., Reva, B., Ritchie, G.R.S., Creixell, P.,
Karchin, R., Vazquez, M., Fink, J.L., Kassahn, K.S., Pearson, J.V., et al.;
International Cancer Genome Consortium Mutation Pathways and
Consequences Subgroup of the Bioinformatics Analyses Working Group
(2013b). Computational approaches to identify functional genetic variants in
cancer genomes. Nat. Methods 10, 723–729.
Gupta, S.C., Sung, B., Prasad, S., Webb, L.J., and Aggarwal, B.B. (2013).
Cancer drug discovery by repurposing: teaching new tricks to old dogs.
Trends Pharmacol. Sci. 34, 508–517.
Halling-Brown, M.D., Bulusu, K.C., Patel, M., Tym, J.E., and Al-Lazikani, B.
(2012). canSAR: an integrated cancer public translational research and drug
discovery resource. Nucleic Acids Res. 40, D947–D956.
Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of cancer: the next gener-
ation. Cell 144, 646–674.
Hopkins, A.L. (2008). Network pharmacology: the next paradigm in drug dis-
covery. Nat. Chem. Biol. 4, 682–690.
Hudson, T.J., Anderson, W., Artez, A., Barker, A.D., Bell, C., Bernabe, R.R.,
Bhan, M.K., Calvo, F., Eerola, I., Gerhard, D.S., et al.; International Cancer
Genome Consortium (2010). International network of cancer genome projects.
Nature 464, 993–998.
Kandoth, C., McLellan, M.D., Vandin, F., Ye, K., Niu, B., Lu, C., Xie, M., Zhang,
Q., McMichael, J.F., Wyczalkowski, M.A., et al. (2013). Mutational landscape
and significance across 12 major cancer types. Nature 502, 333–339.
Khoo, K.H., Verma, C.S., and Lane, D.P. (2014). Drugging the p53 pathway:
understanding the route to clinical efficacy. Nat. Rev. Drug Discov. 13,
217–236.
Lawrence, M.S., Stojanov, P., Polak, P., Kryukov, G.V., Cibulskis, K.,
Sivachenko, A., Carter, S.L., Stewart, C., Mermel, C.H., Roberts, S.A., et al.
(2013). Mutational heterogeneity in cancer and the search for new cancer-
associated genes. Nature 499, 214–218.
Lawrence, M.S., Stojanov, P., Mermel, C.H., Robinson, J.T., Garraway, L.A.,
Golub, T.R., Meyerson, M., Gabriel, S.B., Lander, E.S., and Getz, G. (2014).
Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc. 395
Discovery and saturation analysis of cancer genes across 21 tumour types.
Nature 505, 495–501.
Malhotra, D., Portales-Casamar, E., Singh, A., Srivastava, S., Arenillas, D.,
Happel, C., Shyr, C., Wakabayashi, N., Kensler, T.W., Wasserman, W.W.,
and Biswal, S. (2010). Global mapping of binding sites for Nrf2 identifies novel
targets in cell survival response through ChIP-Seq profiling and network anal-
ysis. Nucleic Acids Res. 38, 5718–5734.
McGranahan, N., and Swanton, C. (2015). Biological and Therapeutic Impact
of Intratumor Heterogeneity in Cancer Evolution. Cancer Cell 27, 15–26.
Mermel, C.H., Schumacher, S.E., Hill, B., Meyerson, M.L., Beroukhim, R., and
Getz, G. (2011). GISTIC2.0 facilitates sensitive and confident localization of the
targets of focal somatic copy-number alteration in human cancers. Genome
Biol. 12, R41.
Mestres, J., Gregori-Puigjane, E., Valverde, S., and Sole, R.V. (2008). Data
completeness—the Achilles heel of drug-target networks. Nat. Biotechnol.
26, 983–984.
Oza, A.M., Elit, L., Tsao, M.-S., Kamel-Reid, S., Biagi, J., Provencher, D.M.,
Gotlieb, W.H., Hoskins, P.J., Ghatage, P., Tonkin, K.S., et al. (2011). Phase II
study of temsirolimus in women with recurrent or metastatic endometrial
cancer: a trial of the NCIC Clinical Trials Group. J. Clin. Oncol. 29, 3278–
3285.
Pemovska, T., Kontro, M., Yadav, B., Edgren, H., Eldfors, S., Szwajda, A.,
Almusa, H., Bespalov, M.M., Ellonen, P., Elonen, E., et al. (2013).
Individualized systems medicine strategy to tailor treatments for patients
with chemorefractory acute myeloid leukemia. Cancer Discov 3, 1416–
1429.
Przychodzen, B., Jerez, A., Guinta, K., Sekeres, M.A., Padgett, R.,
Maciejewski, J.P., and Makishima, H. (2013). Patterns of missplicing due to
somatic U2AF1 mutations in myeloid neoplasms. Blood 122, 999–1006.
396 Cancer Cell 27, 382–396, March 9, 2015 ª2015 Elsevier Inc.
Romero, O.A., and Sanchez-Cespedes, M. (2014). The SWI/SNF genetic
blockade: effects in cell differentiation, cancer and developmental diseases.
Oncogene 33, 2681–2689.
Schroeder, M.P., Rubio-Perez, C., Tamborero, D., Gonzalez-Perez, A., and
Lopez-Bigas, N. (2014). OncodriveROLE classifies cancer driver genes in
loss of function and activating mode of action. Bioinformatics 30, i549–i555.
Stratton, M.R. (2011). Exploring the genomes of cancer cells: progress and
promise. Science 331, 1553–1558.
Tamborero, D., Gonzalez-Perez, A., Perez-Llamas, C., Deu-Pons, J., Kandoth,
C., Reimand, J., Lawrence, M.S., Getz, G., Bader, G.D., Ding, L., and Lopez-
Bigas, N. (2013a). Comprehensive identification of mutational cancer driver
genes across 12 tumor types. Sci. Rep. 3, 2650.
Tamborero, D., Gonzalez-Perez, A., and Lopez-Bigas, N. (2013b).
OncodriveCLUST: exploiting the positional clustering of somatic mutations
to identify cancer genes. Bioinformatics 29, 2238–2244.
Van Allen, E.M., Wagle, N., Stojanov, P., Perrin, D.L., Cibulskis, K., Marlow, S.,
Jane-Valbuena, J., Friedrich, D.C., Kryukov, G., Carter, S.L., et al. (2014).
Whole-exome sequencing and clinical interpretation of formalin-fixed,
paraffin-embedded tumor samples to guide precision cancer medicine. Nat.
Med. 20, 682–688.
Vogelstein, B., Papadopoulos,N., Velculescu, V.E., Zhou,S., Diaz, L.A., Jr., and
Kinzler, K.W. (2013). Cancer genome landscapes. Science 339, 1546–1558.
Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R., Ozenberger, B.A.,
Ellrott, K., Shmulevich, I., Sander, C., Stuart, J.M., Butterfield, Y.S.N., et al.;
Cancer Genome Atlas Research Network (2013). The Cancer Genome Atlas
Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120.
Yoshihara, K., Wang, Q., Torres-Garcia, W., Zheng, S., Vegesna, R., Kim, H.,
and Verhaak, R.G.W. (2014). The landscape and therapeutic relevance of can-
cer-associated transcript fusions. Oncogene. Published online December 15,
2014. http://dx.doi.org/10.1038/onc.2014.406.