pharmacophore-based in silico high-throughput screening to identify novel topoisomerase-i inhibitors
TRANSCRIPT
ORIGINAL RESEARCH
Pharmacophore-based in silico high-throughput screeningto identify novel topoisomerase-I inhibitors
Supriya Singh • Sucheta Das • Anubhuti Pandey •
Swapnil Sharma • Sarvesh Paliwal
Received: 22 June 2012 / Accepted: 31 January 2013
� Springer Science+Business Media New York 2013
Abstract Topoisomerase-I (TOP-I) has emerged as a
potential target for the design and development of anti-
cancer compounds. TOP-I inhibitors have shown promise
in the treatment of various cancers including renal cell
cancer, whose exact cause is yet to be known. Recent
studies indicate that indenoisoquinolines can provide
greater stability to drug-topoisomerase-DNA cleavage
complexes, which makes them a more appropriate anti-
cancer class of compounds compared to camptothecin. In
view of such significance, a three-dimensional pharmaco-
phore model has been developed using a training set of 36
indenoisoquinoline-based topoisomerase inhibitors. The
validated best model consists of three chemical features:
one hydrophobic, one positive ionizable, and one ring
aromatic with good correlation values of r(training)2 = 0.827
and r(test)2 = 0.702. Furthermore, 98 % validation by Cat-
Scramble method and a good r2 of 0.703 from 22 external
test set compounds have testified the universal applicability
of the generated model. Validated three feature pharma-
cophore model has been used to screen the chemical
database from the National Cancer Institute (NCI) leading
to the identification of 17 druggable TOP-I inhibitors
which can be raised into drug candidates after further
evaluation.
Keywords TOP-I � Cancer � Pharmacophore � NCI
Introduction
Cancer is a major health problem and one of the leading
causes of death worldwide. About 13 % of all human
deaths worldwide are caused due to cancer. In India, about
0.9 million new cancer cases are detected every year. Renal
cancer is the third most common urologic malignancy
(Rekha et al., 2008) and the seventh most common cancer
overall (Thakur and Jain, 2011). It is considered as a silent
cancer, as it does not show any symptoms until it reaches
beyond the kidneys. Renal cell carcinoma corresponds to
2–3 % of all cancers (Canamares et al., 2012) with the
highest incidence occurring in western countries. During
the last two decades, there has been an annual increase of
about 2 % in the occurrence of renal cancer both world-
wide and in Europe (Lindblad, 2004). In the last 30 years,
only few drugs have shown some activity against advanced
renal cancer (Scherr et al., 2011).
DNA topoisomerase-I (TOP-I) has emerged as a popular
target for cancer treatment. Topoisomerases are universal
enzymes involved in diverse cellular processes, such as
replication, recombination, transcription, and repair (Wang,
1996, 2002; Fortune and Osheroff, 2000; Champoux, 2001;
Wilstermann and Osheroff, 2003). Camptothecin was the
first agent identified as a TOP-I inhibitor from the Chinese
tree Camptotheca acuminate. However, it was discontinued
in the 70s because of severe side effects and lack of under-
standing of the drug’s mechanism of action. Although, the
camptothecin derivatives currently in the clinic possess
potent antitumor activity, they have a major limitation that
they are inactivated within minute at physiological pH by
lactone E ring opening. In view of this, a variety of het-
erocyclic aromatic and intercalating non-camptothecin
TOP-I inhibitors have been evaluated in clinical trials
(Jaxel et al., 1989).
S. Singh � S. Das � A. Pandey � S. Sharma � S. Paliwal (&)
Department of Pharmacy, Banasthali University,
Banasthali 304022, Rajasthan, India
e-mail: [email protected]
123
Med Chem Res
DOI 10.1007/s00044-013-0526-3
MEDICINALCHEMISTRYRESEARCH
Indenoisoquinolines have shown several advantages
over the camptothecin derivatives. In addition to possess-
ing high antiproliferative activity, indenoisoquinoline-
based compounds are chemically more stable. They also
have greater stability in their drug–enzyme–DNA cleavage
complexes (Kohlhagen et al., 1998; Yoshinari et al., 1999).
The present study has been conducted with the aim to
explore the structural requirements for potent TOP-I
inhibitors and to identify novel lead compounds with
potential as anticancer agents. A pharmacophore model has
been constructed employing renal cancer activity of
indenoisoquinoline series of compounds. The validated and
predictive pharmacophore model has been used to mine the
National Cancer Institute (NCI) chemical compound data-
base for identification of structurally diverse novel TOP-I
inhibitors.
Methods
Data compilation
The biological activity data with sufficient structural
diversity and over four orders of magnitude, reported as
GI50 (lM) has been obtained from the literature (Nagarajan
et al., 2004, 2006; Morrell et al., 2007a, b). The chemical
structures of all the inhibitors are given in Table 1. All
compounds have been built using ISIS Draw 2.5, imported
to Accelry’s Discovery Studio 2.0 (DS 2.0) and energy
minimized to the closest local minima using the general-
ized CHARMM-like force field as implemented in the
software program.
Conformational analysis
The single conformer 3D structures have been used as the
starting point for conformational analysis. The conforma-
tional space of each inhibitor has been extensively sampled
using the poling algorithm. Catalyst provides three types of
conformational analysis: Fast, Best, and Ceasar. Both Best
and Fast uses a version of the CHARMM force field for
energy calculations and a poling mechanism for forcing the
search into unexplored regions of conformer space. Fast
generation takes less time, but Best generation provides
more complete coverage of conformational space by opti-
mizing the conformation in both torsional and Cartesian
space. Moreover, Best searches the conformational space
more extensively than Fast, particularly ring conforma-
tions, and it applies more stringent minimization proce-
dures. Ceasar is 5–20 times faster and slightly better at
reproducing the ligand conformations than Catalyst Fast.
Fast and Ceasar perform better in high-throughput
screenings, while the Best method is recommended for
generating conformers that would be used as input for
developing automated hypotheses (Watts et al., 2010).
In the present study, diverse conformations of the
compounds have been generated using ‘‘Best’’ conforma-
tional approach, specifying 255 as the maximum number of
conformers under the constraint of 20 kcal/mol energy
threshold above the estimated global minimum based on
the use of the CHARMM force field.
Training and test set selection criteria
The most critical aspect in the generation of a pharmaco-
phore hypothesis is selection of the training set. As a
minimum requirement, training set should include at least
16 compounds to assure statistical significance and to avoid
any chance correlation in the pharmacophore model. The
training set consisting of 36 structurally diverse indeno-
isoquinoline-based TOP-I inhibitors have been carefully
selected with biological activities spanning over 4 orders of
magnitude. The remaining compounds having both struc-
tural diversity and biological activity variation have been
used as test set to validate the developed pharmacophore
model.
Pharmacophore generation methodology
A pharmacophore is described by a set of functional fea-
tures such as hydrophobic (HY), hydrogen bond donor
(HBD), hydrogen bond acceptor (HBA), hydrogen bond
acceptor lipid (HBA_L), and positively and negatively
ionizable sites distributed over a 3D space. The hydrogen-
bonding features are vectors, whereas all other functions
are points. The feature mapping protocol in Catalyst gen-
erates all possible pharmacophore features including HBA,
HBA_L, HBD, HY, HY (aliphatic), HY (aromatic), nega-
tive charge, negative ionizable (NI), positive charge,
positive ionizable (PI), and ring aromatic (RA).
The hypogen module of catalyst was used to generate
pharmacophore models. Pharmacophore generation was
carried out by setting function weight to 0.302, mapping
coefficient to 0, and resolution to 297 pm. The uncertainty
value was set to 3, which represents the ratio range of
uncertainty in the activity value based on the expected sta-
tistical straggling of biological data collection. Six com-
pounds (1_43, 1_95, 2_13, 2_24, 3_37, and 4_13) detected
as outliers from both the training as well as the test set were
excluded from the dataset. The implemented protocol
returned top ten hypotheses which were further analyzed for
their statistical significance on the basis of cost function
analysis, correlation coefficient, root mean-square deviation
(RMSD), Cat-scrambling, and internal and external test set
prediction. Out of ten generated hypotheses, the best one was
chosen on the basis of statistical fitness.
Med Chem Res
123
Table 1 Chemical structures of the indenoisoquinoline derivatives as TOP-I inhibitors
N
O
O
R1
R2 R3
R4
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
1_35 –H –NO2 Br O 11.7
1_37 –H –NO2 Br –CH3 46.8
1_38 –H –NO2 Br S 0.891
1_40 –H –NO2 Cl –H 26.9
1_41 –H –NO2 Cl –F 27.5
1_42 –H –NO2 Cl –Cl 2.51
1_43 –H –NO2 Cl –Br 0.019
1_46 –H –NO2 Cl N 5.89
1_49 –H –NO2 BrS
O O 6.92
1_50 –H –NO2 N3 O 5.25
Med Chem Res
123
Table 1 continued
1_52 –H –NO2 N3–CH3 58.9
1_53 –H –NO2 N3 S 1.05
1_55 –H –NO2 N3–H 72.4
1_56 –H –NO2 N3–F 0.302
1_59 –H –NO2 N3–I 3.89
1_61 –H –NO2 N3 N 46.8
1_62 –H –NO2 NH2 O 0.055
1_63 –H –NO2 NH2–C2H5 1.48
1_64 –H –NO2 NH2–CH3 0.229
1_65 –H –NO2 NH2–SMe 0.162
1_66 –H –NO2 NH20.437
1_67 –H –NO2 NH2–H 0.009
1_68 –H –NO2 NH2–F 0.034
1_86 –OCH3 –OCH3 BrNH
O
13.8
1_87 –OCH3 –OCH3 Cl O 5.5
1_90 –OCH3 –OCH3 Br –F 15.5
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Med Chem Res
123
Table 1 continued
1_92 –OCH3 –OCH3 Cl
N 22.9
1_94 –OCH3 –OCH3 N3 N
H
O
50.1
1_95 –OCH3 –OCH3 N3
–NH2 0.148
1_97 –OCH3 –OCH3 N3
–OCH3 64.6
1_99 –OCH3 –OCH3 N3
–H 13.8
1_100 –OCH3 –OCH3 N3
–F 8.32
1_102 –OCH3 –OCH3 N3
N 15.5
1_104 –OCH3 –OCH3 NH2
–NH2 2.57
1_105 –OCH3 –OCH3 NH2
–N(CH3)2 17
1_106 –OCH3 –OCH3 NH2
–OCH3 0.141
1_107 –OCH3 –OCH3 NH2
–C2H5 1.74
1_108 –OCH3 –OCH3 NH2
–H 0.794
1_109 –OCH3 –OCH3 NH2
–F 4.36
1_110 –OCH3 –OCH3 NH2
O
O
15.1
1_111 –OCH3 –OCH3 NH2
N 16.6
1_112 –OCH3 –OCH3 NH2
–NO2 11.7
2_6 –H –H Br
–H 7
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Med Chem Res
123
Table 1 continued
2_7 –H –H NH2
–H 0.16
2_8 –H –H
N
–H 0.91
2_9 –H –H
N
N
–H 1.66
2_10 –H –NO2 N3
–H 72.4
2_11 –H –NO2 NH2
–H 1.102
2_12 –H –NO2
N
–H 4.17
2_13 –H –NO2
N
N
–H 0.015
2_24 –H –NO2 I
–OCH3 0.309
2_27 –H –H
NO
–H 21.4
2_28 –H –H N3
–H 25.1
2_29 –H –H
NO
–NO2 0.309
2_31 –H –H
NO
–OCH3 4.07
2_32 –H –NO2 HN
OH
–H 0.229
2_33 –H –NO2 HN
OH –OCH3 0.012
2_34 –H –H HN
OH –OCH3 0.158
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Med Chem Res
123
Table 1 continued
2_35 –H –H HN
OH –H 0.269
2_37 –H –H
N
–OCH3 0.245
2_39 –H –H N
N
–OCH3 0.676
3_20 –H –H HN
N
O
O
–H 1.55
3_21 –H –H N
N
O
O
–H 0.589
3_32 –H –H
NN
N
O
NHO
HNO
O
O
O
–H 39.8
3_34 –H –H
N
O
NHO
N
N O
OO
HN
O
–H 36.3
3_37 –H –H
N
HN
O
O
–H 11
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Med Chem Res
123
3_38 –H –H
NNH
O
O
–H 0.132
3_39 –H –H
N
O
O
HN
NH
–H 0.178
3_42 –H –HN N
N
O
O
–H 15.5
3_45 –H –HN
O
O
HN
NH
HN
–H 0.017
3_49 –OCH3 –OCH3
N
O
O
HN
HN
–H 0.028
3_50 –OCH3 –OCH3
N
O
O
NH
NH
–H 0.24
3_51 –H –NO2
N
O
O
HN
HN
–H 0.123
3_52 –H –NO2
NNH
NH
O
O
–H 1.17
3_53 –OCH3 –OCH3 NH2NH
NH O O
33
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Table 1 continued
Med Chem Res
123
Table 1 continued
4_5 –OCH3 –OCH3
NOH
H HO O
4_6 –H –H NH
OH–H 0.95
4_9 –H –H NH3 –H 0.49
4_10 –H –HN O
–H 93.3
4_11 –H –HN NH
–H 2.19
4_13 –H –H NH3 –H 0.16
4_15 –H –H HN NH2
NH2
–H 0.23
4_16 –H –HN
–H 0.91
4_17 –H –H
N
N
H –H 1.66
4_20 –H –HN
H2N
N
H
–H 2
4_22 –H –H NH3 –H 0.04
4_33 –H –HBr O O
0.23
4_35 –H –H NH3 O O0.19
4_36 –H –HCl
–H 4.39
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Med Chem Res
123
Virtual screening
The best 3D pharmacophore was used as a search query to
screen the chemical database from the NCI to retrieve new
chemical entities as potent TOP-I inhibitors. Hits obtained were
subjected to Lipinski’s rule of five (Lipinski et al., 1997). This
led to retrieval of compounds having no violation of the Lipin-
ski’s rule of five and good estimated activity (less than 1 lM).
Table 1 continued
4_37 –H –HOH
–H 16.4
4_38 –H –HOH
–H 6.49
4_39 –OCH3 –OCH3NH3 O O
0.36
4_40 –H –H NH3–OCH3 0.43
4_41 –OCH3 –OCH3 NH3 O O0.31
4_42 –OCH3 –OCH3
N O O0.9
4_43 –OCH3 –OCH3
O O3.6
Name of
compound
R1 R2 R3 R4 Activity
GI M50 (µ )
Table 2 The cost values,
correlation coefficients (r),
RMSD, and features for the top
ten hypotheses (Hypo 0–Hypo
10)
Hypothesis Total cost Cost difference RMSD Correlation Features
1 156.29 68.109 1.096 0.909 HY, PI, RA
2 171.252 53.147 1.472 0.789 HY, PI, RA
3 174.449 49.95 1.531 0.770 HY, PI, RA
4 177.974 46.425 1.604 0.743 HY, PI, RA
5 179.277 45.122 1.619 0.738 HY, PI, RA
6 184.184 40.215 1.688 0.713 HY, PI, RA
7 187.156 37.243 1.755 0.681 HY, PI, RA
8 193.013 31.386 1.837 0.643 HY, PI, RA
9 194.002 30.397 1.850 0.636 HY, PI, RA
10 194.424 29.975 1.863 0.629 HY, PI, RA
Med Chem Res
123
Results and discussion
Construction of pharmacophore model
HypoGen pharmacophore models were generated using 36
training set compounds with antiproliferative activity
against the human renal cancer cell line. Hypotheses were
generated using structural information, conformational
models, and chemical features. The first hypothesis
(Hypo1) was considered as the best pharmacophore model
on the basis of high correlation coefficient of 0.909 and
high cost difference of 68.109. The Hypo1 consisted of
three features: one HY, one PI, and one RA.
Assessment of pharmacophore model
The HypoGen module in Catalyst performs two important
theoretical cost calculations that determine the success of
any pharmacophore hypothesis. One is known as the ‘‘fixed
cost,’’ representing the simplest model that fits all data
perfectly, and the second is known as ‘‘null cost,’’ which
represents the highest cost of a pharmacophore with no
features and which estimates activity to be the average of
the activity data of the training set molecules. Because the
null hypothesis is an ‘‘empty’’ hypothesis with no features,
there is no contribution of the weight and configuration
costs. All the analytical cost values represented in bits have
been calculated by the HypoGen module during pharma-
cophore generation.
The top-ranked pharmacophore model (Hypo1) showed
the best predictive power and statistical significance
described by the high squared correlation coefficient (r2 =
0.827), low root mean-square deviation (RMSD = 1.096),
weight (3.920), and error cost (142.703) satisfying the
acceptable range suggested in the cost analysis of the
Catalyst procedure (Lu et al., 2007). The low values of
error cost and RMSD represented the good quality of the
correlation between the estimated and the actual activity
data. The configuration cost was 9.667, indicating that all
generated models have been thoroughly analyzed. The cost
difference between total and fixed costs for the best
hypothesis was only 14.30 bits, indicating the high prob-
ability of the true correlation of the data.
It is a well known fact that, lower the cost difference
between the total and fixed costs, higher the probability is
for true correlation of the data. Also a cost difference of
68.109 between the total cost (156.29) and the null cost
(224.399) indicates a 68 % chance of representing a true
correlation in the data. The cost values, correlation coef-
ficients (r), RMSD, and features for the top ten hypotheses
are listed in Table 2.
Hypo1, identified as the best hypothesis estimated the
activity of the training set molecules accurately. All the
compounds are classified by their activity as highly active
(\0.5 lM, ???), moderately active (0.5–10 lM, ??),
and inactive ([10 lM, ?). Table 3 represents the actual
and predicted renal cancer cell line TOP-I inhibitory
activity of the 36 training set molecules based on the best
hypothesis. Out of the 36 training set compounds, two
Table 3 The actual and predicted TOP-I inhibitory activity of 36
training set molecules based on the best hypothesis
Name Actual
activity
Predicted
activity
Fit
value
Actual
activity scale
Predicted
activity scale
1_67 0.009 0.057 7.072 +++ +++
2_33 0.012 0.183 6.567 +++ +++
3_45 0.017 0.029 7.37 +++ +++
4_22 0.04 0.223 6.482 +++ +++
1_62 0.055 0.093 6.859 +++ +++
3_38 0.132 0.068 7 +++ +++
1_106 0.141 0.764 5.947 +++ ++
2_34 0.158 0.265 6.407 +++ +++
3_39 0.178 0.132 6.711 +++ +++
1_64 0.229 0.13 6.717 +++ +++
4_15 0.23 0.124 6.737 +++ +++
2_37 0.245 0.297 6.358 +++ +++
2_35 0.269 0.107 6.802 +++ +++
1_56 0.302 7.278 4.968 +++ ++
4_40 0.43 0.388 6.242 +++ +++
2_8 0.91 0.433 6.194 ++ +++
3_52 1.17 0.443 6.184 ++ +++
3_20 1.55 1.085 5.795 ++ ++
1_42 2.51 9.423 4.856 ++ ++
1_104 2.57 0.829 5.912 ++ ++
4_36 4.39 17.007 4.599 ++ ++
1_50 5.25 11.939 4.753 ++ +
1_87 5.5 7.127 4.977 ++ +
2_6 7 9.218 4.865 ++ ++
1_112 11.7 15.811 4.631 + +
1_35 11.7 12.053 4.749 + +
1_110 15.1 12.478 4.734 + +
1_90 15.5 8.524 4.899 + ++
1_92 22.9 7.271 4.968 + ++
1_40 26.9 9.618 4.847 + ++
3_53 33 4.07 5.22 + ++
3_34 36.3 14.547 4.667 + +
1_37 46.8 11.03 4.787 + +
1_61 46.8 14.321 4.674 + +
1_97 64.6 12.24 4.742 + +
4_10 93.3 12.744 4.725 + +
Med Chem Res
123
highly active compounds are predicted as moderate, two
moderate compounds are predicted as active and two as
inactive, and four inactive compounds are predicted as
moderate. Consequently, for 26 of 36 training set com-
pounds, the predicted GI50 (lM) values are within the same
activity scale as the experimental values in the training set.
Fig. 1 The plot for training set
compounds showing correlation
between actual and predicted
activity
Fig. 2 The plot for internal test
set compounds showing
correlation between actual and
predicted activity
Fig. 3 Graph of 98 %
CatScrambled cost data. None
of the outcome hypotheses had a
lower cost score than the initial
(best) hypothesis, Hypo1
Med Chem Res
123
Table 4 Structures of 22 indenoisoquinoline derivatives as TOP-I inhibitors used for external validation
Name of
compounds
Structure of compounds Actual activity GI50
(µM)
Predicted
activity GI50
(µ )M
6_5
N
O
O
OH
35.7 19.549
6_6
N
O
O
OH
16.4 16.167
6_7
N
O
O
OH
6.49 13.446
6_8
N
O
O
Cl
4.39 17.015
6_9
N
O
O
Br
7 9.217
Med Chem Res
123
Table 4 continued
6_15c
N
O
O
Cl
O
OMeO
MeO
24 7.325
6_15d
N
O
O
Br
O
OMeO
MeO
5.83 7.464
6_15e
N
O
O
Br
MeO
MeO
OMe
OMe
3.15 13.571
6_16b
N
O
O
MeO
MeON3
O
O
4.53 18.113
6_17a
N
O
O
NH3
OMe
OMe
H
H
+
0.43 0.388
Name of
compounds
Structure of compounds Actual activity GI50
(µM)
Predicted
activity GI50
(µ )M
Med Chem Res
123
Table 4 continued
6_17b
N
O
O
MeO
MeONH2
O
O
+
0.31 1.284
6_18b
N
O
O
NH OH
O
O
2.07 0.336
6_18c
N
O
O
NH
OH
O
O
MeO
MeO
0.01 0.162
6_18d
N
O
O
NH
O
O
MeO
MeO
OH
0.01 0.703
Name of
compounds
Structure of compounds Actual activity GI50
(µM)
Predicted
activity GI50
(µ )M
Med Chem Res
123
Table 4 continued
6_18f
N
O
O
NH
O
O
MeO
MeO
OH
0.15 0.764
6_18g
N
O
O
NH
O
O
MeO
MeO
OH
0.11 0.49
6_19b
N
O
O
NH2
O
O
MeO
MeO
OH+
0.11 0.4
6_19c
N
O
O
NH2
MeO
MeO
OMe
OMe
OH+
0.23 1.076
Name of
compounds
Structure of compounds Actual activity GI50
(µM)
Predicted
activity GI50
(µ )M
Med Chem Res
123
Table 4 continued
6_19d
N
O
O
NH2
MeO
MeO
OMe
OMe
OH+
0.28 1.076
6_25
N
O
O
NH
MeO
MeO
OH
0.15 0.105
6_27a
N
O
CH3
MeO
MeO
O
O 43.6 9.564
6_27c
N
O
MeO
MeO
O
O 21.7 10.615
Name of
compounds
Structure of compounds Actual activity GI50
(µM)
Predicted
activity GI50
(µ )M
Med Chem Res
123
The plot for training set compounds showing correlation
between actual and predicted renal cell line TOP-I inhibitor
activity is depicted in Fig. 1.
Pharmacophore validation
Internal test set
Test set validation is one of the obligatory steps to establish
the competency of the generated pharmacophore model for
prediction accuracy. In order to validate the pharmacophore
hypothesis, we have used a test set consisting of 53 molecules
with variation in anti-proliferative activity against renal
cancer cell line. All molecules in the test set have been built,
minimized, and subjected to conformational analysis like the
molecules in the training set. Test set prediction has been
observed in terms of the squared correlation coefficient (r2),
which is 0.702 (Fig. 2). The high r2 value indicates a good
correlation between the actual and estimated activities. The
agreement between the actual and predicted activity of test
Fig. 4 The plot for external test
set compounds showing
correlation between actual and
predicted activity
Fig. 5 Pharmacophoric
features obtained from the best
hypothesis, Hypo1:
a pharmacophoric features;
b inter-atomic distances
between pharmacophoric
features
Fig. 6 Mapping analysis of the
most active and the least active
compound on pharmacophore
model: a the most active
compound, 1_67, showed best
fit with all the three features;
b the least active compound,
4_10, showed poor fit with
mapping of two out of the three
pharmacophore features
Med Chem Res
123
Table 5 Mapping of pharmacophoric feature with marketed and clinical trial drug candidates
Name of comopunds Mapping of drugs along with generated pharmacophores
replica
Fit value
Afeletecan
4.965
Gimatecan
4.962
SN-38
4.962
Camptothecin
4.958
Irinotecan
4.919
Med Chem Res
123
Table 5 continued
9-amino-
camptothecin
4.505
Topotecan
4.5
Rubitecan
4.474
Belotecan
4.461
Exatecan
3.888
Name of comopunds Mapping of drugs along with generated pharmacophores
replica
Fit value
Med Chem Res
123
set compounds testifies the soundness of Hypo1. This vali-
dation provides an added confidence in the usability of the
selected pharmacophore.
Fischer validation
To further evaluate the statistical relevance of the phar-
macophore hypotheses generated from the training set
molecules, the CatScramble module in Catalyst has been
used which is based on the principle of Fisher’s randomi-
zation test. In this cross validation test, thorough random-
ization of the training set is used to validate and derive the
significance of the generated best model. These random-
ized spreadsheets should yield hypotheses with lesser sta-
tistical significance than the original model to suggest that
the original hypothesis represents a true correlation. 98 %
Fischer validation has been applied to the developed model
to minimize the possibility of adopting fortuitous phar-
macophores. The results of the F-randomization test are
shown in Fig. 3. The data of cross validation clearly indi-
cate that the statistical values of Hypo1 are better than
other random hypotheses, as revealed by the lowest total
cost and the highest correlation coefficient, which verifies
that Hypo1 has not been obtained by chance and there is
98 % possibility for Hypo1 to represent a true correlation
in the training set activity data.
External test set
A pharmacophore model is claimed to be best when it not
only predicts the activity of the training and internal test set
compounds but also predicts the activities of external
molecules. So, the selected pharmacophore model has been
further validated by an external test set consisting of known
TOP-I inhibitors with experimental GI50 values. In total, 22
compounds have been selected for external set showing
diversity in the activity range from 0.01 to 43.6 lM
(Table 4) (Cushman et al., 2000). The activities of all the
external test set compounds have been estimated using
Hypo1 with the squared correlation coefficient, r2 of 0.703.
An r2 value of more than 0.5 between the actual and esti-
mated values renders the model to be good (Frimayanti
et al., 2011). The graphical representation of actual versus
estimated activity of external test set compounds is
depicted in Fig. 4.
Pharmacophore mapping
The obtained pharmacophoric features and their interfea-
ture distances are shown in Fig. 5a, b, respectively. Map-
ping of the most active compound, 1_67, shows the best fit
with all the three features (Fig. 6a). The HY feature is
mapped by the oxygen atom of the carbonyl group, PI is
mapped by the nitrogen atom of the amine group, and the
RA feature is mapped by benzene ring. On the contrary, the
least active compound, 4_10, shows a poor fit with map-
ping of two out of the three pharmacophore features
(Fig. 6b). In this case, the HY and RA features are mapped
on the benzene ring whereas PI feature is missing. The
most active compound in the dataset assumes a confor-
mation that allows proper mapping of all the features of the
generated hypothesis, whereas the least active compound is
unable to map PI.
In addition to this, the pharmacophore has also been
mapped on some of the clinically approved marketed drugs
and clinical trial candidates like Irinotecan, Topotecan,
Belotecan, SN-38, Lurtotecan, Rubitecan, Exatecan, Cam-
ptothecin, Afeletecan, Gimatecan, and 9-amino-camptothecin
Table 5 continued
Lurotecan 2.055
Name of comopunds Mapping of drugs along with generated pharmacophores
replica
Fit value
Med Chem Res
123
Table 6 Hits obtained from NCI database
Name Mapping Estimated Fit value
NSC 17153
0.04 7.228
NSC 3607
0.046 7.172
NSC 32583
0.051 7.125
NSC 11966
0.052 7.112
Med Chem Res
123
Table 6 continued
NSC 23679
0.056 7.08
NSC 24114
0.056 7.078
NSC 8582
0.057 7.071
NSC 23681
0.067 7.003
NSC 32480
0.073 6.969
Name Mapping Estimated Fit value
Med Chem Res
123
(Table 5). It has been found that none of the compounds
mapped the PI feature; however, all the other features (one HY
and one RA) are mapped. Afeletecan is mapped with a
maximum fit value of 4.965.
Database screening
Database screening deals with the quick search of large
libraries of chemical structures in order to identify those
structures which are most likely to bind to target, typically
a protein receptor or enzyme (Kurogi and Guner, 2001;
Oloff et al., 2005). The pharmacophore-based best model
has been used to screen the 260071 compounds of NCI
database which returned 295 hits. Lipinski’s rule of five has
been applied to screen druggable compounds, which led to
the selection of final 64 compounds, among which 17 have
high activity span of 0.04–0.081 and fit values ranging
from 7.228 to 6.924, namely, NSC 17153, NSC 3607, NSC
32583, NSC 11966, NSC 23679, NSC 24114, NSC 8582,
NSC 23681, NSC 32480, NSC 13454, NSC 31334, NSC
18762, NSC 18418, NSC 15412, NSC 32478, NSC 33424,
and NSC 8571 (Table 6).
The most active compound NSC 17153 with estimated
activity of 0.04 and fit value of 7.228 showed mapping with
all the three features. PI was mapped to the amine group
whereas HY and RA mapped to the benzene ring. The
second most active compound NSC 3607 (estimated
activity = 0.046, fit value = 7.172) showed mapping of
NH group present in six membered ring with PI feature,
ethyl chain with HY feature, and benzene ring with RA
feature. Similarly, all the 17 compounds showed mapping
with all the features.
Conclusions
The generated pharmacophore model showed high corre-
lation values for both the training (r2 = 0.827) as well as
the internal test set (r2 = 0.702). The model was also
validated by external test set with an r2 of 0.703. The
results demonstrated that the HY, RA, and PI features
Table 6 continued
NSC 13454 0.075 6.954
NSC 31334 0.079 6.933
NSC 18762 0.081 6.924
Name Mapping Estimated Fit value
Med Chem Res
123
influence significantly to the TOP-I inhibitory activity. The
whole procedure of pharmacophore modeling along with
database screening carried out on the NCI database resulted
in the retrieval of 17 novel ligands with TOP-I inhibitory
activity which is a potential subject of further investigation.
Acknowledgments The authors thank the Department of Science
and Technology, New Delhi and the Vice Chancellor, Banasthali
University, for extending all the necessary facilities. The authors also
thank Dr. Monali Bhattacharya, Department of English, Banasthali
University, for her support.
References
Canamares I, Agustin MJ, Santander C, Gomez-Tijero N, de la
LLama N, Abad-Sazatornil MR (2012) Safety of sunitinib in
renal cell carcinoma. Eur J Hosp Pharm 19:166. doi:10.1136/
ejhpharm-2012-000074.215
Champoux JJ (2001) DNA topisomerases: structure, function, and
mechanism. Annu Rev Biochem 70:369–413
Cushman M, Jayaraman M, Vroman JA, Fukunaga AK, Fox BM,
Kohlhagen G (2000) Synthesis of new indeno[1,2-c]isoquino-
lines: cytotoxic non-camptothecin topoisomerase I inhibitors.
J Med Chem 43:3688–3698
Fortune JM, Osheroff N (2000) Topoisomerase II as a target for
anticancer drugs: when enzymes stop being nice. Prog Nucleic
Acid Res Mol Biol 64:221–253
Frimayanti N, Yam ML, Lee HB, Othman R, Zain SM, Rahman NA
(2011) Validation of quantitative structure activity relationship
(QSAR) model for photosensitizer activity prediction. Int J Mol
Sci 12:8626–8644
Jaxel C, Kohn KW, Wani MC, Wall ME, Pommier Y (1989)
Structure-activity study of the actions of camptothecin deriva-
tives on mammalian topoisomerase I: evidence for a specific
receptor site and a relation to antitumor activity. Cancer Res
49:1465–1469
Kohlhagen G, Paull KD, Cushman M, Nagafuji P, Pommier Y (1998)
Protein-linked DNA strand breaks induced by NSC 314622, a
novel noncamptothecin topoisomerase I poison. Mol Pharmacol
54:50–58
Kurogi Y, Guner OF (2001) Pharmacophore modeling and three
dimensional database searching for drug design using catalyst.
Curr Med Chem 8:1035–1055
Lindblad P (2004) Epidemiology of renal cell carcinoma. Scand J
Surg 93:88–96
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Exper-
imental and computational approaches to estimate solubility and
permeability in drug discovery and development setting. Adv
Drug Deliv Rev 23:3–25
Lu A, Zhang J, Yin X, Luo X, Jiang H (2007) Farnesyltransferase
pharmacophore model derived from diverse classes of inhibitors.
Bioorg Med Chem Lett 17:243–249
Morrell A, Placzek M, Parmley S, Antony S, Dexheimer TS,
Pommier Y et al (2007a) Nitrated indenoisoquinolines as
topoisomerase I inhibitors: a systematic study and optimization.
J Med Chem 50:4419–4430
Morrell A, Placzek M, Parmley S, Grella B, Antony S, Pommier Y
et al (2007b) Optimization of the indenone ring of indenoiso-
quinoline topoisomerase I inhibitors. J Med Chem 50:4388–4404
Nagarajan M, Morrell A, Fort BC, Meckley MR, Antony S,
Kohlhagen G et al (2004) Synthesis and anticancer activity of
simplified indenoisoquinoline topoisomerase I inhibitors lacking
substituents on the aromatic rings. J Med Chem 47:5651–5661
Nagarajan M, Morrell A, Antony S, Kohlhagen G, Agama K,
Pommier Y et al (2006) Synthesis and biological evaluation of
bisindenoisoquinolines as topoisomerase I inhibitors. J Med
Chem 49:5129–5140
Oloff S, Mailman RB, Tropsha A (2005) Application of validated
QSAR models of D1 dopaminergic antagonists for database
mining. J Med Chem 48:7322–7332
Rekha PR, Rajendiran S, Rao S, Shroff S, Joseph LD, Prathiba D
(2008) Histological reclassification, histochemical characteriza-
tion and c-kit immunoexpression in renal cell carcinoma. Indian
J Urol 24:343–347
Scherr AJ, Lima JP, Sasse EC, Lima CS, Sasse AD (2011) Adjuvant
therapy for locally advanced renal cell cancer: a systematic
review with meta-analysis. BMC Cancer 11:115–122
Thakur A, Jain SK (2011) Kidney cancer: current progress in
treatment. World J Oncol 2:158–165
Wang JC (1996) DNA topoisomerases. Annu Rev Biochem
65:635–692
Wang JC (2002) Cellular roles of DNA topoisomerases: a molecular
perspective. Nat Rev Mol Cell Biol 3:430–440
Watts KS, Dalal P, Murphy RB, Sherman W, Friesner RA, Shelly JC
(2010) ConfGen: a conformational search method for efficient
generation of bioactive conformers. J Chem Inf Model 50:
534–546
Wilstermann AM, Osheroff N (2003) Stabilization of eukaryotic
topoisomerase II-DNA cleavage complexes. Curr Top Med
Chem 3:1349–1364
Yoshinari T, Ohkubo M, Fukasawa K, Egashira S, Hara Y,
Matsumoto M et al (1999) Mode of action of a new indoloc-
arbazole anticancer agent, J-107088, targeting topoisomerase I.
Cancer Res 59:4271–4275
Med Chem Res
123