editor: eduardo a. castro 3. molecular topology in qsar ... 3.pdf · molecular topology in qsar and...

32
Research Signpost 37/661 (2), Fort P.O. Trivandrum-695 023 Kerala, India QSPR-QSAR Studies on Desired Properties for Drug Design, 2010: 63-94 ISBN: 978-81-308-0404-0 Editor: Eduardo A. Castro 3. Molecular topology in QSAR and drug design studies J. Gálvez and R. García-Domenech Unidad de Diseño de Fármacos y Conectividad Molecular, Dept. Química Física, Facultad de Farmacia, Universitat de Valencia, Avd. V.A. Estellés, s/n 46100 Burjassot-Valencia, Spain Abstract. The use of graph theoretical approaches to describe the chemical structure of organic compounds has accomplished more and more relevance along the later years. Since the early times in which such formalism was used to predict simple properties on simple molecules, such as boiling points of alkanes, up to the design- for instance- of novel lead anticancer drugs, a significant progress was achieved and a long path has been covered. The aim of this review is depicting some of the milestones of such a path and somehow forecast which will be the challenges and expectancies for the future of the topics. 1. Introduction Since Corvin Hansch [1] introduced, at the beginning of the sixties in the past century, his famous equation relating some experimental properties of chemical compounds with certain electronic and steric parameters of molecules, thereby introducing the quantitative structure-activity relationships (QSAR), the developments of such methods has been dizzy. The introduction Correspondence/Reprint request: Dr. J. Gálvez, Unidad de Diseño de Fármacos y Conectividad Molecular Dept. Química Física, Facultad de Farmacia, Universitat de Valencia, Avd. V.A. Estellés, s/n 46100 Burjassot- Valencia, Spain. E-mail: [email protected] and [email protected]

Upload: lyngoc

Post on 12-May-2018

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Research Signpost 37/661 (2), Fort P.O. Trivandrum-695 023 Kerala, India

QSPR-QSAR Studies on Desired Properties for Drug Design, 2010: 63-94 ISBN: 978-81-308-0404-0 Editor: Eduardo A. Castro

3. Molecular topology in QSAR and drug design studies

J. Gálvez and R. García-Domenech

Unidad de Diseño de Fármacos y Conectividad Molecular, Dept. Química Física, Facultad de Farmacia, Universitat de Valencia, Avd. V.A. Estellés, s/n 46100 Burjassot-Valencia, Spain

Abstract. The use of graph theoretical approaches to describe the chemical structure of organic compounds has accomplished more and more relevance along the later years. Since the early times in which such formalism was used to predict simple properties on simple molecules, such as boiling points of alkanes, up to the design- for instance- of novel lead anticancer drugs, a significant progress was achieved and a long path has been covered. The aim of this review is depicting some of the milestones of such a path and somehow forecast which will be the challenges and expectancies for the future of the topics.

1. Introduction Since Corvin Hansch [1] introduced, at the beginning of the sixties in the past century, his famous equation relating some experimental properties of chemical compounds with certain electronic and steric parameters of molecules, thereby introducing the quantitative structure-activity relationships (QSAR), the developments of such methods has been dizzy. The introduction Correspondence/Reprint request: Dr. J. Gálvez, Unidad de Diseño de Fármacos y Conectividad Molecular Dept. Química Física, Facultad de Farmacia, Universitat de Valencia, Avd. V.A. Estellés, s/n 46100 Burjassot-Valencia, Spain. E-mail: [email protected] and [email protected]

Page 2: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 64

of more and more capable computers made possible that progress up to the point that nowadays, the so called in silico approaches, stand as a essential tool. The initial QSAR technique, assumed that there was a relationship between the properties of a molecule and its structure, this relationship being profiled by a set of physicochemical parameters. QSAR based on this very original concept has been viewed as the classic as shown in the works of Hansch and Fujita [1]. A variety of modern QSAR methods are founded on the basis of the classic QSAR. These methods have become powerful tools to predict chemical properties or possible biological functions of unknown compounds, thus accelerating the process of designing novel compounds and their optimization. Other approaches such as the Free-Wilson’s [2] were digging in the same concept and also yielded good results. The Free-Wilson analysis relies on the additive nature of the properties of molecular fragments, what not always is possible. However, in those cases in which it is applicable it yields very good results. Another significant approach in the field of QSAR was the introduction by Cramer in 1988 of three-dimensional molecular parameters, what allowed the later developed as 3D-QSAR [3]. This initiated a new era at the level of both conceptual evolution and practical applications. The effects from different conformers, stereoisomers or enantiomers of chemical compounds in 3D-QSAR models permitted the comparison molecular structures thereby setting up a representative structural group known as the pharmacophore [4]. The first, and currently one of the most widely used models employing 3D-QSAR to draw pharmacophores was the Comparative Molecular Field Analysis (CoMFA) method proposed by Cramer et al. [3]. Other 3D-QSAR approaches, such as Comparative Molecular Similarity Indices Analysis (CoMSIA) [5] or Self Organizing Molecular Field Analysis (SomFA) [6], have also been developed, some of which incorporate comparisons of different sets of molecular descriptors. Along with the rapid development of computational science, a pull of new techniques based on formalisms such as molecular mechanics, molecular dynamics, docking, scoring and pharmacophore analysis, are now widely used in the area of drug discovery. These computational techniques have been proven to assist in the design of novel, more potent inhibitors because they can visualize the mechanism of ligand-receptor interactions. Molecular topology (MT), a discipline usually considered within the QSAR methods, has demonstrated to be an excellent tool for a quick and accurate prediction of many physicochemical and biological properties [7-9]. One of the most interesting advantages of MT is the straightforward calculation of molecular descriptors to work with. Within this mathematical formalism a molecule is assimilated to a graph, where each vertex represents

Page 3: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 65

one atom and each axis one bond. Starting from the interconnections between the vertices, an adjacency topological matrix can be built up, whose ij elements take the values either one or zero, depending if the vertex i is connected or unconnected to the vertex j, respectively (see Figure 1). The manipulation of this matrix gives origin to a set of topological indices or topological descriptors which characterize each graph and allow the developments of QSPR [10–12] and QSAR [13–18] analysis as well. The singularity of molecular topology can be drawn in the following items: a) It is a completely mathematical framework in which molecular structure

is profiled. b) It is a very efficient approach for drug discovery either by screening of

large databases of compounds or by designing novel compounds by following the inverse process (properties→structure). Furthermore, it is easily computerizable.

The item b) is a direct consequence of the item a). The use of molecular topology in the search of QSAR has grown along the last years in an exponential way. Altogether the topological scope covers over 20% of the overall papers on QSAR. A recent search carried out with the Scifinder Scholar database, disclosed that about 3000 papers out of 15000 dealing on QSAR, were devoted to topological descriptors. (see Figure 2). In the current work, we focus on the contribution of molecular topology to QSAR studies obtained by our research group in the last years and its application to drug design.

Figure 1. The chemical graph and adjacency matrix of the isopentane. 2. Methods and applications Molecular topology, MT, can be defined as a part of chemistry consisting of the topological description of molecular structures. Such description deals basically with the connectivity of the atoms forming the molecule and should

Page 4: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 66

Figure 2. Bibliographic research obtained with Scifinder Scholar data base. White bars: papers including the key QSAR; Black bars: papers including the keys QSAR + molecular topology. yield numerical descriptors which are invariant under deformation of the structure. Note that, although graph theory is usually the main source of descriptors and concepts feeding MT, this is a much broader concept and actually it is not a part of graph theory. Throughout this section somewhat different molecular connectivity indices will be introduced. In order to outline the particular QSAR techniques used with this methodology, descriptors will be defined before explaining the modeling tools applied with them. Diverse statistical and molecular techniques will be sketched here. 2.1. Descriptors The following types of indices, which have been mainly used in this research, are described in increasing order of complexity. Discrete invariants These are natural numbers calculated from what chemists understand qualitatively as the chemical structure. N is the number of non-hydrogen atoms, i.e., the number of molecular graph vertices [19,20]. Vk, where k is 3 or 4, is the number of vertices of degree k, i.e., the number of atoms having

Page 5: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 67

k bonds, σ or π, to non-hydrogen atoms [20]. PRk for k between 0 and 3 is the number of pairs of ramifications at distance k, i.e., the number of pairs of single branches at distance k in terms of bonds [20]. L is the length, i.e., the maximum distance between non-hydrogen atoms measured in bonds, and is thus the diameter of the molecular graph defined as max(dij) [20]. W is the Wiener number, i.e., the sum of the distances between any two non-hydrogen atoms measured in bonds [21]. Connectivity indices Throughout the present section the connectivity indices defined as in Eq 1 will be used [22, 23]. Some of them are slightly different from the previously defined indices. The connectivity index of order k [23] may be derived from the adjacency matrix and is normally written as, kχt, The order k is between 0 and 4 and is the number of connected non-hydrogen atoms which appear in a given sub-structure.

∑ ∏=

∈⎟⎟⎠

⎞⎜⎜⎝

⎛δ=χ

tk

j

n

j iit

k

1

2/1

S Eq. 1a

In eq 1a, δi is the number of simple (i.e., sigma) bonds of the atom i to non-hydrogen atoms, Sj represents the jth sub-structure of order k and type t, and knt is the total number of sub-graphs of order k and type t that can be identified in the molecular structure. The types used are path (p), cluster (c), and path-cluster (pc). A sub-graph of type p is formed by a path, a sub-graph of type c is formed by a star, while the pc sub-graph can be defined as every tree which is neither a path nor a star. Alternatively: a pc sub-graph is any tree containing at least a star and a path. As an example, Table 1 displays all the p, c, and pc sub-graphs found in a simple molecular structure. The use of the valence delta, δv, instead of δ enables the encoding of π and lone-pair electrons [22] in the form given in Eq. 1b.

∑ ∏=

∈⎟⎟⎠

⎞⎜⎜⎝

⎛δ=χ

tk

j

n

j iit

k

1

2/1

S

vv

Eq. 1b

Here δv is just the degree of a vertex in a pseudograph and in this context the old definition, δv = Zv / (Z - Zv - 1), for the δi

v of higher row atoms holds [22]. The values listed in Table 2 and used in this approach were adopted because of their general performance.

Page 6: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 68

Table 1. Types of sub-graphs present in the 2-methylpropanol structure.

Type Order 1 Order 2 Order 3 Order 4 Order 5

OH OH OH

OH OH OH

Path OH OH

OH OH

Cluster OH

Path-Cluster

OH

Table 2. Values of δv for the different heteroatoms present in the listed groups.

Group δv Group δv NH4

+ 1 H3O+ 3NH3 2 H2O 4 -NH2 3 -OH 5 -NH- 4 -O- 6=NH 4 =O 6-N- 5 O (nitro) 6 =N- 5 O (carboxyl) 6=N+= (azide) 4 -F 7=N- (azide) 6 -Cl 0.690 -N= (nitro) 6 -Br 0.254 -S- 1.33 -I 0.15=S 0.99 =P- 0.560 S (-SO2-) 2.67 P(5) 2.22

Topological Charge Indices (TCI) The Topological Charge Indices Gk and Jk of order k =1-5 are defined for a given graph by Eq. 2 [24], in which N is the number of non-hydrogen atoms, and cij = mij – mji, is the charge term between vertices i and j. δ represents here the Krönecker delta symbol, i.e., if α = b, then d(a,b)=1, and if α ≠ b then d(a,b)=0, and finally, dij is the topological distance between vertices i and j.

Page 7: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 69

∑ ∑=1-N

1=

N

1+)d,(cG

iij

j=ijik kδ

and

1NGJ

−= k

k

Eq. 2.

The variables mij are the elements of the NxN matrix M obtained as the product of two matrices, i.e., M = A·Q. The elements of M expanded in terms of the elements of A and Q are given in Eq. 3.

∑=N

1=qam

hhjihij

Eq. 3.

A is the adjacency matrix in which elements aih are: 0 if either i = h or i is not linked to h; 1 if i is linked to h by a single bond; 1.5 if linked to h by an aromatic bond; 2 if linked to h by a double bond; and 3 if linked to h by a triple bond. Q is the inverse squared distance or Coulombian matrix. Its elements, qhj, are 0 if h = j and otherwise qhj = 1/dhj

2, where dhj is the topological distance between vertices h and j. Thus, Gk represents the overall sum of the cij charge terms for every pair of vertices i and j separated by a topological distance k. The valence Topological Charge Indices Gk

v and Jkv are defined in a similar way, but using

Av, the electronegativity-modified adjacency matrix, instead of A. The elements of A and Av are identical except for the main diagonal where A has zeroes and Av the corresponding Pauling electronegativity values, EN, weighed for EN(Cl) = 2 for each heteroatom. To illustrate the calculation of the topological charge indices, let us consider the n-butane. Its hydrogen-depleted graph is: •⎯•⎯•⎯•. If we number each vertex of this graph in the following way: 1-2-3-4, we can write the A, Q ad M matrices, which are used to derive the following G values: G1

= |c12| + |c23| + |c34| = 1/4 + 0 + 1/4 = 0.500, G2 = |c13| + |c24| = 1/9 + 1/9 = 0.2222, and G3 = |c14| = 0.

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

0100101001010010

A

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

014/19/11014/1

4/11019/14/110

Q

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=∗=

1014/14/124/19/109/104/124/1

4/1101

QAM

Differences and quotients of connectivity indices The difference of connectivity indices, kDt, with k = 0-4, and t = p, c, pc, are defined in the following way, [25]

vD tk

tk

tk χ−χ= Eq. 4.

Page 8: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 70

The quotient of connectivity indices, kCt, with k = 0-4, and t = p, c, pc, are defined in the following way [20]

vC

tk

tk

tk

χχ

=

Eq.5

All descriptors used in this work were obtained with the aid of the Desmol11 program developed by us and available by e-mail request. 2.2. QSAR algorithm

2.2.1. Multilinear regression analysis, MLR Several multilinear descriptive functions, Eq. 6 have been obtained by the linear correlation of biological properties with the aforementioned descriptors.

iioi XAAP ∑+= Eq. 6 where Pi is a property, Xi are the topological indices, and Ao and Ai are the regression coefficients of the equation obtained. The Furnival-Wilson algorithm [26] is used to obtain subsets of descriptors and equations with the least Mallows parameter, Cp [27]. This algorithm combines two methods of computing the residual sums of squares for all possible regressions to form a simple leap and bound technique for finding the best subsets without examining all possible ones. The result is a reduction by several orders of magnitude in the number of operations required to find the best subsets. The predictive ability of the selected mathematical models was evaluated through cross-validation, using the leave-one-out [28] method. To do this, one compound in the set was removed and the model was recalculated using the remaining N-1 compounds as training set. The property was then predicted for the removed element. This process was repeated for all the compounds in the set to obtain a prediction for each. A plot of the residual vs. cross-validation residual (cv) allowed the detection of outliers. In order to evidence the possible existence of fortuitous regressions, the randomization test is adopted in this paper. Thus, the values of the property of each compound are randomly permuted and linearly correlated with the aforementioned descriptors. This process is repeated as many times as needed. The usual way to represent the results of a randomization test is plotting the correlation coefficients versus predicted ones, r2 and Q2 respectively.

Page 9: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 71

The predictive ability of the equations obtained can be better assessed by an external test. A random sub-set of molecules is initially chosen (the “external test set") and the modeling study is carried out with the remaining molecules (the "training set"). The predictive performance of the model is assessed by the results obtained when it is applied to the external test set. MLR applications

Prediction of plasmatic protein binding for a group of antineoplastics A set of 41 highly heterogeneous antineoplastic drugs was used in this study. The multilinear regression analysis was performed by means of the BMDP software [29]; The dependent variable, namely plasmatic protein binding (PPB) was correlated against the topological descriptors. The selected equation was: PPB(%) = 194.9 + 7.2 4χpc + 3.6 G1 -10.6 G5

V – 124.3 J1V –

704.0 J4V – 8.5 2Dp Eq. 7

N=41 r2=0.805 Q2=0.732 SEE=15.5 F=23.3 p<0.0001 The J and G labels correspond to the topological charge indices, which take into account the charge transfers inside the molecule, whereas χpc and

2Dp stand for the connectivity and difference of connectivity indices, respectively. The first encodes information abot the topological ensemble of the molecule and the second on electronic properties. Table 3 and Figure 3 show the outcome for the predicton of PPB for each drug. Altogether there is a pretty good predictive asset, as far as all the drugs Table 3. Results of prediction of plasmatic protein binding obtained by multilinear regression analysis for a group of antineoplastics.

Compound PPBexp(%) PPBcalc(%)* PPBcalc(cv)(%)

Dacarbazine 5.0 15.4 17.0Gemcitabine 10.0 33.6 37.8 Cytarabine 13.0 32.1 33.9 Cyclophosphamide 15.0 48.7 51.2 Temozolomide 15.0 -2.1 -5.8 Mercaptopurine 19.0 18.6 18.5 Aminoglutethimide 25.0 53.1 57.2

Page 10: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 72

Table 3. Continued

Cladribine 25.0 9.9 7.3 Busulfan 30.0 53.0 62.4 Nimustine 34.0 26.6 26.0Topotecan 35.0 37.7 38.0 Anastrozole 45.0 31.8 29.1Methotrexate 50.0 71.0 74.7 Capecitabine 54.0 50.0 49.8Letrozol 60.0 55.3 54.6 Irinotecan 65.0 87.6 90.3 Vinblastine 75.0 74.4 74.0 Vincristine 75.0 78.3 80.3 Mitoxantrone 76.0 58.9 56.8 Doxorubicin 80.0 75.1 74.7 Melphalan 80.0 84.7 85.1 Hydroxycarbamide 80.0 72.2 69.0Epirubicin 85.0 75.1 74.1 Exemesttane 90.0 90.1 90.1Formestane 93.0 93.0 93.1 Raltitrexed 93.0 68.7 65.5 Medroxyprogesteron 94.0 92.7 92.4 Cyproterone 95.0 94.3 94.1 Docetaxel 95.0 95.5 95.7 Etoposide 95.0 88.4 86.5 Flutamide 95.0 80.6 77.5 Imatinib 95.0 103.7 107.0Paclitaxel 95.0 98.9 99.9 Idarubicin 96.0 76.0 73.9Amsacrine 97.0 99.8 100.2 Bicalutamide 98.0 88.7 86.9 Chlorambucil 99.0 72.4 70.0 Estramustine 99.0 90.7 89.9 Tamoxifen 99.0 102.5 103.2 Thiotepa 99.0 89.3 80.2 Toremifen 99.0 110.4 112.8

* Values obtained from Eq. 7

Page 11: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 73

Figure 3. Observed versus calculated values of the PPB.

Figure 4. Cross-validated residuals versus residuals for the PPB model.

showing large PPB rate (23 compounds with PPB above 80%) are predicted above 70% (excepting raltitrexed which is a clear outlier). On the other hand, all drugs with values below 20% (namely six compounds) are predicted to lay under a 35% threshold. As in the top values, there is a clear outlier here: cyclofosfamide.

Page 12: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 74

The uniform distribution shown at both sides of the tendency line (see Figure 3) is indicative of the well balanced selection of variables carried out in the model. The predictive ability of the selected mathematical model was evaluated through cross-validation, using the leave-one-out. Table 3 and Figure 4 show the obtained results. The value of Q2=0.732 is accepted as satisfactory. Prediction of the phenoloxidase inhibition by a group of benzaldehyde thiosemicarbazone A topological-mathematical model has been arranged to search for new derivatives of benzaldehyde thiosemicarbazone and related compounds acting as phenoloxidase inhibitors. Phenoloxidase, PO, also known as tyrosinase, is a key enzyme in different metabolic processes of microorganisms and other animals and plants [30,31]. In insects, PO is related to three important biochemical functions, including cuticule sclerotization, defensive encapsulation and the mechanization of alien organisms [32]. By using multilinear regression analysis a function with two descriptors, 1χv, 4χp

v and r2=0,940 was capable to predict adequately the IC50 for each compound. The best linear regression equation obtained, including its statistical parameters, was: pIC50 = -3.132 + 5.716 1χv – 16.581 4χp

v Eq. 8 N=44 r2=0.940 Q2=0.931 SEE=0.500 F=321.1 p<0.00001 The presence of the 1χv and 4χp

v indices in the equation reflects the influence of branching and position of the substituent in aromatic ring, respectively. Figure 5 shows the predicted results obtained with each one of the compounds in the training and test set. The results of the randomness tests, Figure 6, suggest a high stability of the model (all regressions were rather poor except for the selected equation). For more details about this study, see the reference [33]. Others results of prediction of biological properties obtained by our research group The most interesting results obtained recently by the methodology described in the preceding paragraphs are shown in the Table 4. Additional details are available in the references cited.

Page 13: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 75

Figure 5. Relationship of pIC50exp with pIC50calc from prediction function obtained using multilinear regression analysis, Eq. 8. Open circles represent predictions for the training set; solid circles represent predictions for the test set.

Figure 6. Validation of the mathematical model obtained for the pIC50. Correlation coefficients, r2, versus prediction coefficients, Q2, obtained by randomization test.

Page 14: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 76

Table 4. QSAR topological models to predict pharmacological properties.

Property

Predictive equation Ref.

IC50.(μM) L. donovani

Log IC50 = 13.32 +0.814 4χp + 1.381 G3v – 32.16 J3

v + 0.0018W – 0.717N + 0.332PR1 – 0.263PR2 N= 48 r2 = 0.806 SEE=0.366 F=10.6 p<0.0001

[34]

IC50.(μM) K1 strain of P. falciparum

Log IC50 = 3.154 – 0.338 1χv + 0.381 4χpc N= 54 r2 = 0.842 Q2 = 0.825 SEE=0.308 F=136.4 p<0.0001

[35]

LogLD50

(oral in rat)

Log LD50 = 15.13 + 0.79 G1 – 1.83 4Cp – 1.10 1χv – 0.28 G2 – 8.89 0C + 1.43 G3

v – 20.04 J3v – 0.13 4χpc

v N=39 r2=0.821 Q2=0.701 SEE=0.40 F=17.2 p<0.0001

[36]

Log LD50 (i.p. in rat)

Log LD50 = 8.40 + 9.35 J1 – 2.03 4Cp – 10.11 0C + 1.20 1D + 0.83 G5

v – 0.39 4χpcv

N=39 r2=0.721 Q2=0.613 SEE=0.54 F=13.8 p<0.0001 [36]

Toxicity to Chlorella vulgaris

pC = -4,494 + 0,568 0χv -0,113 G1v -1,161 G5

v+ 10,071 J4 + 0,188 V3

N=70 r2=0.928 Q2=0.918 SEE=0.405 F=180 p<0.0001 [37]

2.2.2. Linear discriminant analysis, LDA The objective of the linear discriminant analysis, LDA, is to find a linear combination of variables that allows discrimination between two or more categories or objects. Generally, two sets of compounds are considered in the analysis first, a set of compounds with proven pharmacological activity, and second, a set compounds known to be inactive. Introductory accounts of LDA are given by Kachigan [38] and McFarland and Gans [39].The selection of the descriptors is based on the Fisher-Snedecor parameter, and the classification criterion is the shortest Mahalanobis distance (i.e., the distance of each case from the mean of all cases used in the regression equation). Variables used in computing the linear classification functions are chosen stepwise. At each step either the variable that adds the most to the separation of the groups is entered into the discriminant function, or the variable that adds the least to the separation of the groups is removed from the discriminant function. The quality of the discriminant function is evaluated by Wilks' λ, which is a multivariate analysis of variance parameter that tests the equality of group means for the variable(s) in the discriminant function. Minimization of Wilks' parameter allows selecting the predictors to be entered or deleted in the discriminant function [40]. The technique is described in detail by Tabachnick and Fidell [41]. The discriminant ability of the selected function is stated by:

Page 15: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 77

- The Classification matrix, in which each case is classified into a group according to the classification function. The number of cases classified into each group and the percent of correct classifications are shown.

- The Jack-knifed classification matrix, in which each case is classified into a group according to the classification functions computed from all the data except the case being classified.

- Cross validation with random sub-samples and classifying new cases. Here, the cases in each group are randomly subdivided into two separate sets, the first of which is then used to estimate the classification function, and the second of which is classified according to the function. By observing the proportion of correct classifications produced for the second set, one obtains an empirical measure for the success of the discrimination.

- Use of an External set test, which entails the use of an external compound set to check the validity of the selected discriminant functions.

LDA applications

Inhibition of Trypanosoma cruzi hexokinase by bisphosphonates The American Trypanosomiasis or Chagas disease is a pathology whose causative agent is the protozoan parasite Trypanosoma Cruzi. The infection is transmitted by hemiptera bloodfeeding insect vector, subfamily Triatominae which affects human and domestic-wild animals. An estimate 18 million people are infected and one hundred million have high risk to get the infection in fifteen south-American countries from Mexico to Argentina [42]. Nowadays the therapy for this pathology is based on two drugs: nifurtimox and benznidazol, which have a hazardous level of toxicity and are only effective on the acute stage of the disease. That is the reason why it’s necessary to find new drugs against this disease. Hexokinase is the first enzyme involved in glycolysis in most organisms including the etiological agents of Chagas disease (Trypanosoma cruzi) and African sleeping sickness (Trypanosoma brucei). Recent studies have shown that bisphosphonates analogues, are potent inhibitors of T. cruzi hexokinase, which can represent a novel target to find new compounds actives against T. cruzi [43]. In this work, the inhibition of T. cruzi hexokinase, TcHK, of a group of bisphosphonate derivatives was investigated to obtain a QSAR model of prediction using molecular topology and linear discriminant analysis. A group of 42 bisphosphonate derivatives inhibitors of TcHK was selected.

Page 16: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 78

Tables 5 and 6 show the structures and inhibitory activity expressed as values of IC50 (μM) obtained for the in vitro assays for each compound reported through the papers [43, 44]. To obtain the discriminat function, we apply the LDA to a training group comprised of 35 compounds and to validate it, to a 7 compounds test group. The training series is comprised by two subgroups: an active group (compounds with values of IC50<20μM) and an inactive group (compounds Table 5. Structures of bisphosphonates studied in order of decreasing potency in TcHK inhibition.

PO3H2

H2O3P OHN

4

PO3H2

H2O3P

HN

N

Et

5

PO3H2

PO3H2

HNBr

6

PO3H2

PO3H2

HN

H3C(H2C)6

7

PO3H2

H2O3POHN

N

8

PO3H2

PO3H2

HN

9

PO3H2

H2O3POH

N+

10

PO3H2

H2O3POH

F F

11

PO3H2

PO3H2

HNO2N

12

PO3H2

H2O3P OH

nonane-n

13

PO3H

PO3H2

HN

F

F

14

PO3H2

PO3H2

HN

15

PO3H2

PO3H2

HN

F

16

PO3H2

PO3H2

HN

O

17

PO3H2

H2O3POH

N

18

PO3H

PO3H2

HN

HO

19

H2O3P

PO3H2

OH

Ph

20

PO3H2

PO3H2

HNN

21

PO3H2

H2O3POH

N+

N

22

PO3H2HNPh

23

PO3H2

PO3H2

NH

OH

24

PO3H2

O O

Ph

25

PO3H2

SO3H

Ph

26

PO3H2

H2O3P

NHO

OH

27

PO3H2

PO3H2

HN

NHO

28

PO3H2

H2O3POH

N+

29

PO3H2

H2O3POH

30

PO3H2

H2O3P

N+

OH

31

PO3H2

H2O3P OH

N+

32

PO3H2

H2O3P OH

N

N

33

PO3H2H2O3P

NH

34

PO3H2

PO3H2

HN

35

PO3H2

PO3H2

HN

Cl

Cl 36

PO3H2

PO3H2N

37

PO3H2

PO3H2N

HN

O

38

PO3H2

PO3H2N

HN

O

39

PO3H2

H2O3P OH

NO

Ph

40

PO3H2

PO3H2

N

H3CH2C

H3C(H2C)3 41

PO3H2

H2O3POHN

H

N

42

PO3H2

PO3H2

HN

43

PO3H2

H2O3P

HN

44

PO3H2

H2O3P

HN

OH

45

Page 17: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 79

Table 6. Results of prediction obtained by lineal discriminant analysis with IC50 for TcHK and the bisphosphonate derivatives analysed.

Compound IC50(μM) 1χ G4V J2

V J5V DF Class.

Active group training (IC50<20μM)

5 0.81 9.847 1.869 0.252 0.055 1.19 A

6 0.95 6.661 0.782 0.187 0.029 2.25 A

7 1.45 9.721 1.278 0.193 0.045 3.28 A

8 1.82 8.500 2.024 0.383 0.045 -1.85 I

9 1.95 8.321 0.959 0.197 0.038 2.23 A

10 2.29 10.761 2.222 0.321 0.045 3.07 A

11 2.29 9.933 1.884 0.326 0.046 0.88 A

12 2.4 7.604 0.776 0.171 0.035 2.41 A

13 2.75 8.561 1.126 0.191 0.040 2.81 A

14 2.75 9.149 1.286 0.206 0.040 3.31 A

15 3.47 8.821 0.967 0.254 0.038 1.04 A

16 3.63 8.738 1.267 0.212 0.039 2.91 A

17 4.07 7.721 1.248 0.230 0.037 1.67 A

18 12.6 6.618 0.813 0.338 0.036 -4.08 I

19 14.8 7.221 0.976 0.230 0.031 1.56 A

Inactive group training (IC50>20μM)

4 300 6.618 0.804 0.324 0.028 -1.87 I

22 300 7.972 1.906 0.360 0.058 -4.59 I

23 300 6.689 0.808 0.208 0.039 -0.45 I

24 300 8.209 1.149 0.313 0.043 -2.13 I

26 300 8.321 1.149 0.295 0.040 -0.82 I

27 300 9.100 1.495 0.334 0.042 -0.52 I

28 300 8.732 1.365 0.265 0.044 0.55 A

29 300 9.557 2.080 0.403 0.041 -0.43 I

30 300 9.442 1.718 0.172 0.086 -3.47 I

31 300 8.689 1.858 0.338 0.043 -0.34 I

32 300 6.618 1.222 0.368 0.031 -2.83 I

33 300 7.833 1.343 0.321 0.038 -1.14 I

35 300 7.911 1.590 0.224 0.061 -1.93 I

Page 18: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 80

Table 6. Continued

36 300 7.077 0.833 0.218 0.039 -0.30 I

37 300 7.121 2.162 0.255 0.058 -1.26 I

38 300 9.284 1.276 0.268 0.050 -0.55 I

39 300 7.221 1.071 0.258 0.034 0.29 A

40 300 9.512 1.594 0.261 0.068 -3.00 I

42 300 7.118 0.677 0.290 0.030 -1.11 I

43 300 5.693 0.990 0.234 0.053 -4.55 I

Test group

20 34.7 8.232 0.957 0.233 0.030 2.69 A

21 45.7 6.667 0.692 0.349 0.067 -11.10 I

25 300 8.735 2.027 0.392 0.059 -4.83 I

34 300 8.581 2.115 0.305 0.059 -1.83 I

41 300 7.141 2.180 0.241 0.081 -5.51 I

44 300 5.710 0.913 0.233 0.047 -3.53 I

45 300 8.078 1.551 0.180 0.080 -4.33 I with IC50>20μM). The test series is arranged by compounds with values of IC50>20μM and randomly selected from the inactive group. The discriminant function selected was: DF = 5.261 + 1.013 1χ + 3.011 G4

v – 32.49 J2v – 207.4 J5

v Eq. 9 N=35 F=5.90 λ(Wilks’ lambda) = 0.556 From here, a given compound will be selected as a potential inhibitor TcHK if DF > 0, otherwise it is classified as “inactive”. The classification matrix is very significantly for the training set (86.7% of correct prediction for the active group, 13 out of 15 correctly classified, and 90.0% for the inactive group, 18 correct out of 20 (see Table 6). An easy way to evaluate the quality of the selected discriminant function is to apply it into an external test group. In our case we used 7 compounds which haven’t been included in the discriminant analysis, all of them with values of IC50 > 20μM. Table 6 shows the results obtained for each compound. As shown, all the compounds, except the number 20, are correctly classified as inactive, DF<0. For more details see ref. [45].

Page 19: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 81

Inhibition of parasitary activity against Leishmania donovani Leichmaniasis is a parasitarian disease caused by the Leishmania protozoo. The transmission pathway is by biting of a mosquito belonging to the Phlebotominae subfamily, being the main reservoirs some, either wild or domestic, mammalians, including the humans for some species such as L. donovani, L. tropica [46]. The disease is endemic in 88 countries into four continents. Nowadays is a real public health trouble concerning to 12 million people and an incidence of 1.5-2 million new records per year [47]. Leichmaniasis is considered by WHO as one of the orphan diseases to be included into the program TDR (Special Programme for Research and Training in Tropical Diseases) [48]. In this item, we shall focuse our interest in some dinitrobenzene sulphonamides, all of them derivatives of the herbicide orizaline. These compounds exhibit a proven anti-Leishmania activity, by inhibiting the polimerization of tubuline in purified parasites, thereby hindering the growth of parasite in the phases G2/M of the cell cycle [49]. The study consisted of the use of the discriminant analysis to get a topological model capable to classify correctly the activity/inactivity of a given compound on Leishmania Donovani. Once arranged the model, it could be applied to the search of new potentially active compounds. A set of 57 compounds (37 of which were 3,5-dinitrobenzene derivatives and 20 were selected from the Maybridge Organics Compounds database [50], according to the outcome of Werbovetz [49]). Table 7 illustrates both, the chemical structure as well as the antiparasitary activity of the selected compounds, expressed as IC50 (μM). In order to get the discriminant function we applied the LDA to the training set comprised of all the active compounds (IC50 values <20μM) plus 80% of the inactive compounds (IC50>20μM). The remaining 20% of molecules was used as an external validation test to check the performance of the discriminant function chosen, DF, which was: DF = -7.74 – 5.32 4χpc -0.542 G1

v + 5.11 G3 + 5.16 G4v + 18.52 J2

v – 141.58 J3

v + 2.51 3Dc Eq. 10 N=48 F=8.97 λ(Wilks’ lambda) = 0.389 According to this equation, a compound should be classified as active if DF > 0, otherwise it is classified as inactive. The resulting classification matrix is very significant because 100% of the active compounds are correctly

Page 20: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 82

Table 7. Chemical Structure of the studied compounds and results of the classification as for their activity against L. donovani. (A= Active; I= Inactive).

NRR

NO2NO2

X Compound

R X IC50expa DFb Clasif.

Active group training (CI50<20μM)

46 S

O2N

NO2Cl

F

F

FCl

0.5 0.87 A

48 S

NO2

O2N

O OCl

F

F

FCl

2.3 5.23 A

33 n-propyl SO2HN

F

2.5 3.09 A

11 n-butyl SO2HN 2.6 3.66 A

02 n-propyl SO2HN

Cl

Cl

3.7 5.72 A

01 n-propyl SO2HN 5 1.42 A

32 n-propyl SO2HN

Cl

Cl5 4.55 A

17 n-propyl SO2HN

Cl

5.5 3.53 A

12 n-butyl SO2HN F 5.6 4.69 A

Page 21: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 83

Table 7. Continued

23 n-butyl SO2HN

F

5.7 5.29 A

30 n-propyl SO2HN

OCH3

8.1 2.69 A

36 n-pentyl SO2NH2 9 2.9 A

35 n-ethyl SO2HN 11 1.32 A

37 n-hexyl SO2NH2 12 3.83 A

16 n-propyl SO2HN Cl

13 3.34 A

26 n-butyl SO2NH2 20 1.82 A Inactive group training (CI50>20μM)

04 n-propyl SO2N(Me)

21 -6.38 I

44 S

NH

O O Cl

Cl

Cl

O

O

F

F

F

25 -3.34 I

05 n-propyl SO2N(CH2CH3)2 27 -7.46 I

15 n-propyl SO2HN CH3

32 2.88 I

29 n-propyl SO2HN OCH3 32 1.83 I

45 N

NS

NN

O

F

F

F

32 -5.88 I

54 N

HO

NH

F

Cl Cl

37 -5.93 I

Page 22: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 84

Table 7. Continued

55

HNO

N

S

O

Cl

39 -4.74 I

09 n-propyl SO2HN 43 -3.97 I

19 n-propyl SO2NH(CH2)4CH3 43 -1.21 I

22 S

NO2

N

n-propyl

n-propyl

O

O

HN

43 -6.71 I

08 n-propyl SO2NH(CH2)3CH3 50 -1.89 I

10 n-propyl SO2HN 50 -1.22 I

06 n-propyl SO2NH(CH2)2CH3 54 -3.25 I 07 n-propyl SO2N(CH2CH2CH3)2 55 -6.14 I

03 n-propyl N

SO2HN 60 1.35 I

Oryzalin n-propyl SO2NH2 65 -0.7 I

43 SN

S

S

HN

OO

Cl

Br

66 -4.05 I

24 H, n-propyl SO2NH2 67 -1.28 I 25 n-ethyl SO2NH2 69 -0.48 I

53 O

NO2

Cl

O

Cl

Cl

Cl

Cl

80 -3.34 I

27 S

NO2

N

n-propylO

O

H2Nn-propyl

90 -9.36 I

57 S

NO2

Cl

F

F

FCl

98 -3.17 I

13 n-propyl C N >100 -4.55 I

Page 23: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 85

Table 7. Continued

14 n-propyl N

C

OH

NH2

>100 -1.45 I

34 n-propyl SO2N

>100 -0.25 I

38 N N

H

HN

O

F

F

F

OOH

>100 -7.15 I

39 NH

CH3O

NO2

F

FF

CH3O

>100 -1.28 I

40 Cl

NO2

O

Cl

O

>100 -4.54 I

49

O

S

O

F

F

F

F

>100 -3.31 I

50 F

FS

NO2

O

S Br

OCH3

F

>100 -3.05 I

51 OCH3

NH

O

NN

OCH3

NO2

>100 -8.57 I

Test group

42 N

N

S

N

OHO

Cl

Cl

CH3

N

21 -9.01 I

31 n-propyl SO2HN

CH3

23 3.68 A

20 n-propyl SO2NH(CH2)5CH3 26 -0.58 I

Page 24: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 86

Table 7. Continued

21 n-propyl SO2N 47 -8.72 I

18 n-propyl SO2N((CH2)3CH3)2 48 -3.39 I

56

O

NH

Cl

Cl

S

NO2

O 60 -9.48 I

41 N

N S

Cl

O

O

ClCl

72 -14.41 I

28 n-propyl CONH2 76 -3.23 I

47 N

N N

SH

SCH3

HO

>100 -9.33 I

52

N

S

HO

OH

CH3

HN

O

OCH3

>100 -8.90 I

a Values of IC50 (μM) from reference [49]. b Values of the discriminant function, Eq. 10. classified whereas 90.6 % of the inactive (namely 29 out of 32 molecules) were also correctly placed. Altogether, the rate of correct classification is 93.8 %. To check the performance of the equation to disclose novel structures, an external validation test is necessary. For making so, a set of 10 compounds not included in the training set, randomly selected and all of them showing IC50 values above 20μM, were tested by the model. As observed in Table 7, column 5, all compounds except nº31, are correctly positioned by the model as inactive what imply a rate of success of 90%. For more details see ref. [34]. 2.3. Molecular topology and drug design The use of molecular topology as a tool for the design and selection of new drugs has been the main objective of our group since over twenty years

Page 25: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 87

[51-54]. The outcome of such a task was the finding of many new active compounds in different therapeutical scopes as illustrated in Table 8. Details on assays and protocols can be found in the references therein. Some of these compounds can be considered as new leads which can be the starting point for the design of new and more effective drugs. Table 8. New biological activities discovered through virtual screening. For details see the references in the last column.

Activity found

Selected drugs Ref

Cytostatic 6-azuridine, quinine [55]

Antibacterial

1-Chloro-2,4-dinitrobenzene, 3-Chloro-5-nitroindazole, 1-Phenyl-3-methyl-2-pyrazolin-5-one, neohesperidin, amaranth, mordant brown 24, hesperidin, morine, niflumic acid, silymarine, fraxine

[56,57]

Antifungal Neotetrazolium chloride, benzotropine mesilate, 3-(2-Bromethyl)-indole, 1-Chloro-2,4-dinitrobenzene [58]

Hypoglycaemic 3-Hydroxybutyl acetate 4-(3-Methyl-5-oxo-2-pyrazolin-1-yl) benzoic acid 1-(Mesitylene-2-sulfonyl) 1H-1,2,3-triazole

[59]

Antivirals (anti-Herpes)

3,5-dimethyl-4-nitroisoxazole, nitrofurantoin, (pyrrolidinocarbonylmethyl)piperazine, nebularine, cordycepin, adipic acid, thymidine, α−thymidine, inosine, 2,4-diamino-6-(hydroxymethyl)pteridine, 7-(carboxymethoxy)-4-methylcoumarin, 5-methylcytidine

[60]

Antineoplastic Carminic acid, tetracycline, piromidic acid, doxycycline [61]

Antimalarial Monensin, nigericin, vinblastine, vincristine, vindesine, ethylhydrocupreine, quinacrine, salinomycin [62]

Antitoxoplasma

Cefamandole nafate Prazosin Andrographolide Dibenzothiophene sulfone 2-Acetamido-4-methyl-5 thiazolesulfonyl chloride

[63]

Antihystaminic

Benzydamine 4-(1-Butylpentyl)pyridine N-(3-Bromopropyl)phtalimide N-(3-Chloropropyl)phtalimide N-(3-Chloropropyl)piperidine hydrochloride 5-Bromoindole

[64]

Bronchodilator

Griseofulvin, anthrarobin, 9,10-Dihydro-2-methyl-4H-benzo [5,6] cyclohept [1,2-d] oxazol-4-ol, 2-Aminothiazole, Maltol, esculetin, fisetin, hesperetin, 4-methyl-umbellipheryl-4-guanidine benzoate

[65]

Analgesics 2-(1-propenyl)phenol, 2',4' dimethylacetophenone, p-chlorobenzohydrazide, 1-(p- chlorophenyl) propanol, 4-benzoyl-3-methyl-1-phenyl-2-pyrazolin-5-one

[66,67]

Page 26: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 88

The overall process for topologically driven drug design can be summarized in the following steps: - Step I: Selection of the therapeutical group and the most representative

drugs therein (training set). - Step II: Search of the physicochemical, biological and pharmacological

information for every drug in the training set. - Step III: Calculation of the adequate topological descriptors. - Step IV: Performing of the QSAR throughout different statistical

techniques such as multilinera regression, MRL, discriminant analysis, LDA, neural networks, NN, ... etc.

- Step V: Selection of the best topological model. Usually it is formed by both, predictive equations as well as discriminant functions.

- Step VI: Applcation of the model to the search of new drugs, either through database screening, chemical libraries from combinatorial chemistry or de novo design if necessary.

- Step VII: Finally, the selected compounds should be tested at the laboratoty to confirm their predicted activity. Typically the outcome is used for further refinement of the molecular candidates until goals fulfilment.

Just as an example, the most significant results recently achieved in antineoplastics and antimalarials are shown. Antineoplastic agents Particular relevance in this field has been the design of a novel lead compound named MT477, (see Figure 7), which has shown very potent activity in vivo against human cell carcinoma [68]. MT477 is a novel thiopyrano[2,3-c]quinoline that has been identified using molecular topology screening as a potential anticancer drug with a high activity against protein kinase C (PKC) isoforms. The objective of this study was to determine the mechanism of action of MT477 and its activity against human cancer cell lines. MT477 interfered with PKC activity as well as phosphorylation of Ras and ERK1/2 in H226 human lung carcinoma cells. It also induced poly-caspase-dependent apoptosis. Antimalarials agents against liver stages of Plasmodium Each year, the malaria parasite Plasmodium falciparum infects 300 to 660 million persons worldwide and causes several million deaths [69]. New

Page 27: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 89

Figure 7. Chemical structure of MT477. The chemical name of MT477 is dimethyl 5,6-dihydro-7-methoxy-5,5-dimethyl-6-(2-(2,5-dioxopyrrolidin-1-yl)acetyl)-1H-1-(4,5- dimethoxycarbonyl-1,3-dithiolo-2-spiro) thiopyrano[2,3 ]quinoline-2,3-dicarboxylate. antimalarial drugs are urgently needed, especially considering the increasing prevalence of drug-resistant P. falciparum strains and the lack of effective vaccines and vector control measures. The Plasmodium liver stage is an interesting drug target, as it precedes the emergence of blood stages that cause the symptoms and complications of malaria. Drugs that inhibit parasite maturation within hepatocytes could be used for short-term prophylaxis in areas of endemicity (refugees and travelers, etc.). We conducted a quantitative structure-activity relationship (QSAR) study based on a database of 127 compounds previously tested against the liver stage of Plasmodium yoelii in order to develop a model capable of predicting the in vitro antimalarial activities of new compounds. Topological indices were used as structural descriptors, and their relation to antimalarial activity was determined by using linear discriminant analysis. A topological model consisting of two discriminant functions DF1 and DF2 was created: DF1= 3.17 + 1.000χv +1.281χv -14.04J1

v -22.94J3v +96.23J4

v -65.98J5 + 1.880D - 23.534Dc +0.294Cc -0.51PR3 +0.38V3 Eq.11 N= 76 F= 7.95 λ= 0.42 DF2= 81.90 -3.454χp

v +7.41G5v -70.211C -1.44PR1 -1.21V3 +2.44V4 Eq. 12

N= 28 F= 12.4 λ= 0.21

Page 28: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 90

The first function, DF1, discriminated between active and inactive compounds, and the second, DF2, identified the most active among the active compounds. The model was then applied sequentially to a large database of compounds with unknown activity against liver stages of Plasmodium. Seventeen drugs that were predicted to be active or inactive were selected for testing against the hepatic stage of P. yoelii in vitro (see Table 9). Antiretroviral, antifungal, and cardiotonic drugs were found to be highly active (nanomolar 50% inhibitory concentration values), and two ionophores completely inhibited parasite development. The 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay was performed on hepatocyte cultures for all compounds, and none of these compounds were toxic in vitro. For both ionophores, the same in vitro assay as those for P. yoelii has confirmed their in vitro activities on Plasmodium falciparum. For more details see the ref. [70].

Table 9. Predicted drug activity on liver stage of Plasmodium. yoelii yoelii.

Drugs (Therapeutic Category) DF1(Class) DF2(Class) IC50 exp (nM)c

Active drugs Monensin (Antibacterial/Ionophore) 4.22 (A) -21.62 (NC) < 10-3

Nigericin (Ionophore) 3.36 (A) -24.22 (NC) < 10-3

Delaverdine (Antiviral) -0.21 (NC) 6.75 (HA) 0.846 Mibefradil (Antihypertensive) 7.59 (A) 6.66 (HA) 0.873 Licochalcone A (Estrogenic flavonoid) 8.26 (A) 7.99 (HA) 0.927 Miconazole (Antifungal) 1.46 (A) 1.77 (HA) 2.03 Dobutamine (Cardiotonic) 5.74 (A) 1.52 (HA) 3.7 Ritonavir (Antiviral) 8.80 (A) -5.38 (A) 34.2 Saquinavir (Antiviral) 9.14 (A) -6.76 (A) 35.2 Epoximicin (Antineoplastic) 8.17 (A) 7.92 (HA) 3.95 x103 Indinavir (Antiviral) 8.89 (A) -10.55 (A) 5 x103 Vinblastine (Antineoplastic) 1.21 (A) 38.71 (NC) 7.95 x103 Nordihydroguaiaretic acid (Antineoplastic) 9.03 (A) 2.88 (HA) 3 x104 Inactive drugs Fenbendazol (Antihelminthic) -3.33 (I) / 3 x104

Quinacrine (Anthelminthic/Antimalarial) -0.93 (I) / 3 x104 Rimandine (Antiviral) -2.69 (I) / 35.6 Thiophanote (Anthelminthic) -4.04 (I) / 3 x104

Atovaquone (ref. drug) 9.37(A) 3.63(HA) 57 Primaquine (ref. drug) 0.91(A) 1.79(HA) 75.7

a A=active; I=inactive; HÁ= highly active; NC= non classified. 3. Conclusions The results outlined here, clearly demonstrate that molecular topology (MT) based QSAR has become a powerful tool for the prediction of

Page 29: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 91

properties and the design and selection of novel drugs. Furthermore, the fact that MT consists of a pure mathematical description of molecular structure, beyond any geometrical or physical profile, is also an important asset of the approach. The reasons explaining why MT works with such a level of efficacy remains as an open question and constitutes probably a good challenge to be addressed in the near future. 4. Acknowledgements The authors acknowledge financial support from the Fondo de Investigación Sanitaria, Ministerio de Sanidad, Spain (project: SAF2005-PI052128). We also thank prof. Eduardo Castro for his help in our research and his contributions into the field. 5. References 1. Hansch, C. and Fujita, T., 1964, J.Amer.Chem.Soc., 86, 1616. 2. Free, S.M., Wilson, J.W., 1964, J. Med. Chem., 7, 395. 3. Cramer, R.D. , Patterson, D.E., Bunce, J.D., 1988, J. Am. Chem. Soc., 110, 5959. 4. Güner, O.F., 2002, Curr. Top. Med. Chem., 2, 1321. 5. Klebe, G., Abraham, U., Mietzner, T., 1994, J. Med. Chem., 37, 4130. 6. Robinson, D.D., Winn, P.J., Lyne, P.D., Richards, W.G., 1999, J. Med. Chem.,

42, 573. 7. Kier, L.B., Hall, L.H., 1976, Molecular Connectivity in Chemistry and Drug

Research. Academic Press, London. 8. Devillers, J., 2000, Current Opinion in Drug discovery and Development, 3(3),

275. 9. Diudea M.V., Florescu, M.S., Khadikar, P.V., 2006, Molecular Topology and its

Applications, EfiCon Press, Bucarest. 10. L. Pogliani, 2000, Chem. Rev., 100, 3827. 11. Ivanciuc, O., Balaban, A.T., 1998, Tetrahedron, 54, 9129. 12. Hosoya, H., Gotoh, M., Murakami, M., Ikeda, S., 1999, J. Chem. Inf.

Comput.Sci., 39, 192. 13. Garcia-Domenech, R., Galvez, J., de Julian-Ortiz, J. V., Pogliani, L., 2008,

Chem. Rev., 108(3), 1127. 14. Basak, S. C., Mills, D., Gute, B. D., Natarajan, R., 2006, Topics in Heterocyclic

Chemistry, 3, 39. 15. Balaban, A. T., Motoc, I., Bonchev, D., Mekenyan, O., 1983, Topics in Current

Chemistry, 114, 21. 16. Estrada, E., Uriarte, E., 2001, Current Medicinal Chemistry, 8(13), 1573. 17. Marrero-Ponce, Y., 2004, Bioorganic & medicinal chemistry, 12(24), 6351. 18. Dudek, A.Z., Arodz, T., Galvez, J., 2006, Combinatorial Chemistry: High

Throughput Screening, 9(3), 213.

Page 30: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 92

19. Galvez, J., Garcia-Domenech, R., 1994, Farmaindustria, 357. 20. Gálvez, J., García-Domenech, R., Julián-Ortiz, J. V. de, Soler, R., 1995, J. Chem.

Inf. Comp. Sci., 35, 272. 21. Wiener, H., 1947, J. Am. Chem. Soc., 69, 17. 22. Kier, L.B., Hall, L.H., Molecular Connectivity in Structure-Activity Analysis,

Wiley, New York, 1986. 23. Randić, M., 1975, J. Am. Chem. Soc., 97, 6609. 24. Gálvez, J., García-Domenech, R., Salabert, M. T., Soler, R., 1994, J. Chem. Inf.

Comp. Sci., 34, 520. 25. Kier, L. B., Hall, L. M., 1989, Pharm. Res., 6, 497. 26. Furnival, G.M., Wilson, R.W., 1974, Technometrics, 16, 499. 27. Hocking, R.R., 1972, Technometrics, 14, 967. 28. Allen, D.M., 1974, Technometrics, 16, 125. 29. Dixon, W.J., Brown, M.B.L. Engelmanand R.I. Jennrich, BMDP Statistical

Software Manual, Vol I. University of California, Berkeley. Press 1990, 3390-4358.

30. Sanchez–Ferrer, A., Rodriguez-Lopez, J. N., García–Cánovas, F., García–Carmona, F., 1995, Biochim. Biophys. Acta, 1247, 1.

31. Chase, M. R., Raina, K., Bruno, J., Sugumaran, M., 2000, Insect Biochem. Molec., 30, 953.

32. Ashida, M., Brey, P.T., 1995, Proc. Natl. Acad. Sci. U.S.A., 92, 10698. 33. García-Domenech, R., Calvo-Chamorro, M.L., Cuervo-Arias, A.Y., Gómez-

Sucerquia, L.J., Ortega-Chávez, V., Pérez-Torrado, E., Gálvez, J., 2008, Afinidad, 538, 430.

34. García-Domenech, R., Domingo-Puig, C., Esteve-Martinez, M.A., Schmitt, J., Vera-Martinez, J., Chindemi, A.L., Galvez, J., 2008, Anales de la Real Academia de Farmacia, 74, 345.

35. Garcia-Domenech, R., Lopez-Peña, W., Sanchez-Perdomo, Y., Sanders, J.R., Sierra-Araujo, M.M., Zapata, C., Galvez, J., 2008, Internacional Journal of Pharmaceutics, 363, 78.

36. Garcia-Domenech, R., Alarcon-Elbal, P., Bolas, G., Bueno-Mari, R., Chorda-Olmos, F. A., Delacour, S. A., Mourino, M. C., Vidal, A., Galvez, J., 2007, SAR and QSAR in Environmental Research, 18, 745.

37. García-Domenech, R., Villanueva, A., Gálvez, J., 2008, Organic Chemistry : An Indian Journal,

38. Kachigan, S.L., 1991, Multivariate Statistical Analysis, Radius Press, New York 39. McFarland, J.W., Gans, D.J., 1986, Journal of Medicinal Chemistry, 30, 46. 40. Wold, S., Eriksson, L., 1995, Statistical validation of QSAR results. In: Van de

Waterbeemd H (ed) Chemometric methods in molecular design. VCH, New York.

41. Tabahnick, B.G., Fidell, L.S., 1990, Using Multivariate Statistics, Harper Collins, New York.

42. Organization WH. Burdens and Trends in Chagas disease. Available from: www.who.int/ctd/chagas/burdens

Page 31: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

Molecular topology and drug design 93

43. Hudock, M.P., Sanz-Rodriguez, C.E., Song, Y., Chan, J.M., Zhang, Y., Odeh, S., et al. 2006, J. Med. Chem., 49(1), 215.

44. Racagni, G.E., Machado de Doménech, E.E., 1983, Mol. Biochem. Parasitol., 9(2), 181.

45. García-Domenech, R., Espinoza, N., Galarza, R.F., Moreno-Padilla, M.J., Rojas-Ruiz, B., Roldan-Arroyo, LL., Sanchez-Lavado, M.I., Gálvez, J., 2008, Ars Pharmaceutica, 49(3), .

46. Cheng, C., 1986, General Parasitology, Academic Press College Division. Orlando.

47. Weekly epidemiological record. WHO 2002, (44), 77, 365-372 48. Weekly epidemiological record. WHO 2002, (25), 77, 205-212. 49. Delfín, D.A., Bhattacharjee, A.K., Yakovich, A.J., Werbovetz, K.A., 2006, J.

Med. Chem., 49, 4196. 50. Data base: Maybridge Organics Compounds, http://www.maybridge.com 51. Arviza, M.P., 1985, Predicción e interpretación de algunas propiedades

fisicoquímicas y biológicas de un grupo de barbitúricos y sulfonamidas por el método de conectividad molecular. Doctoral Thesis, Universidad de Valencia, Spain.

52. Bernal, J., 1988, Desarrollo de un nuevo método de diseño molecular asistido por ordenador. Su aplicación a fármacos betabloqueantes y benzodiazepinas. Doctoral Thesis, Universidad de Valencia, Spain.

53. Gálvez, J., García-Domenech, R., Bernal, J. M., García-March, F. J., 1991, An. Real Acad. Farm., 57, 533.

54. Garcia-Domenech,R., Gálvez, J., Garcia-March, F.J., Moliner, R., 1991, Drug. Invest., 3(5), 344.

55. Gálvez, J., García-Domenech, R., Gómez-Lechón, M.J., Castell, J.V., 2000, J. Mol. Struc. (Theochem), 504, 241.

56. de Gregorio-Alapont, C., García-Domenech, R., Gálvez, J., Ros, M.J., Wolski, S., García, M.D., 2000, Bioorg. Med. Chem. Lett., 10, 2033.

57. Gálvez, J., García-Domenech, R., Gregorio Alapont, C.de, Julián-Ortiz, J. V. de, Salabert-Salvador, M. T., Soler-Roca, R., 1996, In Advances in Molecular Similarity. Vol. I , Carbó-Dorca, R., Mezey, P. G., JAI Press Inc : London.

58. Pastor L., García-Domenech, R., de Gregorio Alapont, C., Gálvez, J., 1998, Bioorg. Med. Chem. Lett., 8, 2577.

59. Antón-Fos, G. M., García-Domenech, R., Pérez-Giménez F., Peris-Ribera, J. E., García-March, F., Salabert-Salvador, M. T., 1994, Arzneim-Forsch/Drug Res., 44, 821.

60. De Julian-Ortiz, J.V., Galvez, J., Muñoz-Collado, C., Garcia-Domenech, R., Jimeno-Cardona, C., 1999, J. Med. Chem., 42, 3308.

61. Gálvez, J., Gómez-Lechón, M. J., García-Domenech, R., Castell, J. V., 1996, Bioorg. & Med. Chem. Lett., 6, 2301.

62. Mahmoudi, N., de Julian-Ortiz, J.V., Ciceron, L., Galvez, J., Mazier, D., Danis, D., Derouin, F, Garcia-Domenech, R., 2006, J. Antimic. Chemother., 57, 489.

63. Gozalbes, R., Gálvez, J., García-Domenech, R., Derouin, F., 1999, SAR QSAR Environ. Res., 10, 47.

Page 32: Editor: Eduardo A. Castro 3. Molecular topology in QSAR ... 3.pdf · Molecular topology in QSAR and drug ... Molecular topology and drug design 65 ... large databases of compounds

J. Gálvez & R. García-Domenech 94

64. Casabán-Ros, E., Antón-Fos, G. M., Gálvez, J., Duart, M. J., García-Domenech, R., 1999, Quant. Struct.-Act. Relat., 18, 35.

65. 65a: Rios-Santamarina, I., García-Domenech, R., Cortijo, J., Santamaria, P., Morcillo, E.J., Gálvez, J., 2002, Internet Electron. J. Mol. Des., 1, 70. 65b: Ríos-Santamarina, I., García-Domenech, R., Gálvez, J., Santamaría, P., Cortijo, J., Morcillo, E.J., 1998, Bioorg. Med. Chem. Lett., 8, 477.

66. García-Domenech, R., García-March, F.J., Soler, R.M., Gálvez, J., Antón-Fos, G.M., de Julián-Ortiz, J.V., 1996, Quant. Struct.-Act. Relat., 15, 201.

67. Gálvez, J., García-Domenech, R., de Julián-Ortiz, J.V., Soler, R., 1994, J. Chem. Inf. Comput. Sci., 34, 1198.

68. Jasinski, P., Welsh, B., Galvez, J., Land, D., Zwolak, P., Ghandi, L., Terai, K., Dudek, A. Z., 2008, Investigational New Drugs., 26, 223.

69. Snow, R. W., Guerra, C. A., Noor, A. M., Myint, H. Y., Hay, S. I., 2005, Nature, 434, 214.

70. Mahmoudi, N., Garcia-Domenech, R., Galvez, J., Farhati, K., Franetich, J.F., Sauerwein, R., Hannoun, L., Derouin, F., Danis, M., Mazier, D., 2008, Antimicrobial Agents and Chemotherapy, 52(4), 1215.