modeling of the inhibition of the intermediate-conductance ca2+-activated k+ channel (ikca1) by some...

10
Modeling of the Inhibition of the Intermediate-Conductance Ca 2 þ -Activated K þ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations Michael Ferna ´ ndez a, b and Julio Caballero c * a Molecular Modeling Group, Center for Biotechnological Studies, University of Matanzas, Matanzas, Cuba b Department of Bioscience and Bioinformatics, Kyushu Institute of Technology (KIT), Kawazu, Iizuka, Japan c Centro de Bioinforma ´ tica y Simulacio ´n Molecular, Universidad de Talca, 2 Norte 685, Casilla 721, Talca, Chile, E-mail: [email protected]; [email protected], Fax: þ 56-71-201561 Keywords: Ab initio calculations, GA-PLS; IKCa1 channel inhibitors, QSAR, Quantum chemical properties, Triarylmethanes, WHIM descriptors Received: November 9, 2007; Accepted: March 6, 2008 DOI: 10.1002/qsar.200760157 Abstract Inhibition of the Intermediate-Conductance Ca 2 þ -Activated K þ Channel (IKCa1) by some Triarylmethane (TRAM) derivatives has been successfully modeled by using quantum chemical properties derived from Ab Initio calculations and Weighted Holistic Invariant Molecular (WHIM) descriptors. The predictive model was conducted by Partial Least Squares (PLS) method in combination with Genetic Algorithm (GA). Models with good predictivity were obtained both in cross-validation procedures and external test set predictions. Our results show that Highest Occupied Molecular Orbital (HOMO) energy, some electronic properties, and topological distributions are important parameters influencing the binding of TRAMs with IKCa1. In addition, our model identified some relevant patterns that can be useful for understanding the IKCa1 inhibitory process and the design of new blockers. 866 # 2008 WILEY-VCH Verlag GmbH &Co. KGaA, Weinheim QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875 Abbreviations: GA, genetic algorithm; HOMO, highest occupied molecular orbitals; IKCa1, intermediate-conductance Ca 2 þ -acti- vated K þ channel; LOO, leave-one-out; LUMO, lowest unoccu- pied molecular orbitals; PLS, partial least squares; QSAR, quan- titative structure – activity relationship; TRAM, triarylmethane; WHIM, weighted holistic invariant molecular Symbols Kd Potency of the blockade ENp The Nth component accessibility directional WHIM index weighted by atomic property p G(N...F) Sum of geometrical distances between N and F atoms Q A Net atomic Mulliken charge at specific atom A Q 2max Net charges of the most positive atom at a topological distance 2 from the central atom Q 2min Net charges of the most negative atom at a topological distance 2 from the central atom Q 2 mS Average of square of charges on all atoms at substituents in the central atom Q Smax Net charges of the most positive atoms at substituents in the central atom Q Smin Net charges of the most negative atoms at substituents in the central atom SQ S Sum of absolute of charges on all atoms at substituents in the central atom Q mS Average of the absolute values of the charg- es on all atoms at substituents in the central atom SQ 2 S Sum of squares of charges on all atoms at substituents in the central atom m Molecular dipole moment e HOMO , e LUMO Energies of the highest occupied (HOMO) and lowest unoccupied (LUMO) molecular orbitals c Electronegativity: 0.5 (e HOMO e LUMO ) h Hardness: 0.5 (e HOMO þ e LUMO ) S Softness: 1/h w Electrophilicity: c 2 /2h Full Papers

Upload: michael-fernandez

Post on 06-Jul-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

Modeling of the Inhibition of the Intermediate-ConductanceCa2þ-Activated Kþ Channel (IKCa1) by Some TriarylmethanesUsing Quantum Chemical Properties Derived From Ab InitioCalculations

Michael Fernandeza, b and Julio Caballeroc*a Molecular Modeling Group, Center for Biotechnological Studies, University of Matanzas, Matanzas, Cubab Department of Bioscience and Bioinformatics, Kyushu Institute of Technology (KIT), Kawazu, Iizuka, Japanc Centro de Bioinformatica y Simulacion Molecular, Universidad de Talca, 2 Norte 685, Casilla 721, Talca, Chile,E-mail: [email protected]; [email protected], Fax: þ56-71-201561

Keywords: Ab initio calculations, GA-PLS; IKCa1 channel inhibitors, QSAR, Quantum chemicalproperties, Triarylmethanes, WHIM descriptors

Received: November 9, 2007; Accepted: March 6, 2008

DOI: 10.1002/qsar.200760157

AbstractInhibition of the Intermediate-Conductance Ca2þ-Activated Kþ Channel (IKCa1) bysome Triarylmethane (TRAM) derivatives has been successfully modeled by usingquantum chemical properties derived from Ab Initio calculations and Weighted HolisticInvariant Molecular (WHIM) descriptors. The predictive model was conducted by PartialLeast Squares (PLS) method in combination with Genetic Algorithm (GA). Models withgood predictivity were obtained both in cross-validation procedures and external test setpredictions. Our results show that Highest Occupied Molecular Orbital (HOMO) energy,some electronic properties, and topological distributions are important parametersinfluencing the binding of TRAMs with IKCa1. In addition, our model identified somerelevant patterns that can be useful for understanding the IKCa1 inhibitory process andthe design of new blockers.

866 C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875

Abbreviations: GA, genetic algorithm; HOMO, highest occupiedmolecular orbitals; IKCa1, intermediate-conductance Ca2þ-acti-vated Kþ channel; LOO, leave-one-out; LUMO, lowest unoccu-pied molecular orbitals; PLS, partial least squares; QSAR, quan-titative structure – activity relationship; TRAM, triarylmethane;WHIM, weighted holistic invariant molecular

Symbols

Kd Potency of the blockadeENp The Nth component accessibility directional

WHIM index weighted by atomic propertyp

G(N...F) Sum of geometrical distances between Nand F atoms

QA Net atomic Mulliken charge at specific atomA

Q2max Net charges of the most positive atom at atopological distance 2 from the central atom

Q2min Net charges of the most negative atom at atopological distance 2 from the central atom

Q2mS Average of square of charges on all atoms at

substituents in the central atomQSmax Net charges of the most positive atoms at

substituents in the central atomQSmin Net charges of the most negative atoms at

substituents in the central atomSQS Sum of absolute of charges on all atoms at

substituents in the central atomQmS Average of the absolute values of the charg-

es on all atoms at substituents in the centralatom

SQ2S Sum of squares of charges on all atoms at

substituents in the central atomm Molecular dipole momenteHOMO, eLUMO Energies of the highest occupied (HOMO)

and lowest unoccupied (LUMO) molecularorbitals

c Electronegativity: �0.5� (eHOMO�eLUMO)h Hardness: 0.5� (eHOMOþeLUMO)S Softness: 1/hw Electrophilicity: c2/2h

Full Papers

Page 2: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

1 Introduction

Intermediate-Conductance Calcium-Activated potassiumchannel, IKCa1, is predominantly expressed in peripheraltissues including those of the hematopoietic system, colon,lung, placenta, and pancreas [1]. This channel is voltage-in-dependent and steeply sensitive to rises in intracellularCa2þ, and has intermediate single channel conductancevalues of 11 – 40 pS [2]. IKCa1 plays a central role in thephysiology of erythrocytes, lymphocytes, and intestinaland airway epithelial cells [3, 4]. Furthermore, it regulatesmembrane potential and calcium signaling in mitogen-acti-vated human lymphocytes.Blockade of IKCa1 has proven therapeutic utility for

the treatment of several diseases. Selective inhibitors ofIKCa1 suppress lymphocyte proliferation and cytokine se-cretion by attenuating Ca2þ influx [5]. Erythrocyte dehy-dration in sickle cell disease has been attributed to exces-sive Kþ loss through IKCa1 channels that are activated bya rise in intracellular Ca2þ during sickling [6]; in this sense,IKCa1 blockers [such as clotrimazole and Triarylmethanes(TRAM)-34], have been considered for the treatment ofsickle cell disease [7]. In intestinal and airway epitheliumcells, basolateral expression of the IKCa1 channel modu-lates apical water and Cl� secretion [8]; therefore, block-ade of this channel can ameliorate secretory diarrhea [9].Furthermore, blockade of IKCa1 by TRAMs resulted ininhibition of epidermal growth factor stimulated vascularsmooth muscle cells proliferation in vitro and in reducedneointima formation in vivo. For this reason, IKCa1 block-ers might have therapeutic utility in the prevention ofrestenosis after angioplasty and for the treatment of othercardiovascular disorders characterized by abnormal vascu-lar smooth muscle cells proliferation [10].The correlation of biological data with various molecu-

lar descriptors constitutes an important and widely usedfield of the application of Quantitative Structure –ActivityRelationships (QSARs) [11]. A QSAR model proposes amathematically quantified and computerized form fromthe chemical structure. In this sense, the QSAR conservesresources and accelerates the process of development ofnew molecules for use as drugs. A crucial step in construct-ing the QSAR model is to find a set of molecular descrip-tors that represents variation in the structural characteris-tics of the molecules tested. Structure-related descriptorsreflect molecular properties, and can thus provide insightinto the physicochemical nature of the activity under con-sideration.Different kinds of properties have been used as molecu-

lar descriptors to determine structure – activity relation-ships such as empirical parameters related to structural,electronic and hydrophobic molecular properties [12], de-scriptors derived from topological counts [13 – 15], fieldapproaches [16], etc. Quantum chemical calculations are areliable source of molecular descriptors, since they can de-scribe all of the electronic and geometric features of mole-

cules [17]. In the performance of a QSAR analysis, quan-tum chemistry provides an accurate and detailed descrip-tion of electronic effects. Traditionally, semiempirical mo-lecular orbital methods have been adequate for calculatingquantum chemical descriptors [18 – 20]. However, with thedramatic evolution of computational capabilities and newalgorithms, ab initio molecular orbital can be expeditiouslyapplied in current QSAR studies [21 – 23].Starting with clotrimazole as a template, Wulff et al. [24]

reported a set of TRAM analogs that block potentlyIKCa1. In a recent report [25], we modeled the structure –activity relationship of this class of compounds by usingtopological charge indexes and bayesian-regularized ge-netic neural networks [26]. In that work, we concludedthat molecular charge distribution plays an important rolein IKCa1 inhibition. In the current work, we evaluated therelevance of Three-Dimensional (3D) features and chargedistribution for TRAMs as IKCa1 blockers by using quan-tum chemical properties and Weighted Holistic InvariantMolecular (WHIM) descriptors [27]. We applied the ab in-itio quantum chemistry to derive quantum chemical prop-erties. Optimum variable subsets of descriptors were se-lected using Partial Least Squares (PLS) method in combi-nation with Genetic Algorithm (GA).

2 Methods

2.1 Dataset

Selective blocking activities of the IKCa1 of 76 TRAM an-alogs were taken from the literature [24]. In such report,the potency of the blockade was reported as a Kd value(nM). For modeling, Kd activities were converted in loga-rithmic activities log(1/Kd). The structural features andthe biological activities of the compounds used in thisstudy are shown in Table 1.

2.2 Quantum Chemical Properties

The molecular structures of all the TRAMs were built us-ing the Hyperchem software [28]. The structures were firstoptimized with the semiempirical method AM1 [29] andthen fully optimized with the 6-31G* basis set for all atomsat the RHF level using Gaussian98 [30]. The calculated de-scriptors for each molecule are summarized in Table 2. Aradial-distributed representation was used for each com-pound (Figure 1). The atom in the center was denoted asA0, atoms bound to the central atom were denoted as A1,and atoms located at a topological distance 2 from the cen-tral atom were denoted as A2. Substituents at centralatom were ordered by the importance of the groups ac-cording to the atomic masses, and were identified as S1,S2, S3, and S4 (from the less important to the most impor-tant). Mulliken charges (QA) were calculated at A0 (QA0);in addition, they were calculated at four positions A1 for

QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875 www.qcs.wiley-vch.de C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 867

Modeling of the Inhibition of the Intermediate-Conductance Ca2þ-Activated Kþ Channel (IKCa1)

Page 3: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

868 C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875

Table 1. Experimental and Predicted CDK4/D Inhibitory Activities of pyrido[2,3-d]pyrimidin-7-one (PPO) derivatives.

Compound R R1 R2 R3 Log(1/Kd)

Exp Calc

Training set1 (Clotrimazole) H C6H5 2-ClC6H4 1H-imidazol-1-yl �1.85 �2.962 (TRAM-34) H C6H5 2-ClC6H4 1H-pyrazol-1-yl �1.30 �2.403 H C6H5 2-ClC6H4 4-Me-1H-imidazol-1-yl �3.08 �3.944 H C6H5 2-ClC6H4 2-Me-1H-imidazol-1-yl �3.23 �3.495 H C6H5 2-ClC6H4 1H-benzimidazol-1-yl �4.30 �3.756 H C6H5 2-ClC6H4 3-Me-1H-pyrazol-1-yl �3.04 �3.427 H C6H5 2-ClC6H4 3,5-diMe-1H-pyrazol-1-yl �4.08 �2.918 H C6H5 2-ClC6H4 3-CF3–1H-pyrazol-1-yl �3.30 �3.739 H C6H5 2-ClC6H4 1H-pyrrol-1-yl �3.85 �3.6310 H C6H5 2-ClC6H4 1H-tetraazol-1-yl �1.65 �2.1611 H C6H5 2-ClC6H4 NH2 �3.00 �3.0612 H C6H5 2-ClC6H4 NHCOCH3 �3.08 �3.1413 H C6H5 2-ClC6H4 NHCONH2 �3.70 �3.1914 H C6H5 2-ClC6H4 4-Pyridinylamino �4.48 �4.0115 H C6H5 2-ClC6H4 2-Pyrimidinylamino �2.95 �3.5116 H C6H5 2-ClC6H4 1,3-Thiazol-2-ylamino �4.18 �4.4817 H C6H5 2-ClC6H4 (5-Me-3-isoxazolyl)amino �3.00 �3.4118 H C6H5 2-ClC6H4 {5-[(4-nitrophenyl)sulfonyl]-1,3-thiazol-2-yl}amino �4.18 �4.2919 H C6H5 2-ClC6H4 1,3-Dioxo-1,3-dihydro-2H-isoindol-2-yl �4.30 �4.1820 H C6H5 2-ClC6H4 CH(CO2Et)2 �2.60 �2.7921 H C6H5 C6H5 1H-imidazol-1-yl �3.18 �3.1422 H C6H5 C6H5 1H-pyrazol-1-yl �3.40 �2.7323 H C6H5 C6H5 1H-pyrrol-1-yl �4.45 �4.4824 H C6H5 C6H5 1-Pyrrolidinyl �4.48 �4.5325 H C6H5 C6H5 NHCONH2 �3.90 �4.0026 H C6H5 C6H5 CH2CO2H �4.40 �4.6127 H C6H5 C6H5 CH(CO2Et)2 �3.60 �3.5028 H C6H5 3-ClC6H4 OH �2.72 �2.9229 H C6H5 4-ClC6H4 1H-pyrazol-1-yl �1.95 �2.4030 H C6H5 4-ClC6H4 NHCONH2 �4.18 �3.4531 H C6H5 4-ClC6H4 4-Pyridinylamino �4.04 �2.9932 H C6H5 4-ClC6H4 1,3-Thiazol-2-ylamino �4.30 �3.8833 H C6H5 4-ClC6H4 CN �2.88 �2.3634 H C6H5 2-FC6H4 1H-pyrazol-1-yl �1.60 �2.3135 H C6H5 2-FC6H4 NH2 �3.30 �3.1036 H C6H5 2-FC6H4 NHCOCH3 �3.08 �2.9737 H C6H5 2-FC6H4 1,3-Thiazol-2-ylamino �4.08 �3.4838 H C6H5 2-FC6H4 CN �1.85 �2.1239 H C6H5 2-FC6H4 OH �2.85 �2.9540 H C6H5 4-FC6H4 1H-pyrazol-1-yl �2.30 �2.8541 H C6H5 4-FC6H4 NH2 �3.70 �3.3942 H C6H5 2-(CF3)C6H4 1H-pyrazol-1-yl �3.58 �3.2543 H C6H5 2-(CF3)C6H4 3-CF3–1H-pyrazol-1-yl �4.40 �4.3244 H C6H5 2-(CF3)C6H4 NH2 �3.40 �3.4645 H C6H5 2-(CF3)C6H4 NHCOCH3 �3.18 �3.5646 H C6H5 2-(CF3)C6H4 1,3-Thiazol-2-ylamino �4.40 �4.4347 H C6H5 2-(CF3)C6H4 OH �2.85 �3.1648 H C6H5 3-(CF3)C6H4 OH �2.78 �3.2649 H C6H5 4-(CF3)C6H4 1H-pyrazol-1-yl �3.18 �3.0450 H C6H5 4-(CF3)C6H4 1,3-Thiazol-2-ylamino �4.54 �4.5751 H C6H5 4-(CF3)C6H4 OH �2.90 �3.0152 H C6H5 2-Pyrrolidinyl OH �4.70 �4.0553 H C6H5 4-Pyridinyl OH �4.70 �3.6054 H C6H5 2-Thienyl 1H-imidazol-1-yl �3.00 �3.30

Full Papers Michael Fernandez and Julio Caballero

Page 4: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

each substituent: QA11, QA12, QA13, QA14. The most negativeand most positive charges were calculated at a topologicaldistance 2 from the central atom (QA2min, QA2max) for eachsubstituent: QA21min, QA21max, QA22min, QA22max, QA23min,QA23max, QA23min, QA23max. Furthermore, the most negativeand most positive charges were calculated at substituentsin the central atom: QS1min, QS1max, QS2min, QS2max, QS3min,QS3max, QS4min, QS4max. Additionally, we calculated the sumof the absolute charges on all atoms at each substituent(SQS1, SQS2, SQS3, SQS4), average of the absolute values ofthe charges on all atoms at each substituent (QmS1, QmS2,

QmS3, QmS4), sum of squares of charges on all atoms at eachsubstituent (SQ2

S1, SQ2S2, SQ2

S2, SQ2S2), and average of

square of charges on all atoms at each substituent (Q2mS1,

Q2mS2, Q2

mS3, Q2mS4), molecular dipole moment (m), energies

of the Highest Occupied (HOMO), and Lowest Unoccu-pied (LUMO) Molecular Orbitals (eHOMO, eLUMO). Finally,quantum chemical indices of electronegativity (c), hard-ness (h), softness (S), and electrophilicity (w) were calcu-lated according to the methods given in Table 2 [31].

2.3 WHIM Descriptors

The WHIM indices are invariant to rototranslation de-scriptors obtained for each molecular geometry [27]. Theywere calculated by transforming Cartesian coordinatesweighted by atomic properties and centering the coordi-nates to get invariance to translation. Then, a PrincipalComponent Analysis (PCA) leads to three principal com-ponent axes, and new coordinates are achieved by projec-ting the old ones onto the PCA axes, obtaining three scorecolumn vectors t1, t2, and t3. Four kinds of descriptors werecalculated from the first to fourth order of tm scores, relat-ed to molecular size, shape, symmetry, and atom distribu-tion.

2.4 Modeling Procedure

A data matrix was generated with the quantum chemicalproperties and WHIM descriptors calculated for each

QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875 www.qcs.wiley-vch.de C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 869

Table 1. (cont.)

Compound R R1 R2 R3 Log(1/Kd)

Exp Calc

55 H C6H5 2-Thienyl 1H-pyrazol-1-yl �3.04 �2.8556 H 2-ClC6H4 2-Thienyl OH �2.88 �2.6157 H 4-ClC6H4 4-ClC6H4 OH �4.70 �4.8458 H 4-FC6H4 2-Thienyl OH �2.90 �3.3959 H 4-MeOC6H4 4-MeOC6H4 OH �4.00 �4.9860 H 4-MeOC6H4 4-MeOC6H4 CN �4.54 �4.0161 H 2-thienyl 2-Thienyl OH �3.95 �3.1162 OMe 4-MeOC6H4 4-MeOC6H4 1H-pyrazol-1-yl �4.30 �3.7463 OMe 4-MeOC6H4 4-MeOC6H4 CN �4.60 �4.1364 OMe 4-MeOC6H4 4-MeOC6H4 OH �4.00 �4.64Test set65 H C6H5 2-ClC6H4 2-Pyridinylamino �4.45 �3.9266 H C6H5 2-ClC6H4 (4 Me-1,3-thiazol-2-yl)amino �3.90 �3.8167 H C6H5 2-ClC6H4 CN �1.78 �2.3668 H C6H5 2-ClC6H4 OH �2.72 �2.7369 H C6H5 4-ClC6H4 OH �2.74 �2.9970 H C6H5 2-FC6H4 NHCONH2 �3.70 �2.9371 H C6H5 4-FC6H4 NHCONH2 �4.00 �3.6272 H C6H5 4-FC6H4 CN �2.90 �2.7173 H C6H5 4-FC6H4 OH �2.85 �3.2974 H C6H5 2-(CF3)C6H4 2-Pyrimidinylamino �4.18 �4.6475 H C6H5 2-Thienyl OH �3.18 �3.3676 H 3-ClC6H4 3-ClC6H4 OH �4.70 �4.68

Figure 1. Radial-distributed representation for TRAMs.

Modeling of the Inhibition of the Intermediate-Conductance Ca2þ-Activated Kþ Channel (IKCa1)

Page 5: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

compound. Some geometrical descriptors accounting forthe sum of geometrical distances between atoms were add-ed to the data matrix. Afterwards, dimensionality reduc-tion methods were employed for selecting the most rele-vant vector components for building the QSAR model.The total number of computed descriptors was 145. De-scriptors with constant values were discarded. For the re-maining descriptors, pairwise correlation analysis was per-formed in order to reduce, in a first step, the colinearityand correlation between descriptors. The procedure con-sists of the elimination of the descriptor with lower var-iance from each pair of descriptors with the modulus ofthe pair correlation coefficients higher than a predefinedvalue (R2

max¼0.95). Afterwards, the number of remaineddescriptors was 74.A regression model was developed using PLS [32]. Tak-

ing into account the high dimension of our dataset, a fea-ture selection-based PLS was used in order to obtain validresults. We employed an approach combining GA withPLS (GA-PLS) proposed by Leardi and Gonzalez [33].The dataset was divided into training and test sets. Twelvecompounds were chosen randomly as a prediction set andwere used for external validation. The compounds in theexternal prediction set were reserved for validating poten-tial models. For the development of the GA-PLS model,the training sets included all the remaining 64 compounds.GA-PLS model was built by using PLS-GA toolbox for

MATLAB environment [34]. Previously, we applied tworandomization tests (Figure 2). Firstly, we carried out atest for evaluating a risk of overfitting. Then, we carriedout a test for estimating the number of evaluations thatshould be done in each run so as to get a good model with-out overfitting. These tests provide a metric to assess therisk of overfitting and are conducted by randomizing theresponse values (YOs) relative to the predictor variables(XOs) and then constructing a model to predict the YOsfrom the XOs. No significant prediction ability should befound and if it is found then this is a sign of overfitting.

For GA search, we kept the parameters used by Leardi[35]: 30 chromosomes; there were five variables per chro-mosome in the first population; the number of componentswas determined by cross-validation (higher Q2); the maxi-mum number of variables selected in the same chromo-some was 30; the probability of mutation was 1%; the opti-mal number of components determined by Leave-One-Out (LOO) cross-validation on the model was no higherthan 15; the number of runs was 100; backward eliminationwas applied after every 100th evaluation and at the end.Variable selection has been replicated five times to evalu-ate the variability of the results.

2.5 Analysis of the Quality of the Models

The quality of the fit of the training set of a specific modelwas measured by its R2

R2 ¼ 1�PN

i¼1 ðYi �AiÞ2PN

i¼1 ðYi � �AÞ2ð1Þ

where N is the number of compounds, Yi and Ai thepredicted and experimental biological activities of com-pound i, and respectively, �A the average experimental ac-tivity.The predictive quality was measured by estimating R2 of

LOO cross-validation (Q2) and standard deviation (SCV).A data point was removed (left-out) from the training set,and the model was refitted; the predicted value for thatpoint was then compared to the experimental value. Thiswas repeated until each datum was omitted once; the sumof squares of these deletion residuals was used to calculateQ2.

Q2 ¼ 1�PN

i¼1 ðYi �AiÞ2PN

i¼1 ðYi � �AÞ2ð2Þ

870 C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875

Table 2. Symbols of the calculated quantum chemical descriptors used in this study and their definitions.

Descriptor Definition

QA Net atomic Mulliken charge at each atom A defined by the radial-distributed representation. A¼0 for cen-tral atom, A¼1 for atom at topological distance 1 from the central atom

Q2min, Q2max Net charges of the most negative and most positive atoms at a topological distance 2 from the central atomQSmin, QSmax Net charges of the most negative and most positive atoms at substituents in the central atomSQS Sum of absolute of charges on all atoms at substituents in the central atomQmS Average of the absolute values of the charges on all atoms at substituents in the central atomSQ2

S Sum of squares of charges on all atoms at substituents in the central atomQ2

mS Average of square of charges on all atoms at substituents in the central atomm Molecular dipole momenteHOMO, eLUMO Energies of the Highest Occupied (HOMO) and Lowest Unoccupied (LUMO) Molecular Orbitalsc Electronegativity: � 0.5� (eHOMO�eLUMO)h Hardness: 0.5� (eHOMOþeLUMO)S Softness: 1/hw Electrophilicity: c2/2h

Full Papers Michael Fernandez and Julio Caballero

Page 6: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875 www.qcs.wiley-vch.de C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 871

Figure 2. Randomization tests for minimizing the probability of overfitting described in Ref. [34].

Modeling of the Inhibition of the Intermediate-Conductance Ca2þ-Activated Kþ Channel (IKCa1)

Page 7: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

where N is the number of compounds, Yi and Ai the pre-dicted and experimental biological activities of i left-outcompound, respectively, and �A the average experimentalactivity of left-in compounds that are different to i.In addition, the predictive power of the model was also

measured by an external validation process that consists inpredicting the activity of unknown compounds formingthe test set. In this case, the residuals of predictions wereevaluated. Such criteria have been formulated as the re-quirements for a QSAR model to have highly predictivepower.

3 Results and Discussion

The results of the first randomization test showed thatthere is no risk of overfitting (average % variance ex-plained in cross-validation <10), and GA can be appliedin the present paradigm. The second test indicated that200 evaluations were necessary in each run. GA-PLSmethodology provided a model with R2¼0.647 and Q2¼0.558. This model included ten variables and three latentvariables. These ten variables were Q2minS1, Q2maxS3, SQS2,SQS3, eHOMO, c, E3u, E2m, E3e, G(N...F), where Q2minS1 thenet charge of the most negative atom at a topological dis-tance 2 from the central atom in substituent S1, Q2maxS3 thenet charge of the most positive atom at a topological dis-tance 2 from the central atom in substituent S3, SQS2 thesum of absolute of charges on all atoms at substituent S2,SQS3 the sum of absolute of charges on all atoms at sub-stituent S3, E3u the unweighted third component accessi-bility directional WHIM index, E2m the second compo-nent accessibility directional WHIM index weighted byatomic masses, and E3e the third component accessibilitydirectional WHIM index weighted by atomic Sandersonelectronegativities. The values of the accumulated var-iance of the model with these variables, using 1 – 3 latentvariables, are listed in Table 3.The training and test set predictions of blockade of

IKCa1 by TRAMs [log(1/Kd)] appear in Table 1. In turn,plots of training and test set predictions versus experimen-tal log(1/Kd) values are shown in Figure 3. In general, theobtained model was able to explain data variance and wasquite stable to the inclusion – exclusion of compounds asmeasured by LOO correlation coefficients (Q2>0.5). In

addition, the model was able to describe the test set var-iances with R2¼0.767.The resulting frequency of selection is plotted in Fig-

ure 4. It is obvious from this plot that eHOMO is the most rel-evant descriptor in our model. The HOMO energy is avery popular chemical descriptor for QSAR development.HOMO and LUMO are known to play a major role ingoverning many chemical reactions. HOMO energy allowsestimating which is the availability of the more energeticelectrons in a compound. The presence of eHOMO in ourmodel suggests that TRAM analogs contribute with anelectronic density to the interaction with IKCa1. Clearly,there is no correlation between IKCa1 blockade andeHOMO (R

2¼0.212); however, it is noteworthy that the mostactive compounds have small values of this property(eHOMO<�0.31).We constructed a linear equation including the ten vari-

ables selected by the PLS model:

logð1=KdÞ ¼ �25:754� eHOMO � 6:190� E2m� 2:722� E3uþ 21:689� c� 2:493� E3e� 0:017�GðN:::FÞ þ 1:029�Q2 max S3 � 0:087� SQS3 � 0:316� SQS2 þ 7:516�Q2 min S1 � 12:392 ð3Þ

N¼64, R2¼0.650, Q2¼0.518

This linear equation has similar R2 and Q2 values with re-spect to the PLS model. In addition, eHOMO has a negativeeffect in the activity (its coefficient is negative in the equa-tion) as we previously indicated. By means of this equa-

872 C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875

Table 3. Values of the accumulated variances of the PLS modelwith ten descriptors and three latent variables for the independ-ent and dependent variable blocks.

Latent variable Independent Dependent

Increase Average Increase Average

1 18.13 18.13 60.10 60.102 15.56 33.69 4.26 64.363 10.02 43.71 0.46 64.82

Figure 3. Plot of predicted versus experimental log(1/Kd) val-ues for blockade of IKCa1 by triarylmethane analogs GA-PLSmodel. (*) Training set predictions; (*) test set predictions.

Full Papers Michael Fernandez and Julio Caballero

Page 8: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

tion, we can evaluate the effect of each property in theIKCa1 blockade.Three WHIM descriptors were extracted among the

most relevant descriptors: E2m, E3u, and E3e. The threedescriptors were constituted by the kurtosis, calculatedfrom the fourth order moments of the tm scores. E2m is re-lated to the atom distribution along the second axis for theatomic masses-weighted scheme, E3u is related to theatom distribution along the third axis for the unweightedscheme, and E3e is related to the atom distribution alongthe third axis for the atomic Sanderson electronegativities-weighted scheme. The three descriptors have a negative ef-fect (Eq. 1). According to this, these WHIM descriptorsencode mass and electronic distributions in a 3D mathe-matically defined zone.Other variable extracted by our model is the electrone-

gativity (c) which is related to the difference betweeneHOMO and eLUMO and reflects the electronic characteristicsof the whole molecules. This variable has a positive coeffi-cient in Eq. (1), which indicates a positive effect in theIKCa1 blockade when the gap between eHOMO and eLUMO ishigher. Another global property included in our model isthe sum of geometrical distances between N and F atoms(G(N...F)) which encodes the presence and the spatial dis-position of CF3 and F substituents in molecules that con-tain N atoms. This variable has a negative effect and is sen-sible to the number of F and N atoms in the molecules andthe geometrical distance among them. Some patterns iden-tified by this descriptor can be analyzed in the dataset:The CF3 substituent (compounds 42 and 49), instead of Fsubstituent (compounds 34 and 40), increases the numberof F atoms and decrease the IKCa1 inhibition; when thecompound 42, which contains 3 F atoms, is substitutedwith a new CF3 group the activity decreases (compound

43); when an F atom is moved from ortho position in com-pound 34 to para position in compound 40, the F�N dis-tance increases and activity decreases.The first six relevant properties according to the fre-

quency of selection in Figure 4 correspond to the electron-ic and spatial features of the whole molecules relevant fortheir IKCa1 inhibitory activity. The following four proper-ties encode some local electronic properties which are rel-evant to the IKCa1 inhibition according to our results. Thelocal positions ofQ2maxS3, SQS3, SQS2, and Q2minS1 are repre-sented in Figure 5. Q2maxS3 has a positive influence (Eq. 1)and takes the higher values (around 0.5) for compoundshaving a 2-F phenyl as S3, values around 0.3 for com-pounds having a 2-thienyl group as S3, and values lesserthan 0.1 for compounds having other substituents. SQS3

has a negative influence and takes the higher values whenCF3 groups are in S3 substituents; meanwhile, it takes thelesser values in compounds having phenyl or Cl-phenylgroups as S3. On the other hand, SQS2 has a negative influ-ence and takes the higher values when OMe groups are inS2 substituents (compounds 59, 60, 62 – 64); meanwhile, ittakes the lesser values in compounds having a phenylgroup as S2 substituent. The last characteristics for SQS3

and SQS2 favor the IKCa1 inhibition. Finally, Q2minS1 has apositive effect according to Eq. (1), and it has not muchvariance because S1 is a phenyl group in almost all com-pounds. The requirement of having a positive charge at C-2 in phenyl group of S1 can be accomplished by modulat-ing the substituents at S2, S3. and S4 positions.In a previous QSAR study, we identified the main char-

acteristics of the TRAM analogs that made them more ac-tive by using topological charge indexes [25]. In this previ-ous work, we concluded that the electronic content ofthese compounds can be used for derived structure – activ-ity relationship in silico models. However, deciphering ofthe information content of topological charge indexes isvery complex as their computations involve integration ofthe structural fragments. In this sense, the previous reportdoes not contain useful information for drug design andthe synthesis of new candidates. The current report over-came the previous results [25] since the use of quantumchemical properties for deriving a QSAR model allows an

QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875 www.qcs.wiley-vch.de C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 873

Figure 4. Bar plot of the cumulative frequency of selection ofthe relevant variables by GA-PLS method. The 12 more fre-quent variables are represented.

Figure 5. Position of local descriptors obtained in PLS modelin compound 2.

Modeling of the Inhibition of the Intermediate-Conductance Ca2þ-Activated Kþ Channel (IKCa1)

Page 9: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

easy interpretation and offers clear guidelines for mole-cule optimization or design. Furthermore, 76 compoundswere included in the current model in comparison to the30 compounds included in the earlier reported model. Inthe previous report, compound 9 (compound 10 in Ref.[25]) was removed as an outlier; while the increase in thedata size and the use of quantum chemical properties andWHIM descriptors in the current report, allowed a goodprediction of this compound (Table 1).Our current model confirms that electronic content of

TRAM analogs is related to their IKCa1 inhibitory activi-ty. It possesses a high statistical quality, includes morecompounds, and indicates some local relevant characteris-tics for a high activity. In this sense, we propose it as a use-ful tool for predicting new active IKCa1 blockers.

4 Conclusions

AQSAR model was derived to study the inhibition of theIKCa1 by some TRAM derivatives. The ab initio theorywas used to optimize the 3D structure of the compoundsand to calculate a diverse set of quantum chemical proper-ties. Good relationships between the experimental log(1/Kd) values and quantum chemical properties were ob-tained through GA-PLS method. Ab initio derived quan-tum chemical properties such as local Mulliken charges,molecular dipole moments, and HOMO and LUMO ener-gies in combination with WHIM descriptors were used.The results showed that HOMO energy, some electronicproperties and topological distributions are important pa-rameters influencing the binding of TRAMs with IKCa1.In addition, some local properties were included that de-fine molecular regions responsible for the observed biolog-ical activity that can be used for establishing some relevantaspects of the IKCa1 – blocker interaction.Quantum chemical methods have been widely applied

to QSAR studies. Global quantum chemical propertiescan describe accurately the electronic environment of themolecules and local quantum chemical properties can lo-cate molecular regions responsible for a given biologicalactivity. With the increase in computational resources,there are new opportunities for the application of Ab initiocalculations to drug design area. This application can pro-vide an extra dimension to QSAR studies, particularly inview of the more precise description of the electronic ef-fects.

5 References

[1] C. Brugnara, C. C. Armsby, L. De Franceschi, M. Crest,M. F. Euclaire, S. L. Alper, J. Membr. Biol. 1995, 147, 71 –82.

[2] S. Grissmer, A. N. Nguyen, M. D. Cahalan, J. Gen. Physiol.1993, 102, 601 – 630.

[3] T. M. Ishii, C. Silvia, B. Hirschberg, C. T. Bond, J. P. Adel-man, J. Maylie, Proc. Natl. Acad. Sci. USA 1997, 94,11651 – 11656.

[4] D. H. Vandorpe, B. E. Shmukler, L. Jiang, B. Lim, J. Maylie,J. P. Adelman, L. De Franceschi, M. D. Cappellini, C. Brug-nara, S. L. Alper, J. Biol. Chem. 1998, 273, 21542 – 21553.

[5] S. Ghanshani, H. Wulff, M. J. Miller, H. Rohm, A. Neben,G. A. Gutman, M. D. Cahalan, J. Biol. Chem. 2000, 275,37137 – 37149.

[6] C. Brugnara, Curr. Opin. Hematol. 1997, 4, 122 – 127.[7] C. Brugnara, B. Gee, C. C. Armsby, S. Kurth, M. Sakamoto,

N. Rifai, S. L. Alper, O. S. Platt, J. Clin. Invest. 1996, 97,1227 – 1234.

[8] D. C. Devor, A. K. Singh, L. C. Lambert, A. DeLuca, R. A.Frizzel, R. J. Bridges, J. Gen. Physiol. 1999, 113, 743 – 760.

[9] P. A. Rufo, D. Merlin, M. Riegler, M. H. Ferguson-Maltz-man, B. L. Dickinson, C. Brugnara, S. L. Alper, W. I. Lenc-er, J. Clin. Invest. 1997, 100, 3111 – 3120.

[10] R. Kçhler, H. Wulff, I. Eichler, M. Kneifel, D. Neumann,A. Knorr, I. Grgic, D. KRmpfe, H. Si, J. Wibawa, R. Real,K. Borner, S. Brakemeier, H. D. Orzechowski, H. P. Reusch,M. Paul, K. G. Chandy, J. Hoyer, Circulation 2003, 108,1119 – 1125.

[11] J. Gasteiger, Anal. Bioanal. Chem. 2006, 384, 57 – 64.[12] C. Hansch, A. Kurup, R. Garg, H. Gao, Chem. Rev. 2001,

101, 619 – 672.[13] C. Lu, W. Guo, X. Hu, Y. Wang, C. Yin, Chem. Phys. Lett.

2006, 417, 11 – 15.[14] H. Gonzalez-DTaz, S. Vilar, L. Santana, E. Uriarte, Curr.

Top. Med. Chem. 2007, 7, 1015 – 1029.[15] E. Gregori-Puigjane, J. Mestres, J. Chem. Inf. Model. 2006,

46, 1615 – 1622.[16] G. Klebe, U. Abraham, T. Mietzner, J. Med. Chem. 1994,

37, 4130 – 4146.[17] M. Karelson, V. S. Lobanov, A. R. Katritzky, Chem. Rev.

1996, 96, 1027 – 1043.[18] S. Dixon, K. M. Merz, Jr., G. Lauri, J. C. Ianni, J. Comput.

Chem. 2005, 26, 23 – 34.[19] G.-Z. Li, J. Yang, H.-F. Song, S.-S. Yang, W.-C. Lu, N.-Y.

Chen, J. Chem. Inf. Comput. Sci. 2004, 44, 2047 – 2050.[20] M. Fernandez, J. Caballero, J. Mol. Model. 2007, 13, 465 –

476.[21] P. J. Smith, P. L. A. Popelier, J. Comput. Aided Mol. Des.

2004, 18, 135 – 143.[22] W. Liu, P. Yi, Z. Tang, QSAR Comb. Sci. 2006, 25, 936 –

943.[23] B. Hemmateenejad, M. A. Safarpour, R. Miri, F. Taghavi, J.

Comput. Chem. 2004, 25, 1495 – 1503.[24] H. Wulff, M. J. Miller, W. HRnsel, S. Grissmer, M. D. Caha-

lan, K. G. Chandy, Proc. Natl. Acad. Sci. USA 2000, 97,8151 – 8156.

[25] J. Caballero, M. Garriga, M. Fernandez, J. Comput. AidedMol. Des. 2005, 19, 771 – 789.

[26] M. Fernandez, J. Caballero, J. Mol. Graph. Model. 2006, 25,410 – 422.

[27] R. Todeschini, M. Lansagni, E. Marengo, J. Chemometr.1994, 8, 263 – 272.

[28] HyperChem 7.0. 2002, Hypercube, Gainesville.[29] J. J. P. Stewart, J. Comput. Chem. 1989, 10, 210 – 220.[30] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria,

M. A. Robb, J. R. Cheeseman, V. G. Zakrzewski, J. A.Montgomery Jr., R. E. Stratmann, J. C. Burant, S. Dapprich,J. M. Millam, A. D. Daniels, K. N. Kudin, M. C. Strain, O.Farkas, J. Tomasi, V. Barone, M. Cossi, R. Cammi, B. Men-

874 C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim www.qcs.wiley-vch.de QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875

Full Papers Michael Fernandez and Julio Caballero

Page 10: Modeling of the Inhibition of the Intermediate-Conductance Ca2+-Activated K+ Channel (IKCa1) by Some Triarylmethanes Using Quantum Chemical Properties Derived From Ab Initio Calculations

nucci, C. Pomelli, C. Adamo, S. Clifford, J. Ochterski, G. A.Petersson, P. Y. Ayala, Q. Cui, K. Morokuma, D. K. Malick,A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. Cioslow-ski, J. V. Ortiz, A. G. Baboul, B. B. Stefanov, G. Liu, A. Lia-shenko, P. Piskorz, I. Komaromi, R. Gomperts, R. L. Mar-tin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Na-nayakkara, C. Gonzalez, M. Challacombe, P. M. W. Gill, B.Johnson, W. Chen, M. W. Wong, J. L. Andres, C. Gonzalez,M. Head-Gordon, E. S. Replogle, J. A. Pople, Gaussian 98,Revision A.7, Gaussian, Inc., Pittsburgh, PA 1998.

[31] P. Thanikaivelan, V. Subramanian, J. R. Rao, B. U. Nair,Chem. Phys. Lett. 2000, 323, 59 – 70.

[32] P. Geladi, B. R. Kowalski, Anal. Chim. Acta 1986, 185, 1 –17.

[33] R. Leardi, A. L. Gonzalez, Chemom. Intell. Lab. Syst. 1998,41, 195 – 207.

[34] PLS-Genetic Algorithm toolbox. http://www.models.life.-ku.dk/source/GAPLS/.

[35] R. Leardi, J. Chemometr. 2000, 14, 643 – 655.

QSAR Comb. Sci. 27, 2008, No. 7, 866 – 875 www.qcs.wiley-vch.de C 2008 WILEY-VCH Verlag GmbH&Co. KGaA, Weinheim 875

Modeling of the Inhibition of the Intermediate-Conductance Ca2þ-Activated Kþ Channel (IKCa1)