theoretical studies of hiv-1 reverse transcriptase inhibition

11
12614 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012 Cite this: Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 Theoretical studies of HIV-1 reverse transcriptase inhibition Katarzyna S ´ widerek, a Sergio Martı´ b and Vicent Moliner* b Received 25th March 2012, Accepted 22nd June 2012 DOI: 10.1039/c2cp40953d Computational methods for accurately calculating the binding affinity of a ligand for a protein play a pivotal role in rational drug design. We herein present a theoretical study of the binding of five different ligands to one of the proteins responsible for the human immunodeficiency virus type 1 (HIV-1) cycle replication; the HIV-1 reverse transcriptase (RT). Two types of approaches are used based on molecular dynamics (MD) simulations within hybrid QM/MM potentials: the alchemical free energy perturbation method, FEP, and the pathway method, in which the ligand is physically pulled away from the binding site, thus rendering a potential of mean force (PMF) for the binding process. Our comparative analysis stresses their advantages and disadvantages and, although the results are not in quantitative agreement, both methods are capable of distinguishing the most and the less potent inhibitors of HIV-1 RT activity on an RNase H site. The methods can then be used to select the proper scaffold to design new drugs. A deeper analysis of these inhibitors through molecular electrostatic potential (MEP) and calculation of the binding contribution of the individual residues shows that, in a rational design, apart from the strong interactions established with the two magnesium cations present in the RNase H site, it is important to take into account interactions with His539 and with those residues that are anchoring the metals; Asp443, Glu478, Asp498 and Asp549. The MEPs of the active site of the protein and the different ligands show a better complementarity in those inhibitors that present higher binding energies, but there are still possibilities of improving the favourable interactions and decreasing those that are repulsive in order to design compounds with higher inhibitory activity. Introduction Drugs used in medicinal chemistry are small molecules that bind to a protein thus disrupting the protein–protein interactions or inhibiting the reaction carried out by a protein if bonding takes place in the active site. These potential medicines would bind tightly to active sites, with a large negative standard ‘free energy of binding’, DG 0 b , and exert their biological effect at low doses. A good inhibitor would also present selectivity for the target enzyme, to prevent undesirable side effects that might arise if other enzymes are inhibited. 1 As a consequence, predicting reliable absolute binding free energies should provide a guide for rational drug design with an enormous save of time and costs and, in a more general context, can be used to understand the correlation between the structure and function of proteins. 2 Computational techniques could be used to pose or dock small molecules into a receptor site of a protein structure with the most stable relative orientation and to predict a binding free energy. 3 The computationally discovered compound could be then experimentally tested to confirm whether it shows some activity in an assay measuring biological response and, subsequently, optimized to obtain greater potency and pharmacologically acceptable properties. 4 Protein–ligand ‘‘docking’’ and ‘‘scoring’’ has been an active field of research for the last decade. 3,5 Nevertheless, as pointed out by Leach et al., 6 accurate prediction of binding affinities turns out to be genuinely difficult. Even highly accurate computed binding free energies have errors that represent a large percentage of the target free energies of binding. 3 In fact, Mulholland and co-workers recently stressed that despite decades of intensive research, no method exists currently that allows, from first principles, the binding free energy of a ligand to a protein to be predicted reliably, accurately, and within a reasonable timescale. 7 Current computational chemistry methods devoted to the development of new drugs range from quantitative structure– activity relationship (QSAR) methods 8 to those based on Mole- cular Dynamics (MD) simulations that allow getting a deeper insight into the foundations of the full binding process. In recent years, a number of studies have reported computations of binding free energies of small molecules to proteins using MD simulations with explicit solvent molecules. 9 In order to get a Departamento de Quimica Fisica, Universitat de Valencia, 46100 Burjasot, Valencia, Spain b Departament de Quı´mica Fı´sica i Analı´tica, Universitat Jaume I, 12071 Castello ´n, Spain. E-mail: [email protected]; Fax: +34 964728066 PCCP Dynamic Article Links www.rsc.org/pccp PAPER Downloaded on 03 October 2012 Published on 25 June 2012 on http://pubs.rsc.org | doi:10.1039/C2CP40953D View Online / Journal Homepage / Table of Contents for this issue

Upload: vicent

Post on 05-Oct-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Theoretical studies of HIV-1 reverse transcriptase inhibition

12614 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012

Cite this: Phys. Chem. Chem. Phys., 2012, 14, 12614–12624

Theoretical studies of HIV-1 reverse transcriptase inhibition

Katarzyna Swiderek,aSergio Martı

band Vicent Moliner*

b

Received 25th March 2012, Accepted 22nd June 2012

DOI: 10.1039/c2cp40953d

Computational methods for accurately calculating the binding affinity of a ligand for a protein

play a pivotal role in rational drug design. We herein present a theoretical study of the binding of

five different ligands to one of the proteins responsible for the human immunodeficiency virus

type 1 (HIV-1) cycle replication; the HIV-1 reverse transcriptase (RT). Two types of approaches

are used based on molecular dynamics (MD) simulations within hybrid QM/MM potentials: the

alchemical free energy perturbation method, FEP, and the pathway method, in which the ligand

is physically pulled away from the binding site, thus rendering a potential of mean force (PMF)

for the binding process. Our comparative analysis stresses their advantages and disadvantages

and, although the results are not in quantitative agreement, both methods are capable of

distinguishing the most and the less potent inhibitors of HIV-1 RT activity on an RNase H site.

The methods can then be used to select the proper scaffold to design new drugs. A deeper

analysis of these inhibitors through molecular electrostatic potential (MEP) and calculation of the

binding contribution of the individual residues shows that, in a rational design, apart from the

strong interactions established with the two magnesium cations present in the RNase H site, it is

important to take into account interactions with His539 and with those residues that are

anchoring the metals; Asp443, Glu478, Asp498 and Asp549. The MEPs of the active site of

the protein and the different ligands show a better complementarity in those inhibitors that

present higher binding energies, but there are still possibilities of improving the favourable

interactions and decreasing those that are repulsive in order to design compounds with

higher inhibitory activity.

Introduction

Drugs used in medicinal chemistry are small molecules that bind

to a protein thus disrupting the protein–protein interactions or

inhibiting the reaction carried out by a protein if bonding takes

place in the active site. These potential medicines would bind

tightly to active sites, with a large negative standard ‘free energy

of binding’, DG0b, and exert their biological effect at low doses. A

good inhibitor would also present selectivity for the target

enzyme, to prevent undesirable side effects that might arise if

other enzymes are inhibited.1 As a consequence, predicting

reliable absolute binding free energies should provide a guide

for rational drug design with an enormous save of time and costs

and, in a more general context, can be used to understand the

correlation between the structure and function of proteins.2

Computational techniques could be used to pose or dock small

molecules into a receptor site of a protein structure with the most

stable relative orientation and to predict a binding free energy.3

The computationally discovered compound could be then

experimentally tested to confirm whether it shows some activity

in an assay measuring biological response and, subsequently,

optimized to obtain greater potency and pharmacologically

acceptable properties.4

Protein–ligand ‘‘docking’’ and ‘‘scoring’’ has been an active

field of research for the last decade.3,5 Nevertheless, as pointed

out by Leach et al.,6 accurate prediction of binding affinities turns

out to be genuinely difficult. Even highly accurate computed

binding free energies have errors that represent a large percentage

of the target free energies of binding.3 In fact, Mulholland and

co-workers recently stressed that despite decades of intensive

research, no method exists currently that allows, from first

principles, the binding free energy of a ligand to a protein to be

predicted reliably, accurately, and within a reasonable timescale.7

Current computational chemistry methods devoted to the

development of new drugs range from quantitative structure–

activity relationship (QSAR) methods8 to those based on Mole-

cular Dynamics (MD) simulations that allow getting a deeper

insight into the foundations of the full binding process. In recent

years, a number of studies have reported computations of

binding free energies of small molecules to proteins using MD

simulations with explicit solvent molecules.9 In order to get

aDepartamento de Quimica Fisica, Universitat de Valencia,46100 Burjasot, Valencia, Spain

bDepartament de Quımica Fısica i Analıtica, Universitat Jaume I,12071 Castellon, Spain. E-mail: [email protected];Fax: +34 964728066

PCCP Dynamic Article Links

www.rsc.org/pccp PAPER

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

DView Online / Journal Homepage / Table of Contents for this issue

Page 2: Theoretical studies of HIV-1 reverse transcriptase inhibition

This journal is c the Owner Societies 2012 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 12615

reliable binding free energies with MD simulations, two types

of approaches are basically used: the alchemical free energy

perturbation methods, and the pathway methods. The former,

termed in such a way by Gao et al. because the kind of

simulations cannot be performed in the laboratory,10 are also

called the double annihilation method11 and are based on

thermodynamic cycles in which the interaction of the ligand

with its surroundings is progressively switched off in both the

active site and the bulk solution. This is done through a series

of nonphysical (alchemical) over-lapping states using thermo-

dynamic integration (TI) or free energy perturbation (FEP)

procedures. FEP/MD applications to biological processes, and

in particular to studies of proteins, were first introduced by

Warshel and co-workers,12 who obtained absolute binding free

energies by means of thermodynamic cycles.13

In the pathway methods the ligand is physically pulled away

from the binding site and then the binding free energy can be

computed as a potential of mean force (PMF). Van Gunsteren

and co-workers reviewed and compared up to twelve different

methods to compute a PMF, analyzing all combinations of the

type of sampling (unbiased, umbrella-biased or constraint-biased),

the way of computing free energies (from density of states or

force averaging) and the type of coordinate system (internal or

Cartesian) used for the PMF degree of freedom.14 The first

problem that arises when computing the binding PMF is the

need of transforming the configurational partition function

from Cartesian to internal coordinates, which is the case if we

are interested in applying restrains to the relative distance

between the protein and the ligand to get the PMF associated

with the binding process. The correction can be related to

Jacobian factors and then the final one dimension PMF can be

corrected for the missing Jacobian contribution to get the true

PMF energy profile.14–16 Intrinsically, the problem present

when using a PMF to determine the standard free energy of

binding is that configurational areas at each point along a

single reaction coordinate are not equal in size but increasing

as 4pr2. A corrected PMF along one-dimensional radial reaction

coordinate, r, can be obtained by adding extra terms accounting

for the 4pr2 increase in area as the ligand unbinds. Other

approaches have emerged to correct the limitations of one dimen-

sion PMFs such as the one proposed by Woo and Roux, in which

multiple restrains to the ligand are applied.17 Recently, Henchman

and co-workers have applied a similar approach to the one of

Woo and Roux but using a z-component reaction coordinate to

get the PMF and applying orthogonal restrains to this reaction

coordinate, thus limiting the sampling required in each window,18

and Colizzi et al. described a new method for predicting protein–

ligand binding affinities that works by modelling the force that is

required to pull inhibitors out of their complexes with proteins.19

Mulholland and co-workers7 presented amethod involving the use

of a new reaction coordinate, similar to the method proposed

by Helms andWade20 that swaps a ligand bound to a protein with

an equivalent volume of bulk water by using replica exchange

thermodynamic integration methods.21

In practice, a critical issue with FEP or PMF/MD simulations

is to achieve a sufficient sampling, which can be attacked by

means of different approaches such as the Replica Exchange

Molecular Dynamics (REMD) based strategies22 which has been

shown to improve the statistical convergence in calculations of

absolute binding free energy of ligands to proteins.23 Parrinello

and co-workers developed the Metadynamics method,24 and

successfully applied in docking ligands on flexible receptors in

water solution. Its added value is that it reconstructs the complete

free energy surface, including all the relevant minima and the

barriers between them.25 More recently, Warshel and co-workers

have shown an effective model for high level calculations for

configurational sampling, paradynamics,26 that is based on the

use of an empirical valence bond (EVB) reference potential.

An alternative to these rigorous but expensive methods is

those based on the microscopic linear response approximation

(LRA) and related approaches.13,27,28 These include the linear

interaction energy (LIE) method,29 where the nonelectrostatic term

is estimated by scaling the average van der Waals interactions, or

the less accurate semi-macroscopic methods such as the protein

dipoles Langevin dipoles (PDLD/S) in its LRA form (PDLD/

S-LRA),13,28 the molecular mechanics/Poisson–Boltzmann surface

area (MM/PBSA) or the molecular mechanics/generalized born

surface area (MM/GBSA).30 In a recent paper of Singh and

Warshel, where a comparative analysis of these virtual screening

scoring methods is carried out, it is concluded that although some

of these approaches such as the PDLP/S-LRA appear to offer an

appealing option, the accurate screening of ligands with similar

binding affinities is hard to approach.31 Genheden and Ulf have

also performed an interesting analysis of the efficiency of the LIE

and MM/GBSA methods to calculate ligand-binding energies

concluding that although reasonable binding affinities can be

obtained with LIEmethods,MM/GBSA gives a significantly better

predictive index and correlation to the experimental affinities.32 All

in all, it appears that, as mentioned before, further investigations to

predict accurate ligand–proteins binding energies are required in

order to improve the correlation between theoretical predictions

simulations and observed inhibitory activities.

One of the most devastating diseases of the early 21st century

is still the acquired immunodeficiency syndrome (AIDS), caused

by the human immunodeficiency virus type 1 (HIV-1). Reverse

transcriptase (RT), protease (PR) and integrase (IN) are the three

key enzymes of HIV-1 as drug targets due to their essential role

in HIV replication. Despite the shortcomings of chemical drugs

such as toxicity, lack of curative and multiple effects, the search

for more and better anti-HIV agents remains a challenge.

RT possesses two enzymatic activities, DNA polymerase and

ribonuclease H (RNase H), located in two distinct domains.

Inhibitors of RT which act at the polymerase active site have

received much attention in the drugs development,33 but more

recently, there has been considerable interest in developing a

specific inhibitor of the RNase H activity of RT by combinations

of nucleoside (NRTIs) and non-nucleoside inhibitors (NNRTIs)

of HIV-1 RT.34 In fact, the two type of RT inhibitors can block

the RNase H or polymerase activity and thus, as suggested by

Arnold and co-workers, a better understanding of the structure

and function(s) of RT, conjointly with the mechanism(s) of

inhibition, can be used to generate better drugs.35

Binding energies of different NNRTIs in the binding pocket

of HIV-1 RT have been the subject of previous theoretical

studies where interaction energies to the different residues of the

allosteric binding pocket have been identified by MM-PBSA

combined with molecular docking,36 by quantum methods37,38

and by structure energy optimizations with ONIOMmethods.39

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 3: Theoretical studies of HIV-1 reverse transcriptase inhibition

12616 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012

More recently, Warshel and co-workers have evaluated the

performance of a PDLD/S-LRA/b method by computing the

absolute binding free energies of a set of NNRT inhibitors,31 and

Jorgensen, Anderson and co-workers have discovered potent

NNRTIs with guidance from molecular modeling including

FEP calculations for protein–inhibitor binding affinities.40

In this paper we are using two types of approaches based on

MD simulations within hybrid Quantum Mechanic/Molecular

Mechanics (QM/MM) potentials, the alchemical FEP and the

pathway method based on calculations of the PMF from

umbrella-biased simulations. Both methods will be applied

to study the binding energy of five different NRTI inhibitors of

the HIV-1 RT (see Scheme 1). These inhibitors are believed to

bind the RNase H active site of the enzyme, which contains

two magnesium ions that seem to be involved in the catalytic

mechanism of RNA phosphodiester bond hydrolysis.41 Never-

theless, as mentioned by Budihas et al.,42 since HIV-1 RT is a

bifunctional enzyme, inhibition of its activity might reflect

direct binding to the RNase H domain but, alternatively, an

allosteric effect through binding to the nonnucleoside binding

pocket close to a DNA polymerase domain, which has been

shown previously to modulate RNaseH function. The effect of

the different parameters selected to run the simulations will be

systematically studied. The results will stress that MD based

methods, together with analysis of electrostatic potentials and

ligand–protein specific interactions, reveal to be a promising

computational protocol to predict the structures and features

of new putative drugs.

Computational methods

Setup of system

The first step of our protocol was setting up the system for

performing the MD simulations. In order to carry out this kind

of study for such large molecular systems, hybrid QM/MM

potentials can be used, where a small part of the system, the

ligand, is described by quantum mechanics while the protein

and solvent environment was represented by classical force field.43

This hybrid methodology, by comparison with methods based on

just MM, avoids most of the work needed to obtain new force

field parameters for each new species. Treating the ligand quantum

mechanically and the protein molecular mechanically has the

additional advantage of the inclusion of quantum effects such as

ligand polarization upon binding.44 Moreover, as the largest part

of the system is described classically, sampling can be obtained at

reasonable computational cost although, obviously, it is much

more expensive than if based on just classical force fields.

The initial coordinates for the QM/MM MD calculations

were taken from the 3.0 A resolution structure of wild-type

HIV-1 RT in a complex with an RNA:DNA oligonucleotide

(PDB code 1HYS).45 After removing the RNA:DNA helix, the

missing atoms of side chain residues of the RNase H active site

were added using the SCit program (http://bioserv.rpbs.jus

sieu.fr/cgi-bin/SCit).

In order to get the optimal pose of the five ligands and the

two Mg2+ cations in the active site of the protein, their

coordinates were obtained by overlapping the initial structure

(1HYS) with the corresponding X-ray structures of the protein

complexed with the corresponding nucleoside inhibitor: (1)

2,7-dihydroxy-4-(propan-2-yl)cyclohepta-2,4,6-trien-1-one or

b-thujaplicinol (PDB code 3IG1), (2) 3-cyclopentyl-1,4-

dihydroxy-1,8-naphthyridin-2(1H)-one (PDB code 3LP1), (3)

ethyl 1,4-dihydroxy-2-oxo-1,2-dihydro-1,8-naphthyridine-3-

carboxylate (PDB code 3LP0), (4) 3-[4-(diethylamino)-

phenoxy]-6-(ethoxycarbonyl)-5,8-dihydroxy-7-oxo-7,8-dihydro-

1,8-naphthyridin-1-ium (PDB code 3LP3) and (5) 2-dihydroxy-

4-1(methylethyl)-2,4,6-cycloheptatrien-1-one or b-thujaplicin (PDB

code 3IG1). The DaliLite46 program, available online (http://www.

ebi.ac.uk/Tools/es/cgi-bin/dalilite/index.html), was used to overlap

the structures within the maximum similarity score S, defined as:

S ¼Xi2core

Xj2core

ðy� DðdAij ; d

Bij ÞoðdA

ij ; dBij ÞÞ ð1Þ

where core is the set of structurally equivalent residue pairs

(iA, iB) between A and B, D is the deviation of intramolecular

Ca–Ca distances between (iA, jA) and (iB, jB), relative to their

arithmetic mean (d), and y is the threshold of similarity, set

empirically to 0.2. The envelope function o= exp(�d2/r2), wherer = 20 A, downweights contributions from distant pairs.

Since the standard pKa values of ionizable groups can be

shifted by local protein environments,47 an accurate assign-

ment of the protonation states of all these residues at pH = 7

was carried out. Recalculation of the pKa values of the

titratable amino acids has been done using the empirical

PropKa program of Jensen et al.48 According to these results,

most residues were found at their standard protonation state,

except Asp-110, Glu-514 residues in chain A and Asp-218,

Glu-28 residues in chain B that should be protonated.

Additionally in chain B, His-221 and His-235 were protonated

at the d-position and His-208 at the e-position, and His-539

from chain A was protonated at both d- and e-position. Then,a total of 32 counter ions (Cl�) were placed into optimal

electrostatic positions around the protein, in order to

obtain electro neutrality. Afterwards, a series of optimization

algorithms (steepest descent conjugated gradient and

L-BFGS-B49) were applied. To avoid a denaturation of the

Scheme 1 Structures of five nucleoside inhibitors of HIV-1 RT: (1)

2,7-dihydroxy-4-1(methylethyl)-2,4,6-cycloheptatrien-1-one or b-thujapli-cinol, (2) 3-cyclopentyl-1,4-dihydroxy-1,8-naphthyridin-2(1H)-one, (3)

ethyl 1,4-dihydroxy-2-oxo-1,2-dihydro-1,8-naphthyridine-3-carboxylate,

(4) 3-[4-(diethylamino)phenoxy]-6-(ethoxycarbonyl)-5,8-dihydroxy-7-oxo-

7,8-dihydro-1,8-naphthyridin-1-ium, and (5) 2-dihydroxy-4-1(methylethyl)-

2,4,6-cycloheptatrien-1-one or b-thujaplicin.

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 4: Theoretical studies of HIV-1 reverse transcriptase inhibition

This journal is c the Owner Societies 2012 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 12617

protein structure during this step, all the heavy atoms

of the protein and the inhibitor were restrained by means of

a Cartesian harmonic umbrella with a force constant of

1000 kJ mol�1 A�2. Afterward, the system was fully relaxed,

but the peptidic backbone was restrained with a lower constant

of 100 kJ mol�1 A�2.

The optimized protein was placed in a box of pre-

equilibrated waters (140 � 80 � 80 A3), using the principal

axis of the protein–inhibitor complex as the geometrical

center. Any water with an oxygen atom lying in a radius of

2.8 A from a heavy atom of the protein was deleted.

The remaining water molecules were then relaxed using

optimization algorithms. Finally, 100 ps of hybrid QM/MM

Langevin–Verlet MD (NVT) at 300 K were used to equilibrate

the solvent.

During the MD simulations, the atoms of the ligands were

selected to be treated by QM, using a semiempirical AM1

hamiltonian.50 The rest of the system (protein plus water

molecules) were described using the OPLS-AA51 and TIP3P52

force fields, respectively, as implemented in the fDYNAMO

library.53

Due to the large amount of degrees of freedom, in order to

reduce the cost of the calculations, any water molecule 25 A

apart from any of the atoms of the Mg2+ ions located in the

active site was deleted and protein atoms and all water

molecules 20 A apart from any of the atoms of the initial

inhibitor were kept frozen in the remaining calculations.

Cut-offs for the non-bonding interactions were applied using

a switching scheme, within a range radius from 14.5 to 16 A.

Finally, the system was equilibrated by means of 1 ns of

QM/MM MD at temperature of 309 K.

The potential energy

The potential energy of our scheme is derived from the

standard QM/MM formulation:

EQM=MM ¼hCjH0jCi

þX

CqMM

re;MM

��������C

� �þXXZQMqMM

rQM;MM

� �

þ EvdWQM=MM þ EMM ð2Þ

EQM/MM = Evac + EelecQM/MM + EvdW

QM/MM + EMM (3)

where EMM is the energy of the MM subsystem, EvdWQM/MM the

van der Waals interaction energy between the QM and MM

subsystems, Evac is the gas phase energy of the polarized QM

subsystem and EelectQM/MM includes both the coulombic inter-

action of the QM nuclei (ZQM) and the electrostatic inter-

action of the polarized electronic wave-function (c) with the

charges of the protein (qMM).

The interaction energy between the inhibitor and the

environment, computed by residue, can be evaluated as

the difference between the QM/MM energy and the energies

of the separated, non-interacting, QM and MM subsystems

with the same geometry. Considering that the MM part is

described using a non-polarizable potential the interaction

energy contribution of each residue (i) of the protein is given

by the following expression:

EIntQM=MM;i ¼ Eelect

QM=MM þEvdWQM=MM;i

¼XMM2i

CqMM

re;MM

��������C

� �þXQM

ZQMqMM

rQM;MM

" #

þXX

4eQM;MMsQM;MM

rQM;MM

� �12

� sQM;MM

rQM;MM

� �6" #

ð4Þ

Alchemical free energy perturbation (FEP) methods

In order to evaluate the ligand–protein interaction free energy,

a series of QM/MM MD simulations have been carried out in

the active site of the protein and in a box of solvent water

molecules, introducing two parameters; l and g parameters in

the electrostatic and van der Waals QM/MM interaction

terms, respectively:

EQM=MMðlÞ ¼ hCjH0jCi

þ lX

CqMM

re;MM

��������C

� �þXXZQMqMM

rQM;MM

� �

þ gEvdWQM=MM þ EMM

ð5Þ

Then l smoothly changes between values of 1, corresponding

to a full MM charge–wave function interaction, and 0 whereas

no electrostatic interaction with the force field is introduced.

Once the electrostatic charges of the ligand (QM region) are

annihilated, g smoothly changes between values of 1, corre-

sponding to a full QM-MM van der Waals interaction, and 0

whereas no van der Waals interaction with the force field is

introduced. The calculation of the free energy difference of two

consecutive windows of the two stages, annihilation of charges

and van der Waals parameters, is performed by means of FEP

methods, and then the total free energy change is evaluated as

the sum of all the windows covering the full transformation

from the initial to the final state. In practice, this is done in two

steps (eqn (6) and (7)), once the charges are annihilated then

the van der Waals:

DGelecQM=MM ¼ �

1

b

Xlnhe�b½Eðliþ1Þ�EðliÞili

h ig¼1

ð6Þ

DGvdWQM=MM ¼ �

1

b

Xlnhe�b½Eðgiþ1Þ�EðgiÞigi

h il¼0

ð7Þ

We have used a total of 50 windows to evaluate the electrostatic

interaction term, from l = 0 (no electrostatic interaction) to

l = 1 (full interaction), which turns into a dl of 0.02, and the

same amount of windows have been used to calculate the van

der Waals interaction term, from g = 0 (no electrostatic

interaction) to g = 1 (full interaction), which turns into a

dg of 0.02. The procedure can be applied both forwards (i -

i + 1) and backwards (i + 1 - i) as long as the comparison

provides information about the convergence of the process.

In each window a total of 5 ps of relaxation followed by 10, 50

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 5: Theoretical studies of HIV-1 reverse transcriptase inhibition

12618 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012

or 100 ps of production QM/MM MD, test dependent, have

been performed using the NVT ensemble at the reference

temperature of 309 K.

According to Scheme 2, it is possible to compute the binding

free energy of a particular ligand, which could be directly

compared with experimental data such as the inhibition constants,

KI. Thus, the predicted value would be expressed as:

DG = DGW � DGE (8)

where the two terms on the right side of eqn (8) are computed

by applying eqn (6) and (7), in water and in the enzyme,

respectively. It is important to point out that, in computing

DGW, water molecules can occupy the cavity left by the ligand

when all interactions between ligand and water environment

are switched off. In such a case, if the volume of full solvent

water molecules was not big enough, a decrease in the density

of the box could prevent closing of the thermodynamic cycle in

Scheme 2. A similar argument could be applied to the calculation

of DGE but, in this case, although the cavity created when

removing the ligand from the active site can be occupied by

water molecules of the solvent (keeping in mind that an RNase H

active site is quite flat and open to solvent), the cavity can be

also occupied by the protein, as long as flexibility to the protein

is allowed, and then the impact of this effect is minimized.

Free energy of binding from potential of mean force

(PMF)

The PMFs traced in order to obtain the free energy54,55 of the

process of pulling the ligand from the enzyme binding site to

solution has been calculated using the weighted histogram analysis

method (WHAM) combined with the umbrella sampling

approach55,56 as implemented in fDYNAMO. The procedure for

the PMF calculation is straightforward and requires a series of

molecular dynamics simulations in which the distinguished reaction

coordinate variable, x, is constrained around particular values.55 In

our case, the reaction coordinate x was chosen as the distance

between the center of mass of the ligand and the position of a fixed

water molecule 25 A away from the active site in a perpendicular

direction. The values of the variable sampled during the simula-

tions (which covers a distance of 12 A) are then pieced together to

construct a distribution function from which the PMF is obtained

as a function of the distinguished reaction coordinate (W(x)). ThePMF is related to the normalized probability of finding the system

at a particular value of the chosen coordinate by eqn (9):

W(x) = C � kT lnRr(rN)d(x(rN) � x)drN�1 (9)

The difference of free energy can be then expressed as:57

DG(x) = W(xw) � [W(xbs) + Gx(xbs)] (10)

where the superscripts indicate the value of the reaction

coordinate at the initial state (in the binding site, bs) and final

state (in water environment, w) and Gx(xbs) is the free energy

associated with setting the reaction coordinate to a specific

value at the initial state; in the active site. Normally this last

term makes a small contribution58 and the change in free

energy is directly estimated from the PMF change between the

initial and final states of the profile:

DG(x) = W(xw) � W(xbs) = DW(x) (11)

Results and discussion

Alchemical free energy perturbation methods

First step of our study was to evaluate the electrostatic and

van der Waals contributions to the free energy of binding of

the five ligands in the active site of HIV-1 RT and in aqueous

solution, as a function of the length of the MD sampling. The

results are reported in Table 1.

As can be observed in Table 1a, the values of the electro-

static free energy terms obtained in the enzyme active site

strongly depend on the time of simulation, being clearly over-

estimated if short MD, 10 ps, is used. Changes from 10 ps to 50 ps

range from 4.8 kcal mol�1 (in Lig2) to 11.5 kcal mol�1 (in Lig3),

while changes from 50 ps to 100 ps are dramatically reduced from

0.4 kcal mol�1 (in Lig2, thus virtually invariant) to 4.2 kcal mol�1

(Lig4). Accordingly, changes below 2 kcal mol�1 for all ligands

when enlarging the sampling up to 200 ps would be expected.

These results would not be qualitatively different from the

ones obtained with 100 ps sampling. Thus, after evaluating the

variations between 10, 50 and 100 ps, we can consider that

100 ps MD is a good compromise between computer resources

and quality of results, keeping in mind that we are interested in

a comparative analysis. As expected, the calculation of the

electrostatic term of the free energy of binding in water does

Scheme 2 Thermodynamic cycle to compute enzyme–ligands binding

free energies from alchemy free energy perturbation methods. E is the

enzyme with ligand (L) in its binding site, E0 is the apo form of

enzyme, (L)0 is the ligand in gas phase.

Table 1 Electrostatic, DGe (a) and van der Waals, DGvdW (b) terms ofthe protein–ligand and water–ligand interaction free energy as afunction of the MD simulations. All values are reported in kcal mol�1

Protein–ligand Water–ligand

10 ps 50 ps 100 ps 10 ps 50 ps

(a)Lig-1 �38.2 �30.1 �27.6 �5.1 �5.0Lig-2 �29.7 �24.9 �24.5 �11.6 �12.0Lig-3 �49.2 �37.7 �35.0 �8.5 �8.2Lig-4 �38.3 �29.8 �25.6 �14.0 �13.4Lig-5 �27.7 �20.8 �17.1 �5.4 �4.8

Protein–ligand Water–ligand

10 ps 50 ps 100 ps 10 ps 50 ps 100 ps

(b)Lig-1 �23.5 �19.6 �19.6 �17.3 �16.8 �16.4Lig-2 �27.9 �28.6 �26.7 �23.7 �23.7 �23.4Lig-3 �25.6 �23.8 �26.5 �21.7 �23.1 �23.0Lig-4 �41.9 �39.2 �36.9 �33.9 �36.5 �36.8Lig-5 �15.8 �15.6 �13.2 �15.4 �16.6 �16.5

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 6: Theoretical studies of HIV-1 reverse transcriptase inhibition

This journal is c the Owner Societies 2012 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 12619

not show such strong dependency and then 50 ps MD will be the

length of our sampling. Regarding the van der Waals interaction

term, the dependence on the length of the simulations is not so

dramatic (see Table 1b). The equilibration of the water molecules

or the protein during the annihilation of the ligands can be

considered as fully converged, in both media, after 100 ps of MD

and these are the values used for the FEP calculations. Thus,

according to Scheme 2, the binding free energy of the five ligands

is computed by means of eqn (6)–(8), and presented in Table 2.

The energy decomposition of the two terms, van der Waals and

electrostatics is also reported in Table 2. The first conclusion that

can be obtained is that the contribution of the van der Waals

interaction to the binding free energy is much smaller than the

contribution of the electrostatic term. Obviously this is due to the

fact that very close values are obtained in both environments for

all ligands, which is a reasonable result taking into account that

van der Waals interactions depend mostly on the molecular

surface of the ligand. As observed in Table 2, the largest value

corresponds to the biggest molecule, L4, while the smallest value

is obtained for the smallest molecules, L1 and L5, in both

aqueous solution and in the enzyme active site. Thus, the total

free energy of binding will depend mostly on the difference

between electrostatic interactions in solution and in the enzyme,

as previously suggested by Warshel and co-workers.59 By

comparing the different ligands, L5 shows the lowest electro-

static free energy of interaction in the RNase H active site, while

L3 is the one presenting the largest value. The same trend is

obtained when computing the final binding free energies.

The relative significance of the electrostatic (DGe) and van der

Waals (DGvdW) terms to the binding energy can be also shown

by the more computing demanding calculation of the electro-

static and van der Waals free energy interaction terms along the

binding pathway of the ligand from the water to the active site.

The result obtained for L1 is plotted in Fig. 1 and confirms the

previous predictions: the change of the Lennard-Jones inter-

actions from the aqueous environment to the active site of the

enzyme is significantly smaller than the variation of the electro-

static interactions that, therefore, is the term that will determine

the total binding free energy of the process. Nonetheless, the

evolution of both terms is favourable for the binding process.

Another important conclusion that can be derived from the

plot is that, according to the distance evolution of the electro-

static interaction term presented in Fig. 1a the ligand can be

considered as completely solvated when located at ca. 8 A

from the active site. According to this criterion, the variation

of the electrostatic and van der Waals interactions, from the

enzyme active site to the bulk, would be ca. 33 kcal mol�1 and

ca. 8 kcal mol�1, respectively. This result, although cannot be

considered at a quantitative level as long as longer dynamics is

carried out, is in good agreement with DGelec and DGvdW

obtained from eqn (6) and (7) and reported in Table 2, which

gives some credit to our results.

Pathway methods

In order to study the ligand–protein binding free energies by

means of the umbrella biased PMF along a coordinate defining

the distance from the ligand to the active site, it is necessary to

define some variables. These are, the force constant of the

umbrella potential energy term applied to properly overlap

the consecutive windows during the sampling, the window

width or step size where the restraining or umbrella potential

energy term is centred along the path, and the time of the MD

simulations carried out in each window. Taking the L1 as a

benchmark, the effect of the force constant and the step size

on the PMF profile for a defined time of the MD is presented

in Fig. 2.

As can be observed in Fig. 2, the free energy of binding, as

well as the shape of the profile, presents a strong dependency

on the value of the constant, as previously shown by Henchman

and co-workers18 and by Fabrittis and co-workers:60 the larger the

force constant is, the larger the free energy difference between the

ligand in the active site and completely solvated in the bulk. This

effect is reduced if the size of the step is reduced from 0.2 A to

0.1 A, as observed in Fig. 2a. Obviously, the force constant

dependency is due to an overlapping conformational sampling

deficiency, which is partially minimized when reducing the step size

of the umbrella function. Fig. 2a shows that, albeit reducing the

step size of the scan along the coordinate minimizes the problem of

sampling, it is not completely cancelled out (no convergence is

obtained with the force constant). Henchman and co-workers

already detected this problem when computing binding PMFs

without orthogonal restraints to the single reaction coordinate.18

Table 2 Electrostatic (DGe) and van der Waals (DGvdW) terms of thewater–ligand and protein–ligand interaction free energy, and differencesin interaction between water and the binding site of the protein (DGelec,DGvdW and DGbind). All values are reported in kcal mol�1

Water Enzyme

DGelec DGvdW DGbindDGe-w DGvdW-w DGe-e DGvdW-e

Lig-1 �5.0 �16.4 �27.6 �19.6 �22.6 �3.2 �25.8Lig-2 �12.0 �23.4 �24.5 �26.7 �12.5 �3.3 �15.8Lig-3 �8.2 �23.0 �35.0 �26.5 �26.8 �3.5 �30.3Lig-4 �13.4 �36.8 �25.6 �36.9 �12.2 �0.1 �12.3Lig-5 �4.8 �16.5 �17.1 �13.2 �12.3 3.3 �9.0

Fig. 1 Electrostatic (A) and van der Waals (B) ligand–environment

interaction energy terms along the binding path from aqueous to the

enzyme active site (distance in A). Results obtained for ligand 1.

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 7: Theoretical studies of HIV-1 reverse transcriptase inhibition

12620 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012

Consequently, and considering that this is due to an unbiased

sampling of the conformational space along the path from the

water to the active site, an analysis of the effect of the force

constant and the length of the MDs has been carried out. The

results of increasing the time of the MDs performed in each of

the windows along the binding path, from 10 ps to 1 ns, is

depicted in Fig. 3. As can be observed, while the values

obtained with short MD are clearly dependent on the selected

umbrella force constant, when long dynamics are performed,

the effect is clearly minimized. Fig. 3 suggests that long MDs

are required to get reliable values of binding free energies with

reduced dependency on the selected force constant. Fabrittis

and co-workers have also demonstrated that long sampling

MD simulations are required for robust binding predictions.58

Thus, in order to study the binding of the five nucleoside

inhibitors of the HIV-1 RT by means of the pathway method,

we will apply the umbrella sampling with a step-size of 0.1 A to

cover the range from inside the active site to the bulk solvent,

with a force constant of 100 kJ mol�1 A�2 and MD simulations

of 1 ns. We must keep in mind that while previous binding

studies based on PMF methods were carried out by means of

classical force fields describing both ligand, protein and solvent,

our MD simulations are based on hybrid potentials and thus,

the calculations are much more CPU time demanding.

The final values of the binding free energies obtained by

means of the PMFs for the five ligands within a cartesian force

constant equal to 100 kJ mol�1 A�2, a step size of 0.1 A and

1 ns MD simulations for each window are listed in Table 3.

The values obtained by means of the alchemical FEPs methods

are also listed in the table for comparative purposes.

As observed in Table 3, both methods would present L3 as

the best candidate to inhibit the HIV-1 RT while L5 gives

the lowest negative binding energy. This one is, in fact, the

one that has experimentally shown the lowest activity,

while the other four ligands present similar activities.42,61

Nevertheless, as mentioned above, a direct comparison between

our binding free energies of the ligands to the RNase H site and

experimental data, IC50 (value which expresses the concentration

of an inhibitor required to produce 50 per cent inhibition of an

enzymic reaction at a specific substrate concentration), is proble-

matic since the experimental values can reflect other kind of

inhibition processes. Also, according to Cheng and Prusoff,62

the inhibitory binding constant of a competitive inhibitor to

the enzyme, or the reciprocal of the dissociation constant of

the enzyme–inhibitor complex, is the value that could be straight-

forward compared with binding free energy values. IC50 depends

on the concentration of the substrate and its dissociation

constant. If concentration of the substrate is the same in all

experiments and in any case much larger than the corre-

sponding substrate Michaelis constant, the experimentally

measured IC50 of all five residues would be related with free energies

of binding in the range from �5.5 (L5) to �9.5 kcal mol�1 (L2).

Then, the agreement between our theoretical predictions and the

experimental data has to be considered at a qualitative level, since

the results appear to be clearly overestimated.

There are, nevertheless, significant differences between the

absolute values, and trends, obtained by PMF and FEP based

methods for the rest of the inhibitors, especially for ligands L2

and L4. The origin of this discrepancies could be a deficiency

sampling during the simulations. Keeping in mind that these

two ligands are large and present more functional groups

(hydroxyl, carbonyl and ester groups) than the rest of ligands,

more specific interactions with the pocket can be established

and probably the effect of lack of sampling is more dramatic in

these two ligands. This effect is enhanced in the PMF calculations

along the path. As observed in Fig. 3, the longer the sampling the

smaller the binding free energy and the dependency on the

umbrella force constant. Correction of this possible error would

imply longer MD or the use of, for instance, perpendicular

constrains as the ones used by Henchman and co-workers.18

Another possible source of error can be the underestimation of

the cavity created in the solvent environment after ‘‘deleting’’

the ligands during the calculation of the DGvdW term with the

Fig. 2 (a) PMF dependency on the force constant k (in kJ mol�1 A�2)

and step size (0.1 and 0.2 A). (b) Free energy profile dependency on the

umbrella sampling force constant k with a step size equal to 0.2 A.

Fig. 3 Binding free energy (PMF) dependency on the time of the MD

simulations performed in the umbrella sampling, and on the harmonic

umbrella force constant. Results obtained for ligand 1.

Table 3 Ligand–protein binding free energies obtained by means ofthe FEP and PMF methods (in kcal mol�1)

Lig-1 Lig-2 Lig-3 Lig-4 Lig-5

DGb from PMF �30.4 �34.6 �57.2 �50.7 �16.4DGb from FEP �25.8 �15.8 �30.3 �12.3 �9.0

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 8: Theoretical studies of HIV-1 reverse transcriptase inhibition

This journal is c the Owner Societies 2012 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 12621

FEP method. As mentioned, this effect would not be so

dramatic in the protein environment since part of the cavity

can be occupied by the protein provided that it is flexible and

allowed to move during the simulations. We have computed

the electrostatic binding energy term of filling the cavity by the

required number of water molecules, in each case. The resulting

energies are �16.5, �18.2, �22.5, �53.2 and �22.2 kcal mol�1

for ligands L1, L2, L3, L4 and L5, respectively. Obviously these

numbers are overestimating the real effect since during the

original calculation of the DGvdW term, waters were allowed

to move inside the cavity and the change in the density was

negligible. Anyway, these values present the expected trend,

which is a dependency on the size of the ligand that would

partially close the gap between the PMF and FEP results of

binding free energies.

Molecular Electrostatic Potential (MEP)

The Molecular Electrostatic Potential (MEP) is defined as the

interaction energy between the charge distribution of a mole-

cule and a unit positive charge. The MEP is highly informative

of the nuclear and electronic charge distribution of a given

molecule. The MEP has been applied to a range of fields, such

as the study of biological interactions, topographical analysis

of the electronic structure of molecules and definition of mole-

cular reactivity patterns.63 The potential applications of theMEP

as a tool for interpretations and prediction of chemical reactivity,

as well as to probe the similarity of transition states and

inhibitors, has long been recognized.64 Three-dimensional MEP

surfaces for the five ligands, the active site of the enzyme and its

natural substrate are depicted in Fig. 4. The MEP surfaces were

derived from B3LYP/6-31+G (d,p) calculations, computed

under the effect of the protein environment, using Gaussian 0965

and visualised in Gaussview 3.0.66 These surfaces correspond to an

isodensity value of 0.002 a.u. In the figure, the most nucleophilic

regions (negative electronic potential) are in red, while the most

electrophilic regions (positive electrostatic potential) appear in

blue. The active site displays large positive (blue) regions matching

the positions of the magnesium cations, and nitrogen backbone

atoms of Asp443, Glu478, Asp549 and His539, while it display

large negative (red) regions matching the positions of oxygen

atoms of backbone belonging to His539, Asp443, Glu478,

Asp498, and Asp549. The negative electrostatic potential for

the fragment of DNA-chain can be found around the oxygen

atoms belonging to the phosphate groups. First conclusion

that can be derived from the figure is that there is a reasonable

complementarity between the active site of the enzyme and its

natural substrate. According to Pauling postulate,67 even a

better match could be expected for the TS structure of the

DNA polymerase HIV-1 RT catalyzed reaction. The analysis of

the five ligands reveals negative red regions around the hydroxyl

and carbonyl oxygen atoms while positive regions are located

mainly in hydroxyl hydrogen atoms. Nevertheless, the positive

charge on these proton atoms seems to be delocalized when

taking part of hydrogen bonds established with the contiguous

carbonyl group. This effect is dramatic in the case of L6 that, in

fact, does not present any blue region on the MEP.

This analysis can be supported by means of evaluation of the

interaction energy between the inhibitor and the environment,

computed by eqn (4). This interaction energy can be decomposed

in a sum over residues provided that the polarized wave function

(C) is employed to evaluate this energy contribution. The global

polarization effect can be obtained from the gas phase energy

difference between the polarized, C, and unpolarized, C0, wave

functions. Averaged protein–substrate interaction energies by

residue are displayed in Fig. 5.

In Fig. 5, where positive values correspond to unfavourable

interactions and negative values mean that the interaction is

stabilizing the binding ligand, only the main contributions, which

correspond to residues of the active site, have been presented.

As can be observed in Fig. 5, the strongest interaction is

established between the ligands and the Mg2+ cations. These

favourable interactions are, nevertheless, not completely the

determinant factor of the final ligand–protein binding energies,

since significant non-favourable interactions can be established

with negatively charged residues of the active site such as Asp443,

Glu478, Asp498 and Asp549. These residues are required to

anchor the two Mg2+ cations in the active site and appear to

destabilize the binding of the ligands (see Fig. 6). His539, a

residue on the second shell of the active site that interacts with

Asp549, is the only aminoacid that slightly stabilizes the

interaction with the ligand, particularly in the case of L2. This

can be explained by Fig. 4, as there is a positive charge in

the ethoxy group of this ligand that could interact with the

backbone oxygen atom of this residue. Thus, while L1 shows

the largest negative values with Mg2+ cations, it also presents

high positive values of interaction energies with aspartate

residues of an active site that seem to cancel out the initially

favourable effect. A similar picture is obtained for ligand 5.

Fig. 4 Molecular electrostatic potential (MEP) surfaces derived from

B3LYP/6-31G+(d,p) calculations for the active site of the HIV-1 RT,

together with the natural substrate (RNA) and the five tested ligands

polarized by the charges of the protein environment. The increase in

negative charges goes from blue (positive) to red (negative).

Fig. 5 Contributions of individual amino acids of an active site to

substrate–protein interaction energy (in kcal mol�1) computed for the

five ligands.

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 9: Theoretical studies of HIV-1 reverse transcriptase inhibition

12622 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012

In contrast, while interactions of metals with L2, L3 and L4

are much weaker they do not show important destabilizing

interactions with the aminoacids of the protein, especially with

aspartate residues Asp443 and Asp498. Consequently, the total

interaction energies can be more favourable even presenting

lower interactions with the metals.

According to structure–activity relationship studies by

Grice and co-workers,59,60 three oxygen ligands of RNase H

inhibitors are critical. Their recent crystallographic analysis

indicates that manicol makes multiple contacts with the highly

conserved His539 residue,68 thus suggesting a bidentate inhibitor

as a putative improved drug. Modelling studies of Chung et al.69

indicating that since the vinylogous urea binding site is also in the

vicinity of His539, these kind of compounds would comprise

allosteric and RNase H inhibitory properties that would provide

improved selectivity.68 Our results show how the best inhibitors

we have tested (see Fig. 6) are presenting a favourable interaction

with divalent magnesium ions and with His539 (see Fig. 5), in

agreement with experimental studies, and small destabilizing

interactions with the rest of the aminoacids of the active site.

Conclusions

Two computational approaches, based on MD simulations

within hybrid QM/MM potentials, have been used to compute

the binding free energy of five known inhibitors of HIV-1 RT:

the alchemical free energy perturbation methods, FEP, and a

pathway method, in which the ligand is physically pulled away

from the binding site, that renders a PMF of the process. Our

comparative analysis suggests that just a qualitative agreement

between both strategies can be obtained although the former is

much less computing demanding. Both methods can be used to

distinguish ligands presenting large inhibitory activity differences

and to select the most promising scaffold for further develop-

ments. Nevertheless, as a general conclusion, long sampling MD

simulations should be required to get reliable results. This is

especially important for the pathway method when computing

binding free energies of large compounds with a big amount of

functional groups, capable of interacting with the protein pocket.

Nevertheless, it is important to take into account that our

simulations are based on hybrid QM/MM potentials in an

attempt to get more reliable interactions since treating the

ligand quantum mechanically has the additional advantage of

the inclusion of quantum effects such as ligand polarization

upon binding.

Despite differences between the HIVNRTIs tested previously by

different authors (tropolone derivatives identified by Budihas

et al.,42 pyrimidal carboxylic acid scaffolds described by Kirschberg

et al.,70 the betathujaplicinol described by Himmel et al.,71 and

the naphthyridinones described by Su et al.61), all of them

suggest the importance of the coordination with the two metal

ions. Our analysis of the binding contribution of the individual

residues shows that the design of an inhibitor based just

on favourable interactions with the two magnesium cations

present in the active site of HIV-1 RT can fail. Negatively

charged residues, such as Asp443, Asp549, Asp498 or Glu478,

that play a critic role in anchoring the metal atoms into the

protein, can destabilize the ligand binding. Also, large inhibitor-

like molecules could present the advantage of interacting with

residues of the second shell of the active site. This conclusion

would explain why some ligands present anti-HIV activity while

others, like b-thujaplicin (ligand 5), are almost inactive. In

general, the MEPs of the active site of the protein and the

different ligands show a better complementarity in those cases

for which higher binding energies have been computed. This

result is harmonizing with the analysis of the interaction

energies by residues that support the equilibrium of the absolute

values of favourable and non-favorable interactions that must

present an inhibitor to have significant activity. We are herein

showing how the best inhibitors we have tested are presenting a

favourable interaction with divalent magnesium ions and with

His539, in agreement with experimental studies.

Our results can be used to guide the addition of substituents

to an initial inhibitor scaffold to decrease non-favorable

interactions with negatively charged residues that anchor the

metals. According to our obtained binding energies and

the MEP and electrostatic interactions analysis, this could be

done from L2 or L3. This result would support the use of

naphthyridinone-based ligands, instead of drugs based on a

cycloheptatriene ring such as tropolone and its derivatives, as

well as the importance of properly located hydroxyl groups.

It is important to stress that an over estimation of the

importance of two hydroxyl groups for metal chelation could

render non-favourable interactions with other residues of the

active site. Keeping in mind that an RNase H active site is

open to solvent, as pointed out by Grice and co-workers,68 the

possibility of a variety of substituents could be also proposed

to reduce cellular toxicity by decreasing binding affinity for

other enzymes. Extensive binding and mechanistic studies,

using both experimental and computational tools, are needed

to finally design an optimal drug that efficiently inhibits all

possible mutated forms of the rapidly emerging HIV.

Acknowledgements

We thank the Spanish Ministry Ministerio de Ciencia e Inovacion

for project CTQ2009-14541-C1 and C2, Universitat Jaume

I–BANCAIXA Foundation for projects P1 1B2011-23,

Generalitat Valenciana for ACOMP2011/038, ACOMP/

2012/119 and Prometeo/2009/053 projects. The authors also

acknowledge the Servei d’Informatica, Universitat Jaume I for

generous allotment of computer time.

Fig. 6 Schematic representation of interactions established between

ligand 2 and the active site of HIV-1 RT according to structure after

1 ns MD relaxation.

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 10: Theoretical studies of HIV-1 reverse transcriptase inhibition

This journal is c the Owner Societies 2012 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 12623

Notes and references

1 W. L. Jorgensen, Nature, 2010, 466, 42–43.2 N. Singh and A. Warshel, Proteins: Struct., Funct., Genet., 2010,78, 1705–1723.

3 K. M. Merz Jr., J. Chem. Theory Comput., 2010, 6, 1769–1776.4 W. L. Jorgensen, Acc. Chem. Res., 2009, 42, 724–733.5 (a) N. Brooijmans and I. D. Kuntz, Annu. Rev. Biophys. Biomol.Struct., 2003, 32, 335–373; (b) A. R. Leach, B. K. Shoichet andC. E. Peishoff, J. Med. Chem., 2006, 49, 5851–5855; (c) G. Morra,A. Genoni, M. A. C. Neves, K. M. Merz and G. Colombo, Curr.Med. Chem., 2010, 17, 25–41; (d) P. Kolbo and J. J. Irwin, Curr.Top. Med. Chem., 2009, 9, 755–770; (e) O. Guvench andA. D. MacKerell, Curr. Opin. Struct. Biol., 2009, 19, 56–61.

6 A. R. Leach, B. K. Shoichet and C. E. Peishoff, J. Med. Chem.,2006, 49, 5851–5855.

7 C. J. Woods, M. Malaisree, S. Hannongbua and A. J. Mulholland,J. Chem. Phys., 2011, 134, 054114.

8 S. F. Sousa, P. A. Fernandes and M. Ramos, Proteins: Struct.,Funct., Bioinf., 2006, 65, 15–26.

9 (a) Y. Deng and B. Roux, J. Phys. Chem. B, 2009, 113, 2234–2246;(b) E. Stjernschantz and C. Oostenbrink, Biophys. J., 2010, 98,2682–2691; (c) A. de Ruiter and C. Oostenbrink, Curr. Opin.Chem. Biol., 2011, 15, 547–552; (d) S. E. Boyce, D. L. Mobley,G. J. Rocklin, A. P. Graves, K. A. Dill and B. K. Shoichet, J. Mol.Biol., 2009, 394, 747–763; (e) W. Jiang, M. Hodoscek and B. Roux,J. Chem. Theory Comput., 2009, 5, 2583–2588.

10 J. Gao, K. Kuczera, B. Tidor and M. Karplus, Science, 1989, 244,1069–1072.

11 (a) J. Hermans and S. Shankar, Isr. J. Chem., 1986, 27, 225–227;(b) W. L. Jorgensen, J. K. Buckner, S. Boudon and J. Tirado-Rives,J. Chem. Phys., 1988, 89, 3742–3746; (c) M. K. Gilson, J. A. Given,B. L. Bush and J. A. McCammon, Biophys. J., 1997, 72, 1047–1069.

12 A. Warshel, F. Sussman and G. King, Biochemistry, 1986, 25,8368–8372.

13 F. S. Lee, Z. T. Chu, M. B. Bolger and A. Warshel, Protein Eng.,1992, 5, 215–228.

14 D. Trzesniak, A. Pitschna, E. Kunz and W. F. van Gunsteren,ChemPhysChem, 2007, 8, 162–169.

15 I. V. Khavrutskii, J. Dzubiella and J. A. McCammon, J. Chem.Phys., 2008, 128, 044106.

16 (a) D. R. Herschbach, H. S. Johnston and D. Rapp, J. Chem. Phys.,1959, 31, 1652–1661; (b) S. Boresch and M. Karplus, J. Chem.Phys., 1996, 105, 5145–5154; (c) M. Sprik and G. Ciccotti, J. Chem.Phys., 1998, 109, 7737–7744; (d) D. A. Zichi, G. Ciccotti, J. T. Hynesand M. Ferrario, J. Phys. Chem., 1989, 93, 6261–6265.

17 H. J. Woo and B. Roux, Proc. Natl. Acad. Sci. U. S. A., 2005, 102,6825–6830.

18 S. Doudou, N. A. Burton and R. H. Henchman, J. Chem. TheoryComput., 2009, 5, 909–918.

19 F. Colizzi, R. Perozzo, L. Scapozza, M. Recanatini and A. Cavalli,J. Am. Chem. Soc., 2010, 132, 7361–7371.

20 V. Helms and R. C. Wade, J. Am. Chem. Soc., 1998, 120,2710–2713.

21 (a) C. J. Woods, M. A. King and J. W. Essex, J. Phys. Chem. B,2003, 107, 13703–13710; (b) C. J. Woods, M. A. King andJ. W. Essex, J. Phys. Chem. B, 2003, 107, 13711–13718.

22 R. H. Swendsen and J. Wang, Phys. Rev. Lett., 1986, 57, 2607–2609.23 W. Jiang, M. Hodoscek and B. Roux, J. Chem. Theory Comput.,

2009, 5, 2583–2588.24 A. Laio and M. Parrinello, Proc. Natl. Acad. Sci. U. S. A., 2002,

99, 12562–12566.25 F. L. Gervasio, A. Laio and M. Parrinello, J. Am. Chem. Soc.,

2005, 127, 2600–2607.26 N. V. Plotnikov, S. C. L. Kamerlin and A. Warshel, J. Phys. Chem.

B, 2011, 115, 7950–7962.27 F. S. Lee, Z. T. Chu and A. Warshel, J. Comput. Chem., 1993, 14,

161–185.28 Y. Y. Sham, Z. T. Chu, H. Tao and A. Warshel, Proteins: Struct.,

Funct., Genet., 2000, 39, 393–407.29 (a) T. Hansson, J. Marelius and J. Aqvist, J. Comput.-Aided Mol.

Des., 1998, 12, 27–35; (b) J. Aqvist and T. Hansson, Protein Eng.,1995, 8, 1137–1144.

30 (a) J. Srinivasan, T. E. Cheatham, P. Cieplak, P. A. Kollman andD. A. Case, J. Am. Chem. Soc., 1998, 120, 9401–9409;

(b) P. A. Kollman, I. Massova, C. Reyes, B. Kuhn, S. Huo,L. Chong, M. Lee, T. Lee, Y. Duan, W. Wang, O. Donini,P. Cieplak, J. Srinivasan and D. A. Case Cheatham III., Acc.Chem. Res., 2000, 33, 889–897.

31 N. Singh and A. Warshel, Proteins: Struct., Funct., Genet., 2010,78, 1705–1723.

32 S. Genheden and R. Ulf, J. Chem. Theory Comput., 2011, 7,3768–3778.

33 J. M. Molina, Drugs, 2008, 68, 567–578.34 K. Das, S. E. Martinez, J. D. Bauman and E. Arnold, Nat. Struct.

Mol. Biol., 2012, 19, 253–259.35 S. G. Sarafianos, B. Marchand, K. Das, D. M. Himmel,

M. A. Parniak, S. H. Hughes and E. Arnold, J. Mol. Biol., 2009,385, 693–713.

36 J. Wang, P. Morin, W. Wang and P. A. Kollman, J. Am. Chem.Soc., 2001, 123, 5221–5230.

37 X. He, Y. Mei, Y. Xiang, Da W. Zhang and J. Z. H. Zhang,Proteins: Struct., Funct., Bioinf., 2005, 61, 423–432.

38 J. Lameira, C. N. Alves, V. Moliner and E. Silla, Eur. J. Med.Chem., 2006, 41, 616–623.

39 (a) S. Saen-oon, M. Kuno and S. Hannongbua, Proteins: Struct.,Funct., Bioinf., 2005, 61, 859–869; (b) P. Nunrium, M. Kuno,S. Saen-oon and S. Hannongbua, Chem. Phys. Lett., 2005, 405,198–202.

40 W. L. Jorgensen, M. Bollini, V. V. Thakur, R. A. Domaoal,K. A. Spasov and K. S. Anderson, J. Am. Chem. Soc., 2011,133, 15686–15696.

41 M. Nowotny and W. Yang, EMBO J., 2006, 25, 1924–1933.42 S. R. Budihas, I. Gorshkova1, S. Gaidamakov, A. Wamiru,

M. K. Bona, M. A. Parniak, R. J. Crouch, J. B. McMahon,J. A. Beutler and S.F.J. Le Grice, Nucleic Acids Res., 2005, 33,1249–1256.

43 A. Warshel and M. Levitt, J. Mol. Biol., 1976, 103, 227–249.44 C. N. Alves, S. Martı, R. Castillo, J. Andres, V. Moliner, I. Tunon

and E. Silla, Chem.–Eur. J., 2007, 13, 7715–7724.45 S. G. Sarafianos, K. Das, C. Tantillo, A. D. Clark Jr., J. Ding,

J. M. Whitcomb, P. L. Boyer, S. H. Hughes and E. Arnold, EMBOJ., 2001, 20, 1449–1461.

46 L. Holm and J. Park, Bioinformatics, 2000, 16, 566–567.47 J. Antosiewicz, J. A. McCammon and M. K. Gilson, J. Mol. Biol.,

1994, 238, 415–436.48 L. Hui, A. D. Robertson and J. H. Jensen, Proteins, 2005, 61,

704–721.49 R. H. Byrd, P. Lu, J. Nocedal and C. Zhu, J. Sci. Comput., 1995,

16, 1190–1208.50 M. J. S. Dewar, E. G. Zoebisch and E. F. Healy, J. Am. Chem.

Soc., 1985, 107, 3902–3909.51 W. L. Jorgensen, D. S. Maxwell and J. Tirado-Rives, J. Am. Chem.

Soc., 1996, 118, 11225–11236.52 W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey

and M. L. Klein, J. Chem. Phys., 1983, 79, 926–935.53 (a) M. J. Field, A Practical Introduction to the Simulation of

Molecular Systems, Cambridge University Press, Cambridge,UK, 1999; (b) M. J. Field, M. Albe, C. Bret, F. Proust-de Martinand A. Thomas, J. Comput. Chem., 2000, 21, 1088–1100.

54 (a) J. G. Kirkwood, J. Chem. Phys., 1935, 3, 300–313;(b) D. A. McQuarrie, Statistical Mechanics, Harper & Row,New York, 1976, p. 266.

55 B. Roux, Comput. Phys. Commun., 1995, 91, 275–282.56 G. M. Torrie and J. P. Valleau, J. Comput. Phys., 1977, 23,

187–199.57 G. K. Schenter, B. C. Garrett and D. G. Truhlar, J. Chem. Phys.,

2003, 119, 5828–5833.58 T. D. Heightman and A. T. Vasella, Angew. Chem., Int. Ed., 1999,

38, 750–770.59 (a) J. D. Madura, Y. Nakajima, R. M. Hamilton, A. Wierzbicki

and A. Warshel, Struct. Chem., 1996, 7, 131–138; (b) J. Wang,P. Morin, W. Wang and P. A. Kollman, J. Am. Chem. Soc., 2001,123, 5221–5230; (c) B. O. Brandsdal, F. Osterberg, M. Almlof,I. Feierberg, V. B. Luzhkov and J. Aqvist, Adv. Protein Chem.,2003, 66, 123–158.

60 I. Buch, S. K. Sadiq and G. De Fabritiis, J. Chem. Theory Comput.,2011, 7, 1765–1772.

61 H.-P. Su, Y. Yan, G. S. Prasad, R. F. Smith, C. L. Daniels,P. D. Abeywickrema, J. C. Reid, H. M. Loughran, M. Kornienko,

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online

Page 11: Theoretical studies of HIV-1 reverse transcriptase inhibition

12624 Phys. Chem. Chem. Phys., 2012, 14, 12614–12624 This journal is c the Owner Societies 2012

S. Sharma, J. A. Grobler, B. Xu, V. Sardana, T. J. Allison,P. D. Williams, P. L. Darke, D. J. Hazuda and S. Munshi,J. Virol., 2010, 84, 7625–7633.

62 Y. Cheng and W. H. Prusoff, Biochem. Pharmacol., 1973, 22,3099–3108.

63 (a) A. Gavezotti, J. Am. Chem. Soc., 1983, 105, 5220–5224;(b) J. S. Murray, T. Brinck, P. Lane, K. Paulsen and P. Politzer,Theochem, 1994, 307, 55–64; (c) G. D. van Duyne, R. F. Standaert,P. A. Karplus, S. L. Schreiber and J. Clardy, Science, 1991, 252,839–842; (d) E. Cubero, F. J. Luque and M. Orozco, Proc. Natl.Acad. Sci. U. S. A., 1998, 95, 5976–5980; (e) J. Lameira,C. N. Alves, I. Tunon, S. Martı and V. Moliner, J. Phys. Chem.B, 2011, 115, 6764–6775.

64 C. K. Bagdassarian, V. L. Schramm and S. D. Schwartz, J. Am.Chem. Soc., 1996, 118, 8825–8836.

65 M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria,M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone,B. Mennucci, G. A. Petersson, H. Nakatsuji, M. Caricato, X. Li,H. P. Hratchian, A. F. Izmaylov, J. Bloino, G. Zheng,J. L. Sonnenberg, M. Hada, M. Ehara, K. Toyota, R. Fukuda,J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao,H. Nakai, T. Vreven, J. A. Montgomery Jr., J. E. Peralta,F. Ogliaro, M. Bearpark, J. J. Heyd, E. Brothers, K. N. Kudin,V. N. Staroverov, R. Kobayashi, J. Normand, K. Raghavachari,A. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi,N. Rega, J. M. Millam, M. Klene, J. E. Knox, J. B. Cross,

V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann,O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski,R. L. Martin, K. Morokuma, V. G. Zakrzewski, G. A. Voth,P. Salvador, J. J. Dannenberg, S. Dapprich, A. D. Daniels,O. Farkas, J. B. Foresman, J. V. Ortiz, J. Cioslowski and D. J. Fox,Gaussian 09, Revision A.1, Gaussian, Inc., Wallingford CT, 2009.

66 R. Dennington, T. Keith, J. Millam, K. Eppinnett, W. L. Hovelland R. Gilliland, GaussView, V3.0, Semichem Inc., ShawneeMission, KS, 2003.

67 L. Pauling, Chem. Eng. News, 1946, 24, 1375–1377.68 S. Chung, D. M. Himmel, J. Jiang, K. Wojtak, J. D. Bauman,

J. W. Rausch, J. A. Wilson, J. A. Beutler, C. J. Thomas, E. Arnoldand S. F. J. Le Grice, J. Med. Chem., 2011, 54, 4462–4473.

69 S. Chung, M. Wendeler, J. W. Rausch, G. Beilhartz, M. Gotte,B. R. O’Keefe, A. Bermingham, J. A. Beutler, S. X. Liu, X. W.Zhuang and S. F. Le Grice, Antimicrob. Agents Chemother., 2010,54, 3913–3921.

70 T. A. Kirschberg, M. Balakrishnan, N. H. Squires, T. Barnes,K. M. Brendza, X. Chen, E. J. Eisenberg, W. Jin, N. Kutty,S. Leavitt, A. Liclican, Q. Liu, X. Liu, J. Mak, J. K. Perry,M. Wang, W. J. Watkins and E. B. Lansdon, J. Med. Chem.,2009, 52, 5781–5784.

71 D. M. Himmel, K. A. Maegley, T. A. Pauly, J. D. Bauman,K. Das, C. Dharia, A. D. Clark, K. Ryan Jr., M. J. Hickey,R. A. Love, S. H. Hughes, S. Bergqvist and E. Arnold, Structure,2009, 17, 1625–1635.

Dow

nloa

ded

on 0

3 O

ctob

er 2

012

Publ

ishe

d on

25

June

201

2 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

2CP4

0953

D

View Online