uef // university of eastern finland

79
UEF // University of Eastern Finland Part 1, Basics of molecular model Integrate Summer School, Espoo, Finland, 7.6.2016 Prof. Antti Poso Virtual Screening and Molecular Modeling

Upload: others

Post on 16-May-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UEF // University of Eastern Finland

UEF // University of Eastern Finland

Part  1,  Basics  of  molecular  model

Integrate  Summer  School,  Espoo,  Finland,  7.6.2016Prof. Antti  Poso

Virtual  Screening  and  Molecular  Modeling

Page 2: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

What is  molecular modeling ?o Visualization

o Graphical  presentations  o 3D  glasses,  virtual   room

o Generation   of   realistic  models  of  moleculeso Plastic  modelso Electrostatic   maps  

o Prediction   of  propertieso Reactivityo Spectrum

o Simulationo Movements  

o Comparison   with  experiment

Molecular  modeling  methods  are  the  theoretical  methods  and  computational   techniques  used  to  simulate  the  behavior  of  molecules  and  molecular  systems

Page 3: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Why  Use  Molecular  Modeling?(and  not  deal  directly  with  the  real  world?)

• Visualization– Easier to understand

• Fast, safe, accurate and cheap way to– Study molecular properties– Make predictions for yet unstudied systems– Design new molecules– Interpret experimental results

Page 4: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Database  of  small  molecules

Database  of  proteins

DOCKING/  QSAR/VIRTUAL  SCREENING  

HITS

Experimental  evaluation  in  vitro/in  vivo

Page 5: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Molecular Modeling experiment

•Build  Structures  and  define  charges•Perform  Computations:  Force-­‐‑field  and  quantum  mechanical  models    

offer  sophisticated  descriptions  of  molecules,  both  known  and  unknown,  docking  fits  small  molecule  into  proteins    

•Visualize  and  Interpret  Results:  Results  include  structure,  energies,  molecular orbitals,  electron densities,  vibrationalmodes,  dynamicssimulations,    interactions,  etc.

•Recycle:  Each  answer  often  will  lead  to  more  questions  and  new  calculations

Page 6: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

General  concepts on  Molecular Modeling•On  “atomistic”  modeling

– molecules  are  a  collection  of  charged  particles:  electrons  and  nuclei

•Several  properties  of  molecule  can  be  studied  theoretically  :– Geometry– Energetic properties– Conformations,  charges,  dipole momentum,...– Structure  and  function  as  function  of  time– Interactions with protein(docking and  scoring)– Correlate molecular structure to  biological activity:  QSAR  

(CoMFA)

Page 7: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Define charges

•To  study the properties of  molecules•Usually point charges are used

– The  term  ‘point  charge’  is  a  mathematical  abstraction

•The  dimension  of  a  point  charge  is  small  compared  with  the  distance  between  them

Page 8: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Partial charges

•Atom  A  and  atom  B  have  different  electronegativity (like  H-­‐‑F)

•When  forming  covalent  bond:  Atom  B  is  more  positively  charged  and  atom  A  is  more  negatively  charged.  

•The  partial  charge  on  an  atom  in  a  molecule  depends  on  how  this  electron  density  is  partitioned  among  the  atoms

•crucial for:– hydrogen bonding– ionic bonding– dipole bonding

A B

Page 9: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Partial Charges

Page 10: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Partial charges

•Need  for  accurate  determination  of  molecular  electron  distribution  in  3D-­‐‑space– often  described  using  point  charges,  which  are  generated  for  atoms  in  the  

system• Position  at  the  atomic  center

– are  important  when  calculating  molecular  properties• electrostatic potentialmaps etc.

• two  methodologically  absolutely  different  approaches– topological procedures

• Gasteiger,  Gasteiger-­‐‑Hückel,  MMFF94– quantum  chemical  wave  function  based  methods

• ESP,  Mulliken

Page 11: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Examples of  charges

Page 12: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Computational Approaches

Atomistic Continuum

Finite Periodic

Quantum MechanicalMethods ClassicalMethods

Semi-­‐‑Empirical Ab  Initio

Quantum MC DFT Hartree-­‐‑Fock QM/MM

Deterministic Stochastic

Monte-­‐‑CarloMolecularDynamics

MolecularMechanisc

Page 13: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Level of  theory

Computational methods Molecule sizeMolecular mechanics 1  000  000    atomsSemi-­‐‑empiricalmethods up 1000  atomsAb  initio Quantummechanics up 200 atomsCorrelated Quantummechanics

up 50  atoms

Correlated,  relativisticQuantummechanics

Up 20  atoms

Decreas

ingtim

e

Increa

sing

accu

racy

Page 14: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Quantum Mechanics

– Most fundamental theoretical approach– Schrödinger equation:  HΨ=  EΨ

• H  is  Hamilton  operator– Electrons and  nuclei– Kinetic and  potential energy

• Ψ  is  wave function– Position  (  and  momentum)  of  particles

Page 15: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Semi-­‐‑empirical approach:    Hartree-­‐‑Fock equations  are  iteratively  solved,  consideringonly the  valence electrons.  Common  methods:  MNDO,  AM1  and  PM3.

DFT  (Density Functional theory):  The  electron  density  is  used   in  DFT  as  the  fundamental  property.  Using  the  electron  density  significantly  speeds  up  the  calculation.  One  of  the  most  frequently  used  computational  tools  for  studying  and  predicting  the  properties  of  isolated  molecules,  bulk  solids,  and  material  interfaces,  including  surfaces.

Ab initio  (means  from  the  beginning)   :  all  results  are  calculated  from  computational  analysis  of  Schrödinger  equation  (  no  exact  solvation).  No  parametrization.

Quantum Mechanics

Semi-­‐‑empirical Ab  initioDFT

Page 16: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Ab  initio• Ab initio  latin for  “  from  the  beginning”• Based  on  Schrödinger  equation• Common  type  Hartree Fock

– Coulombic electron-­‐‑electron  repulsion  is  not    specifically  included,  only  it’s  net  effect

– The  energy  is  in  Hartrees (  1H  =  27.2214  eV)– Because  of  approximation,  the  energy  is  always  greater  than  the  exact  energy.  

• Wavefunction is  described  by  functional  form:– Slater  type  orbitals– Gaussia type  orbitals

• Basis  set:– Basis set  is  a  collection of  functions that describe spatial position  of  an  electron.– The  basis  set  needs  to  be  able  to  approximate    the  actual  wave  function  sufficiently  

well  to    give  chemically  meaningful  results.

• The  simplest  of  these  basis  sets  is  that  designated  STO-­‐‑3G,  an  acronym  for  Slater-­‐‑Type-­‐‑Orbitals simulated  by  3 Gaussians  added  together

– reproduce  well   geometries  of  simple   organic  molecules– Not  well   on  energies– fail   in  carbocations and  carbanions

Page 17: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Basis set• To  improve  the  description  of  molecular  geometry  and  properties  split  

valence  basis  sets  are  normally  used  – the  AOs  are  split  into  two  parts:  an  inner,  compact  orbital  and  an  outer,  more  

diffuse  one.– For  simple  molecules,  the  simplest    split  valence  basis  set  is  sufficient– 3-­‐‑21G:   3 Gaussian functions are used for the core orbitals

2  for  the  inner   shell   and1 for  the  outer  one

– 6-­‐‑31G  is    good  for  geometry  optimization

• Based on  the  accuracy– Basis set:  3-­‐‑21G<  6-­‐‑31G<MP2<MP4<CCSD<CCSD(T)

• For  complex molecules polarization and  diffuse functions areneed to  add into  basis functions,   like

– 6-­‐‑31  G*:    polarization basis set• All   non-­‐‑hydrogen   atoms  additionally   are  represented  with  a  set  of    five  D-­‐‑orbitals• Must   be  employed   to  obtain  good  electron  densities   if  delocalization,   polarization  

or  hyperconjugative effects  play   a  role  

Page 18: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

DFT  Density Functional Theory• The  electron  density  of  any  system  determines  all  ground-­‐‑state properties  of  the  

system– The  exact  form  of  the  universal  energy  density  functional  is  unknown.  The  

functional  form  is  APPROXIMATED by  various  models  including  LDA  (LocalDensity Approximation),  WDA  (weighted density approximations),  and  GEA/GGA  (Gradient expansion approximation)

– Extension  to  excited  states  is  no  obvious

• DFT  is  less  expensive  than  ab initio  and  more  accurate  especially  for  solids– The  wavefunction of  an  N-­‐‑electron  system  includes  3N  variables,  while  the  

density  has  only  three  variables  x,  y,  and  z.

• Nowadays widely used in  modeling– DFT  provides  some  chemically  important  concepts,  such  as  electronegativity

(chemical  potential),  hardness  (softness),  Fukui  function,  response  function

• The  most popular functional in  DFT  is  B3LYP  which is  used to  produce reliablemolecular geometry.

Page 19: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Semiempirical methods• Less  computationally  intensive  than  solving  the  Hartree-­‐‑Fock equation• These  methods  are  not  necessarily  less  accurate  than  some  ab initio  methods

• MNDO  (Modified  Neglect  of  Differential  Overlap)  can  be  primarily  applied  to  molecules  composed  of  atoms  that  have  s  and  p  orbitals

– Not  good  at  modeling  systems  with  hydrogen  bonds– Not  good  for  4-­‐‑membered  rings– Energies  are  too  positive  for  sterically crowded  molecules

• AM1  (Austin  Method 1)– Generally  functions  much  better  than  MNDO– Still  limited  primarily  to  atoms  that  have  s-­‐‑ and  p-­‐‑orbitals– Many  parameters  obtained  via  chemical  ‘intuition’

• PM3  (ParameterizedMethod 3)– Has  parameters  for  a  larger  set  of  atoms  than  MNDO  or  AM1  (many  transition  

elements  included  in  PM3)– Performs  very  well  for  molecules  similar  to  those  used  in  the  parameterization.– Performance  for  other  molecules  can  be  better  or  worse  than  MNDO  or  AM1

Page 20: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

When to  use QM  ?• Typical  applications:

– Chemical  reactions– Spectra– Transition  states

•  Molecular Structure– QC  (ab  initio,   DFT)  more universal than MM– Ab  initio methods relatively reliable– Semi-­‐‑empirical methods sometimes fail

•  Electronic   properties– electron density distribution– dipole moment– electrostatic potentials

•  Molecular Orbitals– Frontier  Molecular   Orbitals:   what  are  the  mostfavorable orbital interactions(donor-­‐‑HOMO – acceptor-­‐‑LUMO)– rationalization   and  generalization   of  chemical   reactivity

Page 21: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Molecular Mechanics/Force field method

Based on  the following simplifications:• A  molecule  is  a  collection  of  spherical  particles  held  together  by  simple  

spring

• The  motions  of  the  nuclei  are  studied  (electrons  are  ignored)

• Limited  flexibility  due  to  lack  of  electron  treatment

• The  potential  energies are calculatedwith HOOKE‘s LAW:    Force  needed to  extend or compress a  spring  by some distance is proportional  to  that distance.

• GOAL:  to  reproduce  molecular  geometries  and  RELATIVE  energies

Page 22: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Molecular Mechanics/Force field methods

• Fit  experimental  data  from  a  small  set  of  molecules to  bunch of  molecules

• Fast  method,  can  be  utilized  systems  containing   >106 atoms

• Predict  the  energy associated  with  a  given  conformation  of  a  molecule• Numerical  value  of  Force  Field  energy  has  no  meaning  as  absolute  

quantities• Only  differences  in  energy  between  two  or  more  conformations  

have  meaning

• Typical applications:– Geometry optimization– Conformational search– Simulation

Page 23: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Why molecular mechanics /force fieldmethod?

Advantages:

Ø The greatest advantages:  computational simplicity and  speedØ In  some case  it  gives same level accuracy as  high-­level quantum mechanicsØ Transferablity (  force field developed for  one set  of  molecule can be used for  other)Ø Can  be applied for  largemolecules /systems (  proteins,  biomolecules,  polymers)Ø Molecules can be studied in  vacuum,  implicit or explicit solvent environmentsØThermodynamic and  kinetic propertiesØ Geometry optimization

ØDisadvantages:

ü In  general  less accurate than quantum mechanics or semi-­empirical methodsü No  electronic transitionsü No  electron transportü No  proton  transferüBond  breaking/forming is  not possible =>chemical reactions or reactivity of  molecules cannot be studiedü The lack of  available parameters for  some compound types

Page 24: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force field

•Force field is  a  simple mathematical equation withparameterswhich describe the energy cost of  deviating fromideal geometry

•E  is  energywhich is  defined as  the difference in  energybetween a  real molecule and  ideal molecule,  and  r0 is  theideal bond lenght etc.  derived from experimental values orab  initio calculations.    

•The force constants kb ,  kƟ ,  etc.  are experimentally derivedusually from x-­‐‑ray,  NMR,  IR  and  Raman spectoscopy

Page 25: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

A  general  form  of  force  field  

Epot =SSEstr +SSEbend +SSEtors+SSEEoop+  SSEvdw+SSEelecThere can be also cross-­terms

Page 26: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force  Field(1):  Bond  streching  Estr

•This  is  the  approximation  to  the  energy  of  a  bond  as  a  function  of  displacement  from  the  ideal  bond  length,  r0.  The  force  constant,  Kb,  determines  the  strength  of  the  bond.  Both  ideal  bond  lengths  r0 and  force  constants  Kb are  specific  for  each  pair  of  bound  atoms,  i.e.  depend  on  chemical  type  of  atoms-­constituents.

• Bond  streching  can  be  described    more  accurant  Morse  equation  (blue)  or  simple  quadratic  potential  (black).

E=1/2kb(r-­ro)2

quadratic

Page 27: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force  field  (2):  Bond  angle  Ebend

• Ebend    represents  with  a  harmonic  potential  the  alteration  of  bond  angles  theta  from  ideal  values  qo Values  of  qo and  Kq depend  on  chemical  type  of  atoms  constituting  the  angle

• If  parabel  is  broad  (k  is  small),  more  energy  is  needed  to  bend  bond  angle  away  from  ideal  geometry.

E=  1/2kq(q-qo)2

ideal  geometry

Page 28: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force  Field  (3)  :Torsion  angle  function  Ebend

• Models  the  presence  of  steric  barriers  between  atoms  separated  by  3  covalent  bonds  A-­‐‑B-­‐‑C-­‐‑D  (1,4  pairs).  The  motion  associated  with  this  term  is  a  rotation,  described  by  a  dihedral  angle  and  coefficient  of  symmetry  n=1,2,3),  around  the  middle  bond.  This  potential  is  assumed  to  be  periodic  and  is  often  expressed  as  a  cosine  function.

E=  1/2k  [1+cos(nt-­f)]

Page 29: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force  Field  (4):  Torsion  out  of  the  plane  (out-­‐‑of  plane)  

Eoop=  1/2k(c-co)2

H R'

RO

Page 30: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force  Field  (5):  Non-­‐‑bonded  interactions  (  van  der  waals  :

Ø The  repulsive  force  arises  at  short  distances  where  the  electron-­‐‑electron  interaction  is  strong  (red))

Ø The  attractive  force  arises  from  fluctuations  in  the  charge  distribution  in  the  electron  clouds, atoms  are  at  average  distance  

Ø Each  of  these  two  effects  is  equal  to  zero    as  atoms  are  at  infinite  separationØ Van  der  Waals  interactions  are  one  of  the  most  important  for  the  stability  of  the  

biological  macromolecules.

EvdW =S[(-A)/r6  +  (B)/r12]

Lennard-­Jones   12-­6  equation

Page 31: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Time  saving  trick

•Usually cut-­‐‑off value 8-­‐‑10  Å  for  Lennard-­‐‑Jones is    used to  speed calculation.

•Cut-­‐‑off value is  also usedfor  calculations of  electrostatic interactions.  The  electrostaticinteractions decreasesmoreslowly so cut-­‐‑off value is  larger

Page 32: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force  Field  (6):  Non-­‐‑bonding   interactions  (Electrostatic  

interactions)

Eelec=  q1q2/r12

• The  electrostatic   interaction   between  a  pair  of  atoms  is  represented   by  Coulomb  potential;   D  is  the  effective  dielectric   function   for  the  medium   and   r  is  the  distance   between  two  atoms  having  charges   q1  and  q2.

• Other  non-­‐‑bonding   intercations   :  hydrogen-­‐‑bonding   interactions.

Page 33: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Components  of  a  force  field  :  •Any force field contains the necessary building blocks for  calculating energy :

1. A  list of  atom types2. A  list of  atomic charges3. Rules for  atom-­‐‑types4. Functional forms of  the components of  the energy expression5. Parameters for  function terms

• What is  atom type ?• Atom type is    a  unique description of  an  element and  itsenvironment (  such as  C=O  vs C-­‐‑O)  and  its hybridization (for  Carbon,  sp,  sp2,  sp3)

• If  atom type is  not correct,  molecular geometry is  not correct!

Page 34: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Molecular geometry

•carbonhave different geometry– sp3  (tetrahedral)

– sp2  (plane)

– sp1  (linear)

Page 35: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Atom types•Different  force  field  have  usually  different  atom  types

•Each  atom  type  declaration  must  be  unique

•Atom  types  must  be  correct  to  get  correct  geometry

•Crystal  structures  may  have  “different”  atom  types  and  bonds– Check  when  using  structures  from  

different  databases• Atom  types  correct?• Bond  lengths,  bond  angles  reasonable?

• Bond  orders  correct?• Correct  enantiomer?

Tripos atom types

Page 36: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Create force field terms for  propane ?

What atom types ?  How  many different bonds ?How  many different angles?How  many torsion  angles ?What about non-­bonded interactions ?

H

HHHH

H

HH

Page 37: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Where  to  get  parameters  for  force  fields:

– Emperical  force  field• calibrated  by  experimental  data  (including  structural  data  obtained  from  x-­‐‑ray  crystallography  and  NMR,  dynamic  data  obtained  from  spectroscopy  and  inelastic  neutron  scattering  and  thermodynamic  )  of  small  molecules  (cvff)

– Ab  initio  force  field• ab  initio  calculations  (QM)  are  used  to  produce  data  to  calibrate  the  functional  form  and  parameters(cff)

Page 38: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force fields

They  differ:– in  parameters  and  cross  terms– methods  of  parameterization

Page 39: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Force fields

•The usage of  the force field depends on  purposes they aredesigned:

– MMFF94  optimized for  small organic compounds-­‐‑wide structuralvariety

– Tripos:  general  purpose-­‐‑ reasonable (  but not excellent(  parameters for  wide varietyof  atom enivironments

– Amber94:optimized  for  proteins-­‐‑oftenmissing parameters for  otherorganics

– CFF95  for  polymers– UFF:  universal forcefiels,  contains parameters even for  metal– PEFSA95  optimized for  carbohydrates

•Force field is  usually a  compromise between speed and  accuracy

Page 40: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Energy  minimization  =  Geometry  optimization

• After  sketching  or  download  from  database,  a  molecule    has  usually  bond  lengths  and  angles  etc far  from  ideal.

• Energy  mimimization is  an  approach that findsstable,  low energy conformations by changingthe geometry of  a  structure.

• During  minimization  Cartesian  coordinates  (X,Y,Z  position)  for  each  atom    are  moved  to  obtain  the  optimal  geometry  and  minimal  energy.

• Energy  E  is  minimized  by  assuming  the  entropy  effect  can  be  neglect.

• Typically,  only  small  movements  in  atom  position  are  made.

E

Page 41: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Potential energy surface PESo Geometry optimization/Energy  minimization is  done usingminimization algorithms.  

o Minimization algorithm is  used to  locate minumum points in  potential energy surface (PES).

o Most minimization algorithms go  only downhill on  the  PES  -­ Important to  have several starting structures-­ Several local minimum

Page 42: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Minimizing  process  :

I. Check  the  starting  geometry   (  remove  bad  van  der  Waals  contacts,  minimum  energy  geometry   depends  on  starting  geometry)

II. Select  suitable  force  fieldIII. Select  minimizing  algorithm  (Steepest  

descent,  Conjugate   Gradient,   Powell,  Newton-­‐‑Raphson   etc.)

IV. Choose  parameters for  minimization   (  convergence   criteria,maximum   gradient   etc)

Page 43: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Minimization  algorithms    

Minimization algorithms can be divided:I. Non derivative algorithms (No   functional form for  E)  

o Simplex

II. Derivate-­based algorithms1. First derivative methods

o Steepest descento Conjugate gradient/Powello Broyden-­Fletcher-­Goldfarb-­Shannon   (BFGS)  (Quasi-­newton)

2. Second  derivative methodso Newton-­‐‑Raphson (NR)o Truncated Newton  (TN)

III. Multidimensional methodso Monte  Carloo Molecular Dynamicso Simulated Annealingo Genetic Algorithm

0)(=

rdrdV!!

max,0)(

min,0)(

2

2

2

2

<

>

rdrVd

rdrVd

!!!!

Page 44: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

I.  Non-­‐‑derivate minimization algorithms:oOptimization  algorithms  that  do  not  usederivatives  of  the  energy  function

SIMPLEX  (  In  Sybyl):Three  basic  strategies

Ø ReflectionØ ExpansionØ Contraction

ØEffective  for  bad  geometries,very  slow  near  minima,  very  crude,    

ØMay  invert  chiralities  !Ø Do  not  use  for  proteins!

Page 45: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

II.  Derivative-­‐‑based minimization algorithm• Requires derivatives to  be calculates (can be obtained eitheranalytically or numerically)

• Energy  function is  in  a  form that allows the first(also if wanted the second)  derivatives to  be calculated.

I.  First derivativeo indicates slope of  energy surface=gradiento Gradient=0  indicates minima and  saddle points

II.  Second  derivativeo differentiates type of  points in  energy surface,Positive curvature =  minimaNegative  curvature =  maximaZero  curvature =saddle points

Page 46: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

I.  First derivative minimization algorithms

Steepest  Descent

• Numerically  calculated  deritivatives• Proceeds  along  the  direction  of  the  forces.  • Inefficient  after  a  few  iterations  (  use  only  if  the  gradient  is  extremely  high)•Working  best  when  molecule  is  far  from  a  minimum,have  poor  convergence  close  minimum  because  the  gradient  becomes  

smaller    as  minimum    is  approached  (oscillates  close  to  minimum)

Page 47: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

I.  First derivative minimization algorithms continues..  

ConjugateGradients/Powell•Uses gradients from two successivepoints to  determine direction after first step

-­ Have a  less oscillation•Powell  is  similar to  conjugate gradients (  sometimes torsion  angles are modifiedtoo much)• Powell  is  not suitable algorithm after conformational analysis• Efficient near the minimum (  finds usually a  minimum in  fewer step thanSteepest Descent)•May have problems if the initial conformation is  far from a  minimum•A  good choice for  small molecules (Powell  method also for  proteins as  it  ismost efficient minimization method (~3X  faster )

Page 48: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

II  Second  derivative minimization algorithms:

o Newton-­Raphson (NR)•Calculate the second derivatives of  energy function• Predicts the location of  a  minimum and  heads in  that direction• Fast convergence,  but requires a  lot of  memory• Unstable if far from minimum• For  small molecules

o Truncated Newton   (TN)• similar to  NR  (  the iterative linear equation solver is  terminated after smallnumber of  iterations.• efficient when gradient is  reasonable

o Broyden-­Fletcher-­Goldfarb-­Shannon   (BFGS,  Quasi-­NR  method)•Approximates the second derivatives by iteration•Predicts the location of  a  minimum and  goes in  that direction• Slow method

Page 49: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Multidimensionalmethods

•Uphillsmovements allowed !!!!•Applications  macromolecules,  like proteins•These methods are described in  next lession.

Page 50: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

How  to  end  minimization?1.  Number of  minimization step

-­ Define how many minimization steps are taken

2.  Gradient method

o Minimization stop  if gradient is  less than a  selectedvalueo Gradient can be RMSD  /  energyo A  rough minimization gradient :0.1  kcal/mol/Åo Fine minimization for  small molecule:  0.001  kcal/mol/Å

3.  Delta  Energyo Minimization is  ended as  the  change in  energybetween current step and  previous step is  less than set  criteria (  for  example 0.05  kcal/mol/Å)  

4.  Step size criteria-­ The  size of  the  change in  coordinates is  

monitored and    when this change is  smaller than set  parameter,  minimization ends.

Page 51: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Local  vs  global  minimum

o Minimization  algorithm  finds  only  local  minimum  !

o A  deep  narrow  minimum  may  be  less  populated  than  broad  minimum  with  higher  energy  

o Also  local  minimum  can  be  inaccurate,  because  the  methods  slow  done  as  approaching  a  minimum.

o The  initial  structure  determines  the  results  of  the  minimization!

Global  minimum

local  minimum

Page 52: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

How  to  find  global  minimum  ?

Ø Do  minimization  with  several  starting  conformationsØ Starting  conformation  effects  on  the  minimum  energy  structure

Ø Use  MD  or  Simulated  Annealing  approach  to  overcome  barriers

Ø Systematic  scanning  of  the  molecular  potential  energy  surface  (PES)

Page 53: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Comparing  steric  energies  ?• Be careful!• Compare only steric energies directly for  conformationalisomers or geometric isomers which have same number and  types of  bonds

• In  case  of  hexane compared to  pentane,  the difference of  stericenergy is  larger due to  fact that hexane has more atoms.  

• Enthalpy of  formation or bond enthalpy can be used reliablyfor  comparing molecules of  different atom numbers.

Table:   MM2/  MM3  energies  of  alkanes

Page 54: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Fair  structure  comparisons  ?• Use enthalpy if possible (usually not available)• Use reference structure and  study difference with that• Reference structure helpl to  cancel out  effects of  having differentnumbers of  atoms and  bonds.– For  example:Incorrect comparison :Cyclopentane ó Cyclohexane óCycloheptaneBetter choice:(cyclopentane-­‐‑pentane)  ó(cyclohexane-­‐‑hexane)ó(cycloheptane-­‐‑

heptane)

Comparison Steric Energy  Directly

Difference withReference

ConformationalIsomers

yes yes

Geometric Isomers If sameenvironment

yes

Different Formulas never yes

Table:  Molecular  mechanics  steric  energy   comparisons

Page 55: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Utilizing  minimization  methods:

SteepestDescent

ConjugateGradient  orPowell

Newton-­Raphson  orBFGS

Small  molecule  (<200  atoms)Far  from  a  minimum 1. 2.Close  to  a  minimum 1. 2.Large  molecule(>200  atoms)Far  from  a  minimum 1. 2.Close to  a  minimum 1.

The  choice  of  the  minimization  method  depends  on  1)  the  size  of  system  2)  the  current  state  of  optimization

Page 56: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Example.  Minimization  of  Netropsin  with  Stepest  descent  and  conjugate  gradients  :

pre  minimization              minimizationmethod                                    <  1  kcal/Å                                      <0.1  kcal/Å

cpu-­time  (s)              number  of  itera                    cpu-­ time(s)      num  iterat

Steepest  descents                67                                        98                                      1405                                  1893

Conjugate   gradients        149                                      213                                      257                                    367

Leach,  A.  R.  Molecular  Modelling:  Principles  and  applications.

Page 57: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Minimizing  a  part  of  a  molecule  :

o Only  a  part  of  the  structure  is  minimized  or  certain  atom  types  are  minimized.  

o Usage:•The  added  hydrogens  for  the  crystal  structure  

• avoid  bumps  with  other  atoms

• Point  mutated  amino  acids  and  their  close  environment• Generated  loops  in  protein

Page 58: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Conformational  search  and  analysis

Page 59: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

CCoonnffoorrmmaattiioonnaall sseeaarrcchh::

•Definition:  The  purpose  of  conformational  search  is  to  find  all  different  conformers  that  are  possible.

•In  practise:  Search  a  set  of  energetically  accessible  minima

AIM:  make  a  representative  sampling  of  conformational  space  with  the  smallest  number  of  conformers  that  contains  the  bio-­‐‑active  conformation  within  the  required  accuracy

Page 60: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Conformational  search  outline

Energy  minimizing

Duplicates  elimination

Representative   structures  for  each  potential  minimum

Randomly  or  systematically  generated   conformations

Page 61: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Conformational  analysis  in  modeling    is    needed  for:•  pharmacophore  modeling•rigid  docking•shape  fitting•3D  QSAR•virtual  screening•Any  in  silico  3D  drug  discovery  approach  which  depends  on  the  accurate  representation   of  low-­‐‑energy   conformations

Page 62: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Conformational   search  strategies  :

1. Deterministic  methods– Systematic  search– Molecular  dynamics

• Simulated  annealing

2.  Stochastic– Random  search– Monte  Carlo– Genetic  algorithm– Distance  geometry

Page 63: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Number  of  conformations= Pin [360/Qi]

Rotatable  bonds:  3increment:  30°conformations:  1728minimization:  1  conf/sTotal  time:  29  minutes

Rotatable  bonds:  5increment:  30°conformations:  248832minimization.:conf  /sTotal  time:  69  h

If  there  is  7  rotatable  bonds,   so  over  36  milj.  conformations  are  generated   which  takes415  days  to  minimize  !

Combinatorial  explosion:

Page 64: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Restricted  systematic  search  method:

•Uses  energy  cut-­‐‑off  value  to  decrease  the  number  of  conformations  – Conformations  with  severe  intra-­‐‑molecular  clashes  are  removed– High  energy  conformations  are  ignored

•Can  be  used  to  study  even  10-­‐‑15  rotatable  bonds  

Starting  gometry

Acceptable  conformation  minimum

High  energy  conformations

Page 65: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Advantages   and  disadvantages  of  systematic  search  methodAdvantages:ØExplore  whole  conformational  space  systematically.  All  possible  mimimal  conformers  can  be  found.  

Disadvantages:ØTime  consuming:  the  number  of  conformations  is  huge  ØCannot  be  used  for  large  systems  anda  great  limitation  for  the  ring  systems

Partial  solution:Resctricted  systematic  search  (  energy  cut-­‐‑off)

Page 66: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Random  search:• Generates  conformers  by  random  perturbation  of  Cartesian  

coordinates  or  the  torsion  angles  of  rotatable  bonds  and  then  structure  is  minimized.  Conformation  is  compared  with  others  and  registered  if  it  is  different  than    others.  This  cycle  is  repeated  several  times.

• The  perturbation  of  Cartesian  coordinates  relys  heavily  on  minimization  step  as  conformations  generated  can  be  very  distorted  with  high  energy.

• In  random  search  conformation  can  move  from  one  region  of  the  energy  surface  to  completely  unconnected  region  in  a  single  step.

Page 67: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Random  search  advantages   and  disadvantages:

Advantages:ØExplore  conformations  of  the  ring  systemsØChiral  centers  can  be  preserved  to  their  original  geometry  or  inverted  during  generation

ØFast  and  powerful  method  for  large  flexible  with  many  chiral  centers.  

Disadvantages:ØNo  real  end  point  of  search.ØOne  can  never  be  sure  that  all  of  minimum  conformations  have  been  found!  

Page 68: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

What  is  molecular  dynamics?

•Molecular  dynamics  (MD)  is  a  computer  simulation  technique  that  allows  one  to  predict    the  time  evolution  of  a  system  of  interacting  particles  (atoms,  molecules,  granules,  etc.)

•Model  the  motion  of  some  group  of  particles  (e.g.,  atoms)  by  solving  the  classical  equations  of  motion

Page 69: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

MD  basics1. Specify  the  system

o a  set  of  initial  conditions  initial  positions  &  velocities  of  particles  in  the  system

o the  interaction  potential  for  deriving  the  forces  =  Suitable  force  field

2. Follow  the  evolution  of  the  system  in  timeo Solve  a  set  of  classical   equations  of  motion  for  all  particles  

in  the  system  

Typically  MD  simulations  feature  102  -­‐‑108 atoms,  over  times  of  10  ps  – 100  ns.

Page 70: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

•MD  Total  energy  Etot=  Epot +  Ekin

-­‐‑ Epot is  from  force  field    (  Amber,  Charmm,  Gromos)

Epot =SEstr (1)  +SEbend(2)  +SEtors(3)+SEoop(4)+  SEvdw(5)+SEelec(6)

-­‐‑ Ekin  is  kinetic  part  of  energy  (  from  Newton'ʹs  law)

Page 71: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Why  Molecular  dynamics  (MD)  ?•To  explore  the  conformational  space  where  a  molecule  could  visit

•To  get  detailed  information  on  the  fluctuations  and  conformational  changes  of  molecules  (  also  proteins  and  nucleic  acids)– Molecules  are  not  static  or  rigid  structures  in  room  temperature– If  temperature  is  0  K  molecule  does  not  move

•To  study  of  complex,  dynamic  processes  like  (protein  stability,  conformational  changes,  protein  folding,  molecular  recognition,  ion  transport  in  biological  systems)

Page 72: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

MD  simulation

• An  initial  configuration  of  the  system,  a  starting  point,  or  t=0  is  selected

– It  can  be  an  x-­‐‑ray  crystal  structure  or  an  NMR  structure.  

• Initial  configuration  can  influence  the  quality  of  the  simulation  =>  choose  carefully.  It  is  often  good  to  choose  a  configuration  close  to  the  state  that  you  wish  to  simulate.

• Minimize  the  energy  of  the  structure  to  remove  any  strong  van  der  Waals  interactions,  which  might  otherwise  lead  to  local  structural  distortion  and  result  in  an  unstable  simulation

• Add  solvent  (explicit  water  molecules)  • Use  Periodic  Boundary  Conditions  (PBC)• Simulate  your  system  over  tine  with  specific  conditions  (Pressure,  Volume  and  Temparature)

• Run  time  1ns-­‐‑ 500  ns  (  can  take  weeks  /months)

Page 73: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Choosing  a  time  step

•Too  small:  covering  small  conformation   space

•Too  large:  instability

•Suggested   time  steps– Translation,  10  fs– Flexible  molecules  and  rigid  bonds,  2fs– Flexible  molecules  and  bonds,  1fs

Page 74: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Periodic  boundary   conditions  (PBC)  :• With  PBC  we  can  use  small  number  of  

molecules  to  present  bulk  system  with  less  surface  effects

• When  using  PBC,  particles  are  enclosed  in  a  box,  and  we  can  imagine  that  this  box  is  replicated  to  infinity  by  rigid  translation  in  all  the  three  cartesian  directions,  completely  filling  the  space.  

• When  particle  is  leaving  the  box,  the  image  of  the  particle  will  enter  from  the  opposite  direction.  

Page 75: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Analyzing  results

• MD  simulation,  coordinates  and  velocities  of  the  system  are  saved;  these  are  then  used  for  the  analysis.  Time  dependent  properties  (energy,  rmsd  etc)  can  be  displayed  graphically.

• Average  structures  can  be  calculated  and  compared  to  experimental  structures

Potential  Energy  as  a  Function  of  Time

-­250

-­200

-­150

-­100

-­50

0

0 20 40 60 80 100time    (ps)

Potential  Energy    (kcal/mol)

Page 76: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

MD  can  be  used  for  :

•Visualize  movement   of  the  system•Study  dynamic  behavior  of  system•Conformational   analysis•Simulated  Annealing•Refinement   of  protein   structure   like  side  chains  in  homology  modeling

•Predict  folding  of  protein

Page 77: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Conformational  analysis  using  MD

•MD  can  provide  information  about  the  conformational  properties  of  molecular  system  as  well  as  the  way  in  which  conformation  changes  with  time  in  certain  temperature.  

Energy

Simulation   time

Page 78: UEF // University of Eastern Finland

UEF //  University of  Eastern  Finland

Advangaes   and  Disadvantages  of  MD    conformation   analysisAdvantages:ØFor  large  molecules  (proteins)ØRing  systems  can  be  studiedØHigher  temperatures  can  be  used  and  energy  barriers  can  be  overcome

Ø”Movie”  of  molecular  motions  

Disadvantages:Ø Slow  method.  For  large  systems,  a  long  simulation  times  are  needed  (  usually  nanoseconds).  

ØLarge  structural  rearrangements  happens  in  1  miliseconds  timescale

Page 79: UEF // University of Eastern Finland

Thank  you!

uef.fi