conformational sampling dragos horvath laboratoire d’infochimie – umr 7177...

40
Conformational Sampling Dragos Horvath Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 Laboratoire d’InfoChimie – UMR 7177 [email protected] [email protected]

Upload: tracey-atkins

Post on 13-Jan-2016

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Conformational Sampling

Dragos HorvathDragos Horvath

Laboratoire d’InfoChimie – UMR 7177Laboratoire d’InfoChimie – UMR 7177

[email protected]@chimie.u-strasbg.fr

Page 2: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Presentation Outline

– The Basics: Molecules have Geometries!

• Intramolecular energy calculation: the Empirical Force Field

– Sampling Methods: a brief overview

– Molecular Casino: MonteCarlo Simulations

– Really Difficult Problems: Darwinism, God’s

Will and Massively Parallel Computing

Page 3: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

3

Stable conformation :Minimum of energy

Unstable conformation :High potential energy

Two degrees of freedom

Different « conformations »or « geometries »

of a molecule

Page 4: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• The POTENTIAL ENERGY calculation is based on the EMPIRICAL FORCE FIELD APPROACH – Quantum chemical calculations are too time-consuming: atoms

and their interactions are approximated as “classical” objects – Atoms need to be “parameterized” in function of their chemical

environment: a C atom in an alkane does not carry a same partial charge as a carbonyl C=O!

– Covalent bonds are modeled as harmonic springs. The energy required to stretch or compress a bond by b with respect to its natural length b is expressed as Kbb2

– Valence angle bending modeled by harmonic potential K– Atoms that are not directly bonded or do not form an angle

interact “through space” by means of non-bonded interactions.• Van der Waals interactions• Electrostatics interactions – based on partial charges• Continuum Solvent models

Page 5: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

5

Non-bonded interactions :

Coulomb :

Van der Waals :

Desolvation & Hydrophobic Term:

j,icoulomb

d**E

04

6

12

j,i

ji

j,i

jiVdW

dB*B

dA*AE

-

a1

a2

+

jikd

VQVQkE hphob

h

ji

ijjisolvSolv ,4

,

22

Page 6: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

6

Global energy :

nbtorsbmolecule EEEEE

Torsional correction terms :

))3cos(1(* ttors kE

E=f(Geometry)

Page 7: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Torsions : the gateway to conformational sampling

- Energy Profile with respect to a torsion....

Page 8: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Torsions : the gateway to conformational sampling

- Energy Surface with respect to two torsions....

Page 9: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Torsions : the gateway to conformational sampling

- Alternative Contour Plot representation

Page 10: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

The Ramachandran Plot

http://en.wikipedia.org/wiki/Ramachandran_plot

Page 11: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Key points on the energy surface...

Page 12: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

1=0 36 72 108 144 180 216 252 288 324°

2 =

0 60 120 180 240 300

Computing a 2D torsion plot... Not that easy!

E

low

high

?

Page 13: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Energy Minimization is only [the easy] part of the problem– Given a starting geometry, deterministic algorithms allow the

discovery of the adjacent local minimum– Descent methods follow the local gradient

)(:

,,,...,,,,,)(

,,,...,,,,,

22111

22111

currcurrnew

NNN

NNN

XEsXXyiterativel

z

E

y

E

x

E

y

E

x

E

z

E

y

E

x

EXE

zyxyxzyxXgeometrymolecular

E

X

Page 14: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Bad news: most molecules have more than 2 torsions...

- No visualization of the energy hypersurface is possible!

Page 15: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Why care for conformational sampling?– Because experimental properties of a molecule are given by the

Boltzmann Average of properties of populated conformers

Boltzmann’s probability distribution:

Tk

EEenergyofconformerP

B

exp~

Boltzmann Averaging:

conformerspopulated

conformerPropertyconformerPopertyObservedPr )()(

Objective : finding the most probable

solutions

That is, the relevant minima

Energy

Geometry

Page 16: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

The Challenge…

“Well”-docked(folded) zone

“Misdocked”(folded) conformers

“Misdocked”(folded) conformers

E

E#PDB

PDB

Absolu

te E

nergy

Absolu

te E

nergy

Minim

um

Minim

um

Native-like:

Native-like:

one local clash

one local clashEnergy=f(Geometry)

defined by the Empirical Force Field

Publisher’s Force Field:« Nice H bond »

My Force Field:« Bad Contact »

Microstates contributing to

macroscopic property

Page 17: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Presentation Outline

– The Basics: Molecules have Geometries!

• Intramolecular energy calculation: the Empirical Force Field

– Sampling Methods: a brief overview

– Molecular Casino: MonteCarlo Simulations

– Really Difficult Problems: Darwinism, God’s

Will and Massively Parallel Computing

Page 18: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Sampling methods– Systematic <3…4 torsions– Molecular Dynamics

• Solve Newton’s motion equations, given the atomic forces calculated by the force field: simulate “Brownian motion”

– Stochastic sampling:• Monte Carlo simulations• Genetic Algorithms

18

Page 19: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Presentation Outline

– The Basics: Molecules have Geometries!

• Intramolecular energy calculation: the Empirical Force Field

– Sampling Methods: a brief overview

– Molecular Casino: Monte Carlo Simulations

– Really Difficult Problems: Darwinism, God’s

Will and Massively Parallel Computing

Page 20: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• The Monte Carlo Approach: win an Energy Optimum by Playing Dice!– Take a random geometry– Randomly choose a torsional axis– Apply a Random rotation around that axis– Recalculate the Energy of the thereof resulting

geometry• If lower – or, at least, not too (!) high, accept: make

new conformer new “default” geometry”• Otherwise, reject – restore ancient geometry

– Loop until no further energy drop is observed

21

Page 21: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Presentation Outline

– The Basics: Molecules have Geometries!

• Intramolecular energy calculation: the Empirical Force Field

– Sampling Methods: a brief overview

– Molecular Casino: MonteCarlo Simulations

– Really Difficult Problems: Darwinism, God’s

Will and Massively Parallel Computing

Page 22: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

23

Data representation :

« individual »or

« chromosome »=

list of itstorsional angles

Population of individuals :

… … ...... … … n…

… … ...... … … n…

… … ...... … … n…

… … ...... … … n…

nn-1…

• Genetic Algorithm– Applying a Darwinian Evolution Scenario to a population of

vectors (“chromosomes”) encoding the solution to a problem– Solution Quality is the “Fitness” score, and the fittest survive…

Page 23: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

24

Generation of new offspring :

Crossover :… n…i+1i

…’

n…

’i+

1

’i

parent1 :

parent2 :

Mutation :… n…i+1iWild type :

…’

n

’i+1i…

… ni+1’i…’

child1 :

child2 :

… ni+1’i…mutant :

Page 24: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

25

intermediate population...

n

... n

... n

... n

... n

... n

... n

... n

random

... n

... n

... n

... n

initial population

sorted

final population...

n

... n

... n

... n

sorted

Evolution of the average fitness,Evolution of the fitness of the best the algorithm converges

selection threshold

energies

Population Diversity Control is a Key Issue> Discarding of redundant chromosomes (requires a metricdefining how similar two encoded solutions are!)

> Multiple ‘Island’ models – parallel simulations occasionally swapping solutions

Page 25: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

or God??

Genetic Algorithms: Chance, Selection & the CoinFlipper’s bet!

• Any problem admitting a vector as a solution may be coded by a “chromosome” and left in the hands of Darwin…

• I bet (1M€) I can find a person who won a coin-flipping challenge 10 times in a row, at his/her first attempt!!– In order to fulfill my promise, I need a total of 1024 coin flips

to happen,• 1024/10=102 pretendents, each with a chance of (1/2)10 to score 10

successive winning coin flips: ~90% chance to loose 1M€!

• If you read “Darwin’s Dangerous Idea” by D.C.Dennett, you are not allowed to bet !!

Page 26: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Selection is the Key!

1024 candidates / 512 flips

512 candidates / 256 flips

Page 27: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

28

Hybrid strategies: (1) Selective Chromosome Initialization:

- Knowledge-based: favoring locally stable torsions…

polycycle : torsion nr. 1

0

0,05

0,1

0,15

0,2

0,25

0,3

0,35

0,4

0 100 200 300

angle

pro

bab

ilit

é

polycycle : torsion nr. 3

0

0,05

0,1

0,15

0,2

0,25

0,3

0,35

0,4

0 100 200 300

anglep

rob

abil

itie

s

biased torsion probabilities thanks to learning

biased torsion probabilities wrt local Hamiltonian

- ‘Traditionalism’: favoring torsion values seen in previously visited samples

Page 28: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

29

Evolution stalled in local minimum,

Mutations will not help!

Add a constraint term forcing 1 to adopt ‘mutant’ value ’1

Gradient optimization, following the new energy

landscape…‘Lamarckian’ move towards

next optimum

Process in parallel to main GAstream in order to avoid halting evolution!

Hybrid Strategies (2): Directed or ‘Lamarckian’ Mutations

Page 29: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Hybrid Heuristics (3) The Taboo Search Dilemma

Evolved Solution

Evolved Solution

“Taboo”Phase space region

??

Page 30: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

Search for Optimal Sampling Setups in the Strategy Parameter Space…

p1 p2 p3 p4 p5 p6 p14 p15

Population management

Population size

Number of parallel process

Migration rate between ‘islands’

Evolution management

Crossover rate

Mutation rate

One/two point crossover rate

Selection pressure

Dissimilarity limit

Maximal age

Convergence management

Apocalypse (population reset) frequency

Elitism

Global stop condition

CPUtimeTk

ETkFitness

b

ib .expln._

minimafound

Page 31: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

323-fold repeat

Postprocessing…

Run 1

Run 2

Runn

Global Base of

Diverse Conformers

Base of diverse conformers[sampled at current setup]

µ-Fitness!!

Meta-algorithm defines parameter setup

News??

« Tabus »« Tradition »

Meta-GA picksnext set of

configurations

yes

GAMEOVER

no

DirectedMutations

The Island Model

Page 32: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

GRID 5000-based ‘Planetary’ Model

If (free node)DEPLOY

Island Model

- Executables- Molecule File- Constraint Files- Seeds List- Taboo List- Operational Pars

-Stablest Chromosomes-Sampling Success Score

Solution Merger& Clusterer

Conformer & Cluster Database

‘Panspermia’ policy center‘recent’ clusters: seeds

‘old’ clusters: taboo

Sampling Success vs.Operational Pars

Stop:max. ‘Mission Nr.’

no new clusters sinceN ‘missions’

www.grid5000.fr

Operational ParsSelector

Page 33: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Ab initio folding of Trp cage 1L2YTrp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. D&C Planetary model: 20 nodes for 24 hours

PDB PDB

Page 34: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Ab initio folding of the Villin headpiece 1VIIVillin headpiece 1VII: helical parts are seen to fold in a matter of days (40 nodes) – although not properly oriented.

PDB PDB

Page 35: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Good news for the -hairpin of ChignolinChignolin: out of the top 10 best ranked conformers, 8 are native-like

• Number one is not – but in this case, that may not be a problem

PDB PDB

#1,#5#1,#5

Page 36: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• However, proper folding of 1LE1 could be achieved (though not reproducibly!) with previous force field versions – is the current setup too helix-specific?

• The 1LE1 -sheet is not the absolute energy minimum according to the current setup!

PDB PDB

Page 37: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Docking simulations in presence of flexible loops, such as the hinge region of Casein Kinase 2 (3BQC)Casein Kinase 2 (3BQC)

– pose of ligand emodin and loop geometry are correctly predicted (3BQC not in FF training set).

Flexible hinge region

PDB, #1

PDB, #1

Page 38: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Furthermore, a crystallographic water molecule can be simultaneously docked, being considered as another ligand – and is correctly placed.

Flexible hinge region

Water location converges

Water location converges

towards experimental

towards experimental

position

position

Page 39: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Docking into GPCRs: (1) Turkey 1-Adrenergic Receptor – Cyanopindolol complex 2VT4, 190 degrees of freedom (ligand and side chains) – 30 days/20 nodes**. Ligand RMS =0.48 A (best pose)

** total run time required to visit ~40000 phase space cells** total run time required to visit ~40000 phase space cells

Page 40: Conformational Sampling Dragos Horvath Laboratoire d’InfoChimie – UMR 7177 horvath@chimie.u-strasbg.fr

• Conclusions– Conformational Sampling is the Key Element for Understanding

of Molecular Behavior– It may range from very simple to extremely difficult, to impossible– If you don’t do it well, better don’t do it at all: empirical methods

based on molecular topology only may be more accurate than 3D models based on wrong – or too few – conformations

– Two main sources of errors: A.) wrong calculated energy-geometry landscape (poor Force Field parameterization) and B.) – insufficient sampling!

– Docking is just a specific case of conformational sampling, involving at least two molecules: a binding “site” and one or more “ligands”

– You will often hear that the knowledge of the “bioactive” conformer is paramount to understand binding. This is necessary, but sometimes not sufficient. Note: the “bioactive” conformer may sometimes be quite unstable and almost never populated in the free state.