biology problems solvedusinginformation technology

27
BioSolve IT GmbH An der Ziegelei 75 53757 Sankt Augustin Germany www.biosolveit.de BioSolveIT Biology Problems Solved using Information Technology A Combinatorial Docking Approach for Dealing with Protonation and Tautomer Ambiguities Ingo Dramburg

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

BioSolve IT GmbH • An der Ziegelei 75 • 53757 Sankt Augustin • Germanywww.biosolveit.de

BioSolveITBiology Problems Solved using Information Technology

A Combinatorial Docking Approachfor Dealing with

Protonation and Tautomer Ambiguities

Ingo Dramburg

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 2

Overview

Motivation, current state Automatic protonation during dockingBenchmarking the method Outlook: Isosteric replacementConclusions

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 3

A drug design workflowPhase I: Building a receptor model

X-Ray structure (PDB) docked complex

preparation

preparation

DOCK

Ligand

Protein

split

min. RMSD

Receptor model

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 4

A drug design workflowPhase II: Building a screening model

pre

par

atio

n docked complexes

DOCK

enrichment

LigandLigandCompounds

inactive

active

activity data

Receptor model

maximize

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 5

A drug design workflowPhase III: Virtual screening

Screening model

docked complexes

DOCK

pre

par

atio

n

LigandLigandCompounds

10n

compounds

hitsbiol.assay

106

compounds

leadlike

105

compds.

similarity

lead

optimize

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 6

Ambiguities in X-ray structures

Gln,Asn,His are „flippable“

Glu-,Asp-,Lys+,His,(Arg+,Cys,Tyr) are „titrateable“

N

OO

O

N

O

O

O

N

O

OH

N

ON

C+N

NN

ONH3+

N

O

N

NH

N

O

SH

Glu-

Asp-

Lys+

His

Arg+

Tyr

Cys

Valid receptor modelmust take this into account

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 7

Ambiguities in compound libraries

Virtual compound libraries differ Sources (vendor, in house,...) File-formats (MOL2, SDF, SMILES, PDB,...)Conformations (stereo isomers,...)Protonation, tautomers (neutral, charged)

Which protonation state is the right one ?16

compounds

-++ -

++ -

+ -+ + -

-- -- +- + -- +- + -- + +- + + -

?NH

NH

OH

OO

OH

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 8

Typical data preparation steps

1. Complex-structure: separate protein and ligand

2. Protein preparationstructure validtyions, water, cofactor handlingprotonation states for titratable sites atomtypes (interaction properties)

3. Ligand preparationstructure validityprotonation stateatomtypes (interaction properties)

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 9

The BioSolveIT tool family

ProteinEnsembles Docking

MolecularSimilarity

Comb.Library

(Clib)

I/OCheminf.

FlexibleSuperpos.

FlexS

Permute SMARTSengine

FlexX/E

FTrees

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 10

Protein preparation

FlexE: Ensemble of multiple conformations

common structureH-orientationprotonation

Claussen et al., J.Mol.Biol.(2001)308; 377-395

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 11

Ligand preparation

Pos

tpro

c es s

i ng

Bas

ic I /

O

SMARTSrules

SYBYL, MOL2FlexX Molecule Structure

Atoms (Chemical element)

Connectivity Table

Atomic Coordinates (3D)

Atom Types (SYBYL,MOL2)

Formal Charges

Hydrogens

SDF

SMILES

PDB

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 12

SMILES and be SMARTS

Line notation for molecule and subgraph description

Initially developed by Daylight Inc.

SMILES → Molecule (Unique description of molecules) SMILES „CC(=O)O“ (acetic acid)

CH3 OH

O

R OH

OS OOR

R

P OOR

R

SMARTS → Subgraph (complex description of substructures)SMARTS „[P,S,C](=[OD1])[OD1]“ (‚acidic‘ groups)

Weininger D., J. Chem. Inf. Comp. Sci. (1988)28;31-36

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 13

FlexX´ SMARTS engine

Subgraph Matching Property assignment (aromaticity, atomtypes, charges,....)Descriptor assignment (interaction-types, torsion angles,...)Structure checking (valences, bondtypes)

Structure transformation (similar to SMIRKS)Structure initialisation/correction (PDB setup, bondtypes, valences)Adjustment of ambiguities (mesomerism, tautomerism,...)Chemical modifications, reactions

Combinatorial structure manipulation: PermutationPermutation of protonation statesGeneration of close analoges

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 14

Structure transformationold-substructure >> new-substructure

C(=O)O >> C(:[O-0.5):[O-0.5;H0]

O-0.5

O-0.5

NH2

formal charges

C(=O)O >> C(=O)[O-]O

ONH2

(de)protonation

NCCCCCC(=O)O

OH

O

NH2NCCC >> N=CC=C

OH

O

NH

bondtypes

C(=O)O >> C(-[OH]).O

OHNH2

reduction

C(=O)O >> C(=O)OCC

O

O

NH2

addition

*C(=O)O >> *.C(-[OH])O

NH2OH

O

+

disconnect

N

OH

O

NCCCC >> N1CCCC1

ring closure

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 15

Automatic complex preparation

FLEX/RECEPTOR> pdbinfo 1dwd1dwd LIGAND MID-*-1-*, # (NAPAP – SEE REMARK 13. 27 H31 N5 O4 S1)1dwd PEPTIDE *-*-*-I # (Residues: 11)1dwd PEPTIDE *-*-*-H # (Residues: 258)1dwd PEPTIDE *-*-*-L # (Residues: 29)

FLEX/LIGAND> frompdb 1dwd MID-*-1-*Extract residue(s) as ligand

FLEX/RECEPTOR> read 1dwd.pdb 6.5 ya) Read pdb-file as receptor with default settings (generic.rdf)b) Use FlexE module to find optimal protonation

FLEX/DOCKING> complex allPerform a complete docking run

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 16

Combinatorial structure manipulationGroup identification

NH

NH

OH

OO

OH

16compounds

-++ -

++ -

+ -+ + -

-- -- +- + -- +- + -- + +- + + -

Scaffold

Decomposition

R2NH

XNH

X

R3

O

OH X

R2 R1R3

CoreOH

O

R1

Transformation of R-groups

Combinatorialdocking

CombinatorialLibrary

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 17

Simple tranformation rules

Rule 1: C(=[OD1])[OD1] >> C(=O)O; C(=O)[O-]

CoreOH

O

R1

OH

O

R1

O

O

R1

O

OH X

R3

O

OH X

O

-O X

R2NH

X

R2NH

X

R2NH2+XR1

NH

X

R3

R2

NH

X

R3

NH2+X

R3

Rule 2: [ND2] >> [NH];[NH2+]

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 18

The complete picture

FlexX

Protein-LigandComplex

ensemble

Docking

Rules:C(=O)O >> C(=O)OH; C(=O)[O-][ND2] >> [NH2]; [NH3+]

....

FlexEmodule

Solutions:

1.

2.

3.

4.

.. ....

n.

+

Rules:........

I/O

SMARTSengine

Comb.LibraryModule

Permute16

compounds

-++ -

++ -

+ -+ + -

-- -- +- + -- +- + -- + +- + + -

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 19

Benchmarking: A Dataset XXL

Dataset from PDB: ~20000 complexesProtein constraints

X-Ray structures from PDB

Resolution < 3.5 ANo DNA/RNA

Ligand constraints

MW 78-750Only C,H,N,P,S,F,Cl,Br,INo pure CH compounds

Max. 10 rot. Bonds, max. ring size 9Non-covalentNon-cofactor (242 HET groups, freq. > 10)

2300 selected complexes

Sadowski J.,Buning C.,Claussen H., (2004), in preparation

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 20

Redocking comparison

Generic dataset (2300 complexes)Automatic complex setupRedocking with default settings(6.5 A around ligand as site, no water or cofactors included, chemscore)

Ligand is used in two statesneutral state default protonation

Protonation selected by unified protein model

Datasets prepared by hand:Flex200 Kramer et. al.1) (200 complexes)Astex Nissink et. al.2) (305 complexes)

1) Kramer et al., Proteins (1999) 37:228-2412) Nissink et al., Proteins (2002) 49:457-471

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 21

RMSD of docking solutions (any rank)

0,00

20,0040,00

60,0080,00

100,00

flex200 Astex305

2300neutral

2300default

%

1,00 1,50 2,00 2,50 3,50RMSD

1324< 2.0

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 22

RMSD of solutions on rank 1

0,0010,0020,0030,0040,0050,0060,0070,00

flex200 Astex 305 2300neutral

2300default

%

1,00 1,50 2,00 2,50 3,50RMSD

676< 2.0 !

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 23

Permutation

Complexes with titratable ligand: 634RMSD on rank 1 <= 2.0 is a hit

without permutationlig.setup # hitsPDB2300 neutral 634 144 22 %PDB2300 charged 634 176 27 %

with permutation

lig.setup # hits

PDB2300 neutral 634 179 28 %

PDB2300 charged 634 210 33 %

Permutation of isomersincreases hitrate by ~5%

independent of initial settings

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 24

Outlook: Isosteric replacement

Substituition ([CD1,$(cH)] >> X)F,Cl,Br,I,CF3,NO2,CH3,C2H5,isoprop.,t-butyl-OH,-SH,-NH2,-CH2OH,-OMe, N(Me)2

Rings:

Insertions (*[CD2]* >> X)-CH2-,-NH-,-O-,-C2H4-,-C3H6-,...-COCH2-, -CONH-,-COO-,...>C=O, >C=S, >C=NH, >C=NOH, >C=NO-,...

NH

NH

NNH

NH

N NN

O

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 25

Example: β-phenylethylamine family

NH2

OH NH

CH3

Ephedrin

OH NH2

OH

Octopamin

O

MeO

NH2OH

Metoprolol

O

NH2

O

NHCH3

CH3

OH

Atenolol

NH

OH

OH

OH

CH3

OH

Fenoterol

NH

OH

OH

OH

CH3

CH3CH3

Terbutalin

NH

OH

CH3

CH3CH3

OH

OH Salbutamol

NH

OH

CH3

CH3CH3

NH2

Cl

Cl Clenbuterol

NH2

OHOH

Dopamin

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 26

Conclusions

FlexX can perform automatic docking successfullyReasonable solution for ~3/4 of selected complexes in pdb

Manual setup not always beats automatismImpossible for screening of large datasets

Significant increase of top ranked solutions with permutation of isomers (~5 %)

Combinatorial docking is fast (5-10x faster than sequential docking of all isomers)

Extension of method allows generation anddocking of close analogues on the fly

26-Apr-04Ingo Dramburg © BioSolve IT GmbH 27

Acknowledgements

AstraZenecaJ. Sadowski

MPI SaarbrückenA. Kämper

BioSolve ITH. Claussen M. GastreichM. LilienthalC. Lemmen

ThanxX for your attention.