forming focused libraries and discovering active molecules with iterative stochastic elimination
DESCRIPTION
Forming focused libraries and discovering active molecules with Iterative Stochastic Elimination. Amiram Goldblum, Anwar Rayan and David Marcus Dept. of Medicinal Chemistry School of Pharmacy Ein Kerem Campus http://www.md.huji.ac.il/models. - PowerPoint PPT PresentationTRANSCRIPT
Optimizing Drug DesignLeiden 20-23 July 2009
Forming focused libraries and discovering active molecules with Iterative Stochastic Elimination
Amiram Goldblum, Anwar Rayan and David MarcusDept. of Medicinal Chemistry
School of PharmacyEin Kerem Campus
http://www.md.huji.ac.il/models
Optimizing Drug DesignLeiden 20-23 July 2009
Iterative Stochastic Elimination (ISE)Our Generic tool for optimizing highly complex combinatorial problems
Problem type: Systems with many variables, each variable having many discrete values, the variables interacting with each other, and each state of the system can be evaluated and given a score (transportation, communication, electronic devices, life sciences)
Method: ISE finds optimal system states (global and local minima/optima) by iteratively eliminating values of variables that contribute to worst results. Elimination is based on careful statistics of randomly picked states of the system
Why: ISE has been compared to Genetic Algorithms, Monte Carlo, Simulated annealing, Support Vector Machines and other optimization methods – on specific problems and found to do as well or better
Optimizing Drug DesignLeiden 20-23 July 2009
Iterative Stochastic Elimination publications1. Glick, M. & Goldblum, A. A novel energy-based stochastic method for positioning polar
protons in protein structures from X-rays. Proteins-Structure Function and Genetics 38, 273-287 (2000).
2. Glick, M., Rayan, A. & Goldblum, A. A stochastic algorithm for global optimization and for best populations: A test case of side chains in proteins. Proceedings of the National Academy of Sciences of the United States of America 99, 703-708 (2002).
3. Noy, E., Gorelik, B., Rayan, A. & Goldblum, A. Stochastic path to form ensembles and to quantify flexibility in proteins. Abstracts of Papers of the American Chemical Society 225, U781-U781 (2003).
4. Rayan, A., Barasch, D., Brinker, G., Cycowitz, A., Geva-Dotan, I., Scaiewicz, A. & Goldblum, A. New stochastic algorithm to determine drug-likeness. Abstracts of Papers of the American Chemical Society 226, U297-U297 (2003).
5. Rayan, A., Scaiewicz, A., Geva-Dotan, I., Barasch, D. & Goldblum, A. Screening molecules for their drug-like index. Abstracts of Papers of the American Chemical Society 228, U358-U358 (2004).
6. Rayan, A., Senderowitz, H. & Goldblum, A. Exploring the conformational space of cyclic peptides by a stochastic search method. Journal of Molecular Graphics & Modelling 22, 319-333 (2004).
7. Rayan, A., Noy, E., Chema, D., Levitzki, A. & Goldblum, A. Stochastic algorithm for kinase homology model construction. Current Medicinal Chemistry 11, 675-692 (2004).
8. Rayan, A., Scaiewitz, A., Geva-Dotan, I., Marcus D., Barasch, D. & Goldblum, A (2007). Determining the Drug Like character of molecules and prioritizing them by a drug like index, ACS presentations 2005-8.
9. Noy, E., Tabakman, T. & Goldblbum A. Constructing ensembles of flexible fragments by ISE is relevant to protein-protein interfaces, Proteins (2007) 68, 702-711
10. Gorelik, B & Goldblum, A. High Quality binding modes in docking ligands to proteins. Proteins (2008), 71, 1373-1386
Optimizing Drug DesignLeiden 20-23 July 2009
General Model System
A
B
C
DE
B5
B8
B7
B6
A1 A2
A6
A7
C6
C7
C5
The number of combinations:
7)A(x8(B)x7(C)xn(D)xm(E)..…
=A very large number
B4
C4
• Variables• Values• Interactions
• An exhaustive calculation is not possible
Optimizing Drug DesignLeiden 20-23 July 2009
)1 (Randomly pick: one value for each of the variables
A
B
C
D
E
A7
C5
This determines a single “conformation” or “configuration” of the system
B4
(2) Employ the “cost function” to score the current configuration
Optimizing Drug DesignLeiden 20-23 July 2009
(3) Repeat steps (1) and (2) for n conformations(n~103-106), and calculate the total value of each
sample 2
sample n
.
.
. ..
.
nth value
2nd value
Optimizing Drug DesignLeiden 20-23 July 2009
(4) Construct a histogram of the distribution ofvalues for all sampled conformations
0%1%2%3%4%5%6%7%8%9%
10%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Function Value
Dis
trib
utio
n
low values region high values
region
Optimizing Drug DesignLeiden 20-23 July 2009
0%1%2%3%4%5%6%7%8%9%
10%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Column number
Dis
trib
uti
on
high values region
conformation 715
zoom
conformation 314conformation 220
A3
B4
C6
D7
E8
F1
A3
B8
C6
D2
E8
F2
A3
B4
C6
D6
E2
F9
A3
C6
(5) Examine the frequency of each variable value in worst results, compare to expected
Optimizing Drug DesignLeiden 20-23 July 2009
(6) Evict values that contribute above expectation to worst scores, and less than expected to best
(7) Repeat the process iteratively until all remaining combinations can be evaluated exhaustively and sorted. We obtain a population
conformation 314conformation 220 conformation 715
B4
D7
E8
F1
B8
D2
E8
F2
B4
D6
E2
F9
A3
C6
The total number of combinations is reduced
Optimizing Drug DesignLeiden 20-23 July 2009
Acetylcholinesterase inhibitors with ISEInhibition measured by Marta Rosin (Novartis’ Excellon) , Hebrew University School
of Pharmacy
Target specificity
>2 million molecules
9 molecules, 5 measured, 3 active
Molecular chemical properties
ISE Docking and scoring
ISE “engine ”
Optimizing Drug DesignLeiden 20-23 July 2009
Bcr-Abl dimerization inhibition by peptides 64aa Synthesized and measured by Martin Ruthardt, Goethe Univ. Frankfurt
Target specificity
~ 1080 sequences
10 peptides ,
6 active
Properties of amino acids
ISE protein design
ISE “engine ”
Optimizing Drug DesignLeiden 20-23 July 2009
Distinguishing between actives and inactives, on a specific targetClassification – Drugs vs. Non-drugs, Selectives vs. non Selectives
Huge combinatorial problem with more than 10100 options
Optimization problem: find differences in molecular properties to distinguish between
actives and inactives
Optimizing Drug DesignLeiden 20-23 July 2009
Learning from known data
“Actives” : Molecules with activity < 100nm“Selectives” : Molecules with selectivity > 3:1
“Inactives”: MDDR (randomly picked), or less actives
Properties (“descriptors”, our variables) are produced by computer programs (MOE):
Molecular weight, number of H-bond donors & acceptors, partial charges, topological, polar surface, Van der Waals,
Molar refraction etc…
Optimizing Drug DesignLeiden 20-23 July 2009
Lower Range 0 800 ~ 80 values at intervals of 10
Optimization of property ranges by ISE to distinguish between the two databases
Mol Weight values
0
10
20
30
40
50
60
0 250 500 750 1000 1250 1500MW
pe
rce
nta
ge
0 1200
Upper Range 500 1200 ~70 values
Overall there are 80*70 = 5.6*103 combinations for ranges of this variable
100 700 Randomly picked range
Each property is separated into two “sub properties”
Optimizing Drug DesignLeiden 20-23 July 2009
Using properties to optimize the differencebetween actives (selectives) and inactives
2 < HD 6
-2 < logP 3
150 < M.W 775
If we construct a RANGE for each property
Determine if TP, TN, FP, FN ( P N Pf Nf)
Then we test each of the molecules in the Actives and each in the inactives
Compute the fraction of each category in the full DB Use the Matthews Correlation to score
A FILTER
Optimizing Drug DesignLeiden 20-23 July 2009
Scoring by the Matthews CorrelationEach given range is for ACTIVES, and actives can only be P or Nf
For a fully correct prediction C = 1
For a completely erroneous prediction C = - 1
For a random prediction C ~ 0.00
actives inactives
PfP
Databases :
Nf N
))()()((
)()(
ffff
ff
PPNPPNNN
NPPNMCC
Optimizing Drug DesignLeiden 20-23 July 2009
))()()((
)()(
ffff
ff
PPNPPNNN
NPPNMCC
Applying ISE to discriminate between actives and inactives by optimizing descriptor ranges
Construct filter i :Pick randomly a value for each of the variables ,
i.e., low range MW, high range MW etc.
Construct filter i :Pick randomly a value for each of the variables ,
i.e., low range MW, high range MW etc.
Pass all actives and inactives of the training set through filter i
Pass all actives and inactives of the training set through filter i
Get MCC valuefor filter i
Get MCC valuefor filter i
P, N, Pf, Nf
Until i = 106
Histogram, Elimination, Iteration, Exhaustive, Test
Optimizing Drug DesignLeiden 20-23 July 2009
Results of exhaustive step, before clustering
MCCMWClogPHDonHacc%actives
) P(
%inactives
) N(
0.49282< -6< 0< 2 8267
0.49292< -2.5< 0< 2 7871
0.49292< < 9.5 0< 2 8069
0.49301< -6< 0< 2 7772
0.48282< -6< 0< 1 8563
Bestfilter
Optimizing Drug DesignLeiden 20-23 July 2009
n
NN
PP
MBI f
inactive
f
n
i
active 1
Employing the “best sets of filters” to construct a Molecular Bioactivity Index
With good data, the range of MBI is large and we get a good “resolution”
We have shown that we can use MBI to “fish” a few active molecules out of a “sea” of inactive oneshttp://www.md.huji.ac.il/models (look for “test MBI”)
Optimizing Drug DesignLeiden 20-23 July 2009
n
NN
PP
DLI f
inactive
f
n
i
active 1
Employing the “best sets of filters” to construct a Drug Likeness Index (DLI)
Drug Likeness is different than Lipinski’s ROF!
Optimizing Drug DesignLeiden 20-23 July 2009
High Throughput Screening
Combinatorial Synthesis
Hit to lead development
Lead optimization
Construction of Focused libraries
Molecular scaffold optimization
Selectivity optimization
MBI and DLI can make a difference in:
Optimizing Drug DesignLeiden 20-23 July 2009
Timeline for discovery, single processorOne target (enzyme, cells, organs…)
11.. Model building
2-3 days
22.. ZINC scan Few hrs.
33.. Diversity, SimilarityEliminate known activesA few hours
4. SCIFinder manual search
4-5 days
5. Purchase/synthesize molecules
6. in vitro tests1-2 months
Optimizing Drug DesignLeiden 20-23 July 2009
Input: VEGFR-2 KDR active inhibitors <100nm
549 actives divided randomly into 412 training and 137 test setInactives are from MDDR
Optimizing Drug DesignLeiden 20-23 July 2009
Output: example of a filter with 6 descriptorsOne of the best (high MCC); there are others with higher MCC but many desciptors
Number of descriptors – 6MCC of test set – 0.79
TP - 98.9TN - 78.6
Bcut_SMR_3 0.0 – 3.06 SMR_VSA4 0.1 - 100.6
Vsa_pol 0.1 – 102.4
Reactive 0.0 – 0.999 balabanJ 0.0 - 1.902 Q_RPC- 0.0 – 0.267
Optimizing Drug DesignLeiden 20-23 July 2009
A 6-property filter
Bcut_SMR_3 Molar refractionSMR_VSA4 VdW surface areaVsa_pol Approx VdW polar surfaceReactive Reactive fragmentsbalabanJ Topological variableQ_RPC- Relative Negative partial charge
Optimizing Drug DesignLeiden 20-23 July 2009
MBI MODEL for VEGFR
Green :% True Positives above threshold Red :% True Negatives below threshold
Blue: Enrichment Factor
0
20
40
60
80
100
-18 -8 2 12 22
MBI Threshold
True
P
ositi
ves/
Neg
ativ
es
0
100
200
300
400
500
Enrichment in the training set of VEGFR2
Optimizing Drug DesignLeiden 20-23 July 2009
Initial focused library from ZINC (2.1 million)
ZINC library screening gave 7826 molecules with top MBI
Optimizing Drug DesignLeiden 20-23 July 2009
Similarity of focused library from ZINC against known VEGFR active compounds
0
500
1000
1500
2000
2500
3000
0.0
3
0.0
8
0.1
3
0.1
8
0.2
3
0.2
8
0.3
3
0.3
8
0.4
3
0.4
8
0.5
3
0.5
8
0.6
3
0.6
8
0.7
3
0.7
8
0.8
3
0.8
8
0.9
3
0.9
8
Tanimoto Index
Nu
mb
er
of m
ole
cu
les
0.0250
0.07537
0.125858
0.1752678
0.2252655
0.2751071
0.325344
0.375112
0.42554
0.47510
0.5252
0.5754
0.6251
0.6750
0.7250
0.7750
0.8250
0.8750
0.9250
0.9750
Similarity of highest MBI to training set
Optimizing Drug DesignLeiden 20-23 July 2009
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1B
BB
Ind
ex
Negative BBB pass
Positive BBB pass
BBB results
Optimizing Drug DesignLeiden 20-23 July 2009
ER-MBI “moving ensemble”(normalized MBI values) lo
gRB
A
ER-MBI
HighModerateLow
Optimizing Drug DesignLeiden 20-23 July 2009
ER-MBI Combined high/low MBI
-5
-4
-3
-2
-1
0
1
2
3
-1 -0.5 0 0.5 1
ER-MBI
log
RB
A
LowModerateHigh
R ²=0.75
Optimizing Drug DesignLeiden 20-23 July 2009
Molecular bioactivity index
Optimizing Drug DesignLeiden 20-23 July 2009
Molecular Bioactivity Index (MBI):Fishing actives from a “bath” of “non-actives”
Mix 10 in 100,000 - find 9 in best 100, 5 in best 10
Enrichment of 5000Enrichment of 900
Optimizing Drug DesignLeiden 20-23 July 2009
Polypharmacology – with our indexing method
• We use several MBI (or MBI and DLI) to map activity into multiple targets.
This may be used to extract potential new poly-active compounds or
selective compounds depending on the behavior of the relevant disease
MBI target1
MBI target2
Multitarget
Target2 selective
Target1 selective
Non-actives
Optimizing Drug DesignLeiden 20-23 July 2009
Docking & Scoring
Do the molecules bind ?
Requirement: 3D structure of the target
How strong is the binding
affinity?
How does the complex look like ?
X-ray, NMR,Homology
model
Binding modeScore
Optimizing Drug DesignLeiden 20-23 July 2009
ISE-dock
• A new docking program from our lab that
uses the ISE algorithm in order to produce
large sets of optimal results for docking of
ligands to their targets
Optimizing Drug DesignLeiden 20-23 July 2009
ISE-dock
• Better than AutoDock – the most cited docking program
• Much better in the main docking criteria than other two popular programs – Glide and GOLD
• Produces large near optimal docking populations to study the nature of binding and to predict alternative binding modes
• Accounts for ligand and protein flexibility• Correlation between ISE-dock
populations and experimental multiple binding modes
Optimizing Drug DesignLeiden 20-23 July 2009
Anti Alzheimer current main drug strategy
Optimizing Drug DesignLeiden 20-23 July 2009
MBI MODEL for AChE inhibitionGreen: % True Positives above thresholdRed: % False positives above threshold
Blue : richment factor
0
20
40
60
80
100
-10 0 10 20 30 40
MBI Threshold
Tru
e/F
alse
P
ositi
ves
0
500
1000
1500
2000
Based on ~450 active molecules with IC50 < 10 micromolar~8000 randomly picked molecules from ZINC assumed to be inactives
Optimizing Drug DesignLeiden 20-23 July 2009
Docking with ISE-dock/Autodock
We used the crystal structure of mouse AChE (1q84) for docking.
Compounds in protonated state were docked to AChE by AutoDock3.0 and ISE-Dock.
751 out of 755 compounds were docked in the active site by both methods
Optimizing Drug DesignLeiden 20-23 July 2009
10 different conformations of one ligand in the AChE. Each color represents a different pose
Fig 2 – AChE with ACh , the red color represents the negatively charged gorge due to many side chain aromatic rings
ISE-dock results
Optimizing Drug DesignLeiden 20-23 July 2009
10 compounds from docking results (financial limitation)
The 10 compounds were picked by direct examination of each of these molecules in the active site, paying utmost attention to its conformation, H-bonds and other interactions.
Optimizing Drug DesignLeiden 20-23 July 2009
Experimental Results
9 out of the 10 compounds were purchased
8 out of the 9 compounds reached our lab with enough quantity
5 out of the 8 compounds are soluble
3 out of the 5 compounds are active (IC50=3.25, 3.5, 3.75 µM)
Similarity to known active compounds is less than 0.35
molecules are novel AChE inhibitors (not a single paper on any)
Optimizing Drug DesignLeiden 20-23 July 2009
ISE is useful for solving extremely complex
optimization problems
Provides large sets of graded results
Achieves high enrichments of “actives” vs.
“inactives” by MBI, DLI, MSI etc.
Useful for developing multi-targeted drugs
Discovers new binders for known drug targets
Produces diverse sets of solutions
Conclusions
Optimizing Drug DesignLeiden 20-23 July 2009
Molecular Modeling Group Partnershttp://www.md.huji.ac.il/models
http://www.cancergrid.eu
Prof. Andrej Bohac: Comenius U, Bratislava, VEGFR2 (Angiokem)DAC company Milan, HDAC and HSP90 inhibitionProf. Mart Sarma U. Helsinki, RET Kinase inhibitionProf. Martin Rhutardt U. Frankfurt, Bcr-Abl inhibition by peptidesProf. Yousef Najajreh Al Quds University, Bcr-Abl inhibitor synthesisProf. Yossi Schlessinger Yale, FGFR inhibitorsProf. David Varon Hadassah, Jerusalem, ADAMTS-13 inhibitionProf. Angelo Carotti: School of Pharmacy, Univ. of Bari, MMP inhibitorsProf. Marta Rosin HUJI, AChE inhibitors
Optimizing Drug DesignLeiden 20-23 July 2009
Molecular Modeling Group, HUJIhttp://www.md.huji.ac.il/models