fragment screening library workshop (iqpc 2008)
DESCRIPTION
I also ran a workshop on selection of compounds for fragment screening just before the 2008 IQPC compound library conference and these are the slides I used.TRANSCRIPT
Design of compound libraries for fragment screening
IQPC Compound Libraries 2008, Workshop D
Peter W. Kenny
AstraZeneca, Alderley Park
Workshop outline
• Introduction to fragment based drug discovery (FBDD)
• Diversity, coverage and library design
• Fragment selection criteria
• An example: GFSL05 (AstraZeneca generic fragment screening library)
• Exercises
Introduction to fragment based drug
discovery (FBDD)
FBDD Essentials
Screen fragments
Synthetic
Elaboration
Target
Target & fragment hit
Target & lead
Why fragments?
• Leads are assembled from proven molecular recognition elements
• Access to larger chemical space
• Ability to control resolution at which chemical space is sampled.
L
Fragment screening requirements
• Assay capable of reliably quantifying weak (~mM) binding
• Library of compounds with low molecular complexity and good aqueous solubility
•
2D Protein-observe NMR: PTP1B
15N
ppm
1H ppm
V49 F30
W125
Y46/T154
Ligand Conc
(mM)
o 0
o 0.5
o 1.0
o 2.0
o 4.0
NS
O
N
OO
O
Me
L83
G277
G283T263
A278
D48
Observation of protein resonances allowsdetermination of Kd and can provides binding siteinformation. These techniques require isotopicallylabelled protein and there are limits on the size ofprotein that can be studied. (Kevin Embrey)
1D Ligand-observe NMR
Ligand in buffer
Ligand and target protein
After saturation with potent inhibitor
Isotopically labelled protein is not required whenobserving ligand resonances and there are norestrictions on protein molecular weight. Howevercompetition experiments are necessary to quantifybinding (Rutger Folmer).
Measurement of fragment binding by SPR
[Inhibitor] uM
00
0.2
0.4
0.6
0.8
1
0.001 0.01 0.1 1 10 100 1000
In these experiments, protein is first allowed to bind to ligand (target definition compound) that hasbeen immobilised on sensor chip (Biacore). Test compounds binding competitvely with respect to TDCeffectively draw protein off sensor and strength of binding can be quantified (Wendy VanScyoc).
Figure shows ~200 MW fragment binding
with similar affinities (102 mM &145 mM)
to different forms of target protein
-6 -5 -4 -3 -2-10
0
10
20
30
40
50
60
70
80
90
Log Untitled
Un
title
d
log [compound]/M
% in
hib
itio
n
IC50 = 371 mM
Biochemical assay run at high concentration
Inhibition of target enzyme by ~200 MWfragment. When using a biochemical assayat high concentration it is necessary tocheck for non-specific binding and otherpotential artifacts. It is also possible toassess solubility under assay conditions.Compounds identified by biochemical assaysare inhibitory which may not always be thecase when using affinity methods. (AdamShapiro).
Crystal Structure of AZ10336676 bound to PTP1B
WPD Loop
F182
Catalytic
Loop
C215
Y46
Q266
Crystallographic detection of fragment binding revealsbinding mode but does not allow affinity to be quantified.Crystallography can be challenging with weakly boundinhibitors (Andrew Pannifer & Jon Read)
NS
N
OO
O
NS
N
OO
O
OMe
NS
N
OO
O
NS
N
OO
O
OMe
AZ103366763 mM
conformational lock
150 mM
hydrophobic m-subst
130 mM
AZ11548766
3 mM
PTP1B: Fragment elaboration
PO
O
O
FF
PO
O
O
FF
15mM
Inactive at 200mM
Elaboration by Hybridisation: Literature SAR was mappedonto the fragment AZ10336676 (green). Note overlay ofaromatic rings of elaborated fragment AZ11548766 (blue)and difluorophosphonate (red). See Bioorg Med Chem Lett,15, 2503-2507 (2005)
The Hann molecular complexity model
Hann et al [2001]: Molecular Complexity and Its Impact on the Probability of Finding Leads
for Drug Discovery, J. Chem. Inf. Comput. Sci., 2001, 41, 856-864
Success landscape
Ligand Efficiency (Bang For Buck)
Does molecule punch its weight?
• Scale pIC50 or DGº by molecular weight or number of heavy atoms as surrogate for molecular surface area
– Rationale: Molecules interact by presenting molecular surfaces to each other. How effectively does a molecule make use of its molecular surface?
• Fragment hits tend to have high ligand efficiency…
– But then they need to!
• Is high ligand efficiency indicative of hot spot on protein surface
A. L. Hopkins, C. R. Groom, A. Alex, Ligand efficiency: A useful metric for lead selection,
Drug Discov. Today 2004, 430-431.
Overview of fragment based lead discovery
Target-based compound selection
Analogues of known binders
Generic screening library
Measure
Kd or IC50
Screen
Fragments
Synthetic elaboration
of hits
SARProtein
Structures
Milestone achieved!Proceed to next
project
Scheme for fragment based lead optimisation
Literature
General
• Erlanson et al, Fragment-Based Drug Discovery, J. Med. Chem., 2004, 47, 3463-3482.
• Congreve et al. Recent Developments in Fragment-Based Drug Discovery, J. Med. Chem., 200851, 3661–3680.
• Albert et al, An integrated approach to fragment-based lead generation: philosophy,strategy and case studies from AstraZeneca's drug discovery programmes. Curr. Top.Med. Chem. 2007, 7, 1600-1629
• Hann et al Molecular Complexity and Its Impact on the Probability of Finding Leads forDrug Discovery, J. Chem. Inf. Comput. Sci., 2001, 41, 856-864
• Shuker et al, Discovering High Afinity Ligands for Proteins: SAR by NMR, Science,1996, 274 1531-1534).
Screening Libraries
• Schuffenhauer et al, Library Design for Fragment Based Screening, Curr. Top. Med.Chem. 2005, 5, 751-762.
• Baurin et al, Design and Characterization of Libraries of Molecular Fragments for Usein NMR Screening against Protein Targets, J. Chem. Inf. Comput. Sci., 2004, 44, 2157-2166
• Colclough et al, High throughput solubility determination with application to selectionof compounds for fragment screening. Bioorg, Med. Chem. 2008, 16, 6611-6616.
• Kenny & Sadowski, Structure modification in chemical databases. Methods andPrinciples in Medicinal Chemistry 2005, 23, 271-285.
Diversity, coverage and library design
Screening Library Design Requirements
• Precise specification of substructure– Count substructural elements (e.g. chlorine atoms; rotatable
bonds; terminal atoms; reactive centres…)
– Define generic atom types (e.g. anionic centers; hydrogen bond donors)
• Meaningful measure of molecular similarity– Structural neighbours likely to show similar response in assay
Measures of diversity & coverage
•••
•
••
•
•
•
••
•
••
•
2-Dimensional representation of chemical space is used here to illustrate concepts ofdiversity and converage. Stars indicate compounds selected to sample this region ofchemical space. In this representation, similar compounds are close together
Coverage & Diversity
Poor coverage of available chemical space by small set of mutually similar compounds
Reasonable coverage of available chemical space given small, diverse set of compounds
Good coverage of available chemical space by appropriate number of compounds
• •
• ••
•• •• •• •
•
Neighborhoods and library design
Acceptable diversity
And coverage?
Assemble library in
soluble form
Add layer to core
Incorporate layer
Yes
No
Select core
Core and layer library design
Compounds in a layer are selected to be diverse with respect to core compounds. The
‘outer’ layers typically contain compounds that are less attractive than the ‘inner’ layers.
This approach to library design can be applied with Flush or BigPicker programs (David
Cosgrove, AstraZeneca, Alderley Park) using molecular similarity measures calculated
from molecular fingerprints. (See Curr. Top. Med. Chem. 2007, 7, 1600-1629).
Fragment selection criteria
Sample
AvailabilityMolecular
Connectivity
Physical
Properties
screening samples Close analogs Ease of synthetic
elaboration
Molecular
complexity
Ionisation Lipophilicity
Solubility
Molecular
recognition
elementsMolecular shape
3D Pharmacophore
Privileged
substructures
Undesirable
substructures
Molecular
size
3D Molecular
Structure
Fragment selection criteria
NH
NN
H H
H H
O O
OMe
NH
N
N H
H
H
H
O
O
O
Me
O
O
Degree of substitution as measure of molecular complexity
The prototypical benzoic acid can be accommodated at both sites and, provided that
binding can be observed, will deliver a hit against both targets (see Curr. Top. Med.
Chem. 2007, 7, 1600-1629)
Hits, non-hits & lipophilicity: Survival of the fattest*
Mean Std Err Std Dev
Hits 2.05 0.08 1.10
Non-Hits 1.35 0.03 1.24
*Analysis of historic screening data & quote: Niklas Blomberg, AZ Molndal
Comparison of ClogP for hits and non-hits from
fragment screens run at AstraZeneca
20%10%
30%
40%
50%
log(S/M)
Aqueous solubility:
Percentiles for measured log(S/M) as function of ClogP
Data set is partitioned by ClogP into bins and the percentiles and mean ClogP is calculated for each. This way ofplotting results is particularly appropriate when dynamic range for the measurement is low. Beware of similar plotswhere only the mean or median value is shown for the because this masks variation and makes weak relationshipsappear stronger than they actually are. (See Bioorg. Med. Chem. 2008, 16, 6611-6616).
Measure solubility for
neutral (at pH 7.4)
fragments for which
ClogP > 2.2
Solubility in DMSO: Salts
Precipitate
observed
Precipitate
not observed
All samples
Adduct 525 29 554
Not Adduct 4440 89 4529
All samples 4965 118 5083
Analysis of 5K solubilised samples showed that 5% of samples
registered as ‘adduct’ (mainly salts) showed evidence of precipitation
compared to 2% of the other samples
#
# Generic fragment screening library
#
# SMARTS for restriction of substitution in fragments
#
# restrict_subs_1.smt
#
#-------------------------------------------------------------
# Some general size restrictions to set tone of search
#
Hev [A,a] 5-20
Arom a 5-12
Term [A;D1]-[A,a] 0-2
Fuse [c,A;R2] 0-2
#-------------------------------------------------------------
# Specific atom types: Explicit specification of what is
# permitted in molecule. If it's not allowed it's verboten!
#
CH2 [C;H2;!R] 0-2
O1 [OD2] 0-2
O2 [OH] 0-2
O3 O=C[OH] 0-1
O4 O=C[NX3] 0-2
O5 O=c[n&X3,o&X2] 0-2
O6 O=c1aa[n&X3,o&X2]cc1 0-2
O7 O=S 0-2
TerAm [N;!+;X3]([CX4])([CX4])[CX4] 0-2
N1 [N,n;!+;X3] 0-2
N2 [n;X2] 0-3
N3 [n;H;!+] 0-1
N4 [N;X3;!H0;!+] 0-2
S1 S(c)[C&X4,c] 0-1
CO C(=O)[N,O&H] 0-2
SO S(=O)=O 0-1
ArOS [o,s] 0-1
# Specific requirements
# Atoms providing polar interaction
Interact1 [$TerAm,$N2,$N3,$N4] *
Interact2 [$O2,$O3,$O4,$O5,$O6,$O7] *
Interact [$Interact1,$Interact2] 1-4
#
# Benzene ring
Benzene c1ccccc1 6-12
#-------------------------------------------------------------
#
# Decrapping SMARTS: Don't want these
#
AtmOK1 [c,$CH2,$O1,$O2,$O3,$O4,$O5,$O6,$O7] *
AtmOK2 [$N1,$N2,$N3,$N4,$TerAmin,$S1] *
AtmOK3 [$CO,$SO,$ArOS,C&H3,F,Cl] *
CrpAtm [A,a;!$AtmOK1;!$AtmOK2;!$AtmOK3] 0
Cation [A,a;+] 0
ReactHal [F,Cl,Br,I][C&X4,$(c[nX2]),$(C=O),N,O,S] 0
SulfEster S(=O)O[CX4] 0
NAcyl NC=O *
NN1 [N;!$NAcyl]-[N;!$NAcyl] 0
NN2 [N,n]-N 0
NO [N,n;!$NAcyl]-O 0
AcycEst C(=O)O[a,A] 0
Anhydrid O=[C,c][o,O][C,c]=O 0
Formyl [CH]=O 0
Keto O=C(C)C 0
Quinon O=c1ccc(=O)cc1 0
Phenol [OH]c 0
Anilin1 [NH2]c1ccccc1 0
Anilin2 [NH]([CH3])c1ccccc1 0
Het2sp3c [O,N,n,S]-;!@[CX4]-[O,N,n,S] 0
#
# Groups to restrict: Not so bad in very small numbers
Amino [NH2] 0-1
Chloro [Cl] 0-1
Hydroxyl [OH] 0-1
#
# Combinations of groups to be restricted
AmHydrox [$Amino,$Hydroxyl] 0-1
Example of SMARTS used to select fragments
An example: GFSL05 (AstraZeneca
generic fragment screening library)
The GFSL05 project
• Rationale– Strategic requirement: Readily accessible source of compounds
for a range of fragment screening applications
– Tactical objective: Assemble 20K structurally diverse compounds with properties that are appropriate for fragment screening as 100mM DMSO stocks
• Design overview– Core and layer design applied with successively more permissive
filters (substructural, neighborhood, properties)
– Bias compound selection to cover unsampled chemical space
GFSL05: Design
• Molecular recognition considerations– Requirement for at least one charged center or acceptably
strong hydrogen bonding donor or acceptor
• Substructural requirements defined as SMARTS– Progressively more permissive filters to apply core and layer
design
– Restrict numbers of non-hydrogen atoms (size) and terminal atoms (complexity)
– Filters to remove undesirable functional groups (acyl chloride) and to restrict numbers of others (nitro, chloro)
– ‘Prototypical reaction products’ for easy follow up
• Control of lipophilicity (ClogP) dependent on ionisation state– Solubility measurement for more lipophilic neutrals
• Tanimoto coefficient calculated using foyfi fingerprints (Dave Cosgrove) as primary similarity measure – Requirement for neighbour availability in core and layer design
ClogP: Charged library compounds
ClogP: Neutral library compoundsNon-hydrogen atoms
GFSL05: Size and lipophilicity profiles
Rotatable bonds
61
1713
4 4
1
0
Breakdown of GFSL05 by charge type
Neutral
Anion Cation
Ionisation states are identified using AZ ionisation and tautomer model. Multiple forms are generated
for acids and bases where pKa is thought to be close to physiological pH (see Methods and Principles
in Medicinal Chemistry 2005, 23, 271-285)
GFSL05: Numbers of neighbours within library as function of
similarity (Tanimoto coefficient; foyfi fingerprints)
0.90 0.85 0.80
GFSL05: Numbers of available neighbours as function of similarity
(Tanimoto coefficient; foyfi fingerprints) and sample weight
>10mg
>20mg
0.90 0.85 0.80
0.90 0.85 0.80
Exercises
Exercise 1:
Directed library using crystal structural information
You are selecting fragments for screening against anenzyme target. You have available the crystalstructure of a complex with a stable substrate analog,further access to crystallography and a robustbiochemical assay.
• What advantages and disadvantages do you see in using a biochemical assay
• How would you select the compounds in the screening library?
• How would you follow up hits from the primary screen?
Exercise 2:
Generic library for screening by X-ray
crystallography
You are selecting a single generic set of fragments forscreening against multiple, unrelated targets using X-raycrystallography.
• How might the requirements of crystallography differfrom those of other technologies for detectingbinding?
• How would you select the library compounds?
• How would you partition the screening library intomixtures for screening?