design of a compound screening collection
DESCRIPTION
Design of a Compound Screening Collection. Gavin Harper Cheminformatics, Stevenage. In the Past. Scientists chose what molecules to make They tested the molecules for relevant activity. Now. We often screen a whole corporate collection 10 5 -10 6 compounds - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/1.jpg)
Design of a Compound
Screening Collection
Gavin Harper
Cheminformatics, Stevenage
![Page 2: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/2.jpg)
In the Past...
Scientists chose what molecules to make
They tested the molecules for relevant activity
![Page 3: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/3.jpg)
Now...
We often screen a whole corporate collection
– 105-106 compounds
But we choose what’s in the collection
If the collection doesn’t have the right molecules in it
– we fail
![Page 4: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/4.jpg)
“Screen MORE”
Everything’ll be fine
We’ll find lots of hits
Not borne out by our experience
![Page 5: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/5.jpg)
How do I design a collection? - 1
Pick the right kind of molecules
– hits similar biological targets
– computational (in-silico) model predicts activity at right kind of target for given class of molecules
– exclude molecules that fail simple chemical or property filters known to be important for “drugs”
FOCUS!
![Page 6: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/6.jpg)
How do I design a collection? - 2
Cover all the options
Pick as “diverse” a set of molecules as possible
If there’s an active region of chemical space, we should have it covered
DIVERSE SELECTION
– opposite extreme to focused selection
![Page 7: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/7.jpg)
Basic Idea of Our Model
Relate biological similarity to chemical similarity
Use a realistic objective
– maximize number of lead series found in HTS
Build a mathematical model on minimal assumptions
How does our collection perform now in HTS?
– relate this to our model
Learn what we need to make/purchase for HTS to find more leads
![Page 8: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/8.jpg)
A “simple” model
Chemical space is clustered (partitioned)
– there are various possible ways to do this
For a given screen, each cluster i has
– a probability i that it contains a lead
If we sample a random compound from a cluster containing a lead, the compound has
– a probability i that it shows up as a hit in the screen
If we find a hit in the cluster, that’s enough to get us to the lead
![Page 9: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/9.jpg)
And in pictures...
clusters containingleads
i = Pr(box i is orange)
![Page 10: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/10.jpg)
i = Pr(dot is green)
HitNon-HitLead
![Page 11: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/11.jpg)
Constrained Optimization Problem
),,1( 0
subject to
])1(1[ Maximize
1
1piN
MN
ip
ii
p
i
iNii
(P)
Suppose that we want to construct a screening collection of fixed size M
To maximize expected number of lead series found we have to
![Page 12: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/12.jpg)
Solution
otherwise 0
0 is this whenever) 1 ln(
)) 1 ln( ln( ln ln
i
i i
i N
If we know very little (i,i equal for all i)
– select the same number from each cluster - diversity solution
If e.g. we know some clusters are far more likely than others to contain leads for a target
– select compounds only from these clusters - focused solution (filters)
But we also have a solution for all the situations in between, where there is a balance between diversity and focus
![Page 13: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/13.jpg)
Immediate Impact
])1(1[)}({D1
1
p
i
iNpiiN
Improved “diversity” score
Use in assessing collections for acquisition
We have integrated this score into our Multi-Objective Library Design Package
* Gillett et al., J. Chem. Inf. Comp. Sci. 2002, 42, 375-385.
![Page 14: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/14.jpg)
The Waldman Criteria (Sheffield, 1998)
1. Adding redundant molecules to a system does not change its diversity.
2. Adding non-redundant molecules to a system always increases its diversity.
3. Space-filling behaviour of diversity space should be preferred.
4. Perfect (i.e. infinite) filling of a finite descriptor space should result in a finite value for the diversity function.
5. If the dissimilarity or distance of one molecule to all others is increased, the diversity of the system should increase. However, as this distance increases to infinity, the diversity should asymptotically approach a constant value.
* Waldman et al., J. Mol. Graphics Modell. 2000, 18, 412-426.])1(1[)}({D
11
p
i
iNpiiN
![Page 15: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/15.jpg)
What value should take?
Determining a value of is important. We can cluster molecules using a variety of methods.
Fortunately, there is a recent paper from Abbott which answers this question
In 115 HTS assays, with a TIGHT 2-D clustering
– consistent: mostly varies between 0.2 and 0.4
This agrees well with our experience
In practice we will use this (Taylor-Butina) clustering with radius 0.85 and using Daylight fingerprints
3.0
* Martin et al., J. Med. Chem. 2002, 45, 4350-4358.
![Page 16: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/16.jpg)
How Difficult Are Typical Screens?
Probability ofone-or-more
leads
Screen Difficulty
100000/1
![Page 17: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/17.jpg)
Improving compound acquisition
For a sample to be eligible to form part of the collection
It has to be of a minimum purity
– determined by the QA project
It has to pass a set of agreed in silico filters
– good starting points
– developability
Multiple lead series per screen
– Multiple chemotypes => 2D representation
– Collection model provides rationale and design guidelines
Leads for all targets
– 3D Pharmacophore coverage
![Page 18: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/18.jpg)
Probability of finding a lead for different acquisition strategies
![Page 19: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/19.jpg)
Why don’t far more screens find leads as collections get bigger?
performancewith acollectionof singletons
![Page 20: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/20.jpg)
The model assists with a wide range of strategic questions
What contribution did compounds from different sources make last year?
How big should a focused set of compounds for a biological system be?
– what should be in it? How many compounds that fail the default filters should I include in the
corporate collection?
A secure framework for investigating a wide range of strategic questions
Also easy to check sensitivity to changes in parameters
![Page 21: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/21.jpg)
The model generates new challenges
There appear to be fundamental limits in the performance of screening
Can I find a better way of describing molecules?
– can I increase i but keep i at a similar level?
Are there assumptions in the model that I could breach?
– model assumes I can only find leads from “hits” in the same cluster
– how good can I get at jumping BETWEEN clusters from original hit compounds?
Can I get better at identifying high i clusters?
– improve modelling and estimation of probability values
![Page 22: Design of a Compound Screening Collection](https://reader036.vdocuments.us/reader036/viewer/2022070402/568137f8550346895d9fba1d/html5/thumbnails/22.jpg)
Acknowledgements
Stephen Pickett
Darren Green
Jameed Hussain
Andrew Leach
Andy Whittington
* Harper et al., Combinatorial Chemistry and High Throughput Screening 2004, 7, 63-70.