privileged substructures revisited: target community-selective scaffolds jürgen bajorath life...
TRANSCRIPT
Privileged Substructures Revisited: Target Community-Selective Scaffolds
Jürgen BajorathLife Science InformaticsUniversity of Bonn
Privileged Substructures First postulated by Evans et al. in 1988 based on the observation that
many cholecystokinin antagonists contained conserved substructures not frequently seen in other active compounds
Since then the search for target class-privileged chemotypes has continued in medicinal chemistry
Generally accepted definition:- Recurrent fragments in ligands of a given target family
- Selective at the family level, but not for individual targets
Evans BE et al. J. Med.Chem.1988, 31, 2235-2246
ON
N
N
O
R1
X
R2 ON
N
N
O
N
R1
R2
X
R3
ON
N
N
OR1
X
YO
N
N
N
R1
X
R2
Privileged Substructures
Existence of truly target family-privileged substructures has remained controversial
Intrinsic limitation: Search for privileged substructures has been based on frequency of occurrence analysis of pre-selected substructures
Often drawn conclusion: Substructure might occur with high frequency among ligands of a particular target family but also act on other families
Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Target Family Set # Compounds # Substructures
GPCR class A 21620 1190
Ligand gated ion channels
3792 297
Nuclear hormonereceptors (NHRs)
2176 121
Protein kinases 1079 101
Serine proteases 3015 323
Privileged SubstructuresAre target family-privileged substructures truly privileged?
Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Target FamilySubstructure Sets
Ligand sets
GPCR Ion channels
NHRsProtein kinases
Serine proteases
Random cpd sets
GPCR class A - 26% 10% 11% 17% 46%
Ligand gated ion channels
47% - 15% 19% 92% 99%
Nuclear hormonereceptors (NHRs)
40% 30% - 17% 15% 45%
Protein kinases 48% 34% 16% - 20% 57%
Serine proteases 25% 11% 7% 91% - 37%
Privileged SubstructuresAre target family-privileged substructures truly privileged?
Changing the Analysis Concept
Do molecular scaffolds exist that exclusively occur in ligands of individual target families ?
- Bemis & Murcko framework (scaffold)
- Large-scale distribution in target families
N
N Peptidases
Kinases
GPCRs
...
Departing from frequency of occurrence analysis of pre-selected substructures
Systematic compound data mining taking all available activity annotations into account
Hierarchical Scaffolds
Bemis GW and Murcko MA. J. Med. Chem.1996, 39, 2887-2893
Compound
R-groups Framework
Ring System Linker
1
2 3
N
N+
N
O
N
Cl
Cl
O
Public Data Source - BindingDB
BindingDB database:- Public repository of activity information of small
molecules
- ~31,000 compound entries with ~57,000 activity annotations
- 17,745 compounds active against human targets extracted
Analysis Strategy - Compound Sets
Target pair sets:- Active compounds are organized into target pair sets
- A set contains all compounds active against two individual targets (i.e. compounds might belong to multiple sets)
Binding DB target pair sets:- Sets obtained for 520 pairs of targets that share >= 5
compounds
- 6,343 compounds active against 259 human targets
Pubchem confirmatory bioassays:- Only 3 relevant human target pairs meet the >= 5 compound
criterion
Compound-Based Target Network
520 target pairs are visualized in a network representation
- Nodes: targets
- Edges: target pair sets
- Edge width: number of shared compounds
Densely connected communities- 18 communities
- >= 4 targets
- Different target families
1 2 3 4
5 6
7 8
9 10 11 12 13 14 15 16 17 18
Ser/Thr k
inases
Serine proteinases
Caspases
Tyrosine kinases
MMPs & CAs
Community-Selective Scaffolds
520 human target pair sets (6,343 BDB compounds; 259 targets); 18 target communities
206 community-selective scaffolds:
- Exclusively act in a single community
- With 5 - 45 compounds/scaffold (av. ~12)
- Yielding 147 distinct carbon skeletons (topological diversity)
Adding Selectivity Information
For each compound active against a target pair, its target selectivity (TS) is calculated as:
Compound |TS| values range from 0 to 6.86- 0: equal potency, no selectivity
- 6.86: potency difference of nearly 7 orders of magnitude, i.e. highly selective for one target over another
Selectivity profiles of scaffolds- Community-based
- Target-based
BATS ii pKpK
Selectivity Profiles
Community-based selectivity profile:- For each scaffold found in a given community
All corresponding compounds active against any target pair in this community pooled
Median of their absolute TS values determined (median |TS|)
Target-based selectivity profile:- For each scaffold active against a given target
All corresponding compounds active against this target pooled Selectivity against any other target calculatedMedian of their TS values determined (median TS)
Community Selectivity of Scaffolds Scaffold / Community heat map:
- Columns: target communities
- Rows: scaffolds
- Color spectrum: median |TS| Red: scaffold yields many
compounds with different potency against individual targets
Yellow: scaffold does not yield selective compounds
Non-selective scaffolds- Occur in multiple communities
Community-selective scaffolds- Exclusively occur in one community
Target Selectivity of Scaffolds Scaffold / Target heat map:
- Columns: targets in a community
- Rows: scaffolds
- Cell: the scaffold represents >= 5 compounds active against the target
- Color spectrum: median TS Red (positive): more selective for the
target over others in the community Yellow (negative): more selective for
other members of the community
Target Selectivity of Scaffolds
Different scaffolds display same selectivity profile- e.g. Factor Xa/Thrombin
Scaffolds with no apparent target selectivity
Number of scaffolds per target varies- Factor Xa: 17; Thrombin: 18
- Tryptase: 0; Hepsin: 0
Community 3: 16 serine proteases
Target Selectivity Ranking
Community-selective scaffolds are ranked according to median |TS|
5.2
0
1
2
111 scaffolds with target-selective tendency
37 scaffoldsat least half of compounds having >= 100-fold potency differences against >= 2 community targets
Community-Selective Scaffolds
DPP4
DPP8
CA1
CA5A
CA5B
CA6CA4
CA9
CA2
CA14
CA12
CA7
CA3
N
98: 1.10
3: 4.03
N
N
RankMedian |TS|
Color spectrum: median TS Red: high potential to yield target-selective compoundsYellow: low potential
Selectivity Searching (MDDR)Thrombin
FXa
N
R1
R3
N
R5
R2
N
R1
R3
N
R4
R6
R2
NN
Highly selective for FXa over other serine proteases
Caspase 7
Caspase 3
N
O
O
S
O
NO
O
N
O
O
S
O
NO
OI
N
O
O
S
O
NO
OF
F
N
O
O
S
O
NO
OF
F
N
O
O
S
O
NO
OF
F
F
Selectivity Searching
Inhibit both caspase 3 and 7 with nM potency; ~200-fold selective over caspases 1, 6, 8
Extending the Analysis: ChemblDB
Recent public domain database: ChemblDB- ~500,000 compounds with activity information
- 32,848 compounds with high-confidence annotations active against 671 human targets High-confidence activity annotations:
- Target confidence level: 9
- Interaction type: D(irect)
ftp://ftp.ebi.ac.uk/pub/databases/chembl/latest/
- Active compounds (human targets)
- Scaffolds
- Network
- Community-selective scaffolds
- Topologically distinct scaffolds
ChemblDB vs. BindingDB Comparison at different levels
32,848
ChemblDB
17,745
BDB3,589
12,902
ChemblDB
6,291
BDB1,409
Compounds Scaffolds
- Active compounds (human targets)
- Scaffolds
- Network
- Community-selective scaffolds
- Topologically distinct scaffolds
ChemblDB vs. BindingDB Comparison at different levels
BDB
CDB
shared targetsunique targets
tyrosine kinases
GPCRs
- Active compounds (human targets)
- Scaffolds
- Network
- Community-selective scaffolds
- Topologically distinct scaffolds
ChemblDB vs. BindingDB Comparison at different levels
311
ChemblDB
206
BDB34
227
ChemblDB
147
BDB85
Community-selective Topologically distinct
Community-Selective Scaffolds Distribution in drugs?
- DrugBank: 1,247 approved drugs with 726 unique scaffolds
- Only 11 overlap with 206 community-selective BDB scaffolds
- Community-selective scaffolds currently underrepresented in drugs; opportunities for further chemical exploration
Conclusions The existence of target class-privileged substructures has
remained controversial over the years
From putative privileged substructures to confirmed target community-selective scaffolds through systematic data mining
Community-seletive scaffolds are abundant and topologically diverse
A subset of community-selective scaffolds displays a notable tendency to produce compounds with different target selectivity
BDB and CDB contain complementary target and scaffold information