chemical data and computer-aided drug discovery
DESCRIPTION
Chemical Data and Computer-Aided Drug Discovery. Mike Gilson School of Pharmacy [email protected] 2-0622. Outline. Overview of drug discovery Structure-based computational methods When we know the structure of the targeted protein Ligand-based computational methods - PowerPoint PPT PresentationTRANSCRIPT
![Page 2: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/2.jpg)
Outline
Overview of drug discovery
Structure-based computational methodsWhen we know the structure of the targeted protein
Ligand-based computational methodsWhen we don’t know the protein’s structure
![Page 3: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/3.jpg)
What is a drug?
![Page 4: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/4.jpg)
Small Molecule Drugs
Aspirin
Sildenafil (Viagra)
Glipizide (Glucotrol)
Taxol
Digoxin
Darunavir
![Page 5: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/5.jpg)
Nanoparticles(e.g., packaged small-molecule drugs)
Doxil(liposome package,
extended circulation time,milder toxicity)
Abraxane(albumin-packaged taxol)
http://www.doxil.com/about_doxil.html http://www.abraxane.com/professional/nab-technology.aspx
![Page 6: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/6.jpg)
Biopharmaceuticals
Erythropoietin (EPO)Stabilized variant of a natural protein hormone
Etanercept (Enbrel)Protein with TNF receptor + Ab Fc domainScavenges TNF, diminishes inflammation
http://www.ganfyd.org/index.php?title=Erythropoietin_beta http://en.wikipedia.org/wiki/File:Enbrel.jpg
![Page 7: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/7.jpg)
How are drugs discovered?
![Page 8: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/8.jpg)
Digoxin
Foxglove
Aspirin Taxol
Willow
Pacific Yew
Natural Products
![Page 9: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/9.jpg)
How Aspirin Works
inflammation
platelet activation
Aspirin
platelet inactivationlipidlibrary.aocs.org/lipids/eicintro/index.htm
![Page 10: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/10.jpg)
Biomolecular Pathways and Target SelectionE.g. signaling pathways
http://www.isys.uni-stuttgart.de/forschung/sysbio/insulin/index.html
Target protein
![Page 11: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/11.jpg)
Empirical Path to Ligand DiscoveryCompound library(commercial, in-house,
synthetic, natural)
High throughput screening(HTS)
Hit confirmation
Lead compounds(e.g., µM Kd)
Lead optimization(Medicinal chemistry)
Potent drug candidates(nM Kd)
Animal and clinical evaluation
![Page 12: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/12.jpg)
Compound Libraries
Commercial (also in-house pharma) Government (NIH)
Academia
![Page 13: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/13.jpg)
Computer-Aided Ligand Design
Aims to reduce number of compounds synthesized and assayed
Lower costs
Less chemical waste
Faster progress
![Page 14: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/14.jpg)
1. We Know the Structure of the Targeted ProteinStructure-Based Ligand Discovery
HIV Protease/KNI-272 complex
![Page 15: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/15.jpg)
Protein-Ligand Docking Structure-Based Ligand Design
VDW
Dihedral
Screened Coulombic
+ -
Potential functionEnergy as function of structure
Docking softwareSearch for structure of lowest energy
![Page 16: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/16.jpg)
Energy Determines Probability (Stability)Boltzmann distribution
Ene
rgy
Pro
babi
lity
( )/( ) E x RTp x e
x
![Page 17: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/17.jpg)
Structure-Based Virtual Screening
Compound database 3D structure of target(crystallography, NMR, modeling)
Virtual screening(e.g., computational docking)
Candidate ligands
Experimental assay
Ligands
Ligand optimizationMed chem, crystallography, modeling
Drug candidates
![Page 18: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/18.jpg)
Fragmental Structure-Based Screening
“Fragment” library 3D structure of target(crystallography, NMR, modeling)
Fragment docking
Compound design
http://www.beilstein-institut.de/bozen2002/proceedings/Jhoti/jhoti.html
Experimental assay and ligand optimizationMed chem, crystallography, modeling Drug candidates
![Page 19: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/19.jpg)
Physics-Based
Knowledge-Based
Potential Functions for Structure-Based DesignEnergy as a function of structure
![Page 20: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/20.jpg)
Physics-Based PotentialsEnergy terms from physical theory
Van der Waals interactions (shape fitting)Bonded interactions (shape and flexibility)Coulombic interactions (charge-charge complementarity)Hydrogen-bonding
![Page 21: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/21.jpg)
Common Simplifications Used in Physics-Based Docking
Quantum effects approximated classically
Protein typically held rigid
Configurational entropy neglected
Influence of water treated crudely
![Page 22: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/22.jpg)
Proteins and Ligand are Flexible
+
Ligand
Protein
Complex
DGo
![Page 23: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/23.jpg)
Binding Energy and Entropy
Unbound states
Bound states
l 3n lnbound FreeG RT E EK RTD
EFree
EBound
Energy part Entropy part
/
/
26
Bound
Free
RE
RTE
TeKe
![Page 24: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/24.jpg)
Structure-Based DiscoveryPhysics-oriented approaches
WeaknessesFully physical detail becomes computationally intractableApproximations are unavoidableParameterization still required
StrengthsInterpetable, provides guides to designBroadly applicable, in principle at leastClear pathways to improving accuracy
StatusUseful, far from perfectMultiple groups working on fewer, better approxs
Force fields, quantumFlexibility, entropyWater effects
Moore’s law: hardware improving
![Page 25: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/25.jpg)
Knowledge-Based Docking Potentials
Histidine
Ligandcarboxylate
Aromaticstacking
![Page 26: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/26.jpg)
Probability Energy
( )/( ) E r RTp r e
( ) ln ( )E r RT p r
Boltzmann:
Inverse Boltzmann:
Example: ligand carboxylate O to protein histidine N
1. Find all protein-ligand structures in the PDB with a ligand carboxylate O2. For each structure, histogram the distances from O to every histidine N3. Sum the histograms over all structures to obtain p(rO-N)4. Compute E(rO-N) from p(rO-N)
![Page 27: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/27.jpg)
“PMF”, Muegge & Martin, J. Med. Chem. 42:791, 1999Knowledge-Based Docking Potentials
A few types of atom pairs, out of several hundred total
Atom-atom distance (Angstroms)
( )( )
( )prot lig vdw type ij ijpairs ij
E E E r
![Page 28: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/28.jpg)
Structure-Based DiscoveryKnowledge-based potentials
WeaknessesAccuracy limited by availability of dataAccuracy may also be limited by overall approach
StrengthsRelatively easy to implementComputationally fast
StatusUseful, far from perfectMay be at point of diminishing returns
![Page 29: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/29.jpg)
Limitations of Knowledge-Based Potentials
1. Statistical limitations (e.g., to pairwise potentials)
2. Even if we had infinite statistics, would the results be accurate? (Is inverse Boltzmann quite right? Where is entropy?)
r1 r2 r10…
10 bins for a histogram of O-N distances
rO-N
rO-C
100 bins for a histogram of O-N & O-C distances
rO-N
![Page 30: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/30.jpg)
e.g. MAP Kinase Inhibitors
Using knowledge of existing inhibitors to discover more
2. We Lack the Structure of the Targeted ProteinLigand-Based Discovery
![Page 31: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/31.jpg)
Scenarios for Ligand-Based Discovery
Experimental screening generated some ligands, but they don’t bind tightly
A company wants to work around another company’s chemical patents
An otherwise promising compound is toxic, is not well-absorbed, etc.
![Page 32: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/32.jpg)
Ligand-Based Virtual Screening
Compound Library Known Ligands
Molecular similarityMachine-learning
Etc.
Candidate ligands
Assay
Actives
OptimizationMed chem, crystallography, modeling
Potent drug candidates
![Page 33: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/33.jpg)
Sources of Data on Known LigandJournals, e.g., J. Med. Chem.
![Page 34: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/34.jpg)
Some Binding and Chemical Activity Databases
PubChem (NIH) pubchem.ncbi.nlm.nih.govChEMBL (EMBL) www.ebi.ac.uk/chemblBindingDB (UCSD) www.bindingdb.org
![Page 35: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/35.jpg)
BindingDBwww.bindingdb.org
![Page 36: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/36.jpg)
Finding Protein-Ligand Data in BindingDBe.g., by Name of Protein “Target”
e.g., by Ligand Draw Search
![Page 37: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/37.jpg)
Sample Query ResultsBindingDB to PDB
![Page 38: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/38.jpg)
PDB to BindingDB
![Page 39: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/39.jpg)
Download data inmachine-readableformat
Sample Query Results
![Page 40: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/40.jpg)
Machine-Readable Chemical FormatStructure-Data File (SDF)
PDB Format Lacks Chemical BondingSDF Format Defines Chemical Bonds
![Page 41: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/41.jpg)
There are Many Other Chemical File FormatsInterconvert with Babel
![Page 42: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/42.jpg)
Chemical SimilarityLigand-Based Drug-Discovery
Compounds(available/synthesizable)
Compare with known ligandsSimilar
Test experimentally
Different
Don’t bother
![Page 43: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/43.jpg)
Chemical FingerprintsBinary Structure Keys
Molecule 1
Molecule 2
phenyl
methyl
ketone
carboxylate
amidealdehyd
e
chlorin
e
fluorine
ethylnaphthyl
S-S bond
alcohol
…
![Page 44: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/44.jpg)
Chemical Similarity from FingerprintsTanimoto Similarity or Jaccard Index, T
0.25U
IT NN
NI=2Intersection
NU=8Union
Molecule 1
Molecule 2
![Page 45: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/45.jpg)
Hashed Chemical FingerprintsBased upon paths in the chemical graph
1-atom paths: C F N H S O2-atom paths: F-C C-C C-N C-S S-O C-H3-atom paths: F-C-C C-C-N C-N-H C-S-O
C S-O etc.
Each path sets a pseudo-random bit-pattern in a very long molecular fingerprint
![Page 46: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/46.jpg)
Maximum Common Substructure
Ncommon=34
![Page 47: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/47.jpg)
Potential Drawbacks of Plain Chemical Similarity
May miss good ligands by being overly conservative
Too much weight on irrelevant details
![Page 48: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/48.jpg)
Scaffold Hopping
Zhao, Drug Discovery Today 12:149, 2007
Identification of synthetic statins by scaffold hopping
![Page 49: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/49.jpg)
Abstraction and Identification of Relevant Compound Features
Ligand shape
Pharmacophore models
Chemical descriptors
Statistics and machine learning
![Page 50: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/50.jpg)
+ 1
Bulky hydrophobe
Aromatic
5.0 ±0.3 Å3.2 ±0.4 Å
2.8 ±0.3 Å
Pharmacophore ModelsΦάρμακο (drug) + Φορά (carry)
A 3-point pharmacophore
![Page 51: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/51.jpg)
Molecular DescriptorsMore abstract than chemical fingerprints
Physical descriptorsmolecular weightchargedipole momentnumber of H-bond donors/acceptorsnumber of rotatable bondshydrophobicity (log P and clogP)
Topologicalbranching indexmeasures of linearity vs interconnectedness
Etc. etc.
Rotatable bonds
![Page 52: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/52.jpg)
A High-Dimensional “Chemical Space”Each compound is at a point in an n-dimensional space
Compounds with similar properties are near each other
Descriptor 1
Descriptor 2
Desc
ripto
r 3
Point representing a compound in descriptor space
![Page 53: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/53.jpg)
Statistics and Machine LearningSome examples
Partial least squares
Support vector machines
Genetic algorithms for descriptor-selection
![Page 54: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/54.jpg)
Summary
Overview of drug discovery
Computer-aided methodsStructure-basedLigand-based
Interaction potentialsPhysics-basedKnowledge-based (data driven)
Ligand-protein databases, machine-readable chemical formats
Ligand similarity and beyond
Mike Gilson, School of Pharmacy, [email protected], 2-0622
![Page 55: Chemical Data and Computer-Aided Drug Discovery](https://reader036.vdocuments.us/reader036/viewer/2022081502/568163c9550346895dd5037a/html5/thumbnails/55.jpg)
Activities and Discussion Topics
BindingDB: Advil Machine-readable format, Binding activities
PDB/BindingDB2ONY at PDB BindingDB Substructure search Related data
Similarity search
Combined computational approaches(physics + knowledge)-based docking potentials(ligand + structure)-based computational discovery
Other data-driven methods where it may be hard to get enough statistics
Validation of computational methods
Protein-ligand databases: getting data and assessing data quality