bioinformatics drug design
TRANSCRIPT
-
8/21/2019 Bioinformatics Drug Design
1/62
Case Study #1
Use of bioinformatics in drug
development and diagnostics
Ashok Kolaskar
University of Pune, Pune 411 007,India.
http://www.dbbm.fiocruz.br/class/faculty.html -
8/21/2019 Bioinformatics Drug Design
2/62
Bringing a New Drug to Market
Review and approval by Food& Drug Administration
1compound
approvedPhase III: Confirms effectiveness and monitors
adverse reactions from long-term use in 1,000 to
5,000 patient volunteers.
Phase II: Assesses effectiveness andlooks for side effects in 100 to 500 patient
volunteers.
Phase I: Evaluates safety and dosage
in 20 to 100 healthy human volunteers.
5 compounds enter
clinical trials
Discovery and preclininal testing:Compounds are identified and evaluated
in laboratory and animal studies for
safety, biological activity, and formulation.
5,000 compoundsevaluated
0 2 4 6 8 10 12 14 Years 16
Source: Tufts Center for the Study of Drug Development
-
8/21/2019 Bioinformatics Drug Design
3/62
Biological Research in 21st
CenturyThe new paradigm, now emerging is that all
the 'genes' will be known (in the sense of
being resident in databases availableelectronically), and that the starting "point
of a biological investigation will be
theoretical.- Walter Gilbert
-
8/21/2019 Bioinformatics Drug Design
4/62
Identify target
Clone gene encoding target
Rational Approach to
Drug Discovery
Express target in recombinant form
-
8/21/2019 Bioinformatics Drug Design
5/62
Synthesize
modificationsof lead
compounds
Identify lead
compounds
Screen
recombinanttarget with
available
inhibitors
Crystal
structures oftarget and
target/inhibitor
complexes
-
8/21/2019 Bioinformatics Drug Design
6/62
Preclinical trials
Identify leadcompounds
Toxicity &
pharmacokinetic
studies
Synthesize
modificationsof lead
compounds
-
8/21/2019 Bioinformatics Drug Design
7/62
Case study: Malaria
P. falciparumpresents a fascinating modelsystem. The life-cycle complexity is both a
challenge and an opportunity.
Unfortunately at the molecular level much
remains unknown.
Its genome, currently being sequenced, is
already yielding valuable data.
-
8/21/2019 Bioinformatics Drug Design
8/62
Malarial Drugs
Up until now, nearly all prophylaxis and
therapeutic intervention has been based on
traditional medicines or their derivatives egquinine, paraquine, chloroquine etc.
Effective also have been type I antifolates eg
sulphones and sulphonamides which mimic
PABA, and the type II antifolates egpyrimethamine, trimethoprim and proguanil which
mimic dihydrofolate.
-
8/21/2019 Bioinformatics Drug Design
9/62
List of Drugs
ChloroquineAvloclor ; Nivaquine
Antifolate
Pyrimethamine (Daraprim)Proguanil (Paludrine)
Combination drugs
Pyrimethamine & Sulfadoxine combination eg.:Fansidar
Resistance to these drugs poses a threat to
control morbidity and mortality of malaria.
-
8/21/2019 Bioinformatics Drug Design
10/62
Drug Resistance (DR)
Mechanisms of drug resistance Formation of altered target
Target has decreased affinity for substrate/analogue
Decreased access to target
Mutations decrease membrane permeability Specific transport systems altered/deleted
Increased level of enzymes which cleave substrates
e.g. Increased expression of -lactamase
Gene amplification Decreased activation of drug
Drugs require activation by enzymes near site
Activation pathway suppressed/deleted
-
8/21/2019 Bioinformatics Drug Design
11/62
Malaria: Novel drug combinations
Structural formula: Atovaquone & Proguanil
-
8/21/2019 Bioinformatics Drug Design
12/62
Drugs targeting Dihydrofolate reductase
and Dihydroopteroate synthethase
Structural formula of Lumefantrine (Benflumethol)
-
8/21/2019 Bioinformatics Drug Design
13/62
Artemisinin derivatives as drugs
Structural formula of Arteminsinins
-
8/21/2019 Bioinformatics Drug Design
14/62
Quinolines
Structural formula of Pyronaridine
-
8/21/2019 Bioinformatics Drug Design
15/62
Novel drug combinations
Structural formula of Chlorproguanil and Dapsone
-
8/21/2019 Bioinformatics Drug Design
16/62
Dihydrofolate reductase inhibitors
Structural formula of Pyrimethamine & Cycloguanil
-
8/21/2019 Bioinformatics Drug Design
17/62
Inhibitors of phospholipid metabolism
Structural formula of E13 and G23
-
8/21/2019 Bioinformatics Drug Design
18/62
An Ideal Target
Is generally an enzyme/receptor in a pathway andits inhibition leads to either killing a pathogenicorganism (Malarial Parasite) or to modify someaspects of metabolism of body that is functioning
dormally. An ideal target
Is essential for the survival of the organism.
Located at a critical step in the metabolic pathway.
Makes the organism vulnerable.
Concentration of target gene product is low.
The enzyme amenable for simple HTS assays
-
8/21/2019 Bioinformatics Drug Design
19/62
How Bioinformatics can help in
Target Identification? Homologous & Orthologous genes
Gene Order
Gene Clusters
Molecular Pathways & Wire diagrams
Gene Ontology
Identification of Unique Genes of Parasite
as potential drug target.
-
8/21/2019 Bioinformatics Drug Design
20/62
Comparative Genomics of Malarial
Parasites: Source for identification of
new target molecules. Genome comparisons of malarial parasites of
human.
Genome comparisons of malarial parasites ofhuman and rodent.
Comparison of genomes of
Human
Malarial parasite
Mosquito
-
8/21/2019 Bioinformatics Drug Design
21/62
What one should look for?
Human
P.f
Mosquito
Proteins that are shared by
All genomes
Exclusively by Human & P.f.Exclusively by Human &
Mosquito
Exclusively by P.f. & Mosquito
Unique proteins in
Human
P.f. Targets for
anti-malarial drugs
Mosquito
-
8/21/2019 Bioinformatics Drug Design
22/62
What is Structural Genomics?
Organise all known proteins into families.
Determine structures of at least one member
of every family. Solve structures of more than 10,000 proteinin next 10 years.
Generate knowledge and rules from known
protein structures. Apply this knowledge to predict the structure
of each and every protein of knownorganisms.
-
8/21/2019 Bioinformatics Drug Design
23/62
Objectives of Structural Genomics
Selection of Targets for structure
determination to obtain maximum
information return on total efforts
Develop mechanism that facilitates
cooperation and prevent workduplication
-
8/21/2019 Bioinformatics Drug Design
24/62
Impact of Structural Genomics on
Drug Discovery
Dry, S. et. al. (2000) Nat. Struc.Biol. 7:976-949.
-
8/21/2019 Bioinformatics Drug Design
25/62
Drug Development Flowchart
Check if structure is known
If unknown, model it using
KNOWLEDGE-BASED HOMOLOGYMODELING APPROACH.
Search for small molecules/ inhibitors
Structure-based Drug Design Drug-Protein Interactions
Docking
-
8/21/2019 Bioinformatics Drug Design
26/62
Why Modeling?
Experimental determination of structureis still a time consuming and expensive
process.
Number of known sequences are more
than number of known structures.
Structure information is essential in
understanding function.
-
8/21/2019 Bioinformatics Drug Design
27/62
Sequence identities &Molecular Modeling methods
Methods Sequence Identitywith knownstructures
ab in i t io 0-20%
Fold recognition 20-35%
Homology Modeling >35%
-
8/21/2019 Bioinformatics Drug Design
28/62
What is Homology or ComparativeModeling?
Comparisons of the tertiary structures ofhomologous proteins have shown that 3-Dstructures have been better conserved duringevolution than protein primary structures.
In the absence of experimental data, model-building on the basis of the known threedimensional structure of homologous
protein(s) is the only reliable method to obtainstructural information.
-
8/21/2019 Bioinformatics Drug Design
29/62
Difference between Homology andSimilarity
Homology does not necessarily imply similarity.
Homology has a precise definition: having a
common evolutionary origin.
Since homology is a qualitative description ofthe relationship, the term % homologyhas no
meaning.
Supporting data for a homologous relationshipmay include sequence or structural similarities,
which can be described in quantitative terms.
-
8/21/2019 Bioinformatics Drug Design
30/62
100%
75%
50%
25%
Sequence Identity Model Accuracy
Rate limiting factors in modeling
CPU time to model
Quality of model & Loop modeling
Errors in the sequence alignment
Detection of homology
-
8/21/2019 Bioinformatics Drug Design
31/62
What is Remote HomologyModeling?
Modeling based on low levels of sequence
identity (
-
8/21/2019 Bioinformatics Drug Design
32/62
Steps in Homology Modeling
template recognition
alignment
backbone generation
generation of canonical loops (data based)
side chain generation & optimisation
model optimisation (energy minimisation)
model verification
Optional repeat of previous steps: Generating
more than one model.
STRUCTURE BASED DRUG
-
8/21/2019 Bioinformatics Drug Design
33/62
STRUCTURE-BASED DRUG
DESIGNCompound
databases,
Microbial broths,
Plants extracts,
Combinatorial
Libraries
3-D ligandDatabases
Docking
Linking or
Binding
Receptor-Ligand
Complex
Random
screening
synthesis
Lead molecule
3-D QSAR
Target Enzyme
OR Receptor
3-D structure by
Crystallography,
NMR, electronmicroscopy OR
Homology Modeling
Redesign
to improve
affinity,
specificity etc.
Testing
-
8/21/2019 Bioinformatics Drug Design
34/62
Binding Site Analysis
In the absence of a structure of Target-
ligand complex, it is not a trivial exercise to
locate the binding site!!!
This is followed by Lead optimization.
-
8/21/2019 Bioinformatics Drug Design
35/62
Lead Optimization
Lead Lead OptimizationActive site
-
8/21/2019 Bioinformatics Drug Design
36/62
Factors Affecting The Affinity Of A Small
Molecule For A Target Protein
LIGAND.wat n+PROTEIN.watn LIGAND.PROTEIN.watp+(n+m-p) wat
HYDROGEN BONDING
HYDROPHOBIC EFFECT
ELECTROSTATIC INTERACTIONS
VAN DER WAALS INTERACTIONS
STRAIN IN THE LIGAND ( BOUND)STRAIN IN THE PROTEIN
-
8/21/2019 Bioinformatics Drug Design
37/62
DIFFERENCE BETWEEN AN INHIBITOR AND DRUG
Extra requirement of a drug compared to an inhibitor
LIPINSKIS RULE OF FIVE
Poor absorption or permeation are more
likely when :
-There are more than five H-bond donors
-The mol.wt is over 500 Da-The MlogP is over 4.15(or CLOG P>5)
-The sums of Ns and Os is over 10
SelectivityLess Toxicity
Bioavailability
Slow ClearanceReach The Target
Ease Of Synthesis
Low Price
Slow Or No Development Of Resistance
Stability Upon Storage As Tablet Or Solution
Pharmacokinetic Parameters
No Allergies
THERMODYNAMICS OF RECEPTOR-LIGAND BINDING
-
8/21/2019 Bioinformatics Drug Design
38/62
O N CS O C O G N N NG
Proteins that interact with drugs are typically enzymes orreceptors.
Drug may be classified as: substrates/inhibitors(for enzymes)
agonists/antagonists(for receptors)
Ligands for receptors normally bind via a non-covalent reversiblebinding.
Enzyme inhibitors have a wide range of modes:non-covalentreversible,covalent reversible/irreversible or suicide inhibition.
Enzymes prefer to bind transition states (reaction intermediates)and may not optimally bind substrates as part of energy used forcatalysis.
In contrast, inhibitors are designed to bind with higher affinity:their affi nities often exceed the corresponding substrate affinities
by several orders of magnitude!
Agonists are analogous to enzyme substrates: part of the binding
energy may be used for signal transduction, inducing aconformation or aggregation shift.
To understand what forces are responsible for ligands binding to
-
8/21/2019 Bioinformatics Drug Design
39/62
To understand what forces are responsible for ligands binding to
Receptors/Enzymes,
It is worthwhile considering what forces drive protein folding they
share many common features.The observed structure of Protein is generally a consequence of the
hydrophobic effect!
Secondary amides form much stronger H-bonds to water than to other
sec. Amides hydrophobic collapse
Proteins generally bury hydrophobic residues inside the core,while
exposing hydrophilic residues to the exterior Salt-bridges inside
Ligand building clefts in proteins often expose hydrophobic residues tosolvent and may contain partially desolvated hydrophilic groups that are
not paired:
The desolvation penalty is paid for by favourable (hydrophobic)
interaction elsewhere in the structure.
-
8/21/2019 Bioinformatics Drug Design
40/62
Docking Methods
Docking of ligands to proteins is a
formidable problem since it entails
optimization of the 6 positional degrees offreedom.
Rigid vs Flexible
Speed vs Reliability
Manual Interactive Docking
-
8/21/2019 Bioinformatics Drug Design
41/62
GRID Based Docking Methods
Grid Based methods
GRID (Goodford, 1985, J. Med. Chem. 28:849)
GREEN (Tomioka & Itai, 1994, J. Comp.Aided. Mol. Des. 8:347)
MCSS (Mirankar & Karplus, 1991, Proteins,11:29).
Functional groups are placed at regularly spaced(0.3-0.5A) lattice points in the active site and theirinteraction energies are evaluated.
-
8/21/2019 Bioinformatics Drug Design
42/62
Automated Docking Methods
Basic Idea is to fill the active site of theTarget protein with a set of spheres.
Match the centre of these spheres as good aspossible with the atoms in the database ofsmall molecules with known 3-D structures.
Examples:
DOCK, CAVEAT, AUTODOCK, LEGEND,ADAM, LINKOR, LUDI.
-
8/21/2019 Bioinformatics Drug Design
43/62
Folate Biosynthetic
pathway
DHFR
CLUSTAL W (1.81) multiple sequence alignment
-
8/21/2019 Bioinformatics Drug Design
44/62
chabaudi -----------------------E--KAGCFSNKTFKGLGNEGGLPWKCNSVDMKHFSSV35
vinckei -----------AICACCKVLNSNE--KASCFSNKTFKGLGNAGGLPWKCNSVDMKHFVSV47
berghei MEDLSETFDIYAICACCKVLNDDE--KVRCFNNKTFKGIGNAGVLPWKCNLIDMKYFSSV58
yoelii -----------AICACCKVINNNE--KSGSFNNKTFNGLGNAGMLPWKYNLVDMNYFSSV47
vivax MEDLSDVFDIYAICACCKVAPTSEGTKNEPFSPRTFRGLGNKGTLPWKCNSVDMKYFSSV60
falciparum -------------------------KKNEVFNNYTFRGLGNKGVLPWKCNSLDMKYFCAV35
* *. **.*:********:**::*:*
chabaudi TSYVNETNYMRLKWKRDRYMEK---------NNVKLNTDGIPSVDKLQNIVVMGKASWES86
vinckei TSYVNENNYIRLKWKRDKYIKE---------NNVKVNTDGIPSIDKLQNIVVMGKTSWES98
berghei TSYINENNYIRLKWKRDKYMEKHNLK-----NNVELNTNIISSTNNLQNIVVMGKKSWES113
yoelii TSYVNENNYIRLQWKRDKYMGKNNLK-----NNAELNNGELN--NNLQNVVVMGKRNWDS100
vivax TTYVDESKYEKLKWKRERYLRMEASQGGGDNTSGGDNTHGGDNADKLQNVVVMGRSSWES120
falciparum TTYVNESKYEKLKYKRCKYLNKET----------VDNVNDMPNSKKLQNVVVMGRTNWES85
*:*::*.:*:*::**:*: * .:***:****: .*:*
chabaudi IPSKFKPLQNRINIILSRTLKKEDLAKEYN------NVIIINSVDDLFPILKCIKYYKCF140
vinckei IPSKFKPLENRINIILSRTLKKENLAKEYS------NVIIIKSVDELFPILKCIKYYKCF152berghei IPKKFKPLQNRINIILSRTLKKEDIVNENN--NENNNVIIIKSVDDLFPILKCTKYYKCF171
yoelii IPPKFKPLQNRINIILSRTLKKEDIANEDNKNNENGTVMIIKSVDDLFPILKAIKYYKCF160
vivax IPKQYKPLPNRINVVLSKTLTKEDVK---------EKVFIIDSIDDLLLLLKKLKYYKCF171
falciparum IPKKFKPLSNRINVILSRTLKKEDFD---------EDVYIINKVEDLIVLLGKLNYYKCF136
**::*******::**:**.**:. ***..:::*: :* :*****
chabaudi I----------------------------------------------------------- 141
vinckei IIGGASVYKEFLDRNLIKKIYFTRINNAYT------------------------------ 182
berghei IIGGSSVYKEFLDRNLIKKIYFTRINNSYNCDVLFPEINENLFKITSISDVYYSNNTTLD231
yoelii IIGGSYVYKEFLDRNLIKKIYFTRINNSYN------------------------------ 190vivax IIGGAQVYRECLSRNLIKQIYFTRINGAYPCDVFFPEFDESQFRVTSVSEVYNSKGTTLD231
falciparum I----------------------------------------------------------- 137
*
chabaudi ---------
vinckei ---------
berghei FIIYSKTKE240
yoelii ---------
vivax FLVYSKVGG240
falciparum ---------
Multiple alignment of DHFR of
Plasmodium species
-
8/21/2019 Bioinformatics Drug Design
45/62
Drug binding pocket of L . caseiDHFR
Antifolate drugs in the active site of DHFR L
-
8/21/2019 Bioinformatics Drug Design
46/62
Antifolate drugs in the active site of DHFR L.
casei to show hydrogen bonding with
surrounding residues
MTX
TMP
PYR
SO3
-
8/21/2019 Bioinformatics Drug Design
47/62
Prediction & Design of New Drugs
Prediction of 3-D PfDHFR using bacterial DHFR
and homology modeling approach.
Search for the compounds using bifunctional basic
groups that could form stable H-bonds in a planewith carboxyl group.
Optimize the structure of small molecules and
then dock them on PfDHFR model. Toyoda et. al. (1997). BBRC 235:515-519 could
identify two compounds.
-
8/21/2019 Bioinformatics Drug Design
48/62
How molecular modeling could be
used in identifying new leads These two compounds
a triazinobenzimidazole&
a pyridoindolewere found to
be active with highKi against
recombinant wild type
DHFR.
Thus demonstrate use of
molecular modeling in
malarial drug design.
Additional Drug Target: glutathione-GR
-
8/21/2019 Bioinformatics Drug Design
49/62
Additional Drug Target: glutathione GR
Glutathione-GR
-
8/21/2019 Bioinformatics Drug Design
50/62
Additional Drug
Target:
Thioredoxinreductase (TrxR)
Cancer cell growth appears to
-
8/21/2019 Bioinformatics Drug Design
51/62
Cancer cell growth appears to
be related to evolutionary
development of plump fruits
and vegetables
Large tomatoes can evolve from wild, blueberry-size
tomatoes. The genetic mechanism responsible for this is
similar to the one that proliferates cancer cells inmammalians.
This is a case where we found a connection between
agricultural research, in how plants make edible fruit
and how humans become susceptible to cancer. That's a
connection nobody could have made in the past.
Cornell University News, July 2000
Si f T t F it
-
8/21/2019 Bioinformatics Drug Design
52/62
Size of TomatoFruit
fw2.2: A Quantitative Trait Locus (QTL) key to the Evolution of Tomato Fruit
Size. Anne Frary (2000) Science, 289: 85-88
Single gene, ORFX, that is
responsible for QTL has asequence and structural
similarity to the human
oncogene c-H-rasp21.
Fruit size alterations, imparted
by fw2.2alleles, are most likely
to be due to the changes in
regulation rather than in
sequence/structure of protein.
G U d t P bli
-
8/21/2019 Bioinformatics Drug Design
53/62
Genome Update: Public
domain
Published Complete Genomes: 59
Archaeal 9
Bacterial 36
Eukaryal 14
Ongoing Genomes: 335
Prokaryotic 203
Eukaryotic 132
Private sector holds
data of more than 100
finished & unfinished
genomes.
-
8/21/2019 Bioinformatics Drug Design
54/62
Challenges in Post-Genomic era: Unlocking
Secretes of quantitative variation
For even after genomes have been sequencedand the functions of most genes revealed, we
will have no better understanding of the
naturally occurring variation that determineswhy one person is more disease prone than
another, or why one variety of tomato yields
more fruit than the next. Identifying genes like fw2.2 is a critical first
step toward attaining this understanding.
-
8/21/2019 Bioinformatics Drug Design
55/62
Value of Genome Sequence Data
Genome sequence data provides, in a rapidand cost effective manner, the primaryinformation used by each organism to carryon all of its life functions.
This data set constitutes a stable, primaryresource for both basic and applied research.
This resource is the essential link required to
efficiently utilize the vast amounts ofpotentially applicable data and expertiseavailable in other segments of the biomedicalresearch community.
-
8/21/2019 Bioinformatics Drug Design
56/62
Challenges
Genome databases have individual geneswith relatively limited functionalannotation (enzymatic reaction,structural role)
Molecular reactions need to be placed inthe context of higher level cellular
functions
Th i S i
-
8/21/2019 Bioinformatics Drug Design
57/62
The omics Series Genomics
Gene identification & charaterisation Transcriptomics
Expression profiles of mRNA
Proteomics
functions & interactions of proteins Structural Genomics
Large scale structure determination
Cellinomics
Metabolic PathwaysCell-cell interactions
Pharmacogenomics
Genome-based drug design
-
8/21/2019 Bioinformatics Drug Design
58/62
Different levels of function
Atomic: Binds ATP
Molecular reaction: adds phosphate
(phosphofructokinase)
Pathway: Gluconeogenesis
Network: Energy metabolism
Typically present in
genome database
Typically, little informatics support
for making these connections
Data Mining: Finding the Needle in
-
8/21/2019 Bioinformatics Drug Design
59/62
Data Mining: Finding the Needle in
the Haystack Data mining refers to the new genre of BI tools used to
sift through the mass of raw data.
DM applications should be able to process -
TEMPORAL(Time studied) and
SPATIAL(Organism, organ, cell type etc) data.The gained knowledgeto reprocess data.
Data using techniques beyond Bayesian (similaritysearch) methods.
An extension of DM is the concept ofKNOWLEDGEDISCOVERY, which open up newavenues of research with new questions and different
perspectives.
-
8/21/2019 Bioinformatics Drug Design
60/62
-
8/21/2019 Bioinformatics Drug Design
61/62
Commercial Structural Genomics
-
8/21/2019 Bioinformatics Drug Design
62/62
Commercial Structural Genomics
Initiatives
IBM (Blue Gene project: 2000) Computational protein folding
Geneformatics (1999)
Modeling for identifying active sites
Prospect Genomics (1999) Homology modeling
Protein Pathways (1999)
Phylogenetic profiling, domain analysis, expression
profiling Structural Bioinformatics Inc (1996)
Homology modeling, docking