bioinformatics drug design

8/21/2019 Bioinformatics Drug Design

1/62

Case Study #1

Use of bioinformatics in drug

development and diagnostics

Ashok Kolaskar

University of Pune, Pune 411 007,India.

[email protected]
http://www.dbbm.fiocruz.br/class/faculty.html


2/62

Bringing a New Drug to Market

Review and approval by Food& Drug Administration

1compound

approvedPhase III: Confirms effectiveness and monitors

adverse reactions from long-term use in 1,000 to

5,000 patient volunteers.

Phase II: Assesses effectiveness andlooks for side effects in 100 to 500 patient

volunteers.

Phase I: Evaluates safety and dosage

in 20 to 100 healthy human volunteers.

5 compounds enter

clinical trials

Discovery and preclininal testing:Compounds are identified and evaluated

in laboratory and animal studies for

safety, biological activity, and formulation.

5,000 compoundsevaluated

0 2 4 6 8 10 12 14 Years 16

Source: Tufts Center for the Study of Drug Development


3/62

Biological Research in 21st

CenturyThe new paradigm, now emerging is that all

the 'genes' will be known (in the sense of

being resident in databases availableelectronically), and that the starting "point

of a biological investigation will be

theoretical.- Walter Gilbert


4/62

Identify target

Clone gene encoding target

Rational Approach to

Drug Discovery

Express target in recombinant form


5/62

Synthesize

modificationsof lead

compounds

Identify lead

compounds

Screen

recombinanttarget with

available

inhibitors

Crystal

structures oftarget and

target/inhibitor

complexes


6/62

Preclinical trials

Identify leadcompounds

Toxicity &

pharmacokinetic

studies

Synthesize

modificationsof lead

compounds


7/62

Case study: Malaria

P. falciparumpresents a fascinating modelsystem. The life-cycle complexity is both a

challenge and an opportunity.

Unfortunately at the molecular level much

remains unknown.

Its genome, currently being sequenced, is

already yielding valuable data.


8/62

Malarial Drugs

Up until now, nearly all prophylaxis and

therapeutic intervention has been based on

traditional medicines or their derivatives egquinine, paraquine, chloroquine etc.

Effective also have been type I antifolates eg

sulphones and sulphonamides which mimic

PABA, and the type II antifolates egpyrimethamine, trimethoprim and proguanil which

mimic dihydrofolate.


9/62

List of Drugs

ChloroquineAvloclor ; Nivaquine

Antifolate

Pyrimethamine (Daraprim)Proguanil (Paludrine)

Combination drugs

Pyrimethamine & Sulfadoxine combination eg.:Fansidar

Resistance to these drugs poses a threat to

control morbidity and mortality of malaria.


10/62

Drug Resistance (DR)

Mechanisms of drug resistance Formation of altered target

Target has decreased affinity for substrate/analogue

Decreased access to target

Mutations decrease membrane permeability Specific transport systems altered/deleted

Increased level of enzymes which cleave substrates

e.g. Increased expression of -lactamase

Gene amplification Decreased activation of drug

Drugs require activation by enzymes near site

Activation pathway suppressed/deleted


11/62

Malaria: Novel drug combinations

Structural formula: Atovaquone & Proguanil


12/62

Drugs targeting Dihydrofolate reductase

and Dihydroopteroate synthethase

Structural formula of Lumefantrine (Benflumethol)


13/62

Artemisinin derivatives as drugs

Structural formula of Arteminsinins


14/62

Quinolines

Structural formula of Pyronaridine


15/62

Novel drug combinations

Structural formula of Chlorproguanil and Dapsone


16/62

Dihydrofolate reductase inhibitors

Structural formula of Pyrimethamine & Cycloguanil


17/62

Inhibitors of phospholipid metabolism

Structural formula of E13 and G23


18/62

An Ideal Target

Is generally an enzyme/receptor in a pathway andits inhibition leads to either killing a pathogenicorganism (Malarial Parasite) or to modify someaspects of metabolism of body that is functioning

dormally. An ideal target

Is essential for the survival of the organism.

Located at a critical step in the metabolic pathway.

Makes the organism vulnerable.

Concentration of target gene product is low.

The enzyme amenable for simple HTS assays


19/62

How Bioinformatics can help in

Target Identification? Homologous & Orthologous genes

Gene Order

Gene Clusters

Molecular Pathways & Wire diagrams

Gene Ontology

Identification of Unique Genes of Parasite

as potential drug target.


20/62

Comparative Genomics of Malarial

Parasites: Source for identification of

new target molecules. Genome comparisons of malarial parasites of

human.

Genome comparisons of malarial parasites ofhuman and rodent.

Comparison of genomes of

Human

Malarial parasite

Mosquito


21/62

What one should look for?

Human

P.f

Mosquito

Proteins that are shared by

All genomes

Exclusively by Human & P.f.Exclusively by Human &

Mosquito

Exclusively by P.f. & Mosquito

Unique proteins in

Human

P.f. Targets for

anti-malarial drugs

Mosquito


22/62

What is Structural Genomics?

Organise all known proteins into families.

Determine structures of at least one member

of every family. Solve structures of more than 10,000 proteinin next 10 years.

Generate knowledge and rules from known

protein structures. Apply this knowledge to predict the structure

of each and every protein of knownorganisms.


23/62

Objectives of Structural Genomics

Selection of Targets for structure

determination to obtain maximum

information return on total efforts

Develop mechanism that facilitates

cooperation and prevent workduplication


24/62

Impact of Structural Genomics on

Drug Discovery

Dry, S. et. al. (2000) Nat. Struc.Biol. 7:976-949.


25/62

Drug Development Flowchart

Check if structure is known

If unknown, model it using

KNOWLEDGE-BASED HOMOLOGYMODELING APPROACH.

Search for small molecules/ inhibitors

Structure-based Drug Design Drug-Protein Interactions

Docking


26/62

Why Modeling?

Experimental determination of structureis still a time consuming and expensive

process.

Number of known sequences are more

than number of known structures.

Structure information is essential in

understanding function.


27/62

Sequence identities &Molecular Modeling methods

Methods Sequence Identitywith knownstructures

ab in i t io 0-20%

Fold recognition 20-35%

Homology Modeling >35%


28/62

What is Homology or ComparativeModeling?

Comparisons of the tertiary structures ofhomologous proteins have shown that 3-Dstructures have been better conserved duringevolution than protein primary structures.

In the absence of experimental data, model-building on the basis of the known threedimensional structure of homologous

protein(s) is the only reliable method to obtainstructural information.


29/62

Difference between Homology andSimilarity

Homology does not necessarily imply similarity.

Homology has a precise definition: having a

common evolutionary origin.

Since homology is a qualitative description ofthe relationship, the term % homologyhas no

meaning.

Supporting data for a homologous relationshipmay include sequence or structural similarities,

which can be described in quantitative terms.


30/62

100%

75%

50%

25%

Sequence Identity Model Accuracy

Rate limiting factors in modeling

CPU time to model

Quality of model & Loop modeling

Errors in the sequence alignment

Detection of homology


31/62

What is Remote HomologyModeling?

Modeling based on low levels of sequence

identity (


32/62

Steps in Homology Modeling

template recognition

alignment

backbone generation

generation of canonical loops (data based)

side chain generation & optimisation

model optimisation (energy minimisation)

model verification

Optional repeat of previous steps: Generating

more than one model.

STRUCTURE BASED DRUG


33/62

STRUCTURE-BASED DRUG

DESIGNCompound

databases,

Microbial broths,

Plants extracts,

Combinatorial

Libraries

3-D ligandDatabases

Docking

Linking or

Binding

Receptor-Ligand

Complex

Random

screening

synthesis

Lead molecule

3-D QSAR

Target Enzyme

OR Receptor

3-D structure by

Crystallography,

NMR, electronmicroscopy OR

Homology Modeling

Redesign

to improve

affinity,

specificity etc.

Testing


34/62

Binding Site Analysis

In the absence of a structure of Target-

ligand complex, it is not a trivial exercise to

locate the binding site!!!

This is followed by Lead optimization.


35/62

Lead Optimization

Lead Lead OptimizationActive site


36/62

Factors Affecting The Affinity Of A Small

Molecule For A Target Protein

LIGAND.wat n+PROTEIN.watn LIGAND.PROTEIN.watp+(n+m-p) wat

HYDROGEN BONDING

HYDROPHOBIC EFFECT

ELECTROSTATIC INTERACTIONS

VAN DER WAALS INTERACTIONS

STRAIN IN THE LIGAND ( BOUND)STRAIN IN THE PROTEIN


37/62

DIFFERENCE BETWEEN AN INHIBITOR AND DRUG

Extra requirement of a drug compared to an inhibitor

LIPINSKIS RULE OF FIVE

Poor absorption or permeation are more

likely when :

-There are more than five H-bond donors

-The mol.wt is over 500 Da-The MlogP is over 4.15(or CLOG P>5)

-The sums of Ns and Os is over 10

SelectivityLess Toxicity

Bioavailability

Slow ClearanceReach The Target

Ease Of Synthesis

Low Price

Slow Or No Development Of Resistance

Stability Upon Storage As Tablet Or Solution

Pharmacokinetic Parameters

No Allergies

THERMODYNAMICS OF RECEPTOR-LIGAND BINDING


38/62

O N CS O C O G N N NG

Proteins that interact with drugs are typically enzymes orreceptors.

Drug may be classified as: substrates/inhibitors(for enzymes)

agonists/antagonists(for receptors)

Ligands for receptors normally bind via a non-covalent reversiblebinding.

Enzyme inhibitors have a wide range of modes:non-covalentreversible,covalent reversible/irreversible or suicide inhibition.

Enzymes prefer to bind transition states (reaction intermediates)and may not optimally bind substrates as part of energy used forcatalysis.

In contrast, inhibitors are designed to bind with higher affinity:their affi nities often exceed the corresponding substrate affinities

by several orders of magnitude!

Agonists are analogous to enzyme substrates: part of the binding

energy may be used for signal transduction, inducing aconformation or aggregation shift.

To understand what forces are responsible for ligands binding to


39/62

To understand what forces are responsible for ligands binding to

Receptors/Enzymes,

It is worthwhile considering what forces drive protein folding they

share many common features.The observed structure of Protein is generally a consequence of the

hydrophobic effect!

Secondary amides form much stronger H-bonds to water than to other

sec. Amides hydrophobic collapse

Proteins generally bury hydrophobic residues inside the core,while

exposing hydrophilic residues to the exterior Salt-bridges inside

Ligand building clefts in proteins often expose hydrophobic residues tosolvent and may contain partially desolvated hydrophilic groups that are

not paired:

The desolvation penalty is paid for by favourable (hydrophobic)

interaction elsewhere in the structure.


40/62

Docking Methods

Docking of ligands to proteins is a

formidable problem since it entails

optimization of the 6 positional degrees offreedom.

Rigid vs Flexible

Speed vs Reliability

Manual Interactive Docking


41/62

GRID Based Docking Methods

Grid Based methods

GRID (Goodford, 1985, J. Med. Chem. 28:849)

GREEN (Tomioka & Itai, 1994, J. Comp.Aided. Mol. Des. 8:347)

MCSS (Mirankar & Karplus, 1991, Proteins,11:29).

Functional groups are placed at regularly spaced(0.3-0.5A) lattice points in the active site and theirinteraction energies are evaluated.


42/62

Automated Docking Methods

Basic Idea is to fill the active site of theTarget protein with a set of spheres.

Match the centre of these spheres as good aspossible with the atoms in the database ofsmall molecules with known 3-D structures.

Examples:

DOCK, CAVEAT, AUTODOCK, LEGEND,ADAM, LINKOR, LUDI.


43/62

Folate Biosynthetic

pathway

DHFR

CLUSTAL W (1.81) multiple sequence alignment


44/62

chabaudi -----------------------E--KAGCFSNKTFKGLGNEGGLPWKCNSVDMKHFSSV35

vinckei -----------AICACCKVLNSNE--KASCFSNKTFKGLGNAGGLPWKCNSVDMKHFVSV47

berghei MEDLSETFDIYAICACCKVLNDDE--KVRCFNNKTFKGIGNAGVLPWKCNLIDMKYFSSV58

yoelii -----------AICACCKVINNNE--KSGSFNNKTFNGLGNAGMLPWKYNLVDMNYFSSV47

vivax MEDLSDVFDIYAICACCKVAPTSEGTKNEPFSPRTFRGLGNKGTLPWKCNSVDMKYFSSV60

falciparum -------------------------KKNEVFNNYTFRGLGNKGVLPWKCNSLDMKYFCAV35

* *. **.*:********:**::*:*

chabaudi TSYVNETNYMRLKWKRDRYMEK---------NNVKLNTDGIPSVDKLQNIVVMGKASWES86

vinckei TSYVNENNYIRLKWKRDKYIKE---------NNVKVNTDGIPSIDKLQNIVVMGKTSWES98

berghei TSYINENNYIRLKWKRDKYMEKHNLK-----NNVELNTNIISSTNNLQNIVVMGKKSWES113

yoelii TSYVNENNYIRLQWKRDKYMGKNNLK-----NNAELNNGELN--NNLQNVVVMGKRNWDS100

vivax TTYVDESKYEKLKWKRERYLRMEASQGGGDNTSGGDNTHGGDNADKLQNVVVMGRSSWES120

falciparum TTYVNESKYEKLKYKRCKYLNKET----------VDNVNDMPNSKKLQNVVVMGRTNWES85

*:*::*.:*:*::**:*: * .:***:****: .*:*

chabaudi IPSKFKPLQNRINIILSRTLKKEDLAKEYN------NVIIINSVDDLFPILKCIKYYKCF140

vinckei IPSKFKPLENRINIILSRTLKKENLAKEYS------NVIIIKSVDELFPILKCIKYYKCF152berghei IPKKFKPLQNRINIILSRTLKKEDIVNENN--NENNNVIIIKSVDDLFPILKCTKYYKCF171

yoelii IPPKFKPLQNRINIILSRTLKKEDIANEDNKNNENGTVMIIKSVDDLFPILKAIKYYKCF160

vivax IPKQYKPLPNRINVVLSKTLTKEDVK---------EKVFIIDSIDDLLLLLKKLKYYKCF171

falciparum IPKKFKPLSNRINVILSRTLKKEDFD---------EDVYIINKVEDLIVLLGKLNYYKCF136

**::*******::**:**.**:. ***..:::*: :* :*****

chabaudi I----------------------------------------------------------- 141

vinckei IIGGASVYKEFLDRNLIKKIYFTRINNAYT------------------------------ 182

berghei IIGGSSVYKEFLDRNLIKKIYFTRINNSYNCDVLFPEINENLFKITSISDVYYSNNTTLD231

yoelii IIGGSYVYKEFLDRNLIKKIYFTRINNSYN------------------------------ 190vivax IIGGAQVYRECLSRNLIKQIYFTRINGAYPCDVFFPEFDESQFRVTSVSEVYNSKGTTLD231

falciparum I----------------------------------------------------------- 137

*

chabaudi ---------

vinckei ---------

berghei FIIYSKTKE240

yoelii ---------

vivax FLVYSKVGG240

falciparum ---------

Multiple alignment of DHFR of

Plasmodium species


45/62

Drug binding pocket of L . caseiDHFR

Antifolate drugs in the active site of DHFR L


46/62

Antifolate drugs in the active site of DHFR L.

casei to show hydrogen bonding with

surrounding residues

MTX

TMP

PYR

SO3


47/62

Prediction & Design of New Drugs

Prediction of 3-D PfDHFR using bacterial DHFR

and homology modeling approach.

Search for the compounds using bifunctional basic

groups that could form stable H-bonds in a planewith carboxyl group.

Optimize the structure of small molecules and

then dock them on PfDHFR model. Toyoda et. al. (1997). BBRC 235:515-519 could

identify two compounds.


48/62

How molecular modeling could be

used in identifying new leads These two compounds

a triazinobenzimidazole&

a pyridoindolewere found to

be active with highKi against

recombinant wild type

DHFR.

Thus demonstrate use of

molecular modeling in

malarial drug design.

Additional Drug Target: glutathione-GR


49/62

Additional Drug Target: glutathione GR

Glutathione-GR


50/62

Additional Drug

Target:

Thioredoxinreductase (TrxR)

Cancer cell growth appears to


51/62

Cancer cell growth appears to

be related to evolutionary

development of plump fruits

and vegetables

Large tomatoes can evolve from wild, blueberry-size

tomatoes. The genetic mechanism responsible for this is

similar to the one that proliferates cancer cells inmammalians.

This is a case where we found a connection between

agricultural research, in how plants make edible fruit

and how humans become susceptible to cancer. That's a

connection nobody could have made in the past.

Cornell University News, July 2000

Si f T t F it


52/62

Size of TomatoFruit

fw2.2: A Quantitative Trait Locus (QTL) key to the Evolution of Tomato Fruit

Size. Anne Frary (2000) Science, 289: 85-88

Single gene, ORFX, that is

responsible for QTL has asequence and structural

similarity to the human

oncogene c-H-rasp21.

Fruit size alterations, imparted

by fw2.2alleles, are most likely

to be due to the changes in

regulation rather than in

sequence/structure of protein.

G U d t P bli


53/62

Genome Update: Public

domain

Published Complete Genomes: 59

Archaeal 9

Bacterial 36

Eukaryal 14

Ongoing Genomes: 335

Prokaryotic 203

Eukaryotic 132

Private sector holds

data of more than 100

finished & unfinished

genomes.


54/62

Challenges in Post-Genomic era: Unlocking

Secretes of quantitative variation

For even after genomes have been sequencedand the functions of most genes revealed, we

will have no better understanding of the

naturally occurring variation that determineswhy one person is more disease prone than

another, or why one variety of tomato yields

more fruit than the next. Identifying genes like fw2.2 is a critical first

step toward attaining this understanding.


55/62

Value of Genome Sequence Data

Genome sequence data provides, in a rapidand cost effective manner, the primaryinformation used by each organism to carryon all of its life functions.

This data set constitutes a stable, primaryresource for both basic and applied research.

This resource is the essential link required to

efficiently utilize the vast amounts ofpotentially applicable data and expertiseavailable in other segments of the biomedicalresearch community.


56/62

Challenges

Genome databases have individual geneswith relatively limited functionalannotation (enzymatic reaction,structural role)

Molecular reactions need to be placed inthe context of higher level cellular

functions

Th i S i


57/62

The omics Series Genomics

Gene identification & charaterisation Transcriptomics

Expression profiles of mRNA

Proteomics

functions & interactions of proteins Structural Genomics

Large scale structure determination

Cellinomics

Metabolic PathwaysCell-cell interactions

Pharmacogenomics

Genome-based drug design


58/62

Different levels of function

Atomic: Binds ATP

Molecular reaction: adds phosphate

(phosphofructokinase)

Pathway: Gluconeogenesis

Network: Energy metabolism

Typically present in

genome database

Typically, little informatics support

for making these connections

Data Mining: Finding the Needle in


59/62

Data Mining: Finding the Needle in

the Haystack Data mining refers to the new genre of BI tools used to

sift through the mass of raw data.

DM applications should be able to process -

TEMPORAL(Time studied) and

SPATIAL(Organism, organ, cell type etc) data.The gained knowledgeto reprocess data.

Data using techniques beyond Bayesian (similaritysearch) methods.

An extension of DM is the concept ofKNOWLEDGEDISCOVERY, which open up newavenues of research with new questions and different

perspectives.


60/62


61/62

Commercial Structural Genomics


62/62

Commercial Structural Genomics

Initiatives

IBM (Blue Gene project: 2000) Computational protein folding

Geneformatics (1999)

Modeling for identifying active sites

Prospect Genomics (1999) Homology modeling

Protein Pathways (1999)

Phylogenetic profiling, domain analysis, expression

profiling Structural Bioinformatics Inc (1996)

Homology modeling, docking

bioinformatics drug design

Documents