introduction to drug target identification

Chapter - I Introduction

_________________________________________________________________________

Identification and Validation of Drug Targets

Chapter I

INTRODUCTION

1.1 NEW DRUGS - WHY?

In the initial stages of drug therapy, scientists and medical

researchers were not aware about the targets on which these antibiotics

act. Only thing that fascinated them was that these newly discovered

compounds exhibited reasonable antibacterial properties. The above

scientific findings propelled to isolate those compounds and use them for

treating bacterial diseases. Alexander Fleming’s discovery of antibiotic

‘Penicillin’ is considered as one of the historical milestones in medical

research. The following are some of his words summarizing the findings

(BMJ, 1955).

A certain type of penicillium produces in culture a powerful

antibacterial substance.

The active agent is readily filterable and the name 'penicillin' has

been given to filtrates of broth cultures of the mould.

The action is very marked on the pyogenic cocci and the diphtheria

group of bacilli.

Penicillin is non-toxic to animals in enormous doses and is not

irritant. It does not interfere with leucocytic function to a greater

degree than does ordinary broth.

_________________________________________________________________________

"It is suggested that it may be an efficient antiseptic for application

to, or injection into, areas infected with penicillin-sensitive

microbes."

The discovery of penicillin in 1928 gave confidence to the medical

researchers that any bacterial disease could be treated. Penicillin was one

of the hall mark discoveries in the field of antibiotics and in fact it managed

most of the diseases of that time. Sooner its effect faded due to the

inherent capability of the microbes to confer resistance (Watson, 1958).

The resistance is found to be easily transmitted among the bacterial

species and hence new molecules/antibiotics were always a need to

combat life threatening diseases.

In the 19th century penicillin was one of the most widely used

antibiotics. In these days it is not common to find a person who has not

received it during their life time. Almost every organism responded well to

this drug. Subsequent studies carried out in 1940s explained its mode of

action on cell wall. At this stage scientists and medical researchers did not

have a molecular level understanding of the exact binding of this molecule,

whereas the modern methods of drug discovery explain how a drug

molecule binds specifically and interacts with the disease target.

The growing concern of antibiotic resistance and drug efficiency

demands discovery and development of new drugs to fight against the l ife

threatening diseases. The recent technological advancements in science

enabled rapid sequencing of genome of various organisms. The completion

_________________________________________________________________________

of human genome project (HGP) brought forth a paradigm shift in drug

discovery process as it provided clarity on molecular level understanding of

disease. With the completion of sequencing human and its various

pathogenic microbes, it enabled researchers to look for novel drug targets

from these genome sequences. The numbers of drug targets identified till

date are 500, while the drugs currently in use are based on only 120 drug

targets (Hopkins and Groom, 2002). The majority of existing antibiotics

utilizes a limited number of core chemical structures and targets only a few

cellular functions, such as cell wall biosynthesis, DNA replication,

transcription, and translation (Moir et al., 1999).

1.2 ANTIBACTERIAL DRUG DISCOVERY - A BRIEF HISTORY

The importance of new class of antibiotics will be clearly understood

when we analyze the origin of antibacterial drug discovery and its prevailing

status. The pharmaceutical industry owes much of its early prosperity to the

discovery of antibacterial agents. Early antibacterial agents discovered

were the sulfonamides, penicillin and streptomycin, and these were rapidly

followed by tetracyclines, isoniazid, macrolides, glycopeptides,

cephalosporins, nalidixic acid and other molecular classes. Despite its

discovery in 1928, it required a consortium of five pharmaceutical

companies (Abbott, Lederle, Merck, Chas. Pfizer and ER Squibb & Sons)

and the US Department of Agriculture to develop and produce penicillin in

the 1940s, mainly as part of the war effort during the Second World War.

The cephalosporins became popular during the 1970s, with several

_________________________________________________________________________

‘second’ and ‘third’ generation products entering the marketplace by the

mid-1980s.

Coincident with the growing market dominance of the third generation

cephalosporins was the emergence of the pandemic of multidrug resistant

Streptococcus aureus infections in US hospitals and Streptococcus

pneumoniae in the community. At that time, in the early 1980s, the

pharmaceutical industry began scaling back on their antibacterial drug

discovery efforts with approximately half of large US and Japanese

pharmaceutical companies ending or curtailing their efforts. Yet

antibacterial drug discovery efforts did continue at many major European

and US pharmaceutical companies through the 1990s. But since 1999 the

industry has once again pulled back from anti-infective research in an even

more concerted manner, with 10 of the 15 largest companies ending or

curtailing their discovery efforts. While this was occurring the industry has

been experiencing a series of mega-mergers leading to large scale

consolidation. This consolidation alone has resulted in a major decrease in

the hunt for novel antibacterial agents.

The rise in the levels of antibacterial drug resistance in human

pathogens is most common phenomenon. Resistance is defined as bacteria

that are not inhabited by usually achievable systematic concentration of an

agent with normal dosage schedule and /or fall in the minimum inhibitory

concentration ranges. Drug resistance is of major concern for severely ill

and hospitalized patients as therapeutic efficacy of current drugs in practice

_________________________________________________________________________

is declining. First clear proof of resistance to penicillin was reported by an

accidental observation in 1958 (Ley et al., 1958). Microorganisms

developing resistance towards an antibacterial substance is an inherent

mechanism. Widespread occurrence of microbial resistance coupled with

the declining efficiency of current antibiotics in practice demands discovery

and development of novel therapeutics.

Antimicrobial Availability Task Force identified six problematic

pathogens, Gram negative organisms (Acinetobacter baumannii, extended

spectrum β-lactamase (ESBL) producing Enterobacteriaceae, and

Pseudomonas aeruginosa), Gram-positive pathogens (methicillin resistant

Staphylococcus aureus (MRSA) and vancomycin resistant Enterococcus

faecium) and the filamentuous fungi Aspergillus spp as a potential threat to

the community. Of these organisms, MRSA is the organism that has

received the most attention, largely driven by clinical need rather than by

large sums of money. It is likely that interest in the other problematic

pathogens will also be driven by clinical need and not by investment to

increase awareness. Some experts consider two additional water-borne,

non-fermenting Gram-negative pathogens, namely Stenotrophomonas

maltophilia and Burkholderia cepacia, both of which are related to P.

aeruginosa, to be problematic organisms.

Multidrug-resistant strains are particularly problematic, conveying

increased mortality, longer hospital stays, and higher hospital costs over

and above the values associated with susceptible strains of these

_________________________________________________________________________

pathogens. Successful treatment requires a ‘hit hard and hit fast’ approach

with an antibiotic that provides coverage of these important Gram-negative

organisms, including multidrug-resistant strains. Various studies have

indicated that the frequency of multidrug-resistant isolates is increasing

worldwide. Considering the present need for discovery and development of

novel antibiotics we are already too late.

1.3 MULTIDRUG RESISTANCE - DRIVING THE NEED FOR NEW DRUGS

Increased resistance of commonly used antibiotics, a growing

prevalence of infections, and the emergence of new pathogenic organisms

challenge current use of antibiotic therapy (Rosamond and Allsop, 2000).

Recent epidemiological studies suggest an increase in healthcare

associated infections caused by gram-negative bacteria, particularly

Klebsiella spp., Escherichia coli, Pseudomonas aeruginosa, and

Acinetobacter spp. The rising incidence of drug resistance of these

pathogens presents a challenge given the few novel antimicrobial agents

under development that specifically target these organisms. Latest

developments in the areas of targets involved in bacterial virulence or

resistance against antibacterial agents have been reviewed previously

(Schmid, 1998). Bacteria have developed a variety of resistance

mechanisms coupled with the ability to mobilize the respective genetic

information between bacterial strains and species (Heinemann, 1999).

_________________________________________________________________________

Gram-negative non-fermenters exhibit resistance to essentially all

commonly used antibiotics, including anti-pseudomonal penicillins and

cephalosporins, aminoglycosides, tetracyclines, fluoroquinolones,

trimethoprim-sulfamethoxazole, and carbapenems. Polymyxins are the

remaining antibiotic drug class with fairly consistent activity against

multidrug-resistant strains of P aeruginosa, Acinetobacter spp, and

S. maltophilia. A variety of resistance mechanisms have been identified in

P aeruginosa and other gram-negative non-fermenters, including enzyme

production, over expression of efflux pumps, porin deficiencies, and target-

site alterations. Multiple resistance genes frequently coexist in the same

organism. Multidrug resistance in gram-negative non-fermenters makes

treatment of infections caused by these pathogens both difficult and

expensive. Improved antibiotic stewardship and infection-control measures

will be needed to prevent or slow down the emergence and spread of

multidrug-resistant, non-fermenting gram-negative bacilli in the healthcare

setting, (Lautenbach and Polk, 2007; McGowan, 2006).

Knowledge of the clinical and economic impact of antimicrobial

resistance is useful to influence programs and behavior in healthcare

facilities, to guide policy makers and funding agencies, to define the

prognosis of individual patients and to st imulate interest in developing new

antimicrobial agents and therapies. A recent study showed that there is an

association between antimicrobial resistance in Staphylococcus aureus,

Enterococci and Gram-negative bacilli and increases in mortality, morbidity,

_________________________________________________________________________

length of hospitalization and cost of healthcare. Patients with infections due

to antimicrobial-resistant organisms have higher costs (US $ 6,000-30,000)

than do patients with infections due to antimicrobial-susceptible organisms;

the difference in cost is even greater when patients infected with

antimicrobial-resistant organisms are compared with patients without

infection, (Maragakis et al., 2008). Delivering healthcare with affordate cost

is need of the hour as the increased healthcare care is already rising due to

different factors.

1.3.1 Molecular mechanism of drug resistance

Development of resistance limits usefulness of effective drugs and

hence poses a major threat to the pharmaceutical industry. Over the past

two decades understanding the mechanisms of drug resistance has

become a central issue as its importance in medicine has assumed ever -

increasing significance. The following table shows the various origin of

antimicrobial resistance. Understanding the origin of resistance will aid in

avoiding potential pitfalls while developing a new drug for a specific

disease.

_________________________________________________________________________

Table 1

Origins of Intrinsic and Acquired Resistance

S. No.

Type Duration of resistance

Frequency of resistance

within the population

Intrinsic resistance

1. Absence of target site Permanent All cells

2. Species-specific structure of target site

Permanent All cells

3. High detoxication capacity, arising from:

a. tissue-specific function Permanent All cells

b. ontogenic variations Variable All cells

c. sex-specific differences Permanent All cells

d. population polymorphisms Permanent Variable

e. self defence Permanent All cells

f. high repair capacity Permanent All cells

4. Low drug delivery Variable Variable

5. Cell cycle effects Variable Variable

6. Adaptive change Temporary All cells

7. Stress response Temporary All cells

Acquired resistance

1. Natural selection Permanent Rare

2. Constitutive adaptive change Permanent Rare

3. Constitutive stress response Permanent Rare

4. Gene transfer Required continued selection Rare

5. Gene amplification Required continued selection Rare

Source: John Hayes and Roland Wolf (1990)

_________________________________________________________________________

Intrinsic drug resistance

The term ‘Intrinsic resistance’ is used to describe the situation where

an organism, or cell, possesses a characteristic 'feature' which allows all

normal members of the species to tolerate a particular drug or chemical

environment. In this case, the 'feature' responsible for resistance is an

inherent, or integral, property of the species that has arisen through the

processes of evolution.

Mechanisms of intrinsic resistance

The phenomenon of intrinsic resistance can be due to either the

presence or the absence of a biochemical 'feature' (Table 2). This may, for

example, be the structure of the cell envelope or membrane, the existence

of a drug transport protein, the absence of a metabolic pathway, the

presence of a drug-metabolizing enzyme, the structure of the drug target

site and the expression of specific stress response proteins or high repair

capacity.

Self protection mechanism associated with intrinsic drug resistance

Many organisms survive in the environment through their ability to

produce chemicals which are toxic or distasteful to their predators or their

competitors. As a consequence, they require their own defence against the

noxious chemicals they produce. Studies on the antibiotic-producing micro

organisms such as the various species of Streptomyces provide good

examples of this form of intrinsic drug resistance. The mechanisms used by

_________________________________________________________________________

organisms to protect themselves against their own antibiotic products were

divided into two types, firstly, resistance involving inactivation of antibiotics

such as streptomycin and neomycin by the phosphotransferases and

acetyltransferases and secondly, resistance resulting from modification of

potential target sites within the organism (Cundliffe 1984). For example, the

ribosomal RNA is protected by methylation in the erythromycin producer

Streptomyces erythraeus.

Chemically-induced adaptive change and intrinsic resistance

Drugs and a wide variety of toxic agents (e.g. radiation, osmotic

shock and heat shock) provoke many biochemical changes in cells that

allow them to overcome the toxic effects of either the same or other

compounds. In some circumstances this ability to resist chemical insult

arises immediately following administration of the drug or, alternatively,

there may be a significant time lag following exposure to the drug before

the adaptive process is manifest.

Physiological stress response and intrinsic resistance

Environmental factors, other than drugs, can, through the ability to

stress cells, elicit an adaptive response that confers resistance against

chemicals. Phenomena such as heat, anoxia, viral infection, trauma, UV

irradiation, pH, osmotic shock and oxidative stress stimulate a genetic

reflex in all cells that is 'designed' to confer tolerance against subsequent

exposure to the same physiological insult. Prokaryotes have at least four

_________________________________________________________________________

major regulations which are induced by stress, namely, the SOS response

(Walker, 1985), the adaptive response to alkylating agents (Samson and

Cairns, 1977; Demple et al., 1985), the oxy-R network (Christman et al.,

1985; Storz et al., 1990) and the heatshock response (Lindquist, 1986;

Carper et al., 1987).

Acquired drug resistance

The term ‘acquired resistance’ is used to describe the case where a

resistant strain, or cell line, emerges from a population that was previously

drug-sensitive. Three major types of genetic change can be envisaged:

1. mutations and amplifications of specific genes directly in vivo

mutations and amplifications of specific genes directly involved in a

protective pathway,

2. mutations in genes which regulate stress-response processes and

lead to the altered expression of large numbers of proteins, and

3. gene transfer.

These types of change are of course not mutually exclusive, and

examination of the multiple changes that are frequently seen in resistant

tumour cell lines suggests that several mechanisms can operate

simultaneously.

_________________________________________________________________________

Natural selection and acquired resistance

The distinction between acquired resistance through natural selection

and intrinsic drug resistance lies in the frequency with which the mutated

gene is observed in the 'wild type' population.

Drug-mediated genetic changes and acquired resistance

Herbicides, insecticides or antimicrobials are not mutagenic.

However, many drugs used in cancer chemotherapy are mutagens

providing the selection pressure for resistance, can significantly increase

the frequency of mutations that will produce resistant cells. This is probably

greatly potentiated by the inherent genetic instability of cancer cells. Such

effects are exemplified by the significant increase in the frequency of DNA

amplification following the exposure of tumour cells to mutagens such as

monofunctional and bifunctional alkylating agents and UV. irradiation

(Connors, 1984; Stark, 1986). It is technically difficult to demonstrate

whether resistant cells in tumours arise from drug-mediated mutations or

were present before chemotherapy was initiated.

_________________________________________________________________________

Table 2

Examples of acquired drug resistance

Example Organism Resistance to Procedure Type of resistance

Bacterial drug resistance Escherichia coli

Chloramphenicol, ampicillin

Exposure to drug Gene transfer (+ natural selection)

Bacterial drug resistance Serratia marcescens

Fosfomycin Exposure to drug Gene transfer (+ natural selection)

Preneoplastic hepatocyte nodules

Rat Toxins, carcinogens Carcinogen exposure Carcinogen-induced stress response

Persistant hepatocyte nodules Rat Toxins, carcinogens Carcinogen exposure Natural selection: altered expression of drug metabolizing enzymes

Oxy RI network (adaptive response to oxidative stress)

Salmonella typhimurium

Peroxides, ethanol In vitro selection of cell line Constitutive overexpression of a stress response

ampC, R and D genes (adaptive response to cephalosporins)

Citrobacter freundii

Cefuroxime, cefotaxime, cetazidime

In vitro selection of cell line Constitutive overexpression of an adaptive response

Ada gene (adaptive response to alkylating agents)

Escherichia coli

N-Methyl-N-nitrosourea N-methyl-N-nitro-N- nitrosoguanidine

In vitro selection of cell line Constitutive overexpression of an adaptive response

Multidrug resistance

Tumour cell lines

Adriamycin, vincristine, actinomycin D

Stepwise exposure to increasing concentrations of cytotoxic drug

Amplification of P-glycoprotein genes

_________________________________________________________________________

Example Organism Resistance to Procedure Type of resistance

Alkylating agent resistance Tumour cell lines

Alkylating agents Stepwise exposure to increasing concentrations of cytotoxic drug

Overexpression of drug metabolizing enzymes

DNA gyrase mutants Escherichia coli

Nalidixic acid In vitro exposure to drug Natural selection

Penicillin binding protein mutants

Escherichia coli

Penicillin Exposure to drug Natural selection

Acetylcholinesterase mutants House flies Organophosphorus Exposure to drug Exposure to drug Natural selection

Source: John Hayes and Roland Wolf (1990)

_________________________________________________________________________

1.4 CONCERNS FOR DRUG DISCOVERY AND DEVELOPMENT

The process of drug development begins with the target identification

and eventually leads to the development of final medication. Drug discovery

and development is an expensive and laborious incremental process. The

main objective of this developmental effort is to identify a molecule with

desired effect to cure a specific disease. Also it should establish quality,

safety and efficacy for treating the patients without any undesirable side

effects (Snodin, 2002).

Currently the developmental cost for bringing a new molecule to

market costs around $800 million USD. It takes nearly 12 years for a drug

to progress from bench to market (EMBO Reports, 2004). The drug

discovery process has numerous technical bottlenecks and the molecule

under research has high risk failure at any stage of the development

process. In spite of the growth in drug discovery technologies, the number

of drugs that has crossed the FDA approval is very less. Furthermore, no

new chemical classes of active antibiotics have been successfully

introduced into the clinic for over 30 years. For example, of 5000

compounds that enter pre-clinical testing approximately five compounds are

tested in human trails of which only one receives the approval for

therapeutic purpose. Since the development costs have increased, the

number of companies venturing into R/D spending has decreased

drastically. However, effective use of the new genomic technologies and

_________________________________________________________________________

available data resource accelerates the process of drug discovery and

prevents potential pitfalls in the drug discovery pipeline.

1.4.1 Stages of drug discovery

The cost and time taken to design develop and release new drugs to

the market have continued to rise over recent times (Grabowski et al.,

1990; Di Masi, 2002) and also the number of new drug approvals has

declined drastically (Frantz and Smith, 2003). The pharmaceutical industry

is keen on reducing the drug candidate attrition throughout the drug

discovery and development process. Numerous drugs with reasonable

biological activities fail at the clinical studies. Earlier testing especially

through wet laboratory or in silico protocols can avoid such pitfalls in the

drug development.

Fig. 1: Modern Day Drug Discovery Pipeline

_________________________________________________________________________

The first step is to determine an assay for the receptor or the target.

An assay is a test to assess the positive binding of a molecule (drug) to the

target receptor. Usually a pharmaceutical company will first screen their

entire corporate database of known compounds as the compound in the

database is usually very well characterized. Also, synthetic methods will be

known for this compound, and patent protection is often present. This

enables the company to rapidly prototype a candidate ligand whose

chemistry is well known and within the intellectual property of the company.

If none of these compounds from their database match the target then they

may look for a compound which will fit to their receptor. The molecule

which successfully binds with the target is termed as a lead compound. The

next step is to study the receptors interactions with the ligand molecule.

This would involve both in silico and in vitro analysis to find the binding

residues involved in the ligand-receptor association. The 3D structure of

the ligand-receptor complex provides a clear perspective on the ligand-

receptor interaction.

1.5 DETERMINATION OF THE CRYSTAL STRUCTURE

If the receptor is water soluble, there is a chance that x-ray

crystallographic analysis can be employed to determine the three-

dimensional structure of the ligand bound to the receptor at the atomic

level. X-ray crystallography is a very powerful tool for it allows scientists to

directly visualize a snapshot of the individual atoms of the ligand as they

reside within the receptor. This snapshot is referred to as a crystal

_________________________________________________________________________

structure of the ligand-receptor complex. Unfortunately, not all complexes

can be analyzed in this manner. However, if a crystal structure can be

determined, a strategy can then be developed based upon this

characterization to improve and optimize the binding of the lead

compound. From this point onward, a cycle of iterative chemical refinement

and testing continues until a drug is developed that undergoes clinical

trials. The techniques used to refine drugs are combinatorial chemistry and

structure based drug design.

1.5.1 X-ray crystallography and drug discovery

The concept of applying X-ray crystallography in drug discovery

emerged more than 30 years ago as the first 3D structures of proteins were

determined. A typical example for this include the synthesis of ligands of

haemoglobin to decrease sickling (Beddell et al., 1976; Goodford et al.,

1980), the chemical modification of insulin to increase half lives (Blundell,

1972), and the design of serine proteases inhibitors to control blood

clotting. In spite of the promising results most pharmaceutical companies

considered X-ray crystallography too expensive and time consuming to

bring ‘in house’ and for a time most activity remained in academia. Within a

decade, a radical change in drug design had begun, incorporating the

knowledge of the three dimensional structures of target proteins into the

design process. Although structures of the relevant drug targets were

usually not available directly from X-ray crystallography, comparative

models based on homologues proved useful in defining topographies of the

_________________________________________________________________________

complementary surfaces of ligands and their protein targets, and began to

be exploited in lead optimization in the 1980s (Blundell et al., 1983;

Blundell, 1996; Campbell, 2000).

Sooner crystal structures of key drug targets became available; AIDS

drugs such as Agenerase and Viracept were developed using the crystal

structure of HIV protease (Lapatto et al., 1989) and the influenza drug

Relenza was designed using the crystal structure of neuraminidase

(Varghese, 1999). More than 40 drugs originating from structure-based

design approaches have now entered clinical trials (Hardy and Malikayil,

2003), and seven of these had achieved regulatory approval and been

marketed as drugs by mid-2003. These successes had often led the

pharmaceutical segments to explore design and development of drugs

applying in silico approaches.

Protein structure can influence drug discovery at every stage in the

design process. Classically it has been exploited in lead optimization, a

process that uses structure to guide the chemical modification of a lead

molecule to give an optimized fit in terms of shape, hydrogen bonds and

other non-covalent interactions with the target. Protein structure can also

be used in target identification and selection (the assessment of the

‘druggability’ or tractability of a target). Traditionally, this has involved

homology recognition assisted by knowledge of protein structure; but now

structural genomics programs are seeking to define representative

structures of all protein families, allowing proposals of binding regions and

_________________________________________________________________________

molecular functions. More recently, X-ray crystallography has been used to

assist the identification of hits by virtual screening and more directly in the

screening of chemical fragments. The key roles of structural biology and

bioinformatics in lead optimization remain as important as ever (Whittle and

Blundell, 1994; Lombardino and Lowe, 2004). For protein which cannot be

crystallized, it is not possible to elucidate the structure through X-ray

crystallography. These structures can be predicted with high level of

accuracy using protein modeling methods. The protein modeling is a widely

accepted phenomenon as it produces highly reliable 3D structures and it is

of high importance nowadays in the drug discovery industries.

1.5.2 Protein Modeling

The process of evolution has resulted in the production of DNA

sequences that encode proteins with specific functions. In the absence of a

protein structure that has been determined by X-ray crystallography or

nuclear magnetic resonance (NMR) spectroscopy, researchers can predict

the three-dimensional structure using protein modeling. This method uses

experimentally determined protein structures (templates) to predict the

structure of another protein that has a similar amino acid sequence (target).

Although protein modeling may not be as accurate at determining a

protein's structure as experimental methods, it is still extremely helpful in

proposing and testing various biological hypotheses. This technique also

provides a starting point for researchers wishing to confirm a structure

_________________________________________________________________________

through X-ray crystallography and NMR spectroscopy. Because the

different genome projects are producing more sequences and because

novel protein folds and families are being determined, protein modeling will

become an increasingly important tool for scientists working to understand

normal and disease-related processes in living organisms.

1.5.2.1 The Four Steps of Protein Modeling (Lorenza, 2009)

Identify the proteins with known three-dimensional structures that are

related to the target sequence

Align the related three-dimensional structures with the target sequence

and determine those structures that will be used as templates

Construct a model for the target sequence based on its alignment with

the template structure(s)

Evaluate the model against a variety of criteria to determine if it is

satisfactory

_________________________________________________________________________

Fig. 2: Protein modeling steps

1.5.2.2 Comparative or homology protein structure modeling

Homology or comparative protein structure modeling constructs a

three-dimensional model of a given protein sequence based on its similarity

to one or more known structures. The first class of protein structure

prediction methods, including threading and comparative modeling, rely on

detectable similarity spanning most of the modeled sequence and at least

one known structure. The second class of methods, de novo or ab initio

methods, predict the structure from sequence alone, without relying on

similarity at the fold level between the modeled sequence and any of the

known structures. Despite progress in ab initio protein structure prediction,

comparative modeling remains the most reliable method to predict the 3D

_________________________________________________________________________

structure of a protein with an accuracy that can be comparable to a low-

resolution, experimentally determined structure.

1.6 PROTEIN MODELING AND DRUG DISCOVERY

Advances in bioinformatics and protein modeling algorithms, in

addition to the enormous increase in experimental protein structure

information, have aided in the generation of databases that comprise

homology models of a significant portion of known genomic protein

sequences. Currently, 3D structure information can be generated for up to

56% of all known proteins. However, there is considerable controversy

concerning the real value of homology models for drug design. Despite the

numerous uncertainties that are associated with homology modeling, recent

research has shown that this can be used to significant advantage in the

identification and validation of drug targets, as well as for the identification

and optimization of lead compounds.

Homology model-based drug design has been applied to epidermal

growth factor receptor tyrosine kinase protein (Ghosh et al., 2001), Bruton’s

tyrosine kinase (Mahajan et al., 1999), Janus kinase 3 (Sudbeck et al.,

1996) and human aurora 1 and 2 kinases (Vankayalapati et al., 2003).

Traditionally, the crucial impasse in the industry’s search for new drug

targets was the availability of biological data. Now with the advent of

human genomic sequence, bioinformatics offers several approaches for the

prediction of structure and function of proteins on the basis of sequence

_________________________________________________________________________

and structural similarities. The protein sequence>structure>function

relationship is well established and reveals that the structural details at

atomic level help understand molecular function of proteins. Impressive

technological advances in areas such as structural characterization of

biomacromolecules, computer sciences and molecular biology have made

rational drug design feasible and present a holistic approach.

The protein modeling being a computational approach generates the

3D structure of a receptor with high accuracy in a short duration. Also it is

possible to study the various binding pockets of the receptor (protein) and

ligand by molecular docking. These structures are of high importance for

screening the new chemical entities by in silico methods.

1.6.1 Multidomain Protein Targets

One of the great internal contradictions of drug discovery in practice

is that most regulatory proteins in man, the obvious targets for new drugs,

are complex proteins that are often multidomain and very usually

components of multiprotein systems. A domain represents a complete

functional unit. A protein may have one or more domains. Most of the focus

in the pharmaceutical industry is on the active sites of monomeric proteins.

Many proteins in the higher eukaryotes are large and contain multiple

domains. A typical example is the DNA protein kinase (DNA-PK), a key

molecule in non-homologous end joining, which signals the assembly of the

multiprotein system involved in the repair of double strand breaks (Smider

_________________________________________________________________________

et al., 1994; Taccioli et al., 1994). This protein is composed of a large

catalytic subunit and a regulating heterodimer Ku70 and Ku80. DOMINANT,

a program has been written to deconvolute protein structures into their

constituent domains in order that domains and domain boundaries can be

classified (Brewerton, 2004). For an input protein structure, DOMINANT

checks the existing domain database using a structure comparison

procedure to identify any recurrent domains, and then uses a procedure to

identify domains from the spatial separation of secondary structures to

deconvolute the remaining structure. Programs like DOMINANT will be

helpful in identifying multi domain protein and further assessing them for

druggability.

1.7 IN SILICO - ITS ORIGIN AND REVOLUTION

The term ‘in silico’ is a modern word usually used to mean

experimentation performed by computer and is related to the more

commonly known biological terms in vivo and in vitro. The history of the ‘in

silico’ term is poorly defined, with several researchers claiming their role in

its origination. However, some of the earliest published examples of the

word include the use by Sieburg (1990) and Danchin et al. (1991).

Informatics is a real aid to discovery when analyzing biological

functions. We could reiterate this for drug discovery, which is a hugely

complex information handling and interpretation exercise. With so much

information to process, we need to be able to discover the shortcuts or the

_________________________________________________________________________

rules that will point us as quickly as possible to the targets and molecules

that are likely to proceed to the clinic then onto the market. It has also been

suggested that if we are to build on the advances of the human genome,

we need to integrate computational and experimental data, with the aim of

initiating in silico pharmacology linking all data types. This could change

the way the pharmaceutical industry discovers drugs using data to enable

simulations; however, there may still be significant gaps in our knowledge

beyond genes and proteins (Whittaker, 2003). Structure-based methods are

broadly used for drug discovery but these are just a beginning, for example

in neuropharmacology, it is expected that ligand-receptor interaction kinetic

models will need to be integrated with network approaches to understand

fully neurological disorders, in general this could be applied more widely to

pharmacology (Aradi and Erdi, 2006). Basically, there are two outcomes

when bioactive compounds and biological systems interact (Testa and

Kramer, 2006). Note that ‘biological system’ is defined here very broadly

and includes functional proteins (for example, receptors), monocellular

organisms and cells isolated from mult icellular organisms, isolated tissues

and organs, multicellular organisms and even populations of individuals, be

they unicellular or multicellular. As for the interactions between a drug and

a biological system, they may be simplified to ‘what the compound does to

the biosystem’ and ‘what the biosystem does to the compound.’ A drug that

acts on a biological system can elicit a pharmacological and/or toxic

response, in other words a pharmacodynamic (PD) event. With the

_________________________________________________________________________

computational methods decision making and virtually simulating every facet

of drug discovery and development is a reality (Swaan and Ekins, 2005)

1.7.1 In silico drug discovery

Applying computational methods and techniques in the drug

discovery and development process is more appreciated and it is gaining

popularity among the pharmaceutical companies. In silico application

reduces the time and resource requirements of chemical synthesis and

biological testing. The utilities of computational application in drug

discovery include hit identification, lead identification and optimizing lead.

Before the introduction of genomic sciences, the drug discovery processes

have been guided mostly by chemistry and pharmacology. With the

completion of human genome project coupled with the molecular level

understanding of the diseases, biology is the major driving force of this

discovery process.

1.7.1.1 Chemo genomics approach

Chemogenomics approach aims at studying the effect of wide array of

small molecule ligands on a wide array of macro molecular targets. Human

genome has approximately 3000 druggable targets of which only 800

proteins are currently investigated by pharmaceutical companies. Chemo

genomic approach attempts to match these potential targets with the ligand

space. It depends on these components like compound library,

representative biological system and reliable output (Gene/protein

_________________________________________________________________________

expression data). This approach considers the fact that compounds sharing

some chemical similarity also share targets and targets sharing similar

ligands should share similar patterns or binding sites.

1.7.2 Virtual Screening and In silico Drug Targets

Assessment of 617 approved oral drugs in two-dimensional (2D)

molecular property space (molecular weight versus cLogP) showed that

many of them had cLogP 45 and MW 4500. In spite of this, their associated

targets were potentially druggable but had yet to realize their potential

(Paolini et al., 2006). A recent analysis using 48 molecular 2D descriptors

followed by principal component analysis of over 12,000 anticancer

molecules representing cancer medicinal chemistry space, showed that

they populated a different space broader than hit-like space and orally

available drug-like space. This would indicate that in order to find

molecules for anticancer targets in commercially available databases,

different rules are required other than those widely used for drug-likeness,

as they may unfortunately filter out possible clinical candidates (Lloyd et

al., 2006).

A representative of this inverse docking approach is INVDOCK, which

was recently applied for identifying potential adverse reactions using a

database of 147 proteins related to toxicities (DART). This method has

been recently demonstrated with 11 marketed anti-HIV drugs resulting in

reasonable accuracy against the DNA polymerase beta and DNA

_________________________________________________________________________

topoisomerase I (Ji et al., 2006). The public availability of data on drugs

and drug-like molecules may make the analyses described above possible

for scientists outside the private sector. For example, chemical repositories

such as DrugBank (http://redpoll.pharmacy.ualberta.ca/drugbank/) (Wishart

et al., 2006), PubChem (http://pubchem.ncbi.nlm.nih.gov/), KiDB

(http://kidb.bioc.cwru.edu/) (Roth et al., 2004; Strachan et al., 2006) and

others consist of a wealth of target and small molecule data that can be

mined and used for computational pharmacology approaches.

Nuclear receptors: Nuclear receptors constitute a family of ligand-

activated transcription factors of paramount importance for the

pharmaceutical industry since many of its members are often considered as

double-edged swords (Shi, 2006). On the one hand, because of their

important regulatory role in a variety of biological processes, mutations in

nuclear receptors are associated with many common human diseases such

as cancer, diabetes and osteoporosis and thus, they are also considered

highly relevant therapeutic targets. On the other hand, nuclear receptors

act also as regulators of some the CYP enzymes responsible for the

metabolism of pharmaceutically relevant molecules, as well as transporters

that can mediate drug efflux, and thus they are also regarded as potential

therapeutic antitargets.

Examples of the use of target-based virtual screening to identify

novel small molecule modulators of nuclear receptors have been recent ly

reported. Using the available structure of the oestrogen receptor subtype a

_________________________________________________________________________

(ERa) in its antagonist conformation, a homology model of the retinoic acid

receptor a (RARa) was constructed. Using this homology model, virtual

screening of a compound library lead to the identification of two novel

RARa antagonists in the micromolar range. The same approach was later

applied to discover 14 novel and diverse micromolar antagonists of the

thyroid hormone receptor (Schapira et al., 2000). By means of a procedure

designed particularly to select compounds fitting onto the LxxLL peptide-

binding surface of the oestrogen receptor, novel ERa antagonists were

identified (Shao et al., 2004). The discovery of three low micromolar hits for

ERb displaying over 100-fold binding selectivity with respect to ERa was

also recently reported using database screening (Zhao and Brinton, 2005).

A final example reports the identification and optimization of a novel family

of peroxisome proliferator-activated receptors-g partial agonists based

upon pyrazol-4-ylbenzenesulfonamide after employing structure-based

virtual screening, with good selectivity profile against the other subtypes of

the same nuclear receptor group (Lu et al., 2006).

Antibacterials

Twenty deoxythymidine monophosphate analogues were used along

with docking to generate a pharmacophore for Mycobacterium tuberculosis

thymidine monophosphosphate kinase inhibitors with the Catalyst software.

A final model was used to screen a large database spiked with known

inhibitors. In addition, the model was used to rapidly screen half a million

_________________________________________________________________________

compounds in an effort to discover new inhibitors (Gopalakrishnan et al.,

2005).

Antivirals

Neuroamidase is a major surface protein in influenza virus.

A structure-based approach was used to generate Catalyst

pharmacophores and these in turn were used for a database search and

aided the discovery of known inhibitors. The hit lists were also very

selective (Steindl and Langer, 2004). Utilizing this screening to design

antivirals could help in managing the major epidemics and pandemics.

Usually during an outbreak of a pandemic there is very less chance for

surveillance as the discovery process takes time. Screening for compounds

with activity will lead to rapid identification and to start an appropriate

control measure.

Human rhinovirus 3C protease is an antirhinitis target. A structure-

based pharmacophore was developed initially around AG 7088 but this

proved too restrictive. A second pharmacophore was developed from seven

peptidic inhibitors using the Catalyst HIPHOP method. This hypothesis was

useful in searching the world drug index database to retrieve compounds

with known antiviral activity and several novel compounds were selected

from other databases with good fits to the pharmacophore, indicative that

they would be worth testing although these ultimate testing validation data

were not presented (Steindl et al., 2005b).

_________________________________________________________________________

Human rhinovirus coat protein is another target for antirhinitis.

A pharmacophore was generated from the structure and shape of a known

inhibitor and tested for its ability to find known inhibitors in a database.

Ultimately, after screening the Maybridge database, 10 compounds were

suggested that were then docked and scored. Six compounds were tested

and found to inhibit viral growth. However, the majority of them was found

to be cytotoxic or had poor solubility (Steindl et al., 2005a). The Ligand

Scout approach was tested on the rhinovirus serotype 16 and was able to

find known inhibitors in the PDB (Wolber and Langer, 2005). The SARS

coronavirus 3C-like proteinase has been addressed as a potential drug

design target. A homology model was built and chemical databases were

docked into it. A pharmacophore model and drug-like rules were used to

narrow the hit list. Forty compounds were tested and three were found with

micromolar activity, the best being calmidazolium at 61 mM (Liu et al.,

2005), perhaps a starting point for further optimization.

A pharmacophore has also been developed to predict the hepatitis

C virus RNA-dependent RNA polymerase inhibition of diketo acid

derivatives. A Catalyst HypoGen model was derived with 40 molecules with

activities over three log orders to result in a five-feature pharmacophore

model. This was in turn tested with 19 compounds from the same data set

as well as nine diketo acid derivatives, for which the predicted and

experimental data were in good agreement (Di Santo et al., 2005).

_________________________________________________________________________

1.7.3 Protein-protein interactions

Protein-protein interactions are key components of cellular signalling

cascades, the selective interruption of which would represent a sought after

therapeutic mechanism to modulate various diseases (Tesmer, 2006).

However, such pharmacological targets have been difficult for in silico

methods to derive small molecule inhibitors owing to generally quite

shallow binding sites. The G-protein Gbg complex can regulate a number of

signalling proteins via protein-protein interactions. The search for small

molecules to interfere with the Gbg-protein-protein interaction has been

targeted using FlexX docking and consensus scoring of 1990 molecules

from the NCI diversity set database (Bonacci et al., 2006). After testing 85

compounds as inhibitors of the Gb1g2-SIRK peptide, nine compounds were

identified with IC50 values from 100 nM to 60 mM. Further substructure

searching was used to identify similar compounds to one of the most potent

inhibitors to build a SAR. These efforts may eventually lead to more potent

lead compounds.

A structure-based catalyst pharmacophore was developed for

acetylcholine esterase, which was subsequently used to search a natural

product database. The strategy identified scopoletin and scopolin as hits

and were later shown to have moderate in vivo activity (Rollinger et al.,

2004). The same database was also screened against cyclooxygenase

(COX)-1 and (COX)-2 structure-based pharmacophores, leading to the

identification of known COX inhibitors. These represent examples where a

_________________________________________________________________________

combination of ethnopharmacological and computational approaches may

aid drug discovery (Rollinger et al., 2005).

Homology models for the human 12-LOX and 15-LOX have also been

used with the flexible ligand docking programme Glide (Schrodinger Inc.) to

perform virtual screening of 50 000 compounds. Out of 20 compounds

tested, 8 had inhibitory activity and several were in the low micromolar

range (Kenyon et al., 2006).

1.7.4 Kinases

The kinases represent an attractive family of over 500 targets for the

pharmaceutical industry, with several drugs approved recently. Kinase

space has been mapped using selectivity data for small molecules to create

a chemogenomic dendrogram for 43 kinases that showed the highly

homologous kinases to be inhibited similarly by small molecules (Vieth et

al., 2004). Drug-metabolizing enzymes and transporters: Mathematical

models describing quantitative structure-metabolism relationships were

pioneered by (Hansch et al., 1968) using small sets of similar molecules

and a few molecular descriptors. Later, Lewis and co-workers provided

many QSAR and homology models for the individual human CYPs (Lewis,

2000). As more sophisticated computational modelling tools became

available, there is a steep growth in the number of available models (De

Groot and Ekins, 2002; De Graaf et al., 2005; De Groot, 2006) and the size

of the data sets they encompass. Some more recent methods are also

_________________________________________________________________________

incorporating water molecules into the binding sites when docking

molecules into these enzymes and these may be important as hydrogen

bond mediators with the binding site amino acids (Lill et al., 2006). Docking

methods can also be useful for suggesting novel metabolites for drugs. A

recent example used a homology model of CYP2D6 and docked

metoclopramide as well as 19 other drugs to show a good correlation

between IC50 and docking score r2¼0.61 (Yu et al., 2006).

A novel aromatic N-hydroxy metabolite was suggested as the major

metabolite and confirmed in vitro. Now that several crystal structures of the

mammalian CYPs are available, they have been found to compare qui te

favourably to the prior computational models (Rowland et al., 2006).

However, for some enzymes like CYP3A4, where there is both ligand and

protein promiscuity, there may be difficulty in making reliable predictions

with some computational approaches such as docking with the available

crystal structures (Ekroos and Sjogren, 2006). Hence, multiple

pharmacophores or models may be necessary for this and other enzymes

(Ekins et al., 1999), as it has been indicated by others more recently (Mao

et al., 2006).

Sulfotransferases, a second class of conjugating enzymes, have been

crystallized (Dajani et al., 1999; Gamage et al., 2003) and a QSAR method

has also been used to predict substrate affinity to SULT1A3 The

computational modelling of drug transporters has been thoroughly reviewed

by numerous groups (Zhang et al., 2002a, b; Chang and Swaan, 2005).

_________________________________________________________________________

Various transporter models have also been applied to database searching

to discover substrates and inhibitors (Langer et al., 2004; Pleban et al.,

2005; Chang et al., 2006b) and increase the efficiency of in vitro screening

or enrichment over random screening.

Receptors: There are more than 20 different families of receptors that

are present in the plasma membrane, altogether representing over 1000

proteins of the receptorome (Strachan et al., 2006). Receptors have been

widely used as drug targets and they have a wide array of potential ligands.

However, it should be noted that to date we have only characterized and

found agonists and antagonists for a small percentage of the receptorome.

1.8 DRUG TARGETS

Wikipedia defines drug target as "A biological target is a biopolymer

such as a protein or nucleic acid whose activity can be modified by an

external stimulus".

It has been estimated that current drug therapies are directed at less

than 500 targets. With unprecedented growth in medical sciences and

technology only approximately 500 drug targets had been reported till 2000.

Considering that the human genome contains some 30,000 genes, it is

possible that its study could lead to at least 3,000 to 5,000 potential new

targets for therapy. Currently, predominant candidates include G protein-

coupled receptor families and other receptors and related molecules, a

wide range of enzymes including proteases, kinases and phosphatases,

_________________________________________________________________________

hormones, growth factors, chemokines, soluble receptors and related

molecules, and many others. Exactly the same principles are being applied

to the search for agents to interfere with key biochemical pathways in

pathogens, based on information which is being obtained from the

pathogen genome project (WHO Reports, 2002).

1.8.1 Characteristics of an ideal drug target (Pathogenic Organisms)

The genome data must be analyzed by in vitro and in silico means to

nail down drug targets for developing new drugs. The following are the

characteristic features of an ideal target. The criteria for the ideal target

should fulfill the following four consideration.

Essentiality: The target should be essential for the growth,

replication and survival of the organism.

Selectivity: The target should not have clear orthologs in the human

host. This aspect is referred to as selectivity.

Spectrum: The target should be conserved in a number of

pathogens, providing adequate spectrum for any potential inhibitors.

Functionality: Functionality of the target has to be determined to

detect the inhibitors of the target.

1.8.2 Identifying Drug Targets

Virulence genes as drug targets

The complete genome data sets also spur early identification of

virulence genes. These genes can be identified either by in vitro expression

_________________________________________________________________________

technology or by DNA micro arrays. Extensive analysis coupled with the

comparison of pathogenic and non-pathogenic microbes will reveal the

pathogenic islands which encodes the virulent factors. Most often, these

islands differ from the rest of the genome in certain parameters like GC

content, codon usage and gene density. The protein encodes from these

pathogenic islands are thrust areas for alternative targets.

Species specific genes as drug targets

Peer Bork and his coworkers devised an interesting approach for the

prediction of potential drug targets. They designate this approach as

“Differential genome display”. The approach relied on the fact that

pathogenic organism codes for fewer proteins than free l iving organisms;

and those proteins which is present in pathogen and absent in free living

organisms are considered potential drug targets.

Effective drug targets are selected based on several important

criteria: they must be necessary to bacterial survival or growth, highly

conserved in either a broad- or narrow- range of pathogens, absent or very

different in humans, and understood biochemically (Rosamond and Allsop,

2000).

Microbial genomics and drug discovery

Sequencing technique enabled rapid sequencing and it is still

assisted by the computational tools to perform automated annotation of

these freshly sequenced genome data. Researchers quickly mine these

_________________________________________________________________________

data sets for exploring novel targets for both antimicrobial and vaccine

development.

Unique enzyme and drug targets

Since most of the known antibacterials act as inhibitors of bacterial

enzymes, all bacteria-specific enzymes can be considered potential drug

targets. These enzymes can be identified as potential drug targets. These

enzymes can be identified in organisms based on genome substraction

methods and comprehensive analysis of these resistant proteins for

confirmation. Much more easier and efficient identification is possible by a

similar approach called “Pathway substraction” This approach quickly

identifies enzyme pathways that are specific for bacteria and based on

which drug targets can be easily identified. A typical example is isoprenoid

biosynthesis in lower organisms and higher organisms. Since both these

group uses a completely different enzyme system for the biosynthesis of

this isoprenoid, the enzymes of the pathway are obvious drug targets for

drug design. This has also led to the discovery of fosmidomycin which

binds to the one of the enzyme target in this pathway. The ubiquitin

regulatory pathway, in which ubiquitin is conjugated and deconjugated with

substrate proteins, represents a source of many potential targets for

modulation of cancer and other diseases (Wong et al., 2003).

_________________________________________________________________________

Membrane transporters as drug targets

Comparative analysis of bacterial genome showed that most of the

pathogenic microbes do not have well developed biosynthetic capabilities

when compared to the free living or its related non-pathogenic forms.

Hence most of the organisms depend on the host completely for their

essential nutrients. A metabolic pathway analysis will reveal substrates that

cannot be produced by their bacterial forms and hence needs to be

transported. This eventually leads to identify bacterial transport protein

which could be an affirmative drug target.

1.9 TARGET PREDICTION METHODS AND STRATEGIES - AN OVERVIEW

1.9.1 Protein interaction network strategy for drug target identification

Proteins are the principal targets of drug discovery. Knowing what

proteins are expressed and how is therefore the first step to generating

value from the knowledge of the human genome. High-throughput

proteomics, identifying potentially hundreds to thousands of protein

expression changes in model systems following perturbation by drug

treatment or disease, lends itself particularly well to target identification in

drug discovery. Protein-protein interaction is the basis of drug target

identification. Protein interaction maps can reveal novel pathways and

functional complexes, allowing ‘guilt by association’ annotation of

uncharacterized proteins. Once the pathways are mapped, these need to

be analyzed and validated functionally in a biological model. It is possible

_________________________________________________________________________

that other proteins operating in the same pathway as a known drug target

could also represent appropriate drug targets.

Recent analyses of network properties of protein-protein interactions

and of metabolic maps have provided some insights into the structure of

these networks. So identifying protein-protein interactions can provide

insights into the function of important genes, elucidate relevant pathways,

and facilitate the identification of potential drug targets. Powerful

bioinformatics software enables rapid interpretation of protein-protein

interactions, accelerating functional assignment and drug target discovery.

No matter whether the number of actual drug targets is correct or not,

the available data strongly suggest that the present number of known and

well-validated drug targets is still relatively small. Bioinformatics is making

practical contributions in identifying large number of potential drug targets,

however, target validation efforts are required to link them to the aetiology

of known diseases and/or to demonstrate that the novel targets have

relevant therapeutic potential. The biochemical pathways put a drug target

into context: one can chart those in which a target is seen, and thus make

educated guesses about the effects that blocking the target are likely to

have. Further, more complete knowledge of biological pathways should be

used to gain clues for potential target proteins. Despite the promising

results obtained in the different tests carried out by this strategy, there are

several potential problems in applications to drug target identification and

validation. First, it is yet unclear if the currently available genomic

_________________________________________________________________________

databases, coupled with newly developed computational algorithms, can

offer sufficient information for automated in silico drug target identification.

For improving the biological accuracy of estimated gene networks, o ther

biological information such as sequence information on promoter regions

and protein-protein interactions should be integrated. Secondly, as real

biological processes are often condition specific, and gene expression data

tend to be noisy and often plagued by outliers, it is important to take

“conditions” or “environments” into account. The problem of capturing long -

run network behavior for large-size networks is difficult owing to the

exponential increase of the state spaces. Thirdly, an increasing population

of bioinformatics tools and the lack of an integrated and systematized

interface for their selection and utilization is becoming widely

acknowledged. Last and perhaps more important, understanding how a

target protein works in the context of cellular pathways is rudimentary and

linking diseases in humans to biochemical pathways studied in cells is also

difficult, gene network identification is a really hard problem and modeling a

larger protein complex will be an important challenge. The identificat ion and

validation of drug targets depends critically on knowledge of the

biochemical pathways in which potential target molecules operate within

cells. This requires a restructuring of the classical linear progression from

gene identification, functional elucidation, target validation and screen

development. One of the major goals of pharmaceutical bioinformatics is to

_________________________________________________________________________

develop computational tools for systematic in silico molecular target

identification.

One of the most important challenges for drug development, however,

is to rapidly identify target proteins most appropriate to further

development. Bioinformatics technology in the past decade has given birth

to the new paradigm of a biology-driven process. There are many exciting

developments to come in the field of target identification. Gene network

technology creates cell and organ-level computer models able to simulate

the clinical performance of drugs and drug candidates. By predicting how

and why specific compounds impact human biology, gene networks

technique may provide a glimpse of the signals and interactions within

regulatory pathways of the cell. In fact, it is now possible to think of the

whole pharmaceutical process as a computational approach, with

confirmatory experiments at each decision-point.

1.10 METHODS FOR DRUG TARGET IDENTIFICATION

The identification of disease relevant phenotypes follows the

identification of novel drug targets that modulate or inhibit these responses.

This can be broadly classified into three approaches

Mechanism- driven approach

Physiological approach

Gene driven approach

_________________________________________________________________________

1.10.1 Mechanism driven-Determining novel drug targets from network

structures

With the development of bioinformatics, a number of computational

techniques have been used to search for novel drug targets from the

information contained in genomics. The network-based strategy for drug

target identification attempts to reconstruct endogenous metabolic,

regulatory and signaling networks with which potential drug targets interact.

Once having these information provided by gene networks or protein

networks, the interaction relationships between potential drug targets could

be explicitly revealed, so it could be easily determined which one of these

potential drug targets is most proper, or the scope of selecting candidate

drug targets could be narrowed down to a great extent , for example, if a

potential drug target participates in many biological pathways of the

pathogen, the inhibition of this target may interfere with many activities

associated with those pathways, and therefore, may be a good candidate

for drug target.

It involves acquiring a molecular level understanding of the function

of drug targets. On the molecular level, function is manifested in the

behavior of complex networks. It is necessary to know the cellular context

of the drug target and the impact of its inhibition or activation on multiple

signaling pathways. Graphical models are often used to describe genetic

networks. Generally, a gene network could be presented in a directed

graph, in which nodes indicate genes and edges represent regulations

_________________________________________________________________________

between genes (e.g. activation or suppression). Analyzing the network

structures of large-scale interrogation of cellular processes holds promise

for the identification of essential mediators of s ignal transduction pathways

and potential drug targets. In order to find proper candidate target genes,

one needs biological knowledge of the pathways underlying the disease

process. So the study of biochemical pathways is the focus of numerous

researchers. However, owing to the complexity of pathway structures, many

potential drug targets turned out worthless because the pathways in which

they participate were more complex than expected. A promising strategy is

to examine the functionality of different genes in the network and observe

the connectivity of different functional domains. Some researchers have

implemented this gene network-based strategy for drug target identification.

First, using the gene expression data obtained from expression

experiments of several dose and time responses to the drug, those genes

affected by the drug (drug-affected genes) could be identified by fold-

change analysis or virtual gene technique. Because there is no guarantee

that genes most affected by the drug are the genes that were "drugged" by

the drug agent, nor is there any guarantee that the drugged target

represents the most biologically available and advantageous molecular

target for intervention with new drugs, they further searched the most

proper drug target genes upstream of the drug-affected genes in a

regulatory network. Using gene expression profiles obtained from 120 gene

disruptions, they employed a method based on Bayesian network model to

_________________________________________________________________________

construct a gene network. Then, by exploring the gene network, they found

the “druggable genes”, namely drug targets regulating the drug -affected

genes most strongly, and a novel drug target gene was identified and

validated.

1.10.2 Gene driven-Gene network strategy for drug target identification

The molecular interactions of genes and gene products underlie

fundamental questions of biology. Genetic interactions are central to the

understanding of molecular structure and function, cellular metabolism, and

response of organisms to their environments. If such interaction patterns

can be measured for various kinds of tissues and the corresponding data

can be interpreted, potential benefits are obvious for the identification of

candidate drug targets. It has already been demonstrated that it is possible

to infer a predictive model of a genetic network by time-series gene

expression data or steady-state gene expression data of gene knockout.

Using the inferred model, useful predictions can be made by mathematical

analysis and computer simulations. Recently several computational

methods have been proposed to reconstruct gene networks, such as

Boolean networks, differential equation models and Bayesian networks.

These quantitative approaches can be applied to natural gene networks

and used to generate a more comprehensive understanding of cellular

regulation, discover the underlying gene regulatory mechanisms and reveal

the interactions between drugs and the drug targets in cells.

_________________________________________________________________________

1.10.3 Physiological approach- Protein interaction network strategy

for drug target identification

Proteins are the principal targets of drug discovery. Knowing what

proteins are expressed and how is therefore the first step to generating

value from the knowledge of the human genome. Proteomics has unique

and significant advantages as an important complement to a genomics

approach. High-throughput proteomics, identifying potentially hundreds to

thousands of protein expression changes in model systems following

perturbation by drug treatment or disease, lends itself particularly well to

target identification in drug discovery. Protein-protein interaction is the

basis of drug target identification. Protein interaction maps can reveal novel

pathways and functional complexes, allowing ‘guilt by association’

annotation of uncharacterized proteins. Once the pathways are mapped,

these need to be analyzed and validated functionally in a biological model.

It is possible that other proteins operating in the same pathway as a known

drug target could also represent appropriate drug targets. Recent analyses

of network properties of protein-protein interactions and of metabolic maps

have provided some insights into the structure of these networks. So

identifying protein-protein interactions can provide insights into the function

of important genes, elucidate relevant pathways, and facilitate the

identification of potential drug targets. Powerful bioinformatics software

enables rapid interpretation of protein-protein interactions, accelerating

functional assignment and drug target discovery.

_________________________________________________________________________

No matter whether the number of actual drug targets is correct or not,

the available data strongly suggest that the present number of known and

well-validated drug targets is still relatively small. Bioinformatics is making

practical contributions in identifying large number of potential drug targets;