molecular cell biology. part a: eukaryotic gene structure...

13
Applications of recombinant DNA technology in gastrointestinal medicine and hepatology: Basic paradigms of molecular cell biology. Part A: Eukaryotic gene structure and DNA replication Gary E Wild MD CM PhD FRCPC 1 , Patrizia Papalia BSc 1 , Mark J Ropeleski MD CM FRCPC 1 , Julio Faria MD CM FRCSC 1 , Alan BR Thomson MD PhD FRCPC 2 F or most gastroenterologists, the principles of cell and mo- lecular biology have not traditionally played a major role in day to day clinical practice. However, tremendous ad- vances in the discipline of molecular medicine have pro- vided new insights into the cellular and molecular pathological basis of disease. This ever increasing expansion Can J Gastroenterol Vol 14 No 2 February 2000 99 1 Department of Medicine, Division of Gastroenterology, McGill University Health Centre, and McGill University Inflammatory Bowel Disease Research Program, Montreal, Quebec; 2 Department of Medicine, Division of Gastroenterology, University of Alberta, Edmonton, Alberta Correspondence: Dr Gary E Wild, Montreal General Hospital, 1650 Cedar Avenue, Montreal, Quebec H3G 1A4. Telephone 514-934-8308, fax 514-934-8411, e-mail [email protected] Received for publication March 23, 1999. Accepted July 15, 1999 REVIEW GE Wild, P Papalia, MJ Ropeleski, J Faria, ABR Thomson. Applications of recombinant DNA technology in gastrointesti- nal medicine and hepatology: Basic paradigms of molecular cell biology. Part A: Eukaryotic gene structure and DNA replica- tion. Can J Gastroenterol 2000;14(2)99-110. Progress in the basic sciences of cell and molecular biology has provided an excit- ing dimension that has translated into clinically relevant informa- tion in every medical subspecialty. Importantly, the application of recombinant DNA technology has played a major role in unravel- ling the intricacies related to the molecular pathophysiology of disease. This series of review articles constitutes a framework for the integration of the database of new information into the core knowledge base of concepts related to the pathogenesis of gastro- intestinal disorders and liver disease. The goal of this series of three articles is to review the basic principles of eukaryotic gene expression. The first article examines the role of DNA in directing the flow of genetic information in eukaryotic cells. Key Words: DNA; Eukaryotic cells; Nucleic acids; RNA Applications de la technologie de l’ADN recombinant en gastro-entérologie et en hépatologie : paradigmes fondamentaux de la biologie moléculaire. Partie A : structure et réplication de l’ADN du gène eucaryotique. RÉSUMÉ : Les progrès de la recherche fondamentale en biologie cellulaire et moléculaire ont ouvert la porte à une nouvelle dimension et à de nouvelles données cliniques pertinentes pour toutes les sous-spécialités médicales. Fait important à noter, l’application de la technologie de l’ADN recombinant a joué un rôle de tout premier plan pour nous aider à comprendre la physiopathologie moléculaire complexe des maladies. Cette série d’articles de synthèse se veut un cadre d’intégration pour l’ensemble des données qui viennent d’enrichir le corpus de nos connaissances relativement aux concepts expliquant la pathogenèse des maladies gastro-intestinales et hépatiques. L’objectif de cette série de trois articles est de passer en revue les principes fondamentaux de l’expression du gène eucaryotique. Le premier article se penche sur le rôle de l’ADN dans la transmission de l’information génétique dans les cellules eucaryotiques.

Upload: others

Post on 15-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

Applications of recombinantDNA technology in

gastrointestinal medicine andhepatology: Basic paradigms of

molecular cell biology.Part A: Eukaryotic gene

structure and DNA replicationGary E Wild MD CM PhD FRCPC1, Patrizia Papalia BSc1, Mark J Ropeleski MD CM FRCPC1, Julio Faria MD CM FRCSC1,

Alan BR Thomson MD PhD FRCPC2

For most gastroenterologists, the principles of cell and mo-lecular biology have not traditionally played a major role

in day to day clinical practice. However, tremendous ad-

vances in the discipline of molecular medicine have pro-vided new insights into the cellular and molecularpathological basis of disease. This ever increasing expansion

Can J Gastroenterol Vol 14 No 2 February 2000 99

1Department of Medicine, Division of Gastroenterology, McGill University Health Centre, and McGill University Inflammatory Bowel DiseaseResearch Program, Montreal, Quebec; 2Department of Medicine, Division of Gastroenterology, University of Alberta, Edmonton, Alberta

Correspondence: Dr Gary E Wild, Montreal General Hospital, 1650 Cedar Avenue, Montreal, Quebec H3G 1A4. Telephone 514-934-8308,fax 514-934-8411, e-mail [email protected]

Received for publication March 23, 1999. Accepted July 15, 1999

REVIEW

GE Wild, P Papalia, MJ Ropeleski, J Faria, ABR Thomson.Applications of recombinant DNA technology in gastrointesti-nal medicine and hepatology: Basic paradigms of molecular cellbiology. Part A: Eukaryotic gene structure and DNA replica-tion. Can J Gastroenterol 2000;14(2)99-110. Progress in thebasic sciences of cell and molecular biology has provided an excit-ing dimension that has translated into clinically relevant informa-tion in every medical subspecialty. Importantly, the application ofrecombinant DNA technology has played a major role in unravel-ling the intricacies related to the molecular pathophysiology ofdisease. This series of review articles constitutes a framework forthe integration of the database of new information into the coreknowledge base of concepts related to the pathogenesis of gastro-intestinal disorders and liver disease. The goal of this series ofthree articles is to review the basic principles of eukaryotic geneexpression. The first article examines the role of DNA in directingthe flow of genetic information in eukaryotic cells.

Key Words: DNA; Eukaryotic cells; Nucleic acids; RNA

Applications de la technologie de l’ADNrecombinant en gastro-entérologie et enhépatologie : paradigmes fondamentaux de labiologie moléculaire. Partie A : structure etréplication de l’ADN du gène eucaryotique.RÉSUMÉ : Les progrès de la recherche fondamentale en biologie cellulaire etmoléculaire ont ouvert la porte à une nouvelle dimension et à de nouvellesdonnées cliniques pertinentes pour toutes les sous-spécialités médicales. Faitimportant à noter, l’application de la technologie de l’ADN recombinant a jouéun rôle de tout premier plan pour nous aider à comprendre la physiopathologiemoléculaire complexe des maladies. Cette série d’articles de synthèse se veut uncadre d’intégration pour l’ensemble des données qui viennent d’enrichir lecorpus de nos connaissances relativement aux concepts expliquant lapathogenèse des maladies gastro-intestinales et hépatiques. L’objectif de cettesérie de trois articles est de passer en revue les principes fondamentaux del’expression du gène eucaryotique. Le premier article se penche sur le rôle del’ADN dans la transmission de l’information génétique dans les celluleseucaryotiques.

1

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:45 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 2: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

of the knowledge base has transformed the understandingand management of a diverse array of diseases. The cumula-tive research efforts in cell and molecular biology have pro-vided an exciting dimension that has translated intoclinically relevant information in every medical subspe-cialty. For example, hematologists have defined the molecu-lar basis of the hemoglobinopathies. Endocrinologists havedefined the cellular and molecular networks that mediatethe action of hormones. Neurologists have identified a hostof gene mutations that lead to neurodegenerative disorders.Finally, the identification of the cystic fibrosis transmem-brane regulator has facilitated the molecular diagnosis of thedisease, and, as a result, gene therapy protocols are beingconducted at several centres.

Many of the recent advances in molecular medicine havearisen through efforts driven by the Human Genome Pro-ject. It is apparent that molecular biology has accounted for adramatic paradigm shift in both the teaching and the prac-tice of medicine. This series of review articles constitutes aframework for the integration of the database of new infor-mation into the core knowledge base of concepts related tothe pathogenesis of gastrointestinal disorders and liver dis-ease. We hope to provide the reader with a set of tools to fa-cilitate the understanding of some of the basic concepts ofrecombinant DNA technology and the role it has played inunravelling the intricacies related to the molecular patho-physiology of disease. As well, we wish to provide the readerwith a flavour for the pervasive impact of molecular medi-

cine in the areas of gastroenterology and hepatology. Thegoal of this first series of three articles is to review the basicprinciples of eukaryotic gene expression.

NUCLEIC ACIDS AND INFORMATIONTRANSFER IN THE CELLS

DNA is the storage form of genetic information in cells. Thestructure of DNA was determined by Watson and Crick in1953, and this discovery has revolutionized the thinking inmodern cell biology. All DNA molecules consist of fourtypes of nucleotides joined together by phosphodiesterbonds to form polynucleotides. The nitrogenous bases foundin DNA consist of purines (ie, adenine [A] and guanine [G])and pyrimidines (ie, cytosine [C] and thymine [T]) (Figure1). The nucleotides are linked together by covalent phos-phodiester bonds that join the 5� carbon of one deoxyriboseto the 3� carbon of the adjacent deoxyribose to form polynu-cleotide genes. The double-stranded DNA helix with its twopolynucleotide strands of DNA run in an antiparallel orien-tation, and the DNA strands are held together by hydrogenbonding between A and T residues, and G and C residues.The antiparallel orientation in base pairing is an importantconcept in nucleic acid biochemistry. One strand runs in a 5�

to 3� direction, and the complementary strand runs in the 3�

to 5� direction (Figure 1). Thus, the two strands of the doublehelix are complementary. For example, the sequenceCTGAAGCGCTTA on one strand of DNA has the com-plementary sequence GACTTCGCGAAT on the oppositestrand of DNA in an antiparallel orientation. The variationof the sequence of nucleotides along the DNA strand deter-mines the function of each section of the DNA molecule aswell as its ability to transmit information to RNA and protein.

RNA molecules consist of nucleotides linked together byphosphodiester bonds. RNA generally occurs as single-stranded polynucleotides and contains ribose in place of thedeoxyribose found in DNA. RNA is made up of four bases,incuding A, G and C, but contains uracil (U) in place of T.Because U has the ability to bind with A in the same waythat T binds with A, the four bases found in RNA – A, U, Gand C – can form complementary pairs with other bases foundin RNA as well as with the bases found in DNA. These bio-chemical properties highlight the major function of theRNA molecule in the transfer of information from DNA toprotein in eukaryotic cells. RNA often contains intramol-ecular hydrogen bonding, which gives rise to secondarystructures. Intrastrand base pairing creates structures knownas stem loop structures, with the base pairing sections form-ing the stem and noncomplementary bases forming the loop.

Eukaryotic cells contain five classes of RNA: messengerRNA (mRNA), transfer RNA (tRNA), ribosomal RNA(rRNA), heterogeneous nuclear RNA (hnRNA) and smallnuclear RNA (snRNA). mRNA makes up a small percent-age of the total RNA (1% to 5%) in eukaryotic cells, has ashort half-life and demonstrates a large variation in base se-quence from one mRNA molecule to another. mRNA is thechemical messenger that carries information from the DNAhelix to the protein synthesizing machinery in the cytoplasm.

100 Can J Gastroenterol Vol 14 No 2 February 2000

Wild et al

Figure 1) Base pairing and the antiparallel orientation of DNA. Thetwo DNA strands in the helix have opposite polarity, with one strandrunning in a 5� to 3� direction and the other running in the 3� to 5� direc-tion. Four bases (adenine [A], thymine [T], cytosine [C] and guanine[G]) reside on the inside of the helix to allow hydrogen bonding betweenpurine and pyrimidine residues

2

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:48 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 3: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

tRNA molecules are polynucleotides made up of 75 to 95nucleotides that carry specific amino acids to the ribosomesduring protein synthesis. There is a unique tRNA that spe-cifically recognizes each of the 20 amino acids. In some in-stances, there is more than one tRNA species for a singleamino acid. rRNA is the most abundant of the RNA speciesin eukaryotic cells and is found associated with proteins instructures called ribosomes. These specific rRNAs of eukary-otic cells are designated by their sedimentation coefficients(S values). Human ribosomes contain 28S, 18S, 5.8S and 5SrRNA species.

hnRNA and snRNA species are located in the nucleus ofeukaryotic cells. hnRNA is the immediate product of tran-scription and is complementary to one strand of the DNAhelix. hnRNA is the precursor to mRNA before it undergoesfurther processing. snRNA is found associated with specificproteins that are involved in the processing of the hnRNAto mRNA before exit of the mRNA from the nucleus to thecytoplasm. The role of these RNA molecules in transcriptionand translation is discussed in detail in subsequent sections.These topics have been reviewed in detail (1-8).

MOLECULAR ANATOMYOF EUKARYOTIC GENES

Eukaryotic genomes are larger and more complex than thoseof primitive prokaryotes (ie, bacteria). For example, the hu-man genome contains approximately 100,000 genes, andmuch of its complexity arises from the abundance of severaldifferent types of noncoding DNA sequences.

A gene can be defined as a segment of DNA that is ex-pressed to yield a functional product that may be either anRNA or a peptide. The structural features that are commonto all eukaryotic genes are illustrated in Figure 2. The se-quence of base pairs confers gene specificity and determinesthe specificity of the product that it encodes. However, notall of the nucleotides present in the gene are expressed in thefinal product. Eukaryotic genes are often split into exons –sequences that remain in the final mature mRNA – and in-trons – sequences that are removed from the primary mRNAtranscript early during processing, most which have noknown function. In addition to encoding sequence informa-tion that ultimately defines the protein product, exons con-tain other sequences that are essential for the organizedfunction of mRNA. Thus, an exon is defined as a sequence inthe primary RNA transcript that is conserved during theprocessing of the transcript into a mature mRNA molecule.

Unique sequences that signal the start of transcription arepresent in each gene. These sequences are promoter se-quences, and they determine the site at which transcriptionis initiated on the DNA molecule. Transcription is initiatedwhen RNA polymerase along with transcription factors bindto the promoter site and catalyze the synthesis of RNA.RNA polymerase transcribes RNA by using the sequence ofbases from one strand of the DNA double helix that serves asa template. RNA is synthesized as a single-stranded moleculein the 5� to 3� direction.

Further processing of mRNA transcripts to yield a mature

RNA product involves a series of steps, including the addi-tion of a cap structure at the 5� end of the mRNA and the ad-dition of a poly A tail at the 3� end. Untranslated regions(UTRs) are situated at both the 3� and 5� ends of the mRNAand are sequences in the exons that remain in the mRNAbut are not translated into proteins. These regions containsignals required for mRNA processing and its subsequenttranslation into protein. For further details, see references 4and 8.

ORGANIZATION OF EUKARYOTIC GENOMESThe average polypeptide is approximately 400 amino acidslong; thus, the average size of the coding sequence of a geneis 1200 base pairs. Each amino acid is determined by a set ofthree nucleotides called a codon. In contrast to Escherichia

coli and yeasts, the human genome contains large amounts ofnoncoding DNA. Thus, only a small proportion of the total3×109 base pairs of the human genome is expected to corre-spond to protein coding sequences. The average gene spans10,000 to 20,000 base pairs (including introns) such that thehuman genome consists of approximately 100,000 genes thatcorrespond to 3% of the total human DNA. This topic iscovered extensively in references 4 and 8.

Several types of highly repeated sequences exist ineukaryotic genomes. One class, called simple-sequenceDNA, contains tandem arrays of thousands of copies of shortsequences ranging from five to 200 nucleotides. Such repeatsequence DNA account for approximately 10% to 20% ofthe DNA in higher eukaryotes and is called satellite DNA.Other repetitive DNA sequences are scattered throughoutthe genome rather than being clustered as tandem repeats.These sequences are classified as either short (SINEs) or long

Can J Gastroenterol Vol 14 No 2 February 2000 101

Molecular medicine: Gastroenterology and hepatology

Figure 2) Molecular anatomy of human genes. A typical human genecontains exon and intron sequences that are transcribed by RNA po-lymerase into the primary transcript. This primary transcript is subse-quently processed by the addition of a cap structure at the 5� end and theaddition of a poly A tail to the 3� end. The intron sequences are removed,and the exonic RNA sequences are spliced together. The mature mRNAcontains only exonic RNA sequences that have information for proteinsequences as well as signals for the initiation and termination of proteinsynthesis. UTR Untranslated regions

3

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:49 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 4: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

(LINEs) interspersed elements. The major SINEs in themammalian genome are Alu sequences, which contain a sig-nal site for the restriction endonuclease AluI. These Alu se-quences (300 base pairs long) are dispersed throughout thegenome and account for nearly 10% of the total cellularDNA. The major human LINEs are about 6000 base pairs inlength and repeat approximately 50,000 times in the humangenome. In contrast to Alu sequences, LINE sequences aretranscribed, and some encode proteins of unknown function.

Eukaryotic DNA is tightly associated with small basicproteins (ie, rich in arginine and lysine) called histones. Thecomplexes between eukaryotic DNA and proteins are calledchromatin, which contain about twice as much protein asDNA. The basic amino acids contained in histones havebeen identified: H1, H2A, H2B, H3 and H4. In addition,chromatin contains a variety of nonhistone chromosomalproteins, which are involved in DNA replication and geneexpression. The association of DNA and protein to formchromatin is illustrated in Figure 3.

The basic structural unit of chromatin is called the nu-cleosome, which is composed of repeating 200 base pairunits. Nucleosomes contain a core particle that contains 146base pairs of DNA wrapped 1.75 times around a histone core

consisting of two molecules each of H2A, H2B, H3 and H4.The other structural feature of the nucleosome is the chro-matosome, which contains two full turns of DNA (166 basepairs) held in place by one molecule of H1. The structure (ie,degree of condensation) of chromatin is intimately linked tothe control of gene expression in eukaryotes. The extent ofchromatin condensation varies during the life cycle of thecell. In nondividing cells, most of the chromatin, calledeuchromatin, is decondensed and distributed throughout thenucleus. Genes are transcribed during this period of the cellcycle, and the DNA is replicated in preparation for mitosis.By contrast, about 10% of interphase chromatin is in a veryhighly condensed state called heterochromatin. Hetero-chromatin is transcriptionally inactive and contains highlyrepeated DNA sequences.

The human genome is distributed among 24 chromo-somes (22 autosomes and the two sex chromosomes), eachcontaining between 5×104 and 26×104 kilobases of DNA.The chromosomes have three well defined structures thatare essential for their replication: DNA replication origins,centromeres and telomeres. The DNA replication originsare considered in detail in the section on DNA synthesis.Centromeres consist of highly repetitive DNA sequencesand are the site where the two sister chromatids are attached.The function of the centromere is to ensure the equal distri-bution of each chromosome to the daughter cells at cell divi-sion. The telomere is an important structure associated withthe ends of all human chromosomes. Telomeric DNA con-sists of multiple tandem repeats of the sequence TTAGGGlocated at both ends of each chromatid. Telomeres perform avariety of functions in human cells, including the following.

� Telomeres maintain chromosomal stability and preventthe formation of end-to-end fusions. The presence oftelomeric sequences protects chromosomal ends fromnuclease degradation.

� Telomeres ensure the proper replication of the ends ofchromosomes. DNA ends are not completely replicatedduring DNA replication and require the presence of theenzyme telomerase to add nucleotides to the extremeends of the DNA molecule. The presence of noncodingtelomeric sequences at the chromosomal ends protectsthe coding sequences of the DNA that might be locatednear the terminal ends of a chromosome from being lostduring each cycle of replication.

� Telomeres serve as markers of chromosomal integrity. Inthe event that a chromosome is damaged, the cell cyclestops temporarily such that DNA repair mechanismscan repair the damage.

FLOW OF GENETIC INFORMATIONIN EUKARYOTIC CELLS

The expression of genetic information in all eukaryotic cellsis largely a one way system of traffic. DNA directs the syn-thesis of RNA, and RNA specifies the synthesis of polypep-tides that subsequently form proteins. Because of itsuniversality, the DNA to RNA to protein flow of genetic in-

102 Can J Gastroenterol Vol 14 No 2 February 2000

Wild et al

Figure 3) The packaging of DNA in the nucleus. A model is depicted forthe progressive stages of DNA coiling and folding in the nucleus. The hi-erarchy of structure features arising from the DNA double-helix includenucleosomes, chromatin fibres and their looped domains, and heterochro-matin, which makes up the arms of the chromosomes

4

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:51 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 5: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

formation is called the ‘central dogma of molecular biology’.The synthesis of RNA using DNA as a template and RNApolymerase is called ‘transcription’. Transcription occurs inthe nucleus of eukaryotic cells and to a limited extent in mi-tochondria. The second step involves polypeptide synthesisand is called ‘translation’. Translation occurs on ribosomes,which are large RNA protein complexes found in the cyto-plasm. The RNA molecules that specify polypeptides areknown as mRNAs. Gene expression has traditionally fol-lowed a colinearity principle where the linear sequence ofthe nucleotides in DNA is decoded to give a linear sequenceof nucleotides in RNA. In turn, this linear sequence can bedecoded to give rise to a linear sequence of amino acids inthe polypeptide product. A challenge to this concept hasbeen made by recent findings that eukaryotic cells, includingmammalian cells, contain nonviral chromosomal DNA se-quences that encode cellular reverse transcriptases. Manydifferent classes of viruses have a genome that consists ofRNA. Retroviruses such as human immunodeficiency virusare a subclass of RNA viruses in which the RNA replicatesvia a DNA intermediate by using an RNA-dependent DNApolymerase called reverse transcriptase. Because some non-viral RNA sequences in eukaryotic cells are known to act astemplates for cellular DNA synthesis, the principle of unidi-rectional flow of genetic information is no longer strictlyvalid. The overall flow of genetic information and gene ex-pression in eukaryotic cells is illustrated in Figure 4, and is re-viewed in references 1-8.

THE CELL CYCLEThe cellular processes that determine DNA replication andmitosis are the keys to normal cell growth and development.These processes occur during a well regulated and orderlyprogression through the mammalian cell cycle (Figure 5).

The regulation of the cell cycle ultimately determines how acell will cycle among growth, differentiation and divisionphases. Cell cycle control is a key determinant of differentia-tion or the decision to stop cycling. The loss of control of thecell cycle leads to abnormal cell growth, which results intumourigenesis, developmental defects or premature pro-grammed cell death (ie, apoptosis). The topic of cell cyclecontrol is covered in detail in references 9-13.

The mammalian cell cycle comprises four distinct phases– gap (G) 1 phase, synthetic (S) phase, G2 phase and mitotic(M) phase. The period between one M phase and the next iscalled interphase. Interphase is divided into the remainingthree phases of the cell cycle (ie, G1, S, G2). The G1 phaseis the interval between the completion of the M phase andthe onset of the S phase. The G2 phase is the interval be-tween the end of the S phase and the beginning of the Mphase. DNA is replicated during the S phase and is distrib-uted equally to two daughter cells during the M phase. Thecells prepare for either the S phase or the M phase during theG1 and G2 (interval when proteins are synthesized in prepa-ration for mitosis) phases, respectively. Cells that do not un-dergo division, such as neurons, exit the cell cycle and entera phase called G0. If cells in G0 are stimulated to grow, theymove from G0 into the G1 phase. Progression through thecell cycle is mediated by multiple cyclin-dependent kinases(Cdk) that are sequentially activated by the binding of cy-clins. The activated Cdk-cyclin protein complex phosphory-lates specific proteins that are required for the reactionsunique to each distinct phase of the cell cycle. Cyclins varydramatically during the cell cycle. For example, cyclin B lev-els increase during interphase and subsequently decline dur-ing the M phase. The changes in the level of cyclin B arecorrelated with the activity of a specific Cdk called Cdc2,which is active when cyclin B levels peak and becomes inac-

Can J Gastroenterol Vol 14 No 2 February 2000 103

Molecular medicine: Gastroenterology and hepatology

Figure 4) Gene expression in the eukaryotic cell. The expression of ge-netic information in eukaryotic cells is largely a one way system. DNAspecifies the synthesis of RNA, and RNA specifies the synthesis ofpolypeptides, which subsequently form proteins. A small proportion ofnuclear RNA molecules can be converted to cDNA by reverse transcrip-tases and subsequently integrate into chromosomal DNA

Figure 5) Eukaryotic cell cycle. Cyclin-dependent kinases (Cdk), cy-clins and Cdk inhibitors (CKIs) interact during the cell cycle. Progres-sion during the cell cycle is regulated by interaction of positive andnegative regulatory factors. The positive progression is directed by multi-ple cyclin-cyclin-dependent kinase complexes, which act by phosphory-lating various proteins at the different stages in the cycle. Negativeregulatory factors include CKIs such as p16, p21 and p27, which inhibitphosphorylation of proteins by kinase and stop the cell cycle

5

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:54 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 6: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

tive as cyclin B declines. Thus, the phosphorylating activityof Cdc2 is modulated during the cell cycle by the availabilityof cyclin B. The activation of Cdc2 also depends on phos-phorylation of a specific threonine residue, thus adding a sec-ond layer to the control of the kinase activity. A variety ofcell cycle ‘checkpoints’ monitor progression through the cellcycle. Deviation from the normal cell cycle impedes progres-sion beyond the checkpoint, and the cell cycle is halted untilthe defect is corrected. Thus, the orderly progressionthrough the cell cycle depends on both positive factors,which drive the cell cycle forward, and negative factors,which halt the cycle at a particular stage. Cdk and specificcyclins are the main positive factors, which function at eachstage of the cell cycle. Negative factors block the activity ofthe specific Cdk and are called cyclin-dependent kinase in-hibitors (CKI) (Table 1).

The following mechanisms are responsible for the inacti-vation of an active Cdk-cyclin complex.

� The cyclin molecule can be degraded through theubiquitin protein-degrading system.

� The critical phosphate required for activation of thekinase activity can be removed from the protein by aspecific phosphatase.

� CKI molecules interact with Cdk or Cdk-cyclincomplexes and inhibit the kinase activity. Two classesof CKIs have been described, the inhibitor of Cdk(INK) class and the kinase inhibitory protein (KIP)class (Table 1).

Thus, the interplay between the activation and deactiva-tion of the Cdk activities at various stages of the cell cycle isthe key determinant of the normal progression and regula-tion of the cell cycle.The G1 phase: The G1 phase heralds the onset of the cellcycle. Resting cells that are stimulated to divide enter theG1 phase, and once the cell passes this point it is committedto entering the S phase and subsequently divides. The keypositive regulators of the G1 phase are Cdk4 and cyclins ofthe D family, which form a complex capable of phosphory-

lating a host of proteins required for cell function in the G1phase. The retinoblastoma (pRb) protein is a key proteinphosphorylated by the Cdk4-cyclin D in G1. pRb exists in anonphosphorylated form during the first two-thirds of theG1 phase and becomes phosphorylated just before the transi-tion from the G1 to the S phase. Nonphosphorylated pRbfunctions by restricting cell growth, whereas phosphorylatedpRb is associated with a loss of growth inhibitory functionand allows the cell to proceed through the cell cycle. Thus,pRb functions as a regulator that represses or activates spe-cific promoters through interaction with and modification ofthe activities of transcription factors that bind to DNA andregulate the expression of cell cycle genes. The phosphoryla-tion of pRb by the Cdk4-cyclin D complex allows previouslyrepressed genes to be transcribed and allows the cell to prog-ress from the G1 to the S phase.

The Cdk inhibitor p27 is a second important control thatregulates the progression of a cell from the G1 to the S phase.This protein binds to the Cdk2-cyclin E complex and inacti-vates it. The cells are unable to proceed onto the S phase andremain arrested in G1. Growth-promoting factors result inthe degradation of p27, activation of the Cdk2-cyclin E com-plex and transition into the S phase. The ubiquitin protein-degrading system is responsible for the degradation of p27.The S phase: Entry into the S phase is determined by a puta-tive cytoplasmic signal that is most likely an active Cdk-cyclin complex. Entrance into the S phase from the G1phase and progression through the S phase to the G2 phasedepend on the function of specific Cdk-cyclin complexes.Cdk2 initially binds cyclin E as the cells proceed into the Sphase. Cyclin A activates Cdk2 and phosphorylation of pro-teins required for DNA replication.The G2/M phase: The G2/M phase is a critical checkpointwhere cells decide whether to enter mitosis. The critical pro-teins involved in the G2/M checkpoint include Cdc2 andcyclin B, which form a complex, and the Cdc2-cyclin B com-plex is essential for entrance into and exit from the M phase.This involves activation and deactivation of the Cdc2-cyclin B complex through a series of phosphorylation anddephosphorylation steps.The M phase: The sudden activation of the Cdc2-cyclin Bcomplex by dephosphorylation, which occurs at the G2/Mborder, results in the phosphorylation of a variety of proteinsrequired for mitosis. Three checkpoints are key to the orderlyentrance into and exit from mitosis, with each daughter cellreceiving an exact copy of the parental genome. These threecheckpoints are the transition from G2 to M concurrentwith the activation of the Cdc2-cyclin B complex; the Mphase checkpoint that occurs during metaphase (the pointthat regulates the timing of the separation of the chromatidsand the initiation of anaphase); and the immediate prote-olytic destruction of cyclin B at the onset of anaphase, withthe concomitant inactivation of Cdc2 (which allows the cellto exit the M phase and enter a new G1 phase). Thesecheckpoints are regulated by the ubiquitin pathway.The role of p53 and p21 in the control of cell damage: Theorderly progression within the cell cycle as well as the ability

104 Can J Gastroenterol Vol 14 No 2 February 2000

Wild et al

TABLE 1Cyclin-dependent kinases (Cdk), cyclins andcyclin-dependent kinase inhibitors (CKIs) at different stagesof the cell cycle

CKI

Cell cycle phase Cdk Cyclin KIP* INK†

G1 Cdk4 Cyclin D p21, p27 p15, p16

G1/S Cdk2 Cyclin E p21, p27

S Cdk2 Cyclin A p21

G2/M Cdc2 Cyclin B p21

M Cdc2 Cyclin B,cyclin A

*Kinase inhibitory proteins (KIP) (p21 and p27) bind multiple cyclin-Cdkcomplexes that prevent activation or inhibit kinase activity; †Inhibitor ofCdk (INK) proteins (p15 and p16) are specific for Cdk4/6 and cyclin D, andbind Cdk and inhibit the binding of cyclin D

6

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:55 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 7: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

of the cell to sense any perturbation from its normal state iscrucial to normal cell growth and development. Cells haveevolved negative regulatory mechanisms that sense physio-logical disturbances, DNA damage, hypoxia, nutrient deple-tion and viral infection. The cell can arrest at a particularstage of the cell cycle, or in some instances the cell under-goes programmed cell death called apoptosis.

The DNA binding protein, p53, orchestrates the negativeregulatory mechanisms that occur when the cell is damaged.The p53 protein is a tumour-suppressor protein and activatestranscription of the gene encoding the Cdk inhibitor, p21.The p21 protein binds to multiple cyclin-Cdk complexesand blocks the kinase activity. This inhibits the phosphory-lation of proteins required for the various stages of the cellcycle. The binding of p21 to the G1 cyclin-Cdk complexes iscentral to the cessation of the G1 phase that follows DNAdamage by radiation. This allows time for the DNA repairmechanisms to correct the damage. Another function of p21is to bind proliferating cell nuclear antigen (PCNA). PCNAis a cofactor required for full expression of DNA polymerase-delta. DNA replication is inhibited when p21 is bound toPCNA. The roles that p53 and p21 play in damage controlin cells are illustrated in Figure 6.

Mutations that result in the loss or alteration of p53 activ-ity result in cancer development. Abnormal p53 levels areassociated with the loss of the cell’s ability to halt the pro-gression of the cell cycle under the aforementioned adverseconditions. Therefore, the cell continues to proliferate,resulting in a defective phenotype.

DNA REPLICATIONAs described earlier, the replication of DNA occurs duringthe S phase of the cell. The S phase occupies approximately30% of the cell cycle time. The replication of DNA is a semi-

conservative process, wherein each parental strand of theDNA helix serves as a template for the synthesis of a new andcomplementary daughter strand. In human diploid cells, thisinvolves the replication of six billion base pairs of DNA.The reader may consult references 14-19 for further detailsabout DNA replication.

A diverse array of enzymes and proteins are important inthe process of DNA replication. The key enzyme involved isDNA polymerase, which catalyzes the ligation of the deoxy-ribonucleoside 5�-triphosphates (dNTPs) to generate thegrowing DNA chain. Eukaryotic cells contain five types ofDNA polymerases: alpha, beta, gamma, delta and epsilon.The properties of the various human DNA polymerases aredescribed in Table 2. The DNA polymerase-gamma is re-stricted to the mitochondria where it is responsible for mito-chondrial DNA replication. The other four DNA poly-merases are localized in the nucleus. DNA polymerase-deltais the major replicating enzyme in human cells.

The process of DNA replication on each chromosome isinitiated at designated positions, referred to as origins of rep-lication (ori). Each human chromosome has multiple oriplaced at every 150 to 200 kilobase pairs. There are approxi-mately 30,000 initiation sites in the entire human genome.Thus, multiple sections of the genome are replicated simul-taneously. Each small replicating unit is called a replicon andhas its own ori site where DNA synthesis is initiated. Theprocess of DNA replication proceeds bidirectionally on thechromosome until each replicon comes in contact with thenext one. Thus, an entire chromosome can be replicatedcompletely during the S phase of the cell cycle.

As the two parent DNA strands unwind and separate,DNA replication begins at ori and proceeds down the twoDNA strands (Figure 7). Because of the inherent propertiesof DNA polymerase, daughter strand synthesis can only pro-ceed from the ori in the 5� to 3� direction. Thus, one strand issynthesized in a 5� to 3� direction and the opposite strand isalso synthesized in the 5� to 3� direction. Because there is noDNA polymerase that can synthesize DNA in a 3� to 5� di-rection, a DNA strand cannot be used as a template in the 5�

to 3� direction. Thus, short fragments of DNA, called Oka-zaki fragments, use the 3� to 5� strand as a template. The Oka-

Can J Gastroenterol Vol 14 No 2 February 2000 105

Molecular medicine: Gastroenterology and hepatology

Figure 6) Control of damage by p53 and p21. Cellular damage resultsin increased p53 activity. p53 functions as a transcription factor and in-duces the transcription of p21, a cyclin-dependent kinase inhibitor. Thep21 interacts with multiple cyclin-dependent kinase (Cdk)-cyclin com-plexes, inhibits the kinase activity and halts the cells in G1 phase. p21also binds proliferating cell nuclear antigen (PCNA), inhibiting DNAsynthesis

TABLE 2Structural and functional properties of human DNApolymerases

DNA polymeraseSize, catalyticsubunit (kD) Location

Function inthe cell

Alpha 160–185 Nucleus Lagging strandreplication

Beta 40 Nucleus DNA repair

Gamma 125 Mitochondria Replication ofmitochondrial

DNA

Delta 125 Nucleus Leading andlagging strand

replication

Epsilon 210–230 Nucleus DNA repair

7

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:21:56 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 8: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

zaki fragments are approximately 200 nucleotides in lengthand are synthesized in a 5� to 3� direction. The resulting frag-ments are then joined by an enzyme called DNA ligase togive one continuous DNA strand. The DNA strand that issynthesized continuously in the 5� to 3� direction is called the‘leading strand of DNA synthesis’ because it starts at a fixedpoint and dictates DNA synthesis. The strand of DNA thatis synthesized in the 5� to 3� direction in short pieces (ie, dis-continuously) is called the ‘lagging strand of DNA synthesis’.

The replication fork is the part of the DNA molecule thatis being replicated at a given time and is the region betweenthe unreplicated segment of the DNA molecule and a newlyreplicated portion of DNA. Because DNA is synthesized

bidirectionally, each replicon contains two replication forks.A specific initiator protein has the ability to recognize theorigin sequence and signals the initiation of DNA synthesis.It has been hypothesized that this initiator protein binds theori sequence and attracts the DNA replicating complex tothis particular site on the DNA molecule.

All DNA polymerases must have a primer (ie, a free 3� hy-droxyl end of a polynucleotide). The primer in DNA replica-tion is not DNA, but rather is a small segment of RNAmeasuring five to 10 nucleotides in length that is synthesizedby the enzyme DNA primase. DNA primase initiates thesynthesis of an RNA molecule at the ori, and DNA po-lymerase uses this RNA primer to add deoxyribonucleotidesto the 3� hydroxyl group of the RNA, and synthesizes a newDNA strand that is complementary to the template strand.After completion of DNA synthesis, the RNA molecule isremoved from the DNA helix, and the resulting gap in theDNA is filled by a DNA polymerase.

The various proteins that play an important role in theprocess of DNA replication are listed in Table 3. The separa-tion of the two strands of DNA is catalyzed by an enzymecalled DNA helicase, which breaks the hydrogen bondsholding the DNA strands together. The DNA helix is subse-quently unwound, and strands remain separated through theaction of a protein called replication protein A (RPA). RPAis a single-stranded DNA binding protein (Figure 8). TheDNA helicase acts at the edge of the replication fork, open-ing and unwinding the DNA as replication proceeds alongthe DNA molecule. As the helicase unwinds the DNA atthe replication fork, the DNA helix downstream becomestightly wound and supercoiled. The tension on the DNAmolecule is released by the action of DNA topoisomerase,which breaks phosphodiester bonds, unwinds the down-stream DNA helix and then reseals it by forming new phos-phodiester bonds. Both DNA helicases and DNA topoi-somerases play a pivotal role in the process of DNAreplication and transcription.

106 Can J Gastroenterol Vol 14 No 2 February 2000

Wild et al

Figure 7) Replicon. DNA polymerase can only synthesize DNA in a 5�

to 3� direction. For both strands of the DNA helix to serve as templates,one strand (ie, the leading strand) is synthesized continuously in a 5� to 3�

direction, while the other strand (ie, the lagging strand) is synthesized dis-continuously in short fragments but still in the 5� to 3� direction. Theshort DNA fragments (ie, Okazaki fragments), are subsequently joinedtogether by DNA ligase. Ori Origins of replication

TABLE 3Proteins involved in DNA replication

Protein Function

DNA helicase Unwinds DNA and breakshydrogen bonds

Single-stranded DNA-bindingprotein (RPA)

Binds single-stranded DNA toprevent hydrogen bonding

Proliferating cell nuclear antigen Stimulates DNA polymerase-deltaactivity

DNA polymerase-delta Stimulates leading and laggingstrand DNA replication and 3� to 5�

exonuclease proofreading

DNA polymerase-alpha/DNAprimase complex

Promotes synthesis of RNA primersand lagging strand synthesis

DNA ligase Seals 3� terminal hydroxyl and 5�

terminal phosphate groups ofadjacent nucleotides in DNA

Ribonuclease H1 Removes RNA from RNA-DNAhybrid

DNA topoisomerase Relaxes DNA by breaking andresealing phosphodiester bonds

RPA Replication protein A

Figure 8) Replication of a DNA molecule illustrating the interaction ofthe helicase and DNA binding proteins at the replication fork. RPA Rep-lication protein A

8

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:22:00 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 9: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

The DNA polymerases catalyze the formation of phos-phodiester bonds between the adjacent deoxyribonucleo-tides in the DNA molecule. All DNA polymerases catalyzethe synthesis of DNA only in the 5� to 3� direction. DNApolymerase-delta is the major replicating protein in humancells, and is involved in both leading and lagging strand rep-lication. DNA polymerase-alpha is complexed with anotherprotein, the DNA primase. Together, these proteins are in-volved in the replication of the lagging strand. DNA primasemakes the small RNA primers with DNA polymerase-alpha.Deoxyribonucleotides are added to the 3� terminal of theprimer for a short distance of about 30 nucleotides. TheDNA polymerase-alpha/DNA primase complex subse-quently falls off the DNA molecule and is replaced withDNA polymerase-delta, which continues the synthesis ofthe growing DNA chain. The RNA primers used by DNApolymerases must be removed from the DNA molecule. Thisis accomplished by the action of the enzyme RNase H1,which specifically degrades RNA present in a DNA/RNAhybrid. DNA polymerase later completes the DNA synthesisof the lagging strand by filling in the gap. Then the ligationof the 3� hydroxyl terminus of the DNA of one Okazaki frag-ment with the 5� terminal phosphate of DNA of the adjacentfragment occurs through the formation of a phosphodiesterbond. This reaction is catalyzed by DNA ligase.

DNA polymerase-beta and -epsilon serve in the process ofDNA repair and are not directly involved in replicating theentire genome. Finally, DNA polymerase-gamma is respon-sible for replicating the circular double-stranded DNA foundin mitochondria.

An additional protein involved in the replication ofDNA in human cells is PCNA. PCNA forms part of theDNA polymerase-delta complex and stimulates the activityin the DNA polymerase. The interactions of various proteinsinvolved in DNA synthesis in the lagging strand are de-picted in the model shown in Figure 9.

Some DNA polymerases (eg, DNA polymerase-delta)have intrinsic 3� to �� exonuclease activity that removesbases sequentially from the end of the DNA molecule (ie,the 3� end). This nuclease activity plays a critical role in pre-venting mistakes in base pairing during DNA replication.For example, if a C on the new DNA strands binds to an Aon the template strand, subsequent replications of this mis-take result in a G-C base pair molecule instead of an A-Tbase pair. Substitution of one base pair with another leads toa mutation in the DNA molecule that may affect cellularfunction. The 3� to 5� exonuclease recognizes these mispairsas soon as they occur and removes the newly inserted albeitincorrect base. The DNA polymerase then inserts the properbase into the growing DNA chain. This exonuclease compo-nent of DNA polymerase is termed the ‘proofreading func-tion’.

As mentioned above, the ends (ie, telomeres) of all chro-mosomes maintain the overall integrity of chromosomes.Telomeres consist of randomly repeated base sequences,TTAGGG, which are repeated 100 to 1000 times. BecauseDNA polymerases function only in the 5� to 3� direction,

they are unable to copy the extreme 5� ends of linear DNAmolecules. These sequences (ie, telomeres) are replicated bythe action of the enzyme telomerase, which is a reverse tran-scriptase. Reverse transcriptases synthesize DNA from anRNA template. Telomerases carry their own template RNAcomplementary to the telomere repeat sequences. The RNAtemplate allows telomerase to generate multiple copies ofthe telomeric repeat sequences, thus maintaining telomeresin the absence of a conventional DNA template to directtheir synthesis.

Despite the accuracy of DNA replication, cellular ge-nomes are far from static. Gene rearrangements and muta-tions are required to maintain genetic diversity amongindividuals. To this end, recombination between homolo-gous chromosomes occurs during meiosis and allows parentalgenes to be rearranged in new combinations in the next gen-eration of cells. The rearrangements of DNA sequenceswithin the genome create novel combinations of genetic in-formation. In some instances, DNA rearrangements are pro-grammed to regulate gene expression during the cellularprocesses of differentiation and development. A striking ex-ample of this is the rearrangement of antibody genes duringthe development of the immune system. A key feature ofboth immunoglobulins and T cell receptors is their enor-mous diversity. This diversity allows different antibody or T

Can J Gastroenterol Vol 14 No 2 February 2000 107

Molecular medicine: Gastroenterology and hepatology

Figure 9) Model for DNA replication in human cells. Replication pro-tein A (RPA), a single-stranded DNA-binding protein, separates theDNA strands to allow the DNA polymerase-alpha/DNA primase com-plex to bind to the DNA and initiate the synthesis of an RNA primer (in-dicated by the wavy line). DNA polymerase-alpha adds approximately30 deoxyribonucleotides to the 3� end of the RNA primer. The DNApolymerase-delta displaces the RNA polymerase-alpha/DNA primasecomplex and extends the DNA strand by adding deoxyribonucleotides tothe 3� end of the newly synthesized DNA strand. Upon completion of theDNA synthesis, RNase H1 removes the RNA primer. The DNApolymerase-delta fills in the gap using opposite DNA strands as the tem-plate. Finally, the two Okazaki fragments are joined together. This reac-tion is catalyzed by DNA ligase

9

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:22:02 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 10: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

cell receptor molecules to recognize a variable array of for-eign antigens. These diverse antibodies and T cell receptorsare encoded by unique lymphocyte genes that are formedduring the development of the immune system as a result ofsite-specific recombination between distinct segments of im-munoglobulin and T cell receptor genes.

MUTATIONS AND DNA REPAIR MECHANISMSMutations are the result of permanent changes in the basesequence of the DNA molecule and are central to the patho-genesis of all human genetic diseases. The various classes ofmutations that occur in DNA molecules are listed in Table4. The reader may wish to consult references 20-26 for fur-ther details. Many of the concepts concerning the differenttypes of mutations that occur in DNA, and the potentialmechanisms associated with the production of these muta-tions, were originally developed in bacterial cell model sys-tems. Recently, the knowledge base has expanded in the areaof the molecular basis of mutations in eukaryotic cells. Stud-ies of diseased human cells have established common

mechanisms by which DNA undergoes mutation. More im-portantly, DNA repair mechanisms have been defined.

Many of the mutations that occur in DNA are the resultof single base pair substitutions in which one base pair (eg,an A-T pair) is replaced with a second base pair (eg, a G-Cpair). The substitution of one base pair with a second basepair elicits a change of codon that can lead either to a mis-sense mutation (where one amino acid replaces anotheramino acid in a protein) or to a nonsense mutation (whereone of the terminator codons appears in the middle of agene). With a nonsense mutation, there is no transfer of anRNA molecule to recognize these codons, and protein syn-thesis terminates at the site of the nonsense codon. Thisleads to the production of a truncated polypeptide.

A mutation that alters the splice acceptor or splice donorsequences can result in apparent splicing of an RNA tran-script. This leads to the production of an mRNA that may bemissing a substantial part of a particular exon and thus codesfor a mutant protein. Other base pair substitutions can occurin regulatory sequences required for the binding of transcrip-tion factors or RNA polymerase. In this instance, the quan-tity of the product produced by the gene that is controlled bythese sequences is dramatically altered. In the extreme case,base pair substitutions can lead to a complete absence of thegene product or to a dramatic increase in the amount of aparticular gene product.

Frameshift mutations are caused by the addition or dele-tion of one or two base pairs within the coding sequence of agene. This alters the reading frame of the mRNA. Thus, themRNA is translated out of frame from the site of the inser-tion or deletion of the base pair. This results in the produc-tion of a protein that is altered in its amino acid sequence,starting from the point of the insertion or deletion of thebase pair and continuing to the end of the protein. Often,the altered reading frame also leads to the production of atermination codon in the middle of the gene. This results inpremature cessation of protein synthesis.

The insertion and deletion of many base pairs can also oc-cur with DNA molecules. Deletion mutations can occur in achromosome with the loss of hundreds to thousands of basepairs from the DNA, and the deleted genetic material is per-manently lost. Large insertions of DNA sequences have beendescribed and are caused by transposon-like elements, oftenrepetitive DNA sequences such as LINE repeats.

In summary, the possible changes in DNA that give riseto mutations may be illustrated by considering the followingliterary masterpiece.

Wild type: The cat sat on the mat.

Substitution: The rat sat on the mat.

Insertion (single): The cat shat on the mat.

Insertion (multiple): The cattle sat on the mat.

Deletion (single): The c.t sat on the mat.

Deletion (multiple): The cat … .. the mat.

Inversion (small): The tac sat on the mat.

Inversion (large): Tam eht no tas tac eht.

108 Can J Gastroenterol Vol 14 No 2 February 2000

Wild et al

TABLE 4Classes of mutations found in human DNA

Class Result

Single base pair substitutions (point mutations)

Altered structure of gene product

Missense mutation Single amino acid replacement inthe protein

Nonsense mutation Termination codon in the middleof the gene results in prematuretermination of protein synthesis

RNA splicing mutation Protein may be missing part or allof an exon sequence

Altered quantity of gene product

Mutations in regulatorysequences

Transcription of the gene is altered,which can reduce or eliminatethe gene product

Mutations in RNA processingand translation

Stability of mRNA is altered, whichmay reduce the amount of geneproduct

Insertions or deletions

One or two base pairs(frameshift mutations)

Addition or deletion of one or twobase pairs can affect the readingframe of the gene, resulting in agrossly altered or absent geneproduct

Large number of base pairs Large pieces of the DNA may belost or large segment of DNAmay insert into the middle of agene, resulting in loss offunction

Expansion of trinucleotide repeatsequences

Unstable trinucleotide repeats cansuddenly expand in number,resulting in the alteration ofproduction or structure of aparticular gene product

Chromosomal alteration Inversions, translocations,duplications or geneamplification may result

10

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:22:02 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 11: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

DNA polymerases catalyze the proper pairing of A to Tand G to C with very high accuracy. However, mispairingoccurs at a frequency of approximately 10–5 bases. For exam-ple, an A-C pair forms instead of an A-T pair. If such a mis-pair remains in the DNA molecule, the initial A-T pair thathas become an A-C pair gives rise to a G-C pair during thenext replication cycle. To keep the mutation rate at a lowlevel, eukaryotic cells have devised mechanisms for correct-ing base mispairs before they become a permanent feature ofthe DNA.

Bases that are present in DNA molecules can undergospontaneous damage or modification. One frequent form ofmodification occurs with the purine bases A and G. Purineresidues may be lost from the DNA molecules by a processcalled depurination. The glycosidic bond between thedeoxyribose and the base is hydrolyzed, which leads to a gapin one of the DNA strands. This damage must be correctedbefore the DNA is replicated, otherwise a mutation ensues.The bases C, A and G are capable of undergoing spontane-ous deamination, wherein the base loses an amino group andits structure is changed. For example, when C is deaminatedit becomes U. This leads to the presence of U in DNA in-stead of C. U appears with an A residue during the next rep-lication cycle. The original G-C pair, which afterdeamination is now a G-U pair, subsequently becomes anA-T pair. Finally, ultraviolet rays from sunlight are commonmutagenic agents that cause bond formation between adja-cent pyrimidines on the same DNA strand. The most fre-quent type of pyrimidine dimer is the T-T dimer. Thepresence of a T-T dimer in the DNA molecule blocks DNAreplication and leads to death of the cell if it is not removed.The 3� to 5� exonuclease activity associated with DNA poly-merase-delta and -epsilon is responsible for cleaving mis-paired nucleotides from the 3� end of newly replicated DNAstrands. This allows the polymerase a second opportunity toadd the correct base. The entire process is known as theproofreading function.

If base mispairing remains in the DNA, it leads to amutuation at the next DNA replication cycle. However,eukaryotic cells have evolved a mechanism to deal specifi-cally with persistent base mispairing immediately after repli-cation. Human cells have a methyl-directed mismatch repairsystem that appears to be similar to that of the bacterialstrains. The methyl-directed mismatch repair systems scanthe DNA molecule, and when base mispairs as well as inser-tions and deletions are detected, correction of the error oc-curs on the nonmethylated, newly synthesized DNA strand.This allows the repair system to correct the nascent strandthat has a normal base in the wrong location and preventsthe mispaired bases from giving rise to a permanent muta-tion.

DNA molecules are methylated at specific sites, either onan A or a C residue. In human cells, C residues located inCpG islands are methylated. Methylation is a postreplica-tion event. During the initial period of DNA replication,one strand (ie, the template strand) is methylated, while thenewly synthesized DNA strand is not methylated.

Mutator (Mut) proteins are involved in methyl-directedmismatch repair. Human homologues have been identifiedfor MutS (hMSH2 and GTBP) and MutL (hMLH1 andhPMS2), but there are no known homologues for MutH.Methyl-directed mismatch repair appears to be similar inbacteria and humans. In human cells, mismatches are recog-nized by the protein hMSH2 or a dimer composed of hMSH2and GTBP. Base mispairing creates a bulge in the DNA thatis recognized and bound by the MutS protein. The MutSprotein that is bound to the mismatch recruits the MutLhomologue to the site. MutH cleaves the nonmethylatedDNA strand. This is followed by the stepwise removal of nu-cleotides by an exonuclease, and the resulting gap in theDNA molecule is repaired by DNA polymerase using thebase sequence in the template strand. The final phosphodi-ester bond is sealed by DNA ligase.

One of the most common hereditary cancers, hereditarynonpolyposis colon cancer (HNPCC), arises from mutationsin the methyl-directed mismatch repair system. HNPCC af-fects one in 200 people in North America and accounts forapproximately 15% of all colon cancers. There are at leastfive genetic loci involved in the human mismatch repair pro-cess. These include hMSH2, hMLH1, hPMS1 and hPMS2,and the GTBP gene. Cells with HNPCC are characterizedby microsatellite instability. Microsatellites are repetitivenucleotide sequences (di-, tri- or tetranucleotides) locatedthroughout the human genome. The presence of these re-peats in the DNA presents a ‘road block’ to the DNA po-lymerase molecule during DNA replication. When DNApolymerase is confronted with a long repetitive sequence ofDNA, it produces a strand of DNA with extra bases that arenot base paired with the template and that loop away fromthe DNA helix. The mismatch repair system recognizesthese loops as defective and removes them. The loops remainif the repair system is defective. Microsatellite instability sig-nals that the cell has developed a Mut phenotype and has anincreased rate of overall mutation. These cells also developmutations in such genes as the p53 gene or other tumour su-pressor genes at a much higher rate than do normal cells.

Another type of DNA mutation is incurred through dam-age of bases of a DNA molecule that is not undergoing repli-cation. Cells have evolved two major repair systems to dealwith this type of DNA damage. The first system is called baseexcision repair. When a U residue occurs in a DNA mole-cule, it is recognized by U-DNA glycosylase and is removedfrom the DNA, leaving behind a gap. The lack of a base inthe DNA helix is recognized by specific endonucleasesknown as apurinic/apyrimidinic (AP) endonucleases. TheAP endonuclease cleaves the DNA at the site of the missingbase. The resulting gap is repaired by DNA polymerase usingthe base present in the complementary strand as a template.This is followed by ligation via DNA ligase. If the U residueis not removed, it eventually results in a G-U mismatch, andthe original G-C pair becomes an A-T pair or a mutation. Amore general repair mechanism is known as nucleotide exci-sion repair, which repairs bulky distortions in the DNAmolecule. The overall scheme for nucleotide excision repair

Can J Gastroenterol Vol 14 No 2 February 2000 109

Molecular medicine: Gastroenterology and hepatology

11

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:22:03 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 12: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

resembles that of base excision repair and methyl-directedmismatch repair. All systems have specific proteins that rec-ognize the damaged area of DNA, as well as specific proteinsinvolved in the removal of the damage from the DNA. Fol-lowing removal of the damage, the gap is filled by repair syn-thesis, catalyzed by DNA polymerase and sealed by DNAligase.

Xeroderma pigmentosum (XP) is a rare autosomal reces-sive disorder characterized by skin neoplasms. Skin cellsfrom XP patients are unable to repair DNA damage causedby exposure to ultraviolet light. Ultraviolet light damagesDNA and results in the formation of dimers between adja-cent pyrimidines on the same DNA strand (eg, T-T dimer).These T-T dimers distort the DNA helix, and result in thecessation of replication and transcription at that point untilthe dimer is removed. The nucleotide excision repair systemremoves these T-T dimers. The initial step is the recognitionof the damage by the XPA protein, which binds along withXPF-ERCC1 protein and the single-stranded DNA bindingprotein, RPA. Helicase activity unwinds the helix andstimulates the excision activity of two endonucleases, XPFand XPG, which cut the DNA. This creates a large gap in theDNA molecule, and the 3� hydroxyl is recognized by DNApolymerase-delta or -epsilon, which carries out repair syn-thesis using the undamaged DNA strand as a template. Thefinal nick is sealed by DNA ligase.

A new type of mutation that results in a number of humangenetic diseases has been recently described. These muta-tions are the result of the expansion of trinucleotide repeats(CAG, CTG, CGG or GAA) found throughout the humangenome. Long runs of these repeat triplets are found in exonsat the 5� or 3� end of genes. Individuals affected with one ofthe expansion disorder diseases have an increase in thenumber of copies of the trinucleotide repeats. Expansion ofthe repeat sequences can alter either the structure or func-

tion of a particular protein. One of the best characterized ex-amples of this is the trinucleotide CAG, which codes for theamino acid glutamine. In Huntington’s disease the CAG re-peat is located in the coding region of the first exon at the 5�

end of the gene. These repeats are translated and appear as along stretch of glutamines within the structure of the proteinsuch that the mutant protein has a range of 40 to 100 gluta-mines at that particular site. All of the CAG repeat diseasesare autosomal dominant disorders that are characterized bylate onset neuronal loss.

CONCLUSIONSThe fundamental similarities among different types of cellsconstitute a unifying theme in cell and molecular biology.The basic principles derived from experiments with prokary-otic cells, coupled with the availability of a variety of experi-mental tools, provided the framework to define themolecular processes that determine the flow of genetic infor-mation in eukaryotic cells. The importance of DNA in pro-viding a blueprint that coordinates all cellular activities isunderscored in the present review. The subsequent reviewswill examine the intricate array of cellular processes respon-sible for the orderly transcription and translation of geneticinformation into proteins – the major determinants ofeukaryotic gene expression.

ACKNOWLEDGEMENTS: This work was supported by operat-ing grants from the Medical Research Council of Canada and theCrohn’s and Colitis Foundation of Canada. Dr Gary E Wild is achercheur boursier clinicien of Les Fonds de la Recherche en Santedu Québec. Dr Wild extends his appreciation to Drs David Fromson,John Southin, Howard Bussey and Bruce Brandhorst of the McGillBiology Department. Their tireless efforts in the area of undergradu-ate science education fostered a sense of inquiry and collegiality thatguided a cohort of students through the early recombinant DNA era.

REFERENCES1. Lodish H, Baltimore D, Berk A, Zipursky SL, Matsudaira P,

Darnell J. Molecular Cell Biology, 3rd edn. New York: WH Freeman,1995.

2. Alberts B, Bray D, Johnson A, et al. Essential Cell Biology –An Introduction to the Molecular Biology of the Cell. New York:Garland Publishing, 1998.

3. Cooper GM. The Cell – A Molecular Approach. Washington:ASM Press, 1997.

4. Lewin B. Genes VI. New York: Oxford University Press, 1997.5. Glick BR, Pasternak JL. Molecular Biotechnology – Principles

and Applications of Recombinant DNA, 2nd edn. Washington:ASM Press, 1998.

6. Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD. MolecularBiology of the Cell, 3rd edn. New York: Garland Publishing, 1994.

7. Jameson JL. Principles of Molecular Medicine. New Jersey: HumanaPress, 1998.

8. Strachan T, Read AP. Human Molecular Genetics. New York:Wiley-Liss, 1996.

9. Hartwell LH, Kastan MB. Checkpoints: Controls that ensure the order ofcell cycle events. Science 1989;246:629-34.

10. Murray AW. Creative blocks: Cell-cycle checkpoints and feedbackcontrols. Nature 1992;359:599-604.

11. Norbury C, Nurse P. Animal cell cyles and their control. Ann RevBiochem 1992;61:441-70.

12. Morgan DO. Principles of CDK regulation. Nature 1995;374:131-4.13. Levine AJ. The tumor suppressor genes. Ann Rev Biochem

1993;62:623-51.

14. Blackburn EH. Telomerases. Ann Rev Biochem 1992;61:113-29.15. Diller DJ, Raghuraman MK. Eukaryotic replication origins: Control in

space and time. Trends Biochem Sci 1994;19:320-5.16. Heintz NH, Dailey L, Held P, Heintz N. Eukaryotic replication origins

and promoters of bidirectional DNA synthesis. Trends Genet1992;8:376-81.

17. Kelman Z, O’Donnell M. DNA polymerase III holoenzyme: Structureand function of a chromosomal replicating machine. Ann Rev Biochem1995;64:171-200.

18. Roca J. The mechanism of DNA topoisomerases. Trends Biochem Sci1995;20:156-60.

19. Zakian VA. Telomeres. Beginning to understand the end. Science1995;270:1601-7.

20. Kolodner RD. Mismatch repair: Mechanisms and relationships to cancersusceptibility. Trends Biochem Sci 1995;20:397-401.

21. Leach FS. Mutations of mutS homolog in hereditary nonpolyposiscolorectal cancer. Call 1993;75:1215-25.

22. Modrich P. Mismatch repair, genetic stablity, and cancer. Science1994;266:1959-60.

23. Sancar A. Mechanisms of DNA excision repair. Science1994;266:1954-6.

24. Seeberg E, Eide L, Bjoras M. The base excision repair pathway. TrendsBiochem Sci 1995;20:391-7.

25. Tanaka K, Wood RD. Xeroderma pigmentosum and nucleotide excisionrepair of DNA. Trends Biochem Sci 1994;19:83-6.

26. Davis MM. T cell receptor gene diversity and selection. Ann RevBiochem 1990;59:475-96.

110 Can J Gastroenterol Vol 14 No 2 February 2000

Wild et al

12

G:\GASTRO\2000\14#2\Wild-A\wildA.vpFri Feb 11 18:22:04 2000

Color profile: EMBASSY.CCM - Scitex ScitexComposite Default screen

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

0

5

25

75

95

100

Page 13: molecular cell biology. Part A: Eukaryotic gene structure ...downloads.hindawi.com/journals/cjgh/2000/608248.pdfApplications of recombinant DNA technology in gastrointesti-nal medicine

Submit your manuscripts athttp://www.hindawi.com

Stem CellsInternational

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

MEDIATORSINFLAMMATION

of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Behavioural Neurology

EndocrinologyInternational Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Disease Markers

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

BioMed Research International

OncologyJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Oxidative Medicine and Cellular Longevity

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

PPAR Research

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Immunology ResearchHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Journal of

ObesityJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Computational and Mathematical Methods in Medicine

OphthalmologyJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Diabetes ResearchJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Research and TreatmentAIDS

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Gastroenterology Research and Practice

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Parkinson’s Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttp://www.hindawi.com