the structure of dna and rna - wordpress.com · 2019-09-10 · different conformations of dna and...

18
Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU Dr. Fahd Nasr-All rights reserved 34 The structure of DNA and RNA I. Opening remark DNA was isolated for the first time in 1869 by Frederick Miescher. In 1952 Hershey and Chase published data suggesting that only DNA is required for T2 bacteriophage replication; their results strongly argued in favor of DNA as the genetic material. In 1953 Crick and Watson, together with Wilkins, proposed the double-helix structure of DNA. The proposed structure was based on X-ray crystallography studies on DNA performed by Rosalind Franklin. Crick, Wilkins, and Watson were awarded the Nobel Prize in Medicine and Physiology in 1962. In this chapter we will present an overview of the structure and the chemistry of nucleic acids that are polymers for information storage. Also, we will focus here on the structure of nucleotides and how these subunits are joined together via phosphodiester bonds to form polynucleotides (DNA and RNA). In addition, we will describe briefly the different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure of DNA Most of the genomes, which specify the life of organisms, are made of DNA (deoxyribonucleic acid). DNA is a polymeric molecule made up of linear unbranched chains of monomeric subunits called nucleotides. The four different nucleotides can be linked together in any order to form chains of thousands or millions of units in length. Each nucleotide in a DNA polymer is made up of three components: 1- 2'-deoxyribose, which is a pentose, a type of sugar composed of five carbon atoms. 2'- deoxyribose indicates that in this particular pentose the hydroxyl group (OH) attached to the 2'-carbon of ribose has been replaced by a hydrogen group. 2- A nitrogenous base with single-ring pyrimidines (cytosine or thymine) or double-ring purines (adenine or guanine). Each base is attached to the 1'-carbon of the sugar (Fig. 2). 3- A phosphate group which is attached to the 5'-carbon of the sugar. II.2. Nucleotides are called nucleoside phosphates Nucleosides are compounds made up of just the sugar and a base; these are linked together via a glycosidic bond. Nucleosides are named by adding -idine to the root name of a pyrimidine or -osine to the root name of purine. The full chemical names of the four nucleosides in DNA are thus 2'-deoxyadenosine (dA), 2'-deoxycytidine (dC), 2'- deoxyguanosine (dG) and 2'-deoxythymidine (dT). A nucleoside is converted to a nucleotide by adding a phosphate group. Biochemically, a nucleotide results when phosphoric acid is esterified to a sugar hydroxyl (-OH) group of a nucleoside. Although the nucleoside deoxyribose ring has two –OH groups available for esterification, at C-3' and C-5', most of the

Upload: others

Post on 26-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

34

The structure of DNA and RNA

I. Opening remark DNA was isolated for the first time in 1869 by Frederick Miescher. In 1952 Hershey and Chase published data suggesting that only DNA is required for T2 bacteriophage replication; their results strongly argued in favor of DNA as the genetic material. In 1953 Crick and Watson, together with Wilkins, proposed the double-helix structure of DNA. The proposed structure was based on X-ray crystallography studies on DNA performed by Rosalind Franklin. Crick, Wilkins, and Watson were awarded the Nobel Prize in Medicine and Physiology in 1962. In this chapter we will present an overview of the structure and the chemistry of nucleic acids that are polymers for information storage. Also, we will focus here on the structure of nucleotides and how these subunits are joined together via phosphodiester bonds to form polynucleotides (DNA and RNA). In addition, we will describe briefly the different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure of DNA Most of the genomes, which specify the life of organisms, are made of DNA (deoxyribonucleic acid). DNA is a polymeric molecule made up of linear unbranched chains of monomeric subunits called nucleotides. The four different nucleotides can be linked together in any order to form chains of thousands or millions of units in length. Each nucleotide in a DNA polymer is made up of three components: 1- 2'-deoxyribose, which is a pentose, a type of sugar composed of five carbon atoms. 2'-

deoxyribose indicates that in this particular pentose the hydroxyl group (OH) attached to the 2'-carbon of ribose has been replaced by a hydrogen group.

2- A nitrogenous base with single-ring pyrimidines (cytosine or thymine) or double-ring purines (adenine or guanine). Each base is attached to the 1'-carbon of the sugar (Fig. 2).

3- A phosphate group which is attached to the 5'-carbon of the sugar. II.2. Nucleotides are called nucleoside phosphates

Nucleosides are compounds made up of just the sugar and a base; these are linked together via a glycosidic bond. Nucleosides are named by adding -idine to the root name of a pyrimidine or -osine to the root name of purine. The full chemical names of the four nucleosides in DNA are thus 2'-deoxyadenosine (dA), 2'-deoxycytidine (dC), 2'-deoxyguanosine (dG) and 2'-deoxythymidine (dT). A nucleoside is converted to a nucleotide by adding a phosphate group. Biochemically, a nucleotide results when phosphoric acid is esterified to a sugar hydroxyl (-OH) group of a nucleoside. Although the nucleoside deoxyribose ring has two –OH groups available for esterification, at C-3' and C-5', most of the

Page 2: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

35

monomeric nucleotides in the cell are deoxyribonucleotides having 5'-phosphate groups. The full names of the four deoxyribonucleotides are thus 2'-deoxyadenosine 5'-monophosphate or deoxyadenylic acid1 (dAMP), 2'-deoxycytidine 5'-monophosphate or deoxycytidylic acid (dCMP), 2'-deoxyguanosine 5'-monophosphate or deoxyguanylic acid (dGMP) and thymidine 5'-monophosphate or thymidylic acid (dTMP) (Table 1).

Table 1. DNA and RNA are made up of four monomeric units (see text for further details). DNA Purines Pyrimidines Adenine Guanine Cytosine Thymine Nucleoside Deoxy- Deoxy- Deoxy- Deoxy- adenosine guanosine cytidine thymidine Nucleotide dAMP dGMP dCMP dTMP (or TMP)

RNA Purines Pyrimidines Adenine Guanine Cytosine Uracil Nucleoside Adenosine Guanosine Cytidine Uridine Nucleotide AMP GMP CMP UMP

II.3. What about RNA Like DNA, RNA is a polymeric molecule made up of monomeric subunits called nucleotides. Because DNA and RNA can be isolated from nuclei and because of their acidity they are also called nucleic acids. The structure of the RNA is similar to that of DNA but with two main differences. First, the sugar is ribose instead of deoxyribose and second, RNA contains uracil instead of thymine. As a consequence the full names of nucleosides in RNA are: adenosine (A), cytidine (C), guanosine (G) and uridine (U). The four nucleotides in RNA are therefore: adenosine 5'-monophosphate or adenylic acid (AMP), cytidine 5'-monophosphate or cytidylic acid (CMP), guanosine 5'-monophosphate or guanylic acid (GMP) and uridine 5'-monophosphate or uridylic acid (UMP) (Fig. 1). Although cells contain nucleotides with one, two or three phosphate groups2, only nucleoside triphosphates (NTP and dNTP) act as substrates for nucleic acids synthesis. The full names of deoxyribonucleotides and ribonucleotides do not change when the number of phosphate groups varies; simply instead of monophosphate which indicates one phosphate unit we use diphosphate for two phosphate units and triphosphate for three (i.e. AMP, ADP and ATP for adenosine 5'-monophosphate, adenosine 5'-diphosphate and adenosine 5'-triphosphate, respectively).

1 Because the pKa value for the first dissociation of a proton from the phosphoric acid moiety is 1.0 or less, the nucleotides have acidic properties which are implicit in the other names given to these substances: deoxyadenylic acid etc. Moreover, nucleic acids, DNA and RNA, which are nucleoside monophosphates, derive their name from the acidity of these phosphate groups. 2 As stated before the phosphate group attached to the 5'-carbon of the sugar can comprise one, two or three linked phosphate units. These phosphates are designated and with the -phosphate being the closest to the sugar.

Page 3: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

36

Figure 1. Structure of the nitrogenous bases in nucleic acids. Purine and pyrimidine rings show the general structure of each type of base and the numbers indicate the positions on the rings.

II.4. Nucleic acids are polynucleotides with directional sense Nucleic acids are linear polymers of nucleotides in which individual nucleotides are linked together by phosphodiester bonds between their 5'- and 3'-carbons. The polymerization reaction involves removal of the two and phosphates from the nucleoside triphosphates and successively adding the resulting nucleoside monophosphates to the 3'-OH group of the preceding nucleotide. Note that the phosphodiester bonds are strong, so that the repeated sugar-phosphate-sugar backbone is stable. Although the bases are not part of the backbone they give the polymer its unique identity (Fig. 3). The polymerization process gives the polymer a directional sense because the two ends of the polynucleotide are chemically distinct; one having an unreacted triphosphate group attached to the 5'-carbon which is also called 5'-P terminus and the other having an unreacted

HC

CH

CHHC

N

N1

2

3

4

5

6

C

CH

CHC

N

NH

1

2

3

4

5

6

NH2

O

CH

CH

C

C

HN

NH

1

2

3

4

5

6

O

O

C

C

CHC

HN

NH

1

2

3

4

5

6

O

O

CH3

CH

C

C

CHC

N

N

N

NH

NH2

3

2

1

6

5

4

7

8

9CH

C

C

CC

HN

N

N

NH

O

H2N3

2

1

6

5

4

7

8

9

CH

HC

C

CHC

N

N

N

NH3

2

1

6

5

4

7

8

9

Purine ring (Pu) Pyrimidine ring (Py)

Guanine (G)Adenine (A)

Cytosine (C) Uracil (U) Thymine (T)

Page 4: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

37

hydroxyl group attached to the 3'-carbon or 3'-OH terminus. As a consequence of this polarity the chemical reaction required to extend a DNA polymer in the 5'3' direction is different to that needed to make 3'5' extension. If the chemical synthesis of nucleic acids proceeds from 3' to 5' all natural polymerase enzymes are only able to carry out 5'3' synthesis. Chemical synthesis of nucleic acids is a multistep process: first, functional groups on the monomeric units or bases must be protected by blocking agents to avoid their reactivity during polymerization. Second, only a phosphodiester bond between the 5'-O of the first nucleotide and the 3'-O of the second nucleotide must be allowed to take place.

Figure 2. Structure of the four ribonucleotides AMP, GMP, CMP and UMP.

Figure 3. Chemical structures of RNA and DNA polymers each showing the four specific nucleotides. 3'-5' phosphodiester bridges link nucleotides together to form polynucleotide chains.

Page 5: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

38

Box 1- Every biological molecule is informational All Biological macromolecules are composed of building blocks or units that have a sense, directionality or structural polarity. The two ends of any biological macromolecule e.g. DNA, are chemically different and the molecule is said to have a head and a tail (5' and 3' ends in DNA). Because of this structural polarity, the order of units within any biological macromolecule is unique and specifies information in the same fashion that the alphabet is used to arrange letters in meaningful words and sentences. Some polymers are not rich in information, as in polysaccharides such as cellulose and starch, which are composed of the same unit (glucose) repeated over and over. In contrast, proteins and polynucleotides are very rich in information because the different building blocks are usually arranged in countless number of combinations each of which lies meaning. Biological macromolecules are maintained by weak forces, which include hydrogen bonds, van der Waals forces, ionic bonds and hydrophobic interactions. These weak forces are either intramolecular, contributing to the characteristic three-dimensional structure of the macromolecule, or intermolecular, thus determining biomolecular interactions. Weak forces create bonds that form and break constantly at physiological temperature, their cumulative action being necessary to impart stability to structures. Hereafter, we describe briefly the attributes of each weak force: a. Van der Waals forces results from the electrical interactions between two closely appoaching atoms or molecules. These interactions occur between the positively charged nuclei and the electron cloud or density of the incoming atom. b. Hydrogen bonds form between a hydrogen atom covalently bonded to an electronegative atom (acting as a hydrogen donor) and a second electonegative atom that serves as the hydrogen bond acceptor. For instance, O H----O, N H----O, O H----N, are hydrogen bonds. Hydrogen bonds are more specific than van der Waals bonds because they require the presence of hydrogen donor groups and hydrogen acceptor groups. c. Ionic bonds are due to the attractive forces that occur bettween the oppositely charged polar groups such as negative carboxyl group and positive amino group. d. Hydrophobic interactions are the result of the strong tendency of water to exclude nonpolar groups or molecules. This exclusion explains why the nonpolar groups have tendency to cluster in aqueous solution (in the same manner that oil droplets coalesce upon addition to water) or why the nonpolar regions of a biological macromolecules are often buried inside the molecule.

II.5. DNA is a double helix The DNA isolated from different cells consists of two polynucleotide strands wound together to form a helical molecule, the DNA double helix. The two strands run in opposite directions and are said to be antiparallel; they are held together in the double helix structure by interchain hydrogen bonds which occur between the bases of nucleotides through the base-pairing phenomenon (see below). It is hard to imagine how much time and effort it took to unravel the three dimensional structure of DNA but it is clear that it was the most significant contribution and breakthrough in biology during the 20th century. II.6. Main features of the double helix The double helix discovered by James D. Watson and Francis Crick on Saturday, 7 March 1953, working in the Cavendish laboratory at Cambridge University, comprises two polynucleotides wound around one another with the two chains running in opposite directions. Watson and Crick took advantage of Chargaff's rules and various observations from X-ray

Page 6: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

39

diffraction studies performed mainly by Rosalind Franklin who was herself very close to solve the structure. In the late 1940s, Erwin Chargaff brought a major contribution to the chemical composition of base-pairing in DNA through the analysis of the base composition of various DNAs. His results showed that the four bases A, C, G and T do not occur in equimolar amounts and that the relative amount of each species varies from species to species. However, Chargaff noticed that certain pairs of bases, adenine and thymine on one side and guanine and cytosine on the other side, are always found in a 1:1 ratio; also, he noticed that the number of pyrimidine residues is always equal to the number of purine residues (Table 2). These findings are known as Chargaff's rules: [A] = [T]; [G] = [C]; [pyrimidines] = [purines]. These and other findings led Watson and Crick to conclude that DNA was a complementary double helix. The two strands of deoxyribonucleic acid are held together by hydrogen bonds formed between unique base pairs, always consisting of a purine in one strand and a pyrimidine in the other. Thus, if an A occurs in one strand, the complementary position in the other strand must be occupied by T. Likewise, a G in one strand dictates a C in the other. The limitation that A can only base-pair with T, and G can only base-pair with C, means that the information contained in the sequence of one strand is conserved in the sequence of the other. Therefore, replication of DNA results in perfect copies of a parent molecule because the pre-existing strands will dictate the sequences of the new strands.

Table 2. Molar ratios of Bases in DNA isolated from various organisms.

Adenine Thymine Adenine Guanine Purines to to to to to Source Guanine Cytosine Thymine Cytosine Pyrimidines Human 1.56 1.75 1.00 1.00 1.00 Yeast 1.67 1.92 1.03 1.20 0.99 E. coli K-12 1.05 0.95 1.09 0.99 1.00 H. influenzae 1.74 1.54 1.07 0.91 1.00 Hen 1.45 1.29 1.06 0.91 0.99

The double helix described by Crick and Watson was called the B-form of DNA. It has the following main features: 1- It has a helical diameter of 2.37nm, a rise per base pair of 0.34m and a pitch3 of 3.4nm, which corresponds to ten base pairs per turn. Thus, each base pair is twisted 36º clockwise with respect to the previous base pair. 2- It consists of two chains wound around each other in a right handed double helix; when viewed from either ends, the two chains run around each other in a clockwise fashion. 3- The two chains are said to be antiparallel because they show opposite polarity. One strand is oriented in the 5' to 3' way while the other strand has the opposite 3' to 5' orientation.

3 The pitch is the distance taken up by a complete turn of the helix.

Page 7: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

40

4- The sugar phosphate backbones are on the outsides of the double helix, while the bases are oriented toward the central axis. 5- The two strands of DNA are held together by H bonds that form between the complementary purines and pyrimidines (according to Chargaff's rules), two in an A:T pair and three in a G:C pair (these H bonds account for a part of the stability of the double helix). The weak hydrogen bonds make it easy to separated the two strands of DNA during replication and transcription. 6- The core of the helix consists of the base-pairs, which, in addition to being H-bonded, stack together through hydrophobic interactions and van der waals forces that contribute significantly to the overall stabilizing energy. 7- The sugar-phosphate backbones of the helix are not equally spaced along the helix axis. Consequently the intertwined chains create a major groove and a minor groove of different sizes. The edges of the base pairs have a specific relationship to these grooves. These grooves are large enough to allow proteins to make contact with the bases. The functional implication of this observation is that some proteins that bind to DNA can actually recognize specific nucleotide sequences by reading the pattern of H-bonded possibilities presented by the edges of the bases in the grooves (Fig. 4). III. Classes of nucleic acids

As we mentioned above, the two major classes of nucleic acids are DNA and RNA. DNA has a central biological role in that sense it contains all the information required for an organism's function, development and reproduction. All the information needed to make the functional macromolecules of the cell, including the DNA itself, is preserved in DNA. This information is accessible through the process of transcription into RNA copies. In simple organisms such as the bacterium E. coli there is only a single copy of DNA (one circular chromosome), which is consistent with its singular purpose. In contrast, eukaryotic cells have many chromosomes, each of which is composed of one DNA molecule. DNA is also found in mitochondria and in chloroplasts where it encodes a restricted set of proteins and RNAs unique to these organelles.

On the other hand, RNA occurs in multiple copies and many forms. Cells contain up to eight times as much RNA as DNA. Like DNA, RNA can also form base-pairs, A pairing to U and G pairing with C. Although RNA can base-pair with a DNA molecule (e. g. during transcription, reverse transcription, replication, etc.) or with a RNA molecule (e. g. during the splicing process, RNA replication, translation, etc.), the predominant form of RNA base-pairing is intramolecular. These folded structures (as ribosomal and transfer RNAs) enable these molecules to carry out their functions (see below). Note that within the folded structures, base paired regions adopt helical conformations, but usually these extend for a few tens of base pairs at most. RNA plays several important biological functions. RNA molecules can be classified into three major types, which are found in both eukaryotic and prokaryotic organisms: Messenger RNA (mRNA), ribosomal RNA (rRNA) and transfer RNA (tRNA).

Page 8: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

41

Nevertheless, the story of RNA world was extended in the last quarter of the twentieth century with the discovery of new RNA species e.g. small nuclear RNA (snRNA) and small nucleolar RNA (snoRNA) are two RNA types found in eukaryotic cells; they are involved in post-transcriptional processing and modification of RNAs (see chapter V).

Figure 4. The double helix structure of DNA. On the left the structure is shown with the sugar-phosphate backbones drawn in red (outside) with the base pairs in black (inside). On the right the chemical structure for three base pairs is given (C:G. G:C and T:A).

IV. Significance of chemical differences between DNA and RNA

Two fundamental chemical differences distinguish DNA from RNA. i) DNA contains 2'-deoxyribose instead of ribose, and ii) DNA contains thymine instead of uracil. These differences are the results of factors that make DNA a more stable polymer than RNA. Let us consider the reason why DNA contains thymine instead of uracil. Although at a finite rate, cytosine can deaminate to form uracil in vivo. Since C pairs with G in the opposing strand, whereas U would pair with A, conversion of C to a U could potentially result in a mutation (a stable change in nucleotide sequence). In order to prevent this reaction, a cellular mechanism checks the DNA and when a U arise from a C by deamination, it is treated as inappropriate and is replaced by C4. If DNA normally contained U rather than T, this repair system could not distinguish U formed by C deamination from U correctly paired with A. However, DNA

4 In Escherichia coli uracil residues that result from the deamination of cytosine in DNA are removed by an enzyme termed uracil-DNA glycosylase. Removal of uracil residues is followed by apyrimidic (AP) endonuclease-dependent excision-repair of the AP sites. Uracil-DNA glycosylase is believed to function in protecting the genome against mutations resulting from the deamination of cytosine to uracil in ss or dsDNA.

Page 9: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

42

contains thymine which is a 5-methyl uracil; the 5-methyl would tell the repair mechanism that this specific "uracil" belongs to DNA and should not be replaced. On the other hand, the ribose 2'-OH group of RNA is absent in DNA (instead it contains deoxyribose with 2'-H). Consequently, the ubiquitous 3'-O of polynucleotide backbones lacks a vicinal hydroxyl neighbor in DNA. This difference leads to a greater resistance of DNA to hydrolysis. Therefore, RNA is less stable than DNA because its vicinal 2'-OH group makes the 3'-phosphodiester bond susceptible to nucleophilic cleavage. For just this reason, it is advantageous for the heritable form of genetic material to be DNA rather than RNA. V. Physical features of DNA V.1. Denaturation and renaturation of DNA When double-stranded DNA molecules are subjected to conditions that disrupt hydrogen bonds e.g. temperature and ionic strength5, the strands are no longer held together and separate from each other. Then, the DNA molecule is said to be denatured. When the temperature is the denaturing agent, denaturation is also termed melting. Melting temperature or Tm is referred to as the temperature at which the two strands of a double-stranded nucleic acid molecule separate as a result of complete breakage of hydrogen bonding. DNAs from different sources have different Tm because they have different G+C contents. As G and C are held by three hydrogen bonds whereas A and T pairs have only two, GC rich DNAs have higher Tm than AT rich DNAs. That is, the higher the G+C content of a DNA duplex, the higher the melting temperature. Tm depends also on the ionic strength of the solution; the lower the ionic strength, the lower the Tm. At 0.2M Na+ Tm is calculated as following: Tm = 69.3 + 0.41(%G+C)6 After their separation the two strands can come together to reform the duplex structure when the denaturing conditions are removed (e.g. when the solution is cooled down). This process, termed renaturation of DNA, requires the re-association (also called annealing) of complementary strands through their complementary bases. During the course of renaturation the strands align themselves so that their sequences are in register. This step is slow and depends on the DNA concentration and time. Once the sequences are aligned correctly, the second step, termed zippering, is fast leading to the generation of DNA duplexes. V.2. Thermal melting profile or hyperchromic shift The separation of the double-stranded DNA into single-stranded, as a result of a denaturing agent, can be followed spectrophotometrically because the UV absorbance at 260nm increases as much as 40% as the strands dissociate. This increase, termed hyperchromic

5 pH extremes also denature DNA. For instance, at pH greater than 11.5 or below 2.5 hydrogen bonding is disrupted causing the denaturation of DNA. 6 To avoid damaging DNA at high temperature usually formamide is added. This reagent causes the destabilization of the hydrogen bonds thus allowing the DNA strands to dissociate at much lower temperatures (~40°).

Page 10: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

43

shift, can be explained by the fact that bases are stacked within the double helix thus favoring the interaction of their electron clouds. It is well known that the UV absorbance of bases results from the electron transitions, which decrease when the bases stack. As a consequence, the bases absorb less 260nm-radiation than expected for their numbers. The thermal melting profile is a plot of UV absorbance against temperature for a given dsDNA. As shown in figure 5 the thermal melting profile is essentially linear between two extremes: the minimal absorbance of dsDNA and the maximum absorbance of ssDNA. The midpoint between these two extremes is termed melting temperature.

Figure 5. Melting curve. Heat denaturing of DNA from Haemophilus influenza whose genome has a G+C content of 38.15%. As shown on the plot the absorption of UV radiation (at 260nm) is about 40% greater for the single-stranded DNA compared to the double-stranded condition. The midpoint of the curve corresponds to the melting temperature Tm, [Tm = 69.3 + 0.41 (38.15) = 85ºC].

V.3. Buoyant density of DNA depends as well on G+C content It was found that GC-rich DNAs have higher buoyant densities than AT-rich DNAs. The same linear relationship between the melting temperatures of DNAs and their G+C contents is applicable to buoyant density. As a function of GC the buoyant density () is given by the following equation: = 1.66 + 0.098(GC), where GC is the molar fraction of G+C in the DNA. Figure 6 depicts the relationship of the densities (in g/ml CsCl) of microbial DNAs (taken from table 3, chapter III) and their G+C content. Note that DNAs that differ in their

Page 11: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

44

densities by 0.005 can be resolved as two distinct bands on a CsCl density gradient (see isopycnic centrifugation, Annex 1).

Figure 6. Dependence of melting temperature (Tm) and buoyant density of DNA () on the G+C content. Tm and Values have been calculated for microbial DNAs from table 3, chapter III. Note that there is a linear relationships between the Tm and of DNAs and their GC richness. The maximum values are obtained with Mycobacterium tuberculosis DNA with an average G+C content of 65.61 [Tm = 69.3 + 0.41(%G+C) = 96.2, = 1.66 + 0.098(GC) = 1.724].

VI. Hydrolysis of nucleic acids VI.1. Two major classes of nucleases Various treatments lead to hydrolysis of a nucleic acid molecule by breakage of the polynucleotide backbone. Here we will concentrate on the hydrolysis reactions brought about

80

82

84

86

88

90

92

94

96

98

100

20 30 40 50 60 70

% G+C

Tm

1.685

1.69

1.695

1.7

1.705

1.71

1.715

1.72

1.725

1.73

20 30 40 50 60 70

% G+C

Rho

Page 12: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

45

by a class of enzymes termed nucleases. These are present in virtually all cells where they are involved in the normal metabolism of nucleic acids. Figure 7 represents a 9-nucleotide RNA molecule with 5'-PO4 and 3'-OH. As shown, each internal phosphate group is involved in two ester linkages (phosphodiester bond), hydrolysis of the polynucleotide backbone can occur on either side of the phosphorus atom. By convention the 3'-side of the phosphodiester bond is termed a and the 5'-side is called b. Cleavage of the a-bonds (a-cleavage) generates phosphate products that are different from those that result from b-cleavage (Fig. 7). Each nuclease or hydrolysis reaction is characterized as acting on either a or b bonds.

Figure 7. Hydrolysis of nucleic acids. A 9-nucleotide RNA molecule of AUCGGUACC sequence is presented in the 5'3' orientation with 5'-P and 3'-OH groups. By convention the 3'-side of each phosphodiester bond is termed a and the 5'-side is termed b. Cleavage of the sugar-phosphate-sugar backbone leads to different phosphate products depending on whether the a or b side is cut. (A) Hydrolysis of the a bonds generates a mixture of 5'-NMP (5'-phosphate products), whereas cleavage of the b bonds (B) yields a 5',3' diphosphate nucleotide (5’,3'-ADP), a mixture of 3'-NMP (3'-phosphate products), and a nucleoside (C).

Some nucleases sequentially remove nucleotides from either end of nucleic acids. These nucleases are called exonucleases. Other nucleases, termed endonucleases, cut

PP P P P P P P P

A U C G G U A C C

OH

a a a a a a a ab b b b b b b b

P

C

OHP

A

OHP

G

OHP

C

OHP

A

OH

P

C

OHP

U

OHP

G

OHP

U

OH

PP

A C

OHOH

P

C

OHP

U

OHP

G

OHP

U

OH

(A) Hydrolysis of the a bonds

(B) Hydrolysis of the b bonds

P

A

OHP

G

OHP

C

OH

Page 13: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

46

phosphodiester bonds within the strands (at internal positions). Moreover, nucleases show specificity or selectivity toward their substrates. Although nonspecific nucleases hydrolyze both DNA and RNA, some can only act on DNA (termed DNases), whereas others are specific for RNA (RNases). Table 3 depicts the specificity of some nucleases.

Table 3. Nuclease specificity. A non exhaustive list of nucleases. Note that nucleases are also called phosphodiesterase because the reaction they catalyze involves the use of H2O to cleave the phosphodiester bonds. "Cuts after pyrimidine or purine" means that the 3'-PO4 belongs to the pyrimidine or purine residue, respectively. (py), pyrimidine; (pu) purine.

Enzyme substrate a or b Specificity Endonuclease

Pancreatic RNase (RNase A) Bacillus subtilis RNase RNase T1 RNase T2 Pancreatic DNase (DNase I) Nuclease S1

Exonuclease Snake venom phosphodiesterase Spleen phosphodiesterase

RNA RNA RNA RNA DNA

DNA, RNA

DNA, RNA DNA, RNA

b-cleavage b-cleavage b-cleavage b-cleavage a-cleavage a-cleavage

a-cleavage b-cleavage

Cuts after pyrimidine Cuts after purine Cuts after guanine Cuts after adenine Cuts between py and pu Acts on single stranded Starts at 3'-end Starts at 5'-end

VI.2. Restriction enzymes as a specific class of DNases DNases are nuclease that hydrolyze DNA. Some DNases are specific of single-stranded DNA (ssDNA), whereas others act on double-stranded duplexes (dsDNA). Restriction enzymes (RE) or restriction endonucleases form a class of DNases that bind to dsDNA at specific sites and cleave DNA either at the recognition sites or at another position in the DNA molecule. REs with various specificities have been chiefly isolated from prokaryotic systems. The term restriction comes from the fact that REs serve to defend the cells against foreign DNAs, which might gain access into their cytoplasm. Therefore, the foreign DNA is degraded into small and non-infective DNA fragments7. A restriction enzyme is usually recognized by a three-letter designation based on the name of the species in which it occurs, followed by the strain designation and/or a roman numeral. The latter distinguishes different enzymes from the same species. The endonuclease recognition site is written from 5' to 3' orientation for one strand with an arrow to indicate the site of cleavage (see Table 1, Annex1). Restriction endonucleases are classified into three distinct classes: type I, type II and type III. Types I and III recognize specific sites in DNA and catalyze, in an ATP-dependent reaction, the cleavage of DNA at a site located 24-26bp (for type III) or >1000bp (for type I) from the recognition site. Type II REs are the most common type in bacteria and have received much interest and widespread application in molecular biology. Type II REs do not require ATP to hydrolyze DNA, their recognition sequences being, for most of them, short palindromes of 4-6bp.

7 Note that every RE is a part of a restriction-modification system (R-M system). The latter modifies DNA in a specific fashion, thus protecting DNA against enzymatic degradation or restriction by the cell's own endonucleases. The modification enzyme, termed methylase, binds to same site that is recognized by endonuclease. When modified, the recognition site cannot be cleaved by the corresponding RE.

Page 14: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

47

Cleavage by type II REs occurs within or close to the recognition sequence. Some REs cut both strands at equivalent positions generating flush or blunt ends, whereas other REs make staggered breaks to generate sticky ends (see Table 1, Annex 1 for applications). VII. Conformation variation in the DNA double-helix

The DNA inside living cells is thought to be predominantly in the B-form (described above). However, it is now clear that genomic DNA molecules are not entirely uniform in structure and can occur naturally in other double-helical forms. The base-pairing arrangement remains the same, but the sugar-phosphate groupings that constitute the backbone are inherently flexible and can adopt different conformations. It has been recognized since the 1950s8 that changes in the dimensions of the double helix occur when fibers containing DNA molecules are exposed to different relative humidities. An alternative form of the right-handed double helix is A-DNA. The A-form usually occurs only when relatively little water is available to hydrate the double helix. The A-form is a right handed and modified version of the double helix has a diameter of 2.55nm, a rise of 0.29nm per base pair and a pitch of 3.2nm, corresponding to 11 base pairs per turn. A more drastic reorganization of the double helix is also possible leading to the left-handed Z-DNA form, a slimmer version of the double helix with a diameter of only 1.84nm (Fig. 8). All the features of the different conformations of the DNA double helix are summarized in table 4.

Table 4. Structural properties of A-, B- and Z-type of DNA double helix.

Conformation Feature B-DNA A-DNA Z-DNA Type of helix Right-handed Right-handed Left-handed Helical diameter 2.37nm 2.55nm 1.84nm Rise per base pair 0.34nm 0.29nm 0.37nm Pitch (distance per complete turn) 3.4nm 3.2nm 4.5nm Number of base pairs per complete turn 10 11 12 Topology of major groove Wide, deep Narrow, deep Flat Topology of minor groove Narrow, shallow Broad, shallow narrow, deep

VIII. Tertiary structure of DNA and linking number DNA is assumed to adopt a regular and linear form despite the presence of three different conformations a DNA double helix might have. However, a DNA molecule can also adopt regular structures of higher complexity. For instance, most of bacterial chromosomes are double-stranded covalently closed circular DNA (ds cccDNA). Moreover, in mitochondria and chloroplasts, most bacterial plasmids and some viruses, the DNA is also double-stranded and

8 Rosalind Franklin working in the Cavendish laboratory at Cambridge University achieved the sorting out of the A- and B-forms of DNA.

Page 15: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

48

covalently closed circular (ds cccDNA) i.e. it has no free 3' and 5' ends. From studies of the most known bacterium E. coli it was recognized that its circular genome is supercoiled i.e. the genomic DNA carries additional turns generating the so-called torsional stress (see DNA replication). VIII.1. Supercoiling In linear DNA, the two chains run around each other once every 10bp (B-DNA), corresponding to a complete turn of the helix. Any torsional stress in the linear double helix of DNA causes the ends to rotate so that the supercoils are removed. Supercoiling, occurs when additional turns are introduced (positive supercoils) into the DNA duplex or if turns are removed (negative supercoils). In opposition to linear DNA, a circular DNA undergoing supercoiling cannot release the stress in the same way because no free ends are available to rotate. Somehow, supercoiling of DNA can be compared to twisting of a double stranded rope so that it is tosionally stressed. Now, negative supercoils introduces a torsional stress that favors the unwinding of a right-handed DNA (DNA is said to be underwound) while positive supercoils overwinds such a helix. Thus a circular DNA responds to supercoiling by winding itself to form a more compact structure that help its packaging into a small space.

Figure 8. Different conformations of the DNA double helix in B-DNA (left), A-DNA (center) and Z-DNA (right) forms.

Page 16: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

49

VIII.2. Linking number Linking number or topological winding number (designated by L, LK or ) is the basic parameter of a supercoiled DNA. It is defined by the number of times one strand winds around the other in any double-stranded cccDNA. By convention the linking number is designated positive in right-handed DNA. Relaxed cccDNAs are regarded as linear B-DNA with covalently joined ends, and provided both strands remain uncut, L cannot change. Nevertheless, any cccDNA can occur in different forms or distinct topoisomeric forms. Let us consider a relaxed cccDNA of 600bp with a pitch of 10bp/turn (DNA in the B form); in this DNA duplex L is 60. Because, this linking number is the same than that of an equivalent linear B-DNA, it is considered as the reference parameter and called duplex winding number (designated by L0, LK0 or ). L can further be equated to the twist (T or Tw), being related to the pitch (p) of the helix and corresponding to the number of helical turns (T=N/p where N is the number of base pairs per molecule), and writhe (W) or writhing number (Wr), which is the number of superhelical turns; L=T+W. Thus, in the 600-bp cccDNA duplex, W=0 and L=T=60. Note that the linking number of any cccDNA molecule is a topological property of that molecule; as a consequence, it cannot change without strand breakage. When one or both strands are broken, wound tighter or looser and rejoined covalently, L then changes. Those enzymes that are capable of such reactions are called topoisomerases because they modify the topological state of DNA. If six positive supercoils are introduced by a topoisomerase in the 600-bp cccDNA duplex, then W=+6, T remains unchanged and L=60+6=66. The superhelix winding number or L = = L – L0 (here L = +6) indicates the number of superhelical turns in a given ds cccDNA molecule. Note that underwound molecule has fewer turns than the equivalent relaxed molecule so that L < L0 and is negative i.e. the molecule is said to be negatively supercoiled. if is positive such as in the example mentioned above the molecule is overwound and is positively supercoiled. In general, naturally occurring circular DNA is negatively supercoiled. The equation L = T + W assumes that the pitch remains constant at the corresponding value for the relaxed molecule, so that any supercoils (negative or positive) are accommodated by contortion of DNA (by writhe). However, it is known that supercoiled DNA is stressed. A part of this strain can be removed by forcing a change in the pitch, and hence a change in the twist T. The extreme condition is obtained when all supercoils are removed by changing the pitch, then the molecule appears as if it is relaxed e.g. if = +6 in a 600bp cccDNA molecule in the B-form, as L = T + W, and assuming W = 0, then L = T = 66 and the pitch will encompass 9bp (instead of 10). Another useful parameter, termed superhelix density or specific linking difference is defined as L/L0 (also referred to as sigma ). In our previous example, = L/L0 = +6/60 = +0.1. Here represents a ratio i.e. a measure of the supercoiling degree that is independent of DNA length. The sign (+) or (-) corresponds to positive or negative supercoiling. A DNA molecule with a negative tends to unwind, whereas a positive states that the molecule is overwound. In other terms, the superhelix density indicates the number of supercoils per turns (per 10bp as in B-DNA).

Page 17: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

50

The positive free energy associated with the negatively supercoiled DNA has many functional consequences. For instance, it facilitates the formation of secondary structures (cruciforms and localized regions of Z-DNA); it also favors the unwinding of DNA during the processes of replication, transcription, etc. VIII.3. Intercalating agents Some hydrophobic molecules, called intercalating agents, with a flat structure composed of fused heterocyclic rings can insert or intercalate between the stacked base pairs9 in double stranded DNA or in basepaired regions of single stranded DNA. These agents, such as ethidium bromide, acridine orange, and actinomycin, force the base pairs apart and cause a local unwinding of the double helix so that it is converted to a ladder-like structure. The unwinding leads to an increase in the length of the DNA by extension of the deoxyribose phosphate backbone. As a consequence, the rotational angle about the axis of the helix between adjacent base pairs is reduced (< 36º). In linear dsDNA, these effects cause an increase in the viscosity and decrease in the sedimentation coefficient (s) of the DNA molecule in proportion to the amount of the dye intercalated. Intercalative binding is restricted to one molecule per 2-3bp i.e. the binding of the intercalating agent at one site prevents the binding of others at the adjacent sites, a principle referred to as the neighbor exclusion principle. In circular DNA intercalation leads to a local increase in pitch and decrease in twist. As the linking number, equated to twist plus writhe (L = T + W), cannot change without strand breakage and reunion, any change in the twist number should be accommodated by an equivalent change in the writhe. As a result, when an intercalating dye insert in a negatively supercoiled molecule, the amount of writhe decrease progressively to zero with a concomitant change in the twist number. At this point the molecule behaves as relaxed. The increase in the amount of intercalated dye causes the writhe to increase again as the molecule becomes positively supercoiled.

L = T + W = 60 − 6 = 54 (negatively supercoiled) L = T + W = 54 + 0 = 54 (relaxed due to a change in pitch and, as a result, in twist)

L = T + W = 48 + 6 = 54 (positively supercoiled to accommodate further change in twist) The change in writhe is accompanied by changes in the viscosity and the sedimentation coefficient of the DNA molecule; the viscosity increases to a maximum and then decrease, whereas the sedimentation coefficient decrease first to a minimum and then increases. It is clear that intercalation relieves the stress of negative supercoiling. For this, a negatively supercoiled DNA has higher affinity for intercalating agents than the equivalent relaxed or positively supercoiled molecule.

9 The intercalative ability of these agents indicate that the van der Waals bonds they form with the base pairs are more favorable than similar bonds between the bases themselves.

Page 18: The structure of DNA and RNA - WordPress.com · 2019-09-10 · different conformations of DNA and the different classes of RNA. II. Structure of nucleic acids II.1. The structure

Molecular Genetics and Genomics Department of Biology/Faculty of Sciences/LU

Dr. Fahd Nasr-All rights reserved

51

VIII.4. Cruciforms or palindromes Palindromes are words or sentences that are read in the same way backward or forward such as "radar". A palindromic sequence is a region of a nucleic acid that contains a pair of inverted sequences. Palindromes occur in DNA duplexes or in the double-stranded regions of single-stranded molecules. Such regions have the potential to form a tertiary structure known as cruciform in which the two strands each form hairpins by intrastrand base pairing. Palindromes or cruciforms display two-fold rotational symmetry (dyad symmetry) or hyphenated dyad symmetry if the two inverted sequences are separated by another sequence (Fig. 9). Each palindromic sequence can adopt two possible conformations: a linear structure with interstrand hydrogen-bonding or a cruciform or cross-shaped structure with intrastrand hydrogen-bonding. Such cruciforms are never as stable as the normal DNA duplexes. However, the formation of cruciforms tends to relieve the torsional stress of negatively supercoiled DNA. It is known that palindromic sequences occur in many regulatory sequences and DNA replication origins and seem to be important e.g. they potentially produce distinctive recognition sites for specific DNA-binding proteins.

Figure 9. Cruciform loops. The formation of a cruciform structure from a palindromic sequence is usually promoted by negative supercoiling that causes a localized disruption of hydrogen bonding between base pairs in DNA. Note that the two inverted repeats are self-complementary and can rearrange to form cruciform loops by intrastrand pairing.

*****

GTTCTTGCATTGAGCGTAATGCAAGGCTT

CAAGAACGTAACTCGCATTACGTTCCGAA

T AT AA TC GG CT AT AC G

G CA TA TC GG CT AA TA T

GTT GCTT

CAA CGAA

GAG C

GT

CTC G

CA