©2000 Timothy G. Standish
Structure and AnalysisStructure and Analysisofof
DNA DNA
.
DNA
mRNA
Transcription
IntroductionIntroduction
The Central Dogma The Central Dogma of Molecular Biologyof Molecular Biology
Cell
Polypeptide(protein)
TranslationRibosome
©1998 Timothy G. Standish
©2000 Timothy G. Standish
OutlineOutline
1How we know DNA is the genetic material
2Basic structure of DNA and RNA
3Ways in which DNA can be studied and what they tell us about genomes
©2000 Timothy G. Standish
Historical EventsHistorical Events• 1869 Friedrich Miescher identified DNA, which he called nuclein,
from pus cells• 1889 Richard Altman renamed nuclein nucleic acid• 1928 Griffith discovered that genetic information could be passed
from one bacteria to another; known as the transforming principle• 1944 Avery showed that the transforming material was pure
DNA not protein, lipid or carbohydrate.• 1952 Hershey and Chase used bacteriophage (virus) and E. coli
to show that only viral DNA entered the host• 1953 Watson and Crick discovered the structure of DNA was a
double helix
©2000 Timothy G. Standish
Transformation Of BacteriaTransformation Of BacteriaTwo Strains Of Two Strains Of StreptococcusStreptococcus
Capsules
Smooth Strain(Virulent)
Rough Strain(Harmless)
©2000 Timothy G. Standish
Experimental
Transformation Of BacteriaTransformation Of BacteriaThe Griffith’s 1928 ExperimentThe Griffith’s 1928 Experiment
- Control
+ Control
- Control
OUCH!
©2000 Timothy G. Standish
Avery, MacLeod and McCartyAvery, MacLeod and McCarty 1944 Avery, MacLeod and McCarty repeated Griffith’s
1928 experiment with modifications designed to discover the “transforming factor”
After extraction with organic solvents to eliminate lipids, remaining extract from heat killed cells was digested with hydrolytic enzymes specific for different classes of macro molecules:
NoNuclease
YesProtease
Transformation?Enzyme
YesSaccharase
©2000 Timothy G. Standish
The Hershey-Chase The Hershey-Chase ExperiementExperiement
The Hershey-Chase experiment showed definitively that DNA is the genetic material
Hershey and Chase took advantage of the fact that T2 phage is made of only two classes of macromolecules: Protein and DNA
HOH
P
O
OH
HO ONH2
Nucleotides contain phosphorous, thus DNA contains phosphorous, but not sulfur.
H
OH
OH2N CC
CH2
SH
H
OH
OH2N C
CH3
C
CH2
CH2
S Some amino acids contain sulfur, thus proteins contain sulfur, but not phosphorous.
CysteineMethionine
Using SUsing S3535Bacteria grown in normal non-radioactive media
T2 grown in S35 containing media incorporate S35 into their proteins
Blending causes phage protein coat to fall off
T2 attach to bacteria and inject genetic material
Is protein the genetic material?
When centrifuged, phage protein coats remain in the supernatant while bacteria form a pelletThe supernatant is radioactive, but the pellet is not.
Did protein enter the bacteria?
Using PUsing P3232Bacteria grown in normal non-radioactive media
T2 grown in P32 containing media incorporate P32 into their DNA
Blending causes phage protein coat to fall off
T2 attach to bacteria and inject genetic material
Is DNA the genetic material?
When centrifuged, phage protein coats remain in the supernatant while bacteria form a pelletThe pellet is radioactive, but the supernatant is not.
Did DNA enter the bacteria?
OH
OCH2
Sugar
H
HH
A NucleotideA NucleotideAdenosine Mono Phosphate (AMP)Adenosine Mono Phosphate (AMP)
OH
NH2
N
N N
N
BaseP
O
OH
HO O
Phosphate
2’3’
4’
5’
1’Nucleotide
Nucleoside
H+
-
Pyrimidines
NH2
O
N
N NH
N
Guanine
N
N
Adenine
N
N
NH2
N O
NH2
N O
NH2
NCytosine
Purines
Uracil(RNA)CH3
N ON
O
NH
N ON
O
NH
Thymine(DNA)
NO
H
NO
N
NH C
ytosine
H
O
NN
N
N
N
H
H
Guanine -+
+
+
-
-
Base PairingBase PairingGuanine And CytosineGuanine And Cytosine
CH 3
N
O
N
ON
H+
- ThymineN
NN
N
HN H
-
+Adenine
Base PairingBase PairingAdenine And ThymineAdenine And Thymine
Base PairingBase PairingAdenine And CytosineAdenine And Cytosine
NO
H
NO
N
NH C
ytosine-
+
-
N
NN
N
HN
H
-
+
Adenine
Base PairingBase PairingGuanine And ThymineGuanine And Thymine
CH
3
NO
N
O
NH+
- Thymine
H
O
NN
N
N
N
H
H
Guanine
+
+
-
©2000 Timothy G. Standish
Some minor purine and Some minor purine and pyrimidine basespyrimidine bases
SU
GA
R-P
HO
SP
HA
TE
BA
CK
BO
NE
H
P
O
HO
O
O
CH2
HOH
P
O
O
HO
O
O
CH2
H
P
O
OH
HO
O
O
CH2
NH2
N
N
N
N
O
O
NH2N
NH
N
N
N O
NH2
N
B A
S E
S
DDNNAA
OH
P
O
HO
O
O
CH2
HO
O
H 2N
NHN N
N H
H
P HO
O
O
CH2
OO
N
O
H 2N
NH
H2O
H OH
P
O
HO
O
O
CH2
CH 3
O
O
HNN
H2O
5’Phosphate group
3’Hydroxyl group
5’Phosphategroup
3’Hydroxyl group
©2000 Timothy G. Standish
The Watson - Crick The Watson - Crick Model Of DNAModel Of DNA
3.4 nm1 nm
0.34 nm
Majorgroove
Minorgroove
A T
T AG C
C G
C GG C
T A
A T
G CT A
A TC G
--
-
-
---
--
--
--
-
--
--
-
---
--
--
--
-
-
©2000 Timothy G. Standish
Forms of the Double HelixForms of the Double Helix
0.26 nm
2.8 nmMinorgroove
Majorgroove
C GA T
T AG C
C G
G CT A
A T
G CT A
A TC G
A T
G C
1.2 nm
A DNA
1 nm
Majorgroove
Minorgroove
A T
T AG C
C G
C G
G CT A
A T
G CT A
A TC G
0.34 nm
3.9 nm
B DNA
+34.7o Rotation/Bp11 Bp/turn
-30.0o Rotation/Bp12 Bp/turn
+34.6o Rotation/Bp10.4 Bp/turn
C GG C
G CC G
C G
G CG C
G CC G
G CC G
0.57 nm
6.8 nm
0.9 nm
Z DNA
©2000 Timothy G. Standish
..
©2000 Timothy G. Standish..
A-DNA:1. Large hole in center
2. Sugar phosphate backbone is at the edge 3. Bases are displaced
towards edge B-DNA-1. Bases in center (no
hole) 2. Phosphates at periphery Z-DNA-1. Bases present
throughout the matrix of the helix
2. No exclusive domains for either bases or backbone 3. Left hand helix
©2000 Timothy G. Standish
Biological SignificanceBiological Significance
A-DNA-occurs only in dehydrated samples of DNA, such as those used in crystallographic experiments, and possibly is also assumed by DNA-RNA hybrid helices and by regions of double-stranded RNA.
Z-DNA has been found, it is commonly believed to provide torsional strain relief (supercoiling) while DNA transcription occurs. The potential to form a Z-DNA structure also correlates with regions of active transcription
©2000 Timothy G. Standish
C-DNA:– Exists only under high dehydration conditions– 9.3 bp/turn, 0.19 nm diameter and tilted bases
D-DNA:– Occurs in helices lacking guanine– 8 bp/turn
E-DNA:– Like D-DNA lack guanine– 7.5 bp/turn
P-DNA:– Artificially stretched DNA with phosphate groups found inside
the long thin molecule and bases closer to the outside surface of the helix
– 2.62 bp/turn
Even More Forms Of DNAEven More Forms Of DNA
B-DNA appears to be the B-DNA appears to be the most common form most common form in in vivovivo. However, under . However, under some circumstances, some circumstances, alternative forms of DNA alternative forms of DNA may play a biologically may play a biologically significant role.significant role.
©2000 Timothy G. Standish
Certain DNA sequences adopt Certain DNA sequences adopt unusual structuresunusual structures
Palindrome: The term is applied to regions of DNA with inverted repeats of base sequence having twofold symmetry over two strands of DNA. Such sequences are self-complementary within each strand and therefore have the potential to form hairpin or cruciform (cross-shaped) structures
©2000 Timothy G. Standish
Certain DNA sequences adopt Certain DNA sequences adopt unusual structuresunusual structures
Mirror repeats :When the inverted repeat occurs within each individual strand of the DNA, the sequence is called a mirror repeat.
Mirror repeats do not have complementary sequences within the same strand and cannot form hairpin or cruciform structures.
©2000 Timothy G. Standish
Certain DNA sequences adopt Certain DNA sequences adopt unusual structuresunusual structures
©2000 Timothy G. Standish
..
©2000 Timothy G. Standish
..
©2000 Timothy G. Standish
keto-enol keto-enol tautomerism keto-enol tautomerism refers
to a chemical equlibrium between a keto form (a ketone or an aldehyde) and an enol (An alcohol)
In DNA, the nucleotide bases are in keto form.
Rare enol tautomers of the bases G and T can lead to mutation because of their altered base-pairing properties.
©2000 Timothy G. Standish
Triplex DNATriplex DNA Nucleotides participating in a Watson-
Crick base pair can form a number of additional hydrogen bonds, particularly with functional groups arrayed in the major groove. For example, a cytidine residue (if protonated) can pair with the guanosine residue of a G-C nucleotide pair.
©2000 Timothy G. Standish
Triplex DNATriplex DNA
The N-7, O6, and N6 of purines, the atoms that participate in the hydrogen bonding of triplex DNA, are often referred to as Hoogsteen positions, and the non-Watson-Crick pairing is called Hoogsteen pairing.
The triplexes form most readily within long sequences containing only pyrimidines or only purines in a given strand
Four DNA strands can also pair to form a tetraplex
©2000 Timothy G. Standish
..
©2000 Timothy G. Standish
H-DNAH-DNA A particularly exotic DNA structure, known
as H-DNA, is found in polypyrimidine or polypurine tracts that also incorporate a mirror repeat. A simple example is a long stretch of alternating T and C residues
©2000 Timothy G. Standish
H-DNAH-DNA
©2000 Timothy G. Standish
Structure of RNAStructure of RNA The single strand of RNA tends to assume a
right-handed helical conformation dominated by base stacking Interactions ,which are strongest between two purines
The purine-purine interaction is so strong that a pyrimidine separating two purines is often displaced from the stacking pattern so that the purines can interact
©2000 Timothy G. Standish
Structure of RNAStructure of RNA RNA can base-pair with complementary regions
of either RNA or DNA. For DNA: G pairs with C and A pairs with U ,however base pairing between G and U is fairly common in RNA.
Where complementary sequences are present, the predominant double-stranded structure is an A-form right-handed double helix.
Hairpin loops form between nearby self-complementary sequences.
©2000 Timothy G. Standish
..
short base sequences (such as UUCG) are often found at the ends of RNA hairpins and are known to form particularly tight and stable loops.
Additional structural contributions are made by hydrogen bonds that are not part of standard Watson-Crick base pairs. For example, the 2-hydroxyl group of ribose can hydrogen-bond with other groups.
rRNA has a characteristic secondary structure due to many intramolecular H-bonds
©2000 Timothy G. Standish
Structure Of t-RNAStructure Of t-RNA
©2000 Timothy G. Standish
Denaturation and RenaturationDenaturation and Renaturation Heating double stranded DNA can overcome the
hydrogen bonds holding it together and cause the strands to separate resulting in denaturation of the DNA
When cooled relatively weak hydrogen bonds between bases can reform and the DNA renatures
TACTCGACATGCTAGCACATGAGCTGTACGATCGTG
Double stranded DNA
TACTCGACATGCTAGCACATGAGCTGTACGATCGTG
Double stranded DNA
Renaturation
TACTCGACATGCTAGCAC
ATGAGCTGTACGATCGTG
Denatured DNA
Denaturat
ion
Single stranded DNA
©2000 Timothy G. Standish
Denaturation and RenaturationDenaturation and Renaturation DNA with a high guanine and cytosine content has relatively more
hydrogen bonds between strands This is because for every GC base pair 3 hydrogen bonds are made
while for AT base pairs only 2 bonds are made Thus higher GC content is reflected in higher melting or
denaturation temperature
Intermediate melting temperature
Low melting temperature High melting temperature67 % GC content -
TGCTCGACGTGCTCGACGAGCTGCACGAGC
33 % GC content -
TACTAGACATTCTAGATGATCTGTAAGATC
TACTCGACAGGCTAGATGAGCTGTCCGATC
50 % GC content -
©2000 Timothy G. Standish
Determination of GC ContentDetermination of GC Content Comparison of melting temperatures can be used to
determine the GC content of an organisms genome To do this it is necessary to be able to detect whether DNA
is melted or not Absorbance at 260 nm of DNA in solution provides a means
of determining how much is single stranded Single stranded DNA absorbs 260 nm ultraviolet light more
strongly than double stranded DNA does although both absorb at this wavelength
Thus, increasing absorbance at 260 nm during heating indicates increasing concentration of single stranded DNA
©2000 Timothy G. Standish
Determination of GC ContentDetermination of GC Content
OD260
0
1.0
65 70 75 80 85 90 95
Temperature (oC)
Tm = 85 oCTm = 75 oC
Double stranded DNA
Single stranded DNA
Relatively low GC content
Relatively high GC content
Tm is the temperature at which half the DNA is melted
©2000 Timothy G. Standish
GC Content Of Some GenomesGC Content Of Some Genomes
Phage T7 48.0 %
Organism % GC
Homo sapiens 39.7 %
Sheep 42.4 %
Hen 42.0 %
Turtle 43.3 %
Salmon 41.2 %
Sea urchin 35.0 %
E. coli 51.7 %
Staphylococcus aureus 50.0 %
Phage 55.8 %
©2000 Timothy G. Standish
HybridizationHybridization The bases in DNA will only pair in very specific ways, G with C and
A with T In short DNA sequences, imprecise base pairing will not be tolerated Long sequences can tolerate some mispairing only if -G of the
majority of bases in a sequence exceeds the energy required to keep mispaired bases together
Because the source of any single strand of DNA is irrelevant, merely the sequence is important, DNA from different sources can form double helix as long as their sequences are compatible
Thus, this phenomenon of base pairing of single stranded DNA strands to form a double helix is called hybridization as it may be used to make hybrid DNA composed of strands which came from different sources
©2000 Timothy G. Standish
HybridizationHybridization
DNA from source “Y”
TACTCGACAGGCTAG
CTGATGGTCATGAGCTGTCCGATCGATCAT
DNA from source “X”
TACTCGACAGGCTAG
HybridizationHybridization
©2000 Timothy G. Standish
HybridizationHybridization Because DNA sequences will seek out and hybridize with other
sequences with which they base pair in a specific way much information can be gained about unknown DNA using single stranded DNA of known sequence
Short sequences of single stranded DNA can be used as “probes” to detect the presence of their complimentary sequence in any number of applications including:– Southern blots– Northern blots (in which RNA is probed)– In situ hybridization– Dot blots . . .
In addition, the renaturation or hybridization of DNA in solution can tell much about the nature of organism’s genomes
©2000 Timothy G. Standish
Reassociation KineticsReassociation Kinetics An organism’s DNA can be heated in solution until it
melts, then cooled to allow DNA strands to reassociate forming double stranded DNA
This is typically done after shearing the DNA to form many fragments a few hundred bases in length. The larger and more complex an organisms genome is, the longer it will take for complimentary strands to bum into one another and hybridize
Rate of reassociation is proportional to concentration of the two homologus dissociated strands.
Reassociation follows second order kinetics: dt/dc = -kc2 , now integrate this equation
©2000 Timothy G. Standish
Reassociation KineticsReassociation Kinetics The following equation describes the second order
rate kinetics of DNA reassociation:
11 + kCot
=CCo
Concentration of single stranded DNA after time t
Initial concentration of single stranded DNA
Second order rate constant (the important thing is that it is a constant)
Co (measured in moles/liter) x t (seconds). Generally graphed on a log10 scale.
Cot1/2 is the point at which half the initial concentration of single stranded DNA has annealed to form double-stranded DNA
©2000 Timothy G. Standish
CCoot t 0.5 0.5 valuevalue
Cot0.5 value is proportional to complexity of the genome.
A plot of C/Co against Cot is called Cot curve and it provides information about
complexity of a genome.
©2000 Timothy G. Standish
Genome complexityGenome complexity Complexity is the minimum length of DNA
that contains a single copy of all the single reiterated sequences that are represented within the genome.
Complexity of a genome is equal to its molecular mass only if a genome has unique nucleotide sequences (repetitive sequences absent).
©2000 Timothy G. Standish
exampleexample # For a hypothetical DNA-1 having three
nucleotide sequences, N1, N2, N3. Molecular mass=N1+N2+N3 Complixity=N1+N2+N3
# For a hypothetical DNA-2 having 103 copies of N1 ,105 copies of N2 & 1 copy of N3.
Molecular mass= 103 N1+ 105 N2 + N3 Complixity=N1+N2+N3
©2000 Timothy G. Standish
Reassociation KineticsReassociation Kinetics
Fraction remaining single-stranded (C/Co)
0
0.5
10-4 10-3 10-2 10-1 1 101 102 103 104
Cot (mole x sec./l)
1.0
Higher Cot1/2 values indicate greater genome complexityCot1/2
©2000 Timothy G. Standish
Reassociation KineticsReassociation Kinetics
0.5
Fraction remaining single-stranded (C/Co)
010-4 10-3 10-2 10-1 1 101 102 103 104
Cot (mole x sec./l)
1.0
Eukaryotic DNA
Prokaryotic DNA
Repetitive DNA Unique
sequence complex DNA
©2000 Timothy G. Standish
Repetitive DNARepetitive DNAOrganism % Repetitive DNA
Homo sapiens 21 %
Mouse 35 %
Calf 42 %
Drosophila 70 %
Wheat 42 %
Pea 52 %
Maize 60 %
Saccharomycetes cerevisiae 5 %
E. coli 0.3 %
©2000 Timothy G. Standish