biological background: molecular biology class web site: statistics for microarrays
Post on 18-Dec-2015
217 views
TRANSCRIPT
Biological background: Molecular Biology
Class web site: http://statwww.epfl.ch/davison/teaching/Microarrays/
Statistics for Microarrays
Acknowledgements
• http://www.accessexcellence.org/AB/GG
•http://www.oup.co.uk/best.textbooks/biochemistry/genesvii
• Sandrine Dudoit, UC Berkeley Biostatistics
• Yee Hwa Yang, UC Berkeley Statistics
• Terry Speed, UC Berkeley Statistics and WEHI, Melbourne, Australia
http://www.stg.brown.edu/webs/MendelWeb/MWtoc.html
Mendelian Genetics
Nature (1953), 171:737
“We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest.”
DNA Structure Discovery
DNA
• A deoxyribonucleic acid or DNA molecule is a double-stranded linear polymer composed of four molecular subunits called nucleotides
• Each nucleotide comprises a phosphate group, a deoxyribose sugar, and one of four nitrogen bases: adenine (A), guanine (G), cytosine (C), or thymine (T)
• The two strands are held together by weak hydrogen bonds between complementary bases
• Base-pairing occurs according to the rule: G pairs with C, and A pairs with T
DNA A-type (140D)(low water content)
DNA B-type (7BNA)(Watson-Crick form)
DNA Z-type (2ZNA)(high salt concentration)
Polymorphic DNA Tertiary Structures
A nucleotide is a phospate, a sugar, and a purine (A, G) or a pyramidine (T, C) base.
The monomeric units of nucleic acids are called nucleotides.
DNA Structure
Adenine (A) Guanine (G) (Purines)
Thymine (T) (DNA) (Pyrimidines)
Cytosine (C)
Uracil (U) (RNA)
Nucleotide Bases
Nucleotide codes
A Adenine W Weak (A or T)
G Guanine S Strong (G or C)
C Cytosine M Amino (A or C)
T Thymine K Keto (G or T)
U Uracil B Not A (G or C or T)
R Purine (A or G) H Not G (A or C or T)
Y Pyrimidine (C or T) D Not C (A or G or T)
N Any nucleotide V Not T (A or G or C)
Proteins
• Proteins: macromolecules composed of one or more chains of amino acids
• Amino acids: class of 20 different organic compounds containing a basic amino group (-NH2) and an acidic carboxyl group (-COOH)
• The order of amino acids is determined by the base sequence of nucleotides in the gene coding for the protein
• Proteins function as enzymes, antibodies, structures, etc.
Amino acid codes
AlaArgAsnAspCysGlnGluGlyHisIleLeuLysMetPheProSerThrTrpTyrVa lAsxGlxSecUnk
ARNDCQEGHILKMFPSTWYVBZUX
AlanineArginineAsparagineAspartic acidCysteineGlutamineGlutamic acidGlycineHistidineIsoleucineLeucineLysineMethioninePhenylalanineProlineSerineThreonineTryptophanTyrosineVa lineAsn or AspGln or GluSelenocysteineUnknown
Nature (1953), 171:737
“It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”
DNA Replication
DNA Replication
• The DNA strand that is copied to form a new strand is called a template
• In the replication of a double-stranded or duplex DNA molecule, both original (parental) DNA strands are copied
• When copying is finished, the two new duplexes, each consisting of one of the original strands plus its copy, separate from each other (semiconservative replication)
DNA Replication, ctd• DNA synthesis occurs in the chemical direction 5’3’• Nucleic acid chains are assembled from 5’ triphosphates of
deoxyribonucleosides (the triphosphates supply energy)• DNA polymerases are enzymes that copy (replicate) DNA• DNA polymerases require a short preexisting DNA strand
(primer) to begin chain growth. With a primer base-paired to the template strand, a DNA polymerase adds nucleotides to the free hydroxyl group at the 3’ end of the primer.
• DNA replication requires assembly of many proteins (at least 30) at a growing replication fork: helicases to unwind, primases to prime, ligases to ligate (join), topisomerases to remove supercoils, RNA polymerase, etc.
RNA
• RNA, or ribonucleic acid, is similar to DNA, but-- RNA is single-stranded-- the sugar is ribose rather than
deoxyribose-- uracil (U) is used instead of thymine
• RNA is important for protein synthesis and other cell activities
• There are several classes of RNA molecules, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), and other small RNAs
The Genetic Code
• DNA: sequence of four different nucleotides
• Protein: sequence of twenty different amino acids
• The correspondence between the four-letter DNA alphabet and the twenty-letter protein alphabet is specified by the genetic code, which relates nucleotide triplets, or codons, to amino acids
Variation of genetic codesT1 T2 T3 T4 T5 T6 T9 T10 T12 T13 T14 T15
CUUCUCCUACUG
LeuLeuLeuLeu
----
ThrThrThrThr
----
----
----
----
----
---Ser
----
----
----
AUUAUCAUAAUG
IleIleIleMet
--Met-
--Met-
----
--Met-
----
----
----
----
--Met-
----
----
UAUUACUAAUAG
TyrTyrStopStop
----
----
----
----
--GlnGln
----
----
----
----
--Tyr-
---Gln
AAUAACAAAAAG
AsnAsnLysLys
----
----
----
----
----
--Asn-
----
----
----
--Asn-
----
UGUUCGUGAUGG
CysCysStopTrp
--Trp-
--Trp-
--Trp-
--Trp-
----
--Trp-
--Cys-
----
--Trp-
--Trp-
----
AGUAGCAGAAGG
SerSerArgArg
--StopStop
----
----
--SerSer
----
--SerSer
----
----
--GlyGly
--SerSer
----
T1: standardT2: vert mtT3: yeast mtT4: other mtT5: invert. mtT6: cil. etc nuc.T9: ech. mtT10: eup. nuc.T12:alt yeast nucT13: asc. mtT14: flat. mtT15: bleph. nuc.