the human genome project
DESCRIPTION
The Human Genome Project. And how we got there… Sequencing technologies Sequencing strategies So what? What’s next. But before that…. How do you find out the sequence of DNA? Sanger’s dideoxy sequencing method. Frederick Sanger. Won the Nobel Prize in Chemistry twice - PowerPoint PPT PresentationTRANSCRIPT
The Human Genome The Human Genome ProjectProjectAnd how we got there…Sequencing technologiesSequencing strategiesSo what?What’s next
But before that…But before that…How do you find out the
sequence of DNA?◦Sanger’s dideoxy sequencing
method
2
3
Frederick SangerFrederick Sanger
Won the Nobel Prize in Chemistry twice
1958 – for sequencing insulin
1980 – for inventing a method for sequencing DNA (together with Gilbert)
All the high-throughput sequencing methods in use today are based on the Sanger dideoxy method
http://www.nlm.nih.gov/visibleproofs/media/detailed/vi_a_208b.jpg
Sanger’s recipe: Sanger’s recipe: IngredientsIngredientsThe DNA of interest - templateAn oligonucleotide primer to get
the ball rollingA DNA polymerasedNTPs (deoxyribonucleotide
triphosphates) – dATP, dCTP, dGTP, dTTPThe special ingredient: ddNTPs
4
Revision from ATBMS: Nucleosides Triphosphates/deoxyribonucleotide triphosphates
Concepts of Genetics 7th Ed, Klug and Cummings
Concepts of Genetics 7th Ed, Klug and Cummings
• Phosphodiester bonds are formed between the 3’ carbon of one nucleotide and the 5’ carbon of the next nucleotide
Revision from ATBMS: Nucleosides Triphosphates/deoxyribonucleotide triphosphates
Linkage of two nucleotides
Concepts of Genetics 7th Ed, Klug and Cummings
Revision from ATBMS: Nucleosides Triphosphates/deoxyribonucleotide triphosphates
Concepts of Genetics 7th Ed, Klug and Cummings
What’s special about What’s special about ddNTP?ddNTP?
This method is also known as the chain termination method
8
9
What’s special about What’s special about ddNTP?ddNTP?
Fluorescent dye coupled to N-base
Each ddNTP - ddATP, ddCTP, ddGTP, ddTTP– is coupled to a different type of fluorescent dye – each ddNTP will absorb a characteristic laser wavelength and emit a characteristic colour
Sanger recipe: MethodSanger recipe: MethodDivide DNA into 4 tubes with dNTPs
and a different ddNTP in each tube and incubate
Polymerase catalyses addition of dNTPs
ddNTPs will terminate reactions Form oligonucleotides of varying
lengths terminated by fluorescent ddNTPs
10
11
Denature DNA to produce single stranded oligonucleotidesLoad single stranded oligonucleotides and separate by electrophoresis – usually by capillary electrophoresis‘Read’ DNA sequence
What would an agarose gel look like?
Advances in technology…Advances in technology…The use of fluorescently labelled ddNTPs
(previously radioactive isotopes were used)◦Each ddNTP could be labelled with a different
flurochrome◦Sequencing could be done in a single tube◦Capillaries replaced large sheet gels◦Fluorescence could be read by a laser,
leading to:AutomationThe human genome was sequenced
using Sanger’s dideoxy method
12
13
Capillary electrophoresis (from wikipedia)
Capillary tube filled with agarose and bufferElectrical voltage applied across the capillaryOligonucleotides move across capillary, according to size
14
Typical Electropherogram
But usually first 10-20 bp are not reliable, also limited to about 600-800 bp- Peaks get broader and smaller
What’s in a genome?What’s in a genome?Genes that code for proteins – 2-3% - contain Open Reading Frame (ORF) beginning
with start and stop codons
Many genes have multiple copies or have several closely related ‘family’ members
Regions coding for structural RNA (not proteins)– eg ribosomal RNA, tRNA
Regulatory regions – binding regions for regulatory proteins, transcription factors
16
Moderately Repetitive Moderately Repetitive DNADNAFunctional
◦Gene families eg globin, actin◦Gene family arrays eg histone genes,
rRNA genes (250 copies), tRNA genesWithout known function
◦Short interspersed elements (SINES) eg Alu 200-300 bp long, 100,000s of copies, 13%
◦Long interspersed elements (LINES) 1-5 kb long 10-10000 copies per genome, 21%
◦Pseudogenes
17
Highly repetitive DNAHighly repetitive DNA About 15% of genome Minisatellites (Variable number tandem repeats
(VNTR)◦ Repeats of 14-500 bp segments◦ scattered throughout genome, number of repeats
varies on different chromosomes Microsatellites (Short tandem repeat
polymorphisms (STRP)◦ Regions up to 2-5 bp repeated many times 10-30
copies◦ Hundreds of kb long◦ Eg heterochromatin
Telomeres◦ 6 bp repeat◦ 250 – 1000 repeats at the end of each chromosome
18
The race to sequence the The race to sequence the human genomehuman genome3 billion bases in the human
genomeIn 22 pairs of chromosomes + 2
sex chromosomesOnly about 30,000 genes
19
2 competing approaches2 competing approachesHierarchical method
◦Adopted by the publicly funded Human Genome Project
◦Sequence of 12 individualsWhole genome shotgun (WGS)
method◦Adopted by Celera, a for-profit
company◦Sequence of 1 individual
20
21
Craig VenterCraig Venter
Founder of Celera
Applied whole genome shotgun sequencing method to human genome
Made the first synthetic chromosome
Assignment for next weekAssignment for next weekYou will be working in 4 groups of 4-5.
Explain to the class (10-15 min):Group 1and 2 – explain the following
terms related to genome sequencing: 1: mapping, STSs and ESTs, coverage, contigs, golden tiling path, 2: library, BACs, finishing, annotation
Group 3 – explain the hierarchical approach
Group 4 – explain the whole genome shotgun approach
22