genomics part 1. human genome project g oal is to identify the dna sequence of every gene in humans...

17
Genomics Genomics Part 1 Part 1

Upload: morgan-copeland

Post on 17-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

Genomes That Have Been Sequenced RNA virus MS2 – (1976) DNA virus fX174 – 5368 base pairs (bp) (1977) Bacterium H. influenzae – 1.8 million bp (1985) Yeast S. cerevisiae (first eukaryote) – 12 million bp (1997) Fruit fly – 130 million bp (2000) First plant (Arabidopsis thaliana) – 120 million bp (2000) Human – 3 billion bp. “Working draft” announced % complete, 2003; complete DNA of a single individual (2007). Other animals: dog, horse, cat, mouse, chimpanzee, rat, chicken, pufferfish, mosquito, and many more Now about 900 genomes have been sequenced.

TRANSCRIPT

Page 1: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

GenomicsGenomicsPart 1Part 1

Page 2: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Human Genome Project Goal is to identify the DNA sequence of

every gene in humansGenome all the DNA in one cell of an organism

Will provide scientists with an encyclopedia of information and a better understanding of humans

Page 3: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Genomes That Have Been Sequenced• RNA virus MS2 – (1976)• DNA virus fX174 – 5368 base pairs (bp) (1977)• Bacterium H. influenzae – 1.8 million bp (1985)• Yeast S. cerevisiae (first eukaryote) – 12 million bp (1997)• Fruit fly – 130 million bp (2000)• First plant (Arabidopsis thaliana) – 120 million bp (2000)• Human – 3 billion bp. “Working draft” announced 2000.

99% complete, 2003; complete 2006. DNA of a single individual (2007).

• Other animals: dog, horse, cat, mouse, chimpanzee, rat, chicken, pufferfish, mosquito, and many more

• Now about 900 genomes have been sequenced.

Page 4: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

http://www.genomesonline.org/

Page 5: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Sequencing the genomes of many organisms:

• Provides information to help understand many aspects of biology

• Can help us understand human genes, which are usually similar to genes in other organisms

• Provides information on evolution

Page 6: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Sequencing DNA

• Various methods have been used over the years.• Some newer methods involve copying the DNA to

give pieces of different lengths, with the last nucleotide having a fluorescent nucleotide that is a different color for each base.

• When the pieces are separated by size, reading the sequence of colors gives the sequence of the DNA.

Page 7: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

• Dideoxy chain-termination method for sequencing DNA

Figure 20.12

DNA(template strand)

Primer Deoxyribonucleotides Dideoxyribonucleotides(fluorescently tagged)T

GTT

3

5

DNA polymerase

CTGACTTCGACAA

P P P P P P

dATP

dCTP

dTTP

dGTP

G

OH

ddATP

ddCTPddTTP

ddGTP

G

H

5

3

5

3

CTGACTTCGACAA

ddCTGTT

ddGCTGTT

ddAGCTGTT

ddAAGCTGTT

ddGAAGCTGTT

ddTGAAGCTGTT

ddCTGAAGCTGTT

ddACTGAAGCTGTT

ddGACTGAAGCTGTT

3DNA (templatestrand)

Labeled strands

Directionof movementof strands

Laser Detector

APPLICATION The sequence of nucleotides in any cloned DNA fragment up to about 800 base pairs in length can be determined rapidly with specialized machines that carry out sequencing reactions and separate the labeled reaction products by length.

TECHNIQUE This method synthesizes a nested set of DNA strands complementary to the original DNA fragment. Each strand starts with the same primer and ends with a dideoxyribonucleotide (ddNTP), a modified nucleotide. Incorporation of a ddNTP terminates a growing DNA strand because it lacks a 3’—OH group, the site for attachment of the next nucleotide (see Figure 16.12). In the set of strands synthesized, each nucleotide position along the original sequence is represented by strands ending at that point with the complementary ddNT. Because each type of ddNTP is tagged with a distinct fluorescent label, the identity of the ending nucleotides of the new strands, and ultimately the entire original sequence, can be determined.

RESULTS The color of the fluorescent tag on each strand indicates the identity of the nucleotide at its end. The results can be printed out as a spectrogram, and the sequence, which is complementary to the template strand, can then be read from bottom to top. (Notice that the sequence here begins after the primer.)

GACTGAAGC

Page 8: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

http://en.wikipedia.org/wiki/File:Sanger_sequencing_read_display.gif

Page 9: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Sequencing DNA

• An automated sequencing machine can analyze about 1000 samples in a day, determining sequences of 300 to 1000 bp for each.

• The cost of sequencing per nucleotide has dropped steadily, from about $10 per bp in 1990 to about 1/10 of a cent per bp today.

• In the future it is expected to drop even more, allowing affordable sequencing of individual genomes.

Page 10: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism
Page 11: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Two Strategies for Genome Sequencing

• Method 1: use genetics to find the locations of many genes on the chromosomes; cut chromosomes into pieces containing these genes; sequence small pieces; assemble the sequences

Page 12: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Cytogenetic mapChromosome bandingpattern and location ofspecific genes byfluorescence in situhybridization (FISH)

Genetic (linkage)mapping Ordering of genetic markers such as RFLPs, simple sequence DNA, and other polymorphisms (about 200 per chromosome)

Physical mappingOrdering of large over-lapping fragmentscloned in YAC and BACvectors, followed byordering of smallerfragments cloned inphage and plasmidvectors

DNA sequencingDetermination ofnucleotide sequence ofeach small fragment andassembly of the partialsequences into the com-plete genome sequence

Chromosomebands

Genes locatedby FISH

Geneticmarkers

Overlappingfragments

…GACTTCATCGGTATCGAACT…

1

2

33

Method 1

Page 13: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Two Strategies for Genome Sequencing

• Method 2: “Shotgun” approach: entire chromosome is cut into random pieces; the pieces are sequenced; computer programs then assemble the resulting very large number of overlapping short sequences into a single continuous sequence.

• Two rival groups used these different strategies in sequencing the human genome.

Page 14: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

1

2

3

4

Cut the DNA frommany copies of anentire chromosomeinto overlapping frag-ments short enoughfor sequencing.

Clone the fragmentsin plasmid or phagevectors

Sequence eachfragment

Order thesequences into oneoverall sequencewith computersoftware.

ACGATACTGGT

CGCCATCAGT ACGATACTGGT

AGTCCGCTATACGA

…ATCGCCATCAGTCCGCTATACGATACTGGTCAA…

Method 2

Page 15: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

• About 3 billion bp• Current estimates are that the human genome contains about 25,000 genes• Only 1.5% of the genome codes for genes.• The rest is involved in regulation, or is “junk.”• The number of genes is not much different than in many other “simpler”

organisms.

The Human Genome

Page 16: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism
Page 17: Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism

Genome sequences provide clues to important biological questions

• In genomics: scientists study whole sets of genes and their interactions• Computer analysis of genome sequences helps researchers identify

sequences that are likely to encode proteins• Comparison of the sequences of “new” genes with those of known

genes in other species may help identify new genes