Download - Genomes & their evolution
GENOMES & THEIR EVOLUTION
Campbell & Reece Chapter 21
Genomics
study of a specie’s whole set of genes & their interactions
bioinformatics: use of computers, software, & mathematical modes to process & integrate biological informationfrom large data sets
Human Genome Project
sequencing the human genome 1990 – 2003 20 large centers in 6 countries +
many other small labs working on small parts of it
FISH
Cytogenetic Map: chromosome banding pattern & location of specific genes by flourescence in situ hybridization (FISH)
b/4 Human Genome Project the # of chromosomes & their banding patterns known for many species
some human genes already located
FISH
method in which flourescently labeled nucleic acid probes allowed to hybridize to immobilized array of whole chromosomes
maps generated from this used as starting point
3 Stages to Genome Sequencing
1. Linkage Mapping2. Physical Mapping3. DNA Sequencing
Linkage Mapping
ordering of genetic markers (1000’s) spaced thru-out chromosomes order & spacing determined by
recombinant frequencies markers: genes, RFLPs, (restriction
fragment length polymorphism) or STRs (short tandem repeats)
RFLP
in gel electrophoresis, fragments of DNA are separated by length (-) charge of phosphate groups moves
DNA thru gel (acting like a sieve) toward (+) end
resulting in: bands that each consist of thousands of DNA molecules of same length
RFLP
1 useful technique has been to apply restriction fragment analysis to these bands information about DNA sequences
restriction enzymes “cut” DNA at known nucleotide sequences then these fragments produced are put thru gel electrophoresis
RFLP
DNA can be recovered undamaged from gel bands (so can be used to prepare pure sample of individual fragments)
can be used to compare 2 different DNA molecules (2 alleles of same gene) if nucleotide sequence affects a restriction site: change in even 1 nucleotide will prevent the “cut”
RFLP (restriction fragment length polymorphism)
polymorphisms: variations in DNA sequence among a population
this particular type of sequence change is called RFLP (“rif-lip”)
if 1 allele contains a RFLP, digestion with the enzyme will produce a fragment of different length
Short Tandem Repeats: STR
technique used by forensic scientists
are tandemly repeated units of 2 to 5 base sequences in specific regions of the genome
# repeats present is highly variable person to person (polymorphic)
1 individual’s may vary if has 2 alleles
STR
PCR (polymerase chain react is used to amplify particular STRs
quicker technique than RFLP analysis
can be used with less pure samples of DNA or if only have minute sample
PCR
3 Stages to Genome Sequencing
1. Linkage Mapping2. Physical Mapping3. DNA Sequencing
Physical Mapping
ordering of large fragments cloned in YAC & BAC vectors
followed by ordering of smaller fragments cloned in phage & plasmid vectors
key is to make overlapping fragments & then use probes or automated nucleotide sequencing of ends to find the overlaps
YAC & BAC
1st cloning vector
carries inserted fragments million base pairs (bp) long
carries inserts of 100,000 – 300,000 bp
Yeast Artificial Chromosome
Bacterial Artificial Chromosome
Physical Mapping
fragments from YAC & BAC put in order
each fragment cut into smaller pieces
which are then cloned in plasmids, ordered, & finally sequenced
DNA Sequencing
determination of nucleotide sequence of each small fragment & assembly of the partial sequences into the complete genome sequence
for human genome used sequence machines
sequencing of all 3 billion bps in haploid set of human chromosomes done at rate 1,000 bp/s
Human Genome Project
took 13 yrs $100 million
Sequencing an Entire Genome
Whole-Genome Shotgun Approach
essentially skips the linkage mapping & physical mapping stages & starts with sequencing of DNA fragments from randomly cut DNA
computers then assemble the resulting very large # of short sequences into a single continuous sequence
Shotgun Approach
Application of Systems Biology to Medicine
2007 – 2010 set out to find all the common mutations in 3 types of cancer (lung, ovarian, glioblastoma) by comparing gene sequences & patterns of gene expression in cancer cells compared to normal cells
Cancer Genes
# genes identified that had been suspect + genes that were not suspected
gives researchers point to develop new treatments aimed specifically @ these genes
10 more cancers then studied (most common/most lethal)
Microarray Chip
Genomes Vary in Size, # of Genes, & Gene Density
# of Genes
Prokaryotic cells < Eukaryotic cells Humans: expected 50,000 – 100,000
but have found < 30,000 How do we get by with not many
more genes than nematodes? # proteins we have > # genes vertebrates use alternative splicing of
RNA transcripts
Gene Density
# genes in given length of DNA eukaryotes generally have larger
genomes but fewer genes in given # of bps
humans have 100’s – 1000’s times more bps but only 5 – 15 times as many genes
Sooooo: gene density lower in humans than in bacteria
Noncoding DNA
includes most of eukaryotic DNA introns most is noncoding DNA between genes
1.5% of our genome codes for proteins, or is transcribed into rRNA or tRNA
Pseudogenes
former genes that have accumulated over a long time & no longer produce functional proteins
Repetitive DNA
sequences that are present in multiple copies in the genome
75% of this repetitive DNA (44% of entire genome) is made up of units called transposable elements & related sequences
Transposable Elements & Related Sequences
found in both prokaryotes & eukaryotes
stretches of DNA that can move from one location to another w/in the genome
transposition: process where 1 transposable element moves from 1 site to different target site by a type of recombination process
“Jumping” Genes
Transposons
Gene that is “jumping” never actually completely detach from the cell’s DNA
original and new strands brought together by enzymes & other proteins that bind to DNA
1st evidence came from studying genetics of Indian corn
Movement of Transposons & Retrotransposons
2 types of eukaryotic transposons:1. Transposons
move w/in genome by means of DNA intermediate
move & paste or cut & paste both require enzyme transposase
(encoded by transposon)
Retrotransposon
2nd type of eukaryotic transposable element
move by means of RNA intermediate that is a transcript of retrotransposon DNA
always leave copy @ original site during transposition
RNA intermediate is converted back to DNA by reverse transcriptase (enzyme encoded by retrotransposon)
Other Repeating DNA
probably arises due to mistakes made during DNA replication or recombination
~14% human DNA ~1/3 of this duplications of long
stretches of DNA segments copied from 1 chromosomal
location to another on same or different chromosome
Simple Sequence DNA
Contains many copies of tandemly repeated short sequences: ATTGCGATTGCGATTGCGATTGCG repeated units can be 2 – 500
nucleotides
Short Tandem Repeat (STR)
repeating units that are 2 to 5 nucleotides long
found on telomeres & centromeres (so may play structural role)
# of repeating units can vary w/in same genome and with different alleles
this diversity means STR’s can be used in preparing genetic profiles
Other Types of DNA
1.5% of genome: genes that code for proteins, rRNA, tRNA
include introns & regulatory sequences associated with genes total amt is 25% of the human genome
Multigene Families
<1/2 genes present in 1 copy multigene families: collections of 2
or more identical or very similar genes
some identical present in tandem, repeats code for an RNA or histone proteins
rRNA genes
repeated tandemly 100’s to 1000’s times in 1 to several clusters in genomes of multicellular eukaryotes
helps cells quickly make millions of ribosomes necessary for protein synthesis
Multigene Families of Nonidentical Genes
Globins: group of proteins that include the α and β polypeptide subunits of hgb
Chromosome 16 encodes for forms of α-globin
Chromosome 11, encodes for β-globin different forms are expressed @
different times in development allowing hgb to function effectively in changing environment of developing animal
Fetal-Globin
in fetal stage use this globin because it had higher affinity for O2 ensuring the efficient transfer of O2 from mother
Clues to Evolution
by looking at arrangement of genes in gene families get insight into evolution of genomes
genome w/4gene families in 4 species
Genome Evolution
“accidents” in cell division can lead to extra copies of all or parts of a chromosome which can then diverge if 1 set accumulates nucleotide sequence changes
Genome Evolution
compare chromosomal organization of genomes among species info about evolutionary relationships
w/in given species rearrangements of genes thought to contribute to emergence of new species
Globin Gene
1 common ancestral globin gene duplicated & diverged into α and β globin ancestral genes
subsequent duplication & random mutation present day globin genes
all genes along the way code for O2 -binding proteins
Globin Genes
some copies of the duplicated globin genes have diverged so much that their functions are now substantially different examples: lysozyme: enzyme that destroys
bacterial cell walls in mammals, found in sweat, tears, & saliva
α-lactalbumin: protein found in milk, contains all a.a.
Gene Evolution
rearrangement of exons w/in & between genes during evolution genes containing multiple copies of similar exons&/or several different exons derived from other genes
Gene Evolution
movement of transposable elements or recombination between copies of same element occasionally creates new sequence combinations that are beneficial to the organism
new combinations can alter function of genes or their patterns of expression & regulation
Comparing Genome Sequences
human & chimpanzee sequences show ~4% differences
most due to: insertions deletions
duplications
4%
FOXP2 Gene
gene that affects speech
human & chimp have nucleotide sequence variations
SNPs & CNVs
Single Nucleotide Polymorphisms Copy Number Variations variations of both w/in a species can
yield information about the evolution of that species
Evo-Devo Biologists
Evolutionary Developmental biologists have show that homeotic genes (any of the master regulatory genes that control placement & spatial organization of body parts in animals) & other genes associated with animal development contain a:
homeobox region has sequence that is highly
conserved among diverse species (animals, plants, yeast)
Homeobox Genes in Fruit Fly & Mouse
Hox Genes
genes or groups of genes that are responsible for the lay out of basic body forms
set up the head-to-tail organization are general purpose (work in many
animal phylums) small changes in them or the genes
that control them would lead to major source of evolutionary change
Changes in Expression of Hox Genes have changed over evolutionary time
Comparisons of Animal & Plant Development
last common ancestor of plants & animals probably a unicellular eukaryote (100s of millions of years ago)
morphogenesis in plants relies on differing planes of cell division & on selective cell enlargement
Comparing Development in Plants & Animals
development relies on a cascade of transcriptional regulators turning genes on/off
Plants do not use Hox genes, they have another group of genes (Mads-box)
can find Hox genes in plants & Mads-box genes in animals but in neither case do they have same major role in development