kerstin lindblad-toh whitehead/mit center for genome research

30
Kerstin Lindblad-Toh Whitehead/MIT Center for Genome Research Michael Kamal Broad/MIT Center For Genome Reseach

Upload: aphrodite-stephenson

Post on 01-Jan-2016

25 views

Category:

Documents


1 download

DESCRIPTION

Kerstin Lindblad-Toh Whitehead/MIT Center for Genome Research. Michael Kamal Broad/MIT Center For Genome Reseach. A First Look at the Mouse Genome. Preliminary mouse genome analysis Future directions (briefly). Article available online:. http://www.nature.com/nature/mousegenome/. - PowerPoint PPT Presentation

TRANSCRIPT

Kerstin Lindblad-TohWhitehead/MIT Center for Genome Research

Michael Kamal

Broad/MIT Center For Genome Reseach

A First Look at the Mouse Genome

• Preliminary mouse genome analysis

• Future directions (briefly)

http://www.nature.com/nature/mousegenome/

Article available online:

DraftBAC map 6.5 x shotgun coverage Genome Assembly

Finished sequenceBAC-based coverage XFinishing

Whitehead InstituteWashington University St LouisSanger InstituteEBI

Mouse Genome Sequencing Consortium

C57BL/6JFemale

-41 M reads-2 and 4 kb plasmids (90%)-10 kb plasmids (5%)-40 kb fosmids (5%)-155 kb and 200 kb BACs (RPCI-23 & 24)

-WI 54% of reads

Mouse Genome Sequencing Consortium

Assembly: 88 ultracontigs, covers 96% of genome

Contig: 25 kbSuper: 17 Mb

Ultra: 50 Mb

Regions of conserved synteny: ~95% of genome

Extremely high conservation:560,000 anchors

Regions of conserved synteny: ~95% of genome

Regions of conserved synteny: ~95% of genome

AutosomesChromosome X

Genome size: Mouse < Human (2.5 vs 2.9 Gb)

Expansion ratio (M/H)

Genome size: Mouse < Human (2.5 vs 2.9 Gb)

46%

37%

400 Mb

Total Transposon-derived Repeat

Human

Mouse

Less Transposon Activity in Mouse Lineage?

100 Mb

Ancestral Repeat Lineage-Specific Repeat

Human

Mouse

No!!!! More Transposon Activity

More deletion in mouse

Transposons: Accumulate in same regions

GC-content: human larger tails than mouse

Protein-coding gene count falling (<30,000)

Mouse-Human Comparison~ 99% have homologs (maybe 100%)~ 96% have homolog in region of conserved synteny~ 80% have 1:1-ortholog~22,500 evidence-based gene predictions

Gene family expansions: reproduction, immunity

25 mouse-specific gene family clusterexpansions

• 14 reproduction• 5 host defense, immunity

Exons

Non-exons

75% 90%

Large conserved elements (>100 bp)

Large conserved elements: Coding, Non-coding

PPAR

How much of the genome is under selection?

Extremely high conservation:560,000 anchors

Less than half are coding exons (~220,000)

Nucleotide-level alignment: ~40% of genomes

WHYT

Why so much?

Given neutral substitution rate between mouse-human:Vast majority of truly orthologous sequence can be aligned!

Alignable does NOT imply Functional

Nucleotide-level alignment: ~40% of genomes

WHYT

Suppose: • Ancestral genome ~2.9 Gb • New transposons are offset by deletion

Ancestral genome remaining:• in human = 73%• in mouse = 57%• in both = 73% x 57% = 42%

Why so little?

Neutral substitution rate: ~0.46 per site

Mouse

Human0.31

0.15

Mouse 2x faster over 75 Myr

Substitutions in Ancestral Repeats roughly normal distribution

Neutral substitution rate: ~0.46 per site

Introns Coding exons

5’-UTR 3’-UTR

Upstream Downstream

CpG Islands Known Regulatory

Proportion of genome under selection: ~5%

Neutral sequence:Ancestral repeat

Whole genome:Alignable portion

Excess Conservation

Coding Exons only ~1.5%

What is the rest? UTR, Regulatory Elements,RNA genes, Structural Elements?

TNFα enhancer

Conserved

RefSeq

Genscan

Human

Mouse

ACCGCTTCCTCCACATGAGATCATGGTTTTCTCCACCAAGGAAGTTTTCCGAGGGTTGAATGAGAGCTTTTCCCCGCCC||||||||||||| ||||| |||||| |||||||||||||||||||||||| |||||||||| |||||||||||ACCGCTTCCTCCAGATGAGCTCATGGGTTTCTCCACCAAGGAAGTTTTCCGCTGGTTGAATGA--TTCTTTCCCCGCCC ******* ******** ********** ****** ****** ****** ********

NFat/Ets CRE k3-Nfat Ets Nfat AP1 SP1

Genome evolving at non-uniform rate

Mouse Genome summary

• 2.5Gb in size (smaller than human, due to deletion)

• More lineage-specific repeats

• < 30,000 genes (>99% with homologs in human)

• Evolves 2x faster than human

• 95% of genome in blocks of conserved synteny

• 5% under selection (1.5% coding, the rest is unknown)

• Large haplotype blocks of domesticus or musculus

ancestry in inbred strains

Implications of mouse sequence

• Cloning of Classical mutations

• New Mutagenesis programs

• Identification of Quantitative Trait Loci (QTLs)

• Engineering Knock-outs, Knock-ins

• BAC transgenics

• Modeling human disease

• Understanding gene regulation

Future direction

• Finish mouse Genome

• Sequence more mammals (dog, chimp, marsupial)

•“Genomic accounting”

• Identify regulatory elements

• Mouse haplotype map

Genomic Alignments for Multiple Species

•Sequence more mammals (dog, chimp, marsupial)

•“Genomic accounting”

• Identify regulatory elements

• Mouse haplotype map

…. integrated with gene expression analysis

Acknowledgement

Whitehead InstituteKerstin Linblad-TohMichael C. ZodyDavid JaffeClaire WadeMark DalyJade VinsonElinor KarlssonEJ KulbokasNicole Stange-ThomannRob NicolTim HolzerToby BloomJill MesirovChad NusbaumBruce BirrenEric Lander

Washington UniversityJohn McPhersonBob Waterston

Sanger InstituteJim MullikinJane Rogers

Analysis GroupDavid HausslerJim KentArian SmitChris PontigWebb MillerRoss HardisonLaura ElnitskyInna DubchakLior PachterSean EddyMichael BrentRoderic GuigoWayne FrankelCarol Bult

EnsemblEwan Birney

•Mouse Liaison group•University of Oklahoma•Albert Einstein/Harvard•NIH ISC•TIGR•CHORI

• Mouse Genome: http://www.ensembl.org/Mus_musculus/•SNPs: http://aretha.jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/snps