drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called...

13
Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting fruit and other vegetation. There are about 2000 species of Drosophila, and many more in the Drosophilidae. True fruit flies are the Tephritidae, and live on and in fruit, causing economic damage, e.g. the apple maggot fly Rhagoletis pomonella (below) and the Mediteranean fruit fly or medfly. There are about 100 Rhagoletis species, and about 5000 in the family in 500 genera.

Upload: harry-cummings

Post on 04-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting fruit and other vegetation. There are about 2000 species of Drosophila, and many more in the Drosophilidae.

True fruit flies are the Tephritidae, and live on and in fruit, causing economic damage, e.g. the apple maggot fly Rhagoletis pomonella (below) and the Mediteranean fruit fly or medfly. There are about 100 Rhagoletis species, and about 5000 in the family in 500 genera.

Page 2: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

IB404 - Drosophila melanogaster 1 - Feb 15D. melanogaster has been a premier genetic model organism since Thomas Hunt Morgan started using it at Columbia Univ. in NY in 1910. For example, not only did they isolate many mutants, but they mapped them, figured out sex-linkage, utilized the larval salivary gland polytene chromosomes (Calvin Bridges) for mapping, and made interspecific comparisons (Alfred Sturtevant). Hermann Müller later showed X-ray mutagenesis, which led to 1946 P/M Nobel and carcinogenesis concerns.Christiane Nüsslein-Volhard, Eric Wieschaus, and Ed Lewis won the P/M Nobel in 1995 for early developmental genes and HOX complex.

Bridges, Sturtevant, and Morgan in 1920

Page 3: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

The basic features of the genome architecture were already fairly well known, for example, that it is parsed into compact heterochromatin around the centromeres (made up largely of satellite sequences, that is, many tandem repeats of 100-500 bp stretches, and therefore not easily cloned and sequenced, and also not replicated in the larval salivary gland polytene chromosomes), with the Y and dot 4th chromosomes almost entirely heterochromatic. Roughly 60 Mbp is heterochromatic and 120 Mbp is euchromatic (clonable, sequencable, and containing most genes). It was also known that roughly 15% of the euchromatin is made up of transposons, primarily long retroviral-like retrotransposons, while many more flank, and are in, the centromeric heterochromatin. About 1300 genes had been cloned and sequenced the old-fashioned way, in lambda phage (it used to be a PhD project to clone and sequence a gene).

Mitotic chromosomes

Page 4: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

The idea of sequencing the entire Drosophila genome had a rough start, with lots of scepticism in the 1980s. Several different groups made starts, including Ian Duncan and Dan Hartl at WashU, however eventually in the early 1990s two major groups got going, the Berkeley Drosophila Genome Project led by Gerry Rubin (now vice-president of the Howard Hughes Medical Institute), and the European Drosophila Genome Project, led in part by Michael Ashburner at Cambridge. They first sequenced several well-known regions like the Antennapedia and Bithorax HOX complexes, the Adh region, and the tip of the X-chromosome. This whetted the appetite of the community and the BDGP began a genome-wide BAC clone-by-clone approach, while the EDGP “walked” along the X chromosome using cosmids. They generated about 15% of the genome in a mixture of finished and draft sequence.

Gerry Rubin

Michael Ashburner

Page 5: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

In 1998 Craig Venter left TIGR and formed Celera with $300m from ABI, including 300 99-capillary sequencers, with a plan to sequence the human genome. As a “demonstration project”, they sequenced D. melanogaster using their WGS strategy in 3 months (Science in 2000). The BDGP cleaned up this draft to a near-finished genome in 2003.

Celera’s WGS strategy was to sequence about 2m reads from a 2 kb insert plasmid library to provide the bulk of the basic sequence (±7X coverage), as well as about 1.3m reads from a 10 kb insert plasmid library in an effort to bridge the many long transposons during scaffolding (±5X coverage), and then 20,000 reads from a 130 kb insert BAC library (±0.1X coverage) to provide long-range scaffolding information. Total sequence coverage was around 13X.

Assembly resulted in around 800 scaffolds of total 118 Mbp, with about 1600 sequence gaps within the scaffolds and of course the 800 clone gaps between scaffolds. However, the vast majority of the euchromatic chromosome arm sequence was in a few long scaffolds, with many short scaffolds near the centromeric heterochromatin (still incomplete today).

Page 6: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Drosophila genome (2000). Each chromosome arm is depicted: (A) transposable elements, (B) gene density, (C) scaffolds from the joint assembly, (D) scaffolds from the WGS-only assembly (clone gaps are bars), (E) polytene chromosome divisions, and (F) clone-based tiling path. Red/blue clones were completely/draft sequenced. Each chromosome arm is oriented left to right; the centromere is located at the right side of X, 2L, and 3L, left side of 2R and 3R.

Page 7: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

FlyBase is the home for this genome, and the other Drosophila genomes, with an enormous body of connected information, including all the mutants and their phenotypes, nearby transposon insertions, results of microarray and RNAi experiments, plus the entire Drosophila literature. Here is white and a nearby gene (CG32795) in the Genome Browser.

Page 8: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Like many genes in Drosophila, white is within an intron of another gene, kirre, which is an amazing ~400 kb gene with several ~100 kb introns, containing altogether 23 other annotated gene models, including a more normal paralog, rst. Note ESTs suggesting additional genes.

Page 9: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

But much of the genome looks more like this, with lots of genes, in either orientation, and remarkably short promoter regions between them. This is just 40 kb and it contains 13 genes. Overall, we have ~15,000 genes in ~120 Mbp, so roughly 10 kb per gene on average. Notice that even today most of these genes still have simple CG (originally meaning cognate gene) numbers. Per is period, of circadian rhythm fame. eIF2B is an elongation factor involved in translation.

Page 10: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Perhaps the most outrageous gene in Drosophila is Dscam (Down syndrome cell adhesion molecule), which encodes an immunoglobulin superfamily trans-membrane protein that is involved in both brain development and the immune system. It has four exons that are spliced in a cassette fashion, yielding a possible 38,016 possible mRNAs, and that many slightly different proteins, mediating cell organization in the brain. The highest number of alternative splices known for a single gene in any organism, even for Dscam in other insects. None for the human ortholog!

Page 11: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Cropped view of the X annotation - 1st, 8th, & 16th Mbp

First line shows GC content, then transposons as black marks, then genes on each strand.

Page 12: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Numbers of genes, paralogs, and families in H. influenzae, S. cerevisiae, C. elegans, and D. melanogaster

First row shows the total number of genes predicted in each species. Second row shows the number of genes in each genome that appear to have arisen by gene duplication in each lineage (are paralogs of each other). Third row is the total number of distinct gene families for each genome. Note that flies have fewer genes than worms, despite seemingly increased complexity. And that the total number of “distinct” types of proteins, presumably doing significantly different things, in an animal approaches 10,000!

Species H. influenzae S. cerevisiae C. elegans D. melanogasterGenes 1709 6241 18424 13601Paralogs 284 1858 8971 5536Families 1425 4383 9453 8065

Page 13: Drosophila melanogaster – dark-bellied dew-lover - not really fruit flies, originally called vinegar or pomace flies. Live on yeast and bacteria in rotting

Proteins involved in about 300 human diseases compared with fly, worm, and yeast proteins (1/3 here). Light to dark colors indicate increasing similarity. + indicates likely same function. – indicates not.