caenorhabditis elegans operons: form and function

9
© 2003 Nature Publishing Group 110 | FEBRUARY 2003 | VOLUME 4 www.nature.com/reviews/genetics REVIEWS Operons are a common form of gene organization in bacteria and ARCHAEA, where they often co-regulate genes encoding protein products that act in the same pathway (reviewed in REF. 1). Other bacterial operons result in co-expression of the components of the cellu- lar transcription and translation machinery. These operons must be of ancient origin, because bacteria and archaea often contain equivalent operons with the same genes in the same order 2 . The operon was originally described by Jacob and Monod 3 and was defined as a cluster of genes that are under the control of a single regulatory signal. We now know that bacterial operons are generally transcribed from a single promoter, and that they result in the for- mation of a POLYCISTRONIC mRNA that is translated by ribosomes that re-initiate translation at the 5ends of downstream genes, having terminated translation at the 3ends of upstream genes. By contrast, eukaryotes are not considered to contain operons. Instead, eukaryotic genes are transcribed individually, each from its own promoter. Functional clustering on the chromosome does occur in eukaryotes, but it is relatively rare. For example, developmentally important homeobox genes are often clustered. Surprisingly, it was discovered a decade ago that one member of the animal kingdom, the nematode worm Caenorhabditis elegans, has polycistronic gene clusters that are similar to bacterial operons (FIG. 1). However, whereas bacterial operons make a poly- cistronic mRNA, the C. elegans operons produce a polycistronic pre-mRNA that is subsequently processed by 3-end formation and trans-splicing to make monocistronic mRNAs. Here, we present an overview of C. elegans operons and speculate on their evolutionary origins. We also consider the potential function of operons by examining the types of gene that lie within them. Finally, we discuss the extent to which the C. elegans operons co-express genes that produce proteins that function together. Worm operons The C. elegans operons were discovered as a consequence of the unique trans-splicing event that separates the poly- cistronic pre-mRNA into monocistronic mRNAs 4 . These operons are not ancestrally related to their counterparts in bacteria and archaea. Instead, they have evolved sepa- rately, probably entirely within the nematode phylum. The operons are transcribed from a promoter that lies 5of the gene cluster, but often no polycistronic pre-mRNA can be detected because the pre-mRNA is processed co-transcriptionally (reviewed in REFS 5–7). At the 3end of each gene in the operon, there is a conventional cleavage and polyadenylation signal, which results in the production of a mature mRNA from the upstream gene. When cleavage occurs, the pre-mRNA that specifies the remainder of the genes in the operon is left exposed to exonucleolytic degradation; so, how are the downstream mRNAs produced? Although the molecular details are not yet understood, it is clear that degradation is prevented by a U-rich stretch of bases in the short (~100 bases) intergenic regions in the operon 8 . Whatever binds to this U-rich sequence is also required for trans-splicing at the 5end of the downstream genes. CAENORHABDITIS ELEGANS OPERONS: FORM AND FUNCTION Thomas Blumenthal and Kathy Seggerson Gleason Nematodes are unusual among animals in having a substantial proportion of their genes arranged in polycistronic clusters that are similar to bacterial operons. Recently, a nearly complete database of the Caenorhabditis elegans genes that are transcribed in operons has been produced. Analysis of this database has identified the types of genes that are contained in the operons and the extent to which operons co-regulate genes of related function. ARCHAEA An ancient group of organisms that have ribosomes and cell membranes that distinguish them from bacteria. They are often found in extreme environments, such as near deep-sea vents. POLYCISTRONIC Clusters that contain several adjacent cistrons (or genes). Department of Biochemistry and Molecular Genetics, Box B-121, University of Colorado School of Medicine, Denver, Colorado 80262, USA. Correspondence to T.B. e-mail: [email protected] doi:10.1038/nrg995

Upload: kathy-seggerson

Post on 29-Jul-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

110 | FEBRUARY 2003 | VOLUME 4 www.nature.com/reviews/genetics

R E V I E W S

Operons are a common form of gene organization inbacteria and ARCHAEA, where they often co-regulategenes encoding protein products that act in the samepathway (reviewed in REF. 1). Other bacterial operonsresult in co-expression of the components of the cellu-lar transcription and translation machinery. Theseoperons must be of ancient origin, because bacteriaand archaea often contain equivalent operons with thesame genes in the same order2.

The operon was originally described by Jacob andMonod3 and was defined as a cluster of genes that areunder the control of a single regulatory signal. We nowknow that bacterial operons are generally transcribedfrom a single promoter, and that they result in the for-mation of a POLYCISTRONIC mRNA that is translated byribosomes that re-initiate translation at the 5′ ends ofdownstream genes, having terminated translation at the3′ ends of upstream genes. By contrast, eukaryotes arenot considered to contain operons. Instead, eukaryoticgenes are transcribed individually, each from its ownpromoter. Functional clustering on the chromosomedoes occur in eukaryotes, but it is relatively rare. Forexample, developmentally important homeobox genesare often clustered.

Surprisingly, it was discovered a decade ago thatone member of the animal kingdom, the nematodeworm Caenorhabditis elegans, has polycistronic geneclusters that are similar to bacterial operons (FIG. 1).However, whereas bacterial operons make a poly-cistronic mRNA, the C. elegans operons produce apolycistronic pre-mRNA that is subsequentlyprocessed by 3′-end formation and trans-splicing to

make monocistronic mRNAs. Here, we present anoverview of C. elegans operons and speculate on theirevolutionary origins. We also consider the potentialfunction of operons by examining the types of genethat lie within them. Finally, we discuss the extent towhich the C. elegans operons co-express genes thatproduce proteins that function together.

Worm operonsThe C. elegans operons were discovered as a consequenceof the unique trans-splicing event that separates the poly-cistronic pre-mRNA into monocistronic mRNAs4. Theseoperons are not ancestrally related to their counterpartsin bacteria and archaea. Instead, they have evolved sepa-rately, probably entirely within the nematode phylum.The operons are transcribed from a promoter that lies 5′of the gene cluster, but often no polycistronic pre-mRNAcan be detected because the pre-mRNA is processedco-transcriptionally (reviewed in REFS 5–7).

At the 3′ end of each gene in the operon, there is aconventional cleavage and polyadenylation signal,which results in the production of a mature mRNAfrom the upstream gene. When cleavage occurs, thepre-mRNA that specifies the remainder of the genes inthe operon is left exposed to exonucleolytic degradation;so, how are the downstream mRNAs produced?Although the molecular details are not yet understood,it is clear that degradation is prevented by a U-richstretch of bases in the short (~100 bases) intergenicregions in the operon8. Whatever binds to this U-richsequence is also required for trans-splicing at the 5′ endof the downstream genes.

CAENORHABDITIS ELEGANSOPERONS: FORM AND FUNCTIONThomas Blumenthal and Kathy Seggerson Gleason

Nematodes are unusual among animals in having a substantial proportion of their genesarranged in polycistronic clusters that are similar to bacterial operons. Recently, a nearly completedatabase of the Caenorhabditis elegans genes that are transcribed in operons has beenproduced. Analysis of this database has identified the types of genes that are contained in theoperons and the extent to which operons co-regulate genes of related function.

ARCHAEA

An ancient group of organismsthat have ribosomes and cellmembranes that distinguishthem from bacteria. They areoften found in extremeenvironments, such as neardeep-sea vents.

POLYCISTRONIC

Clusters that contain severaladjacent cistrons (or genes).

Department of Biochemistryand Molecular Genetics,Box B-121, University ofColorado School ofMedicine, Denver,Colorado 80262, USA.Correspondence to T.B.e-mail:[email protected]:10.1038/nrg995

Page 2: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

NATURE REVIEWS | GENETICS VOLUME 4 | FEBRUARY 2003 | 111

R E V I E W S

TRANS-SPLICING

A process closely related tointron removal, in which a shortspliced leader is spliced onto the5′ ends of mRNAs.

RNA INTERFERENCE

(RNAi). A process by whichdouble-stranded RNA silencesspecifically the expression ofhomologous genes throughdegradation of their cognatemRNA.

than 90% of the genes that were shown to be SL2 trans-spliced, on the basis of cDNA clone sequences, wereamong those identified by the microarray analysis,which indicates that most of the genes that are trans-spliced to SL2 were found by this method. Interestingly,careful analysis of gene location showed that the posi-tions of most of these genes were consistent with thembeing downstream genes in operons. The 5′-most genesin these operons were not SL2 trans-spliced.

As a result of this analysis, a database of ~900 oper-ons was compiled, and it was estimated to contain~90% of the operons in the genome. Overall, at least15% of C. elegans genes, corresponding to ~2,600 genes,are contained in operons. These operons contain anaverage of 2.6 genes, with the longest so far identifiedcontaining eight genes. The operons are concentrated inthe gene-dense clusters in the recombination-low central regions of the autosomes, and are largely missingfrom both the X chromosome and the autosome arms.This pattern is similar to that which was recentlyreported for genes that give non-viable phenotypeswhen inactivated by RNA INTERFERENCE (RNAi)19 (seebelow for further discussion).

There are typically only 100 bp between genes inoperons, and, in fact, most intergenic regions are withina few base pairs of this length, which indicates a possiblefunctional constraint. This distance constraint is probablydue to a requirement for the positioning of the U-richsequence at a relatively precise distance from both thesite of 3′-end formation upstream and the site of trans-splicing downstream.

Evolution of the operonsOnce the operons were discovered in C. elegans, it soonbecame evident that they were highly conserved inCaenorhabditis briggsae, a species that separated fromC. elegans 50–100 million years ago. Analysis of therecently completed C. briggsae genomic sequence showsthat 96% of the 900 known C. elegans operons are pre-sent in the same form in C. briggsae (C. elegansSequencing Consortium, unpublished observations).Chromosomal breakpoints — the sites that markrearrangements between C. elegans and C. briggsaechromosomes — occur no more frequently betweengenes in operons than they do within genes, which indi-cates that these gene assemblies are evolutionarily stable.Presumably, this stability results from the fact that thedownstream genes in the operons generally lack pro-moters that are proximal to the genes, so that the

In nematodes and several other groups, includingthe trypanosomes (protists)9,10, the flatworms11,the cnidarian hydra12 and the primitive chordate Ciona intestinalis13, trans-splicing trims the 5′ ends ofthe mRNAs. This results in the removal of the portionof the RNA that begins at the promoter and ends at thefirst 3′ splice site. The sequence is replaced with a short5′ leader (22-nucleotides (nt) long in nematodes),called the spliced leader (SL) (reviewed in REF. 14). Inmost cases, this replacement occurs near the 5′ cappedend of the pre-mRNA. The donor in trans-splicing,known as SL RNA, is a short (~100-nt long in nema-todes) small nuclear RNA (snRNA) that occurs as asmall nuclear ribonucleoprotein (snRNP). The trans-splicing reaction is a variant of cis-splicing (intronremoval) and is catalysed by mostly the same machin-ery (U2, U4, U5 and U6 snRNPs). Recently, it has beenshown that in Ascaris (a parasitic nematode) the SLsnRNP contains two new proteins that have, so far, notbeen found outside the nematode phylum. It seemsthat these two proteins are unique to the SL snRNP15,and determining whether they are present in otherspecies in which SL-type trans-splicing occurs couldhelp to solve the enigma of the evolutionary relation-ship between cis- and trans-splicing16.

C. elegans has two types of SL snRNP. The principalform, SL1 snRNP, is responsible for all of the trans-splicing events that occur near the 5′ ends of the pre-mRNAs. By contrast, the SL2 snRNP, and variantsthereof, are never used at such sites. Instead, they areused to process polycistronic pre-mRNAs at trans-splicesites that lie in the intergenic regions in operons4. It wasthis unique feature of operon processing that led to theirdiscovery. Because all of the genes that had SL2 at their5′ ends were located on the chromosome 100–300 bpdownstream of another gene, it was proposed that theywere co-transcribed with the gene immediatelyupstream of them, and also that the resulting poly-cistronic pre-mRNA was processed by trans-splicingwith the rare SL2 snRNP.

Over the years, the correlation between SL2 trans-splicing and downstream position in a polycistroniccluster, originally based on just five examples, becamemore robust17, but was this correlation also true on aglobal scale? To investigate this question, a probe specificto mRNAs that had the SL2 sequence at their 5′ endswas used to search whole-genome microarrays18.Approximately 1,200 genes that are trans-spliced by theSL2 snRNP were found in this way. Significantly, more

SLP mai-1 gpd-2 gpd-3

Poly(A) Poly(A) Poly(A)

SL

Figure 1 | The first operon identified in Caenorhabditis elegans. The 5′-most gene is mitochondrial ATPase inhibitor-1 (mai-1).This gene is not TRANS-SPLICED. The two downstream genes, glyceraldehyde 3-phosphate dehydrogenase-2 (gpd-2) andglyceraldehyde 3-phosphate dehydrogenase-3 (gpd-3), encode isoforms of a glycolytic enzyme. The two intercistronic regions are~100 bp long (bold). Exons and introns are drawn to scale. Coding regions are depicted by green, purple and orange boxes; non-coding regions are depicted by tan boxes. Arrows indicate the sites of 3′-end formation (poly(A) addition) and trans-splicing(addition of spliced leader (SL)). The single promoter is marked with a P.

Page 3: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

112 | FEBRUARY 2003 | VOLUME 4 www.nature.com/reviews/genetics

R E V I E W S

that the operons are present throughout the rhabditidclade, a grouping that contains both free-living and para-sitic species. Recent unpublished evidence (D. Giulianoand M. Blaxter, personal communication) shows thatthis operon, and possibly others, is also present in thedistantly related parasitic nematode Brugia malayi,which indicates that operons might be presentthroughout the nematode phylum. Interestingly, thedownstream genes in the putative operons in Brugiaare trans-spliced with SL1, and no SL2 has beenfound; this indicates the possibility that SL2 is a morerecent adaptation in the rhabditid nematodes, or oneof their ancestors, in response to the presence of many

chromosomal rearrangements that separated themwould be expected, in most cases, to result in their inac-tivation. However, there are some clear cases in which anoperon has either been formed in C. elegans, or has beendestroyed in C. briggsae, since the divergence of the twospecies (for example, C14B1.7/C14B1.8; T08A11.2/F43C1.3; glyceraldehyde 3-phosphate dehydrogenase-1(gpd-1)/T09F3.4; Y47G6A.25/Y47G6A.24; R107.6/ glutathione S-transferase-1 (gst-1)).

At least one C. elegans operon has been found in amore distantly related rhabditid nematode species,Dolichorhabditis dolichura20, which was also found totrans-splice the downstream genes with SL2. So, it seems

c

a

d

bPeroxisome (21)

Basement membrane (11)Cuticle (46)

Cilia (13)Secretory vesicle (87)Docking protein (18)Nuclear import (10)

Cytoskeletal (84)Cell junction (35)

Axon (35)Signalling (15)

Nuclear pore (15)Golgi (102)

Lysozome/vacuole (22)Midbody (129)

DNA replication (19)Centrosome (25)Chaperone (51)Nucleolus (227)

Vesicle (sec./end.) (32)Spliceosome (55)

RNA degradation (15)

10 20 30 40 50 60 70 80

EnzymesUDP-glucosyl transferase (56)

P450 (82)Glucosaminyl transferase (17)

Protein phosphate (106)Topoisomerase (9)

Other phosphotase (31)Chitinase (36)

Other kinase (59)Alcohol/ribotol dehyd.(29)

Glycolysis enzyme (11)Isomerase (38)cyclophilin (18)

tRNA synthetase (31)

10 20 30 40 50 60

Homeobox (88)Transcription factors

Bromodomain (24)TAF (16)

RNAP subunit (15)TFII (28)

All (212)Matrix (27)

Ubiquinone annotation (35)Cytochrome (23)

Inner membrane (34)Outer membrane (6)

Secondary transporter (17)

ProteasomeF-box (135)

Hect (9)Ubiquitin conjugating enz. (28)

10 20 30

Percentage in operons

Percentage in operons

40 50 60

Intermediate filament (19)Cyclin (10)

Vitellogenin (6)

Major sperm protein (39)Collagen (96)

Secondary transporter-cyt. (83)EGF (29)

Immunoglobulin (79)GAP (21)

GNEF (16)Primary transporter (85)

Motor proteins (50)Ankyrin (53)

Rab (22)RRM domain (98)

Ribosomal proteins (115)

10 20 30 40 50 60Percentage in operons Percentage in operons

Mitochondrial

Figure 2 | Percentage of genes of various classes encoded in operons. a,b | Genes divided into functional classes. c | Enzymes. d | Genes divided into types of protein. Numbers in parentheses refer to the number of genes in the class. In mostinstances, the classes were defined using the categories or keywords in the Proteome database. In a few categories, the lists ofgenes were obtained from experts in the field or recent publications: golgi (K. Howell, University of Colorado), midbody (A. Skop,University of California, Berkeley), nucleolus39, spliceosome (S. Mount web site and REF. 27), proteasome (W. Harper, BaylorUniversity, Houston, Texas), cyclophilin (A. Page, University of Glasgow), nuclear pore (S. Adam, Northwestern University,Chicago, Illinois), RNA stability (A. Van Hoof, University of Texas Medical School, Houston). Numbers in brackets represent thenumber of genes. cyt., cytoplasmic; EGF, epithelial growth factor; end., endocytic; enz., enzyme; GAP, GTPase activating protein;RNAP, RNA polymerase; RRM, RNA recognition motif; sec., secretory; TAF, TAFIID-associated factor; TFII, transcription factor II.

Page 4: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

NATURE REVIEWS | GENETICS VOLUME 4 | FEBRUARY 2003 | 113

R E V I E W S

which trans-splicing occurs should have the potential toevolve operons. So far, trans-splicing has been found innematodes, flatworms, hydra, the primitive chordateCiona and the trypanosomatid protists (reviewed in REF. 16).Although the genomic organization of Ciona andhydra has not yet been analysed in detail, the rest doseem to have polycistronic transcription. Trypanosomeswere the first eukaryotes in which polycistronic tran-scription was discovered, and it clearly occurs through-out much of their genome21. However, trypanosomes arenot known to have regulated polycistronic gene clusterssimilar to the nematode operons. Although there hasbeen relatively little analysis of the genome organizationin flatworms, there is at least one report of probableoperons in these organisms22.

Polycistronic transcription has also been found spor-adically in animals that do not trans-splice, includingDrosophila (the stoned locus, Alcohol dehydrogenase (Adh)and Alcohol dehydrogenase-related (Adhr)), humans(growth differentiation factor 1 (GDF1) and longevityassurance homologue 1 (LASS1) also known as UOG1)and plants (γ-glutamyl kinase (GK ) and glutamyl phos-phate reductase (GPR)) (reviewed in REF. 23). These maketrue dicistronic mature mRNAs that are translated byinternal ribosome entry or by re-starting translation of asecond gene after termination at a first gene. Presumably,such operons are relatively rare because of the inherentcomplexity of internal translation initiation by theeukaryotic translation machinery. Nevertheless, theseexamples indicate that there can be sufficient selectiveadvantage of polycistronic transcription that it is selected,even though it might be difficult to achieve the molecularmechanism by which the proteins are made.

What genes are in operons?What types of gene are encoded in operons? FIG. 2

summarizes the results of a comparison between the listof 2,340 genes that are found in operons and lists ofgenes based on their functions, the motifs that are pre-sent in their protein products, and the processes inwhich they participate. It is immediately clear that theplacement of genes in operons is not random. Genesthat encode certain types of protein are likely to be pre-sent in operons, whereas others are never, or almostnever, found in operons. For example, genes that encodethe proteins that catalyse the various steps of geneexpression — transcription, splicing and translation —have a strong tendency to be found in operons. By con-trast, genes that encode tissue- or cell-type-specificproteins, such as transcription activators, collagens,vitellogenins and major sperm proteins, are virtuallynever found in operons. These genes presumably needto be regulated at the transcriptional level, and genesthat encode cell-type-specific products must be regu-lated independently. Conversely, proteins that are des-tined for the mitochondria have a strong tendency tobe transcribed in operons. This is well illustrated bythe case of SECONDARY TRANSPORTERS: only 2 of the 83cytoplasmic transporters are transcribed in operons,whereas 10 of the 17 mitochondrial transporters arecontained in operons.

operons in the genome. Presumably, such a developmentwould have allowed more efficient RNA processingthan was possible with the SL1 snRNP.

We believe that operons might provide a selectiveadvantage for any organism in which they could effi-ciently produce gene products. They might increase theefficiency of gene regulation, enabling the co-regulationof genes without the need for the duplication of regula-tory sequences. Furthermore, because all of the genesthat are contained in an operon use a single promoter,they can be much closer together than is usual, therebyreducing the genome size.

If the operons provide such selective advantages, whyare they not found throughout the eukaryotes? Perhapsit is because, generally, there is no way to obtain expres-sion of the downstream genes. Cleavage at the 3′ ends ofthe upstream gene creates a 5′ phosphate on the down-stream portion of the pre-mRNAs. This would beexpected to lead to degradation of the downstream RNAor to transcription termination. However, trans-splicingcaps the downstream mRNAs, thereby allowing theirtranscription in a stable form. So, trans-splicing can beviewed as an enabling characteristic, allowing poly-cistronic gene expression and, therefore, any organism in

SECONDARY TRANSPORTER

A membrane transporter thatdoes not require ATP cleavagefor transport.

Box 1 | Cyclophilin operons

All of the genes in the operons that contain a gene in the cyclophilin gene class (markedby an asterisk), which encode peptidyl prolyl trans-isomerases, are listed along withtheir presumed functions or identifications on the basis of their homology to otherknown genes.

CDP alcohol phosphatidyltransferase, involved in phospholipid biosynthesis (Y49A3A.1)ATP synthase α/β-subunit ATP-binding and phosphorylation-dependent chloride channelProtein of unknown function (Y49A3A.3)cyp-1, cyclophilin-related protein-1 (Y49A3A.5)*Zn metal ion transporter (F31C3.4)Glycosyl hydrolase (F31C3.3)cyp-5, cyclophilin-related protein-5 (F31C3.1)*

pdi-1, protein disulphide isomerase-1 (C14B1.1)cyp-9, cyclophilin related protein-9 (T27D1.1)*

cyp-11, cyclophilin-related protein-11 (T01B7.4)*rab-21 (T01B7.3)G-protein-coupled receptor (T01B7.2)

Ubiquitin protein ligase–E3 enzyme (C34D4.14)Protein of unknown function (C34D4.13)cyp-12, cyclophilin-related protein-12 (C34D4.12)*

uaf-2, U2AF small subunit (Y116A8C.35)cyp-13, cyclophilin-related protein-13 (Y16A8C.34) *SF1 branchpoint binding protein (Y116A8C.32)Protein of unknown function (Y116A8C.30)Carbonic anydrase/immunoglobin domain (Y116A8C.28)

TCF9 transcription repressor that binds (C + G)-rich regions (F43G9.12)SNARE — coiled coil (VF39H2L.1) cyp-14, cyclophilin-related protein-14 (F39H2.2)*

tpi-1, triose phosphate isomerase-1 (Y17G7B.7)cyp-16, cyclophilin-related protein-16 (Y17G7B.9)*

UDP-Gal acetylglucosamine galactosyl transferase (Y110A2AL.14)PIN1, CTD interactor (Y110A2AL.13)*Integral membrane protein (Y110A2AL.12)

Page 5: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

114 | FEBRUARY 2003 | VOLUME 4 www.nature.com/reviews/genetics

R E V I E W S

Another class of proteins that are often encoded inoperons is the cyclophilins (BOX 1). These proteins arepeptidyl prolyl trans-isomerases that help other proteinsto fold. Because several other classes of enzymes tendnot to be operon-encoded, it is tempting to speculatethat the cyclophilins are expressed in operons with theirprotein substrates. An examination of the operonsshown in BOX 1 shows many intriguing gene clusters. Forexample, cyclophilin related protein-13 (cyp-13), whichencodes a cyclophilin fused to an RNA recognitionmotif, is in an operon with genes that encode two 3′splice-site recognition factors, U2AF35 and SF1, whichare known to interact with one another25,26. We hypo-thesize that the CYP-13 protein is also involved insplicing, and the recent discovery that the mammalianCYP-13 homologue (CypE) is associated with splice-osomes supports that idea27–29.

The class of genes with the highest fraction tran-scribed in operons is the set of genes encoding proteinsrequired for RNA degradation (FIG. 1; BOX 2). Twelve ofthe 15 genes in this class are operon-encoded. Othergenes in these operons include three ribosomal pro-teins, an RNA helicase, an RNA methylase and an RNApolymerase subunit (BOX 2). It could be argued that co-transcription of the proteins that affect RNA stabilitywith proteins that are involved in other aspects of geneexpression coordinates the various steps in geneexpression. Alternatively, RNA degradation proteinsmight be preferentially encoded in operons becausethey regulate accumulation of their own mRNA at thelevel of stability and, therefore, their co-transcriptionwith genes from a ubiquitously expressed promotermight be irrelevant to the regulation of the other genesin these operons.

Recently, Kamath et al.19 have studied RNAi pheno-types of ~15,000 C. elegans genes. Of these, 89% showedno obvious phenotype; interestingly, only 8% of thesegenes were found to be in operons. By contrast, 31% ofthe 1,649 genes that showed a phenotype were inoperons. So, it seems that the genes that are importantfor viability are preferentially transcribed in operons.Kamath et al.19 also categorized genes according towhether they are found in all eukaryotes (ancientgenes), in animals only or in worms only. Using theselists we have found that, whereas 17% of the 6,755ancient genes are transcribed in operons, only 5% of the1,447 animal-specific genes and only 2% of the 1,812worm-specific genes are located in operons. This resultis consistent with the idea that operons preferentiallycontain the genes encoding proteins that are involved infundamental processes in all eukaryotes.

So, why are those genes that encode elements of thegene expression machinery and proteins that are des-tined for the mitochondria contained in operons,whereas other types of gene are either excluded fromoperons or are present in operons at a much lower fre-quency (FIG.2)? It is possible that genes are encoded inoperons because they do not need to be regulated. Inthis view, they are transcribed from ubiquitouslyexpressed promoters that could specify any number ofgenes as long as they are expressed in all cells at all times.

Genes that encode proteins that are involved intranscription are divided into two groups. RNA poly-merase subunits, the basic transcription machineryand TAFs (TFIID-associated factors) tend to beencoded in operons (FIG. 2). By contrast, homeo-domain transcription factors are rarely encoded inoperons — only 3 of the 88 homeodomain transcrip-tion factors in the survey were found in operons andthey were all part of the same complex. The same istrue for zinc-finger transcription factors — few ofwhich occur in operons. Examples of those that dooccur in operons include an operon that containsseven genes that encode nuclear hormone receptors,and an operon that contains three zinc-finger trans-cription factors — the lin-26 related-2 (lir-2)/lir-1/lineage defective-26 (lin-26) operon complex24 —which consists of two operons, both of which containthe middle gene. The functional significance of thiscomplex arrangement of transcription factor genes isnot known.

Box 2 | RNA degradation operons

All of the genes in the operons that contain a worm homologue of an RNA degradationgene are listed, together with their presumed functions or identification on the basis oftheir homology to other known genes. The list of worm homologues of the RNAdegradation genes (marked by an asterisk) was prepared by A. van Hoof.

Exosome subunit (F59C6.4)*NADH uniquinone oxidoreductase (F59C6.5)

Ribosomal protein L7 Ae (Y48A6B.3)Exosome subunit (Y48A6B.5)*

Exosome subunit (C14A4.4)*Exosome subunit (C14A4.5)*

Tubulin prefoldin subunit (R151.9)Exosome subunit (F37C12.13)*

rRNA methylase (T14B4.1)Exosome subunit (F41G3.14)*Serine/threonine kinase (F41G3.5)

Helicase (K02F3.1)Protein of unknown function (K02F3.12) Exosome subunit (Y6D11A.1)*

num-1, Zn finger/PID domain (T02D8.1)Ribosomal protein S12 (T03D8.2)Exosome subunit (F31D4.1)*

Scavenger decapping protein (Y113G7A.9)*Protein of unknown function (Y113G7A.8)

Protein of unknown function (F52G2.2)DCP2 decapping enzyme (F52G2.1)*

RNA polymerase II subunit (W01G7.3)Protein of unknown function (W01G7.4) Protein of unknown function (W01G7.5)XRN1 cytoplasmic exonuclease (Y39G8C.1)*

Ribosomal protein L43 (Y48B6A.2)RAT1 nuclear exonuclease (Y48B6A.3)*

mes-3, maternal effect sterile-3 (F54C1.3)dom-3, RAI1 nuclear exonuclease subunit (F54C1.2)*

Page 6: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

NATURE REVIEWS | GENETICS VOLUME 4 | FEBRUARY 2003 | 115

R E V I E W S

feature). Furthermore, several of the classes of genes thatdo not occur in operons are classified as housekeepinggenes, which indicates that ubiquitous expression mightnot be the main criterion for operonic location.

Arguing against this simple idea, is the fact that many ofthe operons specify genes of related function, ratherthan just a random assembly of ubiquitously expressedgenes (see below for a detailed discussion of this operon

Box 3 | Operons with clear functional relationships among proteins

Listed below are a selection of gene groups that have obvious functional relationships and are contained in the same operons in Caenorhabditis elegans.The examples shown were identified manually by examining the list of operons in REF. 18. Not all of the genes in each operon are shown, but only thosewith obvious functional relationships.

RNA processing operonsU1A snRNP protein (K08D10.4)U2B snRNP protein (K08D10.3)

uaf-2, U2AF35 3′ splice-site recognition (Y116A8C.35)cyp-13, RRM/cyclophilin spliceosome associated (Y116A8C.34)BBP/SF1 branchpoint recognition (Y116A8C.32)

RNA helicase (C41D11.7)Nuc-1, endonuclease-1 (C41D11.5)

Amino acid-tRNA synthetase (K07H8.10)RNA binding protein KH domain (K07H8.9)

npp-4, nucleoporin-4 (Y54E5A.4)dsRNA binding protein (Y54E5A.6)

RRM protein (EEED8.10)rsp-4, SR protein Sc35 (EEED8.7)

Exosome subunit (C14A4.5)Exosome subunit (C14A4.4)

Mitochondrial operonsATP2 ATP synthase a-subunit (C34E10.6)Mitochondrial tRNA synthetase (C34E10.4)

ced-9, (BCL-2) cell death inhibitor (T07C4.8)cyt-1, cytochrome B560 (T07C4.7)

mai-1, mitochondrial ATPase inhibitor-1 (K10B3.9)gpd-2, glycolytic enzyme GAPDH-2 (K10B3.8)gpd-3, glycolytic enzyme GAPDH-3 (K10B3.7)

Succinate dehydrogenase flavoprotein (C34B2.7)PIM1 serine protease (C34B2.6)

FO ATPase subunit (F27C1.7)coq-1, hexaprenyl pyrophosphate synthetase-1 (C24A11.9)

NADH ubiquinone oxidoreductase (Y57G11C.12)coq-3, 2-demethylubiquinone-9 3-methyltransferase-3 (Y57G11C.11)

Phosphoglycerate kinase (T03F1.3)coq-4, ubiquinone biosynthesis (T03F1.2)

Mitochondrial proton channel (K07B1.3)coq-6, monoxygenase for ubiquinone biosynthesis (K07B1.2)

Succinate semi-aldehyde dehydrogenase (F45H10.3)Ubiquinol cytochrome c reductase (F45H10.2)

Cytochrome c haem lyase (T06D8.6)cox-15, cytochrome oxidase assembly protein-15 (T06D8.5)

Ubiquinol cytochrome c reductase (T27E9.2)ABC50 non-transporter ATP-binding cassette (T27E9.7)

Mitochondrial ribosomal protein S2 (T23B12.3)Mitochondrial ribosomal protein L4 (T23B12.2)

Membranes/Golgi operonsVesicle docking and trafficking protein (C15C7.1)

GRIP domain protein of the trans-Golgi (C15C7.2)

Nipsnap25 (K02D10.1)Snap25 (K02D10.5)

grp-1, PH and SEC7 domains (K06H7.4)Isopentenyl-diphosphate δ-isomerase (K06H7.3)

GRIP domain protein (T05G5.9)Golgi protein involved in protein sorting (T05G5.8) Enoyl-CoA hydratase (T05G5.6)

Channel operonsdes-2, subunit of acetylcholine receptor-2 (T26H10.1)deg-3, subunit of acetylcholine receptor-3 (K03B8.9)

Ion channel subunit (C01F6.8)Modifier of voltage gating of the same channel (C01F6.9)

Longevity operonsir-2.1, chromatin protein that regulates lifespan (R11A8.4)Glutathione S transferase inplicated in detoxification, protects againstDNA alkylation and cytotoxicity (R11A8.5)

Collagen modification operonpdi-1, protein disulphide isomerase-1 (C14B1.1)cyp-9, cyclophilin peptidyl prolyl transisomerase-9 (T27D1.1)

Protein digestion operonProteosome subunit C3 (D1054.2 )Ubiquitin ligase complex subunit (D1054.3)

Signalling operonlin-15A, synmuv A required for vulval morphogenesis (ZK678.1)lin-15B, synmuv B required for vulval morphogenesis (ZK662.4)

Nucleolar operonsfib-1, fibrillarin nucleolar methylase-1 (T01C3.7)rps-16, small subunit ribosomal protein-16 (T01C3.6)

rpa-1, acidic ribosomal protein-1 (Y37E3.7)RPL-29 large subunit ribosomal protein (Y37E3.8)

rps-3, small subunit ribosomal protein-3 (C23G10.3)Translational inhibitor protein (C23G10.2)

RPS-27 with ubiquitin and Zn finger domains (H06I04.4)Methyl transferase required for 60S synthesis (H06I04.3)

rpa-0, acidic ribosomal protein P0 (F25H2.10)Translationally controlled tumour protein (F25H2.11)Ribosomal RNA adenine dimethylase (F25H2.12)

inf-1, DEAD box helicase (eIF4A homologue) (F57B9.6)Bystin-like protein (F57B9.5)

Novel nuclear protein 30 (F55F8.5)Novel nuclear protein 62 (F55F8.3)

Novel nuclear protein 35 (T07A9.8)Novel nuclear protein 51 (T07A9.9)

Page 7: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

116 | FEBRUARY 2003 | VOLUME 4 www.nature.com/reviews/genetics

R E V I E W S

channel are encoded in the same operon31, as are thetwo LIN-15 proteins that act together to specify vulvaldetermination32,33.

The list in BOX 4 contains numerous operons thatcontain components of the basic transcription machinery. Notably, there are several instances of com-ponents of the transcription machinery that are knownto function together (for example, a TFIIIC subunit anda Pol III subunit in one case, and a Pol I subunit and aregulator of ribosomal RNA synthesis in another). Thelists of operons with several genes that encode appar-ently functionally related proteins are intriguing. Theyindicate that the organism might gain some advantagefrom expressing related genes from the same promoter.In most cases, it is not clear what that advantage mightbe, although in some cases it is more obvious thanothers. For example, co-expressing the two polypeptidesthat make up the acetylcholine receptor from a singlepromoter allows coordinated regulation of receptorlevels. Similarly, co-expressing the genes for cyclophilinand protein disulphide isomerase — two proteinsrequired for modifying collagen so that it can trimerize— allows these two enzymes, which work together, to beexpressed in hypodermal cells only at the time and placethat they are needed34.

It is worth noting that the frequencies of functionallysimilar genes that are co-expressed in operons are muchhigher than would be expected by chance, at least forsome classes of genes, such as those that encode splicingproteins and mitochondrial proteins18. Nevertheless, it isalso true that many operons contain genes that are notobviously related. This could be because many operonsdo not function to co-regulate related genes, or becausewe do not yet know enough about the functions of allgenes to have discerned the relationships between them.

A second question is whether the genes in operonsare in fact co-regulated. Although, on the one hand, itseems obvious that they must be, as they are expressedfrom the same promoter, on the other hand, it is welldocumented that genes in bacterial operons are oftendifferentially regulated. This could occur in wormsowing to different efficiencies of RNA processingand/or due to different mRNA stabilities. All of theabove seem to come into effect in C. elegans operonregulation. First, it has recently been shown that a largefraction of the co-regulation of gene pairs in the samechromosomal domains is due to their presencetogether in operons35. Gene pairs that lie in operonsare more likely to be co-regulated than are randomlychosen gene pairs, even those that lie in close proxim-ity to each other. However, it is by no means true thatbecause two genes are members of the same operonthey are necessarily co-expressed.

We have carried out a preliminary survey of the co-expression of operon genes using the data of Hill et al.36.Although there are many operons that contain geneswith nearly identical expression profiles, numerouscounter-examples can be found. In particular, often thefirst gene in an operon has high mRNA levels through-out development, whereas the levels of the downstreamgene mRNAs are far lower. This type of profile could

An interesting possibility might be that the oper-ons contain genes that need to be regulated togetherin response to some sort of global signal. This ideawas proposed many years ago to explain the existenceof Escherichia coli operons that contain genes thatencode elements of both the translation and tran-scription machinery30. In the case of C. elegans, itmight, for example, be necessary to rapidly change thelevel of gene expression and the output of the energygeneration machinery in response to a global signalsuch as starvation or stress. The genes involved mightalready have been co-regulated before operons devel-oped, but their assembly into operons allowed this co-regulation to be accomplished with greater efficiencyand precision, without the duplication of cis-actingregulatory elements.

Do operons co-regulate related proteins?Bacterial operons often consist of all of the genes thatare required for accomplishing an entire pathway, pre-sumably because a strong advantage is provided bythe ability of such gene clusters to be transferred toother bacteria as a unit1. There is no such selection inanimals, so it is not surprising that the C. elegansoperons do not contain genes for an entire pathway,or for all of the polypeptides that interact to form acomplex. Nonetheless, genes that encode proteins thatfunction together do sometimes occur in the sameoperons. Numerous cases have been published andmany more are obvious from a casual examination ofthe list of 900 operons18. BOX 3 shows several instances.For example, two subunits of the acetylcholine receptor

Box 4 | Transcription operons

The examples shown were identified manually by examining the list of operons in REF. 18. Not all of the genes in each operon are shown, only those with obvious functionalrelationships.

set-2, set domain protein-2 (C26E6.9)RNA polymerase I, II and III subunit RPB-8 (F24F4.11)

SNAPc activator complex subunit (TO2C12.2)TFIIIC subunit (TO2C12.3)

lin-53/rba-2, chromatin remodelling protein (K07A1.12)rba-1, chromatin remodelling protein-1 (K07A1.11)

RNA polymerase III subunit (ZK856.10)TFIIIC subunit (ZK856.13)

RRS1 regulator of rRNA synthesis (C15H11.9)RNA polymerase I subunit (C15H11.8)

leo-1 subunit of a basic transcription complex (B0035.11)SNF2 DNA helicase chromatin remodelling factor, related to ISWI (F54E12.2)

Chromodomain protein (TO9A5.8)MED10 mediator complex (TO9A5.6)

C2H2 Zn-finger protein (F44E2.7)Neisseria pilin B transcription regulator (F44E2.6)

Histone acetyl transferase 2PHD Zn fingers (Y51H1A.4)Histone deacetylation (Y51H1A.5)

C2H2 Zn finger (C08B11.3)Histone deacetylase HDAC2 (C08B11.2)

Page 8: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

NATURE REVIEWS | GENETICS VOLUME 4 | FEBRUARY 2003 | 117

R E V I E W S

with a gene of interest? We suggest that it is. Numerousexamples can be cited in which this would have proved auseful line of research had the operons been known. Inone case, this method has, in fact, been successfully usedas a tool to find a related gene: an ICln ion channel wasfound to be encoded in an operon with an unrelatedgene, but when its function was tested in mammaliancells, its product was found to modify the activity of thechannel37. As more examples accumulate in which thisline of investigation proves useful, the C. elegans oper-ons could become a significant new tool for discoveringpreviously unknown functional relationships betweengenes. Many human disease genes have homologuesthat are found in the operons of C. elegans, and theother genes in these operons could turn out to be func-tionally related to disease-causing genes18,38.

ConclusionsWe now know that operons are a common form ofgene organization in C. elegans, and they might turnout to be used by other nematodes and other species inwhich trans-splicing is a form of mRNA processing.However, it is unlikely that operons will be found inorganisms that do not trans-splice, including vertebrates,arthropods, plants and fungi. Nevertheless, the fact thatthese gene groupings, now catalogued in C. elegans,contain so many homologues of the genes that arefound elsewhere means that they can be potentiallyuseful to researchers in their attempts to identify genesof related function. We still do not understand theselective pressures that have allowed the evolution ofoperons in nematodes. However, the fact that the oper-ons seem to be composed primarily of genes thatencode proteins that are components of the basic geneexpression machinery and in the energy generationmachinery indicates that these gene assemblies mighthave been selected to allow the easy global control offundamental cellular processes. These genes might havebeen co-regulated anyway, but their presence in a co-transcribed cluster might make their regulation bettercoordinated and, at the same time, eliminate the needfor duplication of cis-acting control elements.

result from failure to process, or even to transcribe, themRNA from the downstream genes. There are alsomany cases in which two genes in the same operon havedifferent expression profiles. These could result from dif-ferentially regulated mRNA stabilities.Although it seemswasteful to make a large amount of mRNA only todegrade it, the story of evolution includes many similarexamples of apparently wasteful mechanisms. The easewith which such operons can be found indicates theinteresting possibility that the operons might have accu-mulated genes with products that were already regulatedat some level other than transcription, such as mRNA orprotein stability. If this were true, many of the operonsmight simply be expressed from a ubiquitously activepromoter because the genes are regulated at some down-stream step in gene expression.

Clearly, there is no general rule. There are highlyregulated operons in which the genes are co-expressed;presumably, these are the most likely to have gene prod-ucts with a functional relationship to one another.Conversely, there are operons that contain genes that areexpressed ubiquitously, in others the mRNAs accumulateto different levels, and in others still, the mRNA levels aresimilar. Finally, there are operons in which the mRNAsfrom different genes seem to accumulate with distinctprofiles. The frequency with which the operons withthese varied expression profiles occur has yet to be deter-mined. Nonetheless, it might be worthwhile examiningthe operons that contain differentially regulated genes tosearch for examples of regulation by mRNA stability.

The operon database: a gene finding tool?It is clear from the above discussion that many operonscontain functionally related genes, but many others donot seem to do so. The same was true for bacterial oper-ons, in which operons that seemed difficult to under-stand, in many cases, were eventually found to makesense once functional information was available aboutthe protein products of their constituent genes. Is thefrequency of the related genes that are present in thesame C. elegans operon sufficiently high to warrant theinvestigation of genes that are contained in an operon

1. Lawrence, J. G. Shared strategies in gene organizationamong prokaryotes and eukaryotes. Cell 110, 407–413(2002).

2. Langer, D. et al. Transcription in archaea: similarity to thatin eucarya. Proc. Natl Acad. Sci. USA 92, 5768–5772(1995).One of several papers that report the same operonarchitecture in bacteria and archaea, especially foroperons that contain components of the geneexpression machinery.

3. Jacob, F. & Monod, J. On the regulation of gene activity.Cold Spring Harb. Symp. Quant. Biol. 26, 193–211 (1962).

4. Spieth, J. et al. Operons in C. elegans: polycistronic mRNAprecursors are processed by trans-splicing of SL2 todownstream coding regions. Cell 73, 521–532 (1993).The original report of operons in the C. elegansgenome.

5. Blumenthal, T. & Steward, K. in C. elegans II (eds Riddle, D.et al.) 117–145 (Cold Spring Harbor Laboratory Press,New York, 1997).

6. Blumenthal, T. Trans-splicing and polycistronictranscription in Caenorhabditis elegans. Trends Genet. 11,132–136 (1995).

7. Blumenthal, T. & Spieth, J. Gene structure andorganization in Caenorhabditis elegans. Curr. Opin. Genet.Dev. 6, 692–698 (1996).

8. Huang, T. et al. Intercistronic region required for polycistronicpre-mRNA processing in Caenorhabditis elegans. Mol. Cell.Biol. 21, 1111–1120 (2001).

9. Sutton, R. E. & Boothroyd, J. C. Evidence for trans splicingin trypanosomes. Cell 47, 527–535 (1986).

10. Murphy, W. J., Watkins, K. P. & Agabian, N. Identification ofa novel Y branch structure as an intermediate intrypanosome mRNA processing: evidence for trans splicing.Cell 47, 517–525 (1986).

11. Davis, R. E. Surprising diversity and distribution of splicedleader RNAs in flatworms. Mol. Biochem. Parasitol. 87,29–48 (1997).

12. Stover, N. A. & Steele, R. E. Trans-spliced leader addition tomRNAs in a cnidarian. Proc. Natl Acad. Sci. USA 98,5693–5698 (2001).

13. Vandenberghe, A. E., Meedel, T. H. & Hastings, K. E. mRNA 5’-leader trans-splicing in the chordates. Genes Dev.15, 294–303 (2001).

14. Nilsen, T. W. Trans-splicing of nematode premessenger RNA.Annu. Rev. Microbiol. 47, 413–440 (1993).

15. Denker, J. A. et al. New components of the spliced leaderRNP required for nematode trans-splicing. Nature 417,667–670 (2002).

16. Nilsen, T. W. Evolutionary origin of SL-addition trans-splicing:still an enigma. Trends Genet. 17, 678–680 (2001).

17. Zorio, D. A. et al. Operons as a common form ofchromosomal organization in C. elegans. Nature 372,270–272 (1994).

18. Blumenthal, T. et al. A global analysis of Caenorhabditiselegans operons. Nature 417, 851–854 (2002).The paper presents a microarray and cDNA analysis ofthe C. elegans genome to identify most of the operons.It also reports the co-regulation of functionally relatedgenes through their presence in the same operon.

19. Kamath, R. S. et al. Systematic functional analysis of the C. elegans genome using RNAi. Nature 421, 231–237(2003).The authors analyse most of the C. elegans genes byRNAi, and divide them into functional categories onthe basis of their knockdown phenotypes.

20. Evans, D. et al. Operons and SL2 trans-splicing exist innematodes outside the genus Caenorhabditis. Proc. NatlAcad. Sci. USA 94, 9751–9756 (1997).

Page 9: Caenorhabditis elegans operons: form and function

© 2003 Nature Publishing Group

118 | FEBRUARY 2003 | VOLUME 4 www.nature.com/reviews/genetics

R E V I E W S

21. Muhich, M. L. & Boothroyd, J. C. Polycistronic transcripts intrypanosomes and their accumulation during heat shock:evidence for a precursor role in mRNA synthesis.Mol. Cell. Biol. 8, 3837–3846 (1988).

22. Davis, R. E. & Hodgson, S. Gene linkage and steady stateRNAs suggest trans-splicing may be associated with apolycistronic transcript in Schistosoma mansoni. Mol. Biochem. Parasitol. 89, 25–39 (1997).

23. Blumenthal, T. Gene clusters and polycistronic transcriptionin eukaryotes. Bioessays 20, 480–487 (1998).

24. Dufourcq, P. et al. lir-2, lir-1 and lin-26 encode a new class of zinc-finger proteins and are organized in two overlappingoperons both in Caenorhabditis elegans and inCaenorhabditis briggsae. Genetics 152, 221–235 (1999).

25. Zorio, D. A. & Blumenthal, T. U2AF35 is encoded by anessential gene clustered in an operon with RRM/cyclophilinin Caenorhabditis elegans. RNA 5, 487–494 (1999).

26. Mazroui, R., Puoti, A. & Kramer, A. Splicing factor SF1 fromDrosophila and Caenorhabditis: presence of an N-terminalRS domain and requirement for viability. RNA 5, 1615–1631(1999).

27. Zhou, Z. et al. Comprehensive proteomic analysis of thehuman spliceosome. Nature 419, 182–185 (2002).

28. Jurica, M. S. et al. Purification and characterization of nativespliceosomes suitable for three-dimensional structuralanalysis. RNA 8, 426–439 (2002).

29. Rappsilber, J. et al. Large-scale proteomic analysis of thehuman spliceosome. Genome Res. 12, 1231–1245 (2002).

30. Nomura, M. Organization of bacterial genes for ribosomalcomponents: studies using novel approaches. Cell 9,633–644 (1976).

One of several papers reporting the co-expression ofgenes that encode RNA polymerase subunits andribosomal proteins in the same bacterial operons.

31. Treinin, M. et al. Two functionally dependent acetylcholinesubunits are encoded in a single Caenorhabditis elegansoperon. Proc. Natl Acad. Sci. USA 95, 15492–15495 (1998).

32. Clark, S. G., Lu, X. & Horvitz, H. R. The Caenorhabditiselegans locus lin-15, a negative regulator of a tyrosinekinase signaling pathway, encodes two different proteins.Genetics 137, 987–997 (1994).

33. Huang, L. S., Tzou, P. & Sternberg, P. W. The lin-15 locusencodes two negative regulators of Caenorhabditis elegansvulval development. Mol. Biol. Cell 5, 395–411 (1994).

34. Page, A. P. Cyclophilin and protein disulfide isomerase genesare co-transcribed in a functionally related manner inCaenorhabditis elegans. DNA Cell Biol. 16, 1335–1343(1997).

35. Lercher, M. J., Blumenthal, T. & Hurst, L. D. Co–expressionof neighboring genes in Caenorhabditis elegans is mostlydue to operons and duplicate genes. Genome Res. (in thepress).

36. Hill, A. A. et al. Genomic analysis of gene expression in C. elegans. Science 290, 809–812 (2000).

37. Furst, J. et al. ICln ion channel splice variants inCaenorhabditis elegans: voltage dependence and interactionwith an operon partner protein. J. Biol. Chem. 277,4435–4445 (2002).The first paper to use a C. elegans operon as a genefinding tool. It shows that one protein in the operonmodifies the activity of an ion channel encoded in theoperon when the two are co-expressed in a cell culture.

38. Nimmo, R. & Woollard, A. Widespread organisation of C. elegans genes into operons: fact or function? Bioessays24, 983–987 (2002).

39. Andersen, J. S. et al. Directed proteomic analysis of thehuman nucleolus. Curr. Biol. 12, 1–11 (2002).

AcknowledgementsWe are grateful to P. MacMorris for comments on the manuscript,to J. Ahringer for the communication of results before publication,and to A. Skop, K. Howell, S. Adam, A. Page, A. van Hoof and W. Harper for providing gene lists. The authors are supported bythe National Institute of General Medical Sciences.

Online links

DATABASESThe following terms in this article are linked online to:FlyBase: http://flybase.bio.indiana.eduAdh | Adhr | stonedLocusLink: http://www.ncbi.nlm.nih.gov/LocusLinkGDF1 | LASS1WormBase: http://www.wormbase.orgC14B1.7 | C14B1.8 | F43C1.3 | gpd-1 | gpd-2 | gpd-3 | gst-1 |mai-1 | R107.6 | T08A11.2 | T09F3.4 | Y47G6A.24 | Y47G6A.25

FURTHER INFORMATIONSteve Mount’s lab:http://www.wam.umd.edu/~smount/DmrNAfactors/table.htmlAccess to this interactive links box is free online.