the persistent contributions of rna to eukaryotic gen(om)e

28
The Persistent Contributions of RNA to Eukaryotic Gen(om)e Architecture and Cellular Function Ju ¨ rgen Brosius Institute of Experimental Pathology (ZMBE), University of Mu ¨nster, D-48149 Mu ¨nster, Germany Correspondence: [email protected] Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolu- tionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/ protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery. Everything has been said already, but not yet by every- one. —Karl Valentin Sturgeon’s Revelation: Ninety percent of science fiction is crud, but then, ninety percent of everything is crud. —Theodore Sturgeon They think that intelligence is about noticing things that are relevant (detecting patterns); in a complex world, intelligence consists in ignoring things that are irrelevant (avoiding false patterns). —Nassim Nicholas Taleb(Taleb 2010) O f all extant cellular macromolecules, RNA is the most ancient, persisting as much as 4 10 9 years in our planet’s life-forms. The abil- ity to combine genotype with phenotype such as catalytic activity (Noller and Chaires 1972; Kruger et al. 1982; Guerrier-Takada et al. 1983; Noller et al. 1992) leveled a major hurdle in understanding the origin of life. The salient discoveries eliminated the virtually impossible prerequisite for two to three different classes of macromolecules to converge as an evolving unit. At the same time, RNA provides a required con- tinuity in the path of evolution (Yarus 2011) during various genetic takeovers or evolution- ary transitions (Cairns-Smith 1982; Szathma ´ry and Smith 1995). In a remarkably insightful ar- ticle dating back half a century, Alex Rich fore- saw much of what now is becoming main- stream, for example, that RNAwas ancestral to protein and DNA (Rich 1962). This landmark publication received little attention over the years; even early proponents of an RNA world Editors: Patrick J. Keeling and Eugene V. Koonin Additional Perspectives on The Origin and Evolution of Eukaryotes available at www.cshperspectives.org Copyright # 2014 Cold Spring Harbor Laboratory Press; all rights reserved; doi: 10.1101/cshperspect.a016089 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 1 on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/ Downloaded from

Upload: others

Post on 27-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

The Persistent Contributions of RNAto Eukaryotic Gen(om)e Architectureand Cellular Function

Jurgen Brosius

Institute of Experimental Pathology (ZMBE), University of Munster, D-48149 Munster, Germany

Correspondence: [email protected]

Currently, the best scenario for earliest forms of life is based on RNA molecules as they havethe proven ability to catalyze enzymatic reactions and harbor genetic information. Evolu-tionary principles valid today become apparent in such models already. Furthermore, manyfeatures of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNAas the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatoryfunction via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes,conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extantgenomes. Raw material for novel genes encoding protein or RNA, or parts of genes includingregulatory elements that selection can act on, continues to enter the evolutionary lottery.

Everything has been said already, but not yet by every-one.

—Karl Valentin

Sturgeon’s Revelation: Ninety percent of science fictionis crud, but then, ninety percent of everything is crud.

—Theodore Sturgeon

They think that intelligence is about noticing thingsthat are relevant (detecting patterns); in a complexworld, intelligence consists in ignoring things that areirrelevant (avoiding false patterns).

—Nassim Nicholas Taleb (Taleb 2010)

Of all extant cellular macromolecules, RNAis the most ancient, persisting as much as 4

� 109 years in our planet’s life-forms. The abil-ity to combine genotype with phenotype suchas catalytic activity (Noller and Chaires 1972;

Kruger et al. 1982; Guerrier-Takada et al. 1983;Noller et al. 1992) leveled a major hurdle inunderstanding the origin of life. The salientdiscoveries eliminated the virtually impossibleprerequisite for two to three different classes ofmacromolecules to converge as an evolving unit.At the same time, RNA provides a required con-tinuity in the path of evolution (Yarus 2011)during various genetic takeovers or evolution-ary transitions (Cairns-Smith 1982; Szathmaryand Smith 1995). In a remarkably insightful ar-ticle dating back half a century, Alex Rich fore-saw much of what now is becoming main-stream, for example, that RNA was ancestral toprotein and DNA (Rich 1962). This landmarkpublication received little attention over theyears; even early proponents of an RNA world

Editors: Patrick J. Keeling and Eugene V. Koonin

Additional Perspectives on The Origin and Evolution of Eukaryotes available at www.cshperspectives.org

Copyright # 2014 Cold Spring Harbor Laboratory Press; all rights reserved; doi: 10.1101/cshperspect.a016089

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

1

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 2: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

did not refer to this article (Woese 1967; Crick1968; Orgel 1968; Gilbert 1986), although atleast one of the investigators must have hadknowledge about the article, as it was cited ina different context concerning the stereochem-ical possibility of six distinct base pairs (Crick1968). The origin of the DNA genome fromRNA and that “DNA may be regarded as a de-rivative molecule which has evolved in the formthat it only carries out part of the primitive nu-cleic acid function” is another correct prediction(Rich 1962). Furthermore, the investigator pre-saged mechanisms such as antisense RNA con-trol of gene expression, short interfering RNAs(siRNAs), and perhaps microRNAs (miRNAs):“If both strands are active, then the DNAwouldproduce two RNA strands which are comple-mentary to each other. Only one of these mightbe active in protein synthesis, and the otherstrand might be a component of the control orregulatory signal” (Rich 1962).

In this article, I shall present the rise andpersistence of RNA from the dawn of an RNAworld and discuss current evolutionary princi-ples already apparent in an RNAworld. In com-parison to Archaea and Bacteria, the eukaryoticgenome is a better vantage point, as archaeal andbacterial genomes are more derived and, thus,lost many of the RNA signatures that eukaryotesstill show. It is likely that eukaryotic DNA ge-nomes not only kept much more of their RNA/RNP world heritage than previously anticipat-ed, but also continue to evolve novel RNAs invarious functional roles.

WHEN DOES LIFE BEGIN?

An excellent treatise of possible scenarios lead-ing to and continuing in an RNA world to thelast universal common ancestor (LUCA) isavailable (Atkins et al. 2011). Can the beginningof life be defined along the transitions fromphysicochemical to biological reactions? Likealmost everything in biology, clear boundariesare difficult to demarcate and thus the defini-tion of the first life-form rather occupies abandwidth on a continuum. One of several pos-sible thresholds to consider would be the fortu-itous generation of one or two molecules that

could replicate themselves or each other. Shouldthe threshold be set at the transition when themolecules involved could change during repli-cation and the variants are subjected to selec-tion—the initial Darwinian ancestor (IDA)(Szathmary 2006; Yarus 2011)? The first self-replicating macromolecules must not necessar-ily have been RNA. Derivatives of RNA, espe-cially with altered backbones, have been sug-gested as predecessors of RNA owing to morefavorable chemistries/stabilities for spontane-ous generation and persistence of oligomeriza-tion at the presumed planetary conditions(Joyce et al. 1987; Schoning et al. 2000; Zhanget al. 2005; Powner et al. 2009; Robertson andJoyce 2012; Neveu et al. 2013).

An interesting question is, if it is that “sim-ple,” why did life not evolve multiple times?There are a number of explanations. The earlyenvironment of the planet with conditions fa-voring the necessary chemical reactions differedfrom the more temperate conditions now (Rob-ertson and Joyce 2012). Perhaps life did evolvebefore LUCA numerous times independently,but the descendants of LUCA are the only sur-vivors. Perhaps primitive forms of life, for ex-ample, in the form of IDAs, still do arise, but weare not aware of them, in part, because we havenot searched for such simple and different life-forms. Another reason is that a nascent form oflife would easily be outcompeted by the estab-lished ones, as the latter had a great chronolog-ical advantage adapting to current conditions.New forms of life might have a chance only iftheir metabolism is sufficiently different and,thus, not useful prey to LUCA-related life-formsor if they happened to evolve in an unoccupiedniche so as not to succumb to immediate pre-dation by the fitter “incumbents.”1

RETRACING THE PATH

In any event, by applying in vitro synthesis andselection procedures (Ellington and Szostak

1An afterthought worthy of note is the dichotomy with re-spect to life’s fragility considering individuals, even speciesversus the resilience of life as a whole, over an �4-billion-year timescale.

J. Brosius

2 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 3: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

1990; Tuerk and Gold 1990; Gold et al. 2012),several laboratories are making great strides to-ward generating RNA molecules with the abilityto self-replicate (Doudna and Szostak 1989;Johnston et al. 2001; Zaher and Unrau 2007;Lincoln and Joyce 2009; Shechner et al. 2009;Wochner et al. 2011; Attwater et al. 2013; Mastet al. 2013), although these RNA polymeraseribozymes still fail to completely self-replicate(Deamer 2005). Cooperation of two or moreRNA enzymes in hypercycles (Eigen and Schus-ter 1977) may be a solution to this problem(Vaidya et al. 2012) (see also below).

Once a self-replicating ribozyme (mono- ormultimeric) arose with the further potential toevolve into a replicator not restricted to onlyself-copy but to copy other RNA templates aswell, a prerequisite for a metabolically self-suf-ficient RNA conglomerate, further challengesare apparent. First, the replicator indiscrimi-nately copying any “junk RNA” in the mix hard-ly would be able to persist. Second, if furtherRNA molecules would arise by copying witherrors—just like in extant organisms, new genesstill arise by duplication and variation—toeventually take over metabolic functions otherthan replication (e.g., activated compounds, in-cluding nucleotides), these associations wouldbe fleeting, at best, because of diffusion.

SEQUESTRA, AMPLIFICA, DIVIDEET IMPERA!

A compartmentalization of cellular constituentsin droplets, as suggested by Carl Woese (1979),stabilized by simple fatty acids (Szostak et al.2001) was an early evolutionary transition(Maynard Smith and Szathmary 1995) crucialfor the continuation of life. Recent experimentsshowed that such bilayered partitions were suf-ficiently permeable for uptake of small mole-cules from the environment and copying nucle-ic acids in such vesicles is possible (Sacerdoteand Szostak 2005; Mansy et al. 2008; Adamalaand Szostak 2013). This sequestration was a pre-requisite for cellular life and evolution as weknow it today, with many evolutionary princi-ples in place already (Brosius 2003c). Vesiclescould grow along with their RNA contents, di-

vide, fuse while shuffling their contents and di-vide again, in other words, performing sexualacts. Another “forecast” of the mechanisms gen-erating genomic diversity would be recombina-tion not only between cells, but also betweendifferent RNA molecules, a mechanism that ac-tually had been observed in a two-componentribozyme system (Lincoln and Joyce 2009). Theorigin of viruses could date to this early stage ofcellular evolution as well. RNA molecules could,perhaps protected by a lipid envelope, movefrom cell to cell, blurring the line between infec-tion and horizontal transfer.

MOST EVOLUTIONARY PRINCIPLES ARE ATLEAST AS ANCIENT AS THE RNA WORLD

Lessons from the RNA world apply remarkablywell to extant organisms and their genomes.In a primitive RNA cell, conflict and coopera-tion, selfishness and altruism had to coexist andestablish a fine balance. Importantly, the suc-cess of individual ribozymes also depended toa large degree on functional interactions withother cellular RNAs (today: gene products),namely, the (genetic) background of the proto-cell (Brosius 2003c). In contemporary biologyand medicine, considerations of interactions ofvarious alleles within the genomic background,as well as individual variability of gene expres-sion levels including complete gene depletions,are beginning to gain wider acceptance (Wil-liams and Nesse 1991; Sibilia and Wagner 1995;Brosius 2003b; Ganten and Nesse 2012; Nesseet al. 2012).

Balances between selfishness and cooper-ation had to evolve early; this is recently docu-mented by in vitro selection research thatrevealed in an experiment to generate Tetrahy-mena group I ribozymes with improved DNAcleavage capability that one of the two RNAaptamers, itself being catalytically inactive, par-ticipated in a productive intermolecular in-teraction with an active ribozyme evolving inparallel, thus, ensuring the survival of bothRNAs in the nucleic acid population (Hanczycand Dorit 1998). Other studies also show thatparasites can cause the evolution of furthercomplexity (Takeuchi and Hogeweg 2008). In

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 3

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 4: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

the case of the bacterially derived AzoD ri-bozyme, a clonal preparation showed no activ-ity toward the phosphorothioate substrate. Pre-sumably, this sequence alone fails to show afunctional fold, but could form an active com-plex in an intermolecular partnership withother RNA molecules (Hayden et al. 2011). Re-cently, it was shown that cooperative cycles ofreplication involving three or four participatingRNAs have a selective advantage over selfishreplication cycles (Vaidya et al. 2012), as for-mulated earlier in the hypercycle principle ofnatural self-organization (Eigen and Schuster1978) and placed into context with extant or-ganisms (Brosius 2003c). The participation ofseveral small RNAs also has the potential to sig-nificantly increase the complexity of ribozymes.

CHROMOSOMAL RNA

The dispensation of essential RNA moleculesbetween daughter cells was initially stochastic,probably facilitated by the availability of suffi-cient copies of each RNA species in the parentcell, such that each daughter would have a rea-sonable chance to end up with at least onecomplete set of RNAs. If initially a few dif-ferent RNAs were strung together as “miniRNA chromosomes,” the advantage would beachievement of a more balanced distributionof RNAs. Perhaps they were distributed andreplicated as such, with some of them cut upinto functional RNAs, conceivably the birth ofRNA processing. Processing signals might havebeen placed, in part, between the fragmentscorresponding to mature RNAs, presumablythe origin of intergenic or even intronic space(Brosius 2003c). With the advent of templatedprotein biosynthesis, a major evolutionary tran-sition included co-option of existing functionalRNAs as well as longer “chromosomal RNAs”by RNA cutting and pasting to stitch togethermessenger RNAs (mRNAs) with open and in-creasingly longer reading frames, which mightsuggest a very early origin of RNA splicing in aribonucleoprotein world (RNP world) (Rean-ney 1974; Darnell and Doolittle 1986). Alterna-tive splicing and other rearrangements wouldbe one of the mechanisms to enhance variation

for generating translation products out of alimited repertoire of functional RNAs alsodoubling as templates for translation (Brosius2001).

RNA SIGNATURES WRITTEN ALL OVEREXTANT DNA GENOMES

Linear arrangement of RNA genes, as well as thetransition to the RNP world evolving an ances-tor of the reverse transcriptase enzyme, hap-pened to constitute a useful precondition for anext major evolutionary transition, namely, theconversion of RNA to DNA, the latter merelyserving as bookkeeper (Darnell and Doolittle1986; Gould 2002). For this and other reasons,the central dogma of biology could be revisitedor supplemented by a different graphic account,both in appreciation of the major significance ofRNA in the cell and the chronological order ofthe major transitions (Fig. 1) (Maizels and Wei-ner 1987; Brosius 2003a; Cech 2012). Remark-ably, the process of reverse transcription of anyRNA (Brosius 1999b) and more or less randomintegration into DNA genomes still persists andcontributes much to their landscapes in a num-ber of ways. The modular arrangement of func-tional with nonfunctional DNA and its plastic-ity in extant eukaryotic genomes might be avestige of early evolutionary transitions (Bro-sius 2009).

Reverse transcriptase not only played an im-portant role in generating the DNA genomebut continues to be essential for chromosomemaintenance via the action of telomerase. Theextant enzyme synthesizes telomeres that serveas protective caps of chromosome ends, thuscounteracting their shortening during replica-tion cycles. Telomerase is a complex betweenRNA and protein in which the RNA serves astemplate and protein as reverse transcriptase(Greider and Blackburn 1989; Blackburn et al.2006; Blackburn and Collins 2011). Group IIintrons, when transcribed as RNA are self-splic-ing with the activity residing on the RNA at highsalt conditions in vitro (van der Veen et al. 1986).For in vivo activity and integration at new ge-nomic sites in a process termed reverse splicing,a reverse transcriptase encoded in the intron

J. Brosius

4 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 5: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

is necessary (Lambowitz and Zimmerly 2011).The ribozyme has been proposed to be ancestralto non-LTR (long terminal repeat) retroposonsas well as the spliceosome attributable to simi-larities of their reverse transcriptase and somesmall nuclear RNA (snRNA) components ofthe spliceosome, including aspects of the splicemechanism itself, respectively (Xiong and Eick-bush 1990; Guthrie 1991; Sharp 1991; Fica et al.2013). The rapid spread of introns in eukaryoteshas been ascribed to group II introns after theywere imported by endosymbionts (Cavalier-Smith 1991), perhaps necessitating the separa-tion of slow mRNA production in the nucleusfrom the fast translation in the cytoplasm, one ofthe hallmarks of the eukaryotic cell (Martin andKoonin 2006).2

Importantly, the process of continuouslyconverting RNA to DNA and its genomic inte-gration has the potential to grossly inflate ge-nomes with neutrally evolving material and, inconjunction with larger deletion of segments byrecombination, leads to a high turnover rate ofsequences on an evolutionary time scale, oncemore calling into question the ENCODE claimthat �80% of the human genome is functional(Doolittle 2013; Graur et al. 2013; Niu and Jiang2013). Generally, specific retroposons3 becomeactive in certain lineages, and reverse transcriptsfrom one or several master gene transcripts,such as LINEs (long interspersed elements), au-tonomous as they harbor gene encoding theretroposition machinery such as reverse tran-scriptase and nonautonomous SINEs (short in-terspersed elements), such as Alu or B1 ele-ments. Nonautonomous retroposons rely onthe machinery of the autonomous retroposons.

Central dogma of molecular biology Revisited

DNA Causality

RNA Mediating

Protein Execution

RNA Causality,execution

(Protein)

DNABookkeeping

(Protein)

Protein Execution

(Protein) (Protein)

(Protein) (RNA)

(Protein) (RNA)

(Protein)

(Protein)

Figure 1. Alternative to the central dogma of molecular biology. The left part depicts the original central dogmaof molecular biology with several adjustments incorporated, for example, the discovery of reverse transcription(thin upward arrow) (Crick 1958, 1970). The grouping on the right better reflects the evolutionary transitionsand primacy of RNA. RNA could be replicated directly (semicircular arrow), albeit in extant organisms (e.g.,plants, viruses) only with the catalytic activity of protein (RNA-dependent RNA polymerase). In this scheme,catalysis is indicated by (RNA) and or (protein) in parentheses. The major significance of RNAs for peptidyltransferase activity during translation (Noller 2012) is represented by the larger font for (RNA) compared to(Protein). Execution includes structural, catalytic, and regulatory tasks in the cell. The evolutionary develop-ments underscore Stephen Jay Gould’s view that DNA merely is the agent of bookkeeping (Gould 2002). RNAsused to be bookkeepers as well, but remained agents of causality (Brosius 2005a).

2As mentioned above, some of the intron gain could dateback as far as the RNA world, or at least back to LUCA,because type II introns are present in bacteria (Ferat andMichel 1993) and in the more derived bacterial and archaealgenomes, introns could have been lost (Poole et al. 1999).Massive intron loss would be reminiscent of a relativelyrecent purge of most introns in the yeast S. cerevisiae viaretroposition (Fink 1987).

3A general term for retroposed sequences as well as DNAtransposons is “transposable” elements (TEs). However, be-cause following retroposition most elements are not able totranspose any longer attributable to truncations, lack oftranscription, etc., the writer prefers “transposed” elements.

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 5

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 6: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Most retrocopies are not active as they are tran-scriptionally silent because of the lack of inter-nal (e.g., truncation of LINEs) or external pro-moters in SINEs. Intact and transcribed mastergenes can spawn, over a few tens or hundreds ofmillion years, more than one million copies, asis the case for Alu elements (Weiner 2006). Witha few exceptions (see below), there is no selectivepressure on such elements and, accordingly,they deviate over time from the consensus se-quence. The human genome has been estimatedto contain �43% retroposons, a relatively smallamount of DNA transposons (�3%), and aslittle as 5% conserved sequences (Lindblad-Toh et al. 2011; Ward and Kellis 2012) that po-tentially have function (Fig. 2).4 As argued be-fore, conversion of RNA to DNA is an ancientprocess and it has been suggested that mostDNA in the human genome has been derived

by a virtually unabated bombardment of retro-posons (Brosius 1999a). Over time, these ele-ments are blurring into oblivion as randomizedsequences by incessant changes, such as pointmutations and small indels. Recently, it wasconfirmed by more sensitive computationalstrategies involving P-clouds, that almost 70%of the human genome may harbor repetitiveelements (de Koning et al. 2011). Extrapolatingback to or forward from the origin of RNA !DNA genome transition, there is no reason whythis figure, except for the generally lesser con-tributions of DNA transposons, should notapproach almost 100% (Brosius 1999a). Thefact that essential sequences are interspersedwith nonessential sequences as landing padsfor transposed elements lessening their detri-mental impact genome-wide could be chalkedup to the ENCODE project (Bernstein et al.2012), as well as adherents to intelligent design,creationists, and the like as “functional,” whichwould cleanse our species from the blemish ofliving with �80%–90% junk DNA in our ge-

or

or

or

or

Figure 2. Exaptation of a new gene module at any stage of transposed element (TE) deterioration. In thisexample, a SINE, such as Alu, B1, identifier (ID), or mammalian-wide interspersed repeat (MIR), is retroposedinto the first intron of a gene (introns depicted by thin black lines and the SINE as wide yellow bar). The geneharbors three exons (in blue). Open reading frames (ORFs) are shown as wide bars, whereas the 50-UTR(untranslated region) and 30-UTR are shown as narrower blue bars, in the terminal exons, respectively. Thetranscription promoter is depicted in red. The four top representations indicate the gradual decay of the SINEelement by continuous mutation fading from yellow to white; thus, blending into the other anonymous sequenceof the intron. At any of those stages, part of the TE can be exapted as novel exon (framed in black). Sometimespart of a discernible TE and adjacent anonymous intron can be exapted as novel gene module (bottom).

4See discussion points below that absence or low levels ofconservation must not necessarily rule out function and viceversa.

J. Brosius

6 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 7: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

nomes (Doolittle 2013; Graur et al. 2013; Niuand Jiang 2013).

YESTERDAY’S JUNK COULD BECOMETOMORROW’S NOVEL GENE MODULE,IF ONLY TEMPORARILY SO

The lack of purifying selection concerning non-harmful TEs, including mRNA-derived retro-copies, should not divert from the fact that,occasionally, such sequences can be exaptedas genetic novelties (Brosius 1991; Brosius andGould 1992). Gene duplication as a means ofgenerating novel genes has been realized for along time (Haldane 1933; Muller 1935; Bridges1936; Lewis 1951; Stephens 1951; Nei 1969).Gradual change from a duplicated gene, viathose encoding isoforms, up to the acquisitionof novel functions including subfunctionali-zation (Lynch and Force 2000) is now well doc-umented (Roth et al. 2007; Kaessmann 2010;Chen et al. 2013). Gene amplification can occurvia segmental duplication (Bailey and Eichler2006) or retroposition (Brosius 1999a; Ba-bushok et al. 2007) with a lower “success rate”for the latter. Retroposition of a copy of themature mRNA requires the fortuitous presenceof a promoter element upstream, which, as apotential benefit, could immediately alter regu-lation of the retrogene in comparison to the par-ent gene (Brosius 1999b). Genes, chiefly thoseencoded by endogenous retroviruses, were inde-pendently exapted or domesticated numeroustimes into novel functions, some of them aremeanwhile essential for procreation or survival(Volff 2006, 2009). Also, DNA copies of nonret-roviral RNA genomes, or parts thereof, can beintegrated into genomes (Koonin 2010). In fun-gi, exaptations of such genes were reported (Tay-lor and Bruenn 2009), underscoring once morethe notion that any RNA can be a template forthe retroposition machinery (Brosius 1999b).

Can novel genes arise de novo from previ-ously gene-free neutrally evolving genome re-gions? Despite some false positives (Monte etal. 1997; Kriegs et al. 2005), recruitment of entireprotein-coding genes out of neutrally evolvingsequences does occur (Long et al. 2003, 2013;Heinen et al. 2009; Kaessmann 2010; Carvunis

et al. 2012; Murphy and McLysaght 2012; Nemeand Tautz 2013). Recently, it has been empha-sized that many long non-protein-coding RNAsoriginated from TEs (Kapusta et al. 2013). ThisRNA class is discussed in more detail below. Inany event, it is much more common that novelgene modules are being added to existing genes.Such co-opted or exapted modules can be de-rived from inter- or intragenic space and consti-tute novel protein-coding exons and regulatoryregions such as promoters and enhancers (Bro-sius 2005b, 2009; Baertsch et al. 2008; Rebolloet al. 2012).5 A prominent case is the exonizationof parts of Alu elements, usually as alternativelyspliced exons (Makalowski et al. 1994; Nekru-tenko and Li 2001; Sorek et al. 2002; Lev-Maoret al. 2003; Krull et al. 2005; Shen et al. 2011).Many such events are, over evolutionary time,not stable. That is, they persist for a certain timeif they are not or only slightly detrimental,especially when the novel splice product onlyconstitutes a fraction of the functional RNA.Should, over time, the novel splice variant hap-pen to become beneficial, it will be under puri-fying selection and, by point mutations in andaround splice signals, its ratio, in comparison tothe canonical splice form, might change in itsfavor. In a phylogenetic study on primates, it hasbeen shown that such exons derived from Aluelements are lost at a high rate in the trial anderror mode, typical for the evolution of novelties(Krull et al. 2005). A similar study involvingall mammals and the older MIRs revealed thatmany of the events were already fixed and cur-rently are under negative selection. The studyalso shows, in the case of a relatively recent exo-nization of an ancient MIR element, that exap-tation can occur at any stage of TE decay (Fig. 3),and consequently alsowith any nondescript ran-domized DNA (Krull et al. 2007). Recently, amechanism was described by which a snoRNA

5Even modules that reduce or destroy the activity of a tar-geted gene and/or its product can be beneficial to the host.Xmrk is an epidermal growth factor receptor-related onco-gene in certain Xiphophorus fish hybrids and, when overex-pressed, leads to melanoma. Insertion of an autonomousnon-LTR retrotransposon disrupted and deactivated Xmrk.As a consequence, the individuals harboring this insertiondo not develop tumors (Schartl et al. 1999).

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 7

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 8: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

(small nucleolar RNA) extended its RNA-cod-ing region by alternative processing of its 30 endby introducing a single-point mutation near thesite important for processing, generating boththe canonical snoRNA and an extended variant(Mo et al. 2013). This event, detected in a clusterof rat Snord115 snoRNA genes in the imprintedPrader-Willi syndrome locus, but not in mouse,must be relatively recent. The additional geneproduct, L-Snord115, has most likely not yetacquired a function. The odds are that thissnoRNA variant will not survive another dozenor so million years.

ALL WIRED UP ON REWIRING

Examples for TE contributions to gene expres-sion by providing promoter or enhancer mod-ules have been known for some time and areample (Jordan et al. 2003; van de Lagemaatet al. 2003; Medstrand et al. 2005). Interestingly,TEs that were conserved at unusually high levelsover hundreds of millions of years were reportedto act as enhancers (Bejerano et al. 2006; San-tangelo et al. 2007; Sasaki et al. 2008; Lindblad-Toh et al. 2011; Lowe and Haussler 2012). Re-cently, a number of publications proposed thatthe spreading of copies from active TE classes

can lead to rapid rewiring, affecting hundreds ofgenes whose expression is being altered as a con-sequence, and is caused by the action of addi-tional transcription factors binding to those en-hancers (Wang et al. 2007; Bourque et al. 2008;Feschotte 2008; Xie et al. 2010; Lynch et al. 2011;Rebollo et al. 2012). This might even involve Aluelements in the form of Alu-derived miRNAs ornon-miRNAs (Du et al. 2013; Hoffman et al.2013; Liang and Yeh 2013; Mandal et al. 2013;Spengler et al. 2014).

Nevertheless, caution should be applied forthe following reasons:

(1) Four ultraconserved elements were de-leted in mice. Although there is no indicationthat these elements show immediate ancestryto TEs, the lesson equally applies to TE-derivedenhancers. In this study, enhancer elementsadjacent to genes whose protein-coding exons,when deleted, showed clear phenotypes inmice were chosen for deletion and the elementsalso were known to function as enhancers. Inall four cases, the mice were not only viableand fertile, but also failed to reveal any obviousphenotype among the many parameters testedin the laboratory (Ahituv et al. 2007). Experi-mentally, it appears to be much easier to obtaina positive expression result with reporter con-

mRNAs (including UTRs)RNA genesCNEsOther regulatory elementsRetroderived TEsAnonymousDNA TEs

mRNAs (including UTRs)RNA genesCNEsOther regulatory elementsRetroderived TEsAnonymousDNA TEs

Figure 3. The functional versus nonfunctional human genome. The pies represent the various segments of thegenome applying more conservative estimates. mRNAs (�1.3% accounting for ORFs) including their 50- and30-UTRs (�0.8%) occupy slightly .2% of the human genome. All other RNAs (non-protein-coding RNAs) areestimated to cover about 5% (green), maximally 10%. This figure is under debate. Likewise, there is a conser-vative estimate for regulatory elements of �8%. Other sources argue up to 20% (Bernstein et al. 2012; Shen et al.2012). Because conserved non-protein-coding DNA elements (CNEs, �2%, purple) often encode cis-regulatoryelements (Hiller et al. 2012), the other regulatory elements are shown as 6% only (blue). TEs derived from RNAintermediates contribute �43% (yellow) and DNA transposons only 3% (brown). The remainder (39%)corresponds to anonymous, scrambled sequences (white). The estimated 26% introns (Bernstein et al. 2012)occupy mostly the yellow and white segments; intronic regulatory elements and encoded RNAs (e.g., snoRNAs,miRNAs) are accounted for in the respective segments. The pie on the right is identical, except that theretroposons (yellow) and, to a lesser extent, the DNA transposons (brown), blend into the anonymous sequence(white) to reflect TE ancestry of the nondescript sequences as well.

J. Brosius

8 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 9: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

structs than a phenotype by enhancer ablation(Nelson and Wardle 2013).

(2) Should a newly acquired or activated TEclass with “ready-to-use” enhancer activity pop-ulate a genome in high numbers, it might notsmoothly rewire expression of a set of genes ina functionally viable manner, but simply wreakhavoc in a genome by juxtaposition of newenhancers to resident genes. Such eruptions ofTEs would be expected to have at least as manydetrimental effects than beneficial ones. A lessradical scenario would be the following: TEscarry sequences resembling transcription factorbinding sites (TFBS) or other regulatory regionsthat pending minor mutations have the poten-tial to become functional.

(3) Enhancers usually harbor clusters of ho-motypic or heterotypic TFBS and therefore it isless likely that most TEs are instantly functional.Enhancers usually act in a modular fashion andrather than changing expression patterns, theyadd a cell-type/tissue or a developmental win-dow to existing expression patterns. After a pe-riod of testing ( just as in the case of exonization;see above), this or other modules might get lost(adding orexchanging afew mains, yes; rewiring,hardly). For a recent comprehensive assessmentof these problems, see de Souza et al. (2013).

TE FUNCTIONS: TO THE MOON!

TEs are insertional mutagens and often enough,integration near or into functional modules ofthe genome is disadvantageous and, for exam-ple, can cause disease (Chen et al. 2005; Callinanand Batzer 2006; Iskow et al. 2010; Hancks andKazazian 2012). Because of affordable high-throughput sequencing technologies, the searchfor somatic de novo insertions of TEs becomesfeasible. Numerous such events could be detect-ed in tumor tissues and cells (Miki et al. 1992;Lee et al. 2012), some of which might be caus-al. Somatic integration even was proposed tocontribute to neuropsychiatric disease, such asschizophrenia (Bundo et al. 2014). Apart frombeing detrimental or neutral, a minority ofevents has the potential to turn out beneficial.

TEs of a certain class might harbor not onlyTFBS, but also numerous other functional se-

quences in their consensus sequences, and, per-haps, mainly because these elements are definedby their designation, proposals with highly di-verse functions are being published, even in-volving TEs with very narrow phylogenetic dis-tributions (Allen et al. 2004; Espinoza et al. 2004,2007; Lunyak et al. 2007; Mariner et al. 2008;Gong and Maquat 2011; Yakovchuk et al. 2011;Carrieri et al. 2012; Jady et al. 2012; Holdt et al.2013; Ponicsan et al. 2013; Wang et al. 2013).Somatic LINE element integrations even havebeen implied in the development of the brain(Muotri et al. 2005; Coufal et al. 2009; Faulkneret al. 2009; Singer et al. 2010; Baillie et al. 2011;Upton et al. 2011; Perrat et al. 2013; Reilly et al.2013). Either TEs are chock full of regulatorymotifs and control elements, which can be ar-gued in case of promoters, for example, LTRs(Feuchter and Mager 1990), or it is the factthat TEs are defined and designated nucleicacid sequences (nuons) (Brosius and Gould1992) and, therefore, receive more attention in-stead of randomized and anonymous sequencesin attempts to investigate their functions. Per-haps one or the other of these exhilarating find-ings will share the fate of ID repetitive elementsand TEs of the SINE class, in a development thatunfolded about three decades ago. A lot of ex-citement was generated by reports that ID ele-ments regulate brain-specific gene expression(Milner et al. 1984; Sutcliffe et al. 1984a,b;McKinnon et al. 1986). Unfortunately, theseclaims did not stand the test of time (Owenset al. 1985; Sapienza and St-Jacques 1986; Gold-man et al. 2014).6 For sure, any seemingly insig-

6Ironically, the presence of a brain-specific RNA (BC1 RNA)with similarity to the consensus sequence of ID elementshad been noticed in early publications, but dismissed as by-product of a functional act of transcription (Sutcliffe et al.1984a). This activity by RNA polymerase III had been sug-gested to open the chromatin structure of brain-specificgenes to allow transcription by RNA polymerase II (Sutcliffeet al. 1984b). It turned out, however, that BC1 RNA is en-coded by a single active gene, is a master gene for the IDrepetitive SINEs, and is functional (DeChiara and Brosius1987; Martignetti and Brosius 1993; Kim et al. 1994; Wanget al. 2002; Lewejohann et al. 2004; Iacoangeli and Tiedge2013). This perfectly reflects the “Zeitgeist” of the era and isin stark contrast to the current situation in which almostanything that features a ribogroup is being considered func-tional (see below).

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 9

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 10: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

nificant novelty could have far-reaching conse-quences for future lineages (Martignetti andBrosius 1993; Kapitonov and Jurka 2005), butin their infancy, the functional significance, ifnot potential of novelties, is not easy to assessand often might be transitory. And what is truefor regulatory elements and protein-codinggenes and their exonic modules should also ap-ply to functions of non-protein-coding RNAs.

FUNCTIONAL RNA: ALIVE AND KICKING

Despite the growing recognition of RNA’s func-tional significance and versatility, as well asits preeminence in the evolution of life, up to�15 years ago, most scholars in the life sciencesstill deemed RNA molecules as fossils or rem-nants from bygone eras. The most complex cy-toplasmic RNA class, messengers between thegenetic information on DNA and ribosomes,organelles in which structural and functionalmacromolecules (proteins) of a cell are beingassembled, did not generate much excitementany longer after the genetic code had beencracked (Nirenberg et al. 1965; Soll et al. 1965).

A minority of investigators sensed that thepreviously known and rather abundant non-protein-coding RNAs, such as transfer and ri-bosomal RNAs, were only the tip of the iceberg(Prestayko and Busch 1968; Zieve and Penman1976; Lerner et al. 1980; Brown and Fournier1984; Lee et al. 1985; Erdmann and Wolters1987; Mattick 1994; Brosius 1996) and thateven novel functional RNAs could evolve (De-Chiara and Brosius 1987; Brosius 1991). Thefloodgates began to open when sequencing ofcopy DNA generated from non-mRNA fractionsrevealed a plethora of novel RNAs and RNAclasses including miRNAs (Filipowicz 2000;Huttenhofer et al. 2001; Lagos-Quintana et al.2001; Couzin 2002). The development of deepand ultradeep sequencing methods for cellu-lar, organ-specific, and whole transcriptomesof organisms greatly accelerated the deluge ofdata (Wang et al. 2009). The past 15 years havesurprised most of the scientific communitywith the incessant discovery of thousands of no-vel RNAs, some extending RNA species fromalready known classes such as snoRNAs, some

establishing novel RNA classes (e.g., miRNAsand siRNAs), and some unclassified (Fire etal. 1998; Ambros 2001; Lagos-Quintana et al.2001; Lau et al. 2001; Moss 2001; Ruvkun2001; Couzin 2002; Carninci et al. 2005; Derrienet al. 2012; Djebali et al. 2012; Kapranov et al.2012; Ross et al. 2014). Even extracellular RNAsare receiving renewed attention (Benner 1988;Wu et al. 2002; Dunoyer et al. 2007; Leslie2013a). Major RNA classes are summarized inTable 1.7

It appears that the mining of novel tiny andsmall RNAs is reaching a point of deceleration(Saxena and Carninci 2011b) and, as a result,long RNAs are receiving more attention (Jacqu-ier 2009; Ponting et al. 2009; Carninci 2010;Derrien et al. 2012; Hu et al. 2012; St Laurentet al. 2012; Cloutier et al. 2013; Di Ruscio et al.2013; Geisler and Coller 2013; Kung et al. 2013;Orom and Shiekhattar 2013; Sabin et al. 2013;Ulitsky and Bartel 2013; Fatica and Bozzoni2014; Yang et al. 2014). Oligonucleotide arraysequence analysis, conventional RNA sequenc-ing, and RNA-sequencing technologies (FAN-TOM) (Carninci et al. 2005; Cheng et al. 2005;Birney et al. 2007; Mercer et al. 2012), even incombination with biocomputation, taking phy-logenetic considerations and/or the potentialfor forming secondary structures into account,mostly come up with at least ten thousand longnon-protein-coding RNA candidates (Washietlet al. 2007; Managadze et al. 2013; Necsulea etal. 2014; Nielsen et al. 2014).

DID RNA ENTER BUBBLE TERRITORY?

The percentage of the human genome that isbeing transcribed appears to be approachingthe maximum asymptotically. A simple expla-nation is the increased coverage with ultradeepsequencing methods and coverage of any celltype including tumors, many developmentalstages, as well as (tumor) cell lines numerous

7Detailed descriptions of the RNA classes and their (poten-tial) functions can readily be found in reviews and the vastoriginal literature cited therein, for example, Prasanth andSpector 2007; Costa 2010; Aalto and Pasquinelli 2012; Dje-bali et al. 2012; Kapranov et al. 2012; Dieci et al. 2013;Gomes et al. 2013; Leslie 2013b.

J. Brosius

10 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 11: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Table 1. RNA classification

RNA class

Approximate

size (nt) Function

Tiny RNAs, <50 ntmicroRNA, miRNA 21–24 Gene regulation, for example, fine-tuning at the

translational levelShort interfering RNA, siRNA 20–25 RNA interference, defensePIWI-interacting RNA, piRNA 26–31 Epigenetic and posttranscriptional gene silencing

of TEsSmall RNAs, ∼50 – 500 ntExtracellular RNA, exRNA Wide range Intercellular communication

Small nuclear RNA (snRNA)Spliceosomal RNAs, U1, U2, U4, U5, etc.

RNAs100–200 Splicing of mRNA out of primary transcripts

U7 RNA 50 30 processing of histone mRNA precursorsRibonuclease P RNA, RNase P RNA or H1

RNA340 tRNA processing

7SK RNA 330 Regulation of transcriptionY RNA 90–100 DNA replicationTelomerase RNA component, TERC 450 Maintenance of telomeres

Small nucleolar RNA, snoRNA 60–300 Guide RNAs for nucleotide modificationC/D box snoRNA 20 O-methylation of the ribose moiety of rRNAH/ACA box snoRNA rRNA pseudouridinylationCajal body–specific RNA, scaRNA Composite C/D and H/ACA guide RNA,

modification of snRNAsU3, U8, U14, U17, and U22 snoRNAs Regulation of preribosomal RNA (pre-rRNA)

folding, mediation of correct nucleolyticprocessing (maturation)

“Orphan” snoRNAs Target of modification (if any) and functionunknown, examples are SNORD115 andSNORD116 in the PWS (Prader-Willisyndrome) locus

Small cytoplasmic RNA, scRNA5S and 5.8S small ribosomal RNAs 120, 150 TranslationTransfer RNAs, tRNAs 73–94 Adapter molecules in translationSignal recognition particle RNA, 7SL RNA

or SRP RNA300 Targets protein to the endoplasmic reticulum

Vault RNAs 90–100 Components of the vault organelle, function notclear yet

Ribonuclease MRP, mitochondrial RNAprocessing RNA

290 Mitochondrial DNA replication and rRNAprocessing in the nucleus

BC1 RNA 150 Neuronal cytoplasmic RNA (some expression intestes) in soma, but also transported todendrites, possibly involved in regulation oftranslation; phylogenetically restricted torodents, originated from tRNA retroposition

BC200 RNA 200 Neuronal cytoplasmic RNA (some expression intestes) in soma, but also transported todendrites, possibly involved in regulation oftranslation; phylogenetically restricted toanthropoid primates; not homologous to BC1RNA, originated from a monomeric Alu element

Continued

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 11

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 12: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

times over (Carninci 2009; Ritz et al. 2011; Haaset al. 2012; Ho et al. 2012; Shah et al. 2012; Anet al. 2013; Eswaran et al. 2013; Hu et al. 2013;Schonberg et al. 2013; Yoshihara et al. 2013).With all this overabundance, there is a debateraging between those that claim almost anyidentified transcript and snippet of RNA to befunctional (Mattick 2003; Lee et al. 2009; Car-ninci 2010; Kishore et al. 2010; Clark et al. 2011;Bernstein et al. 2012; Kapranov and St Laurent2012; Khayrullina et al. 2012; Gebetsberger andPolacek 2013; Mattick and Dinger 2013) andmore conservative voices (Brosius 2005c; Hut-tenhofer et al. 2005; Robinson 2010; van Bakelet al. 2010; Graur et al. 2013). The writer con-curs with the argumentation for spurious andstochastic transcripts (Kowalczyk et al. 2012;Jensen et al. 2013; Mudge et al. 2013), especiallyas many transcription promoters are bidirec-

tional (Seila et al. 2008; Neil et al. 2009; Xuet al. 2009; Wei et al. 2011; Uesaka et al. 2014).Even the yeast Saccharomyces cerevisiae featur-ing a compact genome abounds with transcrip-tional noise and erroneous initiation of tran-scription by RNA polymerase II (Struhl 2007).Furthermore, some transcription might be in-volved in gene regulation without the tran-scribed RNA being functional (Tisseur et al.2011). Examples include regulation of tran-scription by upstream promoters, previouslyknown as promoter occlusion or transcriptionalinterference (Adhya and Gottesman 1982; Cul-len et al. 1984). In yeast, for example, it has beenshown that the SER3 gene is repressed by the actof transcription from the upstream SRG1 gene(Martens et al. 2004). In contrast, the PHO5gene is activated by intergenic transcription inthe opposite orientation presumably influenc-

Table 1. Continued

RNA class

Approximate

size (nt) Function

Long RNAs, >500 nt18S and 28S large ribosomal RNA, rRNA 1900, 5000 TranslationMessenger RNAs, mRNAs Various Templates for protein biosynthesisLong intervening non-protein-coding

RNAs, lincRNAsVarious Various suggested functions, not many are certain;

regulation of gene expression; gene/chromosome silencing/imprinting byinteraction with chromatin

Enhancer RNA, eRNA .1000Multiexonic poly(A)þ RNA, meRNA VariousLong antisense RNAs, asRNA, or aRNA or

natural antisense RNA, NAS RNAVarious Possibly regulation of gene expression

Pseudogene transcripts Various Decoy or sink for sequestration of RNA or proteinCircular RNA Decoy or sink for sequestration of RNA or protein

It is difficult to classify RNAs because they are so diverse. This table attempts to categorize according to size; other groupings

take their subcellular locations or functions into account. In any event, the distinctions are blurry and apply only to one species

or lineage. In yeast, for example, the U-type snRNAs are significantly larger than in vertebrates. More recently, an arbitrary line

between large and small RNAs is being drawn at 200 nt, the latter including tiny RNAs, such as miRNAs. The three size classes

here reflect the historical division between small and large RNA at �500 nt to accommodate SRP RNA, 7SK RNA, or

mitochondrial RNA processing (MRP) RNA that are well .200 nt in length, but have been known since their discoveries

as small RNAs. Large or long RNAs are defined as being in the size range of mRNAs, sometimes even displaying mRNA-like

attributes (e.g., processing, polyadenylation), but devoid of a functional open reading frame. Initially, miRNAs were

designated as tiny RNAs to distinguish them from conventional small RNAs (Ruvkun 2001). This tripartite categorization

is kept in the table. There is a flurry of attempts to identify additional RNA classes. Their functions are mostly unknown and the

RNAs might correspond to spurious or aborted transcription or other by-products of gene expression (see the text). These

include promoter-associated short and long RNA (PASR, PALR), termini-associated RNA (TASR), promoter upstream RNA

(PROMPT) transcription initiated RNA (tiRNA), transcription start site antisense RNA (TSSa), antisense termini-associated

short RNA (aTASR), retrotransposon derived RNA, satellite region RNA, telomere repeat region antisense RNA (TERRA), etc.

For further information and original references see Saxena and Carninci (2011a). PIWI, P-element-induced wimpy testis.

J. Brosius

12 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 13: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

ing chromatin remodeling (Uhler et al. 2007).In other words, the act of transcription exertsthe function whereas the RNA is merely a by-product, and this mechanism regulates geneexpression in multicellular eukaryotes as well(Kornienko et al. 2013). Another relatively pas-sive role for some RNAs could be the following:Many proteins can bind RNA and proteins canassociate with each other. Perhaps an RNA mol-ecule could have the sole function to broker theinteraction of two or more proteins that other-wise would not be able to interact with eachother. However, if the proteins bind directly oreven via other RNA binding proteins to theRNA, close proximity might facilitate function-al or structural interactions. In addition, suchan RNP complex might even serve as a shuttleinto subcellular domains or environments thatone or the other component, for lack of theappropriate signals, would not be able to reachby itself (Brosius 2005b).

It has been argued that many of the newlydiscovered long RNAs represent 30-UTRs ex-tending beyond the annotated 30-ends of themRNAs by using alternative distal poly(A) ad-dition signals. If there are large introns, 50-UTRswith alternative upstream promoters also can belocated far from the ORF. For that reason, someinvestigators took measures to stay clear of tran-scripts that are located in a certain proximityto annotated protein-coding genes (Managadzeet al. 2013). Nevertheless, splicing does occurin 30-UTRs as well (Bicknell et al. 2012; Cama-cho-Vanegas et al. 2012; Zhernakova et al. 2013)and, hence, the corresponding exons couldmap to corresponding protein-coding genes ata much greater distance to the ORFs. Further-more, apart from spurious initiation of tran-scription anywhere in the genome, a certainlevel of readthrough into gene distal regionscould account for a significant portion of theselong RNA candidates (Finta and Zaphiropoulos2000; Akiva et al. 2006; Parra et al. 2006; Rich-ard and Manley 2009).

RNAs overlapping with annotated genesor transcribed in antisense orientation, albeitshowing regulatory potential, are generally be-ing avoided for now as their structures, regula-tion, and functions are more difficult to inves-

tigate (Ulitsky and Bartel 2013). Hence, thefocus has narrowed to investigating long inter-genic noncoding RNAs (lincRNAs).8 Anotherpoint is being overlooked frequently. For themost part, the arbitrary cutoff for an open read-ing frame is at 100 amino acids in length. Ifhumans were not pentadactyls, but hexa- or tet-radactyls, the cutoff would probably have beenchosen at 144 or 64 amino acids, respectively.There are numerous mRNAs that have evenshorter ORFs, namely, those encoding peptides(Rudd et al. 1998; Sousa et al. 2001; Frith et al.2006a; Kastenmayer et al. 2006; Savard et al.2006; Galindo et al. 2007; Ghabrial 2007; Ha-nada et al. 2007; Hashimoto et al. 2008; Magnyet al. 2013). Perhaps a significant fraction ofmRNA-like long intervening non-protein-cod-ing RNAs are mRNAs after all, encoding pep-tides and short proteins. The question on howmany of the non-protein-coding RNAs are as-sociated with polysomes for productive transla-tion is currently under debate (Ingolia et al.2011; Guttman et al. 2013; Ingolia et al. 2013;Ulitsky and Bartel 2013; van Heesch et al. 2014).

NOVEL RNAs OUT OF THE BLUE

De novo arisen long (intergenic or better inter-vening) non-protein-coding RNA genes out ofneutrally evolving DNA including transposedelements also are receiving increased attention(Guttman et al. 2009; Marques and Ponting2009; Esteller 2011; Hadjiargyrou and Delihas2013; Kapusta et al. 2013; Managadze et al.2013; Ulitsky and Bartel 2013). Their abun-dance as a class is not surprising, however, be-cause parts of TEs, especially LINEs and loneLTRs, can harbor active transcription promot-ers. As a consequence, a large proportion oflong intervening non-protein-coding RNAsshow sequence similarities to TEs.

8This is a double blunder in the troubled RNA nomencla-ture. First, there hardly is a bona fide noncoding RNA (ex-cept, of course, for the nonfunctional noise) and most RNAscarry a code (Trifonov 1989; Barbieri 2003), and, second, ifthe locus encodes an RNA, it is a gene itself and not some-thing intergenic. At least one laboratory has begun to ad-dress these macromolecules as long “intervening” non-[protein]-coding RNAs (Ulitsky and Bartel 2013).

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 13

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 14: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Most investigators agree that among the tensof thousands of long RNA candidates, there willbe hundreds if not thousands of bona fide func-tional RNAs. One article estimates the numberto be �10% of all suggested candidates (Ulit-sky and Bartel 2013). In addition, quite a fewof these spurious and currently nonfunctionalRNAs might one day be exapted into novel func-tions and, as discussed above, as protein-codingexons or regulatory regions. The majority willbe rendered inactive again and a few, if benefi-cial, will eventually persist under selective pres-sure (Brosius 2005c; Siepel 2009; Polev 2012).In an analogy to Alex Rich’s predictions withrespect to an RNA world and the persisting sig-nificance of RNA, Henry Harris envisaged thepotential of spurious transcripts as raw materialfor evolution about half a century ago: “Only asmall proportion of the RNA made in the nu-cleus of animal and higher plant cells serves as atemplate for the synthesis of protein. . . . Mostof the nuclear RNA, however, is made on partsof the DNA, which do not contain informationfor the synthesis of specific proteins. . . . It playsno role in the synthesis of a cell protein, butserves as a background on which mutation andselection may operate to produce new templatesfor protein synthesis” (Harris 1965, 2013). Thisalso applies to new genes encoding functionalRNAs.

The process of new genes arising out of non-genic regions might not appear as extraordinaryif one considers the presumably more trivialreverse process, namely, formation and gradualdecay of pseudogenes. A single-point mutationor small deletion can turn an mRNA into a non-protein-coding RNA. Further mutations can si-lence its transcription and, after enough time,the remnants of a gene cannot be recognized assuch any longer.

COMPETITIVE RNAs

Interestingly, transcribed pseudogenes, includ-ing those that arose via retroposition, reported-ly can be functional as RNAs or mRNAs encod-ing truncated proteins (Balakirev and Ayala2003; Duret et al. 2006; Frith et al. 2006b;Pink et al. 2011; Wen et al. 2012; Sen and Ghosh

2013). For example, retropseudogene PTENP1derived from tumor suppressor gene PTEN istranscribed. Like the canonical mRNA ex-pressed from the parent gene, it harbors bindingsites for several miRNAs in what used to be the30-UTR. These miRNA target sites are compet-ing for the corresponding miRNAs, thus, ame-liorating their inhibitory effect on the intactmRNA, resulting in higher levels of the tumorsuppressor protein PTEN (Poliseno et al. 2010).Even two isoforms of an antisense RNA gener-ated from the PTENP1 pseudogene have beenreported to be involved in gene regulation(Johnsson et al. 2013). In addition, a plethoraof naturally occurring circular RNAs have beendiscovered, which may regulate gene expressionby sponging up complementary RNAs, such asmiRNAs (Salzman et al. 2012; Hansen et al.2013; Memczak et al. 2013; Tay et al. 2014).RNAs not only can be decoys or sinks for otherRNAs but also for proteins. A long RNA pro-cessing product located between two snoRNAshas been reported to act as a sink for Fox2 splic-ing factor and, as a consequence, alter the rela-tive abundance of alternative splice products ofa number of genes (Yin et al. 2012).

THE (NUCLEIC) ACID TESTFOR FUNCTION

As one of the early advocates of a vibrant RNAcomponentry in extant organisms (DeChiaraand Brosius 1987; Brosius 1991, 1996; Petherick2008), the writer never doubted that the num-ber of functional non-protein-coding RNAscould easily match those for mRNAs in linewith more conservative estimates being in the3% range (Clamp et al. 2007; Church et al. 2009;Cabili et al. 2011; Managadze et al. 2013).9 Con-sidering genes encoding RNAs smaller than

9ORFs amount to �1.3% in the mouse and human ge-nomes and an additional �0.8% serve as 50- and 30-UTRsat their extremities (International Human Genome Se-quencing Consortium 2004; Church et al. 2009; Managadzeet al. 2013). Accordingly, all sequences that end up in maturemRNAs cover somewhat .2% of their respective genomes.Strictly speaking, mRNAs are chimera between templatesfor translation and non-protein-coding RNAs and manylong (intervening) non-protein-coding RNAs are quite sim-ilar, except for the lack of (long) ORFs.

J. Brosius

14 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 15: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

long RNAs as well and leaving generous spacefor potentially functional RNAs overlappingother genes in sense or antisense orientation,the figure might top 5%. It does not matterwhether the numbers will end up even at 10%of the total genome, they will not be in the rangeof 75% (Djebali et al. 2012). There is a tremen-dous task ahead of us to determine which of thedetected transcripts are bona fide RNAs, andwhat their individual functions and contribu-tions to the cell are. Phylogenetic conservationhelps, but its absence is not a perfect criterionfor lack of function. Not many investigatorswould deny the functional significance of theUTRs of mRNAs. These regions, flankingORFs, can encode other, for example, regulato-ry functions such as structures to modulateturnover or translation efficiencies, includingsequence complementarities to regulatory miR-NAs. Yet, for the most part, these sequencesshow much less phylogenetic conservation incomparison to ORFs. This agrees with findingsabout Xist, an �17-kb-long non-protein-cod-ing RNA in humans that, at best, is conservedonly in small patches between mammals (Duretet al. 2006). Inactivation of Xist RNA expressionon the paternal X chromosome leads to earlyembryonic death in a mouse knockout model(Marahrens et al. 1997). Even the best test forfunctional significance, gene inactivation inknockout model systems, does not always pro-vide immediate and simple answers. In yeast,very few knockouts of snoRNAs showed a clearphenotype as was the case for U14 (Li et al.1990), but not, for example, snR189 (Thomp-son et al. 1988), or only over time or undercertain conditions with respect to other snoR-NAs (Badis et al. 2003; King et al. 2003; Esguerraet al. 2008). Bacterial genomes like those of Es-cherichia coli do not have much space to wasteand everything without a selective advantagewould not remain in the genome for long. Inagreement, the knockout of the 4.5S RNA genehad been shown to be essential (Brown andFournier 1984). Surprisingly, ablation of thegene encoding 6S RNAwith a wide phylogeneticdistribution and structural conservation in bac-teria (Barrick et al. 2005) had no effect on im-mediate viability (Lee et al. 1985). Later, 6S

RNA was shown to regulate transcription andenhance long-term survival (Wassarman andStorz 2000; Trotochaud and Wassarman 2004).

In mammalian genomes, not every inacti-vation of a protein-coding gene results in a dis-cernible phenotype. The same is to be expectedfrom genes encoding RNA. Snora35 is locatedwithin the second intron (almost 100 kb inlength) of the serotonin 2c receptor gene(Htr2c) and is highly conserved between placen-tal mammals. As a consequence, the brain-spe-cific Snora35 (earlier termed MBI-36 snoRNA)is cotranscribed with and processed out of theprimary Htr2c heterogeneous nuclear RNA(hnRNA), for example, in the choroid plexus(Cavaille et al. 2000). Snora35 gene-depletedmice do not appear to display a phenotype dif-ferent from their wild-type siblings, thus far(BV Skryabin and J Brosius, unpubl.). In anoth-er example of an RNA knockout, it took severalyears of work to tease out a reduced explora-tory behavior in mice when small cytoplas-mic BC1 RNA, phylogenetically restricted torodents and expressed in neurons where it isdelivered to dendritic processes, was deleted(Tiedge et al. 1991; Martignetti and Brosius1993; Lewejohann et al. 2004). In a furthermouse snoRNA knockout, the entire cluster en-coding Snord116 RNA isoforms (earlier termedMBII-85 snoRNA) was deleted from a locus as-sociated with Prader-Willi syndrome, a neuro-developmental disorder in humans. Two inde-pendent studies revealed that the mice showedfailure to thrive and growth retardation, butnot all symptoms described for humans (Skrya-bin et al. 2007; Ding et al. 2008). In many multi-cellular organisms, snoRNAs are cotranscribedwith introns of protein-coding genes or non-protein-coding genes. Snord116 is paternallyimprinted and expressed in the brain. The hostRNA is processed into a set of abundant longnon-protein-coding RNAs by splicing out theintrons (Runte et al. 2001). Attempts to com-pensate for the loss of Snord116 with constructscoexpressing some copies of the respectivesnoRNA in introns of a different host transgeneindependently failed thus far (Ding et al. 2008;BV Skryabin and J Brosius, unpubl.). The fol-lowing could explain the findings: (1) The num-

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 15

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 16: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

ber of Snord116 RNA isoforms from the origi-nal transcription unit is not sufficient in thetransgenic construct; (2) spatial, temporal ex-pression and/or its levels are not appropriate;and (3) truncation of the long host hnRNA-derived, processed non-protein-coding RNA isthe underlying cause for the mouse phenotypeand, by extension, human disease.

An interesting case involves the Hotair longRNA that had been shown ex vivo to regulateHoxD genes (Rinn et al. 2007); deletion of theHoxC cluster harboring the Hotair gene was de-void of a phenotype (Schorderet and Duboule2011), whereas targeted disruption of theHotair locus alone led to homeotic transforma-tion of the spine and malformation of metacar-pal and carpal bones (Li et al. 2013).

A great stride forward has been madethrough a comprehensive gene knockout proj-ect involving 18 large RNA candidates (Sauva-geau et al. 2013). The corresponding genes wereselected not to overlap with any other gene orgene module, and their coding regions were re-placed with the lacZ and neoR marker genes.At the current level of investigation, five of the18 mouse lines revealed a phenotype. In threelines peri- and postnatal lethality was reported,two of which had some survivors, but they hadgrowth defects. For the two remaining lines witha phenotype, growth defects were also reported.The most lethal knockout involved Fendrr, agene with six exons encoding a nuclear RNAof about 2.4 kb in length. An independentknockout only involving the first exon by insert-ing a transcriptional stop cassette was lethal aswell, albeit showing different organ defects(Grote et al. 2013). The ,30%, but .10% “suc-cess” rate of 18 knockouts could mean a numberof things, for example, that the RNA gene can-didates were carefully selected in favor of poten-tial functionality and phenotypic consequences(Sauvageau et al. 2013). Still, two-thirds of thegene ablations did not show a noticeable impactunder the laboratory conditions tested, but,nevertheless, could be beneficial for the long-term survival of mice in natural, ever-changingenvironments or that some of the RNAs have nofunction per se. Also, it remains valid that, inexceptional cases, such transitory “protogenes”

could become exaptations encoding novel func-tional RNAs, even proteins, or predominantlydisappear again into the noise of the neutrallyevolving genomic mass. In any event, majorpoints in an article written to balance some ofthe extreme views are confirmed: “[t]here is nodoubt that a number of these non-protein-cod-ing RNAs have important regulatory functionsin the cell” and “. . . aberrant transcripts or pro-cessing products embody evolutionary poten-tial and provide novel RNAs that natural selec-tion can act on” (Brosius 2005c).

CONCLUDING REMARKS

All scenarios for the beginning of life are farfrom perfect. The most plausible hypothesis isbased on RNA as a primordial (auto)catalyticmacromolecule, leading over a variety of tran-sitions (Cairns-Smith 1982; Szathmary andSmith 1995), of which one of the first was ac-quisition of a simple lipid enclosure (Szostaket al. 2001; Mansy et al. 2008) to extant formsof life. In the RNA world, pheno- and genotypewere united in the same macromolecule. Thisunion began to separate during the major tran-sition to the RNP world with a “division oflabor” between RNA and protein. Even before,linkage of RNA molecules, confinement (of setsof RNAs), and discrimination, on one hand,versus escape (reminiscent of viruses and hori-zontal gene transfer) and more expansive ex-change by reshuffling of the RNA componentry(sex), are apparent. This resembles a tug of war,constantly in need of balancing the forces andnot allowing for a victory of either side. Up tothis day, at the cellular, organismal, and societallevel, the predicament of discrimination andexchange as well as the quandary of selfishnessand cooperation remains an essence of evolu-tion and life.10

Perhaps the modular arrangement of genesand chromosomal DNA has its origins in the

10“Without Contraries is no progression. Attraction andRepulsion [. . .] are necessary to [. . .] existence.” The orig-inal William Blake quote, not shortened by the writer isgiven as: “Without Contraries is no progression. Attractionand Repulsion, Reason and Energy, Love and Hate, are nec-essary to Human existence” (Blake 1975).

J. Brosius

16 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 17: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

RNA or RNP worlds as well, reflecting the hy-pothetical structure of several linked RNA mol-ecules. This also would mean that RNA pro-cessing and trimming had very early origins.Components of the extant telomerase as wellas group II introns, ancestral to the spliceosomeand non-LTR retroposons, might date back asfar as the RNP or RNA worlds. The continuedassault of extant eukaryotic genomes with ret-roposons (somewhat countered and alleviatedby deletions via recombination) clearly had itsorigin in the transition period when RNA wasreplaced by DNA for the task of storing andreplicating genetic material. This process stillcan shape genomes quite drastically in relativelyshort evolutionary time frames. Importantly,out of this sea of change with islands of ratherconserved and fixed gene modules, new mod-ules can be generated fortuitously and “tested”for added or modified function by the forces ofselection. Most disappear again with a few even-tually exapted as novel gene modules and, oc-casionally, even as novel genes encoding proteinor functional RNA.

After a long lag phase on the sidelines, func-tional RNA currently is in the spotlight of biol-ogy, even medicine, as RNomics shows promiseto detect additional disease genes, greatly de-velop the diagnostic toolbox, and revolutionizetherapeutic possibilities (Esteller 2011; Cech2012; Erdmann and Barciszewski 2012; Goldet al. 2012). However, the development fromonly two decades ago when the mere mentionof RNA generally exposed grant proposals tomonkey hammering resulting in poorer scoresand the current situation in which a feedingfrenzy of RNA discovery fueled by ultradeepRNA-sequencing technologies endorses almostany detected transcript or degradation productas functional RNA borders on the grotesque.Clearly, the pendulum has swung to the otherextreme.11 In any event, the renewed interest inRNA and advances in understanding its evolu-tion and biology will continue to fascinate us.RNA, this ancient macromolecule, will grant us

exciting new insights into the works of life past,present, and possibly future.

REFERENCES

Aalto AP, Pasquinelli AE. 2012. Small non-coding RNAsmount a silent revolution in gene expression. CurrOpin Cell Biol 24: 333–340.

Adamala K, Szostak JW. 2013. Nonenzymatic template-directed RNA synthesis inside model protocells. Science342: 1098–1100.

Adhya S, Gottesman M. 1982. Promoter occlusion: Tran-scription through a promoter may inhibit its activity.Cell 29: 939–944.

Ahituv N, Zhu Y, Visel A, Holt A, Afzal V, Pennacchio LA,Rubin EM. 2007. Deletion of ultraconserved elementsyields viable mice. PLoS Biol 5: e234.

Akiva P, Toporik A, Edelheit S, Peretz Y, Diber A, Shemesh R,Novik A, Sorek R. 2006. Transcription-mediated genefusion in the human genome. Genome Res 16: 30–36.

Allen TA, Von Kaenel S, Goodrich JA, Kugel JF. 2004. TheSINE-encoded mouse B2 RNA represses mRNA tran-scription in response to heat shock. Nat Struct Mol Biol11: 816–821.

Ambros V. 2001. microRNAs: Tiny regulators with greatpotential. Cell 107: 823–826.

An J, Lai J, Lehman ML, Nelson CC. 2013. miRDeep�: Anintegrated application tool for miRNA identificationfrom RNA sequencing data. Nucleic Acids Res 41: 727–737.

Atkins JF, Gesteland RF, Cech TR. 2011. RNAWorlds: Fromlife’s origins to diversity in gene regulation. Cold SpringHarbor Laboratory Press, Cold Spring Harbor, NY.

Attwater J, Wochner A, Holliger P. 2013. In-ice evolution ofRNA polymerase ribozyme activity. Nat Chem 5: 1011–1018.

Babushok DV, Ostertag EM, Kazazian HH Jr. 2007. Currenttopics in genome evolution: Molecular mechanisms ofnew gene formation. Cell Mol Life Sci 64: 542–554.

Badis G, Fromont-Racine M, Jacquier A. 2003. A snoRNAthat guides the two most conserved pseudouridine mod-ifications within rRNA confers a growth advantage inyeast. RNA 9: 771–779.

Baertsch R, Diekhans M, Kent WJ, Haussler D, Brosius J.2008. Retrocopy contributions to the evolution of thehuman genome. BMC Genomics 9: 466.

Bailey JA, Eichler EE. 2006. Primate segmental duplications:Crucibles of evolution, diversity and disease. Nat RevGenet 7: 552–564.

Baillie JK, Barnett MW, Upton KR, Gerhardt DJ, RichmondTA, De Sapio F, Brennan PM, Rizzu P, Smith S, Fell M,et al. 2011. Somatic retrotransposition alters the geneticlandscape of the human brain. Nature 479: 534–537.

Balakirev ES, Ayala FJ. 2003. Pseudogenes: Are they “junk”or functional DNA? Annu Rev Genet 37: 123–151.

Barbieri M. 2003. The organic codes: An introduction to se-mantic biology. Cambridge University Press, Cambridge.

Barrick JE, Sudarsan N, Weinberg Z, Ruzzo WL, Breaker RR.2005. 6S RNA is a widespread regulator of eubacterial

11See zmbe.uni-muenster.de/institutes/iep/iepcartoon.htmfor a cartoon.

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 17

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 18: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

RNA polymerase that resembles an open promoter. RNA11: 774–784.

Bejerano G, Lowe CB, Ahituv N, King B, Siepel A, SalamaSR, Rubin EM, Kent WJ, Haussler D. 2006. A distal en-hancer and an ultraconserved exon are derived from anovel retroposon. Nature 441: 87–90.

Benner SA. 1988. Extracellular “communicator RNA.” FEBSLett 233: 225–228.

Bernstein BE, Birney E, Dunham I, Green ED, Gunter C,Snyder M. 2012. An integrated encyclopedia of DNAelements in the human genome. Nature 489: 57–74.

Bicknell AA, Cenik C, Chua HN, Roth FP, Moore MJ. 2012.Introns in UTRs: Why we should stop ignoring them.BioEssays 34: 1025–1034.

Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gin-geras TR, Margulies EH, Weng Z, Snyder M, DermitzakisET, Thurman RE, et al. 2007. Identification and analysisof functional elements in 1% of the human genome bythe ENCODE pilot project. Nature 447: 799–816.

Blackburn EH, Collins K. 2011. Telomerase: An RNP en-zyme synthesizes DNA. Cold Spring Harb Perspect Biol3: a003558.

Blackburn EH, Greider CW, Szostak JW. 2006. Telomeresand telomerase: The path from maize, Tetrahymena andyeast to human cancer and aging. Nat Med 12: 1133–1138.

Blake W. 1975. The marriage of heaven and hell. OxfordUniversity Press, Oxford (first published in 1790).

Bourque G, Leong B, Vega VB, Chen X, Lee YL, SrinivasanKG, Chew JL, Ruan Y, Wei CL, Ng HH, et al. 2008. Evo-lution of the mammalian transcription factor bindingrepertoire via transposable elements. Genome Res 18:1752–1762.

Bridges CB. 1936. The bar “gene” a duplication. Science 83:210–211.

Brosius J. 1991. Retroposons—Seeds of evolution. Science251: 753.

Brosius J. 1996. More haemophilus and Mycoplasma genes.Science 271: 1302–1303.

Brosius J. 1999a. Genomes were forged by massive bom-bardments with retroelements and retrosequences.Genetica 107: 209–238.

Brosius J. 1999b. RNAs from all categories generate retro-sequences that may be exapted as novel genes or regula-tory elements. Gene 238: 115–134.

Brosius J. 2001. tRNAs in the spotlight during protein bio-synthesis. Trends Biochem Sci 26: 653–656.

Brosius J. 2003a. The contribution of RNAs and retroposi-tion to evolutionary novelties. Genetica 118: 99–116.

Brosius J. 2003b. From Eden to a hell of uniformity? Direct-ed evolution in humans. BioEssays 25: 815–821.

Brosius J. 2003c. Gene duplication and other evolutionarystrategies: From the RNA world to the future. J StructFunct Genomics 3: 1–17.

Brosius J. 2005a. Disparity, causation, adaptation, exapta-tion, and contingency at the genome level. Paleobiology31: 1–16.

Brosius J. 2005b. Echoes from the past—Are we still in anRNP world? Cytogenet Genome Res 110: 8–24.

Brosius J. 2005c. Waste not, want not—Transcript excess inmulticellular eukaryotes. Trends Genet 21: 287–288.

Brosius J. 2009. The fragmented gene. Ann NY Acad Sci1178: 186–193.

Brosius J, Gould SJ. 1992. On “genomenclature”: A compre-hensive (and respectful) taxonomy for pseudogenes andother “junk DNA.” Proc Natl Acad Sci 89: 10706–10710.

Brown S, Fournier MJ. 1984. The 4.5 S RNA gene ofEscherichia coli is essential for cell growth. J Mol Biol178: 533–550.

Bundo M, Toyoshima M, Okada Y, Akamatsu W, Ueda J,Nemoto-Miyauchi T, Sunaga F, Toritsuka M, Ikawa D,Kakita A, et al. 2014. Increased l1 retrotranspositionin the neuronal genome in schizophrenia. Neuron 81:306–313.

Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B,Regev A, Rinn JL. 2011. Integrative annotation of humanlarge intergenic noncoding RNAs reveals global proper-ties and specific subclasses. Genes Dev 25: 1915–1927.

Cairns-Smith AG. 1982. Genetic takeover and the mineralorigins of life. Cambridge University Press, Cambridge.

Camacho-Vanegas O, Camacho SC, Till J, Miranda-LorenzoI, Terzo E, Ramirez MC, Schramm V, Cordovano G, WattsG, Mehta S, et al. 2012. Primate genome gain and loss: Abone dysplasia, muscular dystrophy, and bone cancersyndrome resulting from mutated retroviral-derivedMTAP transcripts. Am J Hum Genet 90: 614–627.

Carninci P. 2009. Is sequencing enlightenment ending thedark age of the transcriptome? Nat Methods 6: 711–713.

Carninci P. 2010. RNA dust: Where are the genes? DNA Res17: 51–59.

Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC,Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, et al.2005. The transcriptional landscape of the mammaliangenome. Science 309: 1559–1563.

Carrieri C, Cimatti L, Biagioli M, Beugnet A, Zucchelli S,Fedele S, Pesce E, Ferrer I, Collavin L, Santoro C, et al.2012. Long non-coding antisense RNA controls Uchl1translation through an embedded SINEB2 repeat. Nature491: 454–457.

Carvunis AR, Rolland T, Wapinski I, Calderwood MA, Yil-dirim MA, Simonis N, Charloteaux B, Hidalgo CA, Barb-ette J, Santhanam B, et al. 2012. Proto-genes and de novogene birth. Nature 487: 370–374.

Cavaille J, Buiting K, Kiefmann M, Lalande M, Brannan CI,Horsthemke B, Bachellerie JP, Brosius J, Huttenhofer A.2000. Identification of brain-specific and imprintedsmall nucleolar RNA genes exhibiting an unusual geno-mic organization. Proc Natl Acad Sci 97: 14311–14316.

Cavalier-Smith T. 1991. Intron phylogeny: A new hypothe-sis. Trends Genet 7: 145–148.

Cech TR. 2012. The RNA worlds in context. Cold SpringHarb Perspect Biol 4: a006742.

Chen S, Krinsky BH, Long M. 2013. New genes as drivers ofphenotypic evolution. Nat Rev Genet 14: 645–660.

Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S,Long J, Stern D, Tammana H, Helt G, et al. 2005. Tran-scriptional maps of 10 human chromosomes at 5-nucle-otide resolution. Science 308: 1149–1154.

Church DM, Goodstadt L, Hillier LW, Zody MC, GoldsteinS, She X, Bult CJ, Agarwala R, Cherry JL, DiCuccio M,

J. Brosius

18 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 19: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

et al. 2009. Lineage-specific biology revealed by a finishedgenome assembly of the mouse. PLoS Biol 7: e1000112.

Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, Kellis M,Lindblad-Toh K, Lander ES. 2007. Distinguishing pro-tein-coding and noncoding genes in the human genome.Proc Natl Acad Sci 104: 19428–19433.

Clark MB, Amaral PP, Schlesinger FJ, Dinger ME, Taft RJ,Rinn JL, Ponting CP, Stadler PF, Morris KV, Morillon A,et al. 2011. The reality of pervasive transcription. PLoSBiol 9: e1000625; discussion e1001102.

Cloutier SC, Wang S, Ma WK, Petell CJ, Tran EJ. 2013. Longnoncoding RNAs promote transcriptional poising of in-ducible genes. PLoS Biol 11: e1001715.

Costa FF. 2010. Non-coding RNAs: Meet thy masters.BioEssays 32: 599–608.

Coufal NG, Garcia-Perez JL, Peng GE, Yeo GW, Mu Y, LovciMT, Morell M, O’Shea KS, Moran JV, Gage FH. 2009. L1retrotransposition in human neural progenitor cells.Nature 460: 1127–1131.

Couzin J. 2002. Breakthrough of the year. Small RNAs makebig splash. Science 298: 2296–2297.

Crick FH. 1958. On protein synthesis. Symp Soc Exp Biol 12:138–163.

Crick FH. 1968. The origin of the genetic code. J Mol Biol 38:367–379.

Crick F. 1970. Central dogma of molecular biology. Nature227: 561–563.

Cullen BR, Lomedico PT, Ju G. 1984. Transcriptional inter-ference in avian retroviruses—Implications for the pro-moter insertion model of leukaemogenesis. Nature 307:241–245.

Darnell JE, Doolittle WF. 1986. Speculations on the earlycourse of evolution. Proc Natl Acad Sci 83: 1271–1275.

Deamer D. 2005. A giant step towards artificial life? TrendsBiotechnol 23: 336–338.

DeChiara TM, Brosius J. 1987. Neural BC1 RNA: cDNAclones reveal nonrepetitive sequence content. Proc NatlAcad Sci 84: 2624–2628.

de Koning AP, Gu W, Castoe TA, Batzer MA, Pollock DD.2011. Repetitive elements may comprise over two-thirdsof the human genome. PLoS Genet 7: e1002384.

Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S,Tilgner H, Guernec G, Martin D, Merkel A, KnowlesDG, et al. 2012. The GENCODE v7 catalog of humanlong noncoding RNAs: Analysis of their gene structure,evolution, and expression. Genome Res 22: 1775–1789.

de Souza FS, Franchini LF, Rubinstein M.. 2013. Exaptationof transposable elements into novel cis-regulatory ele-ments: Is the evidence always strong? Mol Biol Evol 30:1239–1251.

Dieci G, Conti A, Pagano A, Carnevali D. 2013. Identifica-tion of RNA polymerase III-transcribed genes in eukary-otic genomes. Biochim Biophys Acta 1829: 296–305.

Ding F, Li HH, Zhang S, Solomon NM, Camper SA, CohenP, Francke U. 2008. SnoRNA Snord116 (Pwcr1/MBII-85)deletion causes growth deficiency and hyperphagia inmice. PloS ONE 3: e1709.

Di Ruscio A, Ebralidze AK, Benoukraf T, Amabile G, GoffLA, Terragni J, Figueroa ME, De Figueiredo Pontes LL,Alberich-Jorda M, Zhang P, et al. 2013. DNMT1-inter-

acting RNAs block gene-specific DNA methylation.Nature 503: 371–376.

Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mor-tazavi A, Tanzer A, Lagarde J, Lin W, Schlesinger F, et al.2012. Landscape of transcription in human cells. Nature489: 101–108.

Doolittle WF. 2013. Is junk DNA bunk? A critique ofENCODE. Proc Natl Acad Sci 110: 5294–5300.

Doudna JA, Szostak JW. 1989. RNA-catalysed synthesis ofcomplementary-strand RNA. Nature 339: 519–522.

Du ZQ, Yang CX, Rothschild MF, Ross JW. 2013. NovelmicroRNA families expanded in the human genome.BMC Genomics 14: 98.

Dunoyer P, Himber C, Ruiz-Ferrer V, Alioua A, Voinnet O.2007. Intra- and intercellular RNA interference in Arabi-dopsis thaliana requires components of the microRNAand heterochromatic silencing pathways. Nat Genet 39:848–856.

Duret L, Chureau C, Samain S, Weissenbach J, Avner P. 2006.The Xist RNA gene evolved in eutherians by pseudo-genization of a protein-coding gene. Science 312: 1653–1655.

Eigen M, Schuster P. 1977. The hypercycle. A principle ofnatural self-organization: Part A. Emergence of the hy-percycle. Naturwissenschaften 64: 541–565.

Eigen M, Schuster P. 1978. The hypercycle. A principle ofnatural self-organization: Part C. The realistic hypercycle.Naturwissenschaften 65: 341–369.

Ellington AD, Szostak JW. 1990. In vitro selection ofRNA molecules that bind specific ligands. Nature 346:818–822.

Ellison CE, Bachtrog D. 2013. Dosage compensation viatransposable element mediated rewiring of a regulatorynetwork. Science 342: 846–850.

Erdmann VA, Barciszewski J. 2012. From nucleic acids se-quences to molecular medicine. Springer, Berlin.

Erdmann VA, Wolters J. 1987. The Berlin RNA Databank.Protein Seq Data Anal 1: 127.

Esguerra J, Warringer J, Blomberg A. 2008. Functional im-portance of individual rRNA 20-O-ribose methylationsrevealed by high-resolution phenotyping. RNA 14:649–656.

Espinoza CA, Allen TA, Hieb AR, Kugel JF, Goodrich JA.2004. B2 RNA binds directly to RNA polymerase IIto repress transcript synthesis. Nat Struct Mol Biol 11:822–829.

Espinoza CA, Goodrich JA, Kugel JF. 2007. Characterizationof the structure, function, and mechanism of B2 RNA, anncRNA repressor of RNA polymerase II transcription.RNA 13: 583–596.

Esteller M. 2011. Non-coding RNAs in human disease. NatRev Genet 12: 861–874.

Eswaran J, Horvath A, Godbole S, Reddy SD, Mudvari P,Ohshiro K, Cyanam D, Nair S, Fuqua SA, Polyak K,et al. 2013. RNA sequencing of cancer reveals novel splic-ing alterations. Sci Rep 3: 1689.

Fatica A, Bozzoni I. 2014. Long non-coding RNAs: Newplayers in cell differentiation and development. Nat RevGenet 15: 7–21.

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 19

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 20: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Faulkner GJ, Kimura Y, Daub CO, Wani S, Plessy C, IrvineKM, Schroder K, Cloonan N, Steptoe AL, Lassmann T,et al. 2009. The regulated retrotransposon transcriptomeof mammalian cells. Nat Genet 41: 563–571.

Ferat JL, Michel F. 1993. Group II self-splicing introns inbacteria. Nature 364: 358–361.

Feschotte C. 2008. Transposable elements and the evolutionof regulatory networks. Nat Rev Genet 9: 397–405.

Feuchter A, Mager D. 1990. Functional heterogeneity of alarge family of human LTR-like promoters and enhanc-ers. Nucleic Acids Res 18: 1261–1270.

Fica SM, Tuttle N, Novak T, Li NS, Lu J, Koodathingal P, DaiQ, Staley JP, Piccirilli JA. 2013. RNA catalyses nuclear pre-mRNA splicing. Nature 503: 229–234.

Filipowicz W. 2000. Imprinted expression of small nucleolarRNAs in brain: Time for RNomics. Proc Nat Acad Sci 97:14035–14037.

Fink GR. 1987. Pseudogenes in yeast? Cell 49: 5–6.

Finta C, Zaphiropoulos PG. 2000. The human cytochromeP450 3A locus. Gene evolution by capture of downstreamexons. Gene 260: 13–23.

Fire A, Xu S, Montgomery MK, Kostas SA, Driver SE, MelloCC. 1998. Potent and specific genetic interference bydouble-stranded RNA in Caenorhabditis elegans. Nature391: 806–811.

Frith MC, Forrest AR, Nourbakhsh E, Pang KC, Kai C, KawaiJ, Carninci P, Hayashizaki Y, Bailey TL, Grimmond SM.2006a. The abundance of short proteins in the mamma-lian proteome. PLoS Genet 2: e52.

Frith MC, Wilming LG, Forrest A, Kawaji H, Tan SL,Wahlestedt C, Bajic VB, Kai C, Kawai J, Carninci P,et al. 2006b. Pseudo-messenger RNA: Phantoms of thetranscriptome. PLoS Genet 2: e23.

Galindo MI, Pueyo JI, Fouix S, Bishop SA, Couso JP. 2007.Peptides encoded by short ORFs control developmentand define a new eukaryotic gene family. PLoS Biol 5:e106.

Ganten D, Nesse R. 2012. The evolution of evolutionarymolecular medicine: Genomics are transforming evolu-tionary biology into a science with new importance formodern medicine. J Mol Med (Berl) 90: 467–470.

Gebetsberger J, Polacek N. 2013. Slicing tRNAs to boostfunctional ncRNA diversity. RNA Biol 10: 1798–1806.

Geisler S, Coller J. 2013. RNA in unexpected places: Longnon-coding RNA functions in diverse cellular contexts.Nat Rev Mol Cell Biol 14: 699–712.

Ghabrial A. 2007. Coding RNAs: Separating the grain fromthe chaff. Nat Cell Biol 9: 617–619.

Gilbert W. 1986. Origin of life—The RNA world. Nature319: 618.

Gold L, Janjic N, Jarvis T, Schneider D, Walker JJ, Wilcox SK,Zichi D. 2012. Aptamers and the RNA world, past andpresent. Cold Spring Harb Perspect Biol 4: a003582.

Goldman A, Capoano CA, Gonzalez-Lopez E, Geisinger A.2014. Identifier (ID) elements are not preferentially lo-cated to brain-specific genes: High ID element represen-tation in other tissue-specific- and housekeeping genes ofthe rat. Gene 533: 72–77.

Gomes AQ, Nolasco S, Soares H. 2013. Non-coding RNAs:Multi-tasking molecules in the cell. Int J Mol Sci 14:16010–16039.

Gong C, Maquat LE. 2011. lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 30 UTRs viaAlu elements. Nature 470: 284–288.

Gould SJ. 2002. The structure of evolutionary theory. Belknap,Harvard University Press, Cambridge, MA.

Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E.2013. On the immortality of television sets: “Function” inthe human genome according to the evolution-free gos-pel of ENCODE. Genome Biol Evol 5: 578–590.

Greider CW, Blackburn EH. 1989. A telomeric sequence inthe RNA of Tetrahymena telomerase required for telo-mere repeat synthesis. Nature 337: 331–337.

Grote P, Wittler L, Hendrix D, Koch F, Wahrisch S, Beisaw A,Macura K, Blass G, Kellis M, Werber M, et al. 2013. Thetissue-specific lncRNA Fendrr is an essential regulator ofheart and body wall development in the mouse. Dev Cell24: 206–214.

Guerrier-Takada C, Gardiner K, Marsh T, Pace N, Altman S.1983. The RNA moiety of ribonuclease P is the catalyticsubunit of the enzyme. Cell 35: 849–857.

Guthrie C. 1991. Messenger RNA splicing in yeast: Clues towhy the spliceosome is a ribonucleoprotein. Science 253:157–163.

Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D,Huarte M, Zuk O, Carey BW, Cassady JP, et al. 2009.Chromatin signature reveals over a thousand highly con-served large non-coding RNAs in mammals. Nature 458:223–227.

Guttman M, Russell P, Ingolia NT, Weissman JS, Lander ES.2013. Ribosome profiling provides evidence that largenoncoding RNAs do not encode proteins. Cell 154:240–251.

Haas BJ, Chin M, Nusbaum C, Birren BW, Livny J. 2012.How deep is deep enough for RNA-Seq profiling of bac-terial transcriptomes? BMC Genomics 13: 734.

Hadjiargyrou M, Delihas N. 2013. The intertwining oftransposable elements and non-coding RNAs. Int J MolSci 14: 13307–13328.

Haldane JBS. 1933. The part played by recurrent mutation inevolution. Am Nat 67: 5–9.

Hanada K, Zhang X, Borevitz JO, Li WH, Shiu SH. 2007. Alarge number of novel coding small open reading framesin the intergenic regions of the Arabidopsis thaliana ge-nome are transcribed and/or under purifying selection.Genome Res 17: 632–640.

Hanczyc MM, Dorit RL. 1998. Experimental evolution ofcomplexity: In vitro emergence of intermolecular ribo-zyme interactions. RNA 4: 268–275.

Hansen TB, Kjems J, Damgaard CK. 2013. Circular RNAand miR-7 in cancer. Cancer Res 73: 5609–5612.

Harris H. 1965. The short-lived RNA in the cell nucleus andits possible role in evolution. In Evolving genes and pro-teins (ed. Bryson V, Vogel HJ), pp. 469–500. Academic,New York.

Harris H. 2013. History: Non-coding RNA foreseen 48 yearsago. Nature 497: 188.

J. Brosius

20 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 21: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Hashimoto Y, Kondo T, Kageyama Y. 2008. Lilliputians getinto the limelight: Novel class of small peptide genes inmorphogenesis. Dev Growth Differ 50: S269–276.

Hayden EJ, Ferrada E, Wagner A. 2011. Cryptic genetic var-iation promotes rapid evolutionary adaptation in anRNA enzyme. Nature 474: 92–95.

Heinen TJ, Staubach F, Haming D, Tautz D. 2009. Emer-gence of a new gene from an intergenic region. Curr Biol19: 1527–1531.

Hiller M, Schaar BT, Bejerano G. 2012. Hundreds of con-served non-coding genomic regions are independentlylost in mammals. Nucleic Acids Res 40: 11463–11476.

Ho DW, Yang ZF, Yi K, Lam CT, Ng MN, Yu WC, Lau J, WanT, Wang X, Yan Z, et al. 2012. Gene expression profiling ofliver cancer stem cells by RNA-sequencing. PloS ONE 7:e37159.

Hoffman Y, Dahary D, Bublik DR, Oren M, Pilpel Y. 2013.The majority of endogenous microRNA targets withinAlu elements avoid the microRNA machinery. Bioinfor-matics 29: 894–902.

Holdt LM, Hoffmann S, Sass K, Langenberger D, Scholz M,Krohn K, Finstermeier K, Stahringer A, Wilfert W, Beut-ner F, et al. 2013. Alu elements in ANRIL non-codingRNA at chromosome 9p21 modulate atherogenic cellfunctions through trans-regulation of gene networks.PLoS Genet 9: e1003588.

Hu W, Alvarez-Dominguez JR, Lodish HF. 2012. Regulationof mammalian cell differentiation by long non-codingRNAs. EMBO Rep 13: 971–983.

Hu Y, Huang Y, Du Y, Orellana CF, Singh D, Johnson AR,Monroy A, Kuan PF, Hammond SM, Makowski L, et al.2013. DiffSplice: The genome-wide detection of differ-ential splicing events with RNA-seq. Nucleic Acids Res41: e39.

Huttenhofer A, Kiefmann M, Meier-Ewert S, O’Brien J,Lehrach H, Bachellerie JP, Brosius J. 2001. RNomics: Anexperimental approach that identifies 201 candidates fornovel, small, non-messenger RNAs in mouse. EMBO J 20:2943–2953.

Huttenhofer A, Schattner P, Polacek N. 2005. Non-codingRNAs: Hope or hype? Trends Genet 21: 289–297.

Iacoangeli A, Tiedge H. 2013. Translational control at thesynapse: Role of RNA regulators. Trends Biochem Sci 38:47–55.

Ingolia NT, Lareau LF, Weissman JS. 2011. Ribosome pro-filing of mouse embryonic stem cells reveals the complex-ity and dynamics of mammalian proteomes. Cell 147:789–802.

Ingolia NT, Brar GA, Rouskin S, McGeachy AM, WeissmanJS. 2013. Genome-wide annotation and quantitation oftranslation by ribosome profiling. Current protocols inmolecular biology (ed. Ausubel FM, et al.), Chap. 4,Unit 4 18.

International Human Genome Sequencing Consortium.2004. Finishing the euchromatic sequence of the humangenome. Nature 431: 931–945.

Jacquier A. 2009. The complex eukaryotic transcriptome:Unexpected pervasive transcription and novel smallRNAs. Nat Rev Genet 10: 833–844.

Jady BE, Ketele A, Kiss T. 2012. Human intron-encoded AluRNAs are processed and packaged into Wdr79-associated

nucleoplasmic box H/ACA RNPs. Genes Dev 26: 1897–1910.

Jensen TH, Jacquier A, Libri D. 2013. Dealing with pervasivetranscription. Mol Cell 52: 473–484.

Johnsson P, Ackley A, Vidarsdottir L, Lui WO, Corcoran M,Grander D, Morris KV. 2013. A pseudogene long-non-coding-RNA network regulates PTEN transcription andtranslation in human cells. Nat Struct Mol Biol 20: 440–446.

Johnston WK, Unrau PJ, Lawrence MS, Glasner ME, BartelDP. 2001. RNA-catalyzed RNA polymerization: Accurateand general RNA-templated primer extension. Science292: 1319–1325.

Jordan IK, Rogozin IB, Glazko GV, Koonin EV. 2003. Originof a substantial fraction of human regulatory sequencesfrom transposable elements. Trends Genet 19: 68–72.

Joyce GF, Schwartz AW, Miller SL, Orgel LE. 1987. The casefor an ancestral genetic system involving simple ana-logues of the nucleotides. Proc Natl Acad Sci 84: 4398–4402.

Kaessmann H. 2010. Origins, evolution, and phenotypicimpact of new genes. Genome Res 20: 1313–1326.

Kapitonov VV, Jurka J. 2005. RAG1 core and V(D)J recom-bination signal sequences were derived from Transibtransposons. PLoS Biol 3: e181.

Kapranov P, St Laurent G. 2012. Dark matter RNA: Exis-tence, function, and controversy. Front Genet 3: 60.

Kapranov P, Ozsolak F, Milos PM. 2012. Profiling of shortRNAs using Helicos single-molecule sequencing. Meth-ods Mol Biol 822: 219–232.

Kapusta A, Kronenberg Z, Lynch VJ, Zhuo X, Ramsay L,Bourque G, Yandell M, Feschotte C. 2013. Transposableelements are major contributors to the origin, diversifi-cation, and regulation of vertebrate long noncodingRNAs. PLoS Genet 9: e1003470.

Kastenmayer JP, Ni L, Chu A, Kitchen LE, Au WC, Yang H,Carter CD, Wheeler D, Davis RW, Boeke JD, et al. 2006.Functional genomics of genes with small open readingframes (sORFs) in S. cerevisiae. Genome Res 16: 365–373.

Khayrullina GA, Raabe CA, Hoe CH, Becker K, Reinhardt R,Tang TH, Rozhdestvensky TS, Kopylov AM. 2012. Tran-scription analysis and small non-protein coding RNAsassociated with bacterial ribosomal protein operons.Curr Med Chem 19: 5187–5198.

Kim J, Martignetti JA, Shen MR, Brosius J, Deininger P.1994. Rodent BC1 RNA gene as a master gene for IDelement amplification. Proc Natl Acad Sci 91: 3607–3611.

King TH, Liu B, McCully RR, Fournier MJ. 2003. Ribosomestructure and activity are altered in cells lacking snoRNPsthat form pseudouridines in the peptidyl transferase cen-ter. Mol Cell 11: 425–435.

Kishore S, Khanna A, Zhang Z, Hui J, Balwierz PJ, Stefan M,Beach C, Nicholls RD, Zavolan M, Stamm S. 2010. ThesnoRNA MBII-52 (SNORD 115) is processed into small-er RNAs and regulates alternative splicing. Hum Mol Ge-net 19: 1153–1164.

Koonin EV. 2010. Taming of the shrewd: Novel eukaryoticgenes from RNA viruses. BMC Biol 8: 2.

Kornienko AE, Guenzl PM, Barlow DP, Pauler FM. 2013.Gene regulation by the act of long non-coding RNA tran-scription. BMC Biol 11: 59.

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 21

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 22: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Kowalczyk MS, Higgs DR, Gingeras TR. 2012. Molecularbiology: RNA discrimination. Nature 482: 310–311.

Kriegs JO, Schmitz J, Makalowski W, Brosius J. 2005. Doesthe AD7c-NTP locus encode a protein? Biochim BiophysActa 1727: 1–4.

Kruger K, Grabowski PJ, Zaug AJ, Sands J, Gottschling DE,Cech TR. 1982. Self-splicing RNA: Autoexcision and au-tocyclization of the ribosomal RNA intervening sequenceof Tetrahymena. Cell 31: 147–157.

Krull M, Brosius J, Schmitz J. 2005. Alu-SINE exonization:En route to protein-coding function. Mol Biol Evol 22:1702–1711.

Krull M, Petrusma M, Makalowski W, Brosius J, Schmitz J.2007. Functional persistence of exonized mammalian-wide interspersed repeat elements (MIRs). Genome Res17: 1139–1145.

Kung JT, Colognori D, Lee JT. 2013. Long noncoding RNAs:Past, present, and future. Genetics 193: 651–669.

Lagos-Quintana M, Rauhut R, Lendeckel W, Tuschl T. 2001.Identification of novel genes coding for small expressedRNAs. Science 294: 853–858.

Lambowitz AM, Zimmerly S. 2011. Group II introns: Mo-bile ribozymes that invade DNA. Cold Spring Harb Per-spect Biol 3: a003616.

Lau NC, Lim LP, Weinstein EG, Bartel DP. 2001. An abun-dant class of tiny RNAs with probable regulatory roles inCaenorhabditis elegans. Science 294: 858–862.

Lee CA, Fournier MJ, Beckwith J. 1985. Escherichia coli 6SRNA is not essential for growth or protein secretion. JBacteriol 161: 1156–1161.

Lee YS, Shibata Y, Malhotra A, Dutta A. 2009. A novel classof small RNAs: tRNA-derived RNA fragments (tRFs).Genes Dev 23: 2639–2649.

Lerner MR, Boyle JA, Mount SM, Wolin SL, Steitz JA. 1980.Are snRNPs involved in splicing? Nature 283: 220–224.

Leslie M. 2013a. Cell Biology. NIH effort gambles on mys-terious extracellular RNAs. Science 341: 947.

Leslie M. 2013b. Cell biology. The immune system’s compactgenomic counterpart. Science 339: 25–27.

Lev-Maor G, Sorek R, Shomron N, Ast G. 2003. The birth ofan alternatively spliced exon: 30 splice-site selection in Aluexons. Science 300: 1288–1291.

Lewejohann L, Skryabin BV, Sachser N, Prehn C, Hei-duschka P, Thanos S, Jordan U, Dell’Omo G, VyssotskiAL, Pleskacheva MG, et al. 2004. Role of a neuronal smallnon-messenger RNA: Behavioural alterations in BC1RNA-deleted mice. Behav Brain Res 154: 273–289.

Lewis EB. 1951. Pseudoallelism and gene evolution. ColdSpring Harb Symp Quant Biol 16: 159–174.

Li HD, Zagorski J, Fournier MJ. 1990. Depletion of U14small nuclear RNA (snR128) disrupts production of18S rRNA in Saccharomyces cerevisiae. Mol Cell Biol 10:1145–1152.

Li L, Liu B, Wapinski OL, Tsai MC, Qu K, Zhang J, CarlsonJC, Lin M, Fang F, Gupta RA, et al. 2013. Targeted dis-ruption of Hotair leads to homeotic transformation andgene derepression. Cell Rep 5: 3–12.

Liang KH, Yeh CT. 2013. A gene expression restriction net-work mediated by sense and antisense Alu sequences

located on protein-coding messenger RNAs. BMC Geno-mics 14: 325.

Lincoln TA, Joyce GF. 2009. Self-sustained replication of anRNA enzyme. Science 323: 1229–1232.

Lindblad-Toh K, Garber M, Zuk O, Lin MF, Parker BJ, Wash-ietl S, Kheradpour P, Ernst J, Jordan G, Mauceli E, et al.2011. A high-resolution map of human evolutionaryconstraint using 29 mammals. Nature 478: 476–482.

Long M, Betran E, Thornton K, Wang W. 2003. The origin ofnew genes: Glimpses from the young and old. Nat RevGenet 4: 865–875.

Long M, VanKuren NW, Chen S, Vibranowski MD. 2013.New gene evolution: Little did we know. Annu Rev Genet47: 307–333.

Lowe CB, Haussler D. 2012. 29 mammalian genomes revealnovel exaptations of mobile elements for likely regulatoryfunctions in the human genome. PloS ONE 7: e43128.

Lunyak VV, Prefontaine GG, Nunez E, Cramer T, Ju BG,Ohgi KA, Hutt K, Roy R, Garcia-Diaz A, Zhu X, et al.2007. Developmentally regulated activation of a SINE B2repeat as a domain boundary in organogenesis. Science317: 248–251.

Lynch M, Force A. 2000. The probability of duplicategene preservation by subfunctionalization. Genetics 154:459–473.

Lynch VJ, Leclerc RD, May G, Wagner GP. 2011. Transpo-son-mediated rewiring of gene regulatory networks con-tributed to the evolution of pregnancy in mammals. NatGenet 43: 1154–1159.

Magny EG, Pueyo JI, Pearl FM, Cespedes MA, Niven JE,Bishop SA, Couso JP. 2013. Conserved regulation of car-diac calcium uptake by peptides encoded in small openreading frames. Science 341: 1116–1120.

Maizels N, Weiner AM. 1987. Peptide-specific ribosomes,genomic tags, and the origin of the genetic code. ColdSpring Harb Symp Quant Biol 52: 743–749.

Makalowski W, Mitchell GA, Labuda D. 1994. Alu sequencesin the coding regions of mRNA: A source of proteinvariability. Trends Genet 10: 188–193.

Managadze D, Lobkovsky AE, Wolf YI, Shabalina SA, Rogo-zin IB, Koonin EV. 2013. The vast, conserved mammalianlincRNome. PLoS Comput Biol 9: e1002917.

Mandal AK, Pandey R, Jha V, Mukerji M. 2013. Transcrip-tome-wide expansion of non-coding regulatory switches:Evidence from co-occurrence of Alu exonization, anti-sense and editing. Nucleic Acids Res 41: 2121–2137.

Mansy SS, Schrum JP, Krishnamurthy M, Tobe S, Treco DA,Szostak JW. 2008. Template-directed synthesis of a genet-ic polymer in a model protocell. Nature 454: 122–125.

Marahrens Y, Panning B, Dausman J, Strauss W, Jaenisch R.1997. Xist-deficient mice are defective in dosage compen-sation but not spermatogenesis. Genes Dev 11: 156–166.

Mariner PD, Walters RD, Espinoza CA, Drullinger LF, Wag-ner SD, Kugel JF, Goodrich JA. 2008. Human Alu RNA isa modular transacting repressor of mRNA transcriptionduring heat shock. Mol Cell 29: 499–509.

Marques AC, Ponting CP. 2009. Catalogues of mammalianlong noncoding RNAs: Modest conservation and incom-pleteness. Genome Biol 10: R124.

J. Brosius

22 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 23: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Martens JA, Laprade L, Winston F. 2004. Intergenic tran-scription is required to repress the Saccharomyces cerevi-siae SER3 gene. Nature 429: 571–574.

Martignetti JA, Brosius J. 1993. Neural BC1 RNA as anevolutionary marker: Guinea pig remains a rodent. ProcNatl Acad Sci 90: 9698–9702.

Martin W, Koonin EV. 2006. Introns and the origin of nu-cleus-cytosol compartmentalization. Nature 440: 41–45.

Mast CB, Schink S, Gerland U, Braun D. 2013. Escalation ofpolymerization in a thermal gradient. Proc Natl Acad Sci110: 8030–8035.

Mattick JS. 1994. Introns: Evolution and function. CurrOpin Genet Dev 4: 823–831.

Mattick JS. 2003. Challenging the dogma: The hidden layerof non-protein-coding RNAs in complex organisms. Bio-Essays 25: 930–939.

Mattick JS, Dinger ME. 2013. The extent of functionality inthe human genome. HUGO J 7: 2.

Maynard Smith J, Szathmary E. 1995. The major transitionsin evolution. Oxford University Press, Oxford.

McKinnon RD, Shinnick TM, Sutcliffe JG. 1986. The neu-ronal identifier element is a cis-acting positive regulatorof gene expression. Proc Natl Acad Sci 83: 3751–3755.

Medstrand P, van de Lagemaat LN, Dunn CA, Landry JR,Svenback D, Mager DL. 2005. Impact of transposableelements on the evolution of mammalian gene regula-tion. Cytogenet Genome Res 110: 342–352.

Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, RybakA, Maier L, Mackowiak SD, Gregersen LH, MunschauerM, et al. 2013. Circular RNAs are a large class of animalRNAs with regulatory potency. Nature 495: 333–338.

Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, TrapnellC, Jeddeloh JA, Mattick JS, Rinn JL. 2012. Targeted RNAsequencing reveals the deep complexity of the humantranscriptome. Nat Biotechnol 30: 99–104.

Milner RJ, Bloom FE, Lai C, Lerner RA, Sutcliffe JG. 1984.Brain-specific genes have identifier sequences in theirintrons. Proc Natl Acad Sci 81: 713–717.

Mo D, Raabe CA, Reinhardt R, Brosius J, RozhdestvenskyTS. 2013. Alternative processing as evolutionary mecha-nism for the origin of novel non-protein coding RNAs.Genome Biol Evol 5: 2061–2071.

Monte SM, Ghanbari K, Frey WH, Beheshti I, Averback P,Hauser SL, Ghanbari HA, Wands JR. 1997. Characteriza-tion of the AD7C-NTP cDNA expression in Alzheimer’sdisease and measurement of a 41-kD protein in cerebro-spinal fluid. J Clin Invest 100: 3093–3104.

Moss EG. 2001. RNA interference: It’s a small RNA world.Curr Biol 11: R772–R775.

Mudge JM, Frankish A, Harrow J. 2013. Functional tran-scriptomics in the post-ENCODE era. Genome Res 23:1961–1973.

Muller HJ. 1935. The origination of chromatin deficienciesas minute deletions subject to insertion elsewhere.Genetics 17: 237–252.

Muotri AR, Chu VT, Marchetto MC, Deng W, Moran JV,Gage FH. 2005. Somatic mosaicism in neuronal precur-sor cells mediated by L1 retrotransposition. Nature 435:903–910.

Murphy DN, McLysaght A. 2012. De novo origin of protein-coding genes in murine rodents. PloS ONE 7: e48650.

Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T,Zeller U, Baker JC, Grutzner F, Kaessmann H. 2014. Theevolution of lncRNA repertoires and expression patternsin tetrapods. Nature 505: 635–640.

Nei M. 1969. Gene duplication and nucleotide substitutionin evolution. Nature 221: 40–42.

Neil H, Malabat C, d’Aubenton-Carafa Y, Xu Z, SteinmetzLM, Jacquier A. 2009. Widespread bidirectional promot-ers are the major source of cryptic transcripts in yeast.Nature 457: 1038–1042.

Nekrutenko A, Li WH. 2001. Transposable elements arefound in a large number of human protein-coding genes.Trends Genet 17: 619–621.

Nelson AC, Wardle FC. 2013. Conserved non-coding ele-ments and cis regulation: Actions speak louder thanwords. Development 140: 1385–1395.

Neme R, Tautz D. 2013. Phylogenetic patterns of emergenceof new genes support a model of frequent de novo evo-lution. BMC Genomics 14: 117.

Nesse RM, Ganten D, Gregory TR, Omenn GS. 2012. Evo-lutionary molecular medicine. J Mol Med (Berl) 90: 509–522.

Neveu M, Kim HJ, Benner SA. 2013. The “strong” RNAworld hypothesis: Fifty years old. Astrobiology 13: 391–403.

Nielsen MM, Tehler D, Vang S, Sudzina F, Hedegaard J,Nordentoft I, Orntoft TF, Lund AH, Pedersen JS. 2014.Identification of expressed and conserved human non-coding RNAs. RNA 20: 236–251.

Nirenberg M, Leder P, Bernfield M, Brimacombe R, TrupinJ, Rottman F, O’Neal C. 1965. RNA codewords and pro-tein synthesis: VII. On the general nature of the RNAcode. Proc Natl Acad Sci 53: 1161–1168.

Niu DK, Jiang L. 2013. Can ENCODE tell us how much junkDNAwe carry in our genome? Biochem Biophys Res Com-mun 430: 1340–1343.

Noller HF. 2012. Evolution of protein synthesis from anRNA world. Cold Spring Harb Perspect Biol 4: a003681.

Noller HF, Chaires JB. 1972. Functional modification of 16Sribosomal RNA by kethoxal. Proc Natl Acad Sci 69: 3115–3118.

Noller HF, Hoffarth V, Zimniak L. 1992. Unusual resistanceof peptidyl transferase to protein extraction procedures.Science 256: 1416–1419.

Orgel LE. 1968. Evolution of the genetic apparatus. J MolBiol 38: 381–393.

Orom UA, Shiekhattar R. 2013. Long noncoding RNAs ush-er in a new era in the biology of enhancers. Cell 154:1190–1193.

Owens GP, Chaudhari N, Hahn WE. 1985. Brain “identifiersequence” is not restricted to brain: Similar abundance innuclear RNA of other organs. Science 229: 1263–1265.

Parra G, Reymond A, Dabbouseh N, Dermitzakis ET, Cas-telo R, Thomson TM, Antonarakis SE, Guigo R. 2006.Tandem chimerism as a means to increase protein com-plexity in the human genome. Genome Res 16: 37–44.

Perrat PN, DasGupta S, Wang J, Theurkauf W, Weng Z,Rosbash M, Waddell S. 2013. Transposition-driven geno-

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 23

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 24: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

mic heterogeneity in the Drosophila brain. Science 340:91–95.

Petherick A. 2008. Genetics: The production line. Nature454: 1042–1045.

Pink RC, Wicks K, Caley DP, Punch EK, Jacobs L, Carter DR.2011. Pseudogenes: Pseudo-functional or key regulatorsin health and disease? RNA 17: 792–798.

Polev D. 2012. Transcriptional noise as a driver of geneevolution. J Theor Biol 293: 27–33.

Poliseno L, Salmena L, Zhang J, Carver B, Haveman WJ,Pandolfi PP. 2010. A coding-independent function ofgene and pseudogene mRNAs regulates tumour biology.Nature 465: 1033–1038.

Ponicsan SL, Houel S, Old WM, Ahn NG, Goodrich JA,Kugel JF. 2013. The non-coding B2 RNA binds to theDNA cleft and active-site region of RNA polymerase II.J Mol Biol 425: 3625–3638.

Ponting CP, Oliver PL, Reik W. 2009. Evolution and func-tions of long noncoding RNAs. Cell 136: 629–641.

Poole A, Jeffares D, Penny D. 1999. Early evolution: Prokary-otes, the new kids on the block. BioEssays 21: 880–889.

Powner MW, Gerland B, Sutherland JD. 2009. Synthesis ofactivated pyrimidine ribonucleotides in prebioticallyplausible conditions. Nature 459: 239–242.

Prasanth KV, Spector DL. 2007. Eukaryotic regulatoryRNAs: An answer to the “genome complexity” conun-drum. Genes Dev 21: 11–42.

Prestayko AW, Busch H. 1968. Low molecular weight RNA ofthe chromatin fraction from Novikoff hepatoma and ratliver nuclei. Biochim Biophys Acta 169: 327–337.

Reanney DC. 1974. On the origin of prokaryotes. J TheorBiol 48: 243–251.

Rebollo R, Romanish MT, Mager DL. 2012. Transposableelements: An abundant and natural source of regulatorysequences for host genes. Annu Rev Genet 46: 21–42.

Reilly MT, Faulkner GJ, Dubnau J, Ponomarev I, Gage FH.2013. The role of transposable elements in health anddiseases of the central nervous system. J Neurosci 33:17577–17586.

Rich A. 1962. On the problems of evolution and biochem-ical information transfer. In Horizons in biochemistry(ed. Kasha M, Pullman B), pp. 103–126. Academic,New York.

Richard P, Manley JL. 2009. Transcription termination bynuclear RNA polymerases. Genes Dev 23: 1247–1269.

Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, BrugmannSA, Goodnough LH, Helms JA, Farnham PJ, Segal E, et al.2007. Functional demarcation of active and silent chro-matin domains in human HOX loci by noncoding RNAs.Cell 129: 1311–1323.

Ritz K, van Schaik BD, Jakobs ME, Aronica E, Tijssen MA,van Kampen AH, Baas F. 2011. Looking ultra deep: Shortidentical sequences and transcriptional slippage. Geno-mics 98: 90–95.

Robertson MP, Joyce GF. 2012. The origins of the RNAworld. Cold Spring Harb Perspect Biol 4: a003608.

Robinson R. 2010. Dark matter transcripts: Sound and fury,signifying nothing? PLoS Biol 8: e1000370.

Ross RJ, Weiner MM, Lin H. 2014. PIWI proteins and PIWI-interacting RNAs in the soma. Nature 505: 353–359.

Roth C, Rastogi S, Arvestad L, Dittmar K, Light S, Ekman D,Liberles DA. 2007. Evolution after gene duplication:Models, mechanisms, sequences, systems, and organ-isms. J Exp Zool B Mol Dev Evol 308: 58–73.

Rudd KE, Humphery-Smith I, Wasinger VC, Bairoch A.1998. Low molecular weight proteins: A challenge forpost-genomic research. Electrophoresis 19: 536–544.

Runte M, Huttenhofer A, Gross S, Kiefmann M, Hors-themke B, Buiting K. 2001. The IC-SNURF-SNRPN tran-script serves as a host for multiple small nucleolar RNAspecies and as an antisense RNA for UBE3A. Hum MolGenet 10: 2687–2700.

Ruvkun G. 2001. Molecular biology. Glimpses of a tiny RNAworld. Science 294: 797–799.

Sabin LR, Delas MJ, Hannon GJ. 2013. Dogma derailed: Themany influences of RNA on the genome. Mol Cell 49:783–794.

Sacerdote MG, Szostak JW. 2005. Semipermeable lipid bi-layers exhibit diastereoselectivity favoring ribose. ProcNatl Acad Sci 102: 6004–6008.

Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. 2012.Circular RNAs are the predominant transcript isoformfrom hundreds of human genes in diverse cell types. PloSONE 7: e30733.

Santangelo AM, de Souza FS, Franchini LF, Bumaschny VF,Low MJ, Rubinstein M. 2007. Ancient exaptation of aCORE-SINE retroposon into a highly conserved mam-malian neuronal enhancer of the proopiomelanocortingene. PLoS Genet 3: 1813–1826.

Sapienza C, St-Jacques B. 1986. “Brain- specific” transcrip-tion and evolution of the identifier sequence. Nature 319:418–420.

Sasaki T, Nishihara H, Hirakawa M, Fujimura K, Tanaka M,Kokubo N, Kimura-Yoshida C, Matsuo I, Sumiyama K,Saitou N, et al. 2008. Possible involvement of SINEs inmammalian-specific brain formation. Proc Natl Acad Sci105: 4220–4225.

Sauvageau M, Goff LA, Lodato S, Bonev B, Groff AF, Ger-hardinger C, Sanchez-Gomez DB, Hacisuleyman E, Li E,Spence M, et al. 2013. Multiple knockout mouse modelsreveal lincRNAs are required for life and brain develop-ment. eLife 2: e01749.

Savard J, Marques-Souza H, Aranda M, Tautz D. 2006. Asegmentation gene in tribolium produces a polycistronicmRNA that codes for multiple conserved peptides. Cell126: 559–569.

Saxena A, Carninci P. 2011a. Long non-coding RNA mod-ifies chromatin: Epigenetic silencing by long non-codingRNAs. BioEssays 33: 830–839.

Saxena A, Carninci P. 2011b. Whole transcriptome analysis:What are we still missing? Wiley Interdiscip Rev Syst BioMed 3: 527–543.

Schartl M, Hornung U, Gutbrod H, Volff JN, Wittbrodt J.1999. Melanoma loss-of-function mutants in Xiphopho-rus caused by Xmrk-oncogene deletion and gene disrup-tion by a transposable element. Genetics 153: 1385–1394.

Schonberg DL, Bao S, Rich JN. 2013. Genomics informsglioblastoma biology. Nat Genet 45: 1105–1107.

Schoning K, Scholz P, Guntha S, Wu X, Krishnamurthy R,Eschenmoser A. 2000. Chemical etiology of nucleic acid

J. Brosius

24 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 25: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

structure: The a-threofuranosyl-(30!20) oligonucleo-tide system. Science 290: 1347–1351.

Schorderet P, Duboule D. 2011. Structural and functionaldifferences in the long non-coding RNA hotair in mouseand human. PLoS Genet 7: e1002071.

Seila AC, Calabrese JM, Levine SS, Yeo GW, Rahl PB, FlynnRA, Young RA, Sharp PA. 2008. Divergent transcriptionfrom active promoters. Science 322: 1849–1851.

Sen K, Ghosh TC. 2013. Pseudogenes and their composers:Delving in the “debris” of human genome. Brief FunctGenomics 12: 536–547.

Shah SP, Roth A, Goya R, Oloumi A, Ha G, Zhao Y, Turash-vili G, Ding J, Tse K, Haffari G, et al. 2012. The clonal andmutational evolution spectrum of primary triple-nega-tive breast cancers. Nature 486: 395–399.

Sharp PA. 1991. Five easy pieces. Science 254: 663.

Shechner DM, Grant RA, Bagby SC, Koldobskaya Y, PiccirilliJA, Bartel DP. 2009. Crystal structure of the catalytic coreof an RNA-polymerase ribozyme. Science 326: 1271–1275.

Shen S, Lin L, Cai JJ, Jiang P, Kenkel EJ, Stroik MR, Sato S,Davidson BL, Xing Y. 2011. Widespread establishmentand regulatory impact of Alu exons in human genes.Proc Natl Acad Sci 108: 2837–2842.

Shen Y, Yue F, McCleary DF, Ye Z, Edsall L, Kuan S, WagnerU, Dixon J, Lee L, Lobanenkov VV, et al. 2012. A map ofthe cis-regulatory sequences in the mouse genome. Na-ture 488: 116–120.

Sibilia M, Wagner EF. 1995. Strain-dependent epithelial de-fects in mice lacking the EGF receptor. Science 269: 234–238.

Siepel A. 2009. Darwinian alchemy: Human genes fromnoncoding DNA. Genome Res 19: 1693–1695.

Singer T, McConnell MJ, Marchetto MC, Coufal NG, GageFH. 2010. LINE-1 retrotransposons: Mediators of so-matic variation in neuronal genomes? Trends Neurosci33: 345–354.

Skryabin BV, Gubar LV, Seeger B, Pfeiffer J, Handel S, RobeckT, Karpova E, Rozhdestvensky TS, Brosius J. 2007. Dele-tion of the MBII-85 snoRNA gene cluster in mice resultsin postnatal growth retardation. PLoS Genet 3: e235.

Soll D, Ohtsuka E, Jones DS, Lohrmann R, Hayatsu H,Nishimura S, Khorana HG. 1965. Studies on polynucle-otides, XLIX. Stimulation of the binding of aminoacyl-sRNA’s to ribosomes by ribotrinucleotides and a survey ofcodon assignments for 20 amino acids. Proc Natl Acad Sci54: 1378–1385.

Sorek R, Ast G, Graur D. 2002. Alu-containing exons arealternatively spliced. Genome Res 12: 1060–1067.

Sousa C, Johansson C, Charon C, Manyani H, Sautter C,Kondorosi A, Crespi M. 2001. Translational and struc-tural requirements of the early nodulin gene enod40, ashort-open reading frame-containing RNA, for elicita-tion of a cell-specific growth response in the alfalfa rootcortex. Mol Cell Biol 21: 354–366.

Spengler RM, Oakley CK, Davidson BL. 2014. FunctionalmicroRNAs and target sites are created by lineage-specif-ic transposition. Hum Mol Genet 23: 1783–1793.

Stephens SG. 1951. Possible significances of duplication inevolution. Adv Genet 4: 247–265.

St Laurent G, Shtokalo D, Tackett MR, Yang Z, Eremina T,Wahlestedt C, Urcuqui-Inchima S, Seilheimer B, McCaf-frey TA, Kapranov P. 2012. Intronic RNAs constitute themajor fraction of the non-coding RNA in mammaliancells. BMC Genomics 13: 504.

Struhl K. 2007. Transcriptional noise and the fidelity ofinitiation by RNA polymerase II. Nat Struct Mol Biol14: 103–105.

Sutcliffe JG, Milner RJ, Gottesfeld JM, Lerner RA. 1984a.Identifier sequences are transcribed specifically in brain.Nature 308: 237–241.

Sutcliffe JG, Milner RJ, Gottesfeld JM, Reynolds W. 1984b.Control of neuronal gene expression. Science 225: 1308–1315.

Szathmary E. 2006. The origin of replicators and reproduc-ers. Philos Trans R Soc Lond B Biol Sci 361: 1761–1776.

Szathmary E, Smith JM. 1995. The major evolutionary tran-sitions. Nature 374: 227–232.

Szostak JW, Bartel DP, Luisi PL. 2001. Synthesizing life.Nature 409: 387–390.

Takeuchi N, Hogeweg P. 2008. Evolution of complexity inRNA-like replicator systems. Biol Direct 3: 11.

Taleb NN. 2010. The bed of Procrustes: Philosophical andpractical aphorisms. Random House, New York.

Tay Y, Rinn J, Pandolfi PP. 2014. The multilayered complex-ity of ceRNA crosstalk and competition. Nature 505:344–352.

Taylor DJ, Bruenn J. 2009. The evolution of novel fungalgenes from non-retroviral RNA viruses. BMC Biol 7: 88.

Thompson JR, Zagorski J, Woolford JL, Fournier MJ. 1988.Sequence and genetic analysis of a dispensible 189 nucle-otide snRNA from Saccharomyces cerevisiae. Nucleic AcidsRes 16: 5587–5601.

Tiedge H, Fremeau RT Jr, Weinstock PH, Arancio O, BrosiusJ. 1991. Dendritic location of neural BC1 RNA. Proc NatlAcad Sci 88: 2093–2097.

Tisseur M, Kwapisz M, Morillon A. 2011. Pervasive tran-scription—Lessons from yeast. Biochimie 93: 1889–1896.

Trifonov EN. 1989. The multiple codes of nucleotide se-quences. Bull Math Biol 51: 417–432.

Trotochaud AE, Wassarman KM. 2004. 6S RNA functionenhances long-term cell survival. J Bacteriol 186: 4978–4985.

Tuerk C, Gold L. 1990. Systematic evolution of ligands byexponential enrichment: RNA ligands to bacteriophageT4 DNA polymerase. Science 249: 505–510.

Uesaka M, Nishimura O, Go Y, Nakashima K, Agata K,Imamura T. 2014. Bidirectional promoters are the majorsource of gene activation-associated non-coding RNAs inmammals. BMC Genomics 15: 35.

Uhler JP, Hertel C, Svejstrup JQ. 2007. A role for noncodingtranscription in activation of the yeast PHO5 gene. ProcNatl Acad Sci 104: 8011–8016.

Ulitsky I, Bartel DP. 2013. lincRNAs: Genomics, evolution,and mechanisms. Cell 154: 26–46.

Upton KR, Baillie JK, Faulkner GJ. 2011. Is somatic retro-transposition a parasitic or symbiotic phenomenon? MobGenet Elements 1: 279–282.

RNA and Eukaryotic Gene/Genome Architecture

Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089 25

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 26: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Vaidya N, Manapat ML, Chen IA, Xulvi-Brunet R, HaydenEJ, Lehman N. 2012. Spontaneous network formationamong cooperative RNA replicators. Nature 491: 72–77.

van Bakel H, Nislow C, Blencowe BJ, Hughes TR. 2010. Most“dark matter” transcripts are associated with knowngenes. PLoS Biol 8: e1000371.

van de Lagemaat LN, Landry JR, Mager DL, Medstrand P.2003. Transposable elements in mammals promote reg-ulatory variation and diversification of genes with spe-cialized functions. Trends Genet 19: 530–536.

van der Veen R, Arnberg AC, van der Horst G, Bonen L,Tabak HF, Grivell LA. 1986. Excised group II introns inyeast mitochondria are lariats and can be formed by self-splicing in vitro. Cell 44: 225–234.

van Heesch S, van Iterson M, Jacobi J, Boymans S, Essers PB,de Bruijn E, Hao W, Macinnes AW, Cuppen E, Simonis M.2014. Extensive localization of long noncoding RNAs tothe cytosol and mono- and polyribosomal complexes.Genome Biol 15: R6.

Volff JN. 2006. Turning junk into gold: Domestication oftransposable elements and the creation of new genes ineukaryotes. BioEssays 28: 913–922.

Volff JN. 2009. Cellular genes derived from Gypsy/Ty3 ret-rotransposons in mammalian genomes. Ann NY AcadSci 1178: 233–243.

Wang H, Iacoangeli A, Popp S, Muslimov IA, Imataka H,Sonenberg N, Lomakin IB, Tiedge H. 2002. DendriticBC1 RNA: Functional role in regulation of translationinitiation. J Neurosci 22: 10232–10241.

Wang T, Zeng J, Lowe CB, Sellers RG, Salama SR, Yang M,Burgess SM, Brachmann RK, Haussler D. 2007. Species-specific endogenous retroviruses shape the transcription-al network of the human tumor suppressor protein p53.Proc Natl Acad Sci 104: 18613–18618.

Wang Z, Gerstein M, Snyder M. 2009. RNA-Seq: A rev-olutionary tool for transcriptomics. Nat Rev Genet 10:57–63.

Wang J, Gong C, Maquat LE. 2013. Control of myogenesisby rodent SINE-containing lncRNAs. Genes Dev 27:793–804.

Ward LD, Kellis M. 2012. Evidence of abundant purifyingselection in humans for recently acquired regulatoryfunctions. Science 337: 1675–1678.

Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR,Hackermuller J, Hertel J, Lindemeyer M, Reiche K, TanzerA, et al. 2007. Structured RNAs in the ENCODE selectedregions of the human genome. Genome Res 17: 852–864.

Wassarman KM, Storz G. 2000. 6S RNA regulates E. coliRNA polymerase activity. Cell 101: 613–623.

Watanabe Y, Yamamoto M. 1994. S. pombe mei2þ encodesan RNA-binding protein essential for premeiotic DNAsynthesis and meiosis I, which cooperates with a novelRNA species meiRNA. Cell 78: 487–498.

Wei W, Pelechano V, Jarvelin AI, Steinmetz LM. 2011. Func-tional consequences of bidirectional promoters. TrendsGenet 27: 267–276.

Weiner AM. 2006. SINEs and LINEs: Troublemakers, sabo-teurs, benefactors, ancestors. In The RNAworld (ed. Ges-teland RF, Cech TR, Atkins JF), pp. 507–533. Cold SpringHarbor Laboratory Press, Cold Spring Harbor, NY.

Wen YZ, Zheng LL, Qu LH, Ayala FJ, Lun ZR. 2012. Pseu-dogenes are not pseudo any more. RNA Biol 9: 27–32.

Williams GC, Nesse RM. 1991. The dawn of Darwinianmedicine. Q Rev Biol 66: 1–22.

Wochner A, Attwater J, Coulson A, Holliger P. 2011. Ribo-zyme-catalyzed transcription of an active ribozyme. Sci-ence 332: 209–212.

Woese CR. 1967. The genetic code: The molecular basis forgenetic expression. Harper and Row, New York.

Woese CR. 1979. A proposal concerning the origin of life onthe planet earth. J Mol Evol 13: 95–101.

Wu X, Weigel D, Wigge PA. 2002. Signaling in plants byintercellular RNA and protein movement. Genes Dev16: 151–158.

Xie D, Chen CC, Ptaszek LM, Xiao S, Cao X, Fang F, Ng HH,Lewin HA, Cowan C, Zhong S. 2010. Rewirable generegulatory networks in the preimplantation embryonicdevelopment of three mammalian species. Genome Res20: 804–815.

Xiong Y, Eickbush TH. 1990. Origin and evolution of retro-elements based upon their reverse transcriptase sequenc-es. EMBO J 9: 3353–3362.

Xu Z, Wei W, Gagneur J, Perocchi F, Clauder-Munster S,Camblong J, Guffanti E, Stutz F, Huber W, SteinmetzLM. 2009. Bidirectional promoters generate pervasivetranscription in yeast. Nature 457: 1033–1037.

Yakovchuk P, Goodrich JA, Kugel JF. 2011. B2 RNA repressesTFIIH phosphorylation of RNA polymerase II. Transcrip-tion 2: 45–49.

Yang L, Froberg JE, Lee JT. 2014. Long noncoding RNAs:Fresh perspectives into the RNA world. Trends BiochemSci 39: 35–43.

Yarus M. 2011. Getting past the RNA world: The initialDarwinian ancestor. Cold Spring Harb Perspect Biol 3:a003590.

Yin QF, Yang L, Zhang Y, Xiang JF, Wu YW, Carmichael GG,Chen LL. 2012. Long noncoding RNAs with snoRNAends. Mol Cell 48: 219–230.

Yoshihara K, Shahmoradgoli M, Martinez E, Vegesna R, KimH, Torres-Garcia W, Trevino V, Shen H, Laird PW, LevineDA, et al. 2013. Inferring tumour purity and stromal andimmune cell admixture from expression data. NatureComm 4: 2612.

Zaher HS, Unrau PJ. 2007. Selection of an improved RNApolymerase ribozyme with superior extension and fidel-ity. RNA 13: 1017–1026.

Zhang L, Peritz A, Meggers E. 2005. A simple glycol nucleicacid. J Am Chem Soc 127: 4174–4175.

Zhernakova DV, de Klerk E, Westra HJ, Mastrokolias A,Amini S, Ariyurek Y, Jansen R, Penninx BW, HottengaJJ, Willemsen G, et al. 2013. DeepSAGE reveals geneticvariants associated with alternative polyadenylation andexpression of coding and non-coding transcripts. PLoSGenet 9: e1003594.

Zieve G, Penman S. 1976. Small RNA species of theHeLa cell: Metabolism and subcellular localization. Cell8: 19–31.

J. Brosius

26 Cite this article as Cold Spring Harb Perspect Biol 2014;6:a016089

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 27: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

31, 20142014; doi: 10.1101/cshperspect.a016089 originally published online JulyCold Spring Harb Perspect Biol 

 Jürgen Brosius Architecture and Cellular FunctionThe Persistent Contributions of RNA to Eukaryotic Gen(om)e

Subject Collection The Origin and Evolution of Eukaryotes

FunctionEukaryotic Gen(om)e Architecture and Cellular The Persistent Contributions of RNA to

Jürgen Brosius

Mitochondrion Acquired?Eukaryotic Origins: How and When Was the

Anthony M. Poole and Simonetta Gribaldo

the Plant KingdomGreen Algae and the Origins of Multicellularity in

James G. Umen

Bacterial Influences on Animal OriginsRosanna A. Alegado and Nicole King

Phylogenomic PerspectiveThe Archaeal Legacy of Eukaryotes: A

Lionel Guy, Jimmy H. Saw and Thijs J.G. Ettemathe Eukaryotic Membrane-Trafficking SystemMissing Pieces of an Ancient Puzzle: Evolution of

Klute, et al.Alexander Schlacht, Emily K. Herman, Mary J.

OrganellesCytoskeleton in the Network of Eukaryotic Origin and Evolution of the Self-Organizing

Gáspár Jékely LifeIntracellular Coevolution and a Revised Tree ofOrigin of Eukaryotes and Cilia in the Light of The Neomuran Revolution and Phagotrophic

Thomas Cavalier-Smith

from Fossils and Molecular ClocksOn the Age of Eukaryotes: Evaluating Evidence

et al.Laura Eme, Susan C. Sharpe, Matthew W. Brown,

Consequence of Increased Cellular ComplexityProtein Targeting and Transport as a Necessary

Maik S. Sommer and Enrico Schleiff

SplicingOrigin of Spliceosomal Introns and Alternative

Manuel Irimia and Scott William Roy

How Natural a Kind Is ''Eukaryote?''W. Ford Doolittle

http://cshperspectives.cshlp.org/cgi/collection/ For additional articles in this collection, see

Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from

Page 28: The Persistent Contributions of RNA to Eukaryotic Gen(om)e

Eukaryotic Epigeneticsand Geochemistry on the Provenance ofImprints of Bacterial Biochemical Diversification Protein and DNA Modifications: Evolutionary

et al.L. Aravind, A. Maxwell Burroughs, Dapeng Zhang,

Insights from Photosynthetic EukaryotesEndosymbionts to the Eukaryotic Nucleus? What Was the Real Contribution of

David Moreira and Philippe Deschamps

Phylogenomic PerspectiveThe Eukaryotic Tree of Life from a Global

Fabien BurkiComplex LifeBioenergetic Constraints on the Evolution of

Nick Lane

http://cshperspectives.cshlp.org/cgi/collection/ For additional articles in this collection, see

Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved

on June 27, 2022 - Published by Cold Spring Harbor Laboratory Press http://cshperspectives.cshlp.org/Downloaded from