new gene evolution: little did we...

29
New Gene Evolution: Little Did We Know Manyuan Long, 1, 2, Nicholas W. VanKuren, 1, 2 Sidi Chen, 3 and Maria D. Vibranovski 4 1 Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637; email: [email protected] 2 Committee on Genetics, Genomics, and Systems Biology, The University of Chicago, Chicago, Illinois 60637; email: [email protected] 3 Department of Biology and the Koch Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139; email: [email protected] 4 Departamento de Gen´ etica e Biologia Evolutiva, Instituto de Biociˆ encias, Universidade de ao Paulo, S˜ ao Paulo, Brazil 05508; email: [email protected] Annu. Rev. Genet. 2013. 47:307–33 First published online as a Review in Advance on September 13, 2013 The Annual Review of Genetics is online at genet.annualreviews.org This article’s doi: 10.1146/annurev-genet-111212-133301 Copyright c 2013 by Annual Reviews. All rights reserved Corresponding author Keywords evolutionary patterns, evolutionary rates, phenotypic evolution, brain evolution, sex dimorphism, gene networks Abstract Genes are perpetually added to and deleted from genomes during evolution. Thus, it is important to understand how new genes are formed and how they evolve to be critical components of the genetic systems that determine the biological diversity of life. Two decades of effort have shed light on the process of new gene origination and have contributed to an emerging comprehensive picture of how new genes are added to genomes, ranging from the mechanisms that generate new gene structures to the presence of new genes in different organisms to the rates and patterns of new gene origination and the roles of new genes in phenotypic evolution. We review each of these aspects of new gene evolution, summarizing the main evidence for the origination and importance of new genes in evolution. We highlight findings showing that new genes rapidly change existing genetic systems that govern various molecular, cellular, and phenotypic functions. 307 Annu. Rev. Genet. 2013.47:307-333. Downloaded from www.annualreviews.org by University of Chicago Libraries on 12/01/13. For personal use only.

Upload: others

Post on 27-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

New Gene Evolution: LittleDid We KnowManyuan Long,1,2,∗ Nicholas W. VanKuren,1,2

Sidi Chen,3 and Maria D. Vibranovski41Department of Ecology and Evolution, The University of Chicago, Chicago,Illinois 60637; email: [email protected] on Genetics, Genomics, and Systems Biology, The University of Chicago,Chicago, Illinois 60637; email: [email protected] of Biology and the Koch Institute, Massachusetts Institute of Technology,Cambridge, Massachusetts 02139; email: [email protected] de Genetica e Biologia Evolutiva, Instituto de Biociencias, Universidade deSao Paulo, Sao Paulo, Brazil 05508; email: [email protected]

Annu. Rev. Genet. 2013. 47:307–33

First published online as a Review in Advance onSeptember 13, 2013

The Annual Review of Genetics is online atgenet.annualreviews.org

This article’s doi:10.1146/annurev-genet-111212-133301

Copyright c© 2013 by Annual Reviews.All rights reserved

∗Corresponding author

Keywords

evolutionary patterns, evolutionary rates, phenotypic evolution, brainevolution, sex dimorphism, gene networks

Abstract

Genes are perpetually added to and deleted from genomes duringevolution. Thus, it is important to understand how new genes areformed and how they evolve to be critical components of the geneticsystems that determine the biological diversity of life. Two decades ofeffort have shed light on the process of new gene origination and havecontributed to an emerging comprehensive picture of how new genesare added to genomes, ranging from the mechanisms that generate newgene structures to the presence of new genes in different organismsto the rates and patterns of new gene origination and the roles of newgenes in phenotypic evolution. We review each of these aspects of newgene evolution, summarizing the main evidence for the origination andimportance of new genes in evolution. We highlight findings showingthat new genes rapidly change existing genetic systems that governvarious molecular, cellular, and phenotypic functions.

307

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 2: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

BACKGROUND ANDHISTORICAL OVERVIEW

Understanding how genes originate andsubsequently evolve is crucial to explaining thegenetic basis for the origin and evolution ofnovel phenotypes and, ultimately, biologicaldiversity. Gene origination is thus a widelyinteresting, yet difficult, problem to study.Perhaps unsurprisingly, the peculiar structures,functions, and evolution of new genes haveattracted the interests of pioneers in geneticsand evolution since the early twentieth century.Sturtevant (129) was one of the first to identifya duplicated gene, the Bar duplication inDrosophila melanogaster, from which Muller(103) developed the first prevalent model ofnew gene evolution in 1936. Muller (103,p. 529) predicted that a new duplicate copyof a gene could acquire a novel function andbe preserved in the genome, and further that“there remains no reason to doubt the appli-cation of the dictum ‘all life from pre-existinglife’ and ‘every cell from a pre-existing cell’to the gene: ‘every gene from a pre-existinggene.’” This early thinking on single-geneand whole-chromosome duplications (55) wasgreatly expanded in the 1970s. Ohno (112)further developed Muller’s model in 1970,and Gilbert (52) proposed an entirely newmodel of new gene formation in 1978, wherebypieces of unrelated genes can be recombinedinto new genes rather than just be strictlyduplicated. However, experimental work onnew genes did not begin until the early 1990swhen a plausible framework for experimentalstudies of new gene formation and evolutionwas proposed: studies must focus on genes thatwere recently formed because young genesstill carry all the signatures of the evolutionaryforces that shaped their origination and theevolution of their new structures and functions(83). As genes age, they accumulate mutationsthat obscure the structural or evolutionarysignals from their early history (83, 84). Ineukaryotes, genes younger than 10–30 millionyears have not experienced much sequenceevolution and thus constitute a valid system in

which to investigate the evolution of new genesand to understand their properties. This ideawas first manifested in the discovery of jingwei,a three-million-year-old gene in two species ofAfrican Drosophila (85). Jingwei revealed severalinteresting features of new gene evolution thatare now known to be general: (a) recombina-tion of existing genes, leading to a hybrid genestructure; (b) rapid sequence evolution drivenby positive selection; and (c) acquisition of newbiochemical functions (150, 162).

Today, it is clear that new gene origination isa general process in evolution and that species-specific or lineage-specific genes exist in many,if not all, organisms. Gigantic databases of ge-nomic sequences from thousands of species re-veal that genomes contain huge numbers anda large diversity of protein-coding genes. Forexample, the plant Glycine max genome en-codes more than 50,000 protein-coding genes,whereas the bacterial genome of CandidatusHodgkinia cicadicola contains only 189 genes. Inaddition, the abundance and diversity of non-coding genes are only now beginning to be real-ized. Even genomes with similar gene numberscan have very different, unrelated genes. Theserecent data reveal a widespread process of birthand death of genes in organisms in which newgenes enter the genome and old genes are lost.What mechanisms and forces dictate gene birthand death? Specifically, how are new genes andnovel functions added to genomes?

In the two decades since the discoveryof jingwei, there have been several hundredadditional publications reporting various inter-esting and significant observations of new genesand new gene functions in many different or-ganisms. Regrettably, we can only choose a fewrepresentative publications to sketch severallines of observation that can provide insightinto an emerging, global picture of new geneevolution. We follow the growth of scientificinformation and underlying ideas and conceptsin new gene evolution, beginning by discussingthe methods for identifying new genes andmechanistic processes of new gene formation.We then describe the rates and patterns ofnew gene origination and evolution that may

308 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 3: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

Fixation: thepopulation geneticprocess by which amutation spreads to allindividuals in apopulation

Monophyletic group:a group of taxa thatshare a commonancestor

indicate some rules governing these processesand discuss the evolutionary forces that act onnew genes. Finally, we review the rapid growthof studies of the phenotypic effects of new genesand their impact on phenotypic evolution.

THE CONCEPT OF NEWGENE ORIGINATION

To understand various basic properties of newgene evolution, we need to have some concep-tion of the process of new gene origination andan operational definition for the process. Thisdefinition helps us explore methods for newgene identification.

The Process of New Gene Origination

New gene origination begins in a microevolu-tionary process. A protogene structure is firstgenerated by a mutation in a single germ-cellgenome. This protogene structure must thenspread through the population until it is fixed.Various evolutionary forces, such as natural se-lection and genetic drift, govern the spreadof the protogene through the population, thusmaking protogene fixation a population geneticprocess. Both before and after fixation, the pro-togene accumulates mutations that confer on itnew structures and beneficial, sometimes novel,functions that are acted on by natural selection.From the point that the protogene carries anoptimized function and is fixed in the genome,it is essentially the same as most other, oldergenes in the genome and can be considereda new gene. New gene studies typically focuson these first two stages (the fixation processand acquisition of a beneficial function) and theconsequences of accepted mutations on the se-quence, structure, and function of the new gene.

Interest in new gene origination has raisedseveral general problems. What molecularmechanisms generate new gene structures?What are evolutionary forces that drive theorigination of new genes? How often are newgenes fixed in a species? Are there any rulesor patterns of new gene origination? What arethe roles of new genes in phenotypic evolution?

This review provides an overview of efforts tounderstand the answers to these problems.

Approaches to Identifying New Genes

All new gene identification methods are basedon comparative analysis of the structures ofgenes and genomes. Within a group of closelyrelated species, we can define new genes asthose that are present in all members of amonophyletic group but absent from all out-group species (Figure 1). Early studies oftenserendipitously identified new genes by ana-lyzing the phylogenetic distribution of genesvia characterization of small genomic regions(e.g., 85, 108). Microarrays (42, 44, 45) and es-pecially next-generation sequencing (168, 169)have made recent searches for new genes morepurposeful efforts.

Multiple genomes. Syntenic alignments(Figure 1) of genomes can be used to iden-tify new genes from related species whosephylogenetic relationship is known. Syntenicalignments of each gene in each species allowidentification of genes that are present orabsent in one genome relative to another(Figure 1). In these comparisons, a gene canbe defined as a new gene candidate if it ispresent in a certain clade or single speciesand absent in all outgroup species (Figure 1).Additionally, the orthologous genes that flankthe new gene candidate appear in all species un-der consideration. This strategy has been usedwith great success in Drosophila and mammals(35, 168, 169, 172). New genes formed by dif-ferent mechanisms also have correspondinglydifferent structural features that can be usedto infer the mechanism of new gene formationand the ancestral and derived characters.

Single genomes. Duplicate genes withina single genome can be identified usingexhaustive pairwise comparisons between allannotated genes in that genome. Most mech-anisms to form new gene structures (see be-low) result in certain structural changes in thenew gene. For example, new genes created by

www.annualreviews.org • New Gene Evolution 309

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 4: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

S1

S2

S3

S4

G1 G2 G3a

SdicCdic AnnX

D. simulans

D. mauritiana

D. melanogaster

D. yakuba

b

Figure 1New genes are defined using syntenic and sequence comparisons between the genomes of a group of relatedspecies. (a) The general procedure to identify new genes. The phylogenetic relationship of species S1–S4 isshown by the blue tree. The linking relationship between the genes G1 ( yellow), G2 (red ), and G3 ( green) isshown within the species tree. Aligning the genomes of species S1–S4 shows that the new gene G2 is presentin S1–S3 but absent in S4, indicating that G2 arose in the common ancestor of S1–S3. G2 was thusgenerated in the genome between old genes G1 and G3 in the common ancestor of S1, S2, and S3 (red star).(b) An example of using syntenic alignments to identify new genes. Sdic exists only in Drosophila melanogaster(110, 160). In this case, Sdic originated as a chimeric gene through recombination of duplicates of the twoflanking genes, a 5′ piece of Cdic encoding a cytoplasmic dynein intermediate chain and a 3′ piece of AnnX.

RNA-based duplication (retrogenes) most of-ten lack introns, contain a stretch of adeninenucleotide at their 3′ end, and contain a pairof short flanking direct repeats. These signalsfade with evolutionary time. Betran et al. (11),Bai et al. (4), and Meisel et al. (100) took ad-vantage of these new structures to identify newretrogenes in fruit flies; Wang et al. (147) insilkworm; and Emerson et al. (43), Marqueset al. (92) and Vinckenbosch et al. (144) in pri-mates and specifically humans. Divergence be-tween the new retrogene and the original genefrom which the retrogene was derived can beused to define the age of the new genes using amolecular clock. However, both strategies thatwe have discussed so far can depend on the cur-rent annotations, which are biased against thenew genes, so caution must be taken when mak-ing claims about the presence/absence of genesin different genomes (167).

Predicting functionality of new genes. Itis desirable to predict whether candidate newgenes are functional before beginning morelaborious functional and phenotypic analyses.Comparisons of open reading frame length,transcription of new gene candidates, and sub-stitution rates between nonsynonymous andsynonymous sites (Ka versus Ks) and polymor-

phism and divergence (60, 97) are often used topredict whether the new gene is functional. AKa/Ks ratio significantly lower than one (for sin-gle genome data, Ka/Ks < 0.5 in a comparisonbetween the new gene and its parental copy), forexample, indicates functional constraint actingon the new gene, which we would expect if dis-ruptive mutations were being prevented fromaccumulating in new protein-coding genes bynatural selection. These methods are widelyused as the first step to predict if a new gene islikely functional (e.g., 4, 11, 43, 147, 168, 169).

MECHANISMS TO FORM NEWGENE STRUCTURES

How are new gene structures formed? Muta-tion toward a new gene structure is the first stepof new gene evolution, and 11 distinct molecu-lar processes are known that contribute to theformation of new genes. These mechanisms arecovered in depth elsewhere (27, 65, 84), so weonly briefly touch on them here. We highlightseveral examples in Figure 2.

Gene Duplication

Gene duplication is thought to contribute mostto the generation of new genes. A single (or a

310 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 5: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

Pseudogenes: genesthat are thought tohave lost their abilityto code for afull-length protein

few) new gene structure(s) can be formed at onetime by DNA-based duplication (the copyingand pasting of a DNA sequence from one ge-nomic region to another) or retroposition. Al-though DNA-based duplications are often tan-dem (134), retroposed genes most often moveto a new genomic environment (14, 15, 65, 172),where they must acquire new regulatory ele-ments or risk becoming processed pseudogenes.An important gene duplication mechanism iswhole-genome duplication (WGD), which hasoccurred multiple times in eukaryote evolu-tion, particularly in plants (126). Hundreds tothousands of duplicate genes are formed by aWGD event, and the vast majority of dupli-cates are quickly lost. However, estimates ofduplicate gene retention after WGDs in teleostfishes (∼15% after 350 million years) (16), yeast(∼12% after 80 million years) (68), and Ara-bidopsis (∼30% after 80 million years) (13) allsuggest that large fractions of duplicated locican be retained. We show below that there area variety of ways that new gene structures cansubsequently acquire new functions (2, 33, 61,78, 158, 170). McLysaght et al. (98) showed thatWGD may more easily generate new paralogs.

Alteration of Existing Gene Structures

New gene structures can be generated bymodifying existing genes, domains, or exons.Gilbert (52) proposed that exons and domainscould be recombined to produce new chimericgene structures (Figure 2a,b). Chimeric pro-teins formed by gene recombination have beenfound in many organisms since their discoveryin the LDL receptor gene (86, 130), includingDrosophila (85, 118, 119), Caenorhabditis elegans(67), mammals (92, 133), and plants (151), andare estimated to have contributed ∼19% ofnew exons in eukaryotes (see Reference 86 andreferences therein). In addition, retroposedsequences may jump into or near existing genesand recruit existing exons, or involve two ad-jacent protein-coding genes (164). Conversely,new gene structures may be formed by splittingexisting genes. Wang et al. (149), for example,found that gene duplication is an intermediate

stage in an evolutionary process leading togene fission (Figure 2c). Okamura et al. (113)demonstrated that frameshift mutations oftengenerate new coding sequences and found470 human gene duplicates that had done so.Xue et al. (157) found that the Epstein-Barrvirus contains an early gene that undergoesfrequent frameshifts, probably to combathost immunity. In addition, divergence inalternative splicing patterns between duplicategenes can generate distinct transcripts thatproduce noncoding RNAs or polypeptideswith slightly or entirely different functionsand rapidly alter duplicate gene structures andfunctions (51, 57, 69, 163, 173).

De Novo Genes

New gene structures may arise from previ-ously noncoding DNA (Figure 2d ). Chen et al.(24) showed that antifreeze proteins of lowstructural complexity, which bind and halt thegrowth of ice crystals in the blood of some polarfishes, were created by amplification of previ-ously noncoding microsatellite DNA. Surpris-ingly, a number of de novo genes of high struc-tural complexity were first found to originatefrom noncoding regions in Drosophila (6, 75).Since then, more de novo genes were identi-fied in Drosophila (26, 168, 172) as well as inhumans (71, 153, 155, 169), primates (137),murine rodents (104), protozoa (159), yeast (17,21), rice (154), and viruses (122). Similar tostrict de novo gene origination, horizontal genetransfer (HGT), the exchange of genes betweengenomes from distantly related taxa, can im-mediately add new genes and functions to agenome (Figure 2f ). HGT is a major mecha-nism for the addition of new genes to prokary-otic genomes (73, 111) but has also been re-ported in a number of eukaryotic organisms, in-cluding plants (8, 161), insects (102), and fungi(56) (Figure 2f ).

Noncoding RNAs

Not all new genes code for proteins. Noncod-ing RNAs were found to play an important

www.annualreviews.org • New Gene Evolution 311

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 6: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

E1

E2

–E

13

E1

4E

15

E1

E2

–E

10

PSM

D4

PIP5

K1A

PIPS

L

Ch

r 1

Ch

r 1

0

Adh

-der

ived

enz

ymat

ic d

omai

n

Dup

licat

ion

Dup

licat

ion

Ymp

Adh

Jingwei

Pse

ud

oe

xon

s

Retr

opos

ition

Hyd

rop

ho

bic

do

mai

n

a

Read

-thr

ough

tran

scri

ptio

n

Reve

rse

tran

scri

ptio

n

Alte

rnat

ive

splic

ing

b

e

18

01

60

02

04

06

08

0

Tim

e (M

ya)

10

01

20

14

0

Δrps

2Δr

ps11

rps1

1

Betu

laCo

rylu

s

Ambo

rella

Sang

uina

ria

Actin

idia

Abel

iaO

xalis

Fagu

sCa

suar

ina

Apiu

mN

icot

iana

Sarr

acen

ia

Buxu

sPl

atan

us

Acor

usPa

ndan

usJu

ncus

Bocc

onia

Ranu

ncul

usCa

ulop

hyllu

m

Mag

nolia

Pipe

rAu

stro

baile

yaN

ymph

aea

Loni

cera

rps2

rps1

1

3' r

ps11

atp1

fP

Intl

Intl

P

Inte

gra

se

Att

Fore

ign

gene

?

Alu

DAF

DAF

Alu

B3

B1

B4

B1

B3

B3

B1

B4

B1

B3

150

100

50

0

250

200

400

350

300

0

600

100

200

300

1,000

1,100

1,200

1,300

1,400

1,500

1,600

400

500

700

800

900

mNSCI

mNSCI

cd

TA

GT

GA

TT

AG

GA

AT

G

TG

A

TG

A

Mu

tati

on

to

ge

ne

rate

CD

S(c

od

ing

se

qu

en

ce r

eg

ion

s)

Mo

use

ge

ne

EN

SMU

SG00

0000

7838

4

Mo

use Rat

Gu

ine

a p

igH

um

an

Mo

use Rat

Gu

ine

a p

igH

um

an

Mo

use Rat

Gu

ine

a p

igH

um

an

ATGCT-AACATACCCGGACTTTGCCGATCTCATTCTTGTGGATGGAAATGTTGGTGTTGA

ATGCTGAACATACCCGGACTTTGCCAATCTCATTCTTGTGGATGGAAATGTTGGTGTTGA

CTGCTGTACATACCCGGACTCTGCCAAACTCGTTCTTGTGGATGGAAATGTTGGTGCCAA

CTGCCACACATACCCGGACTTTGCCGATCTCGTCCTTGTGGATGGAGATGTTGGTGCCGA

GAGTGGTCACAGTGACCTGTCTCACGTAGGACACAGCGGGGCTACCCCGGTTCTCATTCT

GGGTGGTCACAGTGACCAGTCTCACATAGGACACGGCAGGGTTGCCTCGGTTCTCGTTCT

GGGCAGACACGGTGACACGCTTCACGTAGGACACGGCAGGGCTGCCTCGGTTCTCGTTTT

GGGCAGCCACGGTGACGACTCTCACGTAGGACACAGCAGGGTTGCCCCGGTTCTGGTTCT

TGGTTGTGACAGTGAAGGGAGTCAGGCCCTCGGCATTGACCCCAGGACAGAGCGTTCCTG

TGGTTGTGACAGTGAAGGGAGTCAGGCCCTCGGCATTGATCCCAGGACAGATTGTTCCTG

TGGTGGTGACAGTGAAGGGTGTCAGGCCCTCAGCACTGACCCCCGGGCAGCCCACTGCTG

TGGTGGTGACGGTGAAGGGTGTCAGGCCCTGGGTGCTGACCCCCGGGCAGCCAGTTGTTG

D. m

elan

ogas

ter a

nd

D. s

imul

ans m

kg(a

nce

stra

l ge

ne

)

D. m

aurit

iana

ance

stra

l mkg

(hyp

oth

eti

cal)

D. m

aurit

iana

mkg

-r3/

mkg

-p1

(ob

serv

ed

)

D. m

aurit

iana

mkg

-r3/

mkg

-p1

(pre

dic

ted

)

Dup

licat

ion

Com

plem

ent d

egen

erat

ion

Gen

e fis

sion

with

seq

uenc

e lo

ss

TA

GA

TG

TA

GA

TG

AT

GT

AG

AT

GT

AG

AT

GT

AG

AT

GT

AG

AT

GT

AG

312 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 7: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

role in neuronal functions in the early 1990s(136). A large number of functional RNAsfrom noncoding regions have been reportedto play vital roles in a wide variety of organ-isms (7, 80). In Drosophila, microRNAs appearto turn over rapidly, but can be strongly in-fluenced by positive selection (89, 90, 109).Strikingly, Dai et al. (34) showed that a newlong noncoding RNA influences courtship be-havior in D. melanogaster. Pseudogenes areconventionally thought of as dead genes thatplay no functional roles (41), but they mayevolve functions in regulating expression of re-lated genes. Zheng & Gerstein (171) recentlyfound that many mammalian pseudogenesare transcribed and thus may still function.McCarrey & Riggs (96) predicted that pseudo-genes may regulate their parental genes, similarto long noncoding RNAs or miRNAs. An ex-plicit mechanistic model of the use of pseudo-gene transcripts as decoys for cross-regulatingexpression of target genes was actually proposedand tested by Marques et al. (93, 94).

New Gene Regulatory Systems

New genes must acquire a specific transcrip-tion regulatory system to ensure certain tempo-ral and spatial expression patterns. Betran et al.(10) investigated the origin of the male-specificexpression of Dntf-2r, a retroposed gene in theD. melanogaster–Drosophila simulans clade. Thenew retrogene did not contain the parental pro-

moter but had acquired a new β2-tubulin-likepromoter by recruiting a novel 5′ regulatory se-quence. This regulatory sequence drives testis-specific expression of β2-tubulin and appearsto still do so for Dntf-2r. In addition, the newretrogene Xcbp1 recruited existing neuron pro-moters present at its site of integration (29).This co-opted mode of promoter recruitmentis also observed in human retrogenes (144) andmay be a general mode for retrogene promotergain (65). Additionally, Ni et al. (107) observedthat eight new genes essential for Drosophila de-velopment evolved binding sites for the CC-CTC binding factor (CTCF) insulator underpositive selection, ensuring the delineation ofthe regulatory domains of these genes.

Transposable Elements

Transposable elements (TEs) can contribute tofunctional divergence between duplicate genesthrough several ways, all similar to those de-scribed above (12). For instance, TEs can me-diate gene recombination by carrying codingsequences from one part of the genome to an-other (63, 158) and can even themselves be in-corporated into existing coding sequences (46,88, 106). In addition, TEs were recently foundto be a source of micro-RNAs, which are ma-jor components of posttranscriptional regula-tion of expression (116).

Although we still have a developing pictureof the contributions of each of these mecha-nisms for new gene formation in different taxa,

←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−Figure 2Representative new genes exhibiting various new gene origination mechanisms. (a) Jingwei, a new gene found only in Drosophila teissieriand Drosophila yakuba, was generated by a combination of retroposition, DNA-based duplication, and gene recombination, whichformed a chimeric gene consisting of Adh-derived enzymatic domain and a hydrophobic domain from Ymp (85, 150). (b) PIPSL inhumans is a consequence of gene fusion between two adjacent ancestral genes by read-through transcription and subsequentcoretroposition (164). (c) Gene fission split the ancestral gene monkeyking into two distinct genes in Drosophila mauritiana, revealing anintermediate process of gene fission aided by gene duplication and complementary degeneration (149). (d) The geneENSMUSG00000078384 in mouse revealed the evolutionary process of de novo gene origination (104). Red boxes are ancestral stopcodons (TGA) with two triangles showing the positions of the enabling mutations, including a substitution and a deletion. (e) Two newgenes in humans, DAF and mNSCI, were generated by domesticating transposable elements, Alu, and short interspersed elements(B1–B4) (91, 106). DAF and Alu elements together make an interesting case in which alternative splicing generated a new isoform inthe mammalian genome. ( f ) Horizontal gene transfer (HGT) is prevalent in bacteria with mechanisms such as homologousrecombination (111). Antibiotic resistance genes can be acquired by host genomes containing the intl gene (which encodes integrase), arecombination site (Att), and a promoter to express the captured gene, as depicted by the process shown in the panels on the left.

www.annualreviews.org • New Gene Evolution 313

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 8: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

work in humans and Drosophila suggests that∼80% of genes are formed by DNA-based du-plication, 5% to 10% by de novo duplication,and ∼10% by retroposition (168, 169). And al-though these mechanisms may generate the ini-tial gene structures, many new structures (in alarge variety of taxa) undergo radical structuralrenovation to change exon-intron structure andeven recruit new or existing coding sequencesinto the new locus (30, 49, 151, 172). It hasbeen observed that the various mechanisms areoften combined to determine the structure ofnew genes.

Evolution of Transcription Units

Other than the origination and evolution of themacrostructure of genes described above, it wasrecently found that the transcription units inthe genes of vertebrates have been direction-ally evolving toward a productive transcription.Almada et al. (1) reported a highly significantlinear correlation between the gene age and thecritical signals to define transcription units in agene, including the U1 small nuclear ribonu-cleoprotein recognition sites and polyadeny-lation sites (PASs). The observed incrementalgain of the U1 sites and gradual loss of PASsin the 5′ end of protein-coding genes revealeda selection for a U1-PAS axis for productivetranscription.

ABUNDANCE ANDORIGINATION RATESOF NEW GENES

The advent of whole-genome sequences formany organisms allowed identification of manynew DNA-based and RNA-based duplicategenes (e.g., 11, 43). With more genome se-quences available, especially in closely relatedgroups such as the twelve Drosophila species(32), it became possible to investigate the ratesof new gene origination in particular lineages.We review these findings in Drosophila, mam-mals, and plants. There have been no re-ports of new gene origination rates for mech-anisms other than DNA-based duplication,RNA-based duplication, de novo origination,

and gene recombination. Thus, the rates of newgene origination we highlight should be viewedas serious underestimates.

Drosophila

The first estimate of the rate of new gene orig-ination was made for retrogenes in Drosophilain 2002 by Betran et al. (11), who identified∼150 retrogenes in D. melanogaster (4, 11) thatarose after the divergence of the Drosophila andSophophora subgenera approximately 50 Mya.Their estimate of three new retrogenesper million years in the lineage leading toD. melanogaster was corroborated by an inde-pendent estimation of ∼1.5 new retrogenes permillion years based on cDNA hybridiza-tion against salivary polytene chromosomesin species in the D. melanogaster subgroup(∼25 million years old) (158). Zhou et al. (172)computationally estimated via DNA-basedduplication, retroposition, de novo origination,and gene recombination new gene originationrates in the D. melanogaster subgroup to be5–11 new genes per million years and founddifferent rates for the four mechanisms. In par-ticular, approximately 80% of new genes addedto the D. melanogaster lineage genome weregenerated by DNA-based duplication. Moreextensive and detailed analyses of DNA-basedand RNA-based duplicates were conducted byVibranovski et al. (142), Meisel et al. (100), andZhang et al. (168). Zhang et al. (168) analyzedthe 12 Drosophila genomes and estimated that∼17 duplicate genes per million years arosein the Drosophila genome. Figure 3a showsthe distribution of these new genes on theDrosophila phylogeny.

Mammals

Emerson et al. (43) and Marques et al. (92) iden-tified ∼120 retrogenes in the human genome,yielding an estimated retrogene origination rateof one retrogene per million years in the lin-eage leading to humans. Zhang et al. (166, 169)systematically identified new genes in verte-brates, especially in primates, and showed thatthe rates of new gene origination are variable

314 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 9: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

D. m

elanogaster

D. sechellia

D. sim

ulans

D. yakuba

D. erecta

D. ananassae

D. persim

ilis

D. pseudoobscura

D. w

illistoni

D. grim

shawi

D. m

ojavensis

D. virilis

Branch 0

Branch 1

Branch 2

Branch 3

Branch 4

Branch 5

Br. 6

40

35

25

11

6

3

Mya

284

68

154

161

220

11,909

60

a Drosophila

Hu

man

Ch

imp

Oran

gu

tan

Rh

esu

s

Marm

ose

t

Mo

use

Gu

ine

a pig

Do

g

Co

w

Arm

adillo

Te

rec

Op

ossu

m

Platyp

us

Ch

icken

Lizard

Frog

Fug

u

Ze

brafi

sh

Branch 0

Branch 1

Branch 2

Branch 3

Branch 4

Branch 5

Branch 9

Branch 10

Branch 11

Br. 12

450

370

310

220

160

100

70

43

25

13

6

Mya

389

447

392

286

314130130130

336

1,214

945

1,018

1,393

1,013

12,058

b Vertebrates

Branch 6 Branch 7Branch 8

Figure 3The phylogenetic distribution of new gene origination events in (a) Drosophila and (b) vertebrates. These genes were generated byDNA-based duplication, retroposition, and de novo origination (168, 169). The number of new genes that originated in each timeperiod is shown above the branch. For example, in a, branch 1 shows that 220 genes originated between 35 and 40 Mya in Drosophila. Inb, red numbers are new genes that originated in the hominoid branches or specifically in humans.

www.annualreviews.org • New Gene Evolution 315

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 10: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

in different evolutionary stages of vertebrates(Figure 3b), although 25–30 genes generatedmostly by DNA-based and RNA-based dupli-cation arise per million years. Interestingly, thisrate is much higher on the branches closer tohuman (66 new genes per million years in thehuman lineage alone) (166).

Plants

In contrast to flies and mammals, Zhanget al. (165) reported that 0.6 retrogenes permillion years arose in the Arabidopsis thalianagenome, a rate comparable to Populus (174),and a microarray-based study in Arabidopsisidentified 94 new genes created by DNA-basedduplication and retroposition (45). Surpris-ingly, Wang et al. (151) found that a very highrate of retrogene and chimeric gene originationwas present in rice: More than 1,000 retrogeneswere identified in the rice genome, 380 ofwhich evolved chimeric gene structures byrecruiting previously existing genes into theirgene structures. These authors determined therate of chimeric gene origination to be 7 permillion years in grass genomes in the lineageleading to rice, 50 times the origination rate ofchimeric genes in humans (144), and the high-est rate of chimeric gene origination known.In addition, Jiang et al. (63) identified morethan 3,000 gene recombinants in rice mediatedby Pack-Mutator-like transposable elements(Pack-MULEs). These results suggest a hugepotential for protein diversity in plant genomes.

Along with these extensive studies inDrosophila, mammals, and plants, there havebeen many valuable investigations of chimericgenes and retrogenes in Caenorhabditis elegans(66), fish (25, 49), silkworm (147), and chicken(62).

Copy Number Variation

Inexpensive whole-genome analysis has alsomade it possible to identify genes at thevery earliest stages of their evolution, beforefixation. Abundant copy number variation(CNV) of individual genes has been detectedin Drosophila (40, 42, 124), humans (47), mouse

(54), and C. elegans (81). Dopman & Hartl (40),Emerson et al. (42), Cardoso-Moreira & Long(20), and Cardoso-Moreira et al. (19) identifiedmore than 1,000 partial and 100 completegene duplications/deletions in just 15 strains ofD. melanogaster relative to the referencegenome using microarray hybridization.In addition, next-generation sequencingand microarrays have identified more than1,200 partial and 600 complete gene du-plications/deletions in 179 individual humangenomes relative to the reference genome (101,125). The recent sequencing of 43 genomes intwo D. melanogaster populations revealed moreCNVs, including 2,588 duplications and 3,336deletions relative to the reference genome (74).The large number of new genes segregating inpopulations is just now beginning to be appre-ciated and investigated further. An active areaof research will be to perform functional andstatistical analyses of these new genes to under-stand their earliest stages of evolution. In all,these studies have shown that new gene origina-tion rates can differ between taxa, yet are appre-ciable in all groups studied. These results fur-ther strengthen the conclusion that new geneorigination is a general evolutionary process.

PATTERNS OF NEW GENEORIGINATION

Gene Traffic in Drosophila, Humans,and Other Organisms

With the large number of new genes identifiedin various organisms, researchers were ableto investigate statistical patterns of new genecharacteristics to explore the mechanistic andevolutionary forces that impact the formation,origination, and evolution of new genes. Betranet al. (11) examined the chromosomal distri-bution of retrogenes and their parental copiesin D. melanogaster (Figure 4a). Surprisingly,these authors found a significant excess ofautosomal retrogenes derived from X-linkedparental genes (X→A) and a significantdeficiency of retrogenes formed in the oppositedirection (A→X) or between autosomes

316 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 11: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

b Humans

Autosomes

X

299%299%299%260%260%260%

–10 ~ –12%

Excess maleExcess malebiased functionsbiased functionsExcess malebiased functions

Excess non-sex andExcess non-sex andfemale functionsfemale functions

Excess non-sex andfemale functions

a Drosophilia

X

2 332

4–39% –39%

–33%

Excess malebiased functions

114%

Excess malebiased functions114%

Figure 4Retrogene traffic in (a) Drosophila (11, 142) and (b) humans (43). Each arrow indicates the movement of retrogenes from the parentalgene chromosomal location to the retrogene’s location. The size of the arrow indicates the intensity of gene movement betweenchromosomes, and the percentages show quantitatively the excess of movement over the null expectation (random origination andinsertion). The functions of the retrogenes are indicated.

(A→A). Dai et al. (34a) further revealed thatretrogenes derived from autosomal parentalcopies tend to locate to the same chromosomeas the parental copies. However, 42 out ofthe 43 retrogenes exhibited X→A movement;only one retrogene moved X→X. These twoobservations clearly reveal a striking patternof new gene origination in flies: Retrogenesderived from X-linked genes prefer to copy intoautosomes. This directional movement of newgenes is called gene traffic (43). These resultshold in the 12 sequenced species of Drosophila(100, 142) and in Anopheles gambiae (5, 138).Interestingly, 90% of X→A retrogenes inD. melanogaster are expressed in testis, a signif-icantly higher proportion of testis-expressedgenes than average (11), suggesting that the ret-rogene’s function (in this case, male-beneficialfunction) can be associated with its reloca-tion. The symmetric pattern was observed insilkworm, which has ZW sex determination(females are ZW and males ZZ), whereby genesretroposed from Z→A tend to be ovary ex-pressed (147). Gene traffic appears to be generalin Drosophila for different mechanisms of newgene formation, as Vibranovski et al. (142) alsoshowed that new genes created by DNA-basedduplication exhibit the same X→A movementand testis expression. Moreover, the neo-X

chromosome, an autosomal chromosome armthat fused to the ancestral X chromosome inthe Drosophila genus evolution, also shows thesame excess of gene traffic (100, 142).

Relative to Drosophila, human and mousestudies revealed similar yet distinct patterns ofgene traffic (43) (Figure 4b). Compared witha neutral expectation based on the chromo-somal distribution of processed pseudogenes,which are expected to be evolving neutrally,there is an excess of X→A retrogene move-ment and most X→A retrogenes exhibit testisexpression. However, there is also a significantexcess of A→X retrogene movement in hu-mans, and these excess A→X retrogenes ex-hibit either female expression or unbiased ex-pression, with nonsignificant male expression.A→A movement is very low in humans (43).The mouse genome shows a very similar pat-tern. Zhang et al. (166, 168) have shown thatthese patterns exist for DNA-based duplicates,retrogenes, and de novo genes in Drosophila, hu-mans, and mouse.

Consequences of Gene Traffic forGenome Evolution

If gene traffic has been historically impor-tant for genome evolution, the majority of

www.annualreviews.org • New Gene Evolution 317

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 12: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

MSCI model:X chromosomeinactivation duringspermatogenesis favorsrelocation of genesinvolved inspermatogenesis toautosomes

testis-biased/male-biased genes should beautosomal, contrary to the previous conclusionthat the X was a hotbed for male-biased genes(148). Several microarray-based studies ofmale-biased genes and their chromosome loca-tions by Ranz et al. (117) and Parisi et al. (114)in Drosophila, Khil et al. (70) in mouse, and laterby Zhang et al. (166) in humans and mousehave confirmed this prediction. In Drosophila,Zhang et al. (168) showed a smooth transitionof new male-biased genes from X linkage toautosomal linkage over evolutionary time.

Models to Interpret the Causes ofGene Traffic

In general, models to explain gene traffic, andexperimental evaluation of those models, showthat natural selection is a major force govern-ing gene traffic but that mutational processeslikely also play a role (38). Meiotic sex chro-mosome inactivation (MSCI) in the male germline (11, 43, 139, 140), dosage compensationin the heterogametic sex (3, 143), sexual an-tagonism between male- and female-beneficialgenes (22, 128), and meiotic drive (131, 132)have all been implicated in driving gene traf-fic. The relative role of each of these forces hasbeen hotly debated. MSCI has a strong effect inmammals (70), and experimental evidence forMSCI in Drosophila comes from several studies(59, 139, 140). Vibranovski et al. (139) showedthat genes that are highly expressed in themeiotic phase of spermatogenesis (when the Xchromosome is predicted to be inactivated) aresignificantly enriched on the autosomes. Con-versely, genes expressed in the mitotic phases ofspermatogenesis are randomly distributedthroughout the genome. Other studies sug-gest reduced expression throughout spermato-genesis, including in the spermatogonia, whichalso discredits dosage compensation models(99; however, see 141). A clear-cut single celltranscriptome is needed to clarify these issues.Along with the MSCI model, other non-germ-line-based models, e.g., sexual antagonism, arealso necessary to interpret the expression of new

genes in the male somatic cells, although thesemodels need to be rigorously experimentallytested.

Correlation Between Gene Ageand Expression

Early studies revealed a connection be-tween the expression and the ages of newgenes. Betran & Long (10) showed thatDntf-2r, a ∼10 million-year-old gene in theD. melanogaster subgroup, is expressed onlyin testis; however, its parent Dntf-2 is ex-pressed ubiquitously. Almost all retrogenes inDrosophila appear to have testis expression (4)and to have maintained testis-biased or testis-specific expression independent of age (50).Vinckenbosch et al. (144) showed that newhuman retrogenes are often transcribed intestis and later evolve stronger and morediverse spatial expression patterns, coining the“out of the testis” hypothesis. Whether or notthe testis is the starting point for new genes,a general survey of the expression patterns fornew genes that originated within vertebratesrevealed strong positive correlation with age inboth transcription intensity and spatial expres-sion (167). It is possible that this testis-biasedpattern of retrogene expression is due to ourinability to detect genes expressed at low levelsin different tissues, but this issue should beresolved soon with advances in next-generationsequencing.

EVOLUTIONARY FORCESACTING ON NEW GENES

Evolutionary forces, such as natural selectionand genetic drift, operate on both facets of newgene evolution: the fixation of new gene loci andtheir acquisition of a beneficial function. Thesetwo facets may overlap. In this section, we dis-cuss theoretical models developed to describehow new genes arise and acquire novel func-tions as well as general approaches to studyingnew genes and the selective forces that act onthem.

318 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 13: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

Neofunctionalization:the process by which anew gene acquires anovel function

Selective Models of NewGene Evolution

Muller (103) was among the first to recognizethe potential importance of duplicate genesin evolution. He proposed a simple modelwhereby new duplicate genes could acquirenovel, beneficial functions distinct from thoseof the original copies. Ohno (112) furtherelaborated on Muller’s duplication modelas a major means of neofunctionalization.However, Ohno also predicted that duplicategenes are most often inactivated and becomepseudogenes. This classic model assumes thatthe new gene is functional upon duplicationand that the new gene subsequently acquiresmutations that provide a novel beneficialfunction. The novel function is then preservedin the genome by natural selection.

However, strictly duplicate genes areredundant, and beneficial mutations are ex-tremely rare. How do new duplicate genesremain in the population long enough to accu-mulate a beneficial, selected mutation(s)? Thisproblem led to the development of models thatpredict selective preservation of both copiesat all stages of their evolution: adaptiveradiation (AR), innovation-amplification-divergence (IAD), and escape from adaptiveconflict (EAC). The AR model proposes thatgene duplication itself is favored, e.g., for in-creased dosage of a gene product, and that thenew duplicates then undergo functional radia-tion (48). Thus, AR posits that novel functionsare acquired after duplication. IAD and EAC,in contrast, propose that ancestral loci developnovel beneficial secondary functions beforeduplication (9, 36). Under IAD, repeated geneduplication is favored to increase the dosageof the novel secondary function. Differentduplicates are then free to optimize the ances-tral or novel secondary function, and only thetwo best copies are retained in the genome.The increase in the number of duplicate geneswithin the AR and IAD models also providesadditional targets for beneficial mutations,thus increasing the probability and speed of

functional improvement. EAC predicts thatthe bifunctional ancestral gene is subject toselection before gene duplication, that adaptiveconflict between the ancestral function andthe new function constrains improvement ofthe selected function(s) before duplication,and that adaptive changes and functionalimprovement occur in the daughter genes afterduplication.

For additional information on duplicategene evolution, see Conant & Wolfe (33), whosuggest that preservation of new genes stemsfrom the co-option of existing functions to servenew purposes, and Walsh (145, 146), who givesa detailed mathematical description of the mod-els and relative probabilities of neofunctionali-zation and pseudogenization.

Examples of EAC (36), IAD (105), and AR(48) have been published, and each model hasspecific predictions for what we should observeif a new gene originated by each process (33).However, none of these models can be used as astatistical framework for rigorously testing theroles of evolutionary forces in new gene orig-ination. Classic molecular population genetictests based on nucleotide substitution patternsand allele frequency spectra do provide thisframework and have been used extensively todetect selection on new genes. These tests,such as the M-K (McDonald-Kreitman) test(97) and the HKA (Hudson, Kreitman, andAguade) test (60), detect elevated rates of aminoacid substitutions (M-K) or reduced effectivepopulation size (HKA) at loci. In addition,Thornton (135) introduced a coalescent-basedmodel that can be used to test for selectionon CNV. The HKA test and Thornton’s testcompare measurements of nucleotide variationin genes with a distribution of parameter valuesderived from neutral coalescent simulations.Thus, the M-K, HKA, and Thornton’s testsare used to test the classic model. Each ofthese five models (classic, AR, IAD, EAC, andstatistical) predicts that new genes should ex-perience strong natural selection after they areformed. We now discuss some of the evidenceindicating that this often appears to be the case.

www.annualreviews.org • New Gene Evolution 319

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 14: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

c

In the clade ofD. subobscura-guanchi

0.01

JingweiKS = 0

KA = 0.020

In the clade ofD. teissieri-yakuba

Adh

KS = 0

KA = 0.038

Adh

In the clade ofD. hydei-mettleri

0.01Adh-Finnegan

Adh

0.02

KA = 0.064

KS = 0

Siren

In D. ananassae andD. bipectinata complex

0.01Adh-twain

KS = 0

KA = 0.018Adh

30/4

D. simulans D. melanogaster D. simulans D. melanogaster

Fixed retrogenes originatingon autosomes/the X Polymorphic retrogenes

originating onautosomes/the XParental

genes

a65/32

36/3

Chimpanzee Humans Chimpanzee Humans

Fixed retrogenes from A Aor X X copying over the

retrogenes from A Xor X A copying

Polymorphic retrogenes fromA A or X X copying overthe retrogenes from A X orX A copyingParental

genes

70/20

D. teissieri D. yakuba D. teissieri D. yakuba

b

2/192/80/18Adh Jingwei4/11 21/16

9/00/0

Retroposition

Figure 5Positive Darwinian selection acting on new genes. (a) Positive selection for the fixation of new retrogenes in Drosophila (124) andhumans (123). The numerator and denominator show the numbers of retrogenes that originate on the autosomes and the X,respectively. Tests based on the M-K framework indicate an excess of fixed X→A retrogenes in both species and strong positiveselection for X→A retrogene movement. (b) The jingwei ( jgw) gene in Drosophila (85). The ratios over the branches are the numbers ofnonsynonymous changes over the numbers of synonymous changes, and the ratios in the triangles are the ratios of divergence betweenthe species and the polymorphisms. M-K tests and Ka/Ks ratios indicate strong positive selection acted on jgw shortly after itoriginated. (c) Selection acted on all Adh-derived chimeric genes in Drosophila (64), as indicated by elevated Ka/Ks ratios.

Fixation of New GenesThe first study to identify signatures of selec-tion on a new gene journeying to fixation wasperformed by Llopart et al. (82), who analyzeda new variant of the jingwei gene in Drosophilateissieri, which lost its second intron. ThisD. teissieri–specific intron presence-absencepolymorphism exhibits a significant excess ofrare alleles and patterns of nucleotide polymor-phism that is consistent with moderate naturalselection driving the polymorphism to fixation.Selection has also been detected on CNV inD. melanogaster and other organisms. Emerson

et al. (42) found a genome-wide pattern con-sistent with strong purifying selection on allCNVs except duplications of whole genes. Thatis, single-gene duplications are under signif-icantly weaker purifying selection than par-tial gene duplications or partial or completegene deletions. Similarly, Schrider et al. (123,124) showed a significant excess of fixed ver-sus polymorphic retrogene CNV originatingfrom the X chromosome in both Drosophila andhumans, indicating that natural selection gov-erns the patterns of retrogene CNV evolution(Figure 5a). Overall, these studies show that

320 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 15: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

natural selection can play a key role in drivingnew genes to fixation. In addition, they high-light the use of classic population genetic testsin determining whether selection acts on newgenes during their journeys to fixation.

Selection on Sequence Changesin New Genes

In addition to studies of the evolutionary forcesgoverning the fixation of new genes, many stud-ies have investigated the effects of selection anddrift on new gene sequences. Long & Langley(85) showed that the new chimeric gene jingweiin D. teissieri and Drosophila yakuba contains asignificant excess of nonsynonymous substitu-tions compared with nonsynonymous polymor-phisms (relative to the ratio of synonymous sub-stitutions to polymorphisms), indicating thatamino acid substitutions were rapidly driven tofixation shortly after the origination of jingwei(Figure 5b). Similarly, Nurminsky et al. (110)showed that a D. melanogaster–specific genefamily, Sdic, involved in sperm motility rapidlyacquired a new exon-intron structure and testis-specific expression (Figure 1). Sdic is a chimericgene composed of a 5′ piece of Cdic, encodinga cytoplasmic dynein intermediate chain, and a3′ piece of AnnX, a phospholipid binding pro-tein. This fusion protein underwent rapid struc-tural renovations, including the conversion of aCdic intron into an exon and an AnnX exon andCdic intron into a testis-specific promoter. Lowlevels of sequence polymorphism, preservationof coding potential, and the absence of Sdic inother closely related species suggest that Sdicwas rapidly swept to fixation.

These first discoveries sparked searchesfor general evolutionary patterns in newgenes. Jones & Begun (64) searched for com-mon patterns in the evolution of three Adh-derived chimeric genes in different lineages ofDrosophila. All three new genes quickly accu-mulated a large number of amino acid replace-ment substitutions, several at identical aminoacid sites, in the Adh-derived region shortly af-ter they arose. Strikingly, Jones & Begun (64)and Shih & Jones (127) showed that differ-

ent Adh-derived fusion genes often accumulatemutations at the same sites, regardless of towhich other gene they have fused (Figure 5c).In addition, each of the four Adh-derived fu-sion genes exhibits strong signals of acceleratedamino acid substitution using classic populationgenetic statistical tests (e.g., M-K test).

Some of these observations have recentlybeen borne out by genome-wide studies. Xuet al. (156) surveyed structural differences be-tween more than 600 paralogous pairs of genesin plants and found that most new genes un-derwent radical changes in exon/intron contentand boundaries as well as insertion/deletions.And using molecular population genetic tests,Chen et al. (30) found that young genes inD. melanogaster show strong signals of selection.These authors predicted that ∼25% of aminoacid substitutions in young essential genes werefixed by natural selection. In addition, this sig-nal of selection diminishes as genes grow older.Altogether these studies indicate that there aregeneral patterns to new gene evolution: Newgenes often undergo rapid (or immediate) struc-tural and sequence renovations and expressionpattern changes that are driven by strong natu-ral selection.

Analysis of New Gene Structureand Function

In addition to analyses of new gene frequen-cies and nucleotide changes, many groups haveinvestigated the evolutionary forces acting onnew genes by analyzing new gene functions, ge-nomic locations, or expression patterns. Thiscomplementary approach has revealed severalfundamental patterns of new gene origination.Chen et al. (24) and Cheng & Chen (31), forexample, investigated the antifreeze proteinsfound in the blood of several orders of Arc-tic and Antarctic fish. These proteins inde-pendently evolved in the different orders, yetthey consist of nearly identical tripeptide re-peats. These tripeptide repeats were generatedde novo by amplification of short nucleotidesequences. These studies showed that similarenvironmental pressures may favor the genera-tion of genes with similar functions.

www.annualreviews.org • New Gene Evolution 321

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 16: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

In addition, as we showed in the pre-vious section, testis-biased genes are under-represented on the D. melanogaster and mam-malian X chromosome. Diaz-Castillo & Ranz’s(38) analysis of the genomic location of genesrelative to the position of chromosome domainsduring spermatogenesis led the authors to alter-natively propose that the enrichment of testis-biased retrogenes on the autosomes is causedby an increased availability during spermato-genesis of open chromatin domains that con-tain testis-expressed genes. This larger targetfor retrogene integration allows a higher pro-portion of these retrogenes to acquire testis-biased expression. These general observationsof the location of sex-biased genes, and theirgeneral movement off of the X chromosome,indicate that differences in expression alone candictate where in the genome new genes origi-nate. Together, these results show that stud-ies of general patterns of extant gene locations,structures, and expressions can be informativeof new gene origination and evolution.

PHENOTYPIC EFFECTSOF NEW GENES

Studying the roles of new genes in phenotypicevolution recently became feasible with the ad-vent of sophisticated genetic tools and molec-ular techniques as well as significant progressin related areas of important phenotypes in bi-ology. Young genes are often assumed to bedispensable because important functions arethought to require a long evolutionary periodto be developed and optimized (76). However,studies in the past decade have found numerousyoung genes with important, and sometimes es-

sential, functions at the molecular, cellular, andindividual level (27). These findings were sig-nificantly added to the classic reports of thediverged or novel functions in the genes cre-ated by duplication (see 53, 79 for referencestherein).

Biochemical Pathways

New genes can generate new biochemicalpathways and products if they are enzymes orbecome enzymes. Zhang et al. (162) showedthat jingwei evolved the capacity to catalyzebreakdown of long-chain alcohols in D. yakubaand D. teissieri, whereas the parent Adh canonly act on short-chain alcohols. In Arabidopsis,Weng et al. (152) and Matsuno et al. (95)demonstrated that three recently evolvednew duplicate genes from the P-450 family,Cyp98A9, Cyp98A8, and Cyp84A4, assembledtwo new biochemical pathways related tophenolic metabolism that are required forpollen development and α-pyrone synthesis.

Gene Expression Networks

New genes can also be quickly integrated intoexisting gene networks. Chen et al. (30) ob-served that almost all young essential genes havebeen assimilated into protein-protein physicalinteraction networks in Drosophila, and a signif-icant number of these young genes have de-veloped multiple interactions with old genes(Figure 6). Integration appears to be drivenby natural selection. Several new genes havebecome new hubs. Analysis of one new gene,Zeus, derived from the DNA-binding pro-tein Caf40 via retroposition (28), revealed thatit retained ∼30% of Caf40’s DNA-binding

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→Figure 6New genes integrated into gene networks and reshaped those networks. (a) New yeast genes that originated through duplication-based(blue) and non-duplication-based (red ) mechanisms since the recent whole-genome duplication (<100 Mya) were integrated into thephysical interaction network (18). The orange box highlights a module composed of two new genes involved in the pathway to formand process actin. DID4 ( green box) interacts with 11 new genes within three steps. (b) New genes in Drosophila formed hubs inprotein-protein interaction networks (30). (c) The Drosophila melanogaster–Drosophila simulans-specific gene Zeus quickly accumulatedmore than 100 amino acid substitutions (red, yellow, blue, and light green) in its nucleotide-binding domains under positive selection.Consequently, it evolved into a new DNA-binding motif that evolved hundreds of new gene links to rewire the gene networks thatcontrol reproduction (28).

322 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 17: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

a

YPS5

CDC36

MCM21

ERD1

YHL042W

DID4

SBE2

YJL070C

TMA17

RSF1

YLR125W

PAM18

YLL056C

PFA3

YLL023C

YEL057C

HUA2

NDC1

SPG3CRS5

YCL049C

YGR035CNIP1

YER121W

TCP1

YLR030W

PAU16

GCN3

YSC84

LSB3

GCD7

CAT2

CNM67

MUK1

ADY3

UBP15

GTS1

YPL257W

ALD5

YIL092W

QNS1

NAB2

HSP150RAD3

YBL044W

YNL046W

YBR184W

CPR8

YNR040W

EAF6

YPR096C

DDR48

YGL010W

TMN2THP2

BSP1SLA1

IRC10

ABP1

SLA2

YER186C

YDL118W

4–6 Mya after generatingZeus through retroposition

107 amino acidsubstitutions in Zeus

Zeus has created 193 new gene links andkept only 30% (129) of ancestral links of caf40

Zeus

c

Caf40

AGC

ATC

AT

AT

ATC

ATCG

GC G

CT G G

AT T

GC

GAC

GCA

GCT

GCA

GCT

CGA

CGA

TAC

GAC

1

1Bits

2

02 5 8 17 203 6 12 16 18 19 2113 147 9 10 15114

Nucleic acidbinding groove

Nucleic acidbinding groove

Bits

GAC

AT

ATC

ATC

ATG

ATG G

CGC A

GC

GC

GAC

GCT

G

GC

AAC

TGAT

CGT

AGC

GCT

ATC

GCA

1

1

2

02 5 8 17 203 6 12 16 18 19 2113 147 9 10 15114

b

www.annualreviews.org • New Gene Evolution 323

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 18: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

sites. However, in a short evolutionary period(4–6 million years) Zeus acquired 193 new bind-ing sites through which it activates or represseshundreds of downstream genes involved inreproduction. This observation indicates thatgene expression networks can be rapidly andglobally reshaped in evolution by new genes.Li et al. (77) showed that a de novo gene inyeast can suppress a previously existing matingtype–control pathway, thus rewiring the struc-ture of gene networks in the species. Capraet al. (18) revealed that new genes in yeast be-come more integrated into cellular networksover time. The modified networks are not nec-essarily novel or unimportant, either: Konikoffet al. (72) found that genes have been contin-ually added and removed from the Wnt andTGF β-signaling pathways, ancient networksinvolved in animal development.

Development

Surprisingly, new genes can quickly acquireessential roles in development. Chen et al.(30) identified 59 genes that originated inthe past ∼35 million years in Drosophila thatevolved essential developmental functions.Silencing expression of these young genescauses development failure in early to latepupae and in some cases at even earlier stages(Figure 7a,b). Furthermore, tissue-specificknockdown of these young genes can causemorphological defects in adult flies. Silencingnew genes can also have a critical effect onreproduction, even when the individual cancomplete development. The duplicate gene nsr(novel spermatogenesis regulator) exists only inthe four species of the D. melanogaster cladethat diverged 3 Mya, yet it evolved an essentialfunction required for sperm individualization(39). Similarly, silencing Zeus, a gene in thesame group of Drosophila, causes sterility bydisrupting testis and sperm development (28).

Recent work on Umbrea, a 12–15 million-year-old gene in Drosophila, carefully dissectedthe evolutionary steps this young gene tookto becoming essential in D. melanogaster (121).Umbrea arose by DNA-based duplication of

heterochromatin protein 6 (HP6) 12–15 Mya.Subsequent loss of one of its two domains(the chromodomain) and the accumulation ofprotein coding changes in the remaining chro-moshadow domain gave Umbrea a distinct chro-matin localization pattern at the centromere.Umbrea appears to have become essential onlyafter it lost the chromodomain 5–7 Mya. Care-ful molecular dissection, ancestral protein res-urrection, and population genetic analyses arethe keys to understanding the processes andtime new genes take to acquire important rolesin organisms.

Brain Evolution in Flies and Humans

Chen et al. (29) investigated the expression pat-terns of new genes in Drosophila and foundthat approximately five new genes per millionyears evolved brain expression patterns, mostlyin structures involved in olfaction and learn-ing/memory. All new brain genes are expressedin the α/β lobe, an evolutionarily new set ofneurons, implicating the roles of new genesin the evolution of this brain structure. Someof the new brain genes have significant ef-fects on behavior. For example, Xcbp1 and Desrinfluence foraging behaviors (29), and sphinxinfluences courtship behaviors (34). The fre-quent acquirement of new brain genes into thegenome and the behavioral phenotypes of someof these genes suggested rapid evolution of be-haviors, which is consistent with the remarkableobservations of Rollman et al. (120) that de-tected a great variation in the olfactory behav-ioral response associated with odorant receptorgene duplicates within the natural populationof D. melanogaster. The incorporation of newgenes into the brain is not specific to Drosophila.Zhang et al. (166) found a correlation betweennew genes and brain evolution in the human lin-eage. A high proportion of hominoid-specificand human-specific genes are expressed in theprefrontal cortex and temporal lobe, the newestbrain structures, in early fetal development.Strikingly, 54 of 380 human-specific genes areexpressed in these two brain regions, regionsthat are critical for proper cognitive function.

324 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 19: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

CG62899 Mya

Early larva

G3237618 Mya

Pharate

G1346330 MyaEarly pupa

a

YLL1Gene duplicationCG7627

b

Mya

LINES11343

SH2-0995

SH2-1101

SH2-0504

V39539

V39540

METHOD/MUTATIONP-element insertion

EMS/G717S

EMS/T765I

EMS/synonymous

RNAi/constitutive Gal4

RNAi/constitutive Gal4

PHENOTYPES Lethal, pupal stage

Lethal, pupal stage

Lethal, pupal stage

Viable

Lethal, pupal stage

Lethal, pupal stage

D. anan

assa

e

D. ere

cta

D. yak

uba

D. teis

sieri

D. sim

ulan

sD. m

aurit

iana

D. mela

nogas

ter

D. ere

cta

D. yak

uba

D. teis

sieri

D. sim

ulan

sD. m

aurit

iana

D. mela

nogas

ter

8

2

4

6

0

Figure 7The essential effects of new genes on development. (a) Development was terminated in a developmentalstage when three different genes were knocked down using RNA interference (RNAi). (b) YLL1 originated inthe common ancestor of the Drosophila melanogaster subgroup species ∼6–10 Mya, yet showed lethal effectsin the pupal stage when silenced by RNAi, mutated by EMS, or disrupted by the P element (30).

One of these genes, SRGAP2, is involved inneocortical development (23, 37).

Sexual Dimorphism andSexual Reproduction

New genes impact sexual dimorphism by par-ticipating in the genetic systems that control

sexual reproduction and sex determination (87).As the aforementioned patterns of new geneorigination show, the vast majority of new genesare sex-biased, especially male-biased, and theirorigination processes show directional copy-ing between the sex chromosomes and auto-somes (e.g., 11, 43). A number of new genes

www.annualreviews.org • New Gene Evolution 325

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 20: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

have been identified with various phenotypiceffects, including testicular descent in theria(RLN3) (115), testis size in mouse (noncodingRNA gene, Poldi ) (58), sperm competition inD. melanogaster (Sdic) (160), and spermatogen-esis in Drosophila (nsr) (39).

The ability of new genes to be incorporatedinto such conserved pathways, networks, anddevelopmental programs warrants considerablefurther study. What specific roles can newgenes play, and what characteristics of newgenes enable them to become essential compo-nents of these processes so quickly? New genesnow appear to be potent drivers of phenotypicevolution and the genetic control of importantbiological processes, and show that organismaldevelopment and organ development haveevolved species-specific and lineage-specificcomponents. Understanding the evolution andmodification of these components through theincorporation of new genes is crucial to furtherresearch.

CHALLENGES FOR THE FUTURE

It is apparent that we have just a glimpse ofthe emerging world of new genes and thatthese genes play crucial roles in the rapidevolution of the genetic systems that governbiological diversity. Questions about new geneevolution have opened many doors to both ourunderstanding of existing diversity and to newresearch. For example, most studies haveexamined new genes generated from a few

mechanisms, e.g., duplication and de novoorigination, leaving open a vast array of mecha-nisms to be investigated. Continued efforts willbe invaluable for understanding the abundanceof new genes, the mechanisms that have beenneglected so far, and even new gene evolutionin nonmodel organisms. An outstanding chal-lenge is to understand the roles of new genesin the evolution and biology of phenotypes,and the studies we have highlighted haveleft important, unresolved questions to beanswered. For example, what evolutionaryforces drive gene traffic? How do new genesevolve essential developmental functions, andhow quickly? How is CNV driven to fixation,and when do CNVs acquire novel functions?How are important structures, such as thehuman brain, able to incorporate new genefunctions, and how do new genes contributeto novel cognitive function? Future studies ofmore diverse phenotypes will help shed lighton the general patterns and modes of new geneevolution and on the influence of new genes onevolving systems. In addition, understandinghow phenotypes rapidly evolve will require adeep understanding of the underlying local andglobal gene networks. This will be a tremen-dous challenge, ranging from the experimentaldeciphering and graphic description of the genenetworks to a valid comparative analysis of theancestral and derived networks shaped by newgenes and eventually to the causal relationshipof the altered networks with the evolution ofphenotypes.

DISCLOSURE STATEMENT

The authors are not aware of any affiliations, memberships, funding, or financial holdings thatmight be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS

We thank all members of the Manyuan Long lab, past and present, for their scientific contributionsto the relevant topics discussed in this review. We are indebted to our collaborators, includingAntony Dean, Kevin White, Tim Karr, Liqun Luo, Liming Li, Wei Du, Xiaoxi Zhuang, XiaochunNi, and Maria Spletter. We also thank the NIH, the NSF, and the Packard Foundation as wellas the late Edna K. Papazian for their support of the study of new genes throughout the pastfifteen years as we explored this new and exciting area. M.L. is currently supported by NIH grants

326 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 21: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

1R01GM100768-01A1, NSF1051826, and NSF1026200; N.W.V. by the NSF Graduate ResearchFellowship and partially by the NIH genetics training grant T32 GM007197; S.C. by the NSFDoctoral Dissertation Improvement Grant DEB-1110607; and M.D.V. by a Pew Latin AmericanPostdoctoral Fellowship.

LITERATURE CITED

1. Almada AE, Wu X, Kriz AJ, Burge CB, Sharp PA. 2013. Promoter directionality is controlled by U1snRNP and polyadenylation signals. Nature 499:360–63

2. Arguello JR, Chen Y, Yang S, Wang W, Long M. 2006. Origination of an X-linked testes chimeric geneby illegitimate recombination in Drosophila. PLoS Genet. 2(5):e77

3. Bachtrog D, Toda NRT, Lockton S. 2010. Dosage compensation and demasculinization of X chromo-somes in Drosophila. Curr. Biol. 20(16):1476–81

4. Bai Y, Casola C, Feschotte C, Betran E. 2007. Comparative genomics reveals a constant rate of originationand convergent acquisition of functional retrogenes in Drosophila. Genome Biol. 8(1):R11.1–1.9

5. Baker DA, Russell S. 2011. Role of testis-specific gene expression in sex-chromosome evolution ofAnopheles gambiae. Genetics 189(3):1117–20

6. Begun DJ, Lindfors HA, Kern AD, Jones CD. 2007. Evidence for de novo evolution of testis-expressedgenes in the Drosophila yakuba/Drosophila erecta clade. Genetics 176(2):1131–37

7. Berezikov E. 2011. Evolution of microRNA diversity and regulation in animals. Nat. Rev. Genet.12(12):846–60

8. Bergthorsson U, Adams KL, Thomason B, Palmer JD. 2003. Widespread horizontal transfer of mito-chondrial genes in flowering plants. Nature 424(6945):197–201

9. Bergthorsson U, Andersson DI, Roth JR. 2007. Ohno’s dilemma: evolution of new genes under contin-uous selection. Proc. Natl. Acad. Sci. USA 104(43):17004–9

10. Betran E, Long M. 2003. Dntf-2r, a young Drosophila retroposed gene with specific male expressionunder positive Darwinian selection. Genetics 164(3):977–88

11. Betran E, Thornton K, Long M. 2002. Retroposed new genes out of the X in Drosophila. Genome Res.12:1854–59

12. Bohne A, Brunet F, Galiana-Arnoux D, Schultheis C, Volff J-N. 2008. Transposable elements as driversof genomic and biological diversity in vertebrates. Chromosome Res. 16(1):203–15

13. Bowers JE, Chapman BA, Rong J. 2003. Unravelling angiosperm genome evolution by phylogeneticanalysis of chromosomal duplication events. Nature 422:433–38

14. Brosius J. 1991. Retroposons: seeds of evolution. Science 251(4995):75315. Brosius J. 2003. The contribution of RNAs and retroposition to evolutionary novelties. Genetica 118(2–

3):99–11616. Brunet FG, Crollius HR, Paris M, Aury J-M, Gibert P, et al. 2006. Gene loss and evolutionary rates

following whole-genome duplication in teleost fishes. Mol. Biol. Evol. 23(9):1808–1617. Cai J, Zhao R, Jiang H, Wang W. 2008. De novo origination of a new protein-coding gene in Saccha-

romyces cerevisiae. Genetics 179(1):487–9618. Capra JA, Pollard KS, Singh M. 2010. Novel genes exhibit distinct patterns of function acquisition and

network integration. Genome Biol. 11(12):R12719. Cardoso-Moreira M, Emerson JJ, Clark AG, Long M. 2011. Drosophila duplication hotspots are associ-

ated with late-replicating regions of the genome. PLoS Genet. 7(11):e100234020. Cardoso-Moreira M, Long M. 2010. Mutational bias shaping fly copy number variation: implications

for genome evolution. Trends Genet. 26(6):243–4721. Carvunis A-R, Rolland T, Wapinski I, Calderwood MA, Yildirim MA, et al. 2012. Proto-genes and de

novo gene birth. Nature 487(7407):370–7422. Charlesworth B, Coyne JA, Barton NH. 1987. The relative rates of evolution of sex chromosomes and

autosomes. Am. Nat. 130(1):113–4623. Charrier C, Joshi K, Coutinho-Budd J, Kim J-E, Lambert N, et al. 2012. Inhibition of SRGAP2 function

by its human-specific paralogs induces neoteny during spine maturation. Cell 149(4):923–35

www.annualreviews.org • New Gene Evolution 327

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 22: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

24. Chen L, DeVries AL, Cheng CH. 1997. Evolution of antifreeze glycoprotein gene from a trypsinogengene in Antarctic notothenioid fish. Proc. Natl. Acad. Sci. USA 94(8):3811–16

25. Chen M, Zou M, Fu B, Li X, Vibranovski MD, et al. 2011. Evolutionary patterns of RNA-based dupli-cation in non-mammalian chordates. PLoS ONE 6(7):e21466

26. Chen S-T, Cheng H-C, Barbash DA, Yang H-P. 2007. Evolution of hydra, a recently evolved testis-expressed gene with nine alternative first exons in Drosophila melanogaster. PLoS Genet. 3(7):e107

27. Chen S, Krinsky BH, Long M. 2013. New genes as drivers of phenotypic evolution. Nat. Rev. Genet.14:645–60

28. Chen S, Ni X, Krinsky BH, Zhang YE, Vibranovski MD, et al. 2012. Reshaping of global gene expressionnetworks and sex-biased gene expression by integration of a young gene. EMBO J. 31(12):2798–809

29. Chen S, Spletter M, Ni X, White KP, Luo L, Long M. 2012. Frequent recent origination of brain genesshaped the evolution of foraging behavior in Drosophila. Cell Rep. 1(2):118–32

30. Chen S, Zhang YE, Long M. 2010. New genes in Drosophila quickly become essential. Science330(6011):1682–85

31. Cheng C-HC, Chen L. 1999. Evolution of an antifreeze glycoprotein. Nature 401:443–4432. Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, et al. 2007. Evolution of genes and genomes

on the Drosophila phylogeny. Nature 450(7167):203–1833. Conant GC, Wolfe KH. 2008. Turning a hobby into a job: how duplicated genes find new functions.

Nat. Rev. Genet. 9(12):938–5034. Dai H, Chen Y, Chen S, Mao Q, Kennedy D, et al. 2008. The evolution of courtship behaviors through

the origination of a new gene in Drosophila. Proc. Natl. Acad. Sci. USA 105(21):7478–8334a. Dai H, Yoshimatsu TF, Long M. 2007. Retrogene movement within- and between-chromosomes in the

evolution of Drosophila genomes. Gene 385:96–10235. Demuth JP, De Bie T, Stajich JE, Cristianini N, Hahn MW. 2006. The evolution of mammalian gene

families. PLoS ONE 1(1):e8536. Deng C, Cheng C-HC, Ye H, He X, Chen L. 2010. Evolution of an antifreeze protein by neofunction-

alization under escape from adaptive conflict. Proc. Natl. Acad. Sci. USA 107(50):21593–9837. Dennis MY, Nuttle X, Sudmant PH, Antonacci F, Graves TA, et al. 2012. Evolution of human-specific

neural SRGAP2 genes by incomplete segmental duplication. Cell 149(4):912–2238. Dıaz-Castillo C, Ranz JM. 2012. Nuclear chromosome dynamics in the Drosophila male germ line con-

tribute to the nonrandom genomic distribution of retrogenes. Mol. Biol. Evol. 29(9):2105–839. Ding Y, Zhao L, Yang S, Jiang Y, Chen Y, et al. 2010. A young Drosophila duplicate gene plays essential

roles in spermatogenesis by regulating several y-linked male fertility genes. PLoS Genet. 6(12):e100125540. Dopman EB, Hartl DL. 2007. A portrait of copy-number polymorphism in Drosophila melanogaster. Proc.

Natl. Acad. Sci. USA 104(50):19920–2541. Duret L, Chureau C, Samain S, Weissenbach J, Avner P. 2006. The Xist RNA gene evolved in eutherians

by pseudogenization of a protein-coding gene. Science 312(5780):1653–5542. Emerson JJ, Cardoso-Moreira M, Borevitz JO, Long M. 2008. Natural selection shapes genome-wide

patterns of copy-number polymorphism in Drosophila melanogaster. Science 320(5883):1629–3143. Emerson JJ, Kaessmann H, Betran E, Long M. 2004. Extensive gene traffic on the mammalian X chro-

mosome. Science 303(5657):537–4044. Fan C, Chen Y, Long M. 2008. Recurrent tandem gene duplication gave rise to functionally divergent

genes in Drosophila. Mol. Biol. Evol. 25(7):1451–5845. Fan C, Vibranovski MD, Chen Y, Long M. 2007. A microarray based genomic hybridization method

for identification of new genes in plants: case analyses of Arabidopsis and Oryza. J. Integr. Plant Biol.49(6):915–26

46. Feschotte C. 2008. Transposable elements and the evolution of regulatory networks. Nat. Rev. Genet.9(5):397–405

47. Feuk L, Carson AR, Scherer SW. 2006. Structural variation in the human genome. Nat. Rev. Genet.7(2):85–97

48. Francino MP. 2005. An adaptive radiation model for the origin of new gene functions. Nat. Genet.37(6):573–77

328 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 23: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

49. Fu B, Chen M, Zou M, Long M, He S. 2010. The rapid generation of chimerical genes expandingprotein diversity in zebrafish. BMC Genomics 11(1):657

50. Gallach M, Chandrasekaran C, Betran E. 2010. Analyses of nuclearly encoded mitochondrial genes sug-gest gene duplication as a mechanism for resolving intralocus sexually antagonistic conflict in Drosophila.Genome Biol. Evol. 2:835–50

51. Gardiner A, Barker D, Butlin RK, Jordan WC, Ritchie MG. 2008. Evolution of a complex locus: exongain, loss and divergence at the Gr39a locus in Drosophila. PLoS ONE 3(1):e1513

52. Gilbert W. 1978. Why genes in pieces? Nature 271:50153. Gillespie J. 1991. The Causes of Molecular Evolution. New York: Oxford Univ. Press54. Graubert TA, Cahan P, Edwin D, Selzer RR, Richmond TA, et al. 2007. A high-resolution map of

segmental DNA copy number variation in the mouse genome. PLoS Genet. 3(1):e355. Haldane J. 1932. The time of action of genes, and its bearing on some evolutionary problems. Am. Nat.

66(702):5–2456. Hall C, Brachat S, Dietrich FS. 2005. Contribution of horizontal gene transfer to the evolution of

Saccharomyces cerevisiae. Eukaryot. Cell 4(6):1102–1557. Harr B, Turner LM. 2010. Genome-wide analysis of alternative splicing evolution among Mus subspecies.

Mol. Ecol. 19:228–3958. Heinen TJAJ, Staubach F, Haming D, Tautz D. 2009. Emergence of a new gene from an intergenic

region. Curr. Biol. 19(18):1527–3159. Hense W, Baines JF, Parsch J. 2007. X chromosome inactivation during Drosophila spermatogenesis.

PLoS Biol. 5(10):e27360. Hudson RR, Kreitman M, Aguade M. 1987. A test of neutral molecular evolution based on nucleotide

data. Genetics 116:153–5961. Innan H, Kondrashov F. 2010. The evolution of gene duplications: classifying and distinguishing between

models. Nat. Rev. Genet. 11(2):97–10862. Int. Chicken Genome Seq. Consort. 2004. Sequence and comparative analysis of the chicken genome

provide unique perspectives on vertebrate evolution. Nature 432(7018):695–71663. Jiang N, Bao Z, Zhang X, Eddy SR, Wessler SR. 2004. Pack-MULE transposable elements mediate

gene evolution in plants. Nature 431:569–7364. Jones CD, Begun DJ. 2005. Parallel evolution of chimeric fusion genes. Proc. Natl. Acad. Sci. USA

102(32):11373–7865. Kaessmann H, Vinckenbosch N, Long M. 2009. RNA-based gene duplication: mechanistic and evolu-

tionary insights. Nat. Rev. Genet. 10(1):19–3166. Katju V, Lynch M. 2003. The structure and early evolution of recently arisen gene duplicates in the

Caenorhabditis elegans genome. Genetics 165(4):1793–80367. Katju V, Lynch M. 2006. On the formation of novel genes by duplication in the Caenorhabditis elegans

genome. Mol. Biol. Evol. 23(5):1056–6768. Kellis M, Birren BW, Lander ES. 2004. Proof and evolutionary analysis of ancient genome duplication

in the yeast Saccharomyces cerevisiae. Nature 428(6983):617–2469. Keren H, Lev-Maor G, Ast G. 2010. Alternative splicing and evolution: diversification, exon definition

and function. Nat. Rev. Genet. 11(5):345–5570. Khil PP, Smirnova NA, Romanienko PJ, Camerini-Otero RD. 2004. The mouse X chromosome is

enriched for sex-biased genes not subject to selection by meiotic sex chromosome inactivation. Nat.Genet. 36(6):642–46

71. Knowles DG, McLysaght A. 2009. Recent de novo origin of human protein-coding genes. Genome Res.19(10):1752–59

72. Konikoff CE, Wisotzkey RG, Stinchfield MJ, Newfeld SJ. 2010. Distinct molecular evolutionary mech-anisms underlie the functional diversification of the Wnt and TGF β signaling pathways. J. Mol. Evol.70:303–12

73. Koonin EV, Makarova KS, Aravind L. 2001. Horizontal gene transfer in prokaryotes: quantification andclassification. Annu. Rev. Microbiol. 55:709–42

74. Langley CH, Stevens K, Cardeno C, Lee Y, Schrider DR, et al. 2012. Genomic variation in naturalpopulations of Drosophila melanogaster. Genetics 192(2):533–98

www.annualreviews.org • New Gene Evolution 329

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 24: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

75. Levine MT, Jones CD, Kern AD, Lindfors HA, Begun DJ. 2006. Novel genes derived from noncodingDNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. Proc. Natl.Acad. Sci. USA 103(26):9935–39

76. Lewin B, Krebs JE, Goldstein ES, Kilpatrick ST. 2011. Lewin’s Genes X. Sudbury, MA: Jones and BartlettPubl.

77. Li D, Dong Y, Jiang Y, Jiang H, Cai J, Wang W. 2010. A de novo originated gene depresses budding yeastmating pathway and is repressed by the protein encoded by its antisense strand. Cell Res. 20(4):408–20

78. Li WH, Gojobori T. 1983. Rapid evolution of goat and sheep globin genes following gene duplication.Mol. Biol. Evol. 1(1):94–108

79. Li W-H. 1997. Molecular Evolution. Sunderland, MA: Sinauer Assoc.80. Li Z, Liu M, Zhang L, Zhang W, Gao G, et al. 2009. Detection of intergenic non-coding RNAs expressed

in the main developmental stages in Drosophila melanogaster. Nucleic Acids Res. 37(13):4308–1481. Lipinski KJ, Farslow JC, Fitzpatrick KA, Lynch M, Katju V, Bergthorsson U. 2011. High spontaneous

rate of gene duplication in Caenorhabditis elegans. Curr. Biol. 21(4):306–1082. Llopart A, Comeron JM, Brunet G, Lachaise D, Long M. 2002. Intron presence: absence polymorphism

in Drosophila. Proc. Natl. Acad. Sci. USA 99(12):8121–2683. Long M. 1992. The origin and evolutionary mechanisms of new genes. PhD Diss. Univ. Calif., Davis. 139 pp.84. Long M, Betran E, Thornton K, Wang W. 2003. The origin of new genes: glimpses from the young

and old. Nat. Rev. Genet. 4(11):865–7585. Long M, Langley CH. 1993. Natural selection and the origin of jingwei, a chimeric processed functional

gene in Drosophila. Science 260(5104):91–9586. Long M, Rosenberg C, Gilbert W. 1995. Intron phase correlations and the evolution of the intron/exon

structure of genes. Proc. Natl. Acad. Sci. USA 92(26):12495–9987. Long M, Vibranovski MD, Zhang YE. 2012. Evolutionary interactions between sex chromosomes and

autosomes. In Rapidly Evolving Genes and Genetic Systems, ed. R Singh, J Xu, R Kulathinal, pp. 101–14.Oxford: Oxford Univ. Press

88. Lorenc A, Makałowski W. 2003. Transposable elements and vertebrate protein diversity. Genetica 118(2–3):183–91

89. Lu J, Fu Y, Kumar S, Shen Y, Zeng K, et al. 2008. Adaptive evolution of newly emerged micro-RNAgenes in Drosophila. Mol. Biol. Evol. 25(5):929–38

90. Lu J, Shen Y, Wu Q, Kumar S, He B, et al. 2008. The birth and death of microRNA genes in Drosophila.Nat. Genet. 40(3):351–55

91. Makalowski W. 1995. SINEs as a genomic scrap yard: an essay on genomic evolution. In The Impactof Short Interspersed Elements (SINEs) on the Host Genome, ed. R Maraia, pp. 81–104. Austin, TX: R.G.Landes

92. Marques AC, Dupanloup I, Vinckenbosch N, Reymond A, Kaessmann H. 2005. Emergence of younghuman genes after a burst of retroposition in primates. PLoS Biol. 3(11):e357

93. Marques AC, Tan J, Lee S, Kong L, Heger A, Ponting CP. 2012. Evidence for conserved post-transcriptional roles of unitary pseudogenes and for frequent bifunctionality of mRNAs. Genome Biol.13(11):R102

94. Marques AC, Tan J, Ponting CP. 2011. Wrangling for microRNAs provokes much crosstalk. GenomeBiol. 12(11):132

95. Matsuno M, Compagnon V, Schoch GA, Schmitt M, Debayle D, et al. 2009. Evolution of a novelphenolic pathway for pollen development. Science 325(5948):1688–92

96. McCarrey JR, Riggs AD. 1986. Determinator-inhibitor pairs as a mechanism for threshold setting indevelopment: a possible function for pseudogenes. Proc. Natl. Acad. Sci. USA 83(3):679–83

97. McDonald JH, Kreitman M. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature351:652–54

98. McLysaght A, Hokamp K, Wolfe KH. 2002. Extensive genomic duplication during early chordateevolution. Nat. Genet. 31(2):200–4

99. Meiklejohn CD, Landeen EL, Cook JM, Kingan SB, Presgraves DC. 2011. Sex chromosome-specificregulation in the Drosophila male germline but little evidence for chromosomal dosage compensation ormeiotic inactivation. PLoS Biol. 9(8):e1001126

330 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 25: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

100. Meisel RP, Han MV, Hahn MW. 2009. A complex suite of forces drives gene traffic from Drosophila Xchromosomes. Genome Biol. Evol. 1:176–88

101. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, et al. 2011. Mapping copy number variationby population-scale genome sequencing. Nature 470(7332):59–65

102. Moran NA, Jarvik T. 2010. Lateral transfer of genes from fungi underlies carotenoid production inaphids. Science 328(5978):624–27

103. Muller HJ. 1936. Bar duplication. Science 83(2161):528–30104. Murphy DN, McLysaght A. 2012. De novo origin of protein-coding genes in murine rodents. PloS ONE

7(11):e48650105. Nasvall J, Sun L, Roth JR, Andersson DI. 2012. Real-time evolution of new genes by innovation, am-

plification, and divergence. Science 338(6105):384–87106. Nekrutenko A, Li WH. 2001. Transposable elements are found in a large number of human protein-

coding genes. Trends Genet. 17(11):619–21107. Ni X, Zhang YE, Negre N, Chen S, Long M, White KP. 2012. Adaptive evolution and the birth of

CTCF binding sites in the Drosophila genome. PLoS Biol. 10(11):e1001420108. Nozawa M, Aotsuka T, Tamura K. 2005. A novel chimeric gene, siren, with retroposed promoter sequence

in the Drosophila bipectinata complex. Genetics 171(4):1719–27109. Nozawa M, Miura S, Nei M. 2010. Origins and evolution of microRNA genes in Drosophila species.

Genome Biol. Evol. 2:180–89110. Nurminsky DI, Nurminskaya MV, De Aguiar D, Hartl DL. 1998. Selective sweep of a newly evolved

sperm-specific gene in Drosophila. Nature 396:572–75111. Ochman H, Lawrence JG, Groisman EA. 2000. Lateral gene transfer and the nature of bacterial inno-

vation. Nature 405(6784):299–304112. Ohno S. 1970. Evolution by Gene Duplication. New York: Springer-Verlag. 160 pp.113. Okamura K, Feuk L, Marques-Bonet T, Navarro A, Scherer SW. 2006. Frequent appearance of novel

protein-coding sequences by frameshift translation. Genomics 88(6):690–97114. Parisi M, Nuttall R, Naiman D, Bouffard G, Malley J, et al. 2003. Paucity of genes on the Drosophila X

chromosome showing male-biased expression. Science 299(5607):697–700115. Park J, Semyonov J, Chang CL, Yi W, Warren W, et al. 2008. Origin of INSL3-mediated testicular

descent in therian mammals. Genome Res. 18:974–85116. Piriyapongsa J, Jordan IK. 2008. Dual coding of siRNAs and miRNAs by plant transposable elements.

Bioinformatics 14:814–21117. Ranz JM, Castillo-Davis CI, Meiklejohn CD, Hartl DL. 2003. Sex-dependent gene expression and

evolution of the Drosophila transcriptome. Science 300(5626):1742–45118. Rogers RL, Bedford T, Hartl DL. 2009. Formation and longevity of chimeric and duplicate genes in

Drosophila melanogaster. Genetics 181(1):313–22119. Rogers RL, Hartl DL. 2012. Chimeric genes as a source of rapid evolution in Drosophila melanogaster.

Mol. Biol. Evol. 29(2):517–29120. Rollmann SM, Wang P, Date P, West SA, Mackay TF, Anholt RR. 2010. Odorant receptor polymor-

phisms and natural variation behavior in Drosophila melanogaster. Genetics 186:687–97121. Ross BD, Rosin L, Thomae AW, Hiatt MA, Vermaak D, et al. 2013. Stepwise evolution of essential

centromere function in a Drosophila neogene. Science 340(6137):1211–14122. Sabath N, Wagner A, Karlin D. 2012. Evolution of viral proteins originated de novo by overprinting.

Mol. Biol. Evol. 29(12):3767–80123. Schrider DR, Navarro FCP, Galante PAF, Parmigiani RB, Camargo AA, et al. 2013. Gene copy-number

polymorphism caused by retrotransposition in humans. PLoS Genet. 9(1):e1003242124. Schrider DR, Stevens K, Cardeno CM, Langley CH, Hahn MW. 2011. Genome-wide analysis of ret-

rogene polymorphisms in Drosophila melanogaster. Genome Res. 21(12):2087–95125. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, et al. 2004. Large-scale copy number polymorphism

in the human genome. Science 305(5683):525–28126. Semon M, Wolfe KH. 2007. Consequences of genome duplication. Curr. Opin. Genet. Dev. 17(6):505–12127. Shih H-J, Jones CD. 2008. Patterns of amino acid evolution in the Drosophila ananassae chimeric gene,

siren, parallel those of other Adh-derived chimeras. Genetics 180(2):1261–63

www.annualreviews.org • New Gene Evolution 331

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 26: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

128. Sturgill D, Zhang Y, Parisi M, Oliver B. 2007. Demasculinization of X chromosomes in the Drosophilagenus. Nature 450(7167):238–41

129. Sturtevant AH. 1925. The effects of unequal crossing over at the bar locus in Drosophila. Genetics 10:117–47

130. Sudhof TC, Goldstein JL, Brown MS, Russell DW. 1985. The LDL receptor gene: a mosaic of exonsshared with different proteins. Science 228(4701):815–22

131. Tao Y, Araripe L, Kingan SB, Ke Y, Xiao H, Hartl DL. 2007. A sex-ratio meiotic drive system inDrosophila simulans. II: An X-linked distorter. PLoS Biol. 5(11):e293

132. Tao Y, Masly JP, Araripe L, Ke Y, Hartl DL. 2007. A sex-ratio meiotic drive system in Drosophila simulans.I: An autosomal suppressor. PLoS Biol. 5(11):e292

133. Thomson TM, Lozano JJ, Loukili N, Carrio R, Serra F, et al. 2000. Fusion of the human gene for thepolyubiquitination coeffector UEV1 with Kua, a newly identified gene. Genome Res. 10(11):1743–56

134. Thornton KR. 2003. Gene conversion and natural selection at duplicate loci in Drosophila melanogaster.PhD Diss. Univ. Chicago. 181 pp.

135. Thornton KR. 2007. The neutral coalescent process for recent gene duplications and copy-numbervariants. Genetics 177(2):987–1000

136. Tiedge H, Chen W, Brosius J. 1993. Primary structure, neural-specific expression, and dendritic locationof human BC200 RNA. J. Neurosci. 13:2382–90

137. Toll-Riera M, Bosch N, Bellora N, Castelo R, Armengol L, et al. 2009. Origin of primate orphan genes:a comparative genomics approach. Mol. Biol. Evol. 26(3):603–12

138. Toups MA, Hahn MW. 2010. Retrogenes reveal the direction of sex-chromosome evolution inmosquitoes. Genetics 186(2):763–66

139. Vibranovski MD, Lopes HF, Karr TL, Long M. 2009. Stage-specific expression profiling of Drosophilaspermatogenesis suggests that meiotic sex chromosome inactivation drives genomic relocation of testis-expressed genes. PLoS Genet. 5(11):e1000731

140. Vibranovski MD, Zhang YE, Kemkemer C, Lopes HF, Karr TL, Long M. 2012. Re-analysis of the larvaltestis data on meiotic sex chromosome inactivation revealed evidence for tissue-specific gene expressionrelated to the Drosophila X chromosome. BMC Biol. 10(1):49; author reply 50

141. Vibranovski MD, Zhang YE, Kemkemer C, VanKuren NW, Lopes HF, et al. 2012. Segmental datasetand whole body expression data do not support the hypothesis that non-random movement is an intrinsicproperty of Drosophila retrogenes. BMC Evol. Biol. 12:169

142. Vibranovski MD, Zhang Y, Long M. 2009. General gene movement off the X chromosome in theDrosophila genus. Genome Res. 19(5):897–903

143. Vicoso B, Charlesworth B. 2009. The deficit of male-biased genes on the D. melanogaster X Chromosomeis expression-dependent: a consequence of dosage compensation? J. Mol. Evol. 68(5):576–83

144. Vinckenbosch N, Dupanloup I, Kaessmann H. 2006. Evolutionary fate of retroposed gene copies in thehuman genome. Proc. Natl. Acad. Sci. USA 103(9):3220–25

145. Walsh B. 2003. Population-genetic models of the fates of duplicate genes. Genetica 118(2–3):279–94146. Walsh JB. 1995. How often do duplicated genes evolve new functions? Genetics 139(1):421–28147. Wang J, Long M, Vibranovski MD. 2012. Retrogenes moved out of the Z chromosome in the silkworm.

J. Mol. Evol. 74(3–4):113–26148. Wang J, Mager J, Chen Y, Schneider E, Cross JC, et al. 2001. Imprinted X inactivation maintained by

a mouse Polycomb group gene. Nat. Genet. 28(4):371–75149. Wang W, Yu H, Long M. 2004. Duplication-degeneration as a mechanism of gene fission and the origin

of new genes in Drosophila species. Nat. Genet. 36(5):523–27150. Wang W, Zhang J, Alvarez C, Llopart A, Long M. 2000. The origin of the jingwei gene and the

complex modular structure of its parental gene, yellow emperor, in Drosophila melanogaster. Mol. Biol.Evol. 17(9):1294–301

151. Wang W, Zheng H, Fan C, Li J, Shi J, et al. 2006. High rate of chimeric gene origination by retropositionin plant genomes. Plant Cell 18:1791–802

152. Weng J-K, Li Y, Mo H, Chapple C. 2012. Assembly of an evolutionarily new pathway for α-pyronebiosynthesis in Arabidopsis. Science 337(6097):960–64

332 Long et al.

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 27: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47CH14-Long ARI 29 October 2013 14:13

153. Wu D-D, Irwin DM, Zhang Y-P. 2011. De novo origin of human protein-coding genes. PLoS Genet.7(11):e1002379

154. Xiao W, Liu H, Li Y, Li X, Xu C, et al. 2009. A rice gene of de novo origin negatively regulatespathogen-induced defense response. PLoS ONE 4(2):e4603

155. Xie C, Zhang YE, Chen J-Y, Liu C-J, Zhou W-Z, et al. 2012. Hominoid-specific de novo protein-codinggenes originating from long non-coding RNAs. PLoS Genet. 8(9):e1002942

156. Xu G, Guo C, Shan H, Kong H. 2012. Divergence of duplicate genes in exon-intron structure. Proc.Natl. Acad. Sci. USA 109(4):1187–92

157. Xue S, Jones MD, Lu Q, Middeldorp JM, Griffin BE. 2003. Genetic diversity: frameshift mechanismsalter coding of a gene (Epstein-Barr virus LF3 gene) that contains multiple 102-base-pair direct sequencerepeats. Mol. Cell. Biol. 23(6):2192–201

158. Yang S, Arguello JR, Li X, Ding Y, Zhou Q, et al. 2008. Repetitive element-mediated recombination asa mechanism for new gene origination in Drosophila. PLoS Genet. 4(1):e3

159. Yang Z, Huang J. 2011. De novo origin of new genes with introns in Plasmodium vivax. FEBS Lett.585(4):641–44

160. Yeh S-D, Do T, Chan C, Cordova A, Carranza F, et al. 2012. Functional evidence that a recently evolvedDrosophila sperm-specific gene boosts sperm competition. Proc. Natl. Acad. Sci. USA 109(6):2043–48

161. Yoshida S, Maruyama S, Nozaki H, Shirasu K. 2010. Horizontal gene transfer by the parasitic plantStriga hermonthica. Science 328:1128

162. Zhang J, Dean AM, Brunet F, Long M. 2004. Evolving protein functional diversity in new genes ofDrosophila. Proc. Natl. Acad. Sci. USA 101(46):16246–50

163. Zhang PG, Huang SZ, Pin A-L, Adams KL. 2010. Extensive divergence in alternative splicing pat-terns after gene and genome duplication during the evolutionary history of Arabidopsis. Mol. Biol. Evol.27(7):1686–97

164. Zhang Y, Lu S, Zhao S, Zheng X, Long M, Wei L. 2009. Positive selection for the male functionalityof a co-retroposed gene in the hominoids. BMC Evol. Biol. 9:252

165. Zhang Y, Wu Y, Liu Y, Han B. 2005. Computational identification of 69 retroposons in Arabidopsis.Plant Physiol. 138:935–48

166. Zhang YE, Landback P, Vibranovski MD, Long M. 2011. Accelerated recruitment of new brain devel-opment genes into the human genome. PLoS Biol. 9(10):e1001179

167. Zhang YE, Landback P, Vibranovski M, Long M. 2012. New genes expressed in human brains: impli-cations for annotating evolving genomes. BioEssays 34(11):982–91

168. Zhang YE, Vibranovski MD, Krinsky BH, Long M. 2010. Age-dependent chromosomal distribution ofmale-biased genes in Drosophila. Genome Res. 20(11):1526–33

169. Zhang YE, Vibranovski MD, Landback P, Marais GaB, Long M. 2010. Chromosomal redistribution ofmale-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome. PLoSBiol. 8(10):e1000494

170. Zhen Y, Aardema ML, Medina EM, Schumer M, Andolfatto P. 2012. Parallel molecular evolution in anherbivore community. Science 337(6102):1634–37

171. Zheng D, Gerstein MB. 2007. The ambiguous boundary between genes and pseudogenes: The dead riseup, or do they? Trends Genet. 23(5):219–24

172. Zhou Q, Zhang G, Zhang Y, Xu S, Zhao R, et al. 2008. On the origin of new genes in Drosophila. GenomeRes. 18(9):1446–55

173. Zhou R, Moshgabadi N, Adams KL. 2011. Extensive changes to alternative splicing patterns followingallopolyploidy in natural and resynthesized polyploids. Proc. Natl. Acad. Sci. USA 108(38):16122–27

174. Zhu Z, Zhang Y, Long M. 2009. Extensive structural renovation of retrogenes in the evolution of thepopulus genome. Plant Physiol. 151(4):1943–51

www.annualreviews.org • New Gene Evolution 333

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 28: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47-FrontMatter ARI 2 November 2013 9:9

Annual Review ofGenetics

Volume 47, 2013

Contents

Causes of Genome InstabilityAndres Aguilera and Tatiana Garcıa-Muse � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1

Radiation Effects on Human HeredityNori Nakamura, Akihiko Suyama, Asao Noda, and Yoshiaki Kodama � � � � � � � � � � � � � � � � � � �33

Dissecting Social Cell Biology and Tumors Using Drosophila GeneticsJose Carlos Pastor-Pareja and Tian Xu � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �51

Estimation and Partition of Heritability in Human Populations UsingWhole-Genome Analysis MethodsAnna A.E. Vinkhuyzen, Naomi R. Wray, Jian Yang, Michael E. Goddard,

and Peter M. Visscher � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �75

Detecting Natural Selection in Genomic DataJoseph J. Vitti, Sharon R. Grossman, and Pardis C. Sabeti � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �97

Adaptive Translation as a Mechanism of Stress Responseand AdaptationTao Pan � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 121

Organizing Principles of Mammalian Nonsense-MediatedmRNA DecayMaximilian Wei-Lin Popp and Lynne E. Maquat � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 139

Control of Nuclear Activities by Substrate-Selectiveand Protein-Group SUMOylationStefan Jentsch and Ivan Psakhye � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 167

Genomic Imprinting: Insights From PlantsMary Gehring � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 187

Regulation of Bacterial Metabolism by Small RNAsUsing Diverse MechanismsMaksym Bobrovskyy and Carin K. Vanderpool � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 209

Bacteria and the Aging and Longevity of Caenorhabditis elegansDennis H. Kim � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 233

The Genotypic View of Social Interactions in Microbial CommunitiesSara Mitri and Kevin Richard Foster � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 247

SIR Proteins and the Assembly of Silent Chromatin in Budding YeastStephanie Kueng, Mariano Oppikofer, and Susan M. Gasser � � � � � � � � � � � � � � � � � � � � � � � � � � � � 275

v

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.

Page 29: New Gene Evolution: Little Did We Knowlonglab.uchicago.edu/sites/default/files/littleWeKnow...evolution of their new structures and functions (83). As genes age, they accumulate mutations

GE47-FrontMatter ARI 2 November 2013 9:9

New Gene Evolution: Little Did We KnowManyuan Long, Nicholas W. VanKuren, Sidi Chen, Maria D. Vibranovski � � � � � � � � � � � 307

RNA Editing in Plants and Its EvolutionMizuki Takenaka, Anja Zehrmann, Daniil Verbitskiy, Barbara Hartel,

and Axel Brennicke � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 335

Expanding Horizons: Ciliary Proteins Reach Beyond CiliaShiaulou Yuan and Zhaoxia Sun � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 353

The Digestive Tract of Drosophila melanogasterBruno Lemaitre and Irene Miguel-Aliaga � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 377

RNase III: Genetics and Function; Structure and MechanismDonald L. Court, Jianhua Gan, Yu-He Liang, Gary X. Shaw, Joseph E. Tropea,

Nina Costantino, David S. Waugh, and Xinhua Ji � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 405

Modernizing the Nonhomologous End-Joining Repertoire:Alternative and Classical NHEJ Share the StageLudovic Deriano and David B. Roth � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 433

Enterococcal Sex Pheromones: Signaling, Social Behavior,and EvolutionGary M. Dunny � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 457

Control of Transcriptional ElongationHojoong Kwak and John T. Lis � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 483

The Genomic and Cellular Foundations of Animal OriginsDaniel J. Richter and Nicole King � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 509

Genetic Techniques for the ArchaeaJoel A. Farkas, Jonathan W. Picking, and Thomas J. Santangelo � � � � � � � � � � � � � � � � � � � � � � � 539

Initation of Meiotic Recombination: How and Where? Conservationand Specificities Among EukaryotesBernard de Massy � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 563

Biology and Genetics of Prions Causing NeurodegenerationStanley B. Prusiner � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 601

Bacterial Mg2+ Homeostasis, Transport, and VirulenceEduardo A. Groisman, Kerry Hollands, Michelle A. Kriner, Eun-Jin Lee,

Sun-Yang Park, and Mauricio H. Pontes � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 625

Errata

An online log of corrections to Annual Review of Genetics articles may be found athttp://genet.annualreviews.org/errata.shtml

vi Contents

Ann

u. R

ev. G

enet

. 201

3.47

:307

-333

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Uni

vers

ity o

f C

hica

go L

ibra

ries

on

12/0

1/13

. For

per

sona

l use

onl

y.