genome-wide analysis of stowaway-like mites in wheat reveals high sequence conservation, gene...

11
Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic Diversi cation 1[C][W] Beery Yaakov 2 , Smadar Ben-David 2 , and Khalil Kashkush* Department of Life Sciences, Ben-Gurion University, Beer-Sheva 84105, Israel The diversity and evolution of wheat (Triticum-Aegilops group) genomes is determined, in part, by the activity of transposable elements that constitute a large fraction of the genome (up to 90%). In this study, we retrieved sequences from publicly available wheat databases, including a 454-pyrosequencing database, and analyzed 18,217 insertions of 18 Stowaway-like miniature inverted-repeat transposable element (MITE) families previously characterized in wheat that together account for approximately 1.3 Mb of sequence. All 18 families showed high conservation in length, sequence, and target site preference. Furthermore, approximately 55% of the elements were inserted in transcribed regions, into or near known wheat genes. Notably, we observed signicant correlation between the mean length of the MITEs and their copy number. In addition, the genomic composition of nine MITE families was studied by real-time quantitative polymerase chain reaction analysis in 40 accessions of Triticum spp. and Aegilops spp., including diploids, tetraploids, and hexaploids. The quantitative polymerase chain reaction data showed massive and signicant intraspecic and interspecic variation as well as genome-specic proliferation and nonadditive quantities in the polyploids. We also observed signicant differences in the methylation status of the insertion sites among MITE families. Our data thus suggest a possible role for MITEs in generating genome diversication and in the establishment of nascent polyploid species in wheat. Wheat (Triticum-Aegilops group) likely originated from a common ancestor some 4 million years ago and has since undergone multiple polyploidization events. As such, this organism has been the subject of sub- stantial research into genomic evolution and diver- sication. Beginning with three ancestral diploid species, two major allopolyploidization events subse- quently occurred, resulting in the appearance of tet- raploid (pasta) wheat (Triticum turgidum ssp. durum; 2n = 4x = 28; genome AABB) around 0.5 million years ago and hexaploid (bread) wheat (Triticum aestivum; 2n = 6x = 42; genome AABBDD) around 10,000 years ago (Feldman and Levy, 2005). Bread wheat harbors three distinct, yet related, genomes, namely the A u genome originating from Triticum urartu, the B (or S) genome originating from a section of Sitopsis species, most probably Aegilops speltoides or Aegilops searsii, and the D genome originating from Aegilops tauschii (Petersen et al., 2006). The availability of several diploid ancestors of wheat and their polyploid species as research models allows for the tracking of those evolutionary changes that enabled diversication of the different genomes as well as their differentiation within the polyploid species. Past studies on phylogenetic rela- tionships between members of the Triticum-Aegilops group employed nuclear (Mori et al., 1995; Sasanuma et al., 1996; Wang et al., 2000a; Huang et al., 2002; Kudryavtsev et al., 2004; Sallares and Brown, 2004) or organellar (Wang et al., 2000b; Haider and Nabulsi, 2008) DNA markers to cluster divergent species. At the same time, molecular markers have been developed to study wheat phylogeny resulting from polymorphism in transposable element (TE) insertions (Queen et al., 2004; Kalendar et al., 2011; Baruch and Kashkush, 2012), including miniature inverted-repeat transposable elements (MITEs; Yaakov et al., 2012). TEs are sequences of DNA that multiply indepen- dently of the cell cycle, with some sequences, termed retrotransposons, relying on transcription to copy and pastethemselves into new sites in the genome. A second group of sequences, termed DNA transposons, employ a recombination-like mechanism to the same end (Wicker et al., 2007). The host genome combats the potential deleterious effects of TE activity by inhibiting their transcription and transposition through epi- genetic mechanisms, such as cytosine methylation, chromatin modication, and RNA interference. MITEs are nonautonomous DNA elements (i.e. sequences that rely on transposases expressed by autonomous ele- ments for their transposition) and are ubiquitous to 1 This work was supported by the Israel Science Foundation (grant no. 142/08 to K.K.). 2 These authors contributed equally to the article. * Corresponding author; e-mail [email protected]. The author responsible for distribution of materials integral to the ndings presented in this article in accordance with the policy de- scribed in the Instructions for Authors (www.plantphysiol.org) is: Khalil Kashkush ([email protected]). [C] Some gures in this article are displayed in color online but in black and white in the print edition. [W] The online version of this article contains Web-only data. www.plantphysiol.org/cgi/doi/10.1104/pp.112.204404 486 Plant Physiology Ò , January 2013, Vol. 161, pp. 486496, www.plantphysiol.org Ó 2012 American Society of Plant Biologists. All Rights Reserved. www.plantphysiol.org on August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

Genome-Wide Analysis of Stowaway-Like MITEs inWheat Reveals High Sequence Conservation, GeneAssociation, and Genomic Diversification1[C][W]

Beery Yaakov2, Smadar Ben-David2, and Khalil Kashkush*

Department of Life Sciences, Ben-Gurion University, Beer-Sheva 84105, Israel

The diversity and evolution of wheat (Triticum-Aegilops group) genomes is determined, in part, by the activity of transposableelements that constitute a large fraction of the genome (up to 90%). In this study, we retrieved sequences from publicly availablewheat databases, including a 454-pyrosequencing database, and analyzed 18,217 insertions of 18 Stowaway-like miniatureinverted-repeat transposable element (MITE) families previously characterized in wheat that together account for approximately1.3 Mb of sequence. All 18 families showed high conservation in length, sequence, and target site preference. Furthermore,approximately 55% of the elements were inserted in transcribed regions, into or near known wheat genes. Notably, weobserved significant correlation between the mean length of the MITEs and their copy number. In addition, the genomiccomposition of nine MITE families was studied by real-time quantitative polymerase chain reaction analysis in 40 accessionsof Triticum spp. and Aegilops spp., including diploids, tetraploids, and hexaploids. The quantitative polymerase chain reactiondata showed massive and significant intraspecific and interspecific variation as well as genome-specific proliferation andnonadditive quantities in the polyploids. We also observed significant differences in the methylation status of the insertionsites among MITE families. Our data thus suggest a possible role for MITEs in generating genome diversification and in theestablishment of nascent polyploid species in wheat.

Wheat (Triticum-Aegilops group) likely originatedfrom a common ancestor some 4 million years ago andhas since undergone multiple polyploidization events.As such, this organism has been the subject of sub-stantial research into genomic evolution and diver-sification. Beginning with three ancestral diploidspecies, two major allopolyploidization events subse-quently occurred, resulting in the appearance of tet-raploid (pasta) wheat (Triticum turgidum ssp. durum;2n = 4x = 28; genome AABB) around 0.5 million yearsago and hexaploid (bread) wheat (Triticum aestivum;2n = 6x = 42; genome AABBDD) around 10,000 yearsago (Feldman and Levy, 2005). Bread wheat harborsthree distinct, yet related, genomes, namely the Au

genome originating from Triticum urartu, the B (or S)genome originating from a section of Sitopsis species,most probably Aegilops speltoides or Aegilops searsii, andthe D genome originating from Aegilops tauschii (Petersenet al., 2006). The availability of several diploid ancestors

of wheat and their polyploid species as researchmodels allows for the tracking of those evolutionarychanges that enabled diversification of the differentgenomes as well as their differentiation within thepolyploid species. Past studies on phylogenetic rela-tionships between members of the Triticum-Aegilopsgroup employed nuclear (Mori et al., 1995; Sasanumaet al., 1996; Wang et al., 2000a; Huang et al., 2002;Kudryavtsev et al., 2004; Sallares and Brown, 2004)or organellar (Wang et al., 2000b; Haider andNabulsi, 2008) DNA markers to cluster divergentspecies. At the same time, molecular markers havebeen developed to study wheat phylogeny resultingfrom polymorphism in transposable element (TE)insertions (Queen et al., 2004; Kalendar et al., 2011;Baruch and Kashkush, 2012), including miniatureinverted-repeat transposable elements (MITEs; Yaakovet al., 2012).

TEs are sequences of DNA that multiply indepen-dently of the cell cycle, with some sequences, termedretrotransposons, relying on transcription to “copyand paste” themselves into new sites in the genome. Asecond group of sequences, termed DNA transposons,employ a recombination-like mechanism to the sameend (Wicker et al., 2007). The host genome combats thepotential deleterious effects of TE activity by inhibitingtheir transcription and transposition through epi-genetic mechanisms, such as cytosine methylation,chromatin modification, and RNA interference. MITEsare nonautonomous DNA elements (i.e. sequences thatrely on transposases expressed by autonomous ele-ments for their transposition) and are ubiquitous to

1 This work was supported by the Israel Science Foundation (grantno. 142/08 to K.K.).

2 These authors contributed equally to the article.* Corresponding author; e-mail [email protected] author responsible for distribution of materials integral to the

findings presented in this article in accordance with the policy de-scribed in the Instructions for Authors (www.plantphysiol.org) is:Khalil Kashkush ([email protected]).

[C] Some figures in this article are displayed in color online but inblack and white in the print edition.

[W] The online version of this article contains Web-only data.www.plantphysiol.org/cgi/doi/10.1104/pp.112.204404

486 Plant Physiology�, January 2013, Vol. 161, pp. 486–496, www.plantphysiol.org � 2012 American Society of Plant Biologists. All Rights Reserved. www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from

Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 2: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

eukaryotic genomes. MITEs are very short in length,containing up to a few hundred base pairs, and presentstructural similarity, conserved terminal repeats, andhigh copy numbers in some species (Wicker et al.,2001; Jiang et al., 2004; Isidore et al., 2005; Miller et al.,2006; Cloutier et al., 2007; Choulet et al., 2010). More-over, MITEs have been shown to be active in rice(Oryza sativa; Jiang et al., 2003; Kikuchi et al., 2003;Nakazaki et al., 2003; Shan et al., 2005; Naito et al.,2006, 2009).Although several Stowaway-like MITE families have

been characterized in wheat, their structural similarity,level of activation, epigenetic regulation, and associa-tion with wheat genes are poorly understood. In thisstudy, we performed a detailed analysis of thousandsof MITE insertions belonging to 18 Stowaway-like ele-ments as found in publicly available wheat sequences,including the 454-pyrosequencing draft of a hexaploidwheat ‘Chinese Spring’ genome. As was reported forMITEs in other fully sequenced plant genomes, wenoted high sequence conservation, most notably TA-dinucleotide target site preference, and significant as-sociation with transcribed regions in wheat. We alsonoticed a significant correlation between the averagelength of a MITE family and its copy number in hexa-ploid wheat. Furthermore, we assessed the genomiccomposition of nine MITE families using real-timequantitative PCR (qPCR) in 40 different accessions ofwheat, including Triticum spp. and Aegilops spp., andpredicted their copy numbers, based on the number ofelements for each family, estimated bioinformaticallyin the hexaploid, employing information from the 454-pyrosequencing database. The qPCR data revealedTriticum spp. or Aegilops spp. element specificity aswell as deviations from expected additive values in thepolyploid species.

RESULTS

In Silico Analysis of MITEs

Retrieval of Stowaway-Like MITE Families from the Wheat454-Pyrosequencing Database

The availability of a 454 sequence draft for hexa-ploid wheat facilitated a genome-wide analysis of 18characterized Stowaway-like MITE families (publiclyavailable at the Triticeae Repeat Sequence Database;http://wheat.pw.usda.gov/ITMI/Repeats/), given theirshort length (Table I). Overall, 18,217 MITE insertionswere retrieved in silico, using the MITE analysis kit(MAK) software (kindly provided by Guojun Yang,University of Toronto; Yang and Hall, 2003; Janickiet al., 2011). The publicly available sequence of eachof the 18 MITEs was used as a query in the MAKprogram to perform BLASTN against the draft 454-pyrosequencing database. Use of the MAK softwarealso retrieved 100 bp of flanking sequence (59 and 39flanking sequences) and indicated terminal duplica-tions for each hit. As a negative control, we included

the publicly available sequence of a rice-unique MITEfamily, termed mPing (Jiang et al., 2003), as a BLASTNquery against the draft 454 wheat sequence database.As expected, no sequences were retrieved from thewheat genomic database in this case. It is important tomention that because of the unassembled (53 cover-age) reads and because of the quality of the sequenceinformation, we used the following criteria in ouranalysis. (1) Output sequences from BLASTN with thesame identifier number in the 454-pyrosequencingdatabase were removed from the analysis, because insome cases the MAK software-generated output fileincluded sequences from both positive and negativestrands. We noticed this phenomenon for MITE fami-lies that contain short internal sequences (such asAthos; Table I), meaning that both positive and nega-tive stands can pass the E-value used in the BLASTanalysis. (2) The 454-pyrosequencing reads that con-tain nearly intact elements that significantly align withthe query transposon sequence were included in ouranalysis. It is important to note that we considered theelements that were truncated at one of the terminalsequences as being nearly intact elements. (3) Dupli-cated hits in the MAK output file, resulting from du-plicate reads in the 454-pyrosequencing database, wereremoved manually by BLAST-based sequence align-ment of the flanking sequences of all output MAK filesequences to each other and subsequent exclusion ofsimilar sequences (see “Materials and Methods”).Thus, the number of retrieved elements could be anunderestimation and might not represent the true copynumber of each family in hexaploid wheat. With this inmind, we noted a massive difference in the number ofretrieved insertions in each family, from 14 insertionsfor Phoebus and up to 2,604 and 4,855 insertions forHades and Athos, respectively (Table I). When consid-ered together, the retrieved MITE sequences accountfor approximately 1.3 Mb (calculated based on copynumber and average element size) of the approxi-mately 17,000 Mb that constitutes the wheat genome.

High Level of Conservation of Stowaway-Like MITEFamilies in Wheat

Detailed analysis using Galaxy software (see “Ma-terials and Methods”) of each MITE family showed ahigh level of conservation in average element length(Supplemental Fig. S1). For all MITE families, wenoted very low variation in the length of the differentmembers of the same family, as the SD varied between3.9 and 9.5 bases (Supplemental Fig. S1). This is despitethe fact that truncated elements (nearly intact ele-ments; see above) were included in the analysis. Inaddition, we observed high sequence conservation forthe 18 MITE families, as revealed by multiple sequencealignment. The level of sequence similarity rangedfrom 61% for high-copy-number families, such asAthos,Hades, and Thalos, and up to 99% for families withlow copy numbers, such as Jason, Phoebus, and Poly-phemus. Interestingly, sequence conservation at terminal

Plant Physiol. Vol. 161, 2013 487

MITE Dynamics in Wheat

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 3: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

inverted repeat (TIR) regions was very high for allMITE families (over 95%). It is important to note thatour analysis was unbiased toward highly conservedelements, because an E value of e23 was used to re-trieve sequences with the MAK software. Recall thatwe retrieved no mPing elements in the wheat databaseusing the same E value, indicating that no artifactswere obtained in our analysis.

The MAK software also retrieves short duplicatedtarget site sequences based on the analysis of bothflanking sequences of a MITE element. We describedtarget site preferences using the short duplicated out-put sequences created by MAK as an input file in theWebLogo 3.0 package (Crooks et al., 2004). Briefly,WebLogo calculates the relative frequency of eachnucleotide at a given position and their relativeabundance at different positions (see “Materials andMethods”). The observed logos with significant prob-abilities of certain nucleotides indicate target sitepreference. This analysis revealed that the 18 MITEfamilies possess notable target site preference (Table I;Fig. 1). In most cases, the target site preference was theTA dinucleotide, in agreement with the literature onthe preference of Stowaway-like MITEs (for review, seeJiang et al., 2004).

Annotation of MITEs and Flanking Sequences

Because MITEs are nonautonomous, namely lackingsequences that code for transposases and a promoter, it isassumed that they are not transcribed. However, whenperforming BLASTwithMITE sequences against Triticumspp. and Aegilops spp. EST and mRNA databases from

the National Center for Biotechnology Information(NCBI), we identified 943 unique chimeric transcripts(653 unique ESTs and 290 unique transcripts contain-ing mRNA characteristics) that contained MITE se-quences (Table I). As these 943 transcripts are unique,we assumed them to contain different MITE insertionsand thus concluded that approximately 5.1% (943 of18,217) of the retrieved MITEs underwent transcrip-tion, most probably from adjacent promoters. We thentested the locations of the additional MITE insertions(18,217 – 934 = 17,283 elements) by annotating MITE-flanking sequences that were retrieved by MAK, to-gether with the MITEs (see “Materials and Methods”).However, because of the short read length (approxi-mately 388 bp on average) of the 454-pyrosequencingdatabase sequences, we were only able to retrieveshort flanking sequences (approximately 100 bp fromeach side of the element). Surprisingly, we found thatapproximately 63% of the MITE insertions (11,507 ofthe 18,217 elements) are located adjacent (within 100bp) to unique transcribed sequences (Table I). Detailedanalysis led to the identification of 76 MITE insertionswithin introns or near well-characterized wheat genes(Supplemental Table S1). Specifically, 20 MITE inser-tions (26.3%) were found in the introns of 11 genes(Supplemental Fig. S2), 24 insertions (31.5%) werefound upstream of the 59 untranslated region of a gene,and 32 insertions (42.1%) were found downstream ofthe 39 untranslated region of a gene (SupplementalTable S2). The MITE-containing genes included thoseinvolved in disease resistance, transport, cell division,DNA repair, transcription, and other roles as well as aglutenin precursor.

Table I. In silico analysis of 18,217 Stowaway-like MITEs

TE Copy No. Element Size TIR Size Target Site PreferenceTE BLAST Hitsa Flanking BLAST Hitsb

EST mRNA Total EST mRNA Total

bp % %

Athos 4,855 85 41 TA 38 129 167 (3.4) 1,488 1,071 2,559 (52.7)Hades 2,604 96 22 TA 25 26 89 (3.4) 1,278 556 1,834 (70.4)Thalos 2,031 162 61 TA 300 78 378 (18.6) 1,125 266 1,391 (68.4)Icarus 1,663 112 28 TA 115 27 142 (8.5) 895 239 1,134 (68.1)Xados 1,391 116 30 TA 79 10 41 (2.9) 595 271 866 (62.2)Minos 1,132 236 25 TA 10 0 29 (2.5) 300 232 532 (46.9)Pan 1,048 127 58 AC 37 4 51 (4.8) 782 219 1,001 (95.5)Aison 775 219 45 TA 21 8 3 (0.3) 393 202 595 (76.7)Eos 615 354 52 CTTAG 3 0 10 (1.6) 470 39 509 (82.7)Stolos 538 259 21 TA 2 0 2 (0.3) 149 52 201 (37.3)Oleus 489 150 30 TA 12 4 16 (3.2) 86 78 164 (33.5)Antonio 415 108 25 TA 6 3 9 (2.1) 157 99 256 (61.6)Minimus 335 55 26 TA 2 0 2 (0.5) 157 76 233 (69.5)Fortuna 169 353 30 TA 2 0 2 (1.1) 118 49 167 (98.8)Tantalos 112 257 30 TA 1 0 1 (0.8) 23 11 34 (30.3)Polyphemus 16 241 73 TA 0 0 0 (0) 16 3 19 (118.7)Jason 15 260 51 TA 0 1 1 (6.6) 0 1 1 (6.6)Phoebus 14 319 15 CG 0 0 0 (0) 9 2 11 (78.5)Total 18,217 – – – – – 943 (5.1) – – 11,507 (63.1)

aNumber (and percentage of the overall number of TEs) of TEs containing EST hits. bNumber (and percentage of the overall number of TEs) ofTE-flanking sequences (within 100 bp downstream and/or upstream of a TE) containing EST hits.

488 Plant Physiol. Vol. 161, 2013

Yaakov et al.

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 4: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

Massive Variation in MITE Composition in Triticum andAegilops Species

To evaluate the contribution of MITEs to genomicdiversification among wheat species, we performedqPCR on genomic DNA from 40 accessions of Triticumspp. and Aegilops spp. These included 10 different spe-cies (37 diploid and three polyploid; SupplementalTable S3), of which 19 contain B genomes, 10 contain Dgenomes, eight contain A genomes, two contain AB ge-nomes (tetraploids), and one contains an ABD genome(hexaploid; Supplemental Table S3). Of the 18 MITEsanalyzed (Table I), only nine allowed efficient primerdesign for real-time PCR analysis (an example of qualitycontrol for qPCR experiments is shown in SupplementalFig. S3). By visualizing the PCR products on 1.5% aga-rose gels, PCR amplification quality was further vali-dated (Supplemental Fig. S4). It is important to note thattwo different pairs of primers were designed for someMITE families so as to ensure reproducibility of theqPCR results. The absolute copy number of each MITEin each genome was calculated based on an estimatedcopy number in T. aestivum (cv Chinese Spring wheat)derived from the 454 database (Table I; see “MaterialsandMethods”). Thus, the copy number of any genome isthe ratio of its relative quantity to the relative quantity ofT. aestivum, multiplied by the estimated copy number forcv Chinese Spring wheat. Additional validation of therelative quantification of MITEs in different wheat spe-cies was derived from our 454-pyrosequencing analy-sis of transposon display (TD) products (Yaakov andKashkush, 2012). TD allows the amplification ofmultiple TE insertions using a TE-specific primertogether with an adaptor primer. We performed 454-pyrosequencing of TD products of one MITE familycalled Minos in four wheat species: A. tauschii, A.sharonensis, T. monococcum, and T. durum (Yaakov andKashkush, 2012). The results show that the relativequantities of copy numbers, as provided by both qPCRanalysis (this study) and 454-pyrosequencing of TDproducts (Yaakov and Kashkush, 2012), in the fourwheat species are very similar (Supplemental Fig. S5).

The qPCR results further demonstrate the massiveproliferation of some MITEs in the A genome (Fig. 2),as all accessions of T. urartu and species containing theA genome (i.e. T. urartu, T. monococcum, T. dicoccoides,T. durum, and T. aestivum) showed some base level ofMITE copy number, with most showing high levels.Furthermore, two MITE families (Minos and Fortuna)were specifically amplified in this genome (i.e. Triticumspp.-specific amplification; Fig. 2, A and C). The otherA genome species, T. monococcum aegilopoides (genomeAm), showed similar copy numbers to T. urartu (ge-nome Au) for four of the nine MITEs (Aison, Oleus,Icarus, and Polyphemus; Fig. 2, B, D, E, and G).

In considering Aegilops spp., only A. speltoidesshowed species-specific proliferation (for Aison; Fig.2B). In addition, copy numbers in A. speltoides and A.searsii were clearly distinguishable from one another(Fig. 2, B, C, and E–G) in five of nine MITEs (Aison,Fortuna, Icarus, Phoebus, and Polyphemus), while A.sharonensis (accession TH02) and A. longissima (acces-sion TL05) were similar to A. searsii and A. tauschii(and different from A. speltoides) in three MITEs (Aison,Icarus, and Polyphemus; Fig. 2, B, E, and G), yet theydiffered from A. searsii and A. tauschii in one MITE(Fortuna; Fig. 2C).

As expected, none of the analyzed MITEs had lowcopy numbers in the polyploid species. In addition, ofthe nine MITEs analyzed, only two (Minos and For-tuna) showed a shift from the expected additive valuesof the parental species (Fig. 2, A and C) in the poly-ploid species (reflected as an increase in the tetraploidlevel and a reduction in the hexaploid level in Fortunaand vice versa for Minos) that could not be explainedby any combination of accessions of the parental spe-cies. Interestingly, these two elements are the onlyones specific to Triticum spp. Note that the nonadditivevalues that were observed in these two cases, of thenine cases considered, were derived from the availablewheat accessions analyzed in this study.

The combination of MITE copy numbers fromA genomes and B genomes, as compared with the

Figure 1. Target site preference of MITE insertions, as analyzed by WebLogo 3.0. Analysis was performed based on the se-quence of target site duplications retrieved from wheat databases by MAK software. The name of each MITE family is indicatedon top of each logo. [See online article for color version of this figure.]

Plant Physiol. Vol. 161, 2013 489

MITE Dynamics in Wheat

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 5: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

Figure 2. (Figure continues on following page.)

490 Plant Physiol. Vol. 161, 2013 www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from

Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 6: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

tetraploid genomes, was best explained when T. urartuwas combined with A. speltoides for two elements(Aison and Icarus) or with A. searsii for one element(Stolos). For the remaining MITEs, this difference couldbe explained by combining either both or neither ofthese species.

Examination of the intraspecific coefficient of varia-tion of MITE copy numbers in different accessions ofeach species revealed that A. speltoides presents themost variation in MITE copy number, specificallyshowing high and significant variation in three ele-ments (Oleus, Eos, and Stolos). T. urartu accessions

Figure 2. Copy numbers of MITE families Minos (A), Aison (B), Fortuna (C), Oleus (D), Icarus (E), Phoebus (F), Polyphemus (G),Stolos (H), and Eos (I) in various wheat accessions, based on qPCR and the 454-pyrosequencing database. The accession namesand plant identifiers or U.S. Department of Agriculture inventory numbers (Supplemental Table S3), as well as respectivespecies names and genome composition, are indicated. SD is indicated based on three replicates. [See online article for colorversion of this figure.]

Plant Physiol. Vol. 161, 2013 491

MITE Dynamics in Wheat

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 7: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

showed high and significant variation in two elements(Minos and Icarus). For example, the copy number ofMinos in TMU38 was approximately 5-fold higher thanin the other T. urartu accessions (Fig. 2A). This valuewas obtained in experiments repeated three times,using three replicates in each experiment. Similarly,the copy number of Aison in accession 6008 was ap-proximately 8-fold more than in TS47 (Fig. 2B). Fur-thermore, significant variations were observed in A.searsii accessions for Aison (Fig. 2B), with some acces-sions including one or more copies (such as TE16 andTE44), while others included over 170 insertions (suchas 599124 and 599149). In addition, the coefficient ofvariation was higher between species (interspecificvariation) than within a species (intraspecific varia-tion) for six of nine elements considered (Fortuna, Mi-nos, Aison, Icarus, Phoebus, and Stolos).

Cytosine Methylation of MITEs in Hexaploid Wheat

To assess the involvement of epigenetic regulation inthe activity of MITEs in natural allohexaploid wheat,we performed transposon methylation display (TMD)on 13 MITE families (Eos, Fortuna, Oleus, Minos, Thalos,Aison, Antonio, Hades, Jason, Phoebus, Polyphemus, Tan-talos, and Xados). TMD allows for analysis of themethylation status of MITE-flanking CCGG sites in agenome-wide manner (Khasdan et al., 2010; Kraitshteinet al., 2010; Yaakov and Kashkush, 2011). GenomicDNA was restricted with either of two methylation-sensitive enzymes (HpaII or MspI), ligated to adap-tors, and amplified with radiolabeled primers specificto the adaptor and transposon sequences. The result-ing polyacrylamide gel band patterns were analyzedby comparing the ratio of amplicons that exist in onlyone restriction digest (e.g. bands generated with HpaIIonly indicate hemimethylation of the outer cytosine[i.e. CNG methylation], whereas bands generated withMspI only indicate methylation of the inner cytosine[i.e. CG methylation], in CCGG TE-flanking sites) tothose found in both restriction digestions (monomor-phic bands). An example of a TMD gel is presented inSupplemental Figure S6.

Using TMD, we analyzed between 60 and 115 CCGGsites flanking each of the 13 MITE elements (Table II).For each element, we calculated the number ofunmethylated CCGG sites (monomorphic bands gener-ated upon digestion with HpaII and MspI; SupplementalFig. S6) and the number of methylated CCGG sites(polymorphic bands generated upon digestion withHpaII and MspI, where a MspI-unique band indicatesCG methylation and a HpaII-unique band indicatesCNG methylation; Table II; Supplemental Fig. S6). TheTMD results showed that different levels of cytosinemethylation are observed at CCGG sites flanking thedifferent MITE elements in T. aestivum (Table II). Meth-ylation levels ranged from 52.9% for Hades-flankingCCGG sites to 87.2% for Thalos-flanking CCGG sites.This indicates that different MITE elements exist in

different methylation environments. It is important tomention that for most elements, CNG hemimethy-lation was predominant, except for Thalos-flankingCCGG sites, where CG methylation was predominant(Table II). Thalos, however, resides in relatively heavilymethylated sites (87.2% methylated flanking CCGGsites). These data thus support our previous conclusionthat Thalos might be the least active MITE in wheat(Yaakov and Kashkush, 2011), while Hades and Minosmight be the most active (Yaakov and Kashkush,2012).

DISCUSSION

The evolution of genomes, as reflected in both thediversification of related species and the differentiationof homeologous chromosomes in allopolyploids, isrealized by various rapid (revolutionary) and slow(evolutionary) mechanisms, including the activation ofTEs (Chantret et al., 2004; Kazazian, 2004; Feldmanand Levy, 2005). A detailed mechanism describing theimpact of transposons on genomic evolution, however,has yet to be presented. Furthermore, any mechanisticdescription of TE-mediated genomic evolution wouldnecessarily have to take into account the epigeneticchanges induced by transposition as well as the in-fluence of such changes on chromosomal structure andgene expression (Slotkin and Martienssen, 2007). Thus,a genome-wide examination of genetic and epigenetic

Table II. Analysis of the methylation status of CCGG sites flanking 13Stowaway-like MITEs in T. aestivum, as revealed by TMD

MITE Family

No. of

Methylated

CCGG SitesaMonomorphic

Bandsb Totalc

CNG CG

%

Hades 29 16 40 85 (52.9)Thalos 25 70 14 109 (87.2)Xados 25 20 19 64 (70.3)Minos 18 18 24 60 (60)Aison 48 13 13 74 (82.4)Eos 16 21 27 64 (57.8)Oleus 64 18 33 115 (71.3)Antonio 39 15 14 68 (79.4)Fortuna 35 23 34 92 (63)Tantalos 42 20 18 80 (77.5)Polyphemus 45 30 33 108 (69.4)Jason 60 16 20 96 (79.2)Phoebus 49 32 11 92 (88)

aBands present only in samples digested by HpaII or only in samplesdigested by MspI are considered as being methylated in CNG and CGcontexts, respectively (see “Materials and Methods”). bMonomorphicbands from the HpaII and MspI digestions indicate nonmethylated CCGGsites. cThe total number of bands indicates the number of analyzedCCGG sites (both methylated and nonmethylated). The level of methyl-ated CCGG sites is also indicated. It is important to note that the numberof CCGG sites flanking MITE insertions is not directly correlated with thenumber of MITE insertions, as some insertions have several CCGG sitesthat can be analyzed by TMD (see “Materials and Methods”).

492 Plant Physiol. Vol. 161, 2013

Yaakov et al.

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 8: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

variation of Stowaway-like MITEs between relatedspecies and their combined polyploid species mightprovide a mechanistic perspective on this importantcategory of transposons.In this study, we retrieved and analyzed the se-

quences of over 18,000 MITE insertions belonging to 18Stowaway-like families in wheat. As expected forMITEs, based on genome-wide studies in rice (Jianget al., 2004), all 18 families were short in length(ranging from 55 to 354 bp), presented high sequenceconservation, and displayed a clear preference for TAdinucleotides as a target site (Table I). In addition, wefound that wheat MITEs might exist in strong associ-ation with genes or transcribed regions. Indeed, thestrong association of MITEs and wheat genes wasreported previously, based on analysis of a subsetof bacterial artificial chromosome (BAC) sequences(Sabot et al., 2005; Choulet et al., 2010). Furthermore,massive copy number variation was seen among the18 MITE families, with values ranging from 14 copiesup to 4,855 (Table I). In addition, genome-specificproliferation of MITEs may contribute to genomic di-versification in diploid species and, possibly, to thedifferentiation of subgenomes in allopolyploid species,an event that might aid in their diploidization. More-over, we noticed that the relatively short members ofMITE families (namely, those less than 150 bp inlength) had the highest copy numbers, while long el-ements (measuring over 200 bp in length) had thelowest copy numbers. This negative correlation wasfound to be statistically significant (Fig. 3). Finally, wealso found that the methylation levels of CCGG sitessurrounding each family differed substantially amongMITE families (ranging from 52.9% to 88%), indicativeof different levels of regulation among these elements.

Genome-Specific Proliferation of MITEs

TEs assume a central role in the formation andmaintenance of structural elements of the genome,including telomeres and centromeres. TEs can affectthe structure of the genome and the regulation ofgenes by inducing changes in DNA methylation and

heterochromatin and by the production of small RNAs(Nakayashiki, 2011). The tendency of TEs to causemutations, both genetic and epigenetic, has suppos-edly been coopted by the host genome to increasegenetic variability, as TEs are known to be activeduring stress, in gametes, and in early development(Levin and Moran, 2011). An analogous mechanismmay be acting on genomes undergoing “genomicstress,” such as new polyploids, or over large expansesof time, following reproductive isolation of a species.With this in mind, we calculated the relative quantitiesof nine MITE families using qPCR for 40 accessions of10 species of wheat. We then translated the PCR datainto absolute copy numbers based on the observedcopy number of MITE families in hexaploid wheat,with each experiment being repeated at least threetimes using different primer pairs (SD is indicated ineach figure). The results demonstrated specific prolif-eration of two MITE families (Minos and Fortuna; Fig.2, A and C, respectively) in the A genome and one inthe B genome of A. speltoides (Aison; Fig. 2B). Further-more, differences revealed by qPCR between intra-specific and interspecific copy number variations ofMITEs in the diploid wheat genomes suggest thatMITEs play a role in the diversification of genomesduring speciation. We specifically focused on MITEcontent in A. speltoides and A. searsii, the two bestcandidates for contributing the B genome to wheat.MITE content was clearly distinguishable between thetwo species (Fig. 2, B, C, and E–G) in five of nineMITEs (Aison, Fortuna, Icarus, Phoebus, and Polyphe-mus). We also found that A. sharonensis (accessionTH02) and A. longissima (accession TL05) were similarto A. searsii and A. tauschii (and different from A.speltoides) in three MITEs (Aison, Icarus, and Polyphe-mus; Fig. 2, B, E, and G). These data, together with thefinding that Aison (Fig. 2B) specifically proliferated inA. speltoides, support A. speltoides as being the choicecandidate for donating the B genome to wheat. Ourdata, however, do show that the diploid donor of the Bgenome underwent massive genomic changes after theformation of the allotetraploid. In addition, we showed anonadditive change in the polyploid species, as compared

Figure 3. Correlation between the copy numberof each MITE family and average length, as cal-culated for elements retrieved from the 454-pyrosequencing database. The r2 and P values areindicated. Error bars represent SD for MITE length.[See online article for color version of this figure.]

Plant Physiol. Vol. 161, 2013 493

MITE Dynamics in Wheat

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 9: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

with their progenitors, in two MITEs that displayedspecific proliferation in the A genome (Fortuna and Mi-nos), suggesting that T. urartu is the true donor (Feldmanand Levy, 2005). This result, in concert with known ge-netic and epigenetic changes that occur following poly-ploidization, including transcriptional (Kashkush et al.,2002, 2003) and transpositional (Kraitshtein et al., 2010;Yaakov and Kashkush, 2012) activation of tran-sposons, implies that TEs respond to hybridization,resulting in the differentiation and diploidization ofthe subgenomes.

Correlation between Element Length and Copy Number

In a recent study of rice, we showed a possible con-nection between the copy numbers of TEs and themethylation levels of flanking CCGG sites, where anegative correlation was seen in different rice strains fora MITE family termed mPing (Baruch and Kashkush,2012). The nature of the connection between the meth-ylation of TE insertion sites and TE copy number couldbe explained by a difference in the genomic context ofthe initial insertion of an element. Whereas high-copy-number MITEs were inserted into euchromatic regions,where they are able to easily proliferate, low-copy-number MITEs were inserted into heterochromaticregions, where element transposition is hindered by thesilenced chromatin. Here, we assessed the nature of thisconnection in detail for 13 MITE families in a naturalhexaploid wheat ‘Chinese Spring’. The overall methyl-ation level of all MITE sites was high, as reported pre-viously (Yaakov and Kashkush, 2011), yet we found nocorrelation with copy numbers. We did, however, notea significant negative correlation between mean ele-ment lengths (as calculated for all elements retrievedfrom the 454 database; Supplemental Table S1) andtheir copy numbers (P = 0.0297, r2 = 0.28; Fig. 3). Thisresult suggests three possible reasons for the success ofshort-sequence MITEs: (1) short-sequence MITEs canevade the epigenetic silencing mechanisms imposed onlarger elements; (2) short-sequence MITEs are less likelyto be eliminated by recombinational mechanisms; and(3) the chances of short-sequence MITEs to transpose ishigher due to the proximity of the TIRs to one another.

In summary, this study has demonstrated that wheatMITEs may have retained their activity throughout ev-olution. As such, MITEs might play a prominent role inthe diversification of the wheat genome, specifically inthe stabilization of nascent polyploid species in nature,and could provide new insight into the origin of the Bgenome.

MATERIALS AND METHODS

Plant Material and DNA Isolation

In this study, 40 accessions of Triticum spp. and Aegilops spp., including 10different diploid, tetraploid, and hexaploid species, were used (SupplementalTable S3). This includes 34 accessions of four diploid species (Triticum urartu,Aegilops speltoides, Aegilops searsii, and Aegilops tauschii; for details, see

Supplemental Table S3). DNA was isolated from young leaves (4 weeks postgermination) using the DNeasy plant kit (Qiagen).

In Silico Analysis

MITEs and flanking sequences were retrieved from the cv Chinese Spring454-pyrosequencing database (53 coverage; kindly provided by members ofthe Chinese Spring Sequencing Consortium; http://www.cerealsdb.uk.net),where over 95% of the genome is represented by at least one read using theMITE analysis kit (Yang and Hall, 2003) below an E value of e23, an endmismatch tolerance of 20 nucleotides, and a 100-nucleotide flanking size forretrieved members, and from the NCBI using the BLAST 2.0 package (http://www.ncbi.nlm.nih.gov/BLAST/) on the publicly available Triticum spp. andAegilops spp. BAC sequences. All analyses included the rice (Oryza sativa)-specific MITE, mPing, as a negative control (Jiang et al., 2003). MAK usesBLASTN to search input MITE sequences against a nucleotide database toretrieve high-scoring pairs, according to a defined E value and nucleotidemismatches at the ends of the sequence, as well as retrieving target site du-plications and a defined number of nucleotides flanking the high-scoringpairs. Preparation and statistical analysis of the 454-pyrosequencing readswere achieved using Galaxy (Blankenberg et al., 2010; Goecks et al., 2010). Forthe calculation of average read lengths and MITE lengths in the 454 database,we used Compute Sequence Length, which calculates the lengths of nucleotidesequences in a FASTA file, and Summary Statistics, which calculates thesummation, mean, SD, and various percentiles of a series of numbers (in thiscase, sequence lengths) in Galaxy. Levels of sequence conservation in eachMITE family and analysis of target site preference for each MITE family wereperformed using MAFFT for multiple sequence alignment (Katoh et al., 2009)and the publicly available online WebLogo 3.0 package (Crooks et al., 2004).The WebLogo 3.0 software creates logos for each MITE family sequence andfor target site preferences (for examples, see Fig. 1), where the height ofsymbols within the stack indicates the relative frequency of each nucleotide atthat position, while the width of the stack is proportional to the fraction ofvalid nucleotides at that position, such that an abundance of short sequencesyields thin stacks at the end. It is important to mention that because the 454-pyrosequencing database is not assembled, it includes many redundant se-quences. In addition, redundant sequences can be produced as a result of theanalysis of both the NCBI and 454-pyrosequencing databases. RedundantMITE-containing sequences were removed manually by comparing a subset ofsequences with the database and manually calculating redundancy (thenumber of sequences with an E value equal to or lower than the query se-quence against itself, divided by the total number of sequences analyzed).Copy number was then corrected using this factor.

Annotation of MITE sequences and their flanking sequences was per-formed against the EST and mRNA databases at PlantGDB (http://www.plantgdb.org/prj/ESTCluster/) and NCBI (http://www.ncbi.nlm.nih.gov/nucest/), respectively. The annotation was performed using BLAST+, stand-alone version 2.2.24. Redundant transcripts and hits below an E value of e210

were removed from the analysis. The 59 and 39 MITE flanking sequences fromthe 454-pyrosequencing database, as well as the MITEs themselves, were usedseparately as query against the above-mentioned EST databases. Furthermore,publicly available BAC sequences that contain MITE sequences were ana-lyzed for the association of MITEs with wheat genes (i.e. located in an in-tron, 1 kb downstream or upstream from a given gene). Statistical analysisof the correlation between the overall methylation status of MITEs or theiraverage length and MITE copy number was performed with JMP version5 (SAS Institute).

Real-Time qPCR

Primers for previously annotated MITE consensus sequences weredesigned using Primer Express software, version 2.0 (Applied Biosystems;Supplemental Table S4). Each template for qPCR analysis was run in triplicatereactions, each consisting of 7.5 mL of KAPA SYBR FAST Universal 23 qPCRMaster Mix (KAPA Biosystems), 5 mL of DNA template (0.24 ng mL), 1 mL offorward primer (10 mM), 1 mL of reverse primer (10 mM), 0.3 mL of ROX low(serving as a passive reference dye), and 0.2 mL of Ultra Pure Water (Bio-logical Industries). The thermal profile employed with the 7500 Fast Real-Time PCR system (Applied Biosystems) consisted of 20 s at 95°C, then 40cycles of 3 s at 95°C and 30 s at 60°C. The relative quantity (RQ) of eachMITE was measured in comparison with the VRN1 gene and with TQ27 asreference, as described previously (Kraitshtein et al., 2010), based on the

494 Plant Physiol. Vol. 161, 2013

Yaakov et al.

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 10: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

following equation: DDCt(test sample) = [Ct(target) – Ct(VRN1)]test sample – [Ct(target) – Ct

(VRN1)]TQ27, that is, RQ = (2 3 primer efficiency)–DDCt, where Ct denotes thecycle at which the PCR amplification reaches a certain level of fluorescence(Livak and Schmittgen, 2001). The relative quantity for each sample wasthen normalized to its ploidy level, as tetraploids and hexaploids havetwice and three times as many VRN1 genes, as compared with diploids,respectively. Reproducibility of the results was evaluated for each sampleby running three technical replicates of each reaction. To distinguish spe-cific from nonspecific PCR products, a melting curve was generated im-mediately after amplification consisting of a 15-s incubation at 95°C and a1-min incubation at 60°C, after which time the temperature was increasedby increments of 0.1°C s21 until 95°C was reached. A single specific pro-duct was detected using either the target or reference gene as template. Thecopy number of each accession or species was calculated by multiplying theratio of its relative quantity to that of Triticum aestivum (accession TAA01)with the copy number (CN) of T. aestivum retrieved from the 454 databasefor each MITE: (RQsample O RQTAA01) 3 CNT. aestivum. All primer efficiencieswere derived from standard curves with an adequate slope (between 23.0and 23.6) and r2 . 0.98 (for an example, see Supplemental Fig. S3). Foldamplification at each cycle was calculated according to PCR efficiency,which was deduced by the software from the slope of the regression line (y)according to the following equation: E = [(1021/y) 2 1] 3 100. For primerswith 100% efficiency, fold amplification equals 2.

Site-Specific PCR

PCR was prepared using 12 mL of Ultra Pure Water (Biological Industries),2 mL of 103 Taq DNA polymerase buffer (EURX), 2 mL of 25 mM MgCl2 (EURX),0.8 mL of 2.5 mM deoxyribonucleotide triphosphates, 0.2 mL of Taq DNA poly-merase (5 units mL21; EURX), 1 mL of each qPCR primer (50 ng mL21), and 1 mL oftemplate genomic DNA (approximately 50 ng mL21). The PCR conditionsemployed were 94°C for 3 min, repeat (94°C for 1 min, 60°C for 1 min, 72°C for1 min) 30 times, and 72°C for 3 min. PCR products (approximately 10 mL) wereseparated on 1.5% agarose gels and stained with ethidium bromide (Amresco),along with a DNA standard (100-bp ladder; Fermentas). Primer sequences areavailable upon request.

TMD

TMD reactions were performed according to a previously published protocol(Kashkush and Khasdan, 2007). Briefly, DNA was cleaved with two isoschizo-mers, HpaII and MspI, both able to recognize CCGG sites, with HpaII beingsensitive to methylation of either cytosine (except when the external cytosine ishemimethylated [i.e. when methylation of only one DNA strand occurs]) andMspI being affected only when the external cytosine is methylated. Thus, thedifferent types of CCGG site methylation resulted in different isoschizomer-generated cleavage patterns and the appearance of polymorphic PCR frag-ments. Gel-based and sequence analyses of the TMD products revealed that eachTMD band contains a chimeric (TE/flanking DNA) sequence. Note that in somecases, TE internal sequences might also be amplified, thus enabling analysis ofthe methylation status of CCGG sites within that transposon.

Primers were designed for 13 of the 18 MITEs (some MITEs did not allowefficient primer design, and some were excluded as they contained a terminalCCGG site). These primers (Supplemental Table S5) were used together withan adapter primer containing four additional selective nucleotides (TCAG)(Kashkush and Khasdan, 2007) to amplify fragments of DNA resulting from theHpaII and MspI digestions. Levels of methylation were calculated by dividingthe number of polymorphic bands from the HpaII and MspI digestions (indi-cating methylated CCGG sites) by the total number of bands. Note that mon-omorphic bands in both HpaII and MspI digestions, indicative of nonmethylatedCCGG sites, were scored only once. It is important to mention that the calculatednumber of methylation levels might be underestimated, as the TMD assay doesnot detect cases where both cytosines are methylated, since both isoschizomersdo not cleave the site. As such, no PCR products are seen in such instances.

Supplemental Data

The following materials are available in the online version of this article.

Supplemental Figure S1.AverageMITE lengths retrieved from the 454 database.

Supplemental Figure S2. List of MITE containing wheat genes.

Supplemental Figure S3. Quality control for qPCR experiment efficiency.

Supplemental Figure S4. Quality control for qPCR amplification products.

Supplemental Figure S5. Comparison of relative quantities derived from454-TD and qPCR.

Supplemental Figure S6. Examples of TMD patterns.

Supplemental Table S1.Number of TEs inserted adjacent to or into knowngenes.

Supplemental Table S2. List of MITE-flanking genes.

Supplemental Table S3. List of wheat species and accessions used in the study.

Supplemental Table S4. List of primers used for qPCR.

Supplemental Table S5. List of TE-specific primers used for TMD reactions.

ACKNOWLEDGMENTS

We thank Dr. Guojun Yang (University of Toronto) for providing theupdated stand-alone MAK software, Moshe Feldman (Weizmann Institute)and Hakan Ozkan (University of Cukurova) for providing seed material,and Mike Bevan (John Innes Center), Neil Hall (Liverpool University), andKeith Edwards (Bristol University) for providing access to the 454 databaseand for their permission to publish the data.

Received July 26, 2012; accepted October 24, 2012; published October 26, 2012.

LITERATURE CITED

Baruch O, Kashkush K (2012) Analysis of copy-number variation, inser-tional polymorphism, and methylation status of the tiniest class I (TRIM)and class II (MITE) transposable element families in various rice strains.Plant Cell Rep 31: 885–893

Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, ManganM, Nekrutenko A, Taylor J (2010) Galaxy: a Web-based genome anal-ysis tool for experimentalists. Curr Protoc Mol Biol 19: 19.10.1–19.10.21

Chantret N, Cenci A, Sabot F, Anderson O, Dubcovsky J (2004) Se-quencing of the Triticum monococcum hardness locus reveals good mi-crocolinearity with rice. Mol Genet Genomics 271: 377–386

Choulet F, Wicker T, Rustenholz C, Paux E, Salse J, Leroy P, Schlub S, LePaslier MC, Magdelenat G, Gonthier C, et al (2010) Megabase levelsequencing reveals contrasted organization and evolution patterns ofthe wheat gene and transposable element spaces. Plant Cell 22: 1686–1701

Cloutier S, McCallum BD, Loutre C, Banks TW, Wicker T, Feuillet C,Keller B, Jordan MC (2007) Leaf rust resistance gene Lr1, isolated frombread wheat (Triticum aestivum L.) is a member of the large psr567 genefamily. Plant Mol Biol 65: 93–106

Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a se-quence logo generator. Genome Res 14: 1188–1190

Feldman M, Levy AA (2005) Allopolyploidy: a shaping force in the evo-lution of wheat genomes. Cytogenet Genome Res 109: 250–258

Goecks J, Nekrutenko A, Taylor J Galaxy Team (2010) Galaxy: a com-prehensive approach for supporting accessible, reproducible, andtransparent computational research in the life sciences. Genome Biol 11:R86

Haider N, Nabulsi I (2008) Identification of Aegilops L. species and Triti-cum aestivum L. based on chloroplast DNA. Genet Resour Crop Evol 55:537–549

Huang SX, Sirikhachornkit A, Faris JD, Su XJ, Gill BS, Haselkorn R,Gornicki P (2002) Phylogenetic analysis of the acetyl-CoA carboxylaseand 3-phosphoglycerate kinase loci in wheat and other grasses. PlantMol Biol 48: 805–820

Isidore E, Scherrer B, Chalhoub B, Feuillet C, Keller B (2005) Ancienthaplotypes resulting from extensive molecular rearrangements in thewheat A genome have been maintained in species of three differentploidy levels. Genome Res 15: 526–536

Janicki M, Rooke R, Yang GJ (2011) Bioinformatics and genomic analysisof transposable elements in eukaryotic genomes. Chromosome Res 19:787–808

Jiang N, Bao ZR, Zhang XY, Hirochika H, Eddy SR, McCouch SR, WesslerSR (2003) An active DNA transposon family in rice. Nature 421: 163–167

Plant Physiol. Vol. 161, 2013 495

MITE Dynamics in Wheat

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.

Page 11: Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic ... - Plant … · Baruch and Kashkush, 2012), including miniature

Jiang N, Feschotte C, Zhang XY, Wessler SR (2004) Using rice to under-stand the origin and amplification of miniature inverted repeat trans-posable elements (MITEs). Curr Opin Plant Biol 7: 115–119

Kalendar R, Flavell AJ, Ellis TH, Sjakste T, Moisy C, Schulman AH (2011)Analysis of plant diversity with retrotransposon-based molecularmarkers. Heredity (Edinb) 106: 520–530

Kashkush K, Feldman M, Levy AA (2002) Gene loss, silencing andactivation in a newly synthesized wheat allotetraploid. Genetics 160:1651–1659

Kashkush K, Feldman M, Levy AA (2003) Transcriptional activation ofretrotransposons alters the expression of adjacent genes in wheat. NatGenet 33: 102–106

Kashkush K, Khasdan V (2007) Large-scale survey of cytosine methylationof retrotransposons and the impact of readout transcription from longterminal repeats on expression of adjacent rice genes. Genetics 177:1975–1985

Katoh K, Asimenos G, Toh H (2009) Multiple alignment of DNA sequenceswith MAFFT. Methods Mol Biol 537: 39–64

Kazazian HH Jr (2004) Mobile elements: drivers of genome evolution.Science 303: 1626–1632

Khasdan V, Yaakov B, Kraitshtein Z, Kashkush K (2010) Developmentaltiming of DNA elimination following allopolyploidization in wheat.Genetics 185: 387–390

Kikuchi K, Terauchi K, Wada M, Hirano HY (2003) The plant MITE mPingis mobilized in anther culture. Nature 421: 167–170

Kraitshtein Z, Yaakov B, Khasdan V, Kashkush K (2010) Genetic andepigenetic dynamics of a retrotransposon after allopolyploidization ofwheat. Genetics 186: 801–812

Kudryavtsev AM, Martynov SP, Broggio M, Buiatti M (2004) Evaluationof polymorphism at microsatellite loci of spring durum wheat (Triticumdurum Desf.) varieties and the use of SSR-based analysis in phylogeneticstudies. Russ J Genet 40: 1102–1110

Levin HL, Moran JV (2011) Dynamic interactions between transposableelements and their hosts. Nat Rev Genet 12: 615–627

Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression datausing real-time quantitative PCR and the 2(-Delta Delta C(T)) method.Methods 25: 402–408

Miller AK, Galiba G, Dubcovsky J (2006) A cluster of 11 CBF transcriptionfactors is located at the frost tolerance locus Fr-Am2 in Triticum mono-coccum. Mol Genet Genomics 275: 193–203

Mori N, Liu YG, Tsunewaki K (1995) Wheat phylogeny determined byRFLP analysis of nuclear-DNA. 2. Wild tetraploid wheats. Theor ApplGenet 90: 129–134

Naito K, Cho E, Yang GJ, Campbell MA, Yano K, Okumoto Y, Tanisaka T,Wessler SR (2006) Dramatic amplification of a rice transposable elementduring recent domestication. Proc Natl Acad Sci USA 103: 17620–17625

Naito K, Zhang F, Tsukiyama T, Saito H, Hancock CN, Richardson AO,Okumoto Y, Tanisaka T, Wessler SR (2009) Unexpected consequencesof a sudden and massive transposon amplification on rice gene ex-pression. Nature 461: 1130–1134

Nakayashiki H (2011) The trickster in the genome: contribution and controlof transposable elements. Genes Cells 16: 827–841

Nakazaki T, Okumoto Y, Horibata A, Yamahira S, Teraishi M, Nishida H,Inoue H, Tanisaka T (2003) Mobilization of a transposon in the ricegenome. Nature 421: 170–172

Petersen G, Seberg O, Yde M, Berthelsen K (2006) Phylogenetic rela-tionships of Triticum and Aegilops and evidence for the origin of the A,B, and D genomes of common wheat (Triticum aestivum). Mol Phylo-genet Evol 39: 70–82

Queen RA, Gribbon BM, James C, Jack P, Flavell AJ (2004)Retrotransposon-based molecular markers for linkage and genetic di-versity analysis in wheat. Mol Genet Genomics 271: 91–97

Sabot F, Guyot R, Wicker T, Chantret N, Laubin B, Chalhoub B, Leroy P,Sourdille P, Bernard M (2005) Updating of transposable element an-notations from large wheat genomic sequences reveals diverse activitiesand gene associations. Mol Genet Genomics 274: 119–130

Sallares R, Brown TA (2004) Phylogenetic analysis of complete 59 externaltranscribed spacers of the 18S ribosomal RNA genes of diploid Aegilopsand related species (Triticeae, Poaceae). Genet Resour Crop Evol 51:701–712

Sasanuma T, Miyashita NT, Tsunewaki K (1996) Wheat phylogeny de-termined by RFLP analysis of nuclear DNA. 3. Intra- and interspecificvariations of five Aegilops sitopsis species. Theor Appl Genet 92: 928–934

Shan X, Liu Z, Dong Z, Wang Y, Chen Y, Lin X, Long L, Han F, Dong Y,Liu B (2005) Mobilization of the active MITE transposons mPing andPong in rice by introgression from wild rice (Zizania latifolia Griseb.). MolBiol Evol 22: 976–990

Slotkin RK, Martienssen R (2007) Transposable elements and the epige-netic regulation of the genome. Nat Rev Genet 8: 272–285

Wang C, Shi SH, Wang JB, Zhong Y (2000a) Phylogenetic relationships ofdiploid species in Aegilops inferred from the ITS sequences of nuclearribosomal DNA. Acta Bot Sin 42: 507–511

Wang GZ, Matsuoka Y, Tsunewaki K (2000b) Evolutionary features ofchondriome divergence in Triticum (wheat) and Aegilops shown byRFLP analysis of mitochondrial DNAs. Theor Appl Genet 100: 221–231

Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B,Flavell A, Leroy P, Morgante M, Panaud O, et al (2007) A unifiedclassification system for eukaryotic transposable elements. Nat RevGenet 8: 973–982

Wicker T, Stein N, Albar L, Feuillet C, Schlagenhauf E, Keller B (2001)Analysis of a contiguous 211 kb sequence in diploid wheat (Triticummonococcum L.) reveals multiple mechanisms of genome evolution.Plant J 26: 307–316

Yaakov B, Ceylan E, Domb K, Kashkush K (2012) Marker utility of min-iature inverted-repeat transposable elements for wheat biodiversity andevolution. Theor Appl Genet 124: 1365–1373

Yaakov B, Kashkush K (2011) Massive alterations of the methylation pat-terns around DNA transposons in the first four generations of a newlyformed wheat allohexaploid. Genome 54: 42–49

Yaakov B, Kashkush K (2012) Mobilization of Stowaway-like MITEs innewly formed allohexaploid wheat species. Plant Mol Biol 80: 419–427

Yang GJ, Hall TC (2003) MAK, a computational tool kit for automatedMITE analysis. Nucleic Acids Res 31: 3659–3665

496 Plant Physiol. Vol. 161, 2013

Yaakov et al.

www.plantphysiol.orgon August 5, 2020 - Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved.