y l o voluti journal of phylogenetics & h ona o l f a n oi ... · showed the strongest evidence...

10
Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales Dipaloke Mukherjee 1* and Walter J Diehl 2 1 Department of Food Science, Nutrition and Health Promotion, Mississippi State University, MS 39762, USA 2 Department of Biological Sciences, Mississippi State University, MS 39762, USA * Corresponding author: Dipaloke Mukherjee, Department of Food Science, Nutrition and Health Promotion, Mississippi State University, MS 39762, USA, Tel: +1-662-341-2848; Fax: +1-662-325-8728; E-mail: [email protected] Received date: Dec 22, 2016; Accepted date: Jan 09, 2017; Published date: Jan 13, 2017 Copyright: © 2017 Mukherjee D, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Abstract Identifying regions of a genome that evolve by natural selection, particularly as species diverge, has been a matter of considerable interest. The genomes of 12 species in the eubacterial order Mycoplasmatales were compared to test the hypothesis that natural selection targets genes by function and/or at given moments in the phylogenetic history of the species. These species possess some of the smallest genomes known, and analyses on the set of protein-coding genes common to all species in the study will shed light on the evolution of some of the most critical genes to living organisms. Genes that control cellular processes showed greater evidence of natural selection than genes of unknown function or genes associated with information processing and storage or metabolism. Moreover evidence of natural selection was only detected in the deepest branches of the Mycoplasmatales phylogeny, including one node where a host shift from land plants to insects likely occurred and another node where a host shift from land plants/insects to land vertebrates likely occurred. Many of the genes that showed the strongest evidence of natural selection (e.g. secA, secY, ftsH, ftsY, yidC, lepA, dnaK) encode proteins that are components of the Sec-dependent secretory pathway, which regulates the extracellular translocation of proteins. The Sec-dependent secretory pathway is proposed to play a role in speciation of Mycoplasmatales by altering the type and amount of secreted proteins, thereby affecting virulence of Mycoplasma sp. in response to infection of novel hosts. Keywords: Minimal genome; Mycoplasma; Speciation; Natural selection Introduction e proliferation of sequenced genomes has permitted the evaluation of the role of natural selection at that level of organization. Studies have shown that some species, such as Drosophila melanogaster, show positive Darwinian selection in relatively large number of genes [1,2], whereas other species, such as Arabidopsis thaliana show very low levels of positive selection compared to purifying or negative selection [3,4]. By comparison Homo sapiens show intermediate levels of positive selection [5,6]. Such studies have in common an evaluation of selection within a single lineage but generally do not address selection that may be occurring as new species arise, that is at splits in their respective phylogenetic trees. A few studies have targeted selection as species diverge [7,8], but most of them are restricted to evaluating SNPs scattered throughout the genome [9,10] and not whole sequences of genes. Using this approach, one may infer whether selection is common or rare during speciation but not necessarily whether selection is associated with particular functional groups of genes, which in turn may inform hypotheses on the genetics of speciation. e genus Mycoplasma (Order Mycoplasmatales, Domain Eubacteria) is a polyphyletic group [11] that comprises single cell, gram-positive-like, obligate parasites that may cause respiratory, urogenital and other diseases in vertebrates including humans [12,13]. ey lack cell walls due to their inability to synthesize peptidoglycan, and they are considered to be the simplest form of self-replicating biological systems but are entirely dependent on the host cells for essential nutrients. Moreover they possess extremely small genomes (range: 524-1053 genes) [12], which make them ideal for studying natural selection at the genome level. As such, the set of protein-coding genes common to all Mycoplasmatales species should comprise a set of genes that is among the most functionally critical to living organisms and that has potential for the greatest consequence if selection acts commonly and consistently thereon. e objectives of this study were to test the hypotheses (1) that natural selection has acted on the same sets of genes at different times in the Mycoplasmatales phylogeny, (2) that mutation saturation has not affected the pattern of selection acting on the genes in the Order Mycoplasmatales, and (3) that the likelihood of natural selection targeting a particular set of genes has depended on the function of the gene products. is approach has the potential to identify genes (or combinations of genes, gene complexes, or pathways) that may be involved directly or indirectly in speciation. Materials and Methods Phylogenetic tree A Bayesian phylogenetic tree (Figure 1a) was constructed from the 16S ribosomal DNA (rDNA) sequences from 12 Mycoplasma species (Table 1), whose genomes had been sequenced and sufficiently annotated as of April 15, 2008. ese sequences were obtained from the National Center for Biotechnology Information (NCBI) database Journal of Phylogenetics & Evolutionary Biology Mukherjee and Diehl, J Phylogenetics Evol Biol 2017, 5:1 DOI: 10.4172/2329-9002.1000175 Research Article OMICS International J Phylogenetics Evol Biol, an open access journal ISSN: 2329-9002 Volume 5 • Issue 1 • 1000175 J o u r n a l o f P h y l o g e n et i c s & E v o l u ti o n a r y B i o l o g y ISSN: 2329-9002

Upload: phamdung

Post on 30-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Speciation Genomics of Protein-Coding Genes Common toMycoplasmatalesDipaloke Mukherjee1* and Walter J Diehl2

1Department of Food Science, Nutrition and Health Promotion, Mississippi State University, MS 39762, USA2Department of Biological Sciences, Mississippi State University, MS 39762, USA*Corresponding author: Dipaloke Mukherjee, Department of Food Science, Nutrition and Health Promotion, Mississippi State University, MS 39762, USA, Tel:+1-662-341-2848; Fax: +1-662-325-8728; E-mail: [email protected]

Received date: Dec 22, 2016; Accepted date: Jan 09, 2017; Published date: Jan 13, 2017

Copyright: © 2017 Mukherjee D, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Identifying regions of a genome that evolve by natural selection, particularly as species diverge, has been amatter of considerable interest. The genomes of 12 species in the eubacterial order Mycoplasmatales werecompared to test the hypothesis that natural selection targets genes by function and/or at given moments in thephylogenetic history of the species. These species possess some of the smallest genomes known, and analyses onthe set of protein-coding genes common to all species in the study will shed light on the evolution of some of themost critical genes to living organisms. Genes that control cellular processes showed greater evidence of naturalselection than genes of unknown function or genes associated with information processing and storage ormetabolism. Moreover evidence of natural selection was only detected in the deepest branches of theMycoplasmatales phylogeny, including one node where a host shift from land plants to insects likely occurred andanother node where a host shift from land plants/insects to land vertebrates likely occurred. Many of the genes thatshowed the strongest evidence of natural selection (e.g. secA, secY, ftsH, ftsY, yidC, lepA, dnaK) encode proteinsthat are components of the Sec-dependent secretory pathway, which regulates the extracellular translocation ofproteins. The Sec-dependent secretory pathway is proposed to play a role in speciation of Mycoplasmatales byaltering the type and amount of secreted proteins, thereby affecting virulence of Mycoplasma sp. in response toinfection of novel hosts.

Keywords: Minimal genome; Mycoplasma; Speciation; Naturalselection

IntroductionThe proliferation of sequenced genomes has permitted the

evaluation of the role of natural selection at that level of organization.Studies have shown that some species, such as Drosophilamelanogaster, show positive Darwinian selection in relatively largenumber of genes [1,2], whereas other species, such as Arabidopsisthaliana show very low levels of positive selection compared topurifying or negative selection [3,4]. By comparison Homo sapiensshow intermediate levels of positive selection [5,6]. Such studies havein common an evaluation of selection within a single lineage butgenerally do not address selection that may be occurring as newspecies arise, that is at splits in their respective phylogenetic trees. Afew studies have targeted selection as species diverge [7,8], but most ofthem are restricted to evaluating SNPs scattered throughout thegenome [9,10] and not whole sequences of genes. Using this approach,one may infer whether selection is common or rare during speciationbut not necessarily whether selection is associated with particularfunctional groups of genes, which in turn may inform hypotheses onthe genetics of speciation.

The genus Mycoplasma (Order Mycoplasmatales, DomainEubacteria) is a polyphyletic group [11] that comprises single cell,gram-positive-like, obligate parasites that may cause respiratory,urogenital and other diseases in vertebrates including humans [12,13].They lack cell walls due to their inability to synthesize peptidoglycan,

and they are considered to be the simplest form of self-replicatingbiological systems but are entirely dependent on the host cells foressential nutrients. Moreover they possess extremely small genomes(range: 524-1053 genes) [12], which make them ideal for studyingnatural selection at the genome level. As such, the set of protein-codinggenes common to all Mycoplasmatales species should comprise a set ofgenes that is among the most functionally critical to living organismsand that has potential for the greatest consequence if selection actscommonly and consistently thereon.

The objectives of this study were to test the hypotheses (1) thatnatural selection has acted on the same sets of genes at different timesin the Mycoplasmatales phylogeny, (2) that mutation saturation hasnot affected the pattern of selection acting on the genes in the OrderMycoplasmatales, and (3) that the likelihood of natural selectiontargeting a particular set of genes has depended on the function of thegene products. This approach has the potential to identify genes (orcombinations of genes, gene complexes, or pathways) that may beinvolved directly or indirectly in speciation.

Materials and Methods

Phylogenetic treeA Bayesian phylogenetic tree (Figure 1a) was constructed from the

16S ribosomal DNA (rDNA) sequences from 12 Mycoplasma species(Table 1), whose genomes had been sequenced and sufficientlyannotated as of April 15, 2008. These sequences were obtained fromthe National Center for Biotechnology Information (NCBI) database

Journal of Phylogenetics &Evolutionary Biology

Mukherjee and Diehl, J Phylogenetics Evol Biol 2017, 5:1

DOI: 10.4172/2329-9002.1000175

Research Article OMICS International

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

Jour

nal o

f Phy

logenetics & Evolutionary Biology

ISSN: 2329-9002

(http://www.ncbi.nlm.nih.gov/) [14]. The 16S rDNA sequences fromLactobacillus acidophilus and Escherichia coli were used as the outgroups. The sequences were aligned with ClustalX [15,16], andMrBayes version 3.1 [17] was used to construct the tree. A GeneralTime Reversible (GTR) model with Gamma-distributed rates (GTR+γ)was determined to be the best fit model for the run by Modeltestversion 3.7 [18] and a likelihood ratio test was used for modelselection. A total of 100,000 generations of Markov Chain Monte Carlo(MCMC) simulations were run with the first 2500 generations ignoredas burnins, and the consensus tree was selected for further analyses.This tree was found to converge after the run, indicated by a finalstandard deviation of split frequency of 0.006444. Four nodes (A, B, Cand D, Figure 1a) were chosen for selection analyses, since these nodesunited two clades each with multiple species/variants. A more detailedphylogenetic tree showing the divergence of the Mycoplasma/Ureaplasma group from the Spiroplasma and Mesoplasma/Entomoplasma groups have been presented in the Figure 1b, with thedivergence dates of the latter two groups indicated (extrapolated fromManiloff [11]).

OrganismAccessionNumber References

M. agalactiae PG2CU179680

Sirand-Pugnet et al.[24]

M. capricolum subsp capricolumATCC 27343 CP000123

Craig Venter Institute[25]

M. gallisepticum str R (low) AE015450 Papazist et al. [26]

M. genitalium G37 L43967 Fraser et al. [27]

M. hyopneumoniae 232 AE017332 Minion et al. [28]

M. hyopneumoniae 7448 AE017244 Vasconcelos et al. [29]

M. hyopneumoniae J AE017243 Vasconcelos et al. [29]

M. mobile 163K AE017308 Jaffe et al. [30]

M. penetrans HF-2 BA000026 Sasaki et al. [31]

M. mycoides subsp mycoides PG1 BX293980 Westberg et al. [32]

M. pneumonia M129 U00089 Himmelreich et al. [33]

M. pulmonis UAB CTIP AL445566 Chambaud et al. [34]

M. synoviae 53 AE017245 Vasconcelos et al. [29]

U. parvum serovar 3 str. ATCC700970 AF222894 Glass et al. [35]

Table 1: List of the complete genomes used in the study, obtained fromthe National Center for Biotechnology information (NCBI) database(http://www.ncbi.nlm.nih.gov/).

Genomic databaseA database was constructed that contained the 221 protein coding

gene sequences common to all of the 12 species of Mycoplasma used inthe study (Table 1). Sequences and functions were obtained fromNational Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/) database. The Kyoto Encyclopedia of Genes(KEGG) and Genomics (http://www.genome.jp/kegg/) [36] database wasused to detect orthologs for a particular gene. Clusters of Orthologus

Groups (COG) [37] were used to segregate these genes into functionalgroups, namely information storage and processing genes, cellularprocesses genes and metabolism genes plus a group of poorlycharacterized genes (Table 2).

Figure 1a: Bayesian phylogeny from 16S ribosomal DNA (rDNA)sequences for 12 Mycoplasmatales species, using Lactobacillusacidophilus (L. acidophilus) and Escherichia coli (E. coli) asoutgroups. The clade credibility values for each branch are given.Approximate divergence dates (extrapolated from Maniloff [11]),are indicated for each node. Nodes A, B, C, & D were chosen fortests of natural selection. The specific hosts for each species areindicated in parentheses [12,19-23].

Figure 1b: Mollicutes phylogenetic tree showing the relationshipsamong Spiroplasma, Mesoplasma/Entomoplasma and Mycoplasma/Ureaplasma groups [extrapolated from Maniloff [11] andincorporating phylogeny from the current study].

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 2 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

Functional Category Number of Genes

Information Processing and Storage Genes 123

Cellular Processes Genes 26

Metabolism Genes 51

Poorly Categorized Genes 21

Table 2: Number of genes belonging to the different functionalcategories.

Sequence alignmentThe software program package Data Analysis in Molecular Biology

and Evolution (DAMBE) [38] was used to align the sequences. First,the nucleotide sequences were translated to amino acid sequences.Next, multiple sequence alignments were conducted using ClustalW[16,39] with gap open and extension penalties of 10 and 0.1,respectively and using BLOSUM as the amino acid substitution matrix.Finally, the original nucleotide sequences were realigned against thealigned amino acid sequences. This method is a codon-based approachof sequence alignment, which can prove to be useful in accounting forgaps in the middle of the sequences.

Data analysesTwo separate statistical procedures were followed for the analysis of

selection patterns–a codon substitution models test [40] and theMcDonald-Kreitman test [41]. The ratio of non-synonymous (dN) tosynonymous (dS) nucleotide substitution rates (dN/dS, designated bythe letter ω) [42] forms the basis of both tests. A ratio of 1 indicatesneutrality. In other words, the rate of fixation of a neutral amino acidmutation will be equivalent to that of a synonymous substitution. Aratio less than 1 indicates purifying selection and the substitution iseventually eliminated from the population. If the ratio is greater than1, then positive Darwinian selection will retain the amino acidmutation in the population [43]. Codon substitution models tests canbe used to compare dN/dS ratios among branches on a phylogenetictree, where a significant difference implies selection. The McDonald-Kreitman test effectively partitions dN/dS ratios into fixed differencesbetween clades vs. polymorphism across clades, where a significantdifference implies selection by either adaptive fixation orpolymorphism excess. Because each test uses dN/dS ratios differently,the results are effectively independent.

Codon substitution models testsThe Codeml program of the software program package Phylogenetic

Analysis using Maximum Parsimony (PAML) version 4.0 [40] wasused to conduct these analyses. First, each gene in the original datasetwas analyzed by the free and one ratio models test [40]. The free ratiomodel assumes different dN/dS ratios for the different branches of thephylogenetic tree, whereas the one ratio model assumes a single dN/dSvalue for the entire tree. This test effectively determines whetherselection at a gene may be occurring somewhere in the phylogeny butit cannot determine where. Log-likelihood values for each of these twomodels were computed and twice the log-likelihood differences foreach of the genes were compared to a χ2 distribution with α=0.000057(the same α value that was used for the McDonald-Kreitman test afterapplying a Bonferroni correction-described later) and 24 degrees offreedom (25 branches analyzed under the free ratio model plus one

under the one ratio model minus two) [44]. Result from the Fisher’sExact test (P=0.0035) [45,46] showed that the ratio of the genesshowing evidence of selection to the ones showing neutrality amongfunctional groups was significantly different.

To determine whether the dN/dS ratios of the different clades in thephylogeney were different from each other as well as from thebackground, two branch models tests (three vs. two branch ratiomodels test) [40] were used. The three branch ratio model assumesdifferent dN/dS values for the two clades as well as the background.The two branch ratio model, on the other hand, assumes an equaldN/dS ratio for the two clades that are being compared, while thedN/dS value for the rest of the tree (the ‘background’) is assumed to befree. These two tests were applied only to the 153 genes that gaveevidence of selection in response to the application of free and oneratio models tests. Four clades of the phylogenetic tree with sufficientspecies/variants to show sequence variation were analyzed this way.Log-likelihood values for these models were computed, and twice thelog-likelihood differences were compared to a χ2 distribution withα=0.05 with 3 degrees of freedom (three clades tested under the threebranch ratio model plus two under the two branch ratio model minus2). A three way contingency table (4 functional categories × 4 clades ×2 possible selection results, namely selection or neutrality, Table 3) wasconstructed to summarize the results from the analyses using thevarious codon substitution models. Log-linear analysis (α=0.05) [47]was conducted on the data to determine whether selection patterndepended on gene function and/or on nodes of the phylogenetic tree(Table 4).

McDonald-Kreitman testsThe software program package DNA Sequence Polymorphism

(DnaSP) version 4.20.2 [48] was used to compute the neutrality indices[49] for each gene for each pair of clades tested in the phylogeny. Two-tailed Fisher’s exact tests [45] were conducted to assess thesignificances of the computed NIs, with an α value of 0.000057 (afterapplying a Bonferroni correction [44], α=[0.05/Number of genesstudied, 221 × Number of clades analyzed 4]). These tests wereconducted using the software program package DnaSP [48]. Finally, athree way contingency table (4 functional categories × 4 clades × 2McDonald-Kreitman test results, namely adaptive fixation orneutrality, since none of the genes exhibited polymorphism excess,Table 3) was constructed to summarize the outcomes from theMcDonald-Kreitman tests. The results were tested by log-linearanalysis (α=0.05) [47] to determine whether selection pattern variesdepending on functional categories of the gene and/or clades of thephylogenetic tree (Table 4). For both the codon substitution modelsand the McDonald-Kreitman tests, the VassarStats Web Site forStatistical Computation (http://dogsbody.psych.mun.ca/VassarStats/abc.html; [50]) was used for conducting the log-linear analyses.

Detection of mutation saturationMutation saturation [51,52] can be detected by analyzing the

frequency of the complex codons in a gene. Complex codons are agroup of highly variable codons for which the pattern of non-synonymous and synonymous substitutions for fixed differences andpolymorphisms among species sets cannot be inferred, as defined byRozas et al. [48], and that subsequently cannot be analyzed by thepackage DnaSP. We assumed that the greater the degree to which agene shows mutation saturation, the greater the number of complexcodons that gene would possess. Thus mutation saturation can be

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 3 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

analyzed by plotting number of complex codons as a function of thetotal number of codons in a gene for the all genes in the genome.Genes at each node were analyzed separately. Complete mutationsaturation would be indicated if all codons were complex. Completeabsence of mutation saturation would be indicated if no codons in agene were complex.

Pathway analysisThe network of the protein-protein interactions of the cellular

processes gene products from M. pneumonia M129 [33], based ontheir involvement in interconnected biochemical pathways, 173 wasgenerated in the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database [53] of known and predicted protein-protein interactions. The software program Cytoscape version 2.8.2[54] was used to visualize the network.

ResultsAn initial analysis of natural selection among all species in the study

(free vs. one ratio codon substitution models test) [40] indicated that

153 of 221 genes showed preliminary evidence of natural selection(α=0.000057) after applying a Bonferroni correction [44] somewherein the Mycoplasmatales phylogeny. There was a significant differencein the ratio of genes showing selection to genes showing neutralityamong functional groups (Fisher exact test, P=0.0035 [45,46]) with92% of cellular processes genes showing evidence of selectioncompared to 66% of genes showing selection for other groupscombined (Table 3A). The significant results here justified subsequentnode by node analyses. Genes that showed neutrality in this initialevaluation were assumed to be neutral in the all subsequent tests.

Nodes A-D (Figure 1a) was chosen for subsequent analyses becausethe entire respective sister clades nested within them containedmultiple species or variants with sequenced genomes, hence thepotential for genetic variation at every gene. Of the 153 genes showingevidence of selection in the initial test, 135 genes showed evidence ofselection in subsequent three vs. two ratio codon substitution modelstests [40] conducted node by node (Table 3B), and 90 genes showedevidence of adaptive fixation (similar to divergent selection) inMcDonald-Kreitman tests [41] conducted node by node (Table 3C). Inthe latter tests, no genes showed a significant excess of polymorhisms.

A Codon Substitution Models Tests (free- vs. one-ratio model)

NodeInformation Processing andSotrage Genes

Cellular ProcessesGenes Metabolism Genes Poorly Characterized

Genes Total

S N S N S N S N S N

All 74 49 24 2 39 12 16 5 153 68

B Codon Substitution Models Tests (three- vs. two-ratio model)

NodeInformation Processing andStorage Genes

Cellular ProcessesGenes Metabolism Genes Poorly Characterized

Genes Total

S N S N S N S N S N

A 60 63 20 6 31 20 10 11 121 100

B 26 97 5 21 9 42 6 15 46 175

C 0 123 0 26 2 49 0 21 2 219

D 19 104 0 26 4 47 3 18 26 195

Total 105 387 25 79 46 158 19 65 195 689

C McDonald-Kreitman Tests

NodeInformation Processing andSotrage Genes

Cellular ProcessesGenes Metabolism Genes Poorly Characterized

Genes Total

S N S N S N S N S N

A 30 93 11 15 13 38 7 14 61 160

B 40 83 13 13 10 41 4 17 67 154

C 12 111 7 19 2 49 4 17 25 196

D 7 116 0 26 7 44 0 21 14 207

Total 89 403 31 73 32 172 15 69 167 717

Table 3: Contingency tables of genes showing natural selection (S) or neutrality (N) partitioned among functional groups and nodes in theMycoplasmatales phylogeny. All genes showing selection were significant in respective tests at α=0.000057 after applying Bonferroni correction

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 4 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

[44]. Cells with ratios of genes showing selection to genes showing neutrality that are arbitrarily 2.5× greater than the respective total S/N ratioare enclosed in boxes; cells with S/N ratios that are 5× greater than the total S/N ratio are enclosed in double boxes. (A) Distribution of genesamong functional groups for the entire phylogeny from the free vs. one ratio model of codon substitutional models tests [40]. (B) Distribution ofgenes among functional groups and nodes from the three vs. two ratio model of codon substitutional models tests [40]. (C) Distribution of genesamong functional groups and nodes from McDonald-Kreitman tests [41].

Only 62 genes showed evidence of selection by both codonsubstitution models tests and McDonald-Kreitman tests. Whenselection state was partitioned among functional groups andphylogenetic nodes, log-linear analyses [47] showed a significant three-way interaction (P<0.0001) regardless of the selection test used (Table4), indicating that the pattern of selection depended on both genefunction and phylogeny. Moreover, partial interactions of selectionstate and functional group (P<0.017) and selection state andphylogenetic node (P<0.0001) were also significant (Table 4). Cellularprocesses genes showed greater evidence of selection (24-30%) thangenes in all other functional categories combined (17-22%). Nodes Aand B had a greater proportion of genes showing evidence of selection(29-38%) than nodes C and D (6-9%). Only 45 genes showed evidenceof selection by both tests at either node A or node B, and 29% of thesegenes were in the cellular processes category despite accounting foronly 12% of genes overall. Regardless of selection test, the greatestproportion of genes showing evidence of selection occurred in thecellular processes category at either node A or B (Table 3).

Three- vs. Two-ratio Codon Substitution Models Test

Source P-values

Selection State × Gene Function × Node (Phylogeny) <0.0001***

Selection State × Gene Function (Effect of Node removed) <0.0177*

Selection State × Node (Effect of Gene Function removed) <0.0001***

B. McDonald-Kreitman Test

Selection State × Gene Functional × Node (Phylogeny) <0.0001***

Selection State × Gene Function (Effect of Node removed) <0.0012**

Selection State × Node (Effect of Gene Function removed) <0.0001***

Table 4: Log-linear analysis [47] of association among selection state,gene function, and node in the Mycoplasmatales phylogeny from 3-way contingency tables (Tables 3B and 3C) for (A) codon substitutionmodels tests [40] and (B) McDonald-Kreitman tests [41].

Figures 2a and 2b show the relationships between the total numberof codons in a gene and the number of complex codons, indicatingmutation saturation. As one would expect, the more codons that agene possesses, the greater the number of complex codons that occuras well (Table 5). It is evident from the Figures 2a and 2b that in thecourse of evolution, the genomes of the Mycoplasmatales species haveacquired some degree of mutation saturation, especially at the deepernodes A and B, although in all cases the level is less than required toshow complete saturation. But because nodes A and B showed thegreatest evidence of selection, there is possibility that this pattern couldhave been caused by mutation saturation. If that were the case, onewould expect that the majority of genes showing selection would occurabove or below the regression line, which did not occur at any node.That is at all nodes, the genes showing natural selection showed nomore tendencies toward mutation saturation than genes showing

neutrality (Figure 2a). Further, if evolutionary rate varies with genefunction, then mutation saturation could theoretically, albeit notnecessarily, lead to an apparent excess of natural selection in functionalgroups with greater evolutionary rates. If this were the case, the slopesof the relationship between the number of complex codons and thetotal number of codons in a gene would be greater for functionalgroups showing excess natural selection. Not only is this not the case atany node, but also the slope of the aforementioned relationship forcellular processes genes is less than that for all functional groupscombined at each node (Table 5).

Node Functional Category Slope r2 p-Value

A

Information Processing and storage 0.5428 0.8605 <0.000*

Cellular Processes 0.4654 0.6523 <0.000*

Metabolism 0.2876 0.4422 <0.000*

Poorly Characterized 0.4507 0.7776 <0.000*

All Functional Categories combined 0.4968 0.7836 <0.000*

B

Information Processing and storage 0.4781 0.8417 <0.000*

Cellular Processes 0.3793 0.6027 <0.000*

Metabolism 0.2699 0.4549 <0.000*

Poorly Characterized 0.3876 0.7802 <0.000*

All Functional Categories combined 0.4357 0.767 <0.000*

C

Information Processing and storage 0.1489 0.7273 <0.000*

Cellular Processes 0.0826 0.3055 <0.000*

Metabolism 0.0785 0.2361 0.0003*

Poorly Characterized 0.0643 0.3732 0.0033*

All Functional Categories combined 0.1281 0.5912 <0.000*

D

Information Processing and storage 0.1675 0.7319 <0.000*

Cellular Processes 0.1552 0.4914 <0.000*

Metabolism 0.1048 0.3265 <0.000*

Poorly Characterized 0.1422 0.7046 <0.000*

All Functional Categories combined 0.158 0.6517 <0.000*

Table 5: The r2 and p-values for the relationship between the totalnumber of codons and the number of complex codons in a gene for allgenes at each node. The p-values that were significant at α=0.0125[after applying Bonferroni correction (0.05/4, the number of nodestested)] are indicated with an asterisk (*).

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 5 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

Figure 2a: Relationship between the total number of codons and number of complex codons [48] (solid lines from linear regressions) for genesshowing selection under the McDonald and Kreitman tests [41] (closed circles) and genes showing neutrality (open circles) at nodes A (a), B(b), C (c), and D (d) in the phylogeny of the Mycoplasmatales. Complete saturation would be indicated by the dashed lines.

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 6 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

Using M. pneumoniae M129 [33] as a reference species, pathwayanalysis based on the protein-protein interaction patterns of the totalset of proteins encoded by cellular process genes revealed some

evidence of clustering within two subdivisions, namely within post-translational modification, protein turnover and chaperone proteins,and within inorganic ion transport and metabolism proteins (Figure3). Proteins with the greatest number of interactions (secA, ffh, secY,

Figure 2b: Relationship between the total number of codons and number of complex codons [48] (solid lines from linear regressions) for genesshowing selection under the codon substitution models tests, Yang [40] (closed circles) and genes showing neutrality (open circles) at nodes A(a), B (b), C (c), and D (d) in the phylogeny of the Mycoplasmatales. Complete saturation would be indicated by the dashed lines.

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 7 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

Sec-dependent secretory pathway [55]. Moreover, a significantlygreater proportion of genes encoding these proteins (7 of 8) showednatural selection at both nodes A and B compared to that for genesencoding proteins outside the Sec-dependent secretory pathway (6 of18; Fisher Exact Test, P<0.05).

dnaK and lepA) are distributed among three functional subdivisionsbut are all components or putative components of the Mycoplasmatales

Figure 3: Interaction patterns among the proteins based oninvolvement in interconnected biochemical pathways encoded bycellular processes genes from M. pneumoniae M129 [33]. Proteinsshowing evidence of natural selection at both Nodes A and B areindicated with yellow inner circles; otherwise inner circles aregreen. Proteins in the Sec-dependent secretory pathway arebounded by the red polygon. Proteins showing 7 or moreinteractions are indicated with asterisks. Post translationalmodification, protein turnover and chaperone proteins (blue outerring): molecular chaperon DnaK (dnaK), ATP dependent heatshock protease Lon (lon), cell division protein FtsH (ftsH), heatshock protein GrpE (grpE), SSRA-binding protein (smpB), triggerfactor (tig), o-sialoglycoprotein endopeptidase (gcp), molecularchaperone/heat shock protein DnaJ (dnaJ), thioredoxin reductase(trxB), glycoprotease family protein (MPN_291). Intracellulartrafficking and secretion proteins (red outer ring): cell divisionprotein FtsY (ftsY), preprotein translocase subunit SecA (secA),preprotein translocase subunit SecY (secY), inner membraneprotein translocase YidC (oxaA), signal recognition particle protein/GTPase (ffh). Cell wall/membrane biogenesis proteins (purple outerring): prolipoprotein diacylglyceryl transferase (lgt), GTP-bindingprotein LepA (lepA), glucose-inhibited division protein B/methyltransferase GidB (rsmG), signal peptidase II (lspA), s-adenosyl-methyltransferase (mraW). Inorganic ion transport andmetabolism proteins (pink outer ring): cobalt transporter ATPbinding subunit (cbi01), cobalt transporter ATP binding subunit(cbi02), cobalt ABC transporter permease protein (MPN_195). Cellcycle control, mitosis and meiosis proteins (green outer ring):chromosomal segregation protein SMC (p115), tRNA-uracil-5-carboxymethylaminomethyl modification enzyme (mnmG).Defense mechanisms protein (brown outer ring): ABC transporterATP-binding and permease protein (MPN_571).

DiscussionThe first objective of this study was to test the hypotheses that

natural selection has acted on the same sets of genes across differentbranches in the Mycoplasmatales phylogeny, with greater number ofgenes exhibiting selection at the deeper nodes of the phylogeny. Theexcess of cellular process genes showing evidence of selection at bothnodes A and B indicates the possibility of an evolutionary process thatwas influenced by selection acting on a common set of genes asspecies, represented by contemporary clades, diverged early inMycoplasmatales evolution. Failure to detect sufficient evidence ofnatural selection by both the statistical tests in more recent nodes Cand D may indicate (1) that selection was acting on a suite of genesthat is unique to a particular clade or species and that therefore wasnot evaluated here, (2) that selection was acting on regulatory ratherthan structural regions of the genome (because regulatory sequencesdo not encode proteins), the statistical methods used cannot assess thepattern of natural selection acting on them, (3) that evolutionarydivergence was dominated by neutral processes, an unlikely scenariogiven evidence from nodes A & B, or (4) that there has not beenenough time for non-synonymous substitutions to accumulatesufficiently to reveal evidence of natural selection, another unlikelysince divergence at nodes C–D was preceded by a period of rapidgenomic change in Mycoplasma groups about 190 Myr ago [11].Regardless, in spite of the inability to detect selection in recent nodes,the study proposes a common pattern of selection acting at moreprimitive nodes.

The genes showing evidence of natural selection were detectedmostly in the deepest nodes of the Mycoplasmatales phylogeny. It canbe speculated by evaluation of this region of the Mycoplasmatalesphylogeny (Figure 1a) that early species divergence coincided with theorigin of insects (396-407 Myr BP) [56] during the early Devonian(node A) and with the origin of land vertebrates (Amphibians) (368Myr BP) [57] during the late Devonian (node B). This is consistentwith the fact that the M. mycoides/capricolum clade comprises theEntoplasmatales group within which M. mycoides and M. capricolumare derived species [11] that have likely infected vertebrate hostsindependently of that by other Mycoplasmatales and are thustaxonomically grouped in the genus Mycoplasma by convergence. TheEntoplasmatales includes the basal group Spiroplasma [11], whosehosts are plants and insects [12]. The data suggest that early in theevolution of Mycoplasmatales natural selection promoted speciation inresponse to novel environments associated with host shifts as plants,insects, and vertebrates’ successively colonized land and were infectedby species of Mycoplasma. Speciation under such circumstances is notunexpected. For example, it has been reported that especially repeat-rich parts of the genome of different lineages of the Irish potato faminepathogen Phytophthora infestans evolve through host jumps [58].

The second objective was to test the hypotheses that mutationsaturation [51,52] has not affected by causing a false-positive pattern ofselection acting on the genes in the order Mycoplasmatales. Mutationsaturation is a neutral genetic phenomenon that can potentially cause asignificant but spurious pattern of selection. The process occurs when aparticular base mutates to a different one, then mutates back to theoriginal state-for example, an A mutating to a T, then back to A, orwhen multiple mutations occur at a given site (an A mutating to G,then to T). In such a case, it is difficult to determine if a particular baseunderwent two successive mutations in its evolutionary history ornone, confounding one’s ability to assess the number of non-synonymous and synonymous changes. If mutation saturation is more

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 8 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

likely to occur or persist in one type of change than the other, thenselection may be inappropriately implicated. The members of the orderMycoplasmatales have high rates of mutation [12], which is one of theprimary driving forces behind their evolution, and thus mutationsaturation is likely. Indeed our analyses showed that the Mycoplasmagenomes have acquired some mutation saturation, especially at thedeeper nodes. However at each node, the distribution of genes showingselection vs. genes showing neutrality are essentially the same withregard to mutation saturation, indicating that mutation saturation isnot responsible for the patterns of selection observed. Further, it is notat all likely that mutation saturation could produce a pattern ofselection that varies with gene function as is seen in this study, sinceexcess mutation saturation, where it existed, tended to occur infunctional groups not showing high levels of natural selection.

The final objective was to test the hypotheses that the likelihood ofnatural selection targeting a particular set of genes in the orderMycoplasmatales has depended on the function of the gene products,which has finally influenced speciation. In this study, natural selectionhas been shown to target cellular processes genes in general and Sec-dependent secretory pathway genes in particular. The Sec-dependentsecretory pathway is ubiquitous to living organisms [59] and in theMycoplasmatales functions in extracellular protein transport [59],including the export of proteins affecting virulence–a scenario that washypothesized to play a role in the evolution of the Phytoplasmas, theplant pathogenic group within the class Mollicutes [60]. It has beenindicated that genes for protein export should be part of the predictedminimal genome in bacteria [61]. We hypothesize that naturalselection, acting on genes encoding proteins of the Sec-dependentsecretion pathway, has caused speciation by altering the type andamount of secreted proteins, thereby affecting virulence of theMycoplasmatales in response to infection of novel hosts.

The current approach to identifying genes that may play a role in aprevious speciation event is very conservative as the final set of genesthat has passed through 6 filters (3 natural selection tests, 2 selection-by-function interaction tests, and 1 association by specific commonfunction analysis). It provides a mechanism for identifying thesignature of an original selection event that may have been buried insubsequent accumulated genetic variation and sweeps that fixeddifferences in many other genes.

References1. Sawyer SA, Kulathinal RJ, Bustamante CD, Hartl DL (2003) Bayesian

analysis suggests that most amino acid replacements in Drosophila aredriven by positive selection. J Mol Evol 57: S154-S164.

2. Sella G, Petrov DA, Przeworski M, Andolfatto P (2009) Pervasive naturalselection in the Drosophila genome? PLoS Genet 5: e1000495.

3. Bustamante CD, Nielsen R, Sawyer SA, Olsen KM, Purugganan M, et al.(2002) The cost of inbreeding in Arabidopsis. Nature 416: 531-534.

4. Wright SI, Gaut BS (2005) Molecular population genetics and the searchfor adaptive evolution in plants. Mol Biol Evol 22: 506-519.

5. Bustamante CD, Fledel-Alon A, Williamson S, Nielsen R, Hubisz MT, etal. (2005) Natural selection on protein-coding genes in the humangenome. Nature 437: 1153-1157.

6. Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, et al. (2009) The roleof geography in human adaptation. PloS Genet 5: e1000500.

7. Ahmed M, Liang P (2013) Study of modern human evolution viacomparative analysis with the Neanderthal genome. Genomics Inform 11:230-238.

8. Crisci JL, Wong A, Good JM, Jensen JD (2011) On characterizingadaptive events unique to modern humans. Genome Biol Evol 3: 791-798.

9. Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating ahigh-density SNP map For signatures of natural selection. Genome Res12: 1805-1814.

10. Neafsey DE, Schaffner SF, Volkman SK, Park D, Montgomery P, et al.(2008) Genome-wide SNP genotyping highlights the role of naturalselection in Plasmodium falciparum population divergence. Genome Biol9: R17.

11. Maniloff J (2002) Phylogeny and evolution. In Molecular biology andpathogenicity of mycoplasmas. In: Razin S and Herrmann R (eds.)Kluwer Academic/Plenum Publishers, NY, USA, pp: 31-44.

12. Razin S, Yogev D, Naot Y (1998) Molecular biology and pathogenicity ofMycoplasmas. Microbiol Mol Biol Rev 62: 1094-1156.

13. Belloy L, Janovsky M, Vilei EM, Pilo P, Giacometti M, et al. (2003)Molecular epidemiology of Mycoplasma conjunctivae in caprinae:transmission across species in natural outbreaks. Appl Environ Microbiol69: 1913-1919.

14. http://www.ncbi.nlm.nih.gov/15. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997)

The CLUSTAL_X windows interface: flexible strategies for multiplesequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876-4882.

16. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, et al.(2007) Clustal W and Clustal X version 2.0. Bioinformatics 23:2947-2948.

17. Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference ofphylogenetic tree. Bioinformatics 17: 754-755.

18. Posada D, Crandall KA (1998) MODELTEST: testing the model of DNAsubstitution. Bioinformatics 14: 817-818.

19. Al-Momani W, Nicholas RA, Janakat S, Abu-Basha E, Ayling RD (2006)The in vitro effect of six antimicrobials against Mycoplasma putrefaciens,Mycoplasma mycoides subsp. mycoides LC and Mycoplasma capricolumsubsp. capricolum isolated from sheep and goats in Jordan. Trop AnimHealth Prod 38: 1-7.

20. Bergonier D, Berthelot X, Poumarat F (1997) Contagious agalactia ofsmall ruminants: current knowledge concerning epidemiology, diagnosisand control. Rev Sci Tech 16: 843-873.

21. Noormohammadi AH (2007) Role of phenotypic diversity inpathogenesis of avian mycoplasmosis. Avian Pathol 36: 439-444.

22. Opriessnig T, Thacker EL, Yu S, Fenaux M, Meng XL, et al. (2004)Experimental reproduction of postweaning multisystemic wastingsyndrome in pigs by dual infection with Mycoplasma hyopneumoniaeand porcine circovirus type 2. Vet Pathol 41: 624-640.

23. Stadtländer C, Kirchhoff H (1990) Surface parasitism of the fishmycoplasma Mycoplasma mobile 163 K on tracheal epithelial cells. VetMicrobiol 21: 339-343.

24. Sirand-Prugner P, Lartigue C, Merenda M, Jocob D, Barre A, et al. (2007)Being pathogenic, plastic, and sexual while living with a nearly minimalbacterial genome. Plos Genet 3: e75.

25. https://www.crunchbase.com/organization/j-craig-venter-institute#/entity

26. Papazist L, Gorton TS, Kulish G, Markham PF, Browning GF et al. (2003)The complete genome sequence of the avian pathogen Mycoplasmagallisepticum strain R (low). Microbiol 149: 2307-2316.

27. Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, et al. (1995)The minimal gene complement of Mycoplasma genitalium. Science 270:397-403.

28. Minion FC, Lefkowitz EJ, Madsen ML, Cleary BJ, Swartzell SM, et al.(2004) The genome sequence of Mycoplasma hyopneumoniae strain 232,the agent of swine mycoplasmosis. J Bacteriol 186: 7123-7133.

29. Vasconcelos AT, Ferreira HB, Bizarro CV, Carvalho MO, Pinto PM et al.(2005) Swine and poultry pathogens: the complete genome sequences oftwo strains of Mycoplasma hyopneumoniae and a strain of Mycoplasmasynoviae. J Bacteriol 187: 5568-5577.

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 9 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175

30. Jaffe JD, Stange-Thomann N, Smith C, DeCaprio D, Fisher S, et al. (2005)The complete genome and proteome of Mycoplasma mobile. Genome Res14: 1447-1461.

31. Sasaki Y, Ishikawa J, Yamashita A, Oshima K, Kenri T, et al. (2002) Thecomplete genomic sequence of Mycoplasma penetrans, an intracellularbacterial pathogen in humans. Nucleic Acids Res 30: 5293-5300.

32. Westberg J, Persson A, Holmberg A, Goesmann A, Lundeberg J, et al.(2004) The Genome Sequence of Mycoplasma mycoides subsp. mycoidesSC type strain PG1T, the causative agent of contagious bovinepleuropneumonia (CBPP). Genome Res 14: 221-227.

33. Himmelreich R, Hilbert H, Plagens H, Pirkl E, Li BC, et al. (1996)Complete sequence analysis of the genome of the bacterium Mycoplasmapneumonia. Nucleic Acids Res 24: 4420-4449.

34. Chambaud I, Hellig R, Ferns S, Samson D, Galisson F, et al. (2001) Thecomplete genome sequence of the murine respiratory pathogenMycoplasma pulmonis. Nucleic Acids Res 29: 2145–2153.

35. Glass JI, Lefkowitz EJ, Glass JS, Heiner CR, Chen EY, et al. (2000) Thecomplete sequence of the mucosal pathogen Ureaplasma urealyticum.Nature 407: 757-762.

36. www.genome.jp/kegg/37. Tatusov RL, Galperin LY, Natale DA, Koonin EV (2000) The COG

database: a tool for a tool for genome scale analysis of protein functionsand evolution. Nucleic Acids Res 28: 33-36.

38. Xia X, Xie Z (2001) DAMBE: Software package for data analysis inmolecular biology and evolution. J Hered 4: 371-373.

39. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improvingthe sensitivity of progressive multiple sequence alignment throughsequence weighting, position-specific gap penalties and weight matrixchoice. Nucleic Acids Res 22: 4673-4680.

40. Yang Z (1997) PAML: a program package for phylogenetic analysis bymaximum likelihood. Comput Appl Biosci 3: 555-556.

41. McDonald JH, Kreitman M (1991) Adaptive evolution at the Adh Locusin Drosophila. Nature 351: 652-654.

42. Miyata T, Yasunaga T (1980) Molecular evolution of mRNA: a method forestimating evolutionary rates of synonymous and amino acidsubstitutions from homologous nucleotide sequences and its application.J Mol Evol 16: 23-26.

43. Nei M, Gojobori T (1986) Simple methods for estimating the numbers ofsynonymous and nonsynonymous nucleotide substitution. Mol Biol Evol3: 418-426.

44. Cabin RJ, Mitchell RJ (2000) To Bonferroni or not to Bonferroni: whenand how are the questions. ESA Bull 81: 246-248.

45. Fisher RA (1922) On the interpretation of χ2 from contingency tables andthe calculation of p. J R Stat Soc 85: 87-94.

46. Statistical Applicatory System (SAS) software version 9.1.3, SAS Systemfor Microsoft® Windows® Copyright © 2009, SAS Institute Inc., Cary, NC,USA.

47. King RJ, Plosser CI, Rebelo ST (1988) Production, growth and businesscycles: 1. The basic neoclassical model. J Monet Econ 31: 195-232.

48. Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP:DNA polymorphism analysis by the coalescent and other methods.Bioinformatics 19: 2496-2497.

49. Rand DA, Kann LM (1996) Excess amino acid polymorphism amongmitochondrial genes from Drosophila, mice and humans. Mol Biol Evol13: 735-748.

50. Lowry R (2010) VassarStats Website for Statistical Computation.51. Henn BM, Gignoux CR, Feldman MW, Mountain JL (2009)

Characterizing the time dependency of human mitochondrial DNAmutation Rate estimates. Mol Biol Evol 26: 217-230.

52. Ho SYW, Phillips, MJ, Cooper A, Drummond AJ (2005) Timedependency of molecular rate estimates and systematic overestimation ofrecent divergence times. Mol Biol Evol 22: 1561-1568.

53. Von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, et al. (2005)STRING: known and predicted protein-protein associations, integratedand transferred across organisms. Nucleic Acids Res 33: D433-D437.

54. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, et al. (2003)Cytoscape: a software environment for integrated models of biomolecularinteraction networks. Genome Res 13: 2498-2504.

55. Stephenson K (2005) Sec-dependent protein translocation acrossbiological membranes: evolutionary conservation of an essential proteintransport pathway. Mol Membr Biol 22: 17-28.

56. Engel MS, Grimaldi DA (2004) New light shed on the oldest insect.Nature 427: 627-630.

57. Niedźwiedzki G, Szrek P, Narkiewicz K, Narkiewicz M, Ahlberg PE(2010) Tetrapod trackways from the early Middle Devonian period ofPoland. Nature 463: 43-48.

58. Raffaele S, Farrer RA, Cano LM, Studholme DJ, MacLean D, et al. (2010)Genome evolution followed host jumps in the Irish potato faminepathogen lineage. Science 330: 1540-1540

59. Staats CC, Boldo J, Broetto L, Vainstein M, Schrank A (2007)Comparative genome analysis of proteases, oligopeptide uptake andsecretion systems in Mycoplasma spp. Genet Mol Biol 30: 225-229.

60. Bai X, Zhang J, Ewing A, Miller SA, Radek AJ, et al. (2006) Living withgenome instability: the adaptation of phytoplasmas to diverseenvironments of their insect and plant hosts. J Bacteriol 188: 3682-3696.

61. Mushegian AR, Koonin EV (1996) A minimal gene set for cellular lifederived by comparison of complete bacterial genomes. Proc Natl Acad Sci93: 10268-10273.

Citation: Mukherjee D, Diehl WJ (2017) Speciation Genomics of Protein-Coding Genes Common to Mycoplasmatales . J Phylogenetics Evol Biol5: 175. doi:10.4172/2329-9002.1000175

Page 10 of 10

J Phylogenetics Evol Biol, an open access journalISSN: 2329-9002

Volume 5 • Issue 1 • 1000175