toward genetic-based taxonomy: comparative analysis...
TRANSCRIPT
1 of 37
For publication in Journal of Virology 1 2
3
Toward Genetic-Based Taxonomy: Comparative 4
Analysis of a Genetic-Based Classification and the 5
Taxonomy of Picornaviruses 6
7 8 9
Chris Lauber1 and Alexander E. Gorbalenya*,1,2 10
11 12 1 Molecular Virology Laboratory, Department of Medical Microbiology, Leiden 13
University Medical Center, 2300 RC Leiden, The Netherlands 14 2 Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow 15
State University, 119899 Moscow, Russia 16
17
18
* Corresponding author: Dr. Alexander E. Gorbalenya, Department of Medical 19
Microbiology, Leiden University Medical Center, Albinusdreef 2, P.O.Box 9600, E4-20
P, 2300 RC Leiden, The Netherlands, Phone: +31-71-526-1652, Fax: +31-71-526-21
6761, E-mail: [email protected] 22
23
Key words: evolution, genomes, picornaviruses, phylogeny, species, virus discovery, 24
taxonomy 25
26
Running title: Toward Genetic-Based Taxonomy of a Virus Family 27
Abstract: 250 words 28
Text: 5545 words 29
30
Copyright © 2012, American Society for Microbiology. All Rights Reserved.J. Virol. doi:10.1128/JVI.07174-11 JVI Accepts, published online ahead of print on 25 January 2012
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
2 of 37
Abstract 31
Virus taxonomy has received little attention from the research community despite its 32
broad relevance. In Lauber & Gorbalenya (2012) JVI 86(X): xxxx-xxxx we have 33
introduced a quantitative approach to hierarchically classify viruses of a family using 34
pair-wise evolutionary distances (PEDs) as a measure of genetic divergence. When 35
applied to the six most conserved proteins of the Picornaviridae it clustered 1234 36
genome sequences in groups at three hierarchical levels (the GENETIC 37
classification). In this study we compare the GENETIC classification with the expert-38
based picornavirus taxonomy and outline differences in the underlying frameworks 39
regarding the relation of virus groups and genetic diversity that represent, 40
respectively, the structure and content of a classification. To facilitate the analysis we 41
introduce two novel diagrams. The first connects the genetic diversity of taxa to both 42
the PED distribution and the phylogeny of picornaviruses. The second depicts a 43
classification and the accommodated genetic diversity in a standardized manner. 44
Generally, we found striking agreement between the two classifications on species 45
and genus taxa. Few disagreements concern the species Human rhinovirus A and 46
Human rhinovirus C and the genus Aphthovirus, which were split in the GENETIC 47
classification. Furthermore, we propose a new super-genus level and universal, level-48
specific PED thresholds, not reached yet by many taxa. Since the species threshold is 49
approached mostly by taxa with large sampling sizes and those infecting multiple 50
hosts, it may represent an upper limit on divergence beyond which homologous 51
recombination in the six most conserved genes between two picornaviruses might not 52
give viable progeny. 53
54
55
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
3 of 37
Introduction 56
57
Research in virology relies on virus taxonomy for providing a unified intellectual and 58
practical framework for analysis, generalization and knowledge dissemination. 59
Despite its broad relevance, taxonomy has received relatively little attention from the 60
research community. Virus taxonomy is developed under the direction of the 61
Committee on Taxonomy of Viruses (ICTV) and recognizes five hierarchically 62
arranged ranks: order, family, subfamily, genus and species (in the ascending order of 63
inter-virus similarity), with order and subfamily levels being used less commonly. 64
Virus species are of principal importance (60) and for their demarcation the so-called 65
polythetic species concept (3,74) is applied. Accordingly, viruses are recognized as 66
single species if they share a broad range of characteristics while constituting a 67
replicating lineage that occupies a particular ecological niche (36,75). These 68
characteristics, so-called demarcation criteria, are devised for each genus separately 69
and are revised periodically (16,35). To ensure that each virus is classified, they are 70
allowed to vary greatly between and even within families, with no single unifying 71
property being sought after (for review see (76)). Consequently, virus species are 72
operational units that are delimited at the genus level. They can be contrasted to 73
biological species that are commonly defined by shared gene pools and reproductive 74
isolation. The lack of a mandatory common denominator of virus species casts 75
uncertainty over the interpretation and generalization of results obtained across 76
different genera. 77
78
We are interested in exploring the wealth of genomic information for improving the 79
foundation of virus taxonomy. For this purpose we use the family Picornaviridae as a 80
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
4 of 37
case study. Picornaviruses form one of the largest and most actively studied virus 81
families with many human and societally important pathogens whose number is 82
steadily growing (15,64). They employ a single-stranded RNA genome of positive 83
sense (ssRNA+) with lengths in the range of 6500 to 9000 nucleotides of which about 84
90% encode a single polyprotein that is co- and post-translationally cleaved into 85
eleven to thirteen mature proteins (50). In total six proteins, three of the capsid 86
module (1B, 1C and 1D, known also as VP2, VP3 and VP1), and three of the 87
replicase module (2C, 3C and 3D) are conserved family-wide to form the backbone of 88
the genetic plan (20). Other proteins may be specific for different subsets of 89
picornaviruses. Particularly, proteins known as L and 2A come in a large variety of 90
molecular forms (20,40) most of which were implicated in functions that secure virus 91
propagation in the host (1). The open reading frame that encodes the polyprotein (55) 92
is flanked by the two untranslated regions, 5’-UTR and 3’-UTR. The 5’-UTR includes 93
a highly structured internal ribosomal entry site (IRES) which is known to exist in 94
four different molecular forms, from type I to IV (78). The expert-based classification 95
(the ICTV taxonomy) of the Picornaviridae devised by the Picornavirus Study Group 96
(PSG), recognizes 28 species distributed among 12 genera, and no subfamilies (40). A 97
growing number of picornaviruses either is tentatively classified in provisional taxa or 98
remains unclassified. The PSG uses a complex set of rules to devise taxa and classify 99
viruses. All genera form compact monophylogenetic clusters in separate trees of the 100
conserved proteins as well as the capsid and replicative modules, respectively. The 101
polyprotein sequences of viruses in different genera differ by at least 58% amino acid 102
residue (aa) identity (39,70). For genera that include multiple species (Enterovirus, 103
Cardiovirus, Aphthovirus, Parechovirus, Kobuvirus, Sapelovirus) demarcation criteria 104
that separate the species have been developed by the PSG. Most commonly, they 105
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
5 of 37
define lower limits of pair-wise aa identity in the polyprotein and its two parts, the 106
capsid and replicative modules. Additionally, the criteria may include restrictions on 107
genome organization, genome base composition (G+C), host range, host cell receptor 108
variety, and compatibility in processes that underlie the replicative cycle. Some taxa 109
may be distinguished by the presence of a molecular marker that could be an L and/or 110
a 2A protein (20,31), the type of IRES (24,78), the genome position of internal cis-111
replicative element directing the VPg synthesis (CRE) (9,71), or a combination 112
thereof. For genera that include a single species (Hepatovirus, Erbovirus, 113
Teschovirus, Senecavirus, Tremovirus, Avihepatovirus) no species demarcation 114
criteria have been developed due to the lack of sufficient diversity in the available 115
virus sampling. 116
117
In an accompanying paper we have introduced a quantitative approach for partitioning 118
the genetic diversity of a virus family to build a hierarchical classification, which we 119
named DEmARC (43). In contrast to the framework of virus taxonomy, DEmARC 120
uses a sole demarcation criterion – inter-virus genetic divergence. When applying 121
DEmARC to the family Picornaviridae it clustered 1234 genome sequences in groups 122
at three hierarchical levels (the GENETIC classification). In this study, two of the 123
three inferred levels in the GENETIC classification were found to correspond most 124
closely to the species and genus ranks recognized by ICTV (40). Few deviations from 125
the ICTV taxonomy concern assignments for the genus Aphthovirus (40,45) and 126
species Human rhinovirus A and C (2,69). The third level has no counterpart in the 127
current taxonomy. Furthermore, we found the family-wide conserved proteins to have 128
almost universally accumulated fewer substitutions in viruses of the same species than 129
in those belonging to different species, suggesting that picornavirus species are 130
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
6 of 37
genetically separated. This also indicates that the objective discrimination between the 131
genetic divergence within a taxon (intragroup) and that between taxa (intergroup) is 132
attainable. Finally, we outline conceptual differences between the frameworks that 133
underlie the two classifications. These differences concern the relation of genetic 134
diversity, the content of a genetics-based classification, and virus groups representing 135
its structure. To facilitate the comparison we introduce two novel diagrams that 136
illustrate, respectively, the connection of new concepts developed in this study to 137
conventional phylogenetic techniques already used in taxonomy, and the depiction of 138
a classification and the associated genetic diversity in a standardized manner. 139
140
Materials and Methods 141
142
Virus sequences, multiple alignment and distance estimation 143
144
Complete genome sequences for 1234 picornaviruses available on April 15, 2010 at 145
the National Center for Biotechnology Information GenBank/RefSeq (5) databases 146
were downloaded using SARGENS (67) into the Viralis platform (21). A 147
concatenated multiple amino acid alignment covering the family-wide conserved 148
capsid proteins 1B, 1C, 1D and the non-structural proteins 2C, 3C and 3D of the 1234 149
picornaviruses (Fig. 1) was produced using the MUSCLE program (14) and poorly 150
conserved columns were further manually refined. The alignment subsequently 151
facilitated the calculation of pair-wise evolutionary distances (PEDs) using a 152
maximum likelihood (ML) approach (7,17), as implemented in the Tree-Puzzle 153
program (63). The WAG amino acid substitution matrix (77) was applied. PEDs serve 154
as a measure of inter-virus genetic divergence. 155
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
7 of 37
156
Phylogeny reconstruction 157
158
Bayesian posterior probability trees were compiled utilizing the Beast software (12). 159
Bayesian MCMC chains (2 independent runs per dataset) were run for 4 million steps 160
(10% burning, sampled every 100 generations) under the WAG amino acid 161
substitution matrix (77). Substitution rate heterogeneity among alignment sites was 162
allowed as modeled via a gamma distribution with 4 categories. The uncorrelated 163
relaxed molecular clock approach (lognormal distribution) (11) was used as it was 164
strongly favored over the strict molecular clock (log Bayes factor of 56.7) and the 165
relaxed molecular clock approach with exponential distribution (log Bayes Factor of 166
14.6). Convergence of runs was verified using Tracer (13). ML trees were compiled 167
utilizing the PhyML software (23). The WAG amino acid substitution matrix was 168
applied and substitution rate heterogeneity among sites (4 categories) was allowed. 169
Support values for internal nodes were obtained using the non-parametric bootstrap 170
method with 1000 replicates or through SH-like approximate likelihood ratio tests. 171
172
Genetic-based virus classification 173
174
We have developed DEmARC, a quantitative procedure for hierarchical classification 175
of a virus family based on inter-virus genetic divergence (43). It has been evaluated 176
extensively for consistency and stability with respect to key parameters including the 177
amount and/or diversity of the input data, the alignment construction method and the 178
measure of inter-virus divergence. For brevity, we refer to the DEmARC-mediated 179
picornavirus classification as the GENETIC classification. 180
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
8 of 37
181
Measures of quality 182
183
In the accompanying paper (43) we have introduced a cost measure to determine a 184
threshold on intragroup genetic divergence at each classification level in a 185
quantitative way. This cost is calculated as the cumulative violation of intragroup 186
PED values to the respective threshold among all taxa of the level (see (43) for 187
details). Hence, this cost, which is a nonnegative real number, is used as a quality 188
measure for a classification level – the lower the cost the higher the quality. 189
Furthermore, analogs of the cost measure can be calculated for both a taxon and a 190
single virus by summarizing over the respective violating PED values. 191
192
Another measure of quality of a taxon is the fraction of intraspecifc pair-wise 193
distances not exceeding the distance threshold of the respective level, to which we 194
refer as cluster quality (cq). A taxon is considered complete if cq=1, and incomplete 195
otherwise (0<cq<1). 196
197
Results and Discussion 198
199
Phylogeny, PED distribution and classification of picornaviruses 200
201
Our dataset included 1234 genomes sequences from picornaviruses whose taxonomic 202
position at the start of this study was either already established as described above or 203
remained provisional or uncertain due to a considerable time involved in taxa 204
assignments (40). Using a concatenated multiple alignment of six conserved proteins 205
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
9 of 37
of a representative set of 38 picornaviruses we reconstructed a phylogenetic tree under 206
both a ML and a Bayesian framework. The two trees had a matching topology and 207
included monophyletic branches corresponding to the taxa recognized by ICTV (black 208
tree branches and names in Fig. 2). The phylogeny additionally comprised a number 209
of new branches of different lengths accommodating a large number of relatively 210
recently identified picornaviruses. We concluded that the alignment used in our study 211
contains information compatible with taxonomy. Hence, we used this alignment as 212
input for DEmARC in order to devise the GENETIC classification of picornaviruses 213
(43). We identified three statistically most strongly supported positions of 214
discontinuity (thresholds) in the picornavirus PED distribution that we assigned as 215
defining species, genus and super-genus levels of the classification, respectively. 216
217
Below, we compare the GENETIC classification and the ICTV taxonomy at each of 218
these levels separately. To facilitate the comparison we devised a special plot (Fig. 219
3A, central panel), which connects the phylogeny (Fig. 3A, left) and the PED 220
distribution (Fig. 3A, bottom-right) that are used in taxonomy and DEmARC, 221
respectively. The plot (Fig. 3A, central) presents a two-dimensional partitioning of the 222
inter-virus genetic diversity and reveals an association of a taxon in the tree and three 223
ranges in the PED distribution that correspond to the three levels of the GENETIC 224
classification. Thus, the phylogeny and the PED distribution represent complementary 225
projections of the inter-virus genetic diversity that, when combined, reveal the most 226
critical characteristics utilized in taxonomy. The availability of this plot empowers the 227
reader with a tool to inspect the foundations and analyze implications of the proposed 228
classification. 229
230
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
10 of 37
GENETIC classification versus ICTV taxonomy: species level 231
232
At the species level, the principal level in taxonomy, the GENETIC classification 233
includes 38 clusters. Twenty-seven of them correspond one-by-one to species of the 234
ICTV taxonomy (70), three clusters encompass a single species (Human rhinovirus C; 235
HRV-C), and eight clusters comprise recently discovered viruses that were not yet 236
formally classified at the start of the study. HRV-C was split in three species-like 237
clusters provisionally named Human rhinovirus Cα (HRV-Cα), Human rhinovirus Cβ 238
(HRV-Cβ), and Human rhinovirus Cγ (HRV-Cγ) (Figs. 2 and 3A, Table 1). 239
240
The 27 clusters corresponding to the recognized species include already classified 241
viruses and some accommodate also recently discovered viruses, including simian 242
enteroviruses joining Human enterovirus A or B (HEV-A and HEV-B, respectively) 243
(53,54), Saffold virus grouping with Theilovirus (6,8,32,46), Possum enterovirus 244
joining Bovine enterovirus (82), and Porcine kobuvirus being classified with Bovine 245
kobuvirus (62) (Table 1). With the exception of Theilovirus the host range of these 246
species was expanded as a result of this virus update. A recent phylogenetic study of 247
RNA viruses from three families and two genera other than the Picornaviridae 248
revealed that host switching by virus species is more frequent than previously thought 249
(38). 250
251
The eight clusters encompassing exclusively novel viruses include: cosaviruses (4 252
clusters, CosV-A, CosV-B, CosV-C, CosV-D) (26,33), sealion picornavirus (1, AqV-253
A) (34,40), Human klasse- and saliviruses (hereafter referred to as saliviruses) (1, 254
SaliV-A) (22,27), rhinoviruses close but separated from Human rhinovirus A, HRV-A 255
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
11 of 37
(1, provisionally named Human rhinovirus Aβ, HRV-Aβ) (9,56,57) and simian 256
enteroviruses not belonging to Simian enterovirus A (1, SiEV-B) (52-54) (Table 1). 257
There seems to be a good match between the GENETIC classification assignments 258
listed above and those that are in the pipeline for approval by ICTV (39,40). 259
260
Thirty-two out of 38 species include more than one sequence (non-singleton) and they 261
determine the PED range of all 38 species clusters which is defined as “intra-species” 262
genetic divergence (Fig. 3A). Virus sampling for the 38 species varied considerably in 263
the range of one (six species) to 260 (Foot-and-mouth disease virus, FMDV) 264
sequences (Fig. 3A). The corresponding intragroup PED ranges (distances between 265
virus pairs belonging to a single species) differed ~10-fold among the species with 266
more than one non-identical sequences, with maxima varying from 0.04 (Avian 267
encephalomyelitis virus, AvEMV) to 0.41 (HRV-A) (Fig. 3A). All except three 268
species clusters were complete (each intragroup PED is below the species distance 269
threshold) (Fig. 3A) (see (43)). The three incomplete species clusters include viruses 270
that belong to HRV-A (96 viruses in total and 14 viruses define pairs with larger-than-271
threshold distances), Bovine kobuvirus (4 and 1) and the proposed species-like cluster 272
HRV-Cγ (4 and 2) (Table 2; Fig. 4). In these species, respectively, 3.6%, 16.7% and 273
50% of intragroup PEDs exceeded the species threshold (Table 1, Fig. 3A). Combined 274
they account for less than 0.19% (175 out of 93,857) of all intragroup PED values at 275
this level. In respective classifications obtained with three evaluation datasets (43), 276
Bovine kobuvirus was split in two clusters that observe the threshold and are host-277
restricted, which would be in line with the original proposal by the authors who 278
identified the porcine kobuvirus (62). 279
280
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
12 of 37
GENETIC classification versus ICTV taxonomy: rhinoviruses 281
282
Why do the GENETIC classification and the ICTV taxonomy differ so profoundly in 283
respect to HRV-C while agreeing on the virus composition of all other species? 284
Specifics of both HRV-C evolution and the two classification frameworks could play 285
a role. The genetic diversity of these viruses in capsid (1A, also known as VP4, and 286
1D proteins) and non-structural (3D) regions was previously reported to exceed those 287
of other rhinoviruses (51,69). In the 1D protein this difference is smallest and the 288
entire HRV-C diversity was considered to be below the species limit, paving the way 289
to the recognition of HRV-C as a single species. We have also observed HRV-C 290
viruses to form a single species-like cluster in the DEmARC-mediated classification 291
using the major capsid proteins only (43). However, in the analysis of the dataset 292
comprising the six family-wide conserved proteins the observed maximum divergence 293
of HRV-C considerably exceeded that of its most diverged subset (HRV-Cγ) and the 294
family-wide species demarcation threshold: 0.424, 0.392 and 0.37, respectively. This 295
was likely due to a combined effect of congruent phylogenetic signals from both the 296
structural and non-structural proteins (Fig. 4 and data not shown). The virus 297
divergence in HRV-C is so high that even half of intragroup distances in HRV-Cγ 298
exceed the species threshold (Fig. 3A and Table 1). This low support for the HRV-Cγ 299
species (Table 1), which is the lowest overall and only one of three below 100%, is 300
even more striking given that the virus sampling in this provisional species and the 301
two HRV-C sister taxa is very limited (one to four available genome sequences per 302
cluster). Thus, it remains plausible that with the accumulation of sequenced genomes 303
in the future, HRV-Cγ will be split further, increasing the number of provisional 304
HRV-C species to at least four compared to the one currently recognized. Each of 305
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
13 of 37
these species correspond to a separate major lineage in the HRV-C phylogeny (51) 306
(Fig. 4). 307
308
Furthermore, the GENETIC classification proposes the recognition of another 309
potentially new rhinovirus species (HRV-Aβ). It is formed by three viruses and 310
corresponds to the recently identified “clade D” rhinoviruses (57) (known otherwise 311
as the cluster HRV-A2, (9)) that is a sister group to the species HRV-A (Fig. 4). 312
Altogether, our analysis suggests that at least six (rather than three) human rhinovirus 313
species may exist. Testing this more complex species structure in human rhinoviruses 314
could facilitate research into the molecular basis of the observed clinical 315
heterogeneity of rhinovirus infections in humans (2,28,56). 316
317
GENETIC classification and the recognition of virus species as biological entities 318
319
We have found that viruses belonging to a single species are usually separated by less 320
than ~0.4 replacements per residue on average in the six most conserved proteins, 321
while this distance is commonly exceeded in virus pairs representing different species 322
(Fig. 3B). Furthermore, we observed a dependence of the largest intragroup genetic 323
divergence (maximum intragroup PED) on the sampling size (number of viruses) in 324
the 38 species: with increasing sampling size, a species’ maximum genetic divergence 325
tends to approach the species distance threshold (Fig. 3B). Accordingly, the eleven 326
species that constitute the upper ~25% of the maximum PED range are enriched with 327
highly-sampled species. Additionally, host range may be another parameter of 328
relevance to the genetic divergence of species: the upper ~25% of the maximum PED 329
range is also enriched with species that infect multiple hosts (five out of six species of 330
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
14 of 37
this kind) (Fig. 3B). This correlation is sensible biologically since host switching is 331
expected to be accompanied with accelerated virus evolution. 332
333
The abovementioned correlations involve species that belong to four genera, 334
indicating that they may be applicable to all picornavirus species. If so, we may 335
expect that with a sufficient increase of the species sampling size the maximum 336
divergence of all species in the Picornaviridae will approach the species threshold. 337
This would indicate that the intragroup genetic divergence of species is constrained 338
similarly in different lineages. Alternatively, some currently under-sampled lineages 339
could accommodate a smaller natural diversity due to either stricter constrains or 340
being a “young” species. For instance, HAV with its relatively large sampling size 341
and two hosts (Fig. 3B) has an unusually small maximum genetic divergence (see also 342
(4)). Thus, it remains possible that the inferred species threshold represents an upper 343
limit on the maximum intragroup genetic divergence but that the actual limit may be 344
smaller in some picornavirus species. Likewise, we may not exclude that viruses in 345
some species may diverge above the threshold. This might happen due to position-346
specific variations of replacements in the six conserved proteins or involvement of 347
virus lineages that are in the transition to establish separate species. The virus 348
diversity known in taxonomy as the species HRV-A and HRV-C (Fig. 4) could 349
represent such cases. Also, it is important to stress that the species distance threshold 350
represents an average over 2446 positions in six conserved proteins (43) indicating 351
that (lineage-specific) variations of maximum divergence for different proteins are 352
likely (see below and also (44,68)). Further characterization of the natural diversity of 353
picornavirus species, including the surveillance of novel hosts, could address this 354
important aspect of the species delimitation in the GENETIC classification. 355
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
15 of 37
356
The existence of a species threshold on intragroup genetic divergence must be 357
rationalized mechanistically. It may be a manifestation of speciation due to changes 358
accumulated in either conserved proteins or other elements encoded in the 359
picornavirus genome. To discuss the alternatives, it is important to recall that the 360
divergence is a net result of contributions from several sources including mutation and 361
homologous recombination. Although both promote diversity increase, they act in 362
opposite directions concerning progeny divergence: on average, the progeny of two 363
lineages diverged by mutation will be more separated than their parents while those 364
generated through homologous recombination of parents will be closer to each other 365
than their parents (49). In other words, recombination limits the maximum genetic 366
divergence in an asexual population; without it the population will evolve into 367
separate, more distantly related lineages after a sufficient time. 368
369
The inferred species threshold reflects the maximum amount of accumulated genetic 370
differences in the six conserved proteins between two picornaviruses that remains 371
compatible with the viability of progeny produced by homologous recombination, as 372
argued below. The frequency of homologous recombination depends on the extent of 373
base-pairing, with intratypic recombination being most common (37,72). Two 374
picornaviruses that are separated by a distance approaching the species threshold 375
would retain only relatively small stretches of identical orthologous residues in their 376
genome because the threshold is so high; the lack of extensive base-pairing should 377
impede homologous recombination. Even if recombination happens between these 378
viruses, the resulting chimeric progeny will be viable only if the recombinant proteins, 379
which all are essential for virus reproduction, remain functional. The protein 380
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
16 of 37
functionality depends on the intra- and inter-protein compatibility of lineage-specific 381
mutations that have been accumulated since the divergence of these viruses. The 382
mutation spectrum is restricted by so-called epistatic interactions between different 383
protein positions (66) making mutations outside this spectrum incompatible with the 384
protein functioning. As two viruses diverge they will approach the species distance 385
threshold beyond which accumulated mutations may become incompatible with 386
progeny viability in any combination that could be generated in the recombinants. In 387
this framework, the existence of the species threshold reflects the genetic separation 388
of species. This model could be probed in experiments on virus chimera involving the 389
conserved backbone proteins. It is predicted that intra- but not inter-species chimera 390
must be viable. Results compatible with this model are available for Human 391
enterovirus C (29,30). The viability of chimeric progeny may be determined not only 392
by the distance between parents but also by the origins of combined parts (30), 393
indicating that both forth and reciprocal chimera must be characterized. 394
395
In the alternative model, other elements outside the conserved proteins could be 396
implicated in the control of speciation. These elements include L and 2A proteins, 397
which exist in a large variety of molecular forms in picornaviruses (1,20,40), or CRE 398
whose location in the genome varies tremendously among picornaviruses 399
(9,18,19,50,73,80,81), or other elements located in the 5’- and 3’-non-coding 400
regions(71,78). For a number of picornaviruses the viability of inter-species chimera 401
carrying a non-cognate version of either L (58) and 2A protein (47), or CRE (73) and 402
IRES (48,78) was demonstrated experimentally. Also, several picornaviruses with 403
deleted L proteins were found to be viable (42,59) which is in line with their 404
accessory “security” role in virus replication (1). Thus, picornaviruses could accept 405
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
17 of 37
“gene flow” from other species in the case of elements that are not conserved family-406
wide. Consequently, an acquisition or loss or relocation of a non-conserved element 407
by a picornavirus in vivo seems plausible. Furthermore, it is conceivable that such a 408
newly acquired element might confer a function that would allow the virus to explore 409
a new niche, eventually leading to its reproductive isolation from other lineages; in 410
other words it would trigger speciation. However, this model does not provide a 411
mechanistic explanation for the species genetic threshold other than that of the first 412
model (see above). 413
414
Thus, in our opinion, non-conserved and conserved elements of the picornavirus 415
genome may play distinct roles in speciation. The clear-cut relation between the 416
species delimitation and the discontinuity in the inter-virus genetic distance 417
distribution lends support for the notion that picornavirus species are biological 418
entities rather than merely operational units. 419
420
GENETIC classification versus ICTV taxonomy: genus level 421
422
The GENETIC classification includes a genus level comprising 16 clusters. Eleven of 423
them match ICTV genera, two clusters encompass a single genus (Aphthovirus), and 424
three clusters comprise recently discovered viruses (Figs. 2 and 3A). The genus 425
Aphthovirus was split into two clusters that are formed by, respectively, the single 426
species Equine rhinitis A virus (ERAV) (45) and the two species FMDV (10,41) and 427
Bovine rhinitis B virus (BRBV) (25), respectively. The minimum PED of 1.03 428
between viruses of these two clusters is considerably larger than the genus distance 429
threshold of 0.905 and comparable to those between the closest virus pairs of other 430
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
18 of 37
sister genera, e.g. Senecavirus and Cardiovirus or Enterovirus and Sapelovirus. In 431
fact, the distance range between viruses of these two clusters fits in the limits of the 432
next rank (super-genus) that is considered below. This result was also reproduced in 433
classifications of two evaluation datasets (43) in which these viruses are present but 434
which differed in respect to genome region and virus selection, respectively. We note 435
that an L protein variety with a papain-like fold and proteolytic activity that is 436
associated with this monophyletic virus group (40) could be considered a molecular 437
marker of a larger group that also includes the sister genus Erbovirus (45,79). Thus, 438
there is a strong support for splitting the genus Aphthovirus into two genera in future 439
revisions of taxonomy. 440
441
The three genus clusters that are formed by recently discovered viruses include 442
cosaviruses (4 species), sealion picornavirus (1) and saliviruses (1), respectively. All 443
genera clusters were complete with the exception of Enterovirus (Fig. 3A) resulting in 444
less than 0.02% (21 out of 152,194) of intragroup PED values that exceed the genus 445
threshold (Table 2), all involving a single sequence of Enterovirus 71 (Genbank 446
accession AF119795) from HEV-A. Seven out of 16 genera are non-singletons 447
(include more than one species) and they determine the genus-specific PED range 448
which is defined as “inter-species intra-genus” genetic divergence (Fig. 3A). 449
450
GENETIC classification versus ICTV taxonomy: recognition of the new hierarchical 451
level super-genus 452
453
The GENETIC classification recognizes an additional rank – provisionally called 454
super-genus – that has no counterpart in virus taxonomy. The threshold support for 455
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
19 of 37
this level is the strongest overall (43) indicating that it may reflect a clustering that is 456
genetically and evolutionary sensible. At this level we observe five non-singleton 457
super-genera that include more than one genus. They include viruses from 28 species 458
and ten genera. Four of these super-genera represented unions of, respectively, 459
Enterovirus with Sapelovirus, Cardiovirus with Senecavirus, Hepatovirus with 460
Tremovirus, and Kobuvirus with the cluster formed by recently discovered saliviruses 461
(Figs. 2 and 3A). The fifth non-singleton super-genus corresponds to the genus 462
Aphthovirus in the ICTV taxonomy that is split in two genera in the GENETIC 463
classification (see above). The other six super-genera accommodate singleton genera 464
including ten species in total. Four of these super-genera include only a single ICTV 465
genus: Avihepatovirus, Erbovirus, Parechovirus and Teschovirus, respectively. Two 466
supergenera are formed by recently discovered cosaviruses and sealion picornavirus, 467
respectively. All super-genus clusters are complete with the exception of the 468
Enterovirus/Sapelovirus union (Fig. 3A) resulting in less than 0.25% (7 out of 2814) 469
of intragroup PED values that exceed the super-genus threshold (Table 2), all 470
involving a single sequence of avian sapelovirus (RefSeq accession NC_006553) 471
from AvSV. The five non-singleton super-genera determine the super-genus-specific 472
PED range which is defined as “inter-species inter-genus intra-super-genus” genetic 473
divergence (Fig. 3A). 474
475
Multimodality of the PED distribution and the evolution of picornaviruses 476
477
To our knowledge there is nothing in the evolutionary theory that would predict the 478
multimodality of the PED distribution of conserved proteins for a virus family. 479
However, once observed it requires an (evolutionary) explanation. The model of virus 480
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
20 of 37
speciation outlined above may explain the existence of PED discontinuity in which 481
the species threshold resides. This threshold is expected to limit intragroup but not 482
intergroup genetic divergence of lineages once they have crossed the threshold. This 483
biological reasoning seems not to be applicable to other areas of PED discontinuity 484
that are associated with the genus and super-genus thresholds, respectively. One 485
plausible explanation for these discontinuities is that they could reflect large-scale 486
changes in the rate of birth and death that might have happened across all virus 487
lineages. Cellular life forms are known to have gone through alternating periods of 488
both mass birth and death across lineages (61,65). If ancestral (picorna)viruses 489
followed their hosts, alternating peaks and valleys in their PED distribution would 490
reflect periods characterized predominantly by virus speciation and extinction, 491
respectively. Thus, the genus and super-genus levels determined in this study would 492
correspond to two major waves of speciation that are separated by two waves of 493
extinction in the evolution of picornaviruses, possibly reflecting changes in the 494
environment. 495
496
GENETIC classification and the taxonomy of picornaviruses: two different 497
perspectives on known and unknown virus diversities 498
499
As shown above, there is striking agreement between the GENETIC classification and 500
the ICTV taxonomy (70) of the Picornaviridae at the species and genus levels with 501
notable differences concerning the recognition of only few taxa. The observed match 502
is non-trivial (76) since the underlying decision-making frameworks seek to satisfy 503
different criteria. To fully reveal an impact of these criteria in the two frameworks, 504
which are either exclusively (DEmARC) or predominantly (ICTV) genetic–based, we 505
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
21 of 37
sought to characterize their effect on partitioning the virus diversity, the primary 506
target of classification and an important subject of research in virology. To this end, 507
we have developed a circular diagram for presenting the classification of a virus 508
family in a graphical form (Fig. 5). It depicts the proportions of the inter-virus genetic 509
divergence that is partitioned and non-partitioned by a classification, respectively. The 510
circle radius is defined by the PED range observed in the family with inter-virus 511
genetic divergence increasing linearly from the perimeter (PED of zero) towards the 512
centre of the circle (maximum observed PED). Taxa are shown as boxes with heights 513
(in radial dimension) that correspond to the PED range of the respective classification 514
level. Species form the most external layer, followed by the genus layer, and – for the 515
GENETIC classification – the super-genus layer residing most close toward the circle 516
center. Within each taxon, the PED range that has been sampled and not sampled is 517
colored according to the coloring scheme for classification ranks (Fig. 3) using bright 518
and soft colors, respectively. The PED rang that has not been partitioned (yet) by a 519
classification (inner part of the circle) is in white. 520
521
To facilitate an unbiased comparison of the genetic foundations of both frameworks 522
involving as many taxa as possible, the ICTV taxonomy in Fig. 5 was required to 523
follow the GENETIC classification by accepting all taxa containing new viruses and 524
those two (Aphthovirus, HRV-C) that were classified differently. As a result, the 525
taxonomy and the GENETIC classification match each other in the relation to the 526
virus sampling per taxon (the most external layer in Figs. 5A and 5B), and the species 527
and genus structure. At the species level, the PSG applies demarcation criteria that are 528
genus-specific and determined by the maximum observed intragroup genetic 529
divergence among all sampled species of the genus. As a consequence, the limit on 530
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
22 of 37
intragroup genetic divergence of species varies tremendously between genera. 531
Accordingly, in the ICTV diagram only species of the same genus have equal heights 532
(compare taxa 11.x with 12.x in Fig. 5A); for species that comprise a single virus the 533
height is nil (no pair is available to produce a PED; for instance taxon 16.1 in Fig. 534
5A). At the genus level, the PSG does not provide demarcation criteria for the 535
quantification of maximum intragroup genetic divergence and each genus is 536
demarcated separately, usually by means of standard phylogenetic analyses. To reflect 537
this approach, we represented genera as boxes whose heights correspond to the 538
maximum observed intragroup genetic divergence (Fig. 5A). For genera comprising a 539
single species the height is nil (see for instance taxon 15.1 in Fig. 5A). In contrast, in 540
the DEmARC diagram (Fig. 5B) all species, genus or super-genus taxa have uniform, 541
level-specific heights, since in this framework family-wide limits on intragroup 542
genetic divergence are devised (compare for instance taxa 10.1 and 11.1 in Fig. 5B). 543
544
As a consequence of the utilization of family-wide demarcation thresholds, the 545
DEmARC framework, compared to that of ICTV, partitions a larger share of the total 546
PED space (compare white areas in Fig. 5A and 5B). This also shows that for most 547
taxa a fraction of the intragroup genetic divergence is yet to be described in field 548
studies (soft-colored areas in Fig. 5B). Such predictions are not available in the ICTV 549
framework. The diagrams also reveal that most distant relations of viruses in the 550
Picornaviridae remain totally unstructured (white central area in Fig. 5A and 5B). In 551
the DEmARC framework, this area is partially partitioned by super-genera, and could 552
be partitioned further if the subfamily level is introduced (43). 553
554
Concluding Remarks 555
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
23 of 37
556
In a field lacking a gold standard, the striking agreement between the GENETIC 557
classification and the expert-based taxonomy (39,70) of the Picornaviridae could be 558
seen as a cross-validation for both. Of principal importance is that the observed 559
agreement implies that genomes may contain necessary and sufficient information to 560
build a (picorna)virus taxonomy by using an approach (43) that employs a sole (rather 561
than polythetic) demarcation criterion. There are additional benefits of the single 562
criterion: its utilization provides consistency across all taxa, defines expected 563
divergence ranges for poorly sampled taxa, reveals problematic taxa, and makes 564
taxonomy fully genetic-based. We expect the latter to facilitate the interaction 565
between taxonomy and fundamental and applied research. Genetically delimited taxa 566
could be readily targeted for the recognition by virus diagnostic. Furthermore, the 567
validity of the species threshold could be probed in experiments involving 568
homologous recombinants in the backbone genes as well as through characterization 569
of the natural virus diversity in already established and newly identified picornavirus 570
species. Biological foundations of other, higher-rank thresholds could also be 571
addressed. These advancements, combined with the application of DEmARC to other 572
virus families, could bring virus taxonomy into the mainstream of research, and pave 573
the way to ultimately unite it with the taxonomy of cellular life forms. 574
575
Acknowledgments 576
577
We are indebted to Igor Sidorov, Andrey Leontovich and Ivan Antonov for helpful 578
discussions and suggestions, and Dmitry Samborskiy, Igor Sidorov and Alexander 579
Kravchenko for administrating and advancing different Viralis modules. This work 580
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
24 of 37
was partially supported by the Netherlands Bioinformatics Centre (BioRange SP 581
2.3.3), the European Union (FP6 IP Vizier LSHG-CT-2004-511960 and FP7 IP Silver 582
HEALTH-2010-260644), the Collaborative Agreement in Bioinformatics between 583
Leiden University Medical Center and Moscow State University (MoBiLe), and 584
Leiden University Fund (Special Chair in Applied Bioinformatics in Virology). 585
586
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
25 of 37
References 587 588
1. Agol, V. I. and A. P. Gmyl. 2010. Viral security proteins: counteracting host 589 defences. Nat. Rev. Microbiol. 8:867-878. 590
2. Arden, K. E. and I. M. Mackay. 2010. Newly identified human rhinoviruses: 591 molecular methods heat up the cold viruses. Rev. Med. Virol. 20:156-176. 592
3. Beckner, M. 1959. The biological way of thought. Columbia University 593 Press, New York. 594
4. Belalov, I. S., O. V. Isaeva, and A. N. Lukashev. 2011. Recombination in 595 hepatitis A virus: evidence for reproductive isolation of genotypes. J. Gen. 596 Virol. 92:860-872. 597
5. Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and E. W. 598 Sayers. 2010. GenBank. Nucl Acids Res 38:D46-D51. 599
6. Blinkova, O., A. Kapoor, J. Victoria, M. Jones, N. Wolfe, A. Naeem, S. 600 Shaukat, S. Sharif, M. M. Alam, M. Angez, S. Zaidi, and E. L. Delwart. 601 2009. Cardioviruses Are Genetically Diverse and Cause Common Enteric 602 Infections in South Asian Children. J. Virol. 83:4631-4641. 603
7. Cavalli-Sforza, L. L. and A. W. F. Edwards. 1967. Phylogenetic Analysis 604 Models and Estimation Procedures. Am J Hum Genet 19:233-257. 605
8. Chiu, C. Y., A. L. Greninger, K. Kanada, T. Kwok, K. F. Fischer, C. 606 Runckel, J. K. Louie, C. A. Glaser, S. Yagi, D. P. Schnurr, T. D. 607 Haggerty, J. Parsonnet, D. Ganem, and J. L. Derisi. 2008. Identification of 608 cardioviruses related to Theiler's murine encephalomyelitis virus in human 609 infections. Proc. Natl. Acad. Sci. U. S. A. 105:14124-14129. 610
9. Cordey, S., D. Gerlach, T. Junier, E. M. Zdobnov, L. Kaiser, and C. 611 Tapparel. 2008. The cis-acting replication elements define human enterovirus 612 and rhinovirus species. RNA. 14:1568-1578. 613
10. Domingo, E., C. Escarmis, E. Baranowski, C. M. Ruiz-Jarabo, E. Carrillo, 614 J. I. Nunez, and F. Sobrino. 2003. Evolution of foot-and-mouth disease 615 virus. Virus Res. 91:47-63. 616
11. Drummond, A. J., S. Y. W. Ho, M. J. Phillips, and A. Rambaut. 2006. 617 Relaxed phylogenetics and dating with confidence. PLoS Biol 4:699-710. 618
12. Drummond, A. J. and A. Rambaut. 2007. BEAST: Bayesian evolutionary 619 analysis by sampling trees. BMC Evol Biol 7. 620
13. Drummond, A. J. and Rambaut, A. Tracer v1.4, available from 621 http://beast.bio.ed.ac.uk/Tracer. 2007. 622
Ref Type: Generic 623
14. Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high 624 accuracy and high throughput. Nucl Acids Res 32:1792-1797. 625
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
26 of 37
15. Ehrenfeld, E., Domingo, E., and Roos, R. P. The Picornaviruses. 1-493. 2010. 626 Washington, ASM Press. 627
Ref Type: Edited Book 628
16. Fauquet, C. M., Mayo, M. A., Maniloff, J., Desselberger, U., and Ball, L. A. 629 Virus Taxonomy, Eighth Report of the International Committee on Taxonomy 630 of Viruses. 1-1259. 2005. Amsterdam, Elsevier, Academic Press. 631
Ref Type: Edited Book 632
17. Felsenstein, J. 1981. Evolutionary Trees from Dna-Sequences - A Maximum-633 Likelihood Approach. J Mol Evol 17:368-376. 634
18. Gerber, K., E. Wimmer, and A. V. Paul. 2001. Biochemical and Genetic 635 Studies of the Initiation of Human Rhinovirus 2 RNA Replication: 636 Identification of a cis-Replicating Element in the Coding Sequence of 637 2A(pro). J. Virol. 75:10979-90. 638
19. Goodfellow, I., Y. Chaudhry, A. Richardson, J. Meredith, J. W. Almond, 639 W. Barclay, and D. J. Evans. 2000. Identification of a cis-acting replication 640 element within the poliovirus coding region. J. Virol. 74:4590-4600. 641
20. Gorbalenya, A. E. and C. Lauber. 2010. Origin and Evolution of the 642 Picornaviridae Proteome, p. 253-270. In E. Ehrenfeld, E. Domingo, and R. P. 643 Roos (eds.), The Picornaviruses. ASM Press, Washington. 644
21. Gorbalenya, A. E., P. Lieutaud, M. R. Harris, B. Coutard, B. Canard, G. 645 J. Kleywegt, A. A. Kravchenko, D. V. Samborskiy, I. A. Sidorov, A. M. 646 Leontovich, and T. A. Jones. 2010. Practical application of bioinformatics by 647 the multidisciplinary VIZIER consortium. Antivir Res 87:95-110. 648
22. Greninger, A. L., C. Runckel, C. Y. Chiu, T. Haggerty, J. Parsonnet, D. 649 Ganem, and J. L. Derisi. 2009. The complete genome of klassevirus - a novel 650 picornavirus in pediatric stool. Virol. J. 6. 651
23. Guindon, S. and O. Gascuel. 2003. A simple, fast, and accurate algorithm to 652 estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696-704. 653
24. Hellen, C. U. T. and S. de Breyne. 2007. A distinct group of 654 hepacivirus/pestivirus-like internal ribosomal entry sites in members of 655 diverse Picornavirus genera: Evidence for modular exchange of functional 656 noncoding RNA elements by recombination. J. Virol. 81:5850-5863. 657
25. Hollister, J. R., A. Vagnozzi, N. J. Knowles, and E. Rieder. 2008. 658 Molecular and phylogenetic analyses of bovine rhinovirus type 2 shows it is 659 closely related to foot-and-mouth disease virus. Virology 373:411-425. 660
26. Holtz, L. R., S. R. Finkbeiner, C. D. Kirkwood, and D. Wang. 2008. 661 Identification of a novel picornavirus related to cosaviruses in a child with 662 acute diarrhea. Virol. J. 5. 663
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
27 of 37
27. Holtz, L. R., S. R. Finkbeiner, G. Y. Zhao, C. D. Kirkwood, R. Girones, J. 664 M. Pipas, and D. Wang. 2009. Klassevirus 1, a previously undescribed 665 member of the family Picornaviridae, is globally widespread. Virol. J. 6. 666
28. Jackson, D. J., R. E. Gangnon, M. D. Evans, K. A. Roberg, E. L. 667 Anderson, T. E. Pappas, M. C. Printz, W. M. Lee, P. A. Shult, E. 668 Reisdorf, K. T. Carlson-Dakes, L. P. Salazar, D. F. DaSilva, C. J. Tisler, J. 669 E. Gern, and R. F. Lemanske. 2008. Wheezing rhinovirus illnesses in early 670 life predict asthma development in high-risk children. Am J Resp Crit Care 671 Med 178:667-672. 672
29. Jegouic, S., M. L. Joffret, C. Blanchard, F. B. Riquet, C. Perret, I. 673 Pelletier, F. Colbere-Garapin, M. Rakoto-Andrianarivelo, and F. 674 Delpeyroux. 2009. Recombination between Polioviruses and Co-Circulating 675 Coxsackie A Viruses: Role in the Emergence of Pathogenic Vaccine-Derived 676 Polioviruses. PLoS Pathog. 5. 677
30. Jiang, P., J. A. J. Faase, H. Toyoda, A. Paul, E. Wimmer, and A. E. 678 Gorbalenya. 2007. Evidence for emergence of diverse polioviruses from C-679 cluster coxsackie A viruses and implications for global poliovirus eradication. 680 Proc. Natl. Acad. Sci. U. S. A. 104:9457-9462. 681
31. Johansson, S., B. Niklasson, J. Maizel, A. E. Gorbalenya, and A. M. 682 Lindberg. 2002. Molecular analysis of three Ljungan virus isolates reveals a 683 new, close-to-root lineage of the Picornaviridae with a cluster of two unrelated 684 2A proteins. J. Virol. 76:8920-8930. 685
32. Jones, M. S., V. V. Lukashov, R. D. Ganac, and D. P. Schnurr. 2007. 686 Discovery of a novel human picornavirus in a stool sample from a pediatric 687 patient presenting with fever of unknown origin. J. Clin. Microbiol. 45:2144-688 2150. 689
33. Kapoor, A., J. Victoria, P. Simmonds, E. Slikas, T. Chieochansin, A. 690 Naeem, S. Shaukat, S. Sharif, M. M. Alam, M. Angez, C. L. Wang, R. W. 691 Shafer, S. Zaidi, and E. Delwart. 2008. A highly prevalent and genetically 692 diversified Picornaviridae genus in South Asian children. Proc. Natl. Acad. 693 Sci. U. S. A. 105:20482-20487. 694
34. Kapoor, A., J. Victoria, P. Simmonds, C. Wang, R. W. Shafer, R. Nims, 695 O. Nielsen, and E. Delwart. 2008. A highly divergent picornavirus in a 696 marine mammal. J. Virol. 82:311-320. 697
35. King, A. M. Q., Adams, M. J., Carstens, E. B., and Lefkowitz, E. J. Virus 698 Taxonomy, Ninth Report of the International Committee on Taxonomy of 699 Viruses. 1-1327. 2012. Amsterdam, Elsevier, Academic Press. 700
Ref Type: Edited Book 701
36. Kingsbury, D. W. 1985. Species Classification Problems in Virus Taxonomy. 702 Intervirology 24:62-70. 703
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
28 of 37
37. Kirkegaard, K. and D. Baltimore. 1986. The mechanism of RNA 704 recombination in poliovirus. Cell 47:433-443. 705
38. Kitchen, A., L. A. Shackelton, and E. C. Holmes. 2011. Family level 706 phylogenies reveal modes of macroevolution in RNA viruses. Proc. Natl. 707 Acad. Sci. U. S. A. 108:238-243. 708
39. Knowles, N. J., T. Hovi, T. Hyypiä, A. M. Q. King, A. M. Lindberg, M. A. 709 Pallansch, A. C. Palmenberg, P. Simmonds, T. Skern, G. Stanway, T. 710 Yamashita, and R. Zell. 2012. Picornaviridae, p. 855-880. In A. M. Q. King, 711 M. J. Adams, E. B. Carstens, and E. J. Lefkowitz (eds.), Virus Taxonomy, 712 Ninth Report of the International Committee for the Taxonomy of Viruses. 713 Elsevier Academic Press, Amsterdam. 714
40. Knowles, N. J., T. Hovi, A. M. Q. King, and G. Stanway. 2010. Overview 715 of Taxonomy, p. 19-32. In E. Ehrenfeld, E. Domingo, and R. P. Roos (eds.), 716 The Picornaviruses. ASM Press, Washington. 717
41. Knowles, N. J. and A. R. Samuel. 2003. Molecular epidemiology of foot-718 and-mouth disease virus. Virus Res. 91:65-80. 719
42. Kong, W. P., G. D. Ghadge, and R. P. Roos. 1994. Involvement of 720 Cardiovirus Leader in Host Cell-Restricted Virus Expression. Proc. Natl. 721 Acad. Sci. U. S. A. 91:1796-1800. 722
43. Lauber, C. and A. E. Gorbalenya. 2012. Partitioning the Genetic Diversity 723 of a Virus Family: Approach and Evaluation through a Case Study of 724 Picornaviruses. J. Virol. 86:xxxx-yyyy. 725
44. Lewis-Rogers, N. and K. A. Crandall. 2010. Evolution of Picornaviridae: An 726 examination of phylogenetic relationships and cophylogeny. Mol Phylogenet 727 Evol 54:995-1005. 728
45. Li, F., G. F. Browning, M. J. Studdert, and B. S. Crabb. 1996. Equine 729 rhinovirus 1 is more closely related to foot-and-mouth disease virus than to 730 other picornaviruses. Proc. Natl. Acad. Sci. U. S. A. 93:990-995. 731
46. Liang, Z., A. S. M. Kumar, M. S. Jones, N. J. Knowles, and H. L. Lipton. 732 2008. Phylogenetic Analysis of the Species Theilovirus: Emerging Murine and 733 Human Pathogens. J. Virol. 82:11545-11554. 734
47. Lu, H. H., X. Y. Li, A. Cuconati, and E. Wimmer. 1995. Analysis of 735 Picornavirus 2A(Pro) Proteins - Separation of Proteinase from Translation and 736 Replication Functions. J. Virol. 69:7445-7452. 737
48. Lu, H. H. and E. Wimmer. 1996. Poliovirus chimeras replicating under the 738 translational control of genetic elements of hepatitis C virus reveal unusual 739 properties of the internal ribosomal entry site of hepatitis C virus. Proc. Natl. 740 Acad. Sci. U. S. A. 93:1412-1417. 741
49. Lukashev, A. N. 2010. Recombination among picornaviruses. Rev. Med. 742 Virol. 20:327-337. 743
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
29 of 37
50. Martinez-Salas, E. and M. D. Ryan. 2010. Translation and protein 744 processing, p. 141-161. In E. Ehrenfeld, E. Domingo, and R. P. Roos (eds.), 745 The Picornaviruses. ASM Press, Washington. 746
51. McIntyre, C. L., E. C. M. Leitch, C. Savolainen-Kopra, T. Hovi, and P. 747 Simmonds. 2010. Analysis of Genetic Diversity and Sites of Recombination 748 in Human Rhinovirus Species C. J. Virol. 84:10297-10310. 749
52. Oberste, M. S., X. Jiang, K. Maher, W. A. Nix, and B. M. Jiang. 2008. The 750 complete genome sequences for three simian enteroviruses isolated from 751 captive primates. Arch Virol 153:2117-2122. 752
53. Oberste, M. S., K. Maher, and M. A. Pallansch. 2002. Molecular phylogeny 753 and proposed classification of the simian picornaviruses. J. Virol. 76:1244-754 1251. 755
54. Oberste, M. S., K. Maher, and M. A. Pallansch. 2007. Complete genome 756 sequences for nine simian enteroviruses. J. Gen. Virol. 88:3360-3372. 757
55. Palmenberg, A., D. Neubauer, and T. Skern. 2010. Genome Organization 758 and Encoded Proteins, p. 3-17. In E. Ehrenfeld, E. Domingo, and R. P. Roos 759 (eds.), The Picornaviruses. ASM Press, Washington. 760
56. Palmenberg, A. C., J. A. Rathe, and S. B. Liggett. 2010. Analysis of the 761 complete genome sequences of human rhinovirus. Journal of Allergy and 762 Clinical Immunology 125:1190-1199. 763
57. Palmenberg, A. C., D. Spiro, R. Kuzmickas, S. Wang, A. Djikeng, J. A. 764 Rathe, C. M. Fraser-Liggett, and S. B. Liggett. 2009. Sequencing and 765 Analyses of All Known Human Rhinovirus Genomes Reveal Structure and 766 Evolution. Science. 324:55-59. 767
58. Piccone, M. E., H. H. Chen, R. P. Roos, and M. J. Grubman. 1996. 768 Construction of a chimeric Theiler's murine encephalomyelitis virus 769 containing the leader gene of foot-and-mouth disease virus. Virology 226:135-770 139. 771
59. Piccone, M. E., E. Rieder, P. W. Mason, and M. J. Grubman. 1995. The 772 Foot-And-Mouth-Disease Virus Leader Proteinase Gene Is Not Required for 773 Viral Replication. J. Virol. 69:5376-5382. 774
60. Pringle, C. R. 1991. The 20Th Meeting of the Executive-Committee of the 775 International-Committee-On-Virus-Taxonomy - Virus Species, Higher Taxa, 776 A Universal Virus Database, and Other Matters. Arch Virol 119:303-304. 777
61. Raup, D. M. 1994. The Role of Extinction in Evolution. Proc. Natl. Acad. Sci. 778 U. S. A. 91:6758-6763. 779
62. Reuter, G., A. Boldizsar, and P. Pankovics. 2009. Complete nucleotide and 780 amino acid sequences and genetic organization of porcine kobuvirus, a 781 member of a new species in the genus Kobuvirus, family Picornaviridae. Arch 782 Virol 154:101-108. 783
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
30 of 37
63. Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. 784 TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets 785 and parallel computing. Bioinformatics 18:502-504. 786
64. Semler, B. L. and E. Wimmer. 2002. Molecular biology of picornaviruses. 787 ASM Press, Washington, DC, U.S.A. 788
65. Sepkoski, J. J. 1998. Rates of speciation in the fossil record. Philos Trans R 789 Soc Lond B Biol Sci 353:315-326. 790
66. Shapiro, B., A. Rambaut, O. G. Pybus, and E. C. Holmes. 2006. A 791 phylogenetic method for detecting positive epistasis in gene sequences and its 792 application to RNA virus evolution. Mol. Biol. Evol. 23:1724-1730. 793
67. Sidorov, I. A., Samborskiy, D. V., Leontovich, A. M., and Gorbalenya, A. E. 794 SARGENS, Similarity-based Automatic Retrieval of Genetic Sequences. 795 http://veb.lumc.nl/SARGENS. 2009. 796
Ref Type: Online Source 797
68. Simmonds, P. 2006. Recombination and selection in the evolution of 798 picornaviruses and other mammalian positive-stranded RNA viruses. J. Virol. 799 80:11124-11140. 800
69. Simmonds, P., C. McIntyre, C. Savolainen-Kopra, C. Tapparel, I. M. 801 Mackay, and T. Hovi. 2010. Proposals for the classification of human 802 rhinovirus species C into genotypically assigned types. J. Gen. Virol. 91:2409-803 2419. 804
70. Stanway, G., F. Brown, P. Christian, T. Hovi, T. Hyypiae, A. M. Q. King, 805 N. J. Knowles, S. M. Lemon, P. D. Minor, M. A. Pallansch, A. C. 806 Palmenberg, and T. Skern. 2005. Picornaviridae, p. 757-778. In C. M. 807 Fauquet, M. A. Mayo, J. Maniloff, U. Desselberger, and L. A. Ball (eds.), 808 Virus Taxonomy, Eighth report of the International Committee on Taxonomy 809 of Viruses. Elsevier Academic Press. 810
71. Steil, B. P. and D. J. Barton. 2009. Cis-active RNA elements (CREs) and 811 picornavirus RNA replication. Virus Res. 139:240-252. 812
72. Tolskaya, E. A., L. A. Romanova, M. S. Kolesnikova, and V. I. Agol. 1983. 813 Intertypic Recombination in Poliovirus - Genetic and Biochemical-Studies. 814 Virology 124:121-132. 815
73. van Ooij, M. J. M., D. A. Vogt, A. Paul, C. Castro, J. Kuijpers, F. J. M. 816 van Kuppeveld, C. E. Cameron, E. Wimmer, R. Andino, and W. J. G. 817 Melchers. 2006. Structural and functional characterization of the 818 coxsackievirus B3 CRE(2C): role of CRE(2C) in negative- and positive-strand 819 RNA synthesis. J. Gen. Virol. 87:103-113. 820
74. Van Regenmortel, M. H. V. 1989. Applying the Species Concept to Plant-821 Viruses. Arch Virol 104:1-17. 822
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
31 of 37
75. Van Regenmortel, M. H. V. 2003. Viruses are real, virus species are man-823 made, taxonomic constructions. Arch Virol 148:2481-2488. 824
76. Van Regenmortel, M. H. V. 2007. Virus species and virus identification: Past 825 and current controversies. Inf Genet Evol 7:133-144. 826
77. Whelan, S. and N. Goldman. 2001. A general empirical model of protein 827 evolution derived from multiple protein families using a maximum-likelihood 828 approach. Mol. Biol. Evol. 18:691-699. 829
78. Wimmer, E. and A. Paul. 2010. Making of a picornavirus genome, p. 33-55. 830 In E. Ehrenfeld, E. Domingo, and R. P. Roos (eds.), The Picornaviruses. ASM 831 Press, Washington. 832
79. Wutz, G., H. Auer, N. Nowotny, B. Grosse, T. Skern, and E. Kuechler. 833 1996. Equine rhinovirus serotypes 1 and 2: Relationship to each other and to 834 aphthoviruses and cardioviruses. J. Gen. Virol. 77:1719-1730. 835
80. Yang, Y., R. Rijnbrand, K. L. McKnight, E. Wimmer, A. Paul, A. Martin, 836 and S. M. Lemon. 2002. Sequence requirements for viral RNA replication 837 and VPg uridylylation directed by the internal cis-acting replication element 838 (cre) of human rhinovirus type 14. J. Virol. 76:7485-7494. 839
81. Yang, Y., M. K. Yi, D. J. Evans, P. Simmonds, and S. M. Lemon. 2008. 840 Identification of a Conserved RNA Replication Element (cre) within the 841 3D(pol)-Coding Sequence of Hepatoviruses. J. Virol. 82:10118-10128. 842
82. Zheng, T. 2007. Characterisation of two enteroviruses isolated from 843 Australian brushtail possums (Trichosurus vulpecula) in New Zealand. Arch 844 Virol 152:191-198. 845
846 847
848
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
32 of 37
Figure Legends 849
Figure 1. Picornavirus genome organization. The organization of the picornavirus 850
genome is shown on the example of Porcine sapelovirus. Products derived after 851
cleavage of the encoded polyprotein are indicated by rectangles and names. They 852
include structural proteins (dark-grey background) forming virus particles, non-853
structural/accessory proteins (light-grey) involved in replication and expression and 854
the leader protein (white) which is not found in all picornaviruses. Horizontal bars 855
below highlight the six proteins conserved across the family, a concatenated, 856
picornavirus-wide multiple alignment of which forms the dataset of this study. 857
858
Figure 2. Phylogeny and GENETIC classification of the Picornaviridae. Shown is 859
a maximum likelihood phylogeny of 38 picornaviruses representing species diversity 860
based on the family-wide conserved proteins 1B, 1C, 1D, 2C, 3C and 3D. A Bayesian 861
analysis resulted in an identical tree topology (data not shown). The part of the tree 862
representing the ICTV-defined 28 species and 12 genera is drawn in black, and 863
provisional or currently not recognized taxa in grey. Clusters equivalent to ICTV 864
genera are highlighted by colored ovals. A split of Aphthovirus according to the 865
GENETIC classification is indicated (white line). Genera with identical coloring unite 866
to in total 11 super-genera identified in this study. The viruses shown represent the 867
following species (italics) or species-like clusters according to the GENETIC 868
classification: Porcine sapelovirus (PSV), Simian sapelovirus (SiSV), Avian 869
sapelovirus (AvSV), Human rhinovirus A (HRV-A), Human rhinovirus A2 (HRV-870
A2), Human rhinovirus B (HRV-B), Human rhinovirus C1 (HRV-C1), Human 871
rhinovirus C2 (HRV-C2), Human rhinovirus C3 (HRV-C3), Human enterovirus A 872
(HEV-A), Human enterovirus B (HEV-B), Human enterovirus C1 (HEV-C1), Human 873
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
33 of 37
enterovirus D (HEV-D), Simian enterovirus A (SiEV-A), Simian enterovirus B 874
(SiEV-B), Porcine enterovirus B (PEV-B), Bovine enterovirus (BEV), Bovine 875
kobuvirus (BKoV), Aichi virus (AiV), Salivirus A (SaliV-A), Human parechovirus 876
(HPeV), Ljungan virus (LjV), Duck hepatitis A virus (DuHV), Aquamavirus A (AqV-877
A), Hepatitis A virus (HAV), Avian encephalomyelitis virus (AvEMV), Foot-and-878
mouth disease virus (FMDV), Bovine rhinitis B virus (BRBV), Equine rhinitis A virus 879
(ERAV), Equine rhinitis B virus (ERBV), Theilovirus (TMEV), 880
Encephalomyocarditis virus (EMCV), Seneca Valley virus (SVV), Human cosavirus 881
A (CosaV-A), Human cosavirus B (CosaV-B), Human cosavirus D (CosaV-D), 882
Human cosavirus E (CosaV-E), Porcine teschovirus (PTeV). Numbers at branch 883
points provide support values from 1000 non-parametric bootstraps. The scale bar 884
represents 0.5 amino acid substitutions per site on average. 885
886
Figure 3. Intragroup genetic divergence and species sampling size. (A) Box-and-887
whisker graphs were used to plot distributions of distances between viruses from the 888
same species (orange), between viruses from different species but the same genus 889
(blue) and between viruses from different genera but the same super-genus (purple). 890
The boxes span from the first to the third quartile and include the median (bold line) 891
and the whiskers (dashed lines) extend to the extreme values. For name abbreviations 892
see legend of Fig. 2; numbers in brackets correspond to number of sequences per 893
species; open and filled diamonds indicate single and multiple host species range, 894
respectively. Genera and super-genera constituting only one species are not shown. 895
The corresponding first half of the PED distribution (see (43)) is depicted below. 896
Phylogenetic relationships of the 38 picornavirus species are shown by the cladogram 897
to the left (following the topology in Fig. 2) with intra-genus relations collapsed. 898
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
34 of 37
Colored shapes indicate those taxa that contribute to intragroup distances to the right. 899
Species and genera currently not recognized by ICTV are marked with asterisks and 900
discrepancies between the ICTV taxonomy and the GENETIC classification (not 901
caused by recently discovered viruses) are highlighted in red. (B) The relationship 902
between sampling size and maximum intragroup genetic divergence is shown for each 903
species. 904
905
Figure 4. Phylogeny of rhinoviruses. Shown is a ML phylogeny for 140 906
rhinoviruses based on the family-wide conserved proteins 1B, 1C, 1D, 2C, 3C and 907
3D. SH-like support values are shown for basal branching events. Species taxa 908
recognized by the GENETIC classification are indicated (see also legend of Fig. 2). A 909
minimal set of viruses sufficient to explain all violating PEDs that exceed the species 910
distance threshold are highlighted by grey dots (see Table 2 for details on involved 911
viruses). The scale bar represents 0.1 amino acid substitutions per site on average. 912
913
Figure 5. Taxonomy diagram and comparison of classification frameworks. 914
Shown is a taxonomy diagram for a classification under the ICTV framework (A) and 915
the DEmARC framework (B). For simplicity, the GENETIC classification is 916
visualized in both cases and super-genera are omitted for ICTV. Inter-virus genetic 917
divergence (as PED) increases linearly (arrow) from the perimeter (PED of zero) 918
toward the centre of the circle (maximum PED of 2.78). Applied distance thresholds 919
are shown as black dots and the delimited taxa as rectangle-like shapes. Taxa are 920
filled using the coloring scheme from Figure 3; the three basic colors represent, 921
respectively, the species (orange), genus (blue) and super-genus (purple) level. Each 922
color exists in two shadings that highlight, respectively, the limit on intragroup 923
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
35 of 37
genetic divergence according to a distance threshold (soft shading) and the maximum 924
observed intragroup genetic divergence (bright shading) of a taxon. Outside the circle, 925
the relative density of virus sampling per species is shown as grey shadings from low 926
(light) to high (dark) sampling, which is in the range of 1 (least sampled species) to 927
260 (most sampled species). For simplicity, species identities are indicated via a 928
binary system where the first and the second number represent the genus and the 929
species, respectively, as defined in the common legend below the circles. (A) ICTV 930
treats each genus independently (different heights of genus shapes) and species must 931
conform to genus-specific distance thresholds (equal heights of species shapes only 932
within the same genus). (B) In the DEmARC framework taxa are treated equally at 933
each level and they must conform to family-wide distance thresholds (equal, level-934
specific heights of taxon shapes). The space inside taxon shapes colored in soft 935
shading highlights the genetic diversity that may be missed by the current 936
picornavirus sampling, when assuming a universal, level-wide threshold that limits 937
the actual diversity of each taxon. 938
939
940
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
36 of 37
Table 1. Differences between GENETIC classification and ICTV taxonomy on the species level. 941 virusa difference
typeb ICTVc GENETICd
qualitye
Simian picornavirus 17 new - HEV-B 1 Simian picornavirus 13 new - HEV-A 1 Simian enterovirus SV19, SV43 new - HEV-A 1 Saffold virus new - TheiloV 1 Possum enterovirus W1, W6 new - BEV 1 Seal picornavirus type 1 new - AqV-A* - Simian enterovirus N125, N203, SV6 new - SiEV-B* 1 Enterovirus 103 isolate POo-1 new - SiEV-B* 1 Human cosavirus A1, A2 new - CosV-A* 1 Human cosavirus B new - CosV-B* - Human cosavirus D new - CosV-C* - Human cosavirus E new - CosV-D* - Salivirus NG-J1, Human klassevirus 1 new - SaliV-A* 1 Porcine kobuvirus S-1-HUN, K-30-HUN new - BKoV .833 Human rhinovirus VR-1118, VR-1155, VR-1301 new - HRV-Aβ* 1
Human rhinovirus C 026, NY-074, NAT001, QPM mm HRV-C
HRV-Cα* 1
Human rhinovirus C 025 mm HRV-C
HRV-Cβ* -
Human rhinovirus C N4, N10, NAT045 mm HRV-C
HRV-Cγ* .500 a shown is the Definition field value in the Genbank annotation of one or several viruses 942 b a virus was not available or assigned to a tentative species at time of the ICTV release (new); a 943 mismatch was observed between the ICTV taxonomy and GENETIC classification (mm) 944 c it is shown to which species the virus is classified in the ICTV taxonomy; - if not available at the time 945 d it is shown to which species the virus was assigned in the GENETIC classification; new species 946 proposed by the GENETIC-classification are indicated using asterisks; for species abbreviations see 947 legend of Fig. 2 948 e the proportion of intra-species PED values not exceeding the species distance threshold; - for clusters 949 with less than 3 viruses 950
951
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
37 of 37
Table 2. Violations to a distance threshold in the GENETIC classification. 952
accession virusa threshold violationsb costc
FJ445152 Human rhinovirus 71, ATCC VR-1181 species 33 .902 FJ445136 Human rhinovirus 51, ATCC VR-1161 species 17 .770 GQ415052 Human rhinovirus A, hrv-A101-v1 species 16 .707 FJ445147 Human rhinovirus 65, ATCC VR-1175 species 14 .577 FJ445156 Human rhinovirus 80, ATCC VR-1190 species 14 .431 GQ415051 Human rhinovirus A, hrv-A101 species 13 .434 FJ445120 Human rhinovirus 20, ATCC VR-1130 species 13 .393 DQ473507 Human rhinovirus 53 species 11 .285 FJ445150 Human rhinovirus 68, ATCC VR-1178 species 11 .187 DQ473508 Human rhinovirus 28 species 10 .255 DQ473506 Human rhinovirus 46 species 6 .154 FJ445183 Human rhinovirus 78, ATCC VR-1188 species 6 .149 EF173418 Human rhinovirus 78 species 6 .130 DQ473497 Human rhinovirus 23 species 1 .003 NC_009996 Human rhinovirus C species 2 .100 EF077280 Human rhinovirus NAT045 species 1 .049 NC_004421 Bovine kobuvirus species 1 .011 AF119795 Enterovirus 71, TW/2272/98 genus 21 .157 NC_006553 Avian sapelovirus super-genus 7 .195 a Definition field value in the Genbank annotation; viruses of the same taxon are separated from others 953 by an empty row; only the minimal subset of violating viruses sufficient to explain all violating PEDs are 954 listed 955 b number of PEDs exceeding the respective distance threshold 956 c cumulative value of the disagreement of a virus to the respective distance threshold; calculated as the 957 virus-specific clustering cost (see (43)) using the threshold as a unit 958 959 960 961
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
1 1000 2000 3000 4000 5000 6000 7000 8000 nt
2B2A 2C3A
1B3B1A
3C1C 1DL 3D 3’5’
structural genes non-structural genes
Figure 1 Lauber&Gorbalenya (b) on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
HRV-Cγ
HEV-C
SiEV-B
SaliV-
A
SiSV
HRV-A
BKoV
HEV-D
PTeV
CosaV-A
CosaV-E
DuHV
SVV
BEVPEV-B
LjV
HAV
HRV-C
β
ERAVAiV
HEV-A
EMCV
AvSV
HRV-C
α
PSV
SiEV-A
HEV-B
CosaV-D
TMEV
FMDV
CosaV-B
HRV-B
BRBV
HPeV
ERBV
AvE
MV Aq
V-A
HRV-A
β
1000
1000
1000
1000
610
994
704
1000739 955
1000833
566 10
00
1000
1000
1000
462
998
1000
1000
1000990
940
1000705
826
1000
720
971
975
570
1000
0.5
1000
1000
997
Aphthovirus
Erbovirus
Cosavirus
Teschovirus
Aquamavirus
Avihepatovirus
Parechovirus
HepatovirusTremovirus
Kobuvirus
Salivirus
Sapelovirus
Senecavirus
Cardiovirus
Enterovirus
Figure 2 Lauber&Gorbalenya (b)
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
(4)HRV-Cγ*(1)HRV-Cβ*
0PED
0.2 0.4 0.6 0.8 1.0 1.2
0 0.2 0.4 0.6 0.8 1.0 1.2
(1)
(31)(2)
(65)
(3)(55)(3)
(19)(18)
(2)
(2)(2)(2)(3)
(6)(1)
(260)
(37)(7)
(1)(1)(2)
(1)
(3)
(4)
(7)
(2)
(159)
(32)
(6)
(240)
(132)
(96)
(15)
(5)(4)
AqV-A*
PTeVERBV
DuHV
AvEMVHAVSaliV-A*
EMCVTheiloV
SVV
CosV-E*CosV-D*CosV-B*CosV-A*
ERAVBRBVFMDV
HPeVLjV
AvSVPSVSiSV
SiEV-A
HRV-Aβ*
HRV-Cα*
HEV-D
PEV-B
HEV-B
HRV-B
SiEV-B*
HEV-C
HEV-A
HRV-A
BEV
AiVBKoV
Entero
Sapelo
Cardio
Seneca
Aphtho
ErboTescho
Parecho
Avihepato
TremoHepato
Kobu
A
0 50 100 150 200 25000.10.20.30.4
virus sampling size
max
PE
D
B
Cosa*
Sali*
Aquama*
single host speciesmultiple host species
not recognized by ICTV * sampling size(n)
Figure 3 Lauber&Gorbalenya (b)
intra-species inter-speciesintra-genus
inter-speciesinter-genus
intra-super-genus
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
0.1
HRV-Aβ
HRV-B
HRV-Cβ
HRV-Cγ
HRV-Cα
HRV-A
1
1
0.920.98
1
1
1
1
0.99
0.58
1
Figure 4 Lauber&Gorbalenya (b)
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from
B
Sapelovirus12.1 - SiSV12.2 - PSV12.3 - AvSV
Cardiovirus9.1 - TheiloV9.2 - EMCV
Aphthovirus6.1 - FMDV6.2 - BRBV7.1 - ERAV
Cosavirus*14.1 - CosV-A*14.2 - CosV-B*14.3 - CosV-D*14.4 - CosV-E*
Aquamavirus*16.1 - AqV-A*
Hepatovirus1.1 - HAV
Kobuvirus4.1 - AiV4.2 - BKoV
Erbovirus15.1 - ERBV
Teschovirus3.1 - PTeV
Senecavirus10.1 - SVV
Avihepatovirus13.1 - DuHV
Parechovirus8.1 - HPeV8.2 - LjV
Salivirus*5.1 - SaliV-A*
Tremovirus2.1 - AvEMV
Enterovirus11.1 - HEV-C11.2 - HEV-A 11.3 - HEV-B11.4 - BEV11.5 - SiEV-B* 11.6 - PEV-B11.7 - HEV-D
11.8 - HRV-A11.9 - HRV-B 11.10 - HRV-Cα*11.11 - HRV-Cγ*11.12 - HRV- Cβ*11.13 - HRV-Aβ*11.14 - SiEV-A
Figure 5 Lauber&Gorbalenya (b)
1.1
2.1
3.1
4.1
4.2
5.1
6.16.
27.18.29.19.210.1
11.111.2
11.3
11.4
11.5
11.6
11.7
11.8
11.9
11.10
11.11
11.1
2
11.1
3
11.1
4 12.2
12.3
13.1
14.1
14.2
14.3
14.4
15.1
16.1
8.112.1
inter−virusdivergence
DEmARC
A
inter−virusdivergence
ICTV1.1
2.1
3.1
4.1
4.2
5.1
6.16.
27.18.29.19.210.1
11.111.2
11.3
11.4
11.5
11.6
11.7
11.8
11.9
11.10
11.11
11.1
2
11.1
3
11.1
4 12.2
12.3
13.1
14.1
14.2
14.3
14.4
15.1
16.1
8.112.1
on July 14, 2018 by guesthttp://jvi.asm
.org/D
ownloaded from