disclaimer recommended citation - whojatin shrinet, a aditi agrawal, a raj k bhatnagar a &...

27
DISCLAIMER This paper was submitted to the Bulletin of the World Health Organization and was posted to the Zika open site, according to the protocol for public health emergencies for international concern as described in Christopher Dye et al. (http://dx.doi.org/10.2471/BLT.16.170860). The information herein is available for unrestricted use, distribution and reproduction in any medium, provided that the original work is properly cited as indicated by the Creative Commons Attribution 3.0 Intergovernmental Organizations licence (CC BY IGO 3.0). RECOMMENDED CITATION Shrinet J, Agrawal A, Bhatnagar RK, Sujatha Sunil S. Analysis of the genetic divergence in Asian strains of ZIKA virus with reference to 2015-2016 outbreaks. [Submitted]. Bull World Health Organ. E-pub: 22 Apr 2016. doi: http://dx.doi.org/10.2471/BLT.16.176065 Analysis of the genetic divergence in Asian strains of ZIKA virus with reference to 2015-2016 outbreaks Jatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a a International Centre for Genetic Engineering and Biotechnology, New Delhi-110067, India Correspondence to: Sujatha Sunil (e-mail: [email protected]) (Submitted: 20 April 2016 – Published online: 22 April 2016)

Upload: others

Post on 08-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

DISCLAIMER

This paper was submitted to the Bulletin of the World Health Organization and was posted to

the Zika open site, according to the protocol for public health emergencies for international

concern as described in Christopher Dye et al. (http://dx.doi.org/10.2471/BLT.16.170860).

The information herein is available for unrestricted use, distribution and reproduction

in any medium, provided that the original work is properly cited as indicated by the Creative

Commons Attribution 3.0 Intergovernmental Organizations licence (CC BY IGO 3.0).

RECOMMENDED CITATION

Shrinet J, Agrawal A, Bhatnagar RK, Sujatha Sunil S. Analysis of the genetic divergence in

Asian strains of ZIKA virus with reference to 2015-2016 outbreaks. [Submitted]. Bull World

Health Organ. E-pub: 22 Apr 2016. doi: http://dx.doi.org/10.2471/BLT.16.176065

Analysis of the genetic divergence in Asian strains of ZIKA virus with reference to 2015-2016 outbreaks

Jatin Shrinet,a Aditi Agrawal,a Raj K Bhatnagara & Sujatha Sunila aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067, India

Correspondence to: Sujatha Sunil (e-mail: [email protected])

(Submitted: 20 April 2016 – Published online: 22 April 2016)

Page 2: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Abstract

Objective: To compare Zika virus (ZIKV) genomes of the 2015-2016 outbreaks with the older strains and evaluate evolution of ZIKV.

Method: We performed several genetic analyses to 50 ZIKV genomes currently available in the public domain. Phylogenetic and mutation analysis, recombination analysis, molecular evolution and selection analysis identified amino acid variations that were unique to the 2015-2016 outbreak strains and the status of recombination and evolution amongst these sequences.

Findings: We report distinct amino acid variations in the structural and non-structural proteins of all 2015-2016 outbreak strains that are conserved amongst these strains. Our results also reveal unique motifs in the UTRs of the new ZIKV strains. We identified recombination events in the African strains but not in the recent isolates of Asian lineage. Population level analysis revealed over dominant selection of alleles in the genome.

Conclusion: 2015-2016 strains of ZIKV show distinct molecular signatures in their genomes that are conserved across strains isolated from different parts of the globe during the outbreak period. Our analysis at the population level emphasizes on a possibility of balancing selection of the alleles.

Introduction

Arboviruses are an important group of viruses of medical relevance due to the wide range of

illnesses they cause. In the last two decades, infections caused by these viruses have been major

public health concerns resulting in pandemics and epidemics (1, 2). The latest addition to this list

is Zika virus (ZIKV) with the World Health Organization declaring Zika fever (ZF) as a Public

Health Emergency of International concern due to its possible association with neurological and

birth conditions(3).

Zika virus is a member of the genus Flavivirus, family Flaviviridae (4)that has other

medically important flaviviruses like dengue, yellow fever, West Nile, Japanese encephalitis

viruses. Originally maintained in a sylvatic cycle (5), the first virus was isolated from a Macaca

monkey in 1947 in the Zika forest region of Uganda (6). In these conditions humans are

considered to be incidental hosts; however, in the absence of non-human primates, humans

probably serve as the primary amplification hosts (7). The first human case was reported in 1954

in Nigeria (8),and sporadic cases have been reported from different regions around the globe

over the years (9-12). In addition to clinical cases, isolation of ZIKV from vectors has also been

reported (13-15).

Page 3: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

The ZIKV genome consists of a 10794 bps long single stranded RNA of positive sense

encoding a single open reading frame (ORF). Flanked by two non-coding regions (5’ and 3’

untranslated regions), the ORF encodes a polyprotein: C-prM-E-NS1-NS2A-NS2B-NS3-NS4A-

NS4b-NS5, which is cleaved into three structural proteins, namely, capsid (C),

premembrane/membrane (prM) and envelope (E) and seven non-structural proteins (NS1, NS2A,

NS2B, NS3, NS4A, NS4B, NS5) (16, 17). Based on serologic and genetic properties, three

lineages, namely, East African, West African and Asian, have been identified (18).

In 2015, the Americas witnessed a huge outbreak of ZF with neurological implications

and symptoms of Guillian-Barre syndrome in affected individuals (19).Epidemiological studies

reveal the transmission to have originated on Yap in Micronesia in 2007 (18) that spread to other

Pacific islands (20) and to South and Central America (21). With the rapid spread of this virus to

several parts of the globe, it is imperative to understand the cause of spread. Until 2012, there

were eight genomes available; however, post 2012, 42 genomes have been reported in the public

domain (till 20th

March 2016) of which 25 genomes reported post January 2016. Analyzing these

genomes at a molecular level may reveal the genetic divergence the newer viruses may exhibit

thereby providing insights to the evolution of the virus. The present report is a bioinformatics

characterization of the genomes of ZIKV isolated post 2015 and comparison with the older

strains of ZIKV.

Materials and Methods

Genome sequences and Phylogenetic analysis

A total of 50 genome sequences of ZIKV were retrieved from NCBI database. The sequences

were multiple aligned and manually edited to discard any aberration in the sequences. Twelve

different gene sequences of ZIKA virus namely, Capsid, pr, M, Envelope (E), NS1, NS2A,

NS2B, NS3, NS4A, 2K, NS4B and NS5 were extracted from the multiple aligned genome

sequences and were further used for analysis of variations in the proteins. The phylogenetic

analysis of trimmed genome sequences were performed using MEGA6 tool (22). Neighbor-

joining method, Minimum Evolution method with Gamma parameter 1 and 100 bootstrap

replications, Maximum likelihood method, UPGMA method and Maximum Parsimony method

Page 4: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

models were used to construct the phylogenetic tree. Phylogeny test was performed using

bootstrap method and by taking 1000 number of bootstrap replications.

UTR analysis

5’ UTR and 3’ UTR sequences were extracted from the genome and aligned using MEGA6.

UTR sequences were not present for all the genomes and also some of the genomes have short

UTR sequences. The aligned sequences were then analyzed to study the conservation of residues.

The multiple aligned sequences were also subjected to RNAalifold web server to predict

consensus secondary structures of both the UTR sequences(23).

Recombination analysis

The multiple aligned ZIKV genome sequences were subjected to recombination analysis using

RDP tool (24). RDP analyze the sequences using 7 methods namely, RDP (R), GENECONV

(G), MaxChi (M), Chimaera (C), Bootscan (B), 3Seq (T) and SiScan (S). The events predicted

by more than 5 methods and without any unknown parent and p-value<0.05 were considered

recombination event.

Molecular evolution and selection analysis

Transition/Transversion bias, Substitution matrix, overall means distance variations were

calculated using MEGA 6. Tajima’s test of neutrality was also performed using MEGA6 tool.

Results and Discussion

Fifty ZIKV genome details that were used in the study are listed in Supplementary Table 1. Of

these sequences, 15 were belonged to year 2015; ten belonged to 2016 (as of March, 2016).

Amongst the remaining sequences, two sequences each were isolated in the years 2014, 2013,

2001, 1968 and 1974. One sequence each was reported from years 2012, 2010, 2007, 2000,

1997, 1984, 1976 and 1966. Information about the isolation date was not available for seven

sequences. The geographical distribution of these sequences showed that nine sequences were

from Brazil and all were isolated in year 2015. Two sequences of 2015 were isolated from

Guatemala and one sequence each of 2015 belongs to Suriname, Puerto Rico, Martinique and

Colombia. Several of these sequences have been previously used to study molecular evolution of

ZIKV in the earlier years (25, 26).

Page 5: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

The phylogenetic tree of 50 ZIKV sequences was constructed using Neighbor-joining

methods (Figure 1). The tree was also constructed using other methods, namely, Minimum

Evolution method with Gamma parameter 1 and 100 bootstrap replications, Maximum likelihood

method, UPGMA method and Maximum Parsimony method with 1000 bootstrap replications

(Supplementary Figure 1). The sequences from 2015-2016 showed similarity to Asian lineage

and grouped in the same clade. These results showed that the Asian strain has caused the recent

outbreak in western part of the world as reported by others (27).

To study the molecular variations specific to Asian strains, Malaysian isolate

(HQ234499.1; 1966) (13) was used as reference for all further analyses. Sequence comparison of

structural and non-structural ZIKA virus proteins revealed several variations in the 2015-2016

genomes that are discussed in detail below.

Sequence analysis of the 2015-2016 isolates with Asian genotype

Structural region

Year 2015 and 2016 outbreak samples (n=25) were compared against the year 1966 sequence

from Malaysia. Nucleotide variations were too numerous to discuss here. With respect to amino

acid variations, structural proteins showed several variations in their sequences revealing high

mutational rate of the new ZIKV strains (28). Variations observed were classified into two

categories, those that were seen in all 2015-2016 samples and variations that were strain-specific.

Both these categories will be discussed in detail in the following sections. For better clarity of

analyzing common variations in the samples, consensus sequences were acquired for each region

year-wise and compared with the reference sequence (Table 1a). Capsid showed variations at

five aa positions, namely, N25S, L27F, R101K, I110V and I113V in all the sequences. Amino

acid variations in individual samples are listed in Table 1b. In Capsid, apart from the above-

mentioned variations, sequence KU729218.1 (from Brazil) showed variation at G105S. Five

samples, KU647676.1 (from Martinique), KU820897.1 (from Colombia), 820898.1 (from

China), KU922960.1, KU922923.1 (from Mexico) showed variation at position D107E.

Sequences KU866423.1, KU820899.2 (from China) showed variation at position S109N. Amino

acid E76D was seen in KU744693.1 (from China).

Sequence comparison of pr protein showed three aa variations, namely, V1A, S17N,

V31M in all the 2015 and 2016 sequences (Table 1a). Sequence KU312312.1 (from Suriname)

Page 6: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

showed an additional change at M44T. No changes in M protein showed a single amino acid

variation, P72L in a sequence from Mexico (KU922923.1) (table 1b).

Envelope protein of 2015-2016 isolates of ZIKA virus when compared to the reference

Malaysian strain revealed changes at three positions, D393E, V473M and T487M in all

sequences (Table 1a). Amino acid variations T47S, S64T, M68I and V255A were observed in

KU729217.2 (from Brazil). Sequences KU729218.1 and KU497555.1 (from Brazil) showed

M349T and S260T changes respectively. Sequences KU501216.1 and KU501217.1 (from

Guatemala) showed V56I variation, isolate KU312312.1 (from Suriname) showed T479A,

sequences KU866423.1 and KU820899.2 (from China) displayed K419R variation in their

respective genomes. Of special mention is one isolate from China (KU744693.1) that displayed a

total of 12 aa variations including 3 conserved changes (Table 1b).

Non-structural region

The non-structural protein sequences comparison of 25 isolates of 2016 (10) and 2015 (15) with

the reference sequence from Malaysia (HQ234499.1) indicates that the non-structural proteins of

ZIKA virus is more conserved than the structural proteins. Non-structural proteins namely, NS1,

NS2A, NS2B and NS3 showed very few conserved changes as compared to NS4B and NS5

which showed 7 and 15 aa variations respectively.

NS1 showed two changes namely A188V and V264M that were present in all 25

sequences (Table 2a). Sequences KU729217.1 and KU321639.1 (from Brazil) has additional aa

variations at G190E and Y122H respectively (Table 2b). Some sequences revealed two types of

aa variations at position R324. While sequences KU647676.1 (from Martinique), KU922923.1

and KU922960.1 (from Mexico) and KU820897.1 (from Colombia) showed R324W,

KU866423.1 and KU820899.2 (from China) had R324Q instead of R324W indicating the

evolving nature of the site. KU853013.1 (Italy) has an additional variation M349V. Sequences

KU501216.1 and KU501217.1 (from Guatemala) showed an additional mutation at position

G100A. One isolate from China (KU744693.1) showed a total of seven variations including two

conserved variations (Table 2b).

NS2A has only one conserved change A143V that was present in all the sequences.

Analyses of individual protein sequences showed some additional variations - L113F in

KU49755.1 (Brazil), I80T in KU647676.1 (Martinique), KU922923.1 and KU922960.1 (from

Page 7: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Mexico) Variation I139V was present in three sequences of China (KU820898.1, KU740184.2

and KU761564.1). NS2B protein was found to be conserved when 2015-2016 sequences were

compared against reference sequence with one exception of KU729217.2 (Brazil) with variation

M32I.

NS3 sequence has two conserved variation i.e., N400H and M472L seen in all sequences

(Table 2a). Apart from these positions, isolates KU729218.1 and KU321639.1 (from Brazil)

showed M334T and H355Y amino acid residue variations. Likewise both the sequences

KU501216.1 and KU501217.1 (from Guatemala) have a variation at position M572L. Isolates

KU922960.1 and KU922923.1 (from Mexico) showed A106E variation (Table 2b).

NS4B protein has changes at seven positions namely, G14S, M26I, L49F, M98I, I180V,

V184I and L186S in all sequences analyzed. Apart from these changes, only KU321639.1

(Brazil) at one position I176M and KU744693.1 (China) at four additional sites, A44P, T48S,

D150E and I176M showed variations. NS5 protein showed 15 numbers of evolved sites and this

could be due to its large size of 902 amino acid residues. Details of all the aa variations in the

non structural proteins are listed in Table 2a and 2b.

5’ and 3’ UTRs

Untranslated regions (5’ and 3’) are known to play important roles in flavivirus replication and

virulence (29). The untranslated regions (UTR) sequences of ZIKA virus from recent outbreak

were aligned and checked for variations. Malaysian strain did not have 5’ and 3’ UTR sequence

available for analysis and also UTR information were absent for two sequences each from 2015

and 2016 isolates respectively. The analysis revealed that both UTR sequences (5’ UTR and 3’

UTR) were mostly conserved. The sequences were also subjected to UTRscan web server to

predict any conserved UTR motif. UTRscan analysis showed presence of two motifs in the 3

‘UTR sequence, namely, uORF (upstream open reading frame) and MBE (Musashi binding

element) and no motifs were detected in 5’ UTR. Analysis of these motifs revealed that there are

nomenclature differences between prediction softwares and literature and uORF nomenclature

was homologous to dORF (downstream open reading frame) that was observed in the case of 3’

UTR (30). This motif has been reported in flaviviruses for the first time in this study even though

it has been shown to be present in mammalian UTRs and is found to be conserved thereby

highlighting their importance (30). The relevance of dORF in ZIKV warrants in-depth functional

Page 8: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

studies. MBE, earlier referred to as polyadenylation response element are also known to play part

in temporal regulation in Xenopus (31, 32). Studies have shown the importance of this conserved

domain in promoting RNA genome cyclization (29, 33). In addition, secondary structures of both

the UTR sequences were predicted. RNAalifold was used for this purpose and the structures are

detailed in Figure 2. The results revealed that was conservation of the structures as previously

shown in a recent study (34).

Analysis of Recombination events

Recombination analyses were performed using RDP4 tool on all the 50 ZIKV genomes. Total of

11 events were predicted by more than five algorithms (p-value<0.05). Out of these, six events

were shown to be having one of the unknown parents, so they were not considered for further

analysis. The remaining five events consist of eight sequences namely, KF383115.1,

KF383116.1, KF383117.1, KF383118.1, KF383120.1, KF383121.1, HQ234501.1 and

HQ234498.1 (Table 3a). Further analysis showed that these sequences belong to African strains

(East African and West African). Recombination analyses were also performed for the individual

genes to predict the presence of any recombination event (Table 3b). The analysis done using all

algorithms with the above mentioned criteria showed one recombination event each for Envelope

and NS1 genes and two in NS3. In the case of Envelope, isolate KF383118.1 was a recombinant

with site 1-459 and 1041-1512 from major parent (KF383117.1) and site 460-1040 from minor

parent (LC002520.1). For NS1, recombinant was KF383117.1 (site 2-641) and minor and major

parent were KF383119.1 and KF383116.1 respectively. NS3 showed two events, out of which

one event showed KF383117.1 as recombinant (site 610-1070) with HQ23450.1 as major parent

and KF383119.1 as minor parent. In second NS3 event, KF383116.1 was recombinant (site 732-

1035) with HQ234501.1 as major parent and KF383119.1 as minor parent. The p-value for all

the events was found to be significant p-value<=0.05) for all the events. This result clearly

indicates that recombination events are only present in African isolates and absent in Asian

lineage at present. While studies have highlighted that flaviviruses have infrequent

recombination events in the field (35), a study have provided evidence of the presence of such

events in ZIKV (25).

Molecular Evolution - Selection test

Page 9: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

The estimated Transition/Transversion bias value is 6.00. Substitution pattern and rates were

estimated under the Kimura-2 parameter model (+G+I). Selection analysis of genome sequences

was performed using Tajima’s neutrality test involving 50 nucleotide sequences. Tajima’s test

showed nucleotide diversity of 0.064089 and D value of 0.124450. Positive value of Tajima’s D

test suggests over dominant selection of these alleles in the population resulting on negative

selection (36, 37). Several studies have emphasized on the infection and transmission modes to

influence accumulation of negatively selected sites (25, 38).

Conclusion

In conclusion, our study is a comprehensive analysis of ZIKV genomes available till date. With

ZIKV infection spreading across the globe at an alarming rate, it is important to understand the

underlying molecular mechanisms that could aid the spread. Our analysis reveals balancing

selection of the identified amino acid variations thereby favoring fitness to the strains.

Acknowledgements

We thank ICGEB for the support. This work was supported by ICGEB core funds.

Page 10: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

References

1. Weaver SC, Forrester NL. Chikungunya: Evolutionary history and recent epidemic spread. Antiviral research. 2015;120:32-9.

2. Wilson ME, Chen LH. Dengue: update on epidemiology. Current infectious disease reports. 2015;17(1):1-8.

3. Gulland A. Zika virus is a global public health emergency, declares WHO. Bmj. 2016;352:i657.

4. Thiel H, Collett M, Gould E, Heinz F, Houghton M, Meyers G, et al. Family flaviviridae. Virus taxonomy, VIII report of the International Committee on Taxonomy of Viruses, Academic Press, San Diego. 2005:981-98.

5. Haddow A, Williams M, Woodall J, Simpson D, Goma L. Twelve isolations of Zika virus from Aedes (Stegomyia) africanus (Theobald) taken in and above a Uganda forest. Bulletin of the World Health Organization. 1964;31(1):57.

6. Dick G, Kitchen S, Haddow A. Zika virus (I). Isolations and serological specificity. Transactions of the Royal Society of Tropical Medicine and Hygiene. 1952;46(5):509-20.

7. Duffy MR, Chen T-H, Hancock WT, Powers AM, Kool JL, Lanciotti RS, et al. Zika virus outbreak on Yap Island, federated states of Micronesia. New England Journal of Medicine. 2009;360(24):2536-43.

8. Macnamara F. Zika virus: a report on three cases of human infection during an epidemic of jaundice in Nigeria. Transactions of the Royal Society of Tropical Medicine and Hygiene. 1954;48(2):139-45.

9. Fagbami A. Zika virus infections in Nigeria: virological and seroepidemiological investigations in Oyo State. Journal of Hygiene. 1979;83(02):213-9.

10. Foy BD, Kobylinski KC, Chilson Foy JL, Blitvich BJ, Travassos da Rosa A, Haddow AD, et al. Probable non-vector-borne transmission of Zika virus, Colorado, USA. Emerg Infect Dis. 2011;17(5):880-2.

11. Heang V, Yasuda CY, Sovann L, Haddow AD, Travassos da Rosa AP, Tesh RB, et al. Zika virus infection, Cambodia, 2010. Emerg Infect Dis. 2012;18(2):349-51.

12. Smithburn K. Neutralizing antibodies against arthropod-borne viruses in the sera of long-time residents of Malaya and Borneo. American journal of hygiene. 1954;59(2):157-63.

13. Marchette N, Garcia R, Rudnick A. Isolation of Zika virus from Aedes aegypti mosquitoes in Malaysia. American Journal of Tropical Medicine and Hygiene. 1969;18(3):411-5.

14. Dakar IPd. WHO collaborating center for reference and research on arboviruses and hemorrhagic fever viruses: Annual report. Dakar, Senegal. 1999:143.

15. Monlun E, Zeller H, Le Guenno B, Traore-Lamizana M, Hervy J, Adam F, et al. [Surveillance of the circulation of arbovirus of medical interest in the region of

Page 11: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

eastern Senegal]. Bulletin de la Societe de pathologie exotique (1990). 1992;86(1):21-8.

16. Chambers TJ, Halevy M, Nestorowicz A, Rice CM, Lustig S. West Nile virus envelope proteins: nucleotide sequence analysis of strains differing in mouse neuroinvasiveness. Journal of General Virology. 1998;79(10):2375-80.

17. Kuno G, Chang G-J. Full-length sequencing and genomic characterization of Bagaza, Kedougou, and Zika viruses. Archives of virology. 2007;152(4):687-96.

18. Lanciotti RS, Kosoy OL, Laven JJ, Velez JO, Lambert AJ, Johnson AJ, et al. Genetic and serologic properties of Zika virus associated with an epidemic, Yap State, Micronesia, 2007. Emerg Infect Dis. 2008;14(8):1232-9.

19. European Centre for Disease Prevention and Control.Rapid risk assessment:Zika virus epidemic in the Americas:potential association with microcephaly and Guillain-Barre syndrome. ECDC. 2015. From:http://ecdc.europa.eu/en/publications/_layouts/forms/Publication_DispForm.aspx?List=4f55ad51-4aed-4d32-b960-af70113dbb90&ID=1413

20. Roth A, Mercier A, Lepers C, Hoy D, Duituturaga S, Benyon E, et al. Concurrent outbreaks of dengue, chikungunya and Zika virus infections-an unprecedented epidemic wave of mosquito-borne viruses in the Pacific 2012-2014. Euro Surveill. 2014;19(41):20929.

21. Zanluca C, Melo VCAd, Mosimann ALP, Santos GIVd, Santos CNDd, Luz K. First report of autochthonous transmission of Zika virus in Brazil. Memórias do Instituto Oswaldo Cruz. 2015;110(4):569-72.

22. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Molecular biology and evolution. 2013:mst197.

23. Bernhart SH, Hofacker IL, Will S, Gruber AR, Stadler PF. RNAalifold: improved consensus structure prediction for RNA alignments. BMC bioinformatics. 2008;9(1):1.

24. Martin DP, Murrell B, Golden M, Khoosal A, Muhire B. RDP4: Detection and analysis of recombination patterns in virus genomes. Virus Evolution. 2015;1(1):vev003.

25. Faye O, Freire CC, Iamarino A, Faye O, de Oliveira JVC, Diallo M, et al. Molecular Evolution of Zika Virus during Its Emergence in the 20 th Century. PLoS Negl Trop Dis. 2014;8(1):e2636.

26. Haddow AD, Schuh AJ, Yasuda CY, Kasper MR, Heang V, Huy R, et al. Genetic characterization of Zika virus strains: geographic expansion of the Asian lineage. PLoS Negl Trop Dis. 2012;6(2):e1477.

27. Lazear HM, Stringer EM, de Silva AM. The Emerging Zika Virus Epidemic in the Americas: Research Priorities. JAMA. 2016.

28. Logan IS. ZIKA-How Fast Does This Virus Mutate? bioRxiv. 2016:040303.

Page 12: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

29. Villordo SM, Gamarnik AV. Genome cyclization as strategy for flavivirus RNA replication. Virus research. 2009;139(2):230-9.

30. Crowe ML, Wang X-Q, Rothnagel JA. Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides. Bmc Genomics. 2006;7(1):1.

31. Charlesworth A, Ridge JA, King LA, MacNicol MC, MacNicol AM. A novel regulatory element determines the timing of Mos mRNA translation during Xenopus oocyte maturation. The EMBO journal. 2002;21(11):2798-806.

32. Charlesworth A, Wilczynska A, Thampi P, Cox LL, MacNicol AM. Musashi regulates the temporal order of mRNA translation during Xenopus oocyte maturation. The EMBO journal. 2006;25(12):2792-801.

33. Polacek C, Foley JE, Harris E. Conformational changes in the solution structure of the dengue virus 5′ end in the presence and absence of the 3′ untranslated region. Journal of virology. 2009;83(2):1161-6.

34. Zhu Z, Chan JF-W, Tee K-M, Choi GK-Y, Lau SK-P, Woo PC-Y, et al. Comparative genomic analysis of pre-epidemic and epidemic Zika virus strains for virological factors potentially associated with the rapidly expanding epidemic. Emerging Microbes & Infections. 2016;5(3):e22.

35. Cook S, Moureau G, Kitchen A, Gould EA, de Lamballerie X, Holmes EC, et al. Molecular evolution of the insect-specific flaviviruses. Journal of General Virology. 2012;93(2):223-34.

36. Holsinger K. Tajima's D, Fu's FS, Fay and Wu's H, and Zeng et al.'s E. 2013.

37. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123(3):585-95.

38. Hanada K, Suzuki Y, Gojobori T. A large variation in the rates of synonymous substitution for RNA viruses and its relationship to a diversity of viral infection and transmission modes. Molecular biology and evolution. 2004;21(6):1074-80.

Page 13: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Figure Legends

Figure 1. The phylogenetic tree constructed using Neighbor-Joining method is represented in the figure.

Bootstrap values are written next to the branches. For computing evolutionary distance, maximum

Likelihood method was used. Asian strains and African strains formed two distinct clusters and the tree

is rooted using Spondweni Virus as outgroup.

Page 14: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Figure 2. The figure represents the consensus secondary structure of UTR generated using RNAalifold tool. The bases written in black font are

conserve and the bases written in grey are absent or not sequenced in some of the isolates. a) Consensus secondary structure of 5’ UTR. b) Consensus

secondary structure of 3’ UTR.

Page 15: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Table 1a. The table represents the mutation identified in the consensus sequences of the structural protein of the isolates of year 2015-

2016. The number of sequences used in the consensus for each region is also shown. The mutations were identified by comparing the

sequences with Malaysian isolate. Protein Polypeptide

position

Protein

position

Malasiya

1966

(n=1)

French-

polynasia

2013 (n=1)

Puertrico

2015

(n=1)

Brazil

2015

(n=9)

Martinque

2015

(n=1)

Colambia

2015

(n=1)

Guatemala

2015

(n=2)

Suriname

2015

(n=1)

Mexico

2016

(n=2)

China

2016

(n=6)

Italy

2016

(n=2)

Capsid 25 25 N . S S S S S S S S S

27 27 L . F F F F F F F F F

76 76 E . . . . . . . . E/D .

101 101 R . K K K K K K K K K

105 105 G . . G/S . . . . . . .

107 107 D . . . E E . . E D/E .

109 109 S . . . . . . . . S/N .

110 110 I . V V V V V V V V V

113 113 I . V V V V V V V V V

pr 123 1 V A A A A A A A A A A

139 17 S N N N N N N N N N N

153 31 V M M M M M M M M M M

166 44 M . . . . . . T . . .

M 287 72 P . . . . . . . P/L . .

Envelope 323 33 V . . . . . . . . V/A .

337 47 T . . T/S . . . . . . .

346 56 V . . . . . I . . . .

354 64 S . . T/S . . . . . . .

358 68 M . . I/M . . . . . . .

442 152 I . . . . . . . . I/L .

503 213 V . . . . . . . . V/A .

520 230 D . . . . . . . . D/A .

545 255 V . . V/A . . . . . . .

550 260 S . . S/T . . . . . . .

612 322 L . . . . . . . . L/V .

613 323 H . . . . . . . . H/D .

620 330 V . . . . . . . . V/G .

623 333 A . . . . . . . . A/G .

639 349 M . . M/T . . . . . . .

683 393 D E E E E E E E E E E

709 419 K . . . . . . . . K/R .

739 449 F . . . . . . . . F/I .

763 473 V M M M M M M M M M M

769 479 T . . . . . . A . . .

777 487 T M M M M M M M M M M

Page 16: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Table 1b. The table represents the mutation identified in the structural protein sequences of the isolates of year

2015-2016. The mutations were identified by comparing the sequences with Malaysian isolate. Protein Capsid pr M E

KU501215.1_Puertrico_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU729217.2_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- T47S, S64T, M68I, V255A, D393E, V473M,

T487M

KU707826.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU729218.1_Brazil_2015 N25S, L27F, R101K,

G105S, I110V, I113V

V1A, S17N,

V31M

- M349T, D393E, V473M, T487M

KU321639.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU497555.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- S260T, D393E, V473M, T487M

KU365780.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU365779.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU365778.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU365777.1_Brazil_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU647676.1_Martinque_2015 N25S, L27F, R101K,

D107E, I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU820897.1_Colambia_2015 N25S, L27F, D107E,

R101K, I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU501217.1_Guatemala_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- V56I, D393E, V473M, T487M

KU501216.1_Guatemala_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- V56I, D393E, V473M, T487M

KU312312.1_Suriname_2015 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M, M44T

- D393E, V473M, T479A, T487M

KU922960.1_Mexico_2016 N25S, L27F, R101K,

D107E, I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU922923.1_Mexico_2016 N25S, L27F, R101K,

D107E, I110V, I113V

V1A, S17N,

V31M

P72L D393E, V473M, T487M

KU866423.1_China_2016 N25S, L27F, R101K,

S109N, I110V, I113V

V1A, S17N,

V31M

- D393E, K419R, V473M, T487M

KU820898.1_China_2016 N25S, L27F, R101K,

D107E, I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU740184.2_China_2016 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU820899.2_China_2016 N25S, L27F, S109N,

R101K, I110V, I113V

V1A, S17N,

V31M

- D393E, K419R, V473M, T487M

KU761564.1_China_2016 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU744693.1_China_2016 N25S, L27F, E76D,

R101K, I110V, I113V

V1A, S17N,

V31M

- V33A, I152L, V213A, D230A, L322V, H323D,

V330G, A333G, D393E, F449I, V473M, T487M

KU853013.1_Italy_2016 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

KU853012.1_Italy_2016 N25S, L27F, R101K,

I110V, I113V

V1A, S17N,

V31M

- D393E, V473M, T487M

Page 17: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Table 2a. The table represents the mutation identified in the consensus sequences of non-structural protein of the isolates of year

2015-2016. The number of sequences used in the consensus for each region is also shown. The mutations were identified by

comparing the sequences with Malaysian isolate. Protein Polypeptide

position

Protein

position

Malasiya

1966

(n=1)

French-

polynasia

2013

(n=1)

Puertrico

2015

(n=1)

Brazil

2015

(n=9)

Martinque

2015

(n=1)

Colambi

a

2015

(n=1)

Guatemala

2015

(n=2)

Surinam

e

2015

(n=1)

Mexico

2016

(n=2)

China

2016

(n=6)

Italy

2016

(n=2)

NS1 795 1 D . . . . . . . . D/G .

894 100 G . . . . . A . . . .

916 122 Y . . Y/H . . . . . . .

970 176 S . . . . . . . . S/W .

982 188 A V V V V V V V V V V

984 190 G . . G/E . . . . . . .

1005 211 R . . . . . . . . R/W .

1050 256 T . . . . . . . . T/A .

1058 264 V M M M M M M M M M M

1107 313 C . . . . . . . . C/S .

1118 324 R . . . W W . . W R/Q .

1143 349 M . . M/V . . . . . . V

NS2A 1226 80 I . . . T . . . T . .

1259 113 L . . L/F . . . . . . .

1285 139 I . . . . . . . . I/V .

1289 143 A V V V V V V V V V V

NS2B 1404 32 M . . M/I . . . . . . .

Page 18: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

NS3 1608 106 A . . . . . . . E . .

1836 334 M . . M/T . . . . . . .

1856 354 D . . . . . . . . D/E .

1857 355 H . . H/Y . . . . . H/Y .

1867 365 S . . . . . . . . S/R .

1902 400 N H H H H H H H H H H

1938 436 D . . . . . . . . D/G .

1974 472 M L L L L L L L L L L

2027 525 R . . . . . . . . R/K .

2074 572 M . . . . . L . . . .

NS4B 2283 14 G S S S S S S S S S S

2295 26 M I I I I I I I I I/M I

2313 44 A . . . . . . . . A/P .

2317 48 T . . . . . . . . T/S .

2318 49 L F F F F F F F F F F

2367 98 M I I I I I I I I I I

2419 150 D . . . . . . . . D/E .

2445 176 I . . I/M . . . . . I/M .

2449 180 I V V V V V V V V V V

2453 184 V I I I I I I I I I I

2455 186 L S S S S S S S S S S

NS5 2611 91 A . V . . . . . . . .

Page 19: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

2634 114 T M V V V V V V V M/V V

2644 124 V . . . . . . . . V/I .

2659 139 S P P P P P P P P P P

2694 174 K . . . . . R . . . .

2749 229 I T T T T T T T T T/I T

2778 258 N . . N/D . . . . . . .

2787 267 A V V V V V V V V V/A V

2795 275 L M M M M M M M M M M

2800 280 N . . N/D . . . . . . .

2802 282 V I I I I I I I I I I

2807 287 S . . . . . . . . S/A .

2809 289 H . . . . . . . Q H/K .

2831 311 E . . E/V . . . . . E/D .

2833 313 P . . . . . . . . P/A .

2842 322 I . . . . . . . . . V

2896 376 N S S S S S S S S S S

2974 454 N . . . . . . . . N/I .

2975 455 M . . . . . . . . M/T .

3030 510 G . . . . . . . V . .

3045 525 R . . . . . C . . . .

3046 526 T I I I I I I I I I I

3050 530 K R R R R R R R R R R

Page 20: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

3107 587 R K K K K K K K K K K

3144 624 N . . . . . . . . S/N .

3162 642 P S S S S S S S S S S

3167 647 S N N N N N N N N N N

3190 670 K . . . . . . . . R/K .

3223 703 S D D D D D D D D D D

3239 719 Y H H H H H H H H H H

3334 814 V . . . . . . . . V/A .

3353 833 T . . . A A . . A . .

3387 867 D N N N N N N N N N N

3392 872 V . . . . . . . . V/M .

3398 878 D . . . . . . . . . E

3403 883 M . . . . . . . . M/V .

Page 21: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Table 2b. The table represents the mutation identified in the non-structural protein sequences of the individual isolates of year 2015-

2016. The mutations were identified by comparing the sequences with Malaysian isolate. Sequences/Proteins NS1 NS2A NS2B NS3 NS4B NS5

KU501215.1_Puertrico_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

A91V, T114V, S139P, I229T, A267V,

L275M, V282I, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N

KU729217.2_Brazil_2015 A188V,

G190E,

V264M,

M349V

A143V M32I N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

N280D, V282I, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N

KU707826.1_Brazil_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU729218.1_Brazil_2015 A188V,

V264M

A143V - M334T,

N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU321639.1_Brazil_2015 Y122H,

A188V,

V264M

A143V - H355Y,

N400H,

M472L

G14S, L49F, M98I,

I176M, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU497555.1_Brazil_2015 A188V,

V264M

L113F,

A143V

- N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, E311V, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N

KU365780.1_Brazil_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, N258D, A267V,

L275M, V282I, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N

KU365779.1_Brazil_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU365778.1_Brazil_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU365777.1_Brazil_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, N258D, A267V,

L275M, V282I, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N

KU647676.1_Martinque_2015 A188V,

V264M,

R324W

I80T,

A143V

- N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, T833A,

D867N

Page 22: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

KU820897.1_Colambia_2015 A188V,

V264M,

R324W

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, T833A,

D867N

KU501217.1_Guatemala_2015 G100A,

A188V,

V264M

A143V - N400H,

M472L,

M572L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, K174R, I229T, A267V,

L275M, V282I, N376S, R525C, T526I,

K530R, R587K, P642S, S647N, S703D,

Y719H, D867N

KU501216.1_Guatemala_2015 G100A,

A188V,

V264M

A143V - N400H,

M472L,

M572L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU312312.1_Suriname_2015 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, D867N

KU922960.1_Mexico_2016 A188V,

V264M,

R324W

I80T,

A143V

- A106E,

N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, H289Q, N376S, G510V, T526I,

K530R, R587K, P642S, S647N, S703D,

Y719H, T833A, D867N

KU922923.1_Mexico_2016 A188V,

V264M,

R324W

I80T,

A143V

- A106E,

N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, H289Q, N376S, G510V, T526I,

K530R, R587K, P642S, S647N, S703D,

Y719H, T833A, D867N

KU866423.1_China_2016 A188V,

V264M,

R324Q

A143V - N400H,

M472L,

R525K

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114M, V124I, S139P, I229T, A267V,

L275M, V282I, N376S, T526I, K530R,

R587K, N624S, P642S, S647N, K670R,

S703D, Y719H, D867N, V872M, M883V

KU820898.1_China_2016 A188V,

V264M

I139V,

A143V

- N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, L275M, V282I, N376S,

T526I, K530R, R587K, P642S, S647N,

S703D, Y719H, D867N

KU740184.2_China_2016 A188V,

V264M

I139V,

A143V

- N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, L275M, V282I, N376S,

T526I, K530R, R587K, P642S, S647N,

S703D, Y719H, D867N

KU820899.2_China_2016 A188V,

V264M,

R324Q

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114M, S139P, I229T, A267V, L275M,

V282I, N376S, T526I, K530R, R587K,

N624S, P642S, S647N, K670R, S703D,

Y719H, D867N

KU761564.1_China_2016 A188V,

V264M

I139V,

A143V

- N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, L275M, V282I, N376S,

T526I, K530R, R587K, P642S, S647N,

S703D, Y719H, D867N

KU744693.1_China_2016 D1G, S176W,

A188V,

R211W,

T256A,

A143V - D354E,

H355Y,

S365R,

N400H,

G14S, A44P, T48S,

L49F, M98I,

D150E, I176M,

I180V, V184I,

T114V, S139P, I229T, A267V, L275M,

V282I, S287A, H289K, E311D, P313A,

N376S, N454I, T526I, K530R, R587K,

P642S, S647N, S703D, Y719H, V814A,

Page 23: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

V264M, C313S D436G,

M472L

L186S D867N

KU853013.1_Italy_2016 A188V,

V264M,

M349V

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, I322V, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N, D878E

KU853012.1_Italy_2016 A188V,

V264M

A143V - N400H,

M472L

G14S, M26I, L49F,

M98I, I180V,

V184I, L186S

T114V, S139P, I229T, A267V, L275M,

V282I, I322V, N376S, T526I, K530R,

R587K, P642S, S647N, S703D, Y719H,

D867N, D878E

Page 24: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Table 3a. Recombination analysis of whole genome sequences of ZIKA virus. ‘+’ sign represents the prediction of event by

respective method and ‘-‘symbol represent no result predicted by respective method.

Recombinant Major

parent

Minor parent RDP

(P-Val)

GENECONV

(P-Val)

BootScan

(P-Val)

MaxiChi (P-

Val)

Chimaera (P-

Val)

SiScan

(P-Val)

3Seq

(P-Val)

KF383117.1 KF383116.1 KF383115.1 + (6.975E-32) + (2.081E-32) - (NA) + (8.794E-7) + (9.072E-12) + (7.309E-11) + (1.917E-9)

KF383118.1 KF383121.1 KF383117.1 + (4.050E-31) + (1.319E-26) - (NA) + (4.147E-8) + (7.136E-8) + (7.859E-8) + (4.460E-4)

KF383117.1 HQ234501.1 KF383121.1 + (7.007E-19) + (1.688E-18) - (NA) + (1.461E-5) + (1.020E-5) + (7.178E-7) + (2.394E-8)

KF383117.1 KF383116.1 KF383118.1 + (5.698E-22) + 5.905E-20) - (NA) + (7.158E-7) + (5.999E-7) + (2.570E-7) + (2.754E-3)

KF383118.1 HQ234498.1 KF383120.1 + (3.315E-19) + (2.896E-9) - (NA) + (3.221E-7) - (NA) + (5.905E-3) + (1.334E-2)

Page 25: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Table 3b. The recombination analysis results of individual genes of ZIKA virus. ‘+’ sign represents the prediction of event by

respective method and ‘-‘symbol represent no result predicted by respective method.

Genes Recombinant Major

parent

Minor

parent

RDP (P-Val) GENECONV

(P-Val)

BootScan

(P-Val)

MaxiChi

(P-Val)

Chimaera

(P-Val)

SiScan (P-

Val)

3Seq

(P-Val)

Envelope KF383118.1 LC002520.1 KF383117.1 + (1.065E-10) + (6.088E-10) + (7.162E-10) + (6.415E-10) + (4.014E-10) + (7.242E-10) + (5.422E-12)

NS1 KF383117.1 KF383116.1 KF383119.1 + (1.553E-07) + (6.592E-06) + (1.587E-05) + (1.418E-03) + (4.026E-07) + (2.529E-07) + (1.290E-12)

NS3 KF383117.1 HQ234501.1 KF383119.1 + (2.504E-13) + (6.474E-11) + (5.521E-09) + (7.745E-10) + (6.190E-10) + (2.222E-11) + (5.188E-18)

NS3 KF383116.1 HQ234501.1 KF383119.1 + (3.031E-08) + (2.575E-09) + (5.414E-11) + (9.034E-04) + (4.071E-05) + (5.747E-07) + (2.674E-09)

Page 26: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Supplementary Material

Supplementary Table 1: Sequences used in the study.

S. No Accession Number Country Year References (PMID/DOI)

1 KU501217.1 Guatemala 2015 10.3201/eid2205.160065

2 KU501216.1 Guatemala 2015 10.3201/eid2205.160065

3 KU501215.1 Puertrico 2015 10.3201/eid2205.160065

4 KU647676.1 Martinque 2015 10.1016/j.nmni.2016.02.013

5 KU729217.2 Brazil 2015 27013429

6 KU729218.1 Brazil 2015 27013429

7 KU853013.1 Italy 2016 26987769

8 KU853012.1 Italy 2016 26987769

9 KU321639.1 Brazil 2015 26941134

10 KU497555.1 Brazil 2015 26897108

11 KU312312.1 Suriname 2015 26775124

12 KU707826.1 Brazil 2015 26401719

13 KF268950.1 Central African Republic - 25514122

14 KF268949.1 Central African Republic - 25514122

15 KF268948.1 Central African Republic 1976 25514122

16 KF993678.1 Canada 2013 25294619

17 KJ776791.1 French Polynasia 2013 24903869

18 KF383121.1 East African - 24421913

19 KF383119.1 Senegal 2001 24421913

20 KF383118.1 Senegal 2001 24421913

21 KF383115.1 Central African Republic 1968 24421913

22 KF383120.1 Senegal 2000 24421913

23 KF383117.1 Senegal 1997 24421913

24 KF383116.1 Senegal - 24421913

25 JN860885.1 Cambodia 2010 22389730

26 HQ234499.1 Malasiya 1966 22389730

27 HQ234498.1 Uganda 1947 22389730

28 HQ234501.1 Senegal 1984 22389730

29 HQ234500.1 Nigeria 1968 22389730

30 DQ859059.1 Uganda - 19741066

31 EU545988.1 Micronesia 2007 18680646

32 AY632535.2 Uganda - 16223950

33 KU922960.1 Mexico 2016 NA

34 KU922923.1 Mexico 2016 NA

35 KU866423.1 China 2016 NA

36 KU820898.1 China 2016 NA

37 KU740184.2 China 2016 NA

38 KU820899.2 China 2016 NA

39 KU820897.1 Colambia 2015 NA

40 KU761564.1 China 2016 NA

41 KU681082.3 Phillipines 2012 NA

42 KU681081.3 Thialand 2014 NA

43 KU744693.1 China 2016 NA

44 KU509998.1 Haiti 2014 NA

45 KU365780.1 Brazil 2015 NA

46 KU365779.1 Brazil 2015 NA

47 KU365778.1 Brazil 2015 NA

48 KU365777.1 Brazil 2015 NA

49 KU720415.1 Uganda 1947 NA

50 LC002520.1 Uganda - NA

Page 27: DISCLAIMER RECOMMENDED CITATION - WHOJatin Shrinet, a Aditi Agrawal, a Raj K Bhatnagar a & Sujatha Sunil a aInternational Centre for Genetic Engineering and Biotechnology, New Delhi-110067,

Supplementary Figure 1: Phylogenetic trees of ZIKA virus predicted using other methods. a) Maximum

Likelihood tree. b) Minimum-Evolution tree. C) Maximum Parsimony tree. d) UPGMA tree.