ngs and virus discovery - nibsc - home delwart - overview of... · woolhouse m e et al. proc. r....
TRANSCRIPT
Brute force in the clinic
Keen to document exactly which microbes are there, many researchers are using
brute-force 'metagenomic' studies in which they extract microbial DNA from the
body cavity of choice and hurl it all into a sequencing machine.
NGS and virus discovery
Eric Delwart
Blood Systems Research Institute
UCSF Laboratory Medicine
Rapid pace of viral discovery is providing many “orphan
viruses” as candidate pathogens
The discovery curve for human virus species Woolhouse M E et al. Proc. R. Soc. B 2008;275:2111-2115
Some common diseases may be associated
with still unknown viruses
• Gastroenteritis (~50% unexplained in the US)
• Respiratory illness (15-25% unexplained)
• Encephalitis (~60-80% unexplained in CA)
• Hepatitis
• Chronic fatigue syndrome
• Non-polio acute flaccid paralysis
• Auto-immune diseases (diabetes, MS, Kawasaki)
• Cancers
Virus Size R
elat
ive
size
of
viru
ses
and
b
acte
ria
400 nm filters used to purify viral particles
Bacterium (Staphyllococcus aureus)
Chlamydia
Pox virus
Herpes virus
Influenza Virus
Picornavirus
(polio)
Random RT-PCR amplification
RNA DNA
Melt/annneal and extend cDNA with Klenow DNA
polymerase
5’GTCCATGCATGACTCGAGTCNNNNNNNN3’
30-40 PCR rounds
Randomized 3’
5’GTCCATGCATGACTCGAGTC3’
Reverse transcription
Filtration and nuclease enriched viral nucleic acids
Signature tag
Compiling of data
into web-based
format that is
dynamic and
searchable
Raw reads Sequencing Bin and trim
Assembly Blast search
data collecting
and quality
control
Sequence data generation and analysis
Bioinformatics: the perception..
The reality..
Wide variety of viral
genome types have
been “re-discovered”
Viruses detected:
BLASTx E score <10-5
New parvovirus Long contigs
A new highly divergent calicivirus with borderline E score 3x10-4
(23% protein identity)
E=10-10
E=10-5
E=10-20
Highly divergent viruses are not recognized by BLASTx
?
? ?
Borderline recognized E=10-5
Unrecognized E<10-5
Novel animal viral genomes helps identify
highly divergent human virus.
Expansion of viral sequence space by new
species/genus
Recognized E=10-20
Varivax
Other viruses:
None
Detected
104,273
Other viruses:
None
Detected
60,735 PV1
21,251 PV2
59,943 PV3
Oral poliovirus 1,2,3 vacinne Metagenomics is useful tool to detect viral contaminations
Rotarix
Some
Rotavirus
Sequences
(458)
6,344 Porcine Circovirus-1
HEV-B, HEV-C,
Cosavirus,
Adenovirus,
rhinovirus,
Cucumber
mosaic virus
Number mammalian viruses in stool from Pakistani children with
non-polio acute flaccid paralysis
0
1
2
3
4
5
6
7
8
1 2 3 4 5 6 7
Diarrhea
Healthy
Co-infections in high density farm piglets
Number of mammalian viruses per feces
Num
ber
of pig
lets
Parvoviridae family (ssDNA)
Smallest known viruses, very environmentally resistant.
Common pathogen of cats, dogs, pigs. Vaccination available.
Resistant to heat and solvent/detergent inactivation (problem
for transfusion).
Genera infecting vertebrates:
Parvovirus: Murine minute virus
Erythrovirus: B19
Dependovirus: Adeno-associated virus
Amdovirus: Aleutian mink disease virus
Bocavirus: HBoV, HBoV2-3-4
Partetravirus: PARV4
Unassigned: Bufavirus 1-2
Recognizable by protein similarity (BLASTx)
Too divergent to anneal to microarray based on pre-existing
nucleotide sequences in GenBank
PARV4 typical parvovirus genome structure
PARV4: founder of new
Parvovirus genus (Partetraviridae)
Consensus primer PCR identified PARV4 relatives
PARV4 antibodies in US/Europe limited to blood exposed subjects:
First blood transmitted parvovirus?
Hitch-hiker to blood-borne pandemics of HIV and HCV?
Sharp CP, Lail A, Donfield S, Gomperts ED, Simmonds P. Transfusion. 2012 Jul;52(7):1482-9.
PARV4 sero-conversion following heat treated clotting factors
Is PARV4 a pathogen ?
200 healthy blood donors from Los Angeles: 2% positive
200 patients with symptoms of acute viral infection (HIV RNA
negative MSM and/or IDU): 6% positive
p=0.03
Higher PARV4 prevalence may simply reflect higher general
exposure to blood borne viruses in high risk groups
PARV4 in CSF of two unexplained encephalitis cases India Benjamin LA, Lewthwaite P, Vasanthapuram R, Zhao G, Sharp C, Simmonds P, Wang D,
Solomon T. Emerg Infect Dis. 2011 Aug;17(8):1484-7.
PARV4 in cases of fetal hydrops
Chen MY, Yang SJ, Hung CC. Emerg Infect Dis. 2011 Oct;17(10):1954-6.
Picornaviridae family
Very wide range of symptoms and transmission strategies
(e.g. poliovirus versus rhinoviruses)
2008 only four genera known to infect humans (Enterovirus,
Hepatovirus, Parechovirus, Kobuvirus).
Infects humans
Recent increase of Picornaviridae
genera infecting humans
Salivirus
Frequency of cosavirus and enterovirus detection by RT -nPCR
% RT -PCR positive Cosavirus Enterovirus
Acute flacc id paralys is 49% 76%
Healthy controls 44% 61%
Prevalence of cosavirus and enterovirus in
feces of Pakistani children
Cosaviruses may be second most common
human enteric viral infection.
Genetic diversity (5 species and >30
serotypes) complicates disease
association studies
!
Multiple astroviruses clades
in extensively sampled
mammals
Viral sequences in human plasma pools with Roche 454
28543
3562
1000
1783
6107
183
19
8
3
6
8393
296
2972
0
0
13 x B19V (12079 reads)
1 x PARV4 (3 reads)
1 x Human papillomavirus 27 (1 read)
Recent analysis with Illumina HiSeq also showed:
WU polyomavirus, HERV, lab and extraction column
contaminants.
Many new “orphan” viruses being detected
using metagenomics.
Close homologues of human viruses found in
animals.
Very high rate of viral co-infections in healthy
young humans and animals.
“New” human orphan viruses since 2005
• Parvoviruses:
– B19 erythrovirus (3 genotypes)
– PARV4 (3 genotypes)
– Bocavirus-1/2/3/4 (2 species)
– Bufavirus-1/2 (2 species)
• Picornaviruses:
– Enterovirus/Hepatovirus/Parechovirus/Kobuvirus (5 genera with countless serotypes)
– Cosaviruses-A/B/C/D/E (5 species >33 genotypes)
– Salivirus/Klassevirus (1 species)
– Cardioviruses (1 species ~8 genotypes)
• Astroviruses:
– Human astrovirus (8 genotypes)
– Human astroviruses MLB-1/2/3 (3 genotypes)
– Human astroviruses VA1/2/3/4 (4 genotypes)
• Polyomaviruses:
– BKV/JC -KIPyV -WUPyV
– MCPyV -HPyV6 -HPyV7
– TSPyV -HPyV9 -HMWPyV
DIAGNOSTICS
Target anti-viral
Detect minority drug resistance mutations (HIV/HCV/HBV)
Avoid unneeded antibiotic therapies
Understand transmission networks
BIOLOGICS QUALITY CONTROL
Serum, cell supernatants, mAb.
VETERINARY MEDICINE
Diagnostics
Vaccine
ENVIRONMENTAL QC
Agricultural products (plants-animals-etc)
Sewage
Practical uses for virus NGS (beside virus discovery)
Sequence what?
-Everything (total DNA and RNA of host and virus)
-Enrich viral NA by filtration and nuclease resistance
-Enrich known pathogens NA with micro-array or beads
Issues in NGS for virus detection
Quality controls
-Reduce nucleic acid contamination
-Sensitivity using standards (mixed /ss/ds/circular
DNA/RNA viruses)
-Effect of sample type
(blood/respiratory/feces/biopsies)
-Best method for unbiased amplification.
Bioinformatics
Speed
De novo assembly?
Curated viral database
Report only known viruses? (what %
nucleotide/protein similarity criteria?)
Issues in NGS for virus detection
Blood Systems Research Institute
and UCSF:
Linlin Li
Tung Phan
Lark Coffey
Terry Ng
Xutao Feng
Beatrix Kapuszinsky (Stanford)
Tongling Shan (CAS Shanghai)
Amit Kapoor (Columbia U)
Joe Victoria (Boehringer-Ingelheim)
Morris Jones (USDA)
MANY MANY other collaborators for human, animal and environmental samples
Stanford University
Chunlin Wang
University of Edinburgh
Peter Simmonds
Paul Ehrlich Institut
Sally Baylis
WHO poliovirus eradication
Sohail Zaidi
University of Helsinki
Maria Söderlund-Venermo
NHLBI, USAIDS, and BSRI funding
!
High sero-prevalence Bocavirus sero-reactivity
in European children and adults
HBoV1 HBoV2 HBoV3 HBoV4
Children: 26% 25% 11% 3%
Adults: 59% 34% 15% 2%
Kantola K,et al. J Infect Dis. 2011 Nov;204(9):1403-12.