phytophthora genome sequencing: a case study
Post on 05-Jan-2016
31 Views
Preview:
DESCRIPTION
TRANSCRIPT
PHYTOPHTHORA GENOME SEQUENCING: A case study
Santhosh J. Eapensjeapen@spices.res.in
Status of whole genome sequencing of Phytophthora spp.
April 20, 2023 2
Organism Status Release date
Method Size (Mb)
Centre/consortium
P. andina In progress - WGS - Broad InstituteP. capsici Assembly - WGS 64.04 DOE Joint Genome
InstituteP. infestans Assembly 15/11/06 Clone based 190.13 Broad Institute
P. ipomoeae In progress - WGS - Broad InstituteP. mirabilis In progress - WGS - Broad InstituteP. phaseoli In progress - WGS - Broad InstituteP. ramorum Assembly 01/09/06 WGS 54.42 DOE Joint Genome
InstituteP. sojae Assembly 01/09/06 WGS 78.05 DOE Joint Genome
Institute
Phytophthora Whole Genome Sequencing
April 20, 2023 3
• The sequencing platform was Illumina Genome Analyzer
• The sequence base calling, alignment, and variant analysis were done using
CASAVA v1.7 (short for "Consensus Assessment of Sequence And
VAriation“).
• Maq software was used for assembly and variant detection using reference
genome.
• P. capsici genome of JGI was used as the reference genome
Alignment status and reports• Number of reference scaffolds : 917 • Length of reference sequences excluding gaps : 56042007 • Length of gaps in the reference sequences : 8005190 • Length of non-gap regions covered by reads : 22593594 • GC% : 50.4• Total Reads : 15849154• Reads Aligned : 48.8738• Total Genome Size : 64022747• Genome Covered : 28234853• %Coverage : 44.1013• Average Read Depth : 1.50491• Average depth across all non-gap regions : 11.284 • Average depth across 24 bp unique regions : 1.565 • % Coverage at 1X : 54.8897• Single Nucleotide Variants at 3X cutoff : 330410
Base composition and genome size of P. capsici
April 20, 2023 5
Total genome size = 64022747 (64 Mb)
SNP and InDel detailsSNP and InDel details
April 20, 2023 6
Variant Annotation Output
Structural Annotation-
• Structural Annotation was conducted using AUGUSTUS (version 2.5.5), Magnaporthe_grisea as genome model
• However, we have to develop genome model for Oomycete to obtain accurate result
Gene annotation
Functional Annotation
Functional Annotation Result
• Functional Annotation for negative strand is complete
Comparison with Phytophthora capsici (JGI)
Number of reference sequences: 917
Length of reference sequences excluding gaps: 56,042,007
Length of gaps in the reference sequences: 7,981,741
Length of non-gap regions covered by reads: 22,593,608
Length of 24 bp unique regions of the reference: 1,426,016
Reference nucleotide composition:A: 24.81%, C: 25.20%, G: 25.23%, T: 24.77%
Reads nucleotide composition:A: 23.04%, C: 26.43%, G: 27.13%, T: 23.40%
Average depth across all non-gap regions: 11.285
Average depth across 24 bp unique regions: 1.564
Comparison with Phytophthora infestans (Maq)
Number of reference sequences: 4921
Length of reference sequences excluding gaps: 1,90,133,476
Length of gaps in the reference sequences: 38,410,029
Length of non-gap regions covered by reads: 1,832,771
Length of 24bp unique regions of the reference: 44,036
Reference nucleotide composition:A: 24.53%, C: 25.46%, G: 25.51%, T: 24.51%
Reads nucleotide composition:A: 25.11%, C: 23.39%, G: 26.10%, T: 25.40%
Average depth across all non-gap regions: 0.596
Average depth across 24bp unique regions: 0.030
Comparison with Phytophthora ramorum
Number of reference sequences: 2576
Length of reference sequences excluding gaps: 54,424,536
Length of gaps in the reference sequences: 12,227,865
Length of non-gap regions covered by reads: 430,094
Length of 24bp unique regions of the reference: 23,950
Reference nucleotide composition:A: 23.09%, C: 26.92%, G: 26.94%, T: 23.06%
Reads nucleotide composition:A: 25.96%, C: 23.94%, G: 23.99%, T: 26.11%
Average depth across all non-gap regions: 0.676
Average depth across 24bp unique regions: 0.020
Comparison with Phytophthora sojae
Number of reference sequences: 1810
Length of reference sequences excluding gaps: 78,050,814
Length of gaps in the reference sequences: 7,976,489
Length of non-gap regions covered by reads: 514,116
Length of 24bp unique regions of the reference: 30,059
Reference nucleotide composition:A: 22.77%, C: 27.25%, G: 27.20%, T: 22.78%
Reads nucleotide composition:A: 25.14%, C: 21.86%, G: 26.51%, T: 26.48%
Average depth across all non-gap regions: 0.959
Average depth across 24bp unique regions: 0.016
Comparison of P. capsici with P. capsici (JGI), P. infestans, P. ramorum & P. sojae
Organism Number of
scaffolds
Nucleotide composition Genome Size (Mb)
Number of genesA (%) C(%) G(%) T(%)
P. capsici – Pepper (IISR)
917
24.81
25.20
25.23
24.77
64.05
19,805
P. capsici (JGI)
917
23.04
26.43
27.13
23.40
64
19,805
P. infestans
4921
25.11
23.39
26.10
25.40
240
22,658
P. sojae
1810
22.77
27.25
27.20
22.78
95
19,027
P. ramorum
2576
25.96
23.94
23.99
26.11
65
15,743
GenomeView - next-generation stand-alone genome browser
April 20, 2023 26
• Visualize and manipulate a huge number of genomics data
• Browse high volumes of aligned short read data, with dynamic
navigation and semantic zooming, from the whole genome level to the
single nucleotide
• Enables visualization of whole genome alignments of dozens of
genomes relative to a reference sequence.
• Handle thousands of annotation features and millions of mapped short
reads
April 20, 2023 27
Future Plans
• To assign putative functions to the remaining
genes
• Provide a genome wide comparison with
other sequenced Phytophthora species
• More genomes to be sequenced
Data, data, everywhere but ...
April 20, 2023 30
is it knowledge?
• Five oomycete genome sequences are available and several more are on the way
• The rate of new sequence generation is accelerating extraordinarily with next generation technologies
Even today the ability to generate high throughput sequencing and transcriptomic data is outstripping the ability to transform the data into knowledge
Automated data processing pipelines are not a substitute for human insight
Theory Experiment
Modeling Simulation
Life in a data-rich environment
April 20, 2023 31
Every experimental biologist needs to be a computational biologist too
Lecture 4.2 32
Some concluding remarks
• Trust but verify• Beware of gene prediction tools!• Always use more than one gene prediction
tool and more than one genome when possible.
• Active area of bioinformatics research, so be mindful of the new literature in this .
Other factors
• Changing technology– New and disappearing companies?
• Changing price structure– Cost of machine– Cost of operation (reagents/people)– Service from the company– 1 machine vs (2 or 3 machines) vs 40
machines.
• Changing software and processing
What have we learned?
• Sequencing technologies are changing fast• Allowing new biology to be performed, new
questions to be asked• Understand the difference between some of
the technologies
What next?
April 20, 2023 Phytophthora 2011: RRII, Kottayam 36
top related