phytophthora genome sequencing: a case study

Post on 05-Jan-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

PHYTOPHTHORA GENOME SEQUENCING: A case study. Santhosh J. Eapen sjeapen@spices.res.in. Status of whole genome sequencing of Phytophthora spp. Phytophthora Whole Genome Sequencing. The sequencing platform was Illumina Genome Analyzer - PowerPoint PPT Presentation

TRANSCRIPT

PHYTOPHTHORA GENOME SEQUENCING: A case study

Santhosh J. Eapensjeapen@spices.res.in

Status of whole genome sequencing of Phytophthora spp.

April 20, 2023 2

Organism Status Release date

Method Size (Mb)

Centre/consortium

P. andina In progress - WGS - Broad InstituteP. capsici Assembly - WGS 64.04 DOE Joint Genome

InstituteP. infestans Assembly 15/11/06 Clone based 190.13 Broad Institute

P. ipomoeae In progress - WGS - Broad InstituteP. mirabilis In progress - WGS - Broad InstituteP. phaseoli In progress - WGS - Broad InstituteP. ramorum Assembly 01/09/06 WGS 54.42 DOE Joint Genome

InstituteP. sojae Assembly 01/09/06 WGS 78.05 DOE Joint Genome

Institute

Phytophthora Whole Genome Sequencing

April 20, 2023 3

• The sequencing platform was Illumina Genome Analyzer

• The sequence base calling, alignment, and variant analysis were done using

CASAVA v1.7 (short for "Consensus Assessment of Sequence And

VAriation“).

• Maq software was used for assembly and variant detection using reference

genome.

• P. capsici genome of JGI was used as the reference genome

Alignment status and reports• Number of reference scaffolds : 917 • Length of reference sequences excluding gaps : 56042007 • Length of gaps in the reference sequences : 8005190 • Length of non-gap regions covered by reads : 22593594 • GC% : 50.4• Total Reads : 15849154• Reads Aligned : 48.8738• Total Genome Size : 64022747• Genome Covered : 28234853• %Coverage : 44.1013• Average Read Depth : 1.50491• Average depth across all non-gap regions : 11.284 • Average depth across 24 bp unique regions : 1.565 • % Coverage at 1X : 54.8897• Single Nucleotide Variants at 3X cutoff : 330410

Base composition and genome size of P. capsici

April 20, 2023 5

Total genome size = 64022747 (64 Mb)

SNP and InDel detailsSNP and InDel details

April 20, 2023 6

Variant Annotation Output

Structural Annotation-

• Structural Annotation was conducted using AUGUSTUS (version 2.5.5), Magnaporthe_grisea as genome model

• However, we have to develop genome model for Oomycete to obtain accurate result

Gene annotation

Functional Annotation

Functional Annotation Result

• Functional Annotation for negative strand is complete

Comparison with Phytophthora capsici (JGI)

Number of reference sequences: 917

Length of reference sequences excluding gaps: 56,042,007

Length of gaps in the reference sequences: 7,981,741

Length of non-gap regions covered by reads: 22,593,608

Length of 24 bp unique regions of the reference: 1,426,016

Reference nucleotide composition:A: 24.81%, C: 25.20%, G: 25.23%, T: 24.77%

Reads nucleotide composition:A: 23.04%, C: 26.43%, G: 27.13%, T: 23.40%

Average depth across all non-gap regions: 11.285

Average depth across 24 bp unique regions: 1.564

Comparison with Phytophthora infestans (Maq)

Number of reference sequences: 4921

Length of reference sequences excluding gaps: 1,90,133,476

Length of gaps in the reference sequences: 38,410,029

Length of non-gap regions covered by reads: 1,832,771

Length of 24bp unique regions of the reference: 44,036

Reference nucleotide composition:A: 24.53%, C: 25.46%, G: 25.51%, T: 24.51%

Reads nucleotide composition:A: 25.11%, C: 23.39%, G: 26.10%, T: 25.40%

Average depth across all non-gap regions: 0.596

Average depth across 24bp unique regions: 0.030

Comparison with Phytophthora ramorum

Number of reference sequences: 2576

Length of reference sequences excluding gaps: 54,424,536

Length of gaps in the reference sequences: 12,227,865

Length of non-gap regions covered by reads: 430,094

Length of 24bp unique regions of the reference: 23,950

Reference nucleotide composition:A: 23.09%, C: 26.92%, G: 26.94%, T: 23.06%

Reads nucleotide composition:A: 25.96%, C: 23.94%, G: 23.99%, T: 26.11%

Average depth across all non-gap regions: 0.676

Average depth across 24bp unique regions: 0.020

Comparison with Phytophthora sojae

Number of reference sequences: 1810

Length of reference sequences excluding gaps: 78,050,814

Length of gaps in the reference sequences: 7,976,489

Length of non-gap regions covered by reads: 514,116

Length of 24bp unique regions of the reference: 30,059

Reference nucleotide composition:A: 22.77%, C: 27.25%, G: 27.20%, T: 22.78%

Reads nucleotide composition:A: 25.14%, C: 21.86%, G: 26.51%, T: 26.48%

Average depth across all non-gap regions: 0.959

Average depth across 24bp unique regions: 0.016

Comparison of P. capsici with P. capsici (JGI), P. infestans, P. ramorum & P. sojae

Organism Number of

scaffolds

Nucleotide composition Genome Size (Mb)

Number of genesA (%) C(%) G(%) T(%)

P. capsici – Pepper (IISR)

917

24.81

25.20

25.23

24.77

64.05

19,805

P. capsici (JGI)

917

23.04

26.43

27.13

23.40

64

19,805

P. infestans

4921

25.11

23.39

26.10

25.40

240

22,658

P. sojae

1810

22.77

27.25

27.20

22.78

95

19,027

P. ramorum

2576

25.96

23.94

23.99

26.11

65

15,743

GenomeView - next-generation stand-alone genome browser

April 20, 2023 26

• Visualize and manipulate a huge number of genomics data

• Browse high volumes of aligned short read data, with dynamic

navigation and semantic zooming, from the whole genome level to the

single nucleotide

• Enables visualization of whole genome alignments of dozens of

genomes relative to a reference sequence.

• Handle thousands of annotation features and millions of mapped short

reads

April 20, 2023 27

Future Plans

• To assign putative functions to the remaining

genes

• Provide a genome wide comparison with

other sequenced Phytophthora species

• More genomes to be sequenced

Data, data, everywhere but ...

April 20, 2023 30

is it knowledge?

• Five oomycete genome sequences are available and several more are on the way

• The rate of new sequence generation is accelerating extraordinarily with next generation technologies

Even today the ability to generate high throughput sequencing and transcriptomic data is outstripping the ability to transform the data into knowledge

Automated data processing pipelines are not a substitute for human insight

Theory Experiment

Modeling Simulation

Life in a data-rich environment

April 20, 2023 31

Every experimental biologist needs to be a computational biologist too

Lecture 4.2 32

Some concluding remarks

• Trust but verify• Beware of gene prediction tools!• Always use more than one gene prediction

tool and more than one genome when possible.

• Active area of bioinformatics research, so be mindful of the new literature in this .

Other factors

• Changing technology– New and disappearing companies?

• Changing price structure– Cost of machine– Cost of operation (reagents/people)– Service from the company– 1 machine vs (2 or 3 machines) vs 40

machines.

• Changing software and processing

What have we learned?

• Sequencing technologies are changing fast• Allowing new biology to be performed, new

questions to be asked• Understand the difference between some of

the technologies

What next?

April 20, 2023 Phytophthora 2011: RRII, Kottayam 36

top related