generation and analysis of - biotecbiotec.or.th/en/images/stories/news/banner/generation1.pdf ·...

26
Anchalee Tassanakajon Shrimp Molecular Biology and Genomics Laboratory, Department of Biochemistry, Faculty of Science, Chulalongkorn University Generation and Analysis of Penaeus monodon Expressed Sequence Tags

Upload: others

Post on 21-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Anchalee Tassanakajon

Shrimp Molecular Biology and Genomics Laboratory,Department of Biochemistry, Faculty of Science,

Chulalongkorn University

Generation and Analysis of Penaeus monodon Expressed Sequence Tags

Page 2: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Research Team

ChulalongkornChulalongkorn UniversityUniversity

Shrimp Molecular Biology and Genomics Laboratory

Dr. Anchalee Tassanakajon Dr. Siriporn PongsomboonDr. Premruethai Supungul Dr. Piti AmparyupMs. Sureerat Tang

Advanced Virtual and Intelligent Computing Research Center

Dr. Chidchanok Lursinsap Mr. Kasemsant Kuphanumat

CE for Marine BiotechnologyDr. Sirawut Klinbunga Dr. Narongsak Paunglarp

Page 3: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Prince of Prince of SongklaSongkla UniversityUniversityDr. Amornrat Pongdara

Dr. Apinunt Udomkit (IMBG)Dr. Sarawut Jitrapakdee (Faculty of Sci.)Dr. Kallaya Dangtip (Centex Shrimp)

MahidolMahidol UniversityUniversity

Page 4: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

dbEST release 012508 Summary by Organism - January 25, 2008Number of public entries: 49,284,356

Shrimp ESTs from GenBank

Page 5: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Objectives

To generate a collection of ESTs from Penaeusmonodon

To establish the database for mining of genes, repetitive sequences and SNP detection

Page 6: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

cDNAcDNA LibrariesLibraries

Page 7: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Hemocyte

Hematopoietic tissueLymphoid organIntestineGill and epipodite

Hepatopancrease

Antennal gland

Eye stalk

Brain and thoracic ganglia

Heart

Ovary

Testis

Non-normalized cDNA libraries

Page 8: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Hemocyte (11,008 ESTs)

Hepatopancrease (4,122 ESTs)

Ovary

Antennal gland

Gill-epipodite

Normalized cDNA libraries

Page 9: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Subtractive cDNA libraries

Heat-induced gill subtraction

Ovary subtraction (different stages of female broodstock)

Testes subtraction (Broodstock / juvenile)

Page 10: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Normal shrimp

Pathogen-infected shrimp

Heat-induced shrimp

Experimental animals

Page 11: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Summary of Penaeus monodon EST analysis

6858 (17%)Total no. of Mt sequences

7,309No. of singletons

10,536No. of unique transcripts

3,227No. of contigs

25,834No. of ESTs in contigs

33,143Total ESTs analyzed

40,001Total no. of ESTs

Page 12: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

The distribution of cluster size

Cluster size

The no. of EST/contig = 2 to 454

The average range of the contig = 945 bp

The longest assembled sequence = 6,309 bp

The shortest assembled sequence = 109 bp

>50-495

Page 13: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Anchalee Tassanakajon

Functional Annotation

4,888 (46.4%)Unmatched

5,648 (53.6%)Matched EST (e-value < 10-4 )

Blastx and Blastn

Page 14: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Arthropods 41%

Chordates 22%

Mammals11%

Actinopterygii6%

Other Chordates5%

Platyhelminthes

7%

Echinoderms

4%

Protists14%Bacteria

4%

All Others8%Other Arthropods

1%Crustacea

8%

Insects32%

Tribolium23%

Apis21%

Aedes11%

Drosophila10%

Anopheles8%

Bombyx2%

Fenneropenaeus1%

Homarus2%

Other 10%

Litopenaeus4%

Marsupenaeus2%

Penaeus6%

Danio19%

Rattus15%

Homo12%Xenopus

11%

Gallus9%

Tetraodon8%

Macaca3%

Canis3%

Pan2%

Others6%

Bos4%

Mus8%

Matched Species

Page 15: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Gene Ontology Annotations

3,797GO hits within “Cellular Component”

3,427GO hits within “Biological process”

3,859GO hits within “Molecular Function”

5,002 (47.5%)Total no. of GO hits

Page 16: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Gene Ontology Annotations

Page 17: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Highly represented EST transcripts from P. monodon libraries

Contignumber

No. of sequence

Putative gene [Closest species] Accession No. E value

CT95 454 hypothetical protein [Rattus norvegicus] XP_001054782 3.00E-17

CT115 443 unknown

CT19 393 elongation factor 1-alpha [Pocillopora damicornis] BAE66714 0

CT255 393 thrombospondin [Penaeus monodon] AAN17670 0

CT148 374 hypothetical protein [Eimeria tenella str. Houghton] XP_001238639 1.00E-09

CT263 275 conserved hypothetical protein [Aedes aegypti] EAT47957 2.00E-27

CT111 260 penaeidin [Penaeus monodon] AAQ05769 6.00E-39

CT151 254 beta-actin [Litopenaeus vannamei] AAG16253 0

CT283 214 ovarian peritrophin 2 precursor [Penaeusmonodon]

AAM44050 1.00E-169

CT242 161 similar to secreted nidogen domain protein [Strongylocentrotus purpuratus]

XP_788074 2.00E-30

CT82 159 crustin-like peptide type 2 [Marsupenaeusjaponicus]

BAD15063 7.00E-74

CT42 148 ribosomal protein S26 [Branchiostoma belcheri] ABK32080 6.00E-44

CT48 147 putative senescence-associated protein [Pisumsativum]

BAB33421 5.00E-47

CT170 132 hemocyte kazal-type proteinase inhibitor [Penaeusmonodon]

AAP92779 1.00E-167

CT100 131 profilin [Branchiostoma belcheri] Q8T938 2.00E-23

CT251 129 ovarian peritrophin 2 precursor [Penaeusmonodon]

AAM44050 1.00E-128

CT169 128 mFLJ00348 protein [Mus musculus] BAD90390 6.00E-16

CT156 125 ovarian peritrophin 1 precursor [Penaeusmonodon]

AAM44049 1.00E-170

CT219 124 hemocyanin [Litopenaeus vannamei] CAA57880 0

Page 18: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Differentially expressed immune-related genes

Page 19: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Mining for microsatellites

2,165Total no. of microsatellites loci

997No. of unique ESTs containing microsatellites

10,100Total clone searched

1,381 (13.7%)No. of ESTs containing microsatelites

Page 20: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

85 new polymorphic microsatellite markers were developed.

No. of alleles per locus 3–30 alleles (an average of 12.6. alleles/ locus)

Distribution of microsatellite repeat types

Page 21: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

SNP Prediction

595Potential SNP sites

1/ 644 bpEstimated SNP site

356No. of contigs

8,091 Total clones subjected to prediction

3,846No. of clones in contigs

Page 22: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Contigname

Putative genes Length No. of sequence in contig

No. of SNPsite

Contig 1274 thrombospondin 3428 139 35Contig 1256 hemocyanin 2243 32 33Contig 1267 ovarian peritrophin 2 precursor 2658 62 26Contig 1270 unknown 1881 85 19Contig 1251 hemocyte kazal-type proteinase inhibitor 1554 26 17Contig 1255 unknown 1690 29 16Contig 1271 anti-lipopolysaccharide factor 723 86 14Contig 1272 Oryza sativa (japonica cultivar-group) cDNA

clone:J023007E092918 96 12

Contig 1260 ovarian peritrophin 2 precursor 900 36 12Contig 1258 penaeidin 662 34 12Contig 1261 antimicrobial peptide 1795 39 12Contig 1223 elongation factor-1 alpha 2500 15 11Contig 1254 thymosin isoform 1 1337 29 0Contig 1249 ribosomal protein L10 696 23 0Contig 1227 40S ribosomal protein 552 15 0Contig 1207 ATP/ADP translocase 1302 13 0Contig 1198 eukaryotic initiation factor 4A 1548 12 0Contig 1204 Rps16 protein 523 12 0Contig 1189 oncoprotein nm23 721 11 0Contig 1193 trypsin 762 11 0Contig 1176 profilin 1082 10 0Contig 1177 actin depolymerizing factor 1353 10 0Contig 1180 vacuolar ATP synthase subunit E 1985 10 0Contig 1124 fructose 1,6-bisphosphate aldolase 2315 8 0Contig 1117 ficolin 1912 7 0Contig 1112 cathepsin A 2166 7 0Contig 1118 polehole 2637 7 0Contig 1099 chaperonin 1875 6 0Contig 1037 calcium-binding protein Calnexin 2244 5 0

SNP Prediction in various putative genes

Page 23: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Microarrays Fabrication

9,991 genes on the array

7,256 gene spots (72.6%) showed acceptable signal intensity for data analysis.

Duplicated spots

Page 24: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

Outcome and Future Prospects

40,001 high quality EST sequences representing 10,536 unique genes

P. monodon EST database and a user-friendly web site (http://pmonodon.biotec.or.th)

A large number of potential genetic markers from microsatellites and potential SNP sites

A cDNA microarray containing 9,991 unigenes

11 international publications

Page 25: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations

This research received financial support from BIOTEC.

Dr. Prasit PalittapongarnpimProf. Boonsirm Withyachamnarnkul

Page 26: Generation and Analysis of - BIOTECbiotec.or.th/en/images/stories/News/Banner/generation1.pdf · Canis 3% Pan 2% Others 6% Bos 4% Mus 8% Matched Species. Gene Ontology Annotations