a generic and modular platform for automated sequence processing and annotation
DESCRIPTION
2. A generic and modular platform for automated sequence processing and annotation. Arthur Gruber. Instituto de Ciências Biomédicas Universidade de São Paulo. AG-ICB-USP. 2. Sequence processing and annotation. Analyzing and processing sequencing reads is a tedious and error-prone job - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/1.jpg)
A generic and modular platform for automated
sequence processing and annotation
Arthur Gruber
Instituto de Ciências Biomédicas Universidade de São Paulo
AG-ICB-USP
2
![Page 2: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/2.jpg)
• Analyzing and processing sequencing reads is a tedious and error-prone job
• Multistep process• All sequences are submitted to the same
processing steps• Sequences processed by a given step are
the input for the next one • Require different programs• Integrated system – PIPELINE
Sequence processing and annotation
2
AG-ICB-USP
![Page 3: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/3.jpg)
Problem: how to build pipelines
• Creating scripts for new pipelines involves good programming knowledge
• Once created, most pipelines are difficult to change and customize
• Many programs must be used• Phred, Cross_match, Phrap, CAP3, Blast,
HMMer, InterproScan, TMHMM, etc.
2
AG-ICB-USP
![Page 4: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/4.jpg)
• Each program needs a specific environment to work (e.g. directories with specific names)
• Each program produces output in different ways and formats
• Integrating programs is a hard task
2 Problem: how to build pipelines
AG-ICB-USP
![Page 5: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/5.jpg)
Solution: creating an environment to build pipelines
• Abstract the environment of each program
• Abstract output format
• Easily specify “coupling” of different programs
• Document how the pipe was built • Easy to inspect and monitor• Easy to store (e.g. in a database)
Requirements:
2
AG-ICB-USP
![Page 6: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/6.jpg)
EGene
• To develop a simple to use and configure platform for pipeline construction• Big sequencing centers already have sophisticated pipelines,
but many are not published and/or publicly available
• They are too complex for the small-/mid-sized labs
• Platform should be generic • Useful for any sequencing project
• Platform should provide components for the most common tasks
• New components should be easy to develop
Aims and characteristics:
AG-ICB-USP
2
![Page 7: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/7.jpg)
EGene: a generic platform for pipeline construction
• Written in Perl language• Modular• Easy to build specific components to
interact with third-party programs• EGene components can be integrated
to fulfill user-specific needs• CoEd – a graphical configuration editor
written in Java – user-friendly interface
Characteristics:
2
AG-ICB-USP
![Page 8: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/8.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 9: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/9.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 10: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/10.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 11: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/11.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 12: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/12.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 13: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/13.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 14: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/14.jpg)
AG-ICB-USPAG-ICB-USPAG-ICB-USP
![Page 15: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/15.jpg)
Sequence processing pipelineThe Eimeria ORESTES project
Size filteringFilter-size
End trimmingTrim-ends.pl
Quality filteringFilter-quality.pl
Vector masking and screeningCross_Match
Primer screening and maskingCross_Match
Base calling and quality assignmentPhred
Inputchromatogram files
AssemblyCAP3
Human sequence filteringBlast
Chicken sequence filteringBlast
Bacterial sequence filteringBlast
Repetitive sequence filteringCross_Match
Ribosomal sequence filteringCross_Match
Plastid sequence filteringCross_Match
Mitochondrial sequence filteringCross_Match
2
AG-ICB-USP
![Page 16: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/16.jpg)
Sequence processing and grahical report
2
AG-ICB-USP
![Page 17: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/17.jpg)
How to get EGene
Internet site:http://www.coccidia.icb.usp.br/egene
- EGene is distributed under the GNU General Public License- EGene is Open Source
2
AG-ICB-USP
![Page 18: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/18.jpg)
How to get EGene
Internet site:http://www.coccidia.icb.usp.br/egene
- EGene is distributed under the GNU General Public License- EGene is Open Source
2
AG-ICB-USP
![Page 19: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/19.jpg)
![Page 20: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/20.jpg)
Recent developments
• Incorporation of forks• Enhancement of the data model –
incorporation of annotation evidences
• Development of annotation components
• Evidence-based annotation
2
AG-ICB-USP
![Page 21: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/21.jpg)
![Page 22: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/22.jpg)
![Page 23: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/23.jpg)
![Page 24: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/24.jpg)
Genome annotation
• Annotation is the process of adding information to DNA sequence.
• The information usually has a DNA coordinate.
• Features could be repeats, genes, promoters, protein domains, etc.
• Features can be cross-referenced to other databases (e.g. Pfam/Pubmed)
2
AG-ICB-USP
![Page 25: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/25.jpg)
• Annotation is the process of adding information to DNA sequence.
• The information usually has a DNA coordinate.
• Features could be repeats, genes, promoters, protein domains, etc.
• Features can be cross-referenced to other databases (e.g. Pfam/Pubmed)
Genome annotation2
AG-ICB-USP
![Page 26: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/26.jpg)
Annotation file
A typical annotation file contains:A header with:
• Information about the sequence• Organism• Authors• References• Comments
A feature table containing• Sequence features and co-ordinates
2
AG-ICB-USP
![Page 27: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/27.jpg)
Feature table format
• Flatfile format• Format definition available at
http://www.ncbi.nlm.nih.gov/projects/collab/FT/
• Covers DDBJ/EMBL/GenBank
• Defines all accepted annotation terms and hierarchy
2
AG-ICB-USP
![Page 28: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/28.jpg)
Incorporating annotation
• EGene’s data model was enriched to incorporate annotation information into the representation of the sequences
• All collected data is converted into a proprietary XML format• The XML can be easily converted into
different annotation formats: Feature Table, GFF3, etc.
• We provide some converters and new ones can be easily implemented
2
AG-ICB-USP
![Page 29: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/29.jpg)
Annotation components
• A comprehensive set of annotation components has been implemented:
• ORF finding and translation• Tandem repeats finding: TRF, String, mREPS• tRNA finding: tRNAscan-SE• Gene Prediction: Genscan, GlimmerM,
GlimmerHMM, Twinscan, Phat, ESTscan, SNAP • Motif finding: HMMer x Pfam, RPS-BLAST,
InterproScan• Similarity search: BLAST• EST mapping: Sim4, Exonerate
2
AG-ICB-USP
![Page 30: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/30.jpg)
Annotation components
• A comprehensive set of annotation components has been implemented:• Transmembrane domain finding: TMHMM,
Phobius• Signal peptide: SignalP, Phobius• GPI anchor: DGPI• GO mapping and quantification• Orthology assignment and quantification:
COG/KOG• Pathway mapping: KEGG• Annotation visualization with GBrowse: web
inspection• Annotation report generation: feature table,
GFF3• Web site generation: HTML/PHP
2
AG-ICB-USP
![Page 31: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/31.jpg)
![Page 32: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/32.jpg)
![Page 33: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/33.jpg)
![Page 34: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/34.jpg)
EGene generates annotation files that can be inspected using regular editors
(Artemis, Apollo, etc.)
2
AG-ICB-USP
![Page 35: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/35.jpg)
EGene’s annotation
• EGene can generate annotation in different formats:
• XML – local use, easy to feed a database management system
• Feature table Convenient for manual curation on Artemis Ready for submission to public databases
• GFF3 Current annotation interchange format Manual curation/visualization on Artemis,
Apollo and GMOD Genome Browser Compliant with Sequence Ontology terms
2
AG-ICB-USP
![Page 36: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/36.jpg)
![Page 37: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/37.jpg)
![Page 38: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/38.jpg)
![Page 39: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/39.jpg)
![Page 40: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/40.jpg)
EGene performs GO term mapping and constructs web pages for inspection
2
AG-ICB-USP
![Page 41: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/41.jpg)
![Page 42: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/42.jpg)
![Page 43: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/43.jpg)
![Page 44: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/44.jpg)
![Page 45: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/45.jpg)
![Page 46: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/46.jpg)
![Page 47: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/47.jpg)
![Page 48: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/48.jpg)
![Page 49: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/49.jpg)
![Page 50: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/50.jpg)
![Page 51: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/51.jpg)
![Page 52: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/52.jpg)
![Page 53: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/53.jpg)
![Page 54: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/54.jpg)
![Page 55: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/55.jpg)
![Page 56: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/56.jpg)
![Page 57: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/57.jpg)
![Page 58: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/58.jpg)
![Page 59: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/59.jpg)
![Page 60: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/60.jpg)
![Page 61: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/61.jpg)
EGene performs an integrated and quantitative orthology analysis
(COG/KOG) and constructs web pages
2
AG-ICB-USP
![Page 62: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/62.jpg)
![Page 63: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/63.jpg)
![Page 64: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/64.jpg)
![Page 65: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/65.jpg)
![Page 66: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/66.jpg)
![Page 67: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/67.jpg)
![Page 68: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/68.jpg)
![Page 69: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/69.jpg)
![Page 70: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/70.jpg)
![Page 71: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/71.jpg)
![Page 72: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/72.jpg)
![Page 73: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/73.jpg)
![Page 74: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/74.jpg)
![Page 75: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/75.jpg)
![Page 76: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/76.jpg)
![Page 77: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/77.jpg)
![Page 78: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/78.jpg)
![Page 79: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/79.jpg)
![Page 80: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/80.jpg)
![Page 81: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/81.jpg)
![Page 82: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/82.jpg)
EGene automatically constructs a full web site for evidence inspection
2
AG-ICB-USP
![Page 83: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/83.jpg)
![Page 84: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/84.jpg)
![Page 85: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/85.jpg)
![Page 86: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/86.jpg)
![Page 87: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/87.jpg)
![Page 88: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/88.jpg)
![Page 89: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/89.jpg)
![Page 90: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/90.jpg)
![Page 91: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/91.jpg)
![Page 92: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/92.jpg)
![Page 93: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/93.jpg)
![Page 94: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/94.jpg)
![Page 95: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/95.jpg)
![Page 96: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/96.jpg)
Current developments
• Full integration with a database management system
• Automated task distribution management across multiple processing nodes
• Development of a graphical interface for evidence inspection and manual curation
• “Intelligent” annotation – use of probalistic methods to evaluate evidence and designate protein functions
2
AG-ICB-USP
![Page 97: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/97.jpg)
Why use EGene2 ?• Ideal for small- and mid-sized laboratories
• Genome and EST sequencing projects• Conceived for Biologists
• Does not require programming skills• Generic tool for any sequencing/annotation
project – customized for specific user’s requirements
• Very easy to implement new components• Multiplatform - MacOS, UNIX, Linux, etc.• Well documented – HOWTOs, tutorials, example
datasets available• Easy configuration
• CoEd - Application with a GUI for pipeline construction• Generic pipeline templates provided
2
AG-ICB-USP
![Page 98: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/98.jpg)
Research team
Prof. Alan M. Durham – IME-USP
AnnotationMilene Ferro – ICB-USPRicardo Yamamoto Abe – IME-USPLuiz Thiberio Rangel – ICB-USP
Sequence pre-processingAndré Yoshiaki Kashiwabara - IME-USP Fernando Tadashi G. Matsunaga - ICB-USPPaulo Henrique Ahagon - ICB-USP Leonardo Varuzza - ICB-USP
2
AG-ICB-USP
![Page 99: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/99.jpg)
Financial Support
• FAPESP - São Paulo State Science Foundation
• CNPq - National Research Council
2
AG-ICB-USP
![Page 100: A generic and modular platform for automated sequence processing and annotation](https://reader035.vdocuments.us/reader035/viewer/2022062520/568157cf550346895dc5576a/html5/thumbnails/100.jpg)
Thanks for your
attention
AG-ICB-USP