trypdb analysis workflow

19
TrypDB Analysis Workflow Common Analysis T Cruzi Analysis T Brucei Analysis L Braziliens is Analysis L Infantum Analysis L Major Analysis Mercator

Upload: kamea

Post on 12-Jan-2016

51 views

Category:

Documents


1 download

DESCRIPTION

TrypDB Analysis Workflow. Common Analysis. T Cruzi Analysis. T Brucei Analysis. L Braziliensis Analysis. L Infantum Analysis. L Major Analysis. Mercator. Common Analysis. Init Workflow Home Dir on Cluster. Run Tuning Manager. Make Data Dir. Init User/Group/Project. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TrypDB Analysis Workflow

TrypDB Analysis Workflow

Common Analysis

T Cruzi Analysis

T Brucei Analysis

L BraziliensisAnalysis

L Infantum Analysis

L Major Analysis

Mercator

Page 2: TrypDB Analysis Workflow

Common Analysis

Init Workflow Home Dir on Cluster

Init User/Group/Project

Copy PDB from

Downloads

Make Data Dir

Mirror Common

Data Dir to Cluster

Copy NRDB from

Downloads

Make NRDBShort Defline

Make Mercator Data Dir

Init apiSiteFilesRun Tuning

Manager

Page 3: TrypDB Analysis Workflow

Organism Analysis Workflow

Genome Analysis

Proteome Analysis

Make Data Dir

Make Gff FileRun Full Record

Dump

Init apiSiteFiles DownloadSite Organism Dir

Page 4: TrypDB Analysis Workflow

Genome Analysis

Extract Genome Seqs

Find Tandem Repeats

Load Tandem Repeats

Copy Genomic Seqs to Cluster

BLASTXNRDB

Filter Sequences

Load Low Complexity

Seqs

Splign

Make Data Dir

Dump and Block Mixed

Genome Seqs

Calculate Residues for NASequence

Make Mercator Gff File

tRNA Scan

DoTS Assemblies

ORFs

Misc DownloadSite

Files

Correct Reading Frame in

Mercator Gff file

Page 5: TrypDB Analysis Workflow

Proteome Analysis

Calcuate Protein Seq

Molecular Weight

Molecular Weight Min

Max

Isoelectric Point

Extract Protein Seqs

Filter Seqs

Load Low Complexity

Seqs

Copy Protein Seqs to Cluster

BLASTPNRDB

Psipred InterproScan

Run TMHMM

Load TMHMM

Run SignalP

Load SignalP

EpitopesFind Seq Identity to

NRDB

Load NRDB xrefs

BLASTPPDB

Make Data Dir

Make Annotated Protein

Download File

Update TaxonId for ExternalAASequence

Page 6: TrypDB Analysis Workflow

Make and Block Candidate

Assem Seqs

Make and Block DoTS

Assemblies

Map Candidate Assem Seqs to

Genome

Map DoTS Assemblies to

genome

Run Tuning Manager

Make DoTS Assemblies

Download File

DoTS Assemblies

Page 7: TrypDB Analysis Workflow

Make Derived CDS

Download File

Make EST Download

File

Make Transcript Download

File

Make Codon Usage

Download File

Misc DownloadSite Files

Page 8: TrypDB Analysis Workflow

Make ORFs

Load ORFs

Run Tuning Manager

Make ORF Download File

Make ORFNa Download File

ORFs

Page 9: TrypDB Analysis Workflow

BLAST

Make data dir

Start blast

Wait for cluster

Copy files From cluster

extract IDsFrom Blast

result

Load Subjectsubset

Load Result

Optional steps(runtime test)

filter by subject

Page 10: TrypDB Analysis Workflow

Psipred

fix protein IDsFor psipred

create psipredTask dir

copy Data Dirto cluster

start psipredOn cluster

wait for cluster

copy psipredFiles from

cluster

fix psipredFile names

make Alg Inv

load psipred

run pfilt on nrdb

Make data dir

Page 11: TrypDB Analysis Workflow

Splign

runSplign

Extract subjectSequenceAlt defline

insertSplign

Extract querySequenceAlt defline

Make Data Dir

Page 12: TrypDB Analysis Workflow

Epitopes

Make Data Dir

Make Blast Dir

Make protetins file

simple defline

Format NCBI blast file

Create Epitoptes map file

Load Epitopes map

Page 13: TrypDB Analysis Workflow

InterproScanMake Data

Dir

Make InterproScan Cluster Task

Input Dir

Mirror InterproScan to Cluster

Start Cluster Task

Wait for Cluster Task

Mirror InterproScan From Cluster

Insert IprScan Results

Make Interpro Download

File

Page 14: TrypDB Analysis Workflow

Make and Block Candidate Assembly Seqs

Make Candidate Assembly Seqs

Extract Candidate Assembly Seqs

Make Cluster Task Input Dir

Mirror To Cluster

Start Cluster Task

Wait for Cluster Task

Mirror From Cluster

Make Data Dir

Page 15: TrypDB Analysis Workflow

Map Candidate Assembly Seqs to Genome

Extract Genomic Seqs into Separate

Fasta Files

Make Data Dir

Make Gf Client Cluster Task Input

Dir

Mirror Gf Client to Cluster

Mirror Gf Client From Cluster

Insert BlatAlignmentQuality

Table with Xml

Insert BLAT Alignment

Setbest BLAT Alignment

Start GFCluster Task

Wait for GF Cluster Task

Page 16: TrypDB Analysis Workflow

Cluster Transcripts by Genome Alignment

Put Unaligned Transcripts into One

Cluster

Assemble Transcripts

Extract Assemblies

Make Data Dir

Make Repeat Mask Cluster Task Input

Dir

Mirror Assembly Repeat Mask To

Cluster

Start RM Task on Cluster

Wait for RM Cluster Task

Make and Block Assemblies

Page 17: TrypDB Analysis Workflow

Make Data Dir

Make Assembly Gf Client Cluster Task

Input Dir

Mirror Assembly Gf Client to Cluster

Start GF Task on Cluster

Wait for GF Cluster Task

Mirror Gf Client From Cluster

Insert BLAT Alignment

Setbest BLAT Alignment

Update Assembly Source Id

Copy Genomic Separate Fasta Files

Map Assemblies to Genome

Page 18: TrypDB Analysis Workflow

Dump Mixed Genomic Sequences

Make Repeat Mask Cluster Task Input

Dir

Mirror Repeat Mask To Cluster

Start Cluster Task

Wait for Cluster Task

Mirror Virtual Sequence Repeat Mask From Cluster

Make Data Dir

Dump and Block Mixed Genome Seqs

Move Blocked Seq File to Mercator

Data Dir

Push Mixed Genomic Seq File to

Download File Dir

Page 19: TrypDB Analysis Workflow

Mercator

Run MercatorMavid

Create External Database and

Release for Synteny from Mercator

Insert Mercator Synteny Spans