trypdb analysis workflow

27
TrypDB Analysis Workflow Common Analysis T Cruzi Analysis T Brucei Analysis L Braziliens is Analysis L Infantum Analysis L Major Analysis Tc Specific Analysis Tb Specific Analysis Lm Specific Analysis

Upload: cian

Post on 12-Jan-2016

76 views

Category:

Documents


0 download

DESCRIPTION

TrypDB Analysis Workflow. Common Analysis. T Cruzi Analysis. T Brucei Analysis. L Braziliensis Analysis. L Infantum Analysis. L Major Analysis. Tc Specific Analysis. Tb Specific Analysis. Lm Specific Analysis. PlasmDB Analysis Workflow. Common Analysis. Pf Analysis. Py Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TrypDB Analysis Workflow

TrypDB Analysis Workflow

Common Analysis

T Cruzi Analysis

T Brucei Analysis

L BraziliensisAnalysis

L Infantum Analysis

L Major Analysis

Tc Specific Analysis

Tb Specific Analysis

Lm Specific Analysis

Page 2: TrypDB Analysis Workflow

PlasmDB Analysis Workflow

Common Analysis

Pf Analysis Pv AnalysisPy

AnalysisPc Analysis

Pk Analysis

Pg Analysis

Pv Mito Analysis

Pr Analysis

Pf Specific Analysis

Py Specific Analysis

Pv Specific Analysis

Pc Specific Analysis

Pk Specific Analysis

Pv Specific Analysis

Pg Specific Analysis

Pr Specific Analysis

Page 3: TrypDB Analysis Workflow

ToxoDB Analysis Workflow

Common Analysis

TgME49 Analysis

TgGT1 Analysis

TgVEGAnalysis

TgCkUg2 Analysis

TgApicoplast Analysis

Ncaninum Analysis

TgME49 Specific Analysis

TgVEGSpecific Analysis

TgGT1 Specific Analysis

TgCkUg2 Specific Analysis

TgApicoplast Specific Analysis

Ncaninum Specific Analysis

Page 4: TrypDB Analysis Workflow

Common Analysis

Init Workflow Home Dir on Cluster

Init User/Group/Project

Copy PDB from

Downloads

Make Data Dir

Mirror Common

Data Dir to Cluster

Copy NRDB from

Downloads

Make NRDBShort Defline

Make Mercator Data Dir

Init apiSiteFiles

WebServices Dirs

Insert BlatAlignmentQuality

Table with Xml

Extract Isolate Seqs

Page 5: TrypDB Analysis Workflow

Organism Analysis Workflow

Genome Analysis

Proteome Analysis

Mirror Data Dir to Cluster

Init apiSiteFiles DownloadSite Organism Dir

Make Data Dir

Make and Format

Download Files

Run Tuning Manager

Optional steps

Page 6: TrypDB Analysis Workflow

Genome Analysis

Extract Genome Seqs

Find Tandem Repeats

Load Tandem Repeats

Copy Genomic Seqs to Cluster

BLASTXNRDB

Filter Sequences

Load Low Complexity

Seqs

Make Data Dir

tRNA Scan

Load ORFs

Make ORFs

Make and Block

Candidate Assem Seqs

Make and Block DoTS Assemblies

Map Candidate

Assem Seqs to Genome

Map DoTS Assemblies to

genome

BLASTNIsolates

Page 7: TrypDB Analysis Workflow

Proteome Analysis

Calculate AASeq

Attributes

Extract Protein Seqs

Filter Seqs

Load Low Complexity

Seqs

Copy Protein Seqs to Cluster

BLASTPNRDB

Psipred InterproScan

Run TMHMM

Load TMHMM

Run SignalP

Load SignalP

EpitopesFind Seq Identity to

NRDB

Load NRDB xrefs

BLASTPPDB

Make Data Dir

Run ExportPred

Load ExportPred

Page 8: TrypDB Analysis Workflow

BLASTMake data dir

Start blast

Wait for cluster

Copy files From cluster

extract IDsFrom Blast

result

Load Subjectsubset

Load Result

Optional steps(runtime test)

filter by subject

Mirror Data Dir to Cluster

Make Task Input Dir

Page 9: TrypDB Analysis Workflow

TRNA Scan

Make data dir

Start TRNA Scan

Wait for cluster

Copy files From cluster

Mirror Data Dir to Cluster

Make Task Input Dir

Load TRNA Scan

Page 10: TrypDB Analysis Workflow

Psipred

fix protein IDsFor psipred

create psipredTask dir

copy Data Dirto cluster

start psipredOn cluster

wait for cluster

copy psipredFiles from

cluster

fix psipredFile names

make Alg Inv

load psipred

run pfilt on nrdb

Make data dir

Page 11: TrypDB Analysis Workflow

Epitopes

Make Data Dir

Make Blast Dir

Format NCBI blast file

Create Epitoptes map file

Load Epitopes map

Page 12: TrypDB Analysis Workflow

InterproScan

Make Data Dir

Make InterproScan Cluster Task

Input Dir

Mirror InterproScan to Cluster

Start Cluster Task

Wait for Cluster Task

Mirror InterproScan From Cluster

Insert IprScan Results

Page 13: TrypDB Analysis Workflow

Make and Block Candidate Assembly Seqs

Make Candidate Assembly Seqs

Extract Candidate Assembly Seqs

Make Cluster Task Input Dir

Mirror To Cluster

Start Cluster Task

Wait for Cluster Task

Mirror From Cluster

Make Data Dir

Make Candidate Assembly Seqs from

Predicted Transcripts

Optional steps(runtime test)

Page 14: TrypDB Analysis Workflow

Map Candidate Assembly Seqs to Genome

Extract Genomic Seqs into Separate

Fasta Files

Make Data Dir

Make Gf Client Cluster Task Input

Dir

Mirror Gf Client to Cluster

Mirror Gf Client From Cluster

Insert BLAT Alignment

Setbest BLAT Alignment

Start GFCluster Task

Wait for GF Cluster Task

Run Nib On Cluster

Page 15: TrypDB Analysis Workflow

Cluster Transcripts by Genome Alignment

Put Unaligned Transcripts into One

Cluster

Assemble Transcripts

Extract Assemblies

Make Data Dir

Make Repeat Mask Cluster Task Input

Dir

Mirror Assembly Repeat Mask To

Cluster

Start RM Task on Cluster

Wait for RM Cluster Task

Make and Block Assemblies

Page 16: TrypDB Analysis Workflow

Make Data Dir

Make Assembly Gf Client Cluster Task

Input Dir

Mirror Assembly Gf Client to Cluster

Start GF Task on Cluster

Wait for GF Cluster Task

Mirror Gf Client From Cluster

Insert BLAT Alignment

Setbest BLAT Alignment

Update Assembly Source Id

Map Assemblies to Genome

Page 17: TrypDB Analysis Workflow

Mercator

Run MercatorMavid

Create External Database and

Release for Synteny from Mercator

Insert Mercator Synteny Spans

Make Mercator Gff File

Correct Reading Frame in

Mercator Gff file

Page 18: TrypDB Analysis Workflow

Pfalciparum Specific Analysis

Run MUMMER for SNPsSu_SNPs

Map SAGE Tags to

Genome Seqs

Run MUMMER for SNPs

Broad_SNPs

Run MUMMER for SNPs

Winzeler_SNPs

Run MUMMER for SNPs

SangerPf_SNPs

Run MUMMER for SNPs

SangerPr_SNPs

Run MUMMER for SNPs

Combined_SNPs

Load Anti-codon

Map Oligos

Page 19: TrypDB Analysis Workflow

TgME49 Specific Analysis

Run MUMMER for SNPs

Stanford_SNPs

Map SAGE Tags to

Genome Seqs

Run MUMMER for SNPs

Sibley_SNPs

Run MUMMER for SNPs

Nucmer_SNPs

Load Anti-codon

Blastn Genbank Isolates

Blastn Sibley Isolates

Map BAC Ends Seqs toGenome

Map Cosmid Ends Seqs toGenome

Map Oligos

ChIP-chip

Page 20: TrypDB Analysis Workflow

Extract Annotated Transcript Seqs

Extract Oligo Seqs

Copy Transcripts

Seqs to ClusterCopy Oligo

Seqs to Cluster

BLAST Oligo Against Transcripts

BLAST Oligo Against Genomic

Seqs

Map Oligos

Page 21: TrypDB Analysis Workflow

Extract Ioslate Seqs

Copy Ioslate Seqs to Cluster

BLAST Ioslate Against Genomic

Seqs

Blastn Isolate Seqs against Genome Seqs

Page 22: TrypDB Analysis Workflow

Extract BAC Ends Seqs

Map BAC Ends Seqs to Genome

Map BAC Ends Seqs to Genome Seqs

Make Data Dir

Make Repeat Mask Cluster Task Input

Dir

Mirror BAC ENDs Repeat Mask To

Cluster

Start RM Task on Cluster

Wait for RM Cluster Task

Page 23: TrypDB Analysis Workflow

Map ChIP seqs to Na Sequence

Load mapping results

ChIP-chip

Page 24: TrypDB Analysis Workflow

Pyoelii Specific Analysis

Map Oligos

Page 25: TrypDB Analysis Workflow

Run MUMMER for SNPs

Run Mummer

Convert Mummer Result to Gff file

Load Mummer Resoult

Copy Gff file from Download

Extract Fasta file from Gff file

Page 26: TrypDB Analysis Workflow

Map SAGE Tags to Genome Seqs

Load SAGE Tag Mapping Results

Create SAGE Tag Normolization Files

Load SAGE Tag Normalization

Results

Extract SAGE Tag Seqs

Map SAGE Tag seqs to Genome

Seqs

Page 27: TrypDB Analysis Workflow

Dump Mixed Genomic Sequences

Make Repeat Mask Cluster Task Input

Dir

Mirror Repeat Mask To Cluster

Start Cluster Task

Wait for Cluster Task

Mirror Virtual Sequence Repeat Mask From Cluster

Make Data Dir

Dump and Block Mixed Genome Seqs

Move Blocked Seq File to Mercator

Data Dir