rna rocket demo
DESCRIPTION
A How-to for using the RNA-Rocket site.TRANSCRIPT
RNA-Seq Analysis Using Pathogen Portal’s RNA-Seq analysis pipeline RNA-
Rocket
Overview
Creating an account Exploring the site Getting data Checking quality Starting analysis Further analysis
Create an account
Step 1: Create a login account: I. Go to http://pathogenportal.org
II. Click on RNA Rocket. III. Click on Create account IV. Fill in the required information.
Exploring the site: Launch Pad
- Interactive concept diagram
- Task oriented menu system
- Designed for novice user
Exploring the site: Launch Pad Trim Reads
- User guide to why, what, and how
- Details required inputs and expected outputs
- Helps organize files into project spaces
Exploring the site: Project View
- View existing projects - Download files - View metadata - Stream to BRC sites - Manage space allocation - Share projects
Exploring the site: Shared Data Published Projects
- View shared projects - Import into your project
space - Share with collaborators - Provide data for
presentations… -
Getting Data A. Importing shared data B. Transferring ENA/SRA data C. Uploading your own
1. Click on “Shared Data” “Published Projects”
2. Click on the title of the Project you wish to import
3. Click “Import History” to import the Project into your Project View
Getting Data A. Importing shared data B. Transferring ENA/SRA data C. Uploading your own
1. Navigate to the ‘Launch Pad’ page and click the ‘Get fastq files from SRA/ENA’ link
2. Click the ‘Continue’ button
3. Search for the SRA or ENA accession in the search box provided. Alternatively search for the GEO, ArrayExpress, SRA, or ENA identifiers in the global search box at the top.
4. Click on the Nucleotide Sequences Record title you wish to import.
5. On the subsequent ENA record page click the ‘File’ link in the ‘Fastq files (galaxy)’ column for the files you wish to transfer.
Getting Data A. Importing shared data B. Transferring ENA/SRA data C. Uploading your own
1. To upload data from your computer or a remote computer click the ‘Upload Files’ link on
the Launch Pad page.
2. On the subsequent page use the ‘Choose File’ button to upload files from your own
computer (limited to 2Gb), the ‘URL/Text’ box to paste URLs for files on remote computers, and the FTP instructions for transferring files over FTP (better for larger files).
Paste the FastQ URLs here
Choose files from your computer here
Instructions for using FTP
Checking quality Read base quality can affect how the reads map to the genome. Different sequencing technologies can have different quality and base-call error profiles. Depending on the quality of base calls you may wish to trim your read sequences or make special adjustments to the alignment parameters to account for this. There are two tools, FastQC and SAMStat, for checking the average base call quality in a fastq file and the number of reads aligned, respectively. An example is provided in Shared Data Published Projects RNASeq_QC_Demo Here we show two classes of files: 1. the original reads 2. trimmed version of those reads with low quality ends removed For these two classes we give both the FastQC and SAMStat report
Original fastq & analysis
Trimmed fastq & analysis
Click the eye see the contents of a file or report
From the FastQC report we see that the average base call quality is improved by trimming the reads.
From the SAMStat report we see that the number of unaligned reads only shows a slight improvement with trimming. Modern alignment software is often able to account for the base call quality in determining alignments. Also of note is that the ‘Mean Base Quality’ profile is not substantially different for MAPQ >=30 and MAPQ < 3.
Starting Analysis Test datasets have been provided for the purpose of starting an alignment and transcript assembly job at Shared Data Published Projects RNASeq_Run_Demo. - To begin, import this history into your own workspace by using the ‘Import history’ functionality
demonstrated previously.
- After the Project is imported it should appear in your ‘Project View’
- Proceed to the ‘Launch Pad’ page and click the ‘Align Reads & Assemble Transcripts’ link.
-‐ On the next page choose the type of analysis (we are analyzing a paired end prokaryotic
sample). -‐ Next select the target project from the drop down menu. You should have a project called
‘imported: RNASeq_Run_Demo’. Once you select the correct project you should see the two FASTQ files listed. Next click ‘Continue’.
The following page allows you to configure the parameters for the various tools that will run as part of the analysis you have selected. Here we describe the bare minimum for running a job. More care should be taken when customizing analysis to your data. First populate the Upstream and Downstream Read Files with READ1_SHORT.fastq and READ2_SHORT.fastq respectively.
Select the reference organism ‘Salmonella enterica subsp. Typhimurium 14028S’ from the dropdown. It may take a moment for the dropdown to appear once clicked due to the number of organisms.
Select ‘Run Workflow’ at the bottom of the page
If the workflow is successfully queued you should see the following
Next go to the ‘Project View’ page to see the status of your jobs From the display in the right most panel: Grey jobs are pending, Green jobs are complete, and Yellow Jobs are running.
Further Analysis Test datasets have been provided for the purpose of testing the RNA-Seq visualization capabilities at PATRIC. Navigate to Shared Data Published Projects RNASeq_Analysis_Demo The files displayed each have a visualization component on the PATRIC site. This can be done by first clicking the dataset title to expand the dataset section, then clicking the display at PATRIC link.
Displaying BAM at PATRIC
Read Quality View
Expression View
Displaying BigWig at PATRIC
Displaying GFF at PATRIC
Displaying GeneList file at PATRIC