an introduction to rna-seq transcriptome profiling with iplant
TRANSCRIPT
![Page 1: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/1.jpg)
An Introduction to RNA-Seq Transcriptome Profiling with iPlant
![Page 2: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/2.jpg)
![Page 3: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/3.jpg)
Before we start: Align sequence reads to the reference genomeThe most time-consuming part of the analysis is doing the alignments of the reads (in Sanger fastq format) for all replicates against the reference genome.
![Page 4: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/4.jpg)
Overview: This training module is designed to demonstrate a workflow in the iPlant Discovery Environment using RNA-Seq for transcriptome profiling.
Question: How can we compare gene expression levels using RNA-Seq data in Arabidopsis WT and hy5 genetic backgrounds?
RNA-seq in the Discovery Environment
![Page 5: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/5.jpg)
Scientific Objective
LONG HYPOCOTYL 5 (HY5) is a basic leucine zipper transcription factor (TF).
Mutations cause aberrant phenotypes in Arabidopsis morphology, pigmentation and hormonal response.
We will use RNA-seq to compare WT and hy5 to identify HY5-regulated genes.
Source: http://www.gla.ac.uk/media/media_73736_en.jpg
![Page 6: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/6.jpg)
Samples
• Experimental data downloaded from the NCBI Short Read Archive (GEO:GSM613465 and GEO:GSM613466)
• Two replicates each of RNA-seq runs for Wild-type and hy5 mutant seedlings.
![Page 7: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/7.jpg)
RNA-Seq Conceptual Overview
Image source: http://www.bgisequence.com
![Page 8: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/8.jpg)
RNA-seq Sample Read Statistics
• Genome alignments from TopHat were saved as BAM files, the binary version of SAM (samtools.sourceforge.net/).
• Reads retained by TopHat are shown below
Sequence run WT-1 WT-2 hy5-1 hy5-2
Reads 10,866,702 10,276,268 13,410,011 12,471,462
Seq. (Mbase) 445.5 421.3 549.8 511.3
![Page 9: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/9.jpg)
RNA-Seq Data
@SRR070570.4 HWUSI-EAS455:3:1:1:1096 length=41CAAGGCCCGGGAACGAATTCACCGCCGTATGGCTGACCGGC+BA?39AAA933BA05>A@A=?4,9#################@SRR070570.12 HWUSI-EAS455:3:1:2:1592 length=41GAGGCGTTGACGGGAAAAGGGATATTAGCTCAGCTGAATCT+@=:9>5+.5=?@<6>A?@6+2?:</7>,%1/=0/7/>48##@SRR070570.13 HWUSI-EAS455:3:1:2:869 length=41TGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCA+A;BAA6=A3=ABBBA84B<&78A@BA=(@B>AB2@>B@/[email protected] HWUSI-EAS455:3:1:4:1075 length=41CAGTAGTTGAGCTCCATGCGAAATAGACTAGTTGGTACCAC+BB9?A@>AABBBB@BCA?A8BBBAB4B@BC71=?9;B:[email protected] HWUSI-EAS455:3:1:5:238 length=41AAAAGGGTAAAAGCTCGTTTGATTCTTATTTTCAGTACGAA+BBB?06-8BB@B17>9)=A91?>>8>*@<A<>>@1:B>(B@@SRR070570.44 HWUSI-EAS455:3:1:5:1871 length=41GTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTGTAAG+BBBCBCCBBBBBA@BBCCB+ABBCB@B@BB@:BAA@B@BB>@SRR070570.46 HWUSI-EAS455:3:1:5:1981 length=41GAACAACAAAACCTATCCTTAACGGGATGGTACTCACTTTC+?A>-?B;BCBBB@BC@/>A<BB:?<?B?=75?:9@@@3=>:
…Now What?
![Page 10: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/10.jpg)
@SRR070570.4 HWUSI-EAS455:3:1:1:1096 length=41CAAGGCCCGGGAACGAATTCACCGCCGTATGGCTGACCGGC+BA?39AAA933BA05>A@A=?4,9#################@SRR070570.12 HWUSI-EAS455:3:1:2:1592 length=41GAGGCGTTGACGGGAAAAGGGATATTAGCTCAGCTGAATCT+@=:9>5+.5=?@<6>A?@6+2?:</7>,%1/=0/7/>48##@SRR070570.13 HWUSI-EAS455:3:1:2:869 length=41TGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCA+A;BAA6=A3=ABBBA84B<&78A@BA=(@B>AB2@>B@/[email protected] HWUSI-EAS455:3:1:4:1075 length=41CAGTAGTTGAGCTCCATGCGAAATAGACTAGTTGGTACCAC+BB9?A@>AABBBB@BCA?A8BBBAB4B@BC71=?9;B:[email protected] HWUSI-EAS455:3:1:5:238 length=41AAAAGGGTAAAAGCTCGTTTGATTCTTATTTTCAGTACGAA+BBB?06-8BB@B17>9)=A91?>>8>*@<A<>>@1:B>(B@@SRR070570.44 HWUSI-EAS455:3:1:5:1871 length=41GTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTGTAAG+BBBCBCCBBBBBA@BBCCB+ABBCB@B@BB@:BAA@B@BB>@SRR070570.46 HWUSI-EAS455:3:1:5:1981 length=41GAACAACAAAACCTATCCTTAACGGGATGGTACTCACTTTC+?A>-?B;BCBBB@BC@/>A<BB:?<?B?=75?:9@@@3=>:
Bioinformatician
0100110
10 1
![Page 11: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/11.jpg)
![Page 12: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/12.jpg)
The Tuxedo Protocol
![Page 13: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/13.jpg)
$ tophat -p 8 -G genes.gtf -o C1_R1_thout genome C1_R1_1.fq C1_R1_2.fq$ tophat -p 8 -G genes.gtf -o C1_R2_thout genome C1_R2_1.fq C1_R2_2.fq$ tophat -p 8 -G genes.gtf -o C1_R3_thout genome C1_R3_1.fq C1_R3_2.fq$ tophat -p 8 -G genes.gtf -o C2_R1_thout genome C2_R1_1.fq C1_R1_2.fq$ tophat -p 8 -G genes.gtf -o C2_R2_thout genome C2_R2_1.fq C1_R2_2.fq$ tophat -p 8 -G genes.gtf -o C2_R3_thout genome C2_R3_1.fq C1_R3_2.fq
$ cufflinks -p 8 -o C1_R1_clout C1_R1_thout/accepted_hits.bam$ cufflinks -p 8 -o C1_R2_clout C1_R2_thout/accepted_hits.bam$ cufflinks -p 8 -o C1_R3_clout C1_R3_thout/accepted_hits.bam$ cufflinks -p 8 -o C2_R1_clout C2_R1_thout/accepted_hits.bam$ cufflinks -p 8 -o C2_R2_clout C2_R2_thout/accepted_hits.bam$ cufflinks -p 8 -o C2_R3_clout C2_R3_thout/accepted_hits.bam
$ cuffmerge -g genes.gtf -s genome.fa -p 8 assemblies.txt
$ cuffdiff -o diff_out -b genome.fa -p 8 –L C1,C2 -u merged_asm/merged.gtf \./C1_R1_thout/accepted_hits.bam,./C1_R2_thout/accepted_hits.bam,\./C1_R3_thout/accepted_hits.bam \./C2_R1_thout/accepted_hits.bam,\./C2_R3_thout/accepted_hits.bam,./C2_R2_thout/accepted_hits.bam
Your RNA-Seq Data
Your transformed RNA-Seq Data
![Page 14: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/14.jpg)
RNA-Seq Analysis Workflow
Tophat (bowtie)
Cufflinks
Cuffmerge
Cuffdiff
CummeRbund
Your Data
iPlant Data Store
FASTQ
Disco
very E
nviro
nm
en
t Atm
osphe
re
![Page 15: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/15.jpg)
The iPlant Discovery Environment
![Page 16: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/16.jpg)
The iPlant Discovery Environment
![Page 17: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/17.jpg)
The iPlant Discovery Environment
![Page 18: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/18.jpg)
The iPlant Discovery Environment
![Page 19: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/19.jpg)
Import SRA data from NCBI SRA
Extract FASTQ files from the
downloaded SRA archives
Getting the RNA-Seq Data
![Page 20: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/20.jpg)
Staged Data
![Page 21: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/21.jpg)
Examining Data Quality with fastQC
![Page 22: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/22.jpg)
Tophat
![Page 23: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/23.jpg)
Tophat in the Discovery Environment
![Page 24: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/24.jpg)
Align the four FASTQ files to Arabidopsis genome using Tophat
Align Reads to the Genome
![Page 25: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/25.jpg)
TopHat
• TopHat is one of many applications for aligning short sequence reads to a reference genome.
• It uses the BOWTIE aligner internally.
• Other alternatives are BWA, MAQ, OLego, Stampy, Novoalign, etc.
![Page 26: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/26.jpg)
ATG44120 (12S seed storage protein) significantly down-regulated in hy5 mutantBackground (> 9-fold p=0). Compare to gene on right lacking differential expression
![Page 27: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/27.jpg)
Assembling the Transcripts
![Page 28: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/28.jpg)
Cufflinks in the Discovery Environment
![Page 29: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/29.jpg)
Cufflinks
![Page 30: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/30.jpg)
Merging the Transcriptomes
![Page 31: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/31.jpg)
Cufffmerge in the Discovery Environment
![Page 32: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/32.jpg)
Cuffmerge
![Page 33: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/33.jpg)
Comparing wild-type to hy5 transcriptomes
![Page 34: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/34.jpg)
Cuffdiff in the Discovery Environment
![Page 35: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/35.jpg)
Cuffdiff
![Page 36: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/36.jpg)
Cuffdiff Results
![Page 37: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/37.jpg)
Differentially expressed genes
Example filtered CuffDiff results generated with the Filter_CuffDiff_Results to1)Select genes with minimum two-fold expression difference2)Select genes with significant differential expression (q <= 0.05)3)Add gene descriptions
![Page 38: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/38.jpg)
Density Plot
![Page 39: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/39.jpg)
Scatter Plot
![Page 40: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/40.jpg)
Volcano Plot
![Page 41: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/41.jpg)
Expression Plots
![Page 42: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/42.jpg)
Cloud Computing with iPlant Atmosphere
![Page 43: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/43.jpg)
Launch a Virtual Server (in the Cloud!)
![Page 44: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/44.jpg)
You now have your very own virtual linux server
![Page 45: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/45.jpg)
Expression Plots: Open a terminal and launch R
![Page 46: An Introduction to RNA-Seq Transcriptome Profiling with iPlant](https://reader030.vdocuments.us/reader030/viewer/2022032706/56649de95503460f94ae3bc2/html5/thumbnails/46.jpg)
Expression Plots: Demonstration