rna$seq(...
Post on 01-Feb-2021
7 Views
Preview:
TRANSCRIPT
-
RNA-‐seq Gene Expression Analysis
Tzu L. Phang Ph.D. Robert Stearman Ph.D. Michael Edwards Ph.D.
University of Colorado Denver 2014 AACR Workshop, Snowmass CO
-
What is gene expression?
“Gene expression is the process by which informaSon from a gene is used in the synthesis of a funcSonal gene product”
-
The Central Dogma
Transcriptome Proteomic
-
Agenda
• Import BAM file • What is Next GeneraSon Sequencing (NGS)? • NGS Usages: RNA-‐seq and ChIP-‐seq • NGS File Format and Size • Mapping Phred Quality Score • Demo: Using SeqMonk for RNA-‐seq and ChIP-‐seq Analysis
-
Installing SeqMonk
h]p://www.bioinformaScs.babraham.ac.uk/projects/seqmonk/
-
SeqMonk: First Look
-
List Panel
Chromosome Panel
Track Panel
Quick Assess Panel
-
Sample BAM Files • ABC_DHL2.bam • ABC_Ly10.bam • ABC_Ly3.bam • ABC_U2932.bam • GCB_DHL10.bam • GCB_DHL4.bam • GCB_DHL6.bam • GCB_Ly7.bam • STAT3 ChIPSeq Genes.txt
-
Import BAM files
-
Next GeneraSon Sequencing
h]p://www.youtube.com/watch?v=77r5p8IBwJk
-
How RNA-seq works
Figure from Wang et. al, RNA-‐Seq: a revolu=onary tool for transcriptomics, Nat. Rev. GeneScs 10, 57-‐63, 2009).
Next generaSon sequencing (NGS)
Sample preparaSon
-
How ChIP-seq works
-
File System
-
unknown:5:1:2:836#0/1:CATACAAGTTGTTTGTACTATAGNTGTTTTTGAATT:aabaaaa^abaaba^_]_aaaXPD\^_aaa`Y]_aa!unknown:5:1:2:717#0/1:TCTGTTCCAGATTCTAAGGGCATNGTCTTTTTGAAT:aa^]]`\_^[Y_`^aZP^VZV[SDLZ^aa__^^\Ya!unknown:5:1:2:188#0/1:TAAGAAGAAAGATGCATAGGTACNATATTTTTGAAT:a``Z[^Y^`\\\^[\^][WNTWNDS_[^_^^[OWY_!unknown:5:1:2:1262#0/1:CACTTACAAACAAGGAATGTTGGNCGGTTTTTGAAT:a`ababaabaaaa_``aa``_ULDXZ_^aaa`O_aa!unknown:5:1:2:1046#0/1:CTAAGATGGCCTAAGAGTAGACTNACTTTTTTGAAT:abb`Xa`Z_aabaaa`]__Z^`\D\`aaaaaa^aab!
!
@ILLUMINA-‐545855_0001:4:100:743:1210#0 TAACATGTGTCATATGTCCCAGGATGTC +ILLUMINA-‐545855_0001:4:100:743:1210#0 ab^aaaa_a_aaaa`a^abaaaa``a_a
Data Structure
FASTQ format
-
Quality Control of sequences
• The quality scores are the only measure of confidence
• QualiSes usually fall with length where trimming is needed to remove
-
Phred QualiSes
• Developed by Phil Green’s group at the University of Washington in the 1990’s
• AutomaScally processes sequence chromatogram files – Reports sequence and associated qualiSes – Introduced concept of phred quality values
-
James H. Thomas, University of Washington
-
James H. Thomas, University of Washington
-
FASTQ QC VisualizaSon Per base sequence quality
h]p://www.bioinformaScs.babraham.ac.uk/projects/fastqc/
-
Our Dataset
-
RNA-‐seq Analysis Workflow
-
From FASTQ to SAM/BAM
-
galaxyproject.org
-
h]ps://usegalaxy.org
-
File Format
SCARF
FASTQ
SAM
BAM
VCF
GTF
BED WIG
Single-End
Paired-End
PILEUP
Mapping
Input for Visualization
Tools
QC Visualization
5.2 GB
3.3 GB
4.5 GB
696 MB
-
NOW A DEMONSTRATION
-
Arer BAM Import
-
RNA-‐seq Analysis Workflow
-
Define Probes & QuanStaSon
-
RNA-‐seq Analysis Workflow
-
Why normalizaSon?
• Remove systematic errors introduced in labeling, hybridization and scanning procedures
• Correct these errors while preserve biological variability / information
-
A different look … Technical rep
licate diffe
rence
Average Intensity Values
-
To normalize or not to …
-
ChIP-‐seq Analysis Strategy
-
ChIP-‐seq Analysis Caveats
-
ChIP-‐seq Analysis Workflow
top related