rna spike-in controls and analysis methods for trustworthy
TRANSCRIPT
![Page 1: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/1.jpg)
RNA spike-in controls & analysis methods for trustworthy genome-scale
measurements
Sarah A. Munro, Ph.D.Genome-Scale Measurements Group
ABRF MeetingMarch 29, 2015
![Page 2: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/2.jpg)
Overview
• External RNA Controls Consortium (ERCC) RNA spike-in controls
• ‘erccdashboard’ analysis tool• ERCC 2.0: Building an updated suite of RNA
controls
![Page 3: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/3.jpg)
Overview
• External RNA Controls Consortium (ERCC) RNA spike-in controls
• ‘erccdashboard’ analysis tool• ERCC 2.0: Building an updated suite of RNA
controls
![Page 4: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/4.jpg)
How can we have trustworthy gene expression results?
• We’re simultaneously measuring thousands of RNA molecules in gene expression experiments
• But are we getting it right?
![Page 5: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/5.jpg)
External RNA Controls Consortium (ERCC) initiated by industry, hosted by NIST
• Initiated by Janet Warrington,VP Clinical Genomics at Affymetrix
• Open to all interested parties• Voluntary• More than 90 participants
– Industry, Academia, Government– All major microarray technology
developers– Other gene expression assay
developers
Spike-ins
![Page 6: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/6.jpg)
ERCC control sequences arein NIST Standard Reference Material 2374
• DNA sequence library• 96 unique control
sequences in DNA plasmids
• Controls intended to mimic mammalian mRNA
• In vitro transcription to make RNA controls NIST SRM 2374 and related data files
are available directly from NIST @http://tinyurl.com/erccsrm
![Page 7: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/7.jpg)
Making ERCC ratio mixtures with true positive and true negative ratios
NIST Plasmid DNA Library
in vitrotranscription
RNA transcripts
Pooling
Mixtures with knownabundance ratios
…
![Page 8: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/8.jpg)
Treated (n>3)
Using ERCC ratio mixturesControl (n>3)
![Page 9: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/9.jpg)
Control (n>3)Treated (n>3)
Using ERCC ratio mixtures
![Page 10: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/10.jpg)
Control (n>3)Treated (n>3)
Using ERCC ratio mixtures
![Page 11: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/11.jpg)
Control (n>3)Treated (n>3)
Using ERCC ratio mixtures
Measurementprocess
Expression Measures
Statistical Analysis
Multiple stepsMany people & labsTakes days to weeks
![Page 12: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/12.jpg)
Example gene expression data
Treated Control
![Page 13: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/13.jpg)
Are the RNA molecule ratios statistically different across the samples?
Treated Control
![Page 14: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/14.jpg)
Evaluate technical performance with ERCC true positive and true negative ratios
Treated Control
![Page 15: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/15.jpg)
Overview
• External RNA Controls Consortium (ERCC) RNA spike-in controls
• ‘erccdashboard’ analysis tool• ERCC 2.0: Building an updated suite of RNA
controls
![Page 16: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/16.jpg)
Use erccdashboard to produce standard performance metrics for any experiment
• R package is available from: – Bioconductor– NIST GitHub Site
• Open source and open access for use in– Other analysis tools and
pipelines– Commercial software
![Page 17: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/17.jpg)
Gauge technical performance with 4 erccdashboard figures
• Developed as part of SEQC study, with ABRF partners
• Technology-independent ratio performance measures
• Assessed differences in performance across– Experiments– Laboratories– Measurement processes
Munro, S. A. et al. Nature Communications 5:5125 doi: 10.1038/ncomms6125 (2014).
![Page 18: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/18.jpg)
Ambion ERCC Ratio Mixtures
23 Controls per Subpool Design abundance spans 220
range within each Subpool
![Page 19: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/19.jpg)
Spike-in design for SEQC RNA Sequencing Experiments
Rat ExperimentTreated and Control Rat RNA
Biological Replicates
Interlaboratory ExperimentHuman Reference RNA Samples
Technical Replicates
Samples replicatesfor sequencing
![Page 20: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/20.jpg)
What is the dynamic range of my experiment?
Rat Experiment Interlaboratory Experiment
Log2
Nor
mal
ized
ERCC
Cou
nts
Log2
Nor
mal
ized
ERCC
Cou
nts
Log2 ERCC Spike Amount (attomol nt µg-1 total RNA) Log2 ERCC Spike Amount (attomol nt µg-1 total RNA)
![Page 21: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/21.jpg)
What is the dynamic range of my experiment?
Rat Experiment Interlaboratory Experiment
TypicalSequencing ~40 million sequence reads per replicate
DeepSequencing~260 million sequence reads per replicate
Log2 ERCC Spike Amount (attomol nt µg-1 total RNA) Log2 ERCC Spike Amount (attomol nt µg-1 total RNA)
Log2
Nor
mal
ized
ERCC
Cou
nts
Log2
Nor
mal
ized
ERCC
Cou
nts
![Page 22: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/22.jpg)
What was the diagnostic power? Rat Experiment Interlaboratory Experiment
True
Pos
itive
Rat
e
True
Pos
itive
Rat
e
False Positive Rate False Positive Rate
![Page 23: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/23.jpg)
What was the diagnostic power?Rat Experiment Interlaboratory Experiment
True
Pos
itive
Rat
e
True
Pos
itive
Rat
e
False Positive Rate False Positive Rate
Area Under the Curve (AUC)depends on the number of controls detected!
![Page 24: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/24.jpg)
AUC is a reasonable summary statistic…
But we’d like to evaluate our diagnostic performance as a function of abundance…
![Page 25: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/25.jpg)
Log2
Nor
mal
ized
Ratio
of C
ount
s
Log2 Normalized Average Counts
Rat ExperimentMA Plot
![Page 26: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/26.jpg)
LODR: Limit of Detection of RatiosRat Experiment Reference RNA
• Model P-values as a function of average signal
• Find P-value threshold based on chosen false discovery rate
• Here FDR = 0.1• Default is FDR = 0.05
• Estimate LODR from intersection of model confidence interval upper bound and P-value threshold
Average Counts
DE Te
st P
-val
ues
![Page 27: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/27.jpg)
LODR: Limit of Detection of RatiosRat Experiment Reference RNA
LODR provides• Specified confidence in the
differentially expressed transcripts above LODR (90% chance of <10% FDR)
• Guidance for experimental design increase signal for
transcripts above LODR estimateAverage Counts
DE Te
st P
-val
ues
![Page 28: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/28.jpg)
Log2
Rat
io o
f Nor
mal
ized
Coun
ts
Log2 Normalized Average Counts
4:1 LODRRat ExperimentMA Plot
![Page 29: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/29.jpg)
Log2
Rat
io o
f Nor
mal
ized
Coun
ts
Log2 Normalized Average Counts
4:1 LODRRat ExperimentMA Plot **
*
![Page 30: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/30.jpg)
Log2 Normalized Average Counts
4:1 LODRLo
g2 R
atio
of N
orm
alize
d Co
unts
Increased sequencing depth shifts endogenous transcript ratio measurements above LODR
Rat ExperimentMA Plot **
*
![Page 31: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/31.jpg)
What are the LODR estimates for my experiment?
Rat Experiment Interlaboratory Experiment
Average Counts Average Counts
DE Te
st P
-val
ues
DE Te
st P
-val
ues
![Page 32: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/32.jpg)
How do the endogenous samples relate to LODR?
Rat Experiment Interlaboratory Experiment
4:1 LODR 4:1 LODR
Log2 Normalized Average Counts
Log2
Rat
io o
f Nor
mal
ized
Coun
ts
Log2 Normalized Average Counts
Log2
Rat
io o
f Nor
mal
ized
Coun
ts
![Page 33: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/33.jpg)
How much technical variability & bias is there?
Rat Experiment Interlaboratory Experiment
Significant Ratio Bias
Decreased Variability
Log2
Rat
io o
f Nor
mal
ized
Coun
ts
Log2
Rat
io o
f Nor
mal
ized
Coun
ts
![Page 34: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/34.jpg)
Total RN
A
Sample 1 Sample 2
mRNA
rRNA
Spike-in
mRNA Fraction Differences Between Samples Contributes to Bias in ERCC Ratios
Sample 1 Sample 2
mRNA
Spike-in
mRNAenrichment
The RNA fractions are exaggerated for illustration purposes
![Page 35: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/35.jpg)
LODRLimit of Detection of Ratios
• Variability • Bias• LODR &
Sample Transcripts
AUCDiagnostic performance
DynamicRange
![Page 36: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/36.jpg)
EVALUATE REPRODUCIBILITY ACROSS LABORATORIES
![Page 37: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/37.jpg)
Good Performance
PoorPerformance
![Page 38: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/38.jpg)
Interlaboratory Analysis Using erccdashboard performance metrics
Lab 1-6Illumina + poly-A selection (Illumina kit)
Lab 7-9 Life Tech + poly-A selection (Life Tech kit)
Lab 10-12Illumina + ribosomal RNA depletion
![Page 39: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/39.jpg)
Consistent LODR across 11 of 12 Labs
• Diagnostic performance was consistent within and amongst measurement processes
• Lab 7 was an outlier for diagnostic performance
• LODR agreement with AUC
Laboratory
LODR
(Ave
rage
Cou
nts)
![Page 40: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/40.jpg)
Log(
r m)
Ratio bias is highly variable amongst experiments
• Ratio bias (rm) can be attributed to mRNA fraction difference between samples:
Rs = nominal subpoolratio(E1/E2)s = empirical ratio
• Large standard errors indicate that mRNA fraction isn’t the only factor contributing to ERCC ratio bias
– mRNA enrichment protocol is a factor…
Laboratory
Shippy et al. 2006mRNA fractionDifference
![Page 41: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/41.jpg)
Protocol-dependent bias from poly-A selection affects ERCC controls due to short poly-A tails
Lab 1-6 ILM Poly-A Lab 7-9 LIF Poly-A
Lab 10-12 ILM Ribo
![Page 42: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/42.jpg)
mRNA enrichment protocol biases vary across individual ERCCs but are consistent for a protocol
![Page 43: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/43.jpg)
mRNA enrichment protocol biases vary across individual ERCCs but are consistent for a protocol
![Page 44: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/44.jpg)
![Page 45: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/45.jpg)
![Page 46: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/46.jpg)
![Page 47: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/47.jpg)
Results of the erccdashboard Publication
• Ratio performance measures for any technology platform and any experiment– Diagnostic Power– Novel LODR metric– Technical Variability & Bias
• Comparison across experiments
• Quantification of mRNA fraction differences between samples
• Show protocol-dependent bias
![Page 48: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/48.jpg)
Overview
• External RNA Controls Consortium (ERCC) RNA spike-in controls
• ‘erccdashboard’ analysis tool• ERCC 2.0: Building an updated suite of RNA
controls
![Page 49: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/49.jpg)
ERCC 2.0: A New Suite of RNA Controls
• Approached by industry and academia to build new RNA controls
• NIST-hosted open, public ERCC 2.0 workshop– Workshop report and
presentations available:
slideshare.net/ERCC-Workshop
• All interested parties are welcome to participate– Sequence contributions– Interlaboratory analysis
• New and Improved mRNA Mimics
• Transcript Isoforms
• miRNA
![Page 50: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/50.jpg)
New and Improved mRNA Mimics
• Additional controls • Expand distributions
of RNA control properties– Length (> 2kb)– GC content– Poly-A tail length
![Page 51: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/51.jpg)
Transcript Isoform Controls
• Transcript Design– Non-cognate Spike-in
RNA Variants (SIRVs) developed by Lexogen
– Cognate sequence selection in progress
• Schizosaccharomycespombe
• Mixture design– Dynamic Range
• 24
– Design Ratios• < 2:1
Lukas Paul, Lexogen
![Page 52: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/52.jpg)
Small and miRNA Controls
• Needed for validation of clinical applications– Early Detection Research
Network– Tgen
• Other applications relevant to bacterial RNA-Seq
• Non-cognate miRNAcontrols
• Include some pre-miRNA• Direct RNA control synthesis
by Agilent– no need for DNA templates
Karol Thompson, FDA
![Page 53: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/53.jpg)
Recap
• External RNA Controls Consortium (ERCC) RNA spike-in controls
• ‘erccdashboard’ analysis tool• ERCC 2.0: Building an updated suite of RNA
controls
![Page 54: RNA Spike-in Controls and Analysis Methods for Trustworthy](https://reader034.vdocuments.us/reader034/viewer/2022051710/584ca8ca1a28ab85738f3ace/html5/thumbnails/54.jpg)
Acknowledgements
• All External RNA Controls Consortium participants
• NIST– Marc Salit– Steve Lund– P. Scott Pine– Justin Zook– David Duewer– Jerod Parsons– Jennifer McDaniel– Margaret Klein
• Empa– Matthias Roesslein
• SEQC study participants• Co-authors on erccdashboard
manuscript:
S. P. Lund, P. S. Pine, H. Binder,D. Clevert, A. Conesa, J. Dopazo,M. Fasold, S. Hochreiter, H. Hong, N. Jafari, D. P. Kreil, P. P. Łabaj, S. Li, Y. Liao, S. M. Lin, J. Meehan, C. E. Mason, J. Santoyo-Lopez, R. A. Setterquist, L. Shi, W. Shi, G. K. Smyth, N. Stralis-Pavese, Z. Su, W. Tong, C. Wang, J. Wang, J. Xu, Z. Ye, Y. Yang, Y. Yu, & M. Salit
For more information contact: [email protected]