affymetrix probe level analysis - biostatisticsririzarr/talks/jnj-affy.pdf · • affy r package ()...
TRANSCRIPT
![Page 1: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/1.jpg)
Affymetrix Probe Level Analysis
Rafael A. Irizarry and Zhijin WuDepartment of Biostatistics, JHU
Johnson and Johnson, 12/5/3
![Page 2: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/2.jpg)
Contact Information
• e-mails: [email protected], [email protected]
• Personal webpages: • http://www.biostat.jhsph.edu/~ririzarr• http://www.biostat.jhsph.edu/~zwu
• (http://www.bioconductor.org)
![Page 3: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/3.jpg)
Acknowledgements
• Paco Martinez-Murillo, Forrest Spencer (JHU)• Felix Naef (Rockefeller)• Ben Bolstad, Sandrine Dudoit, Terry Speed
(Berkeley)• Jean Yang (UCSF)• Robert Gentleman (Harvard)• Wolfgang Huber (Germany)• Johnson & Johnson
![Page 4: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/4.jpg)
Outline
• Quick review of technology• Overview of Issues• Previous Work• RMA• Improvements to RMA
![Page 5: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/5.jpg)
Applications of microarrays
• Measuring transcript abundance– Differential Expression– Classifying samples– Detecting expression pattern
• Other applications:– Genotyping– TAG arrays
Brain
Liver
![Page 6: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/6.jpg)
How they work
![Page 7: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/7.jpg)
Before Labelling
Array 1 Array 2
Sample 1 Sample 2
![Page 8: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/8.jpg)
Before Hybridization
Array 1 Array 2
Sample 1 Sample 2
![Page 9: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/9.jpg)
After Hybridization
Array 1 Array 2
![Page 10: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/10.jpg)
Scanner Image
Array 1 Array 2
![Page 11: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/11.jpg)
Quantification
Array 1 Array 2
4 2 0 3 0 4 0 3
![Page 12: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/12.jpg)
Microarray Image
![Page 13: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/13.jpg)
Case Study: Preprocessing Affymetrix GeneChip Arrays
2424µµmm
Millions of copies of a specificMillions of copies of a specificoligonucleotideoligonucleotide probeprobe
Image of Hybridized Probe ArrayImage of Hybridized Probe Array
>200,000 different>200,000 differentcomplementary probes complementary probes
Single stranded, Single stranded, labeled RNA targetlabeled RNA target
OligonucleotideOligonucleotide probeprobe
* **
**
1.28cm1.28cm
GeneChipGeneChip Probe ArrayProbe ArrayHybridized Probe CellHybridized Probe Cell
Compliments of D. Gerhold
![Page 14: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/14.jpg)
![Page 15: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/15.jpg)
Before Hybridization
Array 1 Array 2
Sample 1 Sample 2
![Page 16: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/16.jpg)
More Realistic
Array 1 Array 2
Sample 1 Sample 2
![Page 17: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/17.jpg)
Non-specific Hybridization
Array 1 Array 2
![Page 18: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/18.jpg)
![Page 19: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/19.jpg)
Statistical Problem
• Each gene is represented by 11-20 pairs (PM and MM) of probe intensities
• Each array has 8K-20K genes• Usually there are various arrays• Obtain measure for each gene on each array:
• Background adjustment and normalizationare issues
Summarize probeset data
![Page 20: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/20.jpg)
Default until 2002 (MAS 4.0)• GeneChip® software used Avg.diff
• with A a set of “suitable” pairs chosen by software.
• Obvious Problems:– Many negative expression values– No log transform
∑Α∈
−Α
=j
jj MMPMdiffAvg )(1.
![Page 21: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/21.jpg)
Why use log?
Original Scale Log Scale
Original scale Log scale
![Page 22: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/22.jpg)
Current default (MAS 5.0)
• GeneChip® new version uses something else
• with MM* a version of MM that is never bigger than PM.
• Ad-hoc background procedure and scale normalization are used.
)}{log( *jj MMPMghtTukeyBiweisignal −=
![Page 23: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/23.jpg)
Can this be improved?
![Page 24: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/24.jpg)
Log-scale scatter plot
log 2
(exp
ress
ion
2)
log2(expression 1)
![Page 25: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/25.jpg)
MvA Plot
M=l
og2(
expr
essi
on 2
/ ex
pres
sion
1)
A= ½{ log2(expression 2) + log2(expression 1) } /2
![Page 26: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/26.jpg)
Can this be improved?
![Page 27: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/27.jpg)
Precision/Accuracy
• It appears precision can be improved. How does it relate to accuracy?
• Spike-in experiments (Affymetrix and GeneLogic)
• Dilution Study (GeneLogic)
![Page 28: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/28.jpg)
Use Spike-In Experiment
![Page 29: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/29.jpg)
First academic alternative: dChip
Li and Wong fit a model
Here represents expression on chip iand represents the probe effect
A non-linear normalization technique is used and the model assumptions are used to remove outliers.
),0(, 2σεεφθ NMMPM ijijjiijij ∝+=−
iθjφ
![Page 30: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/30.jpg)
dChip is betterbut still room for improvement
![Page 31: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/31.jpg)
Three steps
From the spike-in data we learn that:
• We need to background adjust• Normalize• Summarize appropriately (in the log-scale)
![Page 32: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/32.jpg)
Why background correct?
100 100
100
Concentration of 0 pM
Concentration of 1.0 pM
Concentration of 0.5 pM
![Page 33: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/33.jpg)
Why background correct?
![Page 34: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/34.jpg)
Why background correct?
![Page 35: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/35.jpg)
Why background correct?
![Page 36: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/36.jpg)
Why normalize?
Compliments of Ben Bolstad
![Page 37: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/37.jpg)
Why correct for non-specific hyb?
One MM not enough? Look for more!
![Page 38: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/38.jpg)
RMA• Robust Multiarray Analysis (RMA) is a 3-step
approch: – ignores MM and remove global background– quantile normalize– use median polish to estimate log expression robustly
• Irizarry et al: Biostatistics (2003)
• Irizarry et al: NAR (2003)
• affy R package (www.bioconductor.org)
![Page 39: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/39.jpg)
Background adjustment
![Page 40: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/40.jpg)
Deterministic Model
PM = B + N + SMM = B + N
PM – MM = S
![Page 41: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/41.jpg)
Do MMs measure non-specific binding?Look at Yeast DNA hybridized to Human Chip(HGU95)
log (PM-B) v log (MM-B)
Not perfectly: This explains large variance
![Page 42: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/42.jpg)
Stochastic Model(Additive background/multiplicative error)
PM = BPM + NPM + S,MM = BMM+ NMM
log (NPM), log (NMM) ~ Bivariate Normal (ρ ≈ 0.7)S = exp ( Ө + α + ε )Ө is the quantity of interest (log scale expression)
E[ PM – MM ] = S, but Var[ log( PM – MM ) ] ~ 1/exp(Ө) (can be very large)
![Page 43: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/43.jpg)
Can we just ignore background?
PM is a biased estimate of Ө
![Page 44: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/44.jpg)
Alternative Approach
Predict log(S) from PM,MM
For example: 1) E[ log(S) | PM, MM ]
2) Estimate Ө and obtain standard error: Formal hypothesis testing
![Page 45: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/45.jpg)
Quantile normalization
![Page 46: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/46.jpg)
Summarization
• Do it in the log-scale• Account for the probe effect• Use robust procedure
![Page 47: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/47.jpg)
Probe-effect
• Li and Wong (2001) first observed the very strong probe effect
• Within the same probeset, a large range of intensities (orders of magnitude) is observed. But across arrays, variance of intensities, for the same probe, is relatively small
• This probe effect explains high correlation between replicate arrays
![Page 48: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/48.jpg)
Expression from 2 replicate arrays
Correlation is higher than 0.99
![Page 49: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/49.jpg)
Expression from probesets divided into 2 (at random)
Correlation drops to 0.55
![Page 50: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/50.jpg)
Probe effect seen in spike-ins
![Page 51: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/51.jpg)
Why fit log scale additive model?
![Page 52: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/52.jpg)
RMA• Instead of subtracting MM,
Assume PM = B + S• To estimate S, use expectation: E[S|B+S], with B
normal and S exponential• After quantile normalization, assume:
log2Sij = Өi + αj + εij• Estimate Өi using robust procedure (median
polish)• We call this procedure RMA• Does it make a difference?
![Page 53: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/53.jpg)
Does it make a difference?
Ranks 1
27020743063 3935 4639 4652 5149 5372 5947 64486870703775498429 9721
![Page 54: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/54.jpg)
Perfect
Ranks1 23 4 56 7 8 9
10 111213141516
![Page 55: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/55.jpg)
MAS 5.0
Ranks 1
27020743063 3935 4639 4652 5149 5372 5947 64486870703775498429 9721
![Page 56: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/56.jpg)
RMA
Ranks1 23 4 67
10 16 4556 5888
406999
1643 2739
![Page 57: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/57.jpg)
Can RMA be improved?
Global Accuracy and Precision
499.960.110.61RMA218882.430.630.69MAS 5.0RankPercentileMedian SDSlope
![Page 58: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/58.jpg)
Can RMA be improved?
![Page 59: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/59.jpg)
Current Work
• Incorporate MM and sequence information to build an improved model and estimate
• Find alternative, faster, approaches to posterior mean
• Preliminary work: GCRMA
![Page 60: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/60.jpg)
Predict NSB with sequence info
![Page 61: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/61.jpg)
Naef’s model
• Assume that being an A,T,G or C has a position dependent effect on probe effect
• Assume that this effect is a smooth function of position (Naef uses cubic polynomials we use splines)
• Use training data to get affinities
![Page 62: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/62.jpg)
Naef uses these to predict probe effect
![Page 63: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/63.jpg)
We use them to predict NSB too
![Page 64: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/64.jpg)
Problems with MM
Also they take up half the space on the chip ($250)
![Page 65: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/65.jpg)
More problems with MM
![Page 66: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/66.jpg)
More problems with MM
![Page 67: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/67.jpg)
Our model predicts this
![Page 68: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/68.jpg)
Adjustment options
• Define a loss function, assume S is random variable, find empirical Bayesesimtate, e.g. for log ratio based loss the solution is:
E[ log(S) | PM, MM ]• GCRMA assumes S follows power-law or
log(S) is uniform
![Page 69: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/69.jpg)
Does it help?
Global Accuracy and Precision
499.960.110.61RMA299.980.080.85GCRMA
218882.430.630.69MAS 5.0RankPercentileMedian SDSlope
Local slopes also improve
![Page 70: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/70.jpg)
Does it help?
![Page 71: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/71.jpg)
ROC for FC=2 spikes
![Page 72: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/72.jpg)
ROC for low concentration spikes
![Page 73: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/73.jpg)
Local Ranks
16186189433324815GCRMA
00011261117181625% of data
408761053333527360380RMA
412839762633235220511887193518981998228227361715MAS_5.0
9:108:97:86:75:64:53:42:31:20:1-1:0-2:-1
![Page 74: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/74.jpg)
Conclusion
• Data exploration useful tool for quality assessment and motivating models
• Statistical thinking helpful for interpretation
• Statistical models may help find signals in noise
• Physical models help improve accuracy
![Page 75: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/75.jpg)
Supplemental Slide
![Page 76: Affymetrix Probe Level Analysis - Biostatisticsririzarr/Talks/jnj-affy.pdf · • affy R package () Background adjustment. Deterministic Model PM = B + N + S MM = B + N PM – MM](https://reader033.vdocuments.us/reader033/viewer/2022052018/6031817303f63c69f553f098/html5/thumbnails/76.jpg)
Local Ranks
96247719210275525261113200907961dChip
16186189433324815GCRMA
00011261117181625% of data
408761053333527360380RMA
412839762633235220511887193518981998228227361715MAS_5.0
9:108:97:86:75:64:53:42:31:20:1-1:0-2:-1