detecting subtle differential expression in t cells of...
TRANSCRIPT
![Page 1: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/1.jpg)
Detecting subtle differential expression in Tcells of Melanoma patients using
Microarray Experiments
Susan Holmes
Statistics Department, Stanford University
Collaboration with Peter P Lee from Stanford’s Hematology Department
Work supported by the ACS and the NSF.
Berkeley, October 23rd, 2003
Thanks to Bioconductor contributors, in particular Sandrine Dudoit, Jean
Yee et al. for marrayNorm and Wolfgang Huber for vsn.
![Page 2: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/2.jpg)
T-cell Data
70 agilent arrays, 13,000 genes in 5 batches. Much batch to batch effect,
no replication.
• five patients with melanoma, five healthy subjects.
• Stimulated and unstimulated T-cells of 7 different types.
![Page 3: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/3.jpg)
T-cell sorting
![Page 4: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/4.jpg)
1 10 100 1000 1
1 10 100 1000 10000
10
100
1000
1
10
100
1000
10000
effector naive
memoryCD45
RA
CD27
![Page 5: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/5.jpg)
The percentages of naıve, memory, and effector CD8+ subsets in each
healthy donor or melanoma patient sample was determined by CD27 and
CD45RA expression. There was no statistically significant differences
(p > 0.05) in the relative distribution of the subsets either amongst
subjects within each group or between melanoma and healthy subjects,
confirming that melanoma patients do not have gross perturbations of
their CD8+ T cell subsets. CD8+ T cell subsets were then isolated from
PBMC samples by FACSorting based on their expression of CD27 and
CD45RA, RNA was extracted, amplified, and hybridized onto Agilent
microarrays.
![Page 6: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/6.jpg)
Two channels
Mean intensities.
Channel 1: Green, reference mRNA mixture of Human cell types.
Channel 2: Red, T-cells, of 7 different types after sorting and
amplication.
![Page 7: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/7.jpg)
Red
![Page 8: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/8.jpg)
Self-self and Dye Flip Experiments
Array system noise determination and dye swap experiment
Self-self hybridizations. 500 ng aRNA (from sorted CD8+ cells) labeled
with Cy3 or Cy5 were competitively hybridized against each other on
microarrays.
.
![Page 9: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/9.jpg)
![Page 10: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/10.jpg)
![Page 11: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/11.jpg)
The scatter plot shows that virtually all the features lie close to the
diagonal line corresponding to the expected log ratio of 0. Overall the
signal from the two channels is similar.
To test the difference of dye incorporation in probes, we performed dye
swap experiments. The figure shows that among 12842 features, 2773
display good correlation with similar differential expression for both dyes,
less than 1% features are differentially expressed, and about 97% of the
genes were correlated to each other.
![Page 12: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/12.jpg)
![Page 13: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/13.jpg)
Combined microarray data from two microarrays representing the two
halves of a dye swap experiment. In one experiment, the microarray was
hybridized with cyanine-3 labeled T cell cDNA and cyanine-5 labeled
reference cDNA generated from 500 ng of the corresponding cRNA. In
the second experiment, the microarray was hybridized with cyanine-5
labeled T cell cDNA and cyanine-3 labeled reference cDNA. Together,
the combined plot represents the two polarities in a dye swap
experiement. The yellow points are genes that displayed good correlation
with similar differential expression for both polarities of the dye swap
experiment. Shown in blue are genes that are unchanged and in red are
genes that were found to be differentially expressed in one polarity but
unchanged in the other polarity. The pink points represent anti-correlated
genes in the two polarities.
![Page 14: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/14.jpg)
Data Normalization and Variance Stabilizing
• Variance Stabilizing Transformation (vsn) was used on the batches one
at a time.
• Batch effect was taken out by removing the batch medians.
![Page 15: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/15.jpg)
Discriminant Analysis between healthy andmelanoma patients
Correction for cell effects allowed us to pool data from the three cell
subsets, to maximize the statistical power of our dataset. In order to
pool the cell types, they had to be homogenized, so cell type differences
between the CD8+ cells were removed through a simple linear model on
the transformed data.
We used linear discriminant analysis Then the Westfall and Young
multiple testing procedure was applied to detect the difference between
patients and healthy donors. A subset of 50 genes with the smallest
p-values was selected by taking the 50 features with smallest adjusted
p-values. These were used as the input to a discriminant analyses with
cross-validation that chose the best subset for discriminating the arrays
into healthy and melanoma groups with low classification error.
![Page 16: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/16.jpg)
Eleven features (10 genes) were found to consistently discriminate
between T cells from melanoma patients versus healthy controls with
adjusted p-values < 0.05.
![Page 17: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/17.jpg)
Discriminating Features, Ranked in order of SignificanceAbrev. UniGene Name/Description Adj. p-value Rel.Expres.
PSPC1 Paraspeckle component 1 0.0002 ⇓
DDB2 Damage-specific DNA binding protein 2 0.0160 ⇑
VIL2 Villin 2 (ezrin) 0.0278 ⇑
LGALS1 Lectin, galactoside-bind.1 (galectin 1) 0.0278 ⇑
ITPKB Inositol 1,4,5-trisphosphate 3-kinase B 0.0340 ⇑
LGALS1 Lectin, galactoside-bind.1 (galectin 1) 0.0340 ⇑
—- Hs, Simi.RNA binding motif prot., X 0.0492 ⇓
NMT2 N-myristoyltransferase 2 0.0492 ⇓
—- LOC154084; Hs clone mRNA 0.0492 ⇑
NME2 Non-metastatic cells 2, protein in 0.0492 ⇑
FLJ22059Hypothetical protein FLJ22059 0.0492 ⇑
Table 1: Table of Significantly differentially expressed genes The relative
expression in the last column gives the cancer patients’ expression relative
to healthy, ⇑ means overexpressed compared to healthy. The adjusted
p-values were computed using a Westfall and Young step down procedure
[?].
![Page 18: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/18.jpg)
Hierarchical cluster and gene combinationhierarchical cluster
A hierarchical clustering of the data was performed using the mva
package in R. The results are displayed here for the 11 retained features.
![Page 19: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/19.jpg)
![Page 20: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/20.jpg)
HE
A3
1_
EF
FE
_2
HE
A5
5_
ME
M_
4
HE
A5
5_
NA
I_4
HE
A5
9_
EF
FE
_5
HE
A2
6_
NA
I_1
HE
A2
5_
EF
FE
_3
HE
A3
1_
NA
I_2
HE
A2
6_
ME
M_
1
HE
A3
1_
ME
M_
2
HE
A2
6_
EF
FE
_1
HE
A2
5_
NA
I_3
HE
A5
5_
EF
FE
_4
HE
A2
5_
ME
M_
3
HE
A5
9_
ME
M_
5
HE
A5
9_
NA
I_5
ME
L5
3_
EF
FE
_3
ME
L3
9_
ME
M_
2
ME
L3
9_
NA
I_2
ME
L6
7_
EF
FE
_4
ME
L3
6_
NA
I_1
ME
L5
3_
NA
I_3
ME
L6
7_
NA
I_4
ME
L5
1_
EF
FE
_5
ME
L3
6_
EF
FE
_1
ME
L5
1_
ME
M_
5
ME
L3
6_
ME
M_
1
ME
L5
1_
NA
I_5
ME
L3
9_
EF
FE
_2
ME
L5
3_
NA
I_3
ME
L6
7_
ME
M_
4
Paraspeckle component 1
H. similar to RNA bind
N−myristoyltransferase 2
Galectin 1
Galectin 1
FLJ22059
inositol 1,4,5−triph.
H.clone 24889
DDB2
protein (NM23B) non−metastatic.
Villin 2
![Page 21: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/21.jpg)
![Page 22: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/22.jpg)
HE
A2
5_
EF
FE
_3
HE
A5
9_
NA
I_5
HE
A2
6_
ME
M_
1
HE
A3
1_
NA
I_2
HE
A3
1_
ME
M_
2
HE
A5
9_
EF
FE
_5
HE
A2
5_
NA
I_3
ME
L6
7_
ME
M_
4
HE
A3
1_
EF
FE
_2
ME
L5
3_
NA
I_3
HE
A5
9_
ME
M_
5
ME
L3
9_
NA
I_2
ME
L3
6_
EF
FE
_1
ME
L6
7_
EF
FE
_4
ME
L5
1_
ME
M_
5
ME
L3
9_
EF
FE
_2
ME
L5
1_
NA
I_5
HE
A5
5_
ME
M_
4
HE
A2
6_
EF
FE
_1
HE
A5
5_
EF
FE
_4
HE
A2
5_
ME
M_
3
HE
A5
5_
NA
I_4
ME
L5
1_
EF
FE
_5
HE
A2
6_
NA
I_1
ME
L6
7_
NA
I_4
ME
L3
6_
ME
M_
1
ME
L5
3_
NA
I_3
ME
L3
6_
NA
I_1
ME
L3
9_
ME
M_
2
ME
L5
3_
EF
FE
_3
H.mRNA for CASH beta
caspase 3, apopt.
H.TFAR19 mRNA
programmed cell death
H.prostaglandin end.
caspase 7, apopt.
H.Lice2 beta cysteine
H.B−cell leuk./lymph.
caspase 10, apopt.
![Page 23: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/23.jpg)
The white are the most expressed, the red the least, with yellow as the
intermediary. Note that the clustering separates the two groups neatly
between melanoma and healthy patients. (b) Hierarchical Clustering of
the best 9 apoptosis genes, this clustering, even though done using the
best possible classifier is unable to decompose the two groups distinctly.
The same analysis was carried out on a set of 86 apoptosis genes, of
which the 9 most discriminant were chosen, these 9 genes were then
analysed using the same hierarchical clustering as the 10 overall best
genes.
The figure shows that this smaller subset of genes classifies healthy
donors and melanoma patients into two groups (only two outliers were
found in the cluster, and they are just borderline cases). In order to
check our procedure, we have used a cross validation method that leaves
the known label of each array out in turn and builds a predictor from the
remaining arrays. We can then compare to the known classification. This
cross validation estimate of prediction error gave a 100% well classified
![Page 24: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/24.jpg)
score, encouraging us to think that these 10 genes are indeed good
predictors of T cells from melanoma versus healthy subjects. The
expression for each gene in this subset is significantly different between
healthy donors and melanoma patients (p < 0.05). This data
demonstrates that CD8+ T cells (except CD45RA-CD27-) in melanoma
patients are subtly different from CD8+ T cells from healthy donors.
As the clustering done only on known apoptosis genes shows, the
distinction between melanoma and healthy patients cannot be made
simply on the evidence of genes involved in the classical apoptosis
pathway.
![Page 25: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/25.jpg)
Check and recalibration
Male subjects were in a 2:1 ratio between the melanoma and healthy
patients in this study, creating an expected fold difference that was
observed on several specific Y-chromosome genes. When we performed
the multiple testing procedure with the Y-genes re-inserted into the
dataset, the SMC (mouse) homolog Y gene appears tied with galectin-1,
with an adjusted p-value of 0.0342. This puts the level of the differential
expression between the two groups of patients at around two-fold for the
galectin-1 gene.
![Page 26: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/26.jpg)
Gene confirmation with real time RT-PCR
To confirm the significance of the genes identified to be differentially
expressed using microarrays, we performed quantitative real-time PCR
(RQ-PCR) analysis using the original, unamplified RNA materials. CD8
subsets were not combined in these experiments. Due to limited
quantities of unamplified RNA, six of ten genes were selected for
RQ-PCR analysis: DDB2, FLJ10955, Galectin-1, Myristoyltransferase,
Hypothetical Protein 669. A housekeeping gene, GAPDH, was also
amplified for normalization of data. The threshold cycle (CT ) for each
sample is calculated by iCycler Real Time Detection System. After
normalized with GAPDH, the CT of the six genes was analyzed using a
Wilcoxon test. Significant differences were found in the CD8+ T cells
between melanoma patients and healthy donors, matching the microarray
data.The only exception was expression of gene FLJ10955, which did not
show significant differences between healthy and melanoma in any of the
three subsets.
![Page 27: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/27.jpg)
Results from two sample t-tests, by cell typesUniGene UniGene p-
value
p-
value
p-
value
Abbrev. Name/description Naive Effector Memory
PSPC1 Paraspeckle component 1 0.53 0.0498 0.54
DDB2 Damage-specific DNA binding protein
2
0.28 0.27 0.027
LGALS1 Lectin, galactoside-binding,1(galectin
1)
0.048 0.041 0.021
—- Hs, Similar to RNA binding motif
protein, X chromosome, clone, mRNA
0.08 6.10−6 0.69
NMT2 N-myristoyltransferase 2 0.27 0.0003 0.45
FLJ22059 Hypothetical protein FLJ22.. 0.75 0.62 0.21
Table 2: Significance of PCR validation of most differentially expressed
genes. The p-values are computed by using a Wilcoxon two sample test
comparing the melanoma patients and the healthy subjects.
![Page 28: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/28.jpg)
Histogram of M3[, 12]
M3[, 12]
Fre
quen
cy
−6 −4 −2 0 2 4
010
020
030
040
050
060
070
0
![Page 29: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/29.jpg)
Parametric Bootstrap
Generate data under various null hypotheses that have many features
similar to the orginal data:
• Same error distribution (form and moments).
• Same components.
• Vary some underlying tuning parameters.
![Page 30: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/30.jpg)
VSN Noise Model: Additive and MultiplicativeComponents
Yk,red = ared + vk,red + βredγkeηk,red
ared and βred are the offset and the factor. γk is gene specific and the
other terms are the noise. additive and multiplicative, we make weak
assumptions of symmetry of noise around 0:∑k
vki = 0∑
k
ηk,red = 0∑
`
η`,green = 0
η`,green + η`,red = 0
![Page 31: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/31.jpg)
Finding the right distribution for the residuals
Heavy tails, extreme case was from a different data set:
![Page 32: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/32.jpg)
Histogram of M3[, 12]
M3[, 12]
Fre
quen
cy
−6 −4 −2 0 2 4
010
020
030
040
050
060
070
0
![Page 33: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/33.jpg)
Non normal residuals
For this data, the recentering-rescaling was always better with mad and
median.
=⇒ Laplace knew already that the error distribution that has the location
parameter the median and the scale parameter the mad has density:
fY (y) =12φexp[−|y − θ|/φ], φ > 0
![Page 34: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/34.jpg)
Laplace’s First Error Distribution
Useful representation:
Y =√X · Z, X ∼ Exp(1), Z ∼ N(0, 1)
A mixture of Normals whose scale parameters vary from an exponential.
The probe sequences are of different length and so have varying
hybridization precisions.
![Page 35: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/35.jpg)
Histogram of log5
Log Ratio
Fre
quen
cy
−4 −2 0 2 4
020
4060
8010
0
Histogram of X2
Simulated Laplace
Fre
quen
cy
−4 −2 0 2 4
050
100
150
![Page 36: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/36.jpg)
●
●
●●●●●
●●●●●●
●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●
●●●●●●
●●●●●●●
●●
●
●
−4 −2 0 2 4
−6
−4
−2
02
46
log5
X2
![Page 37: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/37.jpg)
Assymetric Laplace Distribution
The marginals each have assymmetric Laplace distributions:
Yid= miX +X1/2σiiZi
where Z ∼ N (0, 1) and X ∼ Exp(1).
![Page 38: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/38.jpg)
Assymetric Multivariate Laplace Distribution
Definition (Kotz et al):
A random vector Y in Rd has a multivariate asymmetric Laplace
distribution (AL) if its characteristic function is given by
ψ(t) =1
1 + 12t′Σt− im′t
, t ∈ Rd.
where m ∈ Rd , m 6= 0 and is a dd non-negative definite symmetric
matrix. This distribution is ALd(m,Σ).
Let Y ∼ ALd(m,Σ) and let Z ∼ Nd(0,Σ). Let X be an exponentially
distributed r.v. with mean 1, independent of X. Then, the following
representation is valid:
Yd= mX +X1/2Z.
![Page 39: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/39.jpg)
and we have
Cov(Y ) = Σ +mm′
![Page 40: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/40.jpg)
Random Number Generation
An ALd(m,Σ) generator (Kotz et al., 2003)
1. Decompose Σ = CC ′ using Choleski decomposition.
2. Generate a standard exponential variate X.
3. Independently of X, generate multivariate normal Nd(0,Σ) variate N:
• Generate iid normal vectors in a matrix of dimensions n× p: W.
• N = C ×W .
4. Set Y = m ·X +√XN.
5. Return Y
![Page 41: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/41.jpg)
Parametric Bootstrap
Very useful in situations where there are not enough replicates.
Pretend that the data come from a parametric family Fθ.
Replace in the generation of ‘new’ data, the unknown parameters by Fθ.
![Page 42: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/42.jpg)
Application to Microarrays
Model, after renormalization, and eventual variance stabilization.
mjk = mkεk +√εkZkj, Zkj ∼ N (0,Σ)
Genes are k, k = 1....n. Estimate the parameters, covariance matrix,
medians, mads.
This model is added onto the variance stabilization model.
One may want to change the covariance matrix to adjust for the
hypotheses being tested.
Plug these into the Assymetric Multivariate Laplace generator.
We can compare the number of genes chosen with the parametric model
with a given covariance structure as the one we get with the data.
![Page 43: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/43.jpg)
Calibration through Simulation Experiments
If we assume that the measured signal ykj increases, to sufficient
approximation, proportionally to the mRNA abundance ckj of gene k on
the j-th array, or on the j-th color channel:
ykj ≈ aj + bjbkckj +mkj. (1)
![Page 44: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/44.jpg)
Simulation of Data with same parameters
For red channel, and k = 1...600 consistently expressed genes
rk = ak,green + bk,green × sinh(arcsinh(lk) + εk,green,j)
1/lk ∼ Γ(1, 1), ε ∼ N (0, c2)
All the parameters were estimated from the data c2 is estimated from
Var(h(Y )−mean(µkj)).
For consistently expressed genes:
> lks <- 1/rgamma(600,1,1)
> simuldata <- matrix(0,600,96)
> eps <- matrix(rnorm(600*96,0,0.25),600,96)
> areds <- matrix(rnorm(600*48,-0.5,0.26),600,48)
> agreens <- matrix(rnorm(600*48,-1.66,0.52),600,48)
![Page 45: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/45.jpg)
> breds <- matrix(rnorm(600*48,0.006,0.0038),600,48)
> bgreens <- matrix(rnorm(600*48,0.004,0.0022),600,48)
> simuldata[,refs] <- agreens+bgreens*sinh(asinh(lks)+eps[,refs])
> simuldata[,samps] <- areds+breds*sinh(asinh(mks)+eps[,samps])
For differentially expressed genes (of which we believe there are about
20), we use an additional Bernouilli random variable (at the strain level
and an extra uniform random variable to say how much the new
expressions differ).
![Page 46: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/46.jpg)
Conclusions
Beyond the bootstrap:Distributional Assumptions can be useful.
• Frechet bounds for the correlations of genes under these distributional
assumptions can be found.
• The correlation coefficient may have a much narrower range than we
believe.
• We can attack the Multiple Testing Problem in a Multivariate
Framework.
• We will probably need a different measure of association.
![Page 47: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/47.jpg)
References
[1] S. Dudoit, Y. H. Yang, T. P. Speed, and M. J. Callow. Statistical
methods for identifying differentially expressed genes in replicated
cDNA microarray experiments. Statistica Sinica, 12:111–139, 2002.
[2] W. Huber, A. von Heydebreck, H. Sultmann, A. Poustka, and M.
Vingron. Variance stablization applied to microarray data calibration
and to quantification of differential expression. Bioinformatics,
18:S96–S104, 2002.
[3] W. Huber, A. von Heydebreck, and M. Vingron.
Analysis of microarray gene expression data. Manuscript,
http://www.dkfz.de/abt0840/whuber, 2002.
[4] Kotz, S. Kozubowski, T., J. and Podgorski, K. An assymetric
multivariate Laplace Distribution, May, 2003.
![Page 48: Detecting subtle differential expression in T cells of ...statweb.stanford.edu/~susan/talks/berkbiostat.pdf · Detecting subtle differential expression in T cells of Melanoma patients](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f07369c7e708231d41bdf35/html5/thumbnails/48.jpg)
[5] David M. Rocke and Blythe Durbin. A model for measurement
error for gene expression analysis. Journal of Computational Biology,
8:557–569, 2001.
[6] Tong Xu, Chen-Tsen Shu, Elizabeth Purdom, Demi Dang, Diane
Ilsley, Yaqian Guo, Susan P. Holmes and Peter P. Lee Subtle but
consistent differences in gene expression of CD8+ T cells in melanoma
patients and healthy donors submitted.