discussion of dimension reduction by dennis cookpmcc/seminars/aca70.pdfno logical reason why the...
TRANSCRIPT
![Page 1: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/1.jpg)
Discussion of Dimension Reduction...by Dennis Cook
Peter McCullagh
Department of StatisticsUniversity of Chicago
Anthony Atkinson 70th birthday celebrationLondon, December 14, 2007
Peter McCullagh
![Page 2: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/2.jpg)
Outline
Peter McCullagh
![Page 3: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/3.jpg)
The renaissance of PCA?
Peter McCullagh
![Page 4: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/4.jpg)
State of affairs regarding PCA
In a regression model used for prediction, only the vectorspace span(X ) matters, not the individual vectors.Non-invariance: PCA of X versus PCA of XL(either diagonal L or general L)No logical reason why the smallest PC should not be bestfor predicting Y .Logical conclusion: PCA cannot be useful.Yet it continues to be used by research workers. And weare their methodological leaders so we must follow them!N.B. logical argument = coordinate-free linear algebraargument
Peter McCullagh
![Page 5: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/5.jpg)
State of affairs regarding PCA
In a regression model used for prediction, only the vectorspace span(X ) matters, not the individual vectors.Non-invariance: PCA of X versus PCA of XL(either diagonal L or general L)No logical reason why the smallest PC should not be bestfor predicting Y .Logical conclusion: PCA cannot be useful.Yet it continues to be used by research workers. And weare their methodological leaders so we must follow them!N.B. logical argument = coordinate-free linear algebraargument
Peter McCullagh
![Page 6: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/6.jpg)
State of affairs regarding PCA
In a regression model used for prediction, only the vectorspace span(X ) matters, not the individual vectors.Non-invariance: PCA of X versus PCA of XL(either diagonal L or general L)No logical reason why the smallest PC should not be bestfor predicting Y .Logical conclusion: PCA cannot be useful.Yet it continues to be used by research workers. And weare their methodological leaders so we must follow them!N.B. logical argument = coordinate-free linear algebraargument
Peter McCullagh
![Page 7: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/7.jpg)
State of affairs regarding PCA
In a regression model used for prediction, only the vectorspace span(X ) matters, not the individual vectors.Non-invariance: PCA of X versus PCA of XL(either diagonal L or general L)No logical reason why the smallest PC should not be bestfor predicting Y .Logical conclusion: PCA cannot be useful.Yet it continues to be used by research workers. And weare their methodological leaders so we must follow them!N.B. logical argument = coordinate-free linear algebraargument
Peter McCullagh
![Page 8: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/8.jpg)
State of affairs regarding PCA
In a regression model used for prediction, only the vectorspace span(X ) matters, not the individual vectors.Non-invariance: PCA of X versus PCA of XL(either diagonal L or general L)No logical reason why the smallest PC should not be bestfor predicting Y .Logical conclusion: PCA cannot be useful.Yet it continues to be used by research workers. And weare their methodological leaders so we must follow them!N.B. logical argument = coordinate-free linear algebraargument
Peter McCullagh
![Page 9: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/9.jpg)
State of affairs regarding PCA
In a regression model used for prediction, only the vectorspace span(X ) matters, not the individual vectors.Non-invariance: PCA of X versus PCA of XL(either diagonal L or general L)No logical reason why the smallest PC should not be bestfor predicting Y .Logical conclusion: PCA cannot be useful.Yet it continues to be used by research workers. And weare their methodological leaders so we must follow them!N.B. logical argument = coordinate-free linear algebraargument
Peter McCullagh
![Page 10: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/10.jpg)
When/where can PCA be successful?
Example: Sub-population structure in geneticsYi ethnic origin of individual ixir allele at SNP locus r on individual in = x000, p = 10 ∗ n, both large
Every allele is a potential discriminator
No single allele (or pair) is a good discriminator
First few PCs Xξ of X ′X sufficient to identify sub-groups
sub-groups related to certain measurable characteristics
Sub-group structure in X may also be related to Y .
Peter McCullagh
![Page 11: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/11.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 12: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/12.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 13: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/13.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 14: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/14.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 15: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/15.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 16: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/16.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 17: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/17.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 18: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/18.jpg)
What is different about genetics example?
Natural to regard X as randomEach column (locus) of X measured on same scaleNatural to regard X as a process indexed by loci (columns)and individuals (rows)Exchangeable rows for individuals, possibly exchangeablecolumns for loci.What models do we have for exchangeable arrays?What interesting models do we have for patternedcovariance matrices?A class intermediate between Ip and general Σ?Relation to PCA?
Peter McCullagh
![Page 19: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/19.jpg)
Rooted trees as covariance matrices
Defn: A symmetric matrix of order n such thatΣrs ≥ min{Σrt ,Σst} ≥ 0spherical if Σii = Σjj .
Table 1a. Viewer preference correlations for 10 programmes (Ehrenberg, 1981)
WoS 1.000 0.581 0.622 0.505 0.296 0.140 0.187 0.145 0.093 0.078MoD 0.581 1.000 0.593 0.473 0.326 0.121 0.131 0.082 0.039 0.049GrS 0.622 0.593 1.000 0.474 0.341 0.142 0.181 0.132 0.070 0.085PrB 0.505 0.473 0.474 1.000 0.309 0.124 0.168 0.106 0.065 0.092RgS 0.296 0.327 0.341 0.309 1.000 0.121 0.147 0.064 0.051 0.09724H 0.140 0.122 0.142 0.124 0.121 1.000 0.524 0.395 0.243 0.266Pan 0.187 0.131 0.181 0.168 0.147 0.524 1.000 0.352 0.200 0.197ThW 0.145 0.082 0.132 0.106 0.064 0.395 0.352 1.000 0.270 0.188ToD 0.093 0.039 0.070 0.065 0.051 0.243 0.200 0.270 1.000 0.155LnU 0.078 0.049 0.085 0.092 0.097 0.266 0.197 0.188 0.155 1.000
Table 1b. Fitted tree for viewer preference correlations
WoS 0.99 0.59 0.61 0.48 0.32 0.10 0.10 0.10 0.10 0.10MoD 0.59 1.01 0.59 0.48 0.32 0.10 0.10 0.10 0.10 0.10GrS 0.61 0.59 0.99 0.48 0.32 0.10 0.10 0.10 0.10 0.10PrB 0.48 0.48 0.48 1.00 0.32 0.10 0.10 0.10 0.10 0.10RgS 0.32 0.32 0.32 0.32 1.00 0.10 0.10 0.10 0.10 0.1024H 0.10 0.10 0.10 0.10 0.10 0.96 0.51 0.36 0.25 0.20Pan 0.10 0.10 0.10 0.10 0.10 0.51 1.01 0.36 0.25 0.20ThW 0.10 0.10 0.10 0.10 0.10 0.36 0.36 0.99 0.25 0.20ToD 0.10 0.10 0.10 0.10 0.10 0.25 0.25 0.25 1.03 0.20LnU 0.10 0.10 0.10 0.10 0.10 0.20 0.20 0.20 0.20 1.01
Peter McCullagh
![Page 20: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/20.jpg)
WoS GrS............................................................................................................................................................................................................................................................................................
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
......38 38
MoD
..............................................................................................................................................................................................................................................................................................................................................................
2
42
PrB
..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................
11
52
RgS
...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
16
68
LnUToD
ThW Pan24H.......................................................................................................................................................................................................................................................................................................................................................
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
........
......5046
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
63
15
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
78
11
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
80
5
........................................................................................................................................................................................................................................................................................................................................................................
22
10
...........................................................
10
Figure 1: Rooted tree illustrating an approximate correlation ma-trix. For each pair of variables, the correlation in percent is thedistance from the root to the junction.
1
Peter McCullagh
![Page 21: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/21.jpg)
Matrix random-effects models
B a partition of [n] means Bij = 1 if i , j in same block(B may be random); #B = number of blocks or clustersX a random matrix of order n × p
(i) X ∼ N(0, In ⊗ Ip + B ⊗ Σ), Σ arbitraryequivalent to Xij = εij + ηb(i),j with η ∼ Np(0,Σ)Implies X ′X/n ' Ip + Σ with eigenvalues 1 + λZ = Xξ ∼ Nn(0, In + λB) equivalent to Zi = εi + λ1/2ηb(i)with λ1 maximizing rms distance between blocks
(ii) More elaborate version with B a random partition (Ewensdistn)
(iii) Replace stump B with a fully branched tree(iv) ...or an exchangeable random tree(v) More principled version: X a process on loci (stationary Σ)(vi) Relation between partition/tree B and response Y?
plausibly related but no necessary connection!
Peter McCullagh
![Page 22: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/22.jpg)
Matrix random-effects models
B a partition of [n] means Bij = 1 if i , j in same block(B may be random); #B = number of blocks or clustersX a random matrix of order n × p
(i) X ∼ N(0, In ⊗ Ip + B ⊗ Σ), Σ arbitraryequivalent to Xij = εij + ηb(i),j with η ∼ Np(0,Σ)Implies X ′X/n ' Ip + Σ with eigenvalues 1 + λZ = Xξ ∼ Nn(0, In + λB) equivalent to Zi = εi + λ1/2ηb(i)with λ1 maximizing rms distance between blocks
(ii) More elaborate version with B a random partition (Ewensdistn)
(iii) Replace stump B with a fully branched tree(iv) ...or an exchangeable random tree(v) More principled version: X a process on loci (stationary Σ)(vi) Relation between partition/tree B and response Y?
plausibly related but no necessary connection!
Peter McCullagh
![Page 23: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/23.jpg)
Matrix random-effects models
B a partition of [n] means Bij = 1 if i , j in same block(B may be random); #B = number of blocks or clustersX a random matrix of order n × p
(i) X ∼ N(0, In ⊗ Ip + B ⊗ Σ), Σ arbitraryequivalent to Xij = εij + ηb(i),j with η ∼ Np(0,Σ)Implies X ′X/n ' Ip + Σ with eigenvalues 1 + λZ = Xξ ∼ Nn(0, In + λB) equivalent to Zi = εi + λ1/2ηb(i)with λ1 maximizing rms distance between blocks
(ii) More elaborate version with B a random partition (Ewensdistn)
(iii) Replace stump B with a fully branched tree(iv) ...or an exchangeable random tree(v) More principled version: X a process on loci (stationary Σ)(vi) Relation between partition/tree B and response Y?
plausibly related but no necessary connection!
Peter McCullagh
![Page 24: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/24.jpg)
Matrix random-effects models
B a partition of [n] means Bij = 1 if i , j in same block(B may be random); #B = number of blocks or clustersX a random matrix of order n × p
(i) X ∼ N(0, In ⊗ Ip + B ⊗ Σ), Σ arbitraryequivalent to Xij = εij + ηb(i),j with η ∼ Np(0,Σ)Implies X ′X/n ' Ip + Σ with eigenvalues 1 + λZ = Xξ ∼ Nn(0, In + λB) equivalent to Zi = εi + λ1/2ηb(i)with λ1 maximizing rms distance between blocks
(ii) More elaborate version with B a random partition (Ewensdistn)
(iii) Replace stump B with a fully branched tree(iv) ...or an exchangeable random tree(v) More principled version: X a process on loci (stationary Σ)(vi) Relation between partition/tree B and response Y?
plausibly related but no necessary connection!
Peter McCullagh
![Page 25: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/25.jpg)
Matrix random-effects models
B a partition of [n] means Bij = 1 if i , j in same block(B may be random); #B = number of blocks or clustersX a random matrix of order n × p
(i) X ∼ N(0, In ⊗ Ip + B ⊗ Σ), Σ arbitraryequivalent to Xij = εij + ηb(i),j with η ∼ Np(0,Σ)Implies X ′X/n ' Ip + Σ with eigenvalues 1 + λZ = Xξ ∼ Nn(0, In + λB) equivalent to Zi = εi + λ1/2ηb(i)with λ1 maximizing rms distance between blocks
(ii) More elaborate version with B a random partition (Ewensdistn)
(iii) Replace stump B with a fully branched tree(iv) ...or an exchangeable random tree(v) More principled version: X a process on loci (stationary Σ)(vi) Relation between partition/tree B and response Y?
plausibly related but no necessary connection!
Peter McCullagh
![Page 26: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/26.jpg)
Matrix random-effects models
B a partition of [n] means Bij = 1 if i , j in same block(B may be random); #B = number of blocks or clustersX a random matrix of order n × p
(i) X ∼ N(0, In ⊗ Ip + B ⊗ Σ), Σ arbitraryequivalent to Xij = εij + ηb(i),j with η ∼ Np(0,Σ)Implies X ′X/n ' Ip + Σ with eigenvalues 1 + λZ = Xξ ∼ Nn(0, In + λB) equivalent to Zi = εi + λ1/2ηb(i)with λ1 maximizing rms distance between blocks
(ii) More elaborate version with B a random partition (Ewensdistn)
(iii) Replace stump B with a fully branched tree(iv) ...or an exchangeable random tree(v) More principled version: X a process on loci (stationary Σ)(vi) Relation between partition/tree B and response Y?
plausibly related but no necessary connection!
Peter McCullagh
![Page 27: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/27.jpg)
Conclusions
Einstein quote ‘nature tricky but not mean’:useful quote but not relevant
Reasons that PCA is widely used in various fields:need to do something. why not PCA?Absence of alternative things to do
No shortage of testimonials to PCA:it sometimes gives interesting projections
X needs to be a process in order to evade Cox’s objectionDifficult to formulate a model in which the PC projectionemerges as part of the likelihood solution.Sufficient reductions:
not convincing: R(X ) not a statistic
Peter McCullagh
![Page 28: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/28.jpg)
Conclusions
Einstein quote ‘nature tricky but not mean’:useful quote but not relevant
Reasons that PCA is widely used in various fields:need to do something. why not PCA?Absence of alternative things to do
No shortage of testimonials to PCA:it sometimes gives interesting projections
X needs to be a process in order to evade Cox’s objectionDifficult to formulate a model in which the PC projectionemerges as part of the likelihood solution.Sufficient reductions:
not convincing: R(X ) not a statistic
Peter McCullagh
![Page 29: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/29.jpg)
Conclusions
Einstein quote ‘nature tricky but not mean’:useful quote but not relevant
Reasons that PCA is widely used in various fields:need to do something. why not PCA?Absence of alternative things to do
No shortage of testimonials to PCA:it sometimes gives interesting projections
X needs to be a process in order to evade Cox’s objectionDifficult to formulate a model in which the PC projectionemerges as part of the likelihood solution.Sufficient reductions:
not convincing: R(X ) not a statistic
Peter McCullagh
![Page 30: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/30.jpg)
Conclusions
Einstein quote ‘nature tricky but not mean’:useful quote but not relevant
Reasons that PCA is widely used in various fields:need to do something. why not PCA?Absence of alternative things to do
No shortage of testimonials to PCA:it sometimes gives interesting projections
X needs to be a process in order to evade Cox’s objectionDifficult to formulate a model in which the PC projectionemerges as part of the likelihood solution.Sufficient reductions:
not convincing: R(X ) not a statistic
Peter McCullagh
![Page 31: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/31.jpg)
Conclusions
Einstein quote ‘nature tricky but not mean’:useful quote but not relevant
Reasons that PCA is widely used in various fields:need to do something. why not PCA?Absence of alternative things to do
No shortage of testimonials to PCA:it sometimes gives interesting projections
X needs to be a process in order to evade Cox’s objectionDifficult to formulate a model in which the PC projectionemerges as part of the likelihood solution.Sufficient reductions:
not convincing: R(X ) not a statistic
Peter McCullagh
![Page 32: Discussion of Dimension Reduction by Dennis Cookpmcc/seminars/ACA70.pdfNo logical reason why the smallest PC should not be best for predicting Y. Logical conclusion: PCA cannot be](https://reader034.vdocuments.us/reader034/viewer/2022042913/5f4bde40c59a4c5f111d00ce/html5/thumbnails/32.jpg)
Conclusions
Einstein quote ‘nature tricky but not mean’:useful quote but not relevant
Reasons that PCA is widely used in various fields:need to do something. why not PCA?Absence of alternative things to do
No shortage of testimonials to PCA:it sometimes gives interesting projections
X needs to be a process in order to evade Cox’s objectionDifficult to formulate a model in which the PC projectionemerges as part of the likelihood solution.Sufficient reductions:
not convincing: R(X ) not a statistic
Peter McCullagh