pca and admixture models - uclaweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf ·...
TRANSCRIPT
![Page 1: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/1.jpg)
PCA and admixture modelsCM226: Machine Learning for Bioinformatics.
Fall 2016
Sriram SankararamanAcknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price
PCA and admixture models 1 / 57
![Page 2: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/2.jpg)
Announcements
• HW1 solutions posted.
PCA and admixture models 2 / 57
![Page 3: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/3.jpg)
Supervised versus Unsupervised Learning
Unsupervised Learning from unlabeled observations
• Dimensionality Reduction. Last class.
• Other latent variable models. This class + review of PCA.
PCA and admixture models 3 / 57
![Page 4: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/4.jpg)
Outline
Dimensionality reduction
Linear Algebra background
PCAPractical issuesProbabilistic PCA
Admixture models
Population structure and GWAS
PCA and admixture models Dimensionality reduction 4 / 57
![Page 5: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/5.jpg)
Raw data can be complex, high-dimensional
• If we knew what to measure, we could find simple relationships.
• Signals have redundancy.
• Genotype measured at ≈ 500K SNPs.
• Genotypes at neighboring SNPs correlated.
PCA and admixture models Dimensionality reduction 5 / 57
![Page 6: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/6.jpg)
Dimensionality reduction
Goal: Find a “more compact” representation of dataWhy ?
• Visualize and discover hidden patterns.
• Preprocessing for a supervised learning problem.
• Statistical: remove noise.
• Computational: reduce wasteful computation.
PCA and admixture models Dimensionality reduction 6 / 57
![Page 7: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/7.jpg)
Dimensionality reduction
Goal: Find a “more compact” representation of dataWhy ?
• Visualize and discover hidden patterns.
• Preprocessing for a supervised learning problem.
• Statistical: remove noise.
• Computational: reduce wasteful computation.
PCA and admixture models Dimensionality reduction 6 / 57
![Page 8: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/8.jpg)
An example
• We measure parents’ andoffspring heights.
• Two measurements.• Points in R2
• How can we find a more“compact” representation ?
• Two measurements arecorrelated with some noise.
• Pick a direction and project.
PCA and admixture models Dimensionality reduction 7 / 57
![Page 9: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/9.jpg)
An example
• We measure parents’ andoffspring heights.
• Two measurements.• Points in R2
• How can we find a more“compact” representation ?
• Two measurements arecorrelated with some noise.
• Pick a direction and project.
PCA and admixture models Dimensionality reduction 7 / 57
![Page 10: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/10.jpg)
An example
• We measure parents’ andoffspring heights.
• Two measurements.• Points in R2
• How can we find a more“compact” representation ?
• Two measurements arecorrelated with some noise.
• Pick a direction and project.
PCA and admixture models Dimensionality reduction 7 / 57
![Page 11: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/11.jpg)
Goal: Minimize reconstruction error
• Find projection that minimizesthe Euclidean distance betweenoriginal points and projections.
• Principal Components Analysissolves this problem!
PCA and admixture models Dimensionality reduction 8 / 57
![Page 12: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/12.jpg)
Principal Components Analysis
PCA: find lower dimensional representation of data
• Choose K.
• X is N ×M raw data.
• X ≈ ZWT where Z = N ×K reduced representaion (PC scores)
• W is M ×K principal components (columns are principalcomponents).
PCA and admixture models Dimensionality reduction 9 / 57
![Page 13: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/13.jpg)
Outline
Dimensionality reduction
Linear Algebra background
PCAPractical issuesProbabilistic PCA
Admixture models
Population structure and GWAS
PCA and admixture models Linear Algebra background 10 / 57
![Page 14: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/14.jpg)
Covariance matrix
C =1
NXTX
• Generalizes to many features
• Ci,i: variance of feature i
• Ci,j : covariance of feature i and j
• Symmetric
PCA and admixture models Linear Algebra background 11 / 57
![Page 15: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/15.jpg)
Covariance matrix
C =1
NXTX
• Positive semi-definite (PSD). Sometimes indicated as C � 0
(Positive semi-definite matrix) A matrix A ∈ Rn×n is positivesemi-definite iff vTAv ≥ 0 for all v ∈ Rn.
PCA and admixture models Linear Algebra background 11 / 57
![Page 16: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/16.jpg)
Covariance matrix
C =1
NXTX
• Positive semi-definite (PSD). Sometimes indicated as C � 0
vTCv ∝ vTXTXv
= (Xv)TXv
=
n∑i=1
(Xv)i2
PCA and admixture models Linear Algebra background 11 / 57
![Page 17: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/17.jpg)
Covariance matrix
C =1
NXTX
• All covariance matrices (being symmetric and PSD) have aneigendecomposition
PCA and admixture models Linear Algebra background 11 / 57
![Page 18: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/18.jpg)
Eigenvector and eigenvalue
(Eigenvector and eigenvalue) A vector v is an eigenvector ofA ∈ Rn×n if Av = λv for λ is the eigenvalue associated with v.
PCA and admixture models Linear Algebra background 12 / 57
![Page 19: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/19.jpg)
Eigendecomposition of a covariance matrix
• C is symmetric ⇒Its eigenvectors {ui}, i ∈ {1, . . . ,M} can be chosen to beorthonormal
• uTi uj = 0, i 6= j
• uTi ui = 1
• We can choose eigenvectors so that eigenvalues are in decreasingorder: λ1 ≥ λ2 . . . ≥ λM .
PCA and admixture models Linear Algebra background 13 / 57
![Page 20: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/20.jpg)
Eigendecomposition of a covariance matrix
Cui = λiui, i ∈ {1, . . . ,M}
Arrange U = [u1 . . .uM ]
CU = C[u1 . . .uM ]
= [Cu1 . . .CuM ]
= [λ1u1 . . . λMuM ]
= [u1 . . .uM ]
λ1 0 . . . 0...
......
...0 0 . . . λM
= UΛ
PCA and admixture models Linear Algebra background 13 / 57
![Page 21: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/21.jpg)
Eigendecomposition of a covariance matrix
CU = UΛ
Now U is an orthogonal matrix. So UUT = IM
C = CUUT
= UΛUT
PCA and admixture models Linear Algebra background 14 / 57
![Page 22: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/22.jpg)
Eigendecomposition of a covariance matrix
C = UΛUT
• U is m×m orthonormal matrix. Columns are eigenvectors sorted byeigenvalues.
• Λ is a diagonal matrix of eigenvalues.
PCA and admixture models Linear Algebra background 14 / 57
![Page 23: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/23.jpg)
Eigendecomposition: Example
Covariance matrix : Ψ
PCA and admixture models Linear Algebra background 15 / 57
![Page 24: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/24.jpg)
Eigendecomposition: Example
Covariance matrix : Ψ
PCA and admixture models Linear Algebra background 15 / 57
![Page 25: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/25.jpg)
Alternate characterization of eigenvectors
• Eigenvectors are orthonormal directions of maximum variance
• Eigenvalues are the variance in these directions.
• First eigenvector direction of maximum variance with variance = λ1.
PCA and admixture models Linear Algebra background 16 / 57
![Page 26: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/26.jpg)
Alternate characterization of eigenvectors
Given covariance matrix C ∈ RM×M
x∗ = arg maxx xTCx
‖x‖2 = 1
Solution:x∗ = u1 is the first eigenvector of C.
• Example of a constrained optimization problem
• Why do we need the constaint ?
PCA and admixture models Linear Algebra background 16 / 57
![Page 27: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/27.jpg)
Outline
Dimensionality reduction
Linear Algebra background
PCAPractical issuesProbabilistic PCA
Admixture models
Population structure and GWAS
PCA and admixture models PCA 17 / 57
![Page 28: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/28.jpg)
Back to PCA
Given N data points xn ∈ RM , n ∈ {1, . . . , N}, find a lineartransformation from a lower dimensional space K < M :W ∈ RM×K and a projection zn ∈ RK so that we can reconstructoriginal data from the lower dimensional projection.
xn ≈ w1zn,1 + . . .+wKzn,K
= [w1 . . .wK ]
zn,1...zn,K
= Wzn, zn ∈ RK
• We assume the data is centered.∑
n xn,m = 0.
Compression• We go from storing N ×M to M ×K +N ×K.
How do we define quality of reconstruction?
PCA and admixture models PCA 18 / 57
![Page 29: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/29.jpg)
PCA
• Find zn ∈ RK and W ∈ RM×K to minimize the reconstruction error
J(W ,Z) =1
N
∑n
‖xn −Wzn‖22
Z = [z1, . . . ,zN ]T
• Require columns of W to be orthonormal.
• The optimal solution is obtained by setting W = UK where UK
contains the K eigenvectors associated with the K largesteigenvalues of the covaiance matrix C of X.
• The low-dimensional projection zn = WTxn.
PCA and admixture models PCA 19 / 57
![Page 30: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/30.jpg)
PCA
• Find zn ∈ RK and W ∈ RM×K to minimize the reconstruction error
J(W ,Z) =1
N
∑n
‖xn −Wzn‖22
Z = [z1, . . . ,zN ]T
• Require columns of W to be orthonormal.
• The optimal solution is obtained by setting W = UK where UK
contains the K eigenvectors associated with the K largesteigenvalues of the covaiance matrix C of X.
• The low-dimensional projection zn = WTxn.
PCA and admixture models PCA 19 / 57
![Page 31: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/31.jpg)
PCA
• Find zn ∈ RK and W ∈ RM×K to minimize the reconstruction error
J(W ,Z) =1
N
∑n
‖xn −Wzn‖22
Z = [z1, . . . ,zN ]T
• Require columns of W to be orthonormal.
• The optimal solution is obtained by setting W = UK where UK
contains the K eigenvectors associated with the K largesteigenvalues of the covaiance matrix C of X.
• The low-dimensional projection zn = WTxn.
PCA and admixture models PCA 19 / 57
![Page 32: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/32.jpg)
PCA: K = 1
J(w1, z1) =1
N
∑n
‖xn −w1zn,1‖22
=1
N
∑n
(xn −w1zn,1)T (xn −w1zn,1)
=1
N
∑n
(xTnx− 2wT
1 xnzn,1 + zn,12wT
1w1
)= const+
1
N
∑n
(−2wT
1 xnzn,1 + zn,12)
To maximize this function, take derivatives with respect to zn,1
∂J(w1, z1)
∂zn,1= 0
⇒ zn,1 = wT1 xn
PCA and admixture models PCA 20 / 57
![Page 33: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/33.jpg)
PCA: K = 1Plugging back zn,1 = wT
1 xn
J(w1) = const+1
N
∑n
(−2wT
1 xnzn,1 + zn,12)
= const+1
N
∑n
(−2zn,1zn,1 + zn,1
2)
= const− 1
N
∑n
zn,12
Now, because the data is centered
E [z1] =1
N
∑n
zn,1
=1
N
∑n
wT1 xn
= wT1
1
N
∑n
xn = 0PCA and admixture models PCA 20 / 57
![Page 34: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/34.jpg)
PCA: K = 1
J(w1) = const− 1
N
∑n
zn,12
Var [z1] = E[z1
2]− E [z1]
2
=1
N
∑n
zn,12 − 0
=1
N
∑n
zn,12
PCA and admixture models PCA 20 / 57
![Page 35: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/35.jpg)
PCA: K = 1
Putting together
J(w1) = const− 1
N
∑n
zn,12
Var [z1] =1
N
∑n
zn,12
We have
J(w1) = const− Var [z1]
Two views of PCA: Find a direction that minimizes the reconstructionerror ≡ Find a direction that maximizes variance of projected data
arg minw1J(w1) = arg maxw1
Var [z1]
PCA and admixture models PCA 20 / 57
![Page 36: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/36.jpg)
PCA: K = 1
arg minw1J(w1) = arg maxw1
Var [z1]
Var [z1] =1
N
∑n
zn,12
=1
N
∑n
wT1 xnw
T1 xn
=1
N
∑n
wT1 xnx
Tnw1
= wT1
∑n(xnx
Tn )
Nw1
= wT1Cw1
PCA and admixture models PCA 21 / 57
![Page 37: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/37.jpg)
PCA: K = 1
arg minw1J(w1) = arg maxw1
Var [z1]
So we need to solve
arg maxw1wT
1Cw1
Since we required W to be orthonormal, we need to constrain: ‖w1‖2 = 1.
This objective function is maximized when w1 is the first eigenvector of C
PCA and admixture models PCA 21 / 57
![Page 38: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/38.jpg)
PCA: K > 1
• We can repeat the argument for K > 1.
• Since we require directions wk to be orthonormal, we can repeat theargument by searching for direction that maximzes the remainingvariance and is orthogonal to previously selected directions.
PCA and admixture models PCA 22 / 57
![Page 39: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/39.jpg)
Computing eigendecompositions
• Numerical algorithms to compute all eigenvalue, eigenvectors.O(M3).
• Infeasible for genetic datasets.
• Computing largest eigenvalue, eigenvector: Power iteration. O(M2).
• Since we are interested in covariance matrices, can use algorithms tocompute the singular-value decomposition (SVD): O(MN2). (Willdiscuss later).
PCA and admixture models PCA 23 / 57
![Page 40: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/40.jpg)
Practical issues
Choosing K
• For visualization, K = 2 or K = 3.
• For other analyses, pick K so that most of the variance in the data isretained. Fraction of variance retained in the top K eigenvectors∑K
k=1 λk∑Mm=1 λm
PCA and admixture models PCA 24 / 57
![Page 41: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/41.jpg)
PCA: Example
PCA and admixture models PCA 25 / 57
![Page 42: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/42.jpg)
PCA: Example
PCA and admixture models PCA 25 / 57
![Page 43: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/43.jpg)
PCA: Example
PCA and admixture models PCA 25 / 57
![Page 44: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/44.jpg)
PCA: Example
PCA and admixture models PCA 25 / 57
![Page 45: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/45.jpg)
PCA: Example
PCA and admixture models PCA 25 / 57
![Page 46: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/46.jpg)
PCA on HapMap
PCA and admixture models PCA 26 / 57
![Page 47: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/47.jpg)
PCA on Human Genome Diversity Project
PCA and admixture models PCA 27 / 57
![Page 48: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/48.jpg)
PCA on Human Genome Diversity Project
PCA and admixture models PCA 27 / 57
![Page 49: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/49.jpg)
PCA on European genetic data
1
Novembre et al. Nature 2008PCA and admixture models PCA 28 / 57
![Page 50: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/50.jpg)
Probabilistic interpretation of PCA
zniid∼ N (0, IK)
p(xn|zn) = N (Wzn, σ2IM )
PCA and admixture models PCA 29 / 57
![Page 51: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/51.jpg)
Probabilistic interpretation of PCA
zniid∼ N (0, IK)
p(xn|zn) = N (Wzn, σ2IM )
E [xn|zn] = Wzn
E [xn] = E [E [xn|zn]]
= E [Wzn]
= WE [zn]
= 0
PCA and admixture models PCA 29 / 57
![Page 52: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/52.jpg)
Probabilistic interpretation of PCA
zniid∼ N (0, IK)
p(xn|zn) = N (Wzn, σ2IM )
Cov [xn] = E[xnx
Tn
]− E [xn]E [xn]T
= E[(Wzn + εn)(Wzn + εn)T
]− 0
= E[Wznz
TnW
T + 2WznεTn + εnε
Tn
]= E
[Wznz
TnW
T]
+ E[2Wznε
Tn
]+ E
[εnε
Tn
]= WE [znzn]WT + 2WE
[znε
Tn
]+ σ2IM
= WE [znzn]WT + 2WE [zn]E [εn]T + σ2IM
= WIKWT + 2W 0 + σ2IM
= WWT + σ2IM
PCA and admixture models PCA 29 / 57
![Page 53: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/53.jpg)
Probabilistic PCA
Log likelihood
LL(W , σ2) ≡ logP (D|W , σ2)
Maximize W subject to constraint that columns of W are orthonormal.The maximum likelihood estimator
WML = UK
√(ΛK − σ2IK)
UK = [U1 . . .UK ]
ΛK =
λ1 . . . 0...
...0 . . . λK
σ2ML =
1
M −K
M∑j=K+1
λj
PCA and admixture models PCA 30 / 57
![Page 54: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/54.jpg)
Probabilistic PCA
Log likelihood
LL(W , σ2) ≡ logP (D|W , σ2)
Maximize W subject to constraint that columns of W are orthonormal.The maximum likelihood estimator
WML = UK
√(ΛK − σ2IK)
UK = [U1 . . .UK ]
ΛK =
λ1 . . . 0...
...0 . . . λK
σ2ML =
1
M −K
M∑j=K+1
λj
PCA and admixture models PCA 30 / 57
![Page 55: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/55.jpg)
Probabilistic PCA
Computing the MLE
• Compute eigenvalues, eigenvectors
• Hidden/latent variable problem: Use EM
PCA and admixture models PCA 31 / 57
![Page 56: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/56.jpg)
Probabilistic PCA
Computing the MLE
• Compute eigenvalues, eigenvectors
• Hidden/latent variable problem: Use EM
PCA and admixture models PCA 31 / 57
![Page 57: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/57.jpg)
Other advantages of Probabilistic PCA
Can use model selection to infer K.
• Choose K to maximize the marginal likelihood P (D|K).
• Use cross-validation and pick K that maximizes likelihood on held outdata.
• Other model selection criteria such as AIC or BIC (see lecture 6 onclustering).
PCA and admixture models PCA 32 / 57
![Page 58: PCA and admixture models - UCLAweb.cs.ucla.edu/~sriram/courses/cm226.fall-2016/slides/pca.1.pdf · PCA and admixture models Dimensionality reduction 4 / 57 Raw data can be complex,](https://reader030.vdocuments.us/reader030/viewer/2022040918/5e93981849207d13be7be1ec/html5/thumbnails/58.jpg)
Mini-Summary
• Dimensionality reduction: Linear methods• Exploratory analysis and visualization.• Downstream inference: Can use the low-dimensional features for other
tasks.
• Principal Components Analysis finds a linear subspace that minimizedreconstruction error or equivalently maximizes the variance.
• Eigenvalue problem.• Probabilistic interpretation also leads to EM.
• Why may PCA not be appropriate for genetic data ?
PCA and admixture models PCA 33 / 57