an introduction to principal component analysis (pca)
DESCRIPTION
an introduction to Principal Component Analysis (PCA). abstract. Principal component analysis (PCA) is a technique that is useful for the compression and classification of data. The purpose is to reduce the dimensionality of a data set (sample) by finding a new set of variables, - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/1.jpg)
an introduction to
Principal Component Analysis
(PCA)
![Page 2: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/2.jpg)
abstract
Principal component analysis (PCA) is a technique that is useful for thecompression and classification of data. The purpose is to reduce thedimensionality of a data set (sample) by finding a new set of variables,smaller than the original set of variables, that nonetheless retains mostof the sample's information.
By information we mean the variation present in the sample,given by the correlations between the original variables. The newvariables, called principal components (PCs), are uncorrelated, and areordered by the fraction of the total information each retains.
![Page 3: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/3.jpg)
overview
• geometric picture of PCs
• algebraic definition and derivation of PCs
• usage of PCA
• astronomical application
![Page 4: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/4.jpg)
Geometric picture of principal components (PCs)
A sample of n observations in the 2-D space
Goal: to account for the variation in a sample in as few variables as possible, to some accuracy
![Page 5: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/5.jpg)
Geometric picture of principal components (PCs)
• the 1st PC is a minimum distance fit to a line in space
PCs are a series of linear least squares fits to a sample,each orthogonal to all the previous.
• the 2nd PC is a minimum distance fit to a line in the plane perpendicular to the 1st PC
![Page 6: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/6.jpg)
Algebraic definition of PCs
Given a sample of n observations on a vector of p variables
λ
where the vector
is chosen such that
define the first principal component of the sampleby the linear transformation
is maximum
![Page 7: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/7.jpg)
Algebraic definition of PCs
Likewise, define the kth PC of the sampleby the linear transformation
λwhere the vector
is chosen such that is maximum
subject to
and to
![Page 8: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/8.jpg)
Algebraic derivation of coefficient vectors
To find first note that
where
is the covariance matrix for the variables
![Page 9: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/9.jpg)
To find maximize subject to
Algebraic derivation of coefficient vectors
Let λ be a Lagrange multiplier
by differentiating…
then maximize
is an eigenvector of
corresponding to eigenvalue
therefore
![Page 10: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/10.jpg)
We have maximized
Algebraic derivation of
So is the largest eigenvalue of
The first PC retains the greatest amount of variation in the sample.
![Page 11: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/11.jpg)
To find the next coefficient vector maximize
Algebraic derivation of coefficient vectors
then let λ and φ be Lagrange multipliers, and maximize
subject to
and to
First note that
![Page 12: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/12.jpg)
We find that is also an eigenvector of
Algebraic derivation of coefficient vectors
whose eigenvalue is the second largest.
In general
• The kth largest eigenvalue of is the variance of the kth PC.
• The kth PC retains the kth greatest fraction of the variation in the sample.
![Page 13: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/13.jpg)
Algebraic formulation of PCA
Given a sample of n observations
define a vector of p PCs
according to
on a vector of p variables
whose kth column is the kth eigenvector of
where is an orthogonal p x p matrix
Then is the covariance matrix of the PCs,
being diagonal with elements
![Page 14: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/14.jpg)
usage of PCA: Probability distribution for sample PCs
If (i) the n observations of in the sample are independent &
(ii) is drawn from an underlying population that follows a p-variate normal (Gaussian) distribution with known covariance matrix
then
where is the Wishart distribution
else utilize a bootstrap approximation
![Page 15: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/15.jpg)
usage of PCA: Probability distribution for sample PCs
If (i) follows a Wishart distribution &
(ii) the population eigenvalues are all distinct
then the following results hold as
(a tilde denotes a population quantity)
• all the are independent of all the
• and
•
•
are jointly normally distributed
![Page 16: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/16.jpg)
usage of PCA: Probability distribution for sample PCs
and
(a tilde denotes a population quantity)
•
•
![Page 17: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/17.jpg)
usage of PCA: Inference about population PCs
If follows a p-variate normal distribution
MLE’s of , , and
confidence intervals for and
hypothesis testing for and
then analytic expressions exist* for
else bootstrap and jackknife approximations exist
*see references, esp. Jolliffe
![Page 18: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/18.jpg)
usage of PCA: Practical computation of PCs
In general it is useful to define standardized variables by
If the are each measured about their sample mean
then the covariance matrix of
will be equal to the correlation matrix of
and the PCs will be dimensionless
![Page 19: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/19.jpg)
usage of PCA: Practical computation of PCs
Given a sample of n observations on a vector of p variables
(each measured about its sample mean)
compute the covariance matrix
where is the n x p matrix
Then compute the n x p matrix
whose ith row is the PC score
for the ith observation.
whose ith row is the ith obsv.
![Page 20: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/20.jpg)
usage of PCA: Practical computation of PCs
Write to decompose each observation into PCs
![Page 21: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/21.jpg)
usage of PCA: Data compression
Because the kth PC retains the kth greatest fraction of the variation
we can approximate each observation
by truncating the sum at the first m < p PCs
![Page 22: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/22.jpg)
usage of PCA: Data compression
Reduce the dimensionality of the data
from p to m < p by approximating
where is the n x m portion of
and is the p x m portion of
![Page 23: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/23.jpg)
Dressler, et al. 1987
astronomical application: PCs for elliptical galaxies
Rotating to PC in BT – Σ space improves Faber-Jackson relation
as a distance indicator
![Page 24: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/24.jpg)
astronomical application: Eigenspectra (KL transform)
Connolly, et al. 1995
![Page 25: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/25.jpg)
references
Connolly, and Szalay, et al., “Spectral Classification of Galaxies: An Orthogonal Approach”, AJ, 110, 1071-1082, 1995.Dressler, et al., “Spectroscopy and Photometry of Elliptical Galaxies. I. A New Distance Estimator”, ApJ, 313, 42-58, 1987.Efstathiou, G., and Fall, S.M., “Multivariate analysis of elliptical galaxies”, MNRAS, 206, 453-464, 1984.Johnston, D.E., et al., “SDSS J0903+5028: A New Gravitational Lens”, AJ, 126, 2281-2290, 2003.
Jolliffe, Ian T., 2002, Principal Component Analysis (Springer-Verlag New York, Secaucus, NJ).Lupton, R., 1993, Statistics In Theory and Practice (Princeton University Press, Princeton, NJ).Murtagh, F., and Heck, A., Multivariate Data Analysis (D. Reidel Publishing Company, Dordrecht, Holland).Yip, C.W., and Szalay, A.S., et al., “Distributions of Galaxy Spectral Types in the SDSS”, AJ, 128, 585-609, 2004.
![Page 26: an introduction to Principal Component Analysis (PCA)](https://reader035.vdocuments.us/reader035/viewer/2022062217/568145ae550346895db2aa59/html5/thumbnails/26.jpg)