partial least squares regression (plsr)
TRANSCRIPT
![Page 1: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/1.jpg)
Partial Least Squares Partial Least Squares Regression (PLSR)Regression (PLSR)
![Page 2: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/2.jpg)
• Partial least squares (PLS) is a method for constructing predictive models when the predictors are many and highly collinear.
• Note that the emphasis is on predicting the responses and not necessarily on trying to understand the underlying relationship between the variables.
• When prediction is the goal and there is no practical need to limit the number of measured factors, PLS can be a useful tool.
![Page 3: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/3.jpg)
• PLS was developed in the 1960’s by Herman Wold as an econometric technique, but some of its most avid proponents (including Wold’s son Svante) are chemical engineers and chemometricians.
![Page 4: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/4.jpg)
• Partial least squares regression (PLSR) is a multivariate data analytical technique designed to handle intercorrelated regressors.
• It is based on Herman Wold’s general PLS principle in which complicated, multivariate systems analysis problems are solved by sequence of simple least squares regressions.
![Page 5: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/5.jpg)
How Does PLS Work?How Does PLS Work?
• In principle, MLR can be used with very many predictors.
• However, if the number of predictors gets too large (for example, greater than the number of observations), you are likely to get a model that fits the sampled data perfectly but that will fail to predict new data well.
• This phenomenon is called over-fitting.
• In such cases, although there are many manifest predictors, there may be only a few underlying or latent factors that account for most of the variation in the response.
![Page 6: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/6.jpg)
• The general idea of PLS is to try to extract these latent factors, accounting for as much of the manifest predictor variation as possible while modeling the responses well.
• For this reason, the acronym PLS has also been taken to mean ‘‘projection to latent structure.’’
![Page 7: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/7.jpg)
![Page 8: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/8.jpg)
![Page 9: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/9.jpg)
• The overall goal is to use the predictors to predict the responses in the population.
• This is achieved indirectly by extracting latent variables T and U from sampled factors and responses, respectively.
• The extracted factors T (also referred to as X-scores) are used to predict the Y-scores U, and then the predicted Y-scores are used to construct predictions for the responses.
• This procedure actually covers various techniques, depending on which source of variation is considered most crucial.
![Page 10: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/10.jpg)
• PCR is based on the spectral decomposition of XtX, where X is the matrix of predictor values;
• PLS is based on the singular value decomposition of XtY .
![Page 11: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/11.jpg)
• If the number of extracted factors is greater than or equal to the rank of the sample factor space, then PLS is equivalent to MLR.
• An important feature of the method is that usually a great deal fewer factors are required.
• One One approach approach toto extract extract optimum number ofoptimum number of factors factors is to construct the PLS model for a given number of factors on one set of data and then to test it on another, choosing the number of extracted factors for which the total prediction error is minimized.
![Page 12: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/12.jpg)
• Alternatively, van der Voet (1994) suggests choosing the least number of extracted factors whose residuals are not significantly greater than those of the model with minimum error.
• If no convenient test set is available, then each observation can be used in turn as a test set; this is known as cross-validation.
![Page 13: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/13.jpg)
• The PLSR is a bilinear regression method that extracts a small number of factor, ta, a = 1, 2,…, A that are linear combinations of the K X variables, and use these factors as regressors for y.
• What is special for the PLSR compared to principal component regression (PCR) is that the y variable is used actively in determining how the regression factors ta are computed from the X.
![Page 14: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/14.jpg)
• Each PLSR factor ta is defined so that it describes as much as possible of the covariance between X and y remaining after the previous a-1 factors have been estimated and subtracted.
• The purpose of using PLSR in multivariate calibration is to obtain good insight and good predictive ability at the same time.
![Page 15: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/15.jpg)
• In classical stepwise multiple linear regression (SMLR) the collinearity is handled by picking out a small subset of individual, distinctly different X variables from all the available X variables.
• This reduced subset is used as regressors for y, leaving the other X variables unused.
![Page 16: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/16.jpg)
• The estimated factors are often defined to be orthogonal to one another.
• The model for regressions on estimated latent variables can be summarized as follows:
T = w(X)
X = p(T) + E
y = q(T) + f
y = q(w(X)) + f = b(X) + f
![Page 17: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/17.jpg)
• In practice, the model parameters have to be estimated from empirical data.
• Since the regression is intended for later prediction of y and X, the factor scores T are generally defined as functions of X:T = w(X).
• The major difference between calibration methods is how T is estimated.
![Page 18: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/18.jpg)
• For instance, in PCR it is estimated as a series of eigenvector spectra for (X – 1x(X – 1xTT))TT(X – 1x(X – 1xTT),), etc.
• In PLSR w() is defined as a sequence of X versus y covariances.
![Page 19: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/19.jpg)
PLS-Regression (PLS-R)PLS-Regression (PLS-R)PLS-A Powerful Alternative to PCRPLS-A Powerful Alternative to PCR
• It is possible to obtain the same prediction results as PCR, but based on a smaller number of components, by allowing the y-data structure to intervene directly in the X-decomposition.
• This by condensing the two-stage PCR process into just one: PLS-R (Partial Least Squares Regression).
• Usually the term used is just PLS, which has also been interpreted to signify Projection to Latent Structures.
• PLS claims to do the same job as PCR, only with fewer bilinear components.
![Page 20: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/20.jpg)
PLS(X, Y); Initial Comparison with PLS(X, Y); Initial Comparison with PCA(X),PCA(Y)PCA(X),PCA(Y)
• In comparision between PCR and PLS, PLS uses the y-data structure, the y-variance, directly as a guiding hand in decomposing the X-matrix, so that the outcome constitutes as optimal regression, precisely in the strict prediction validation sense.
![Page 21: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/21.jpg)
• A very first approximation to an understanding of how the PLS-approach works (though not entirely correct) is tentatively and simply to view it as two simultaneous PCA-analyses, PCA of X and PCA of Y.
• The equivalent PCA equations are presented at the following Figure.
![Page 22: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/22.jpg)
• Note how the score and loading complements in X are called T and P respectively (X also has an alternative W-loading in addition to the familiar P-loading), while these are called U and Q respectively for the Y-space.
A
T
A
T
FQUY
EPTX
![Page 23: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/23.jpg)
• However PLS does not really perform two independent PCA-analyses on the two spaces.
• On the contrary, PLS actively connects the X- and Y-spaces by specifying the u-score vector (s) to act as the starting points for (actually instead of) the t-score vectors in the X-space decomposition.
w = loading weight p = x loading q = y loading
![Page 24: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/24.jpg)
• Thus the starting proxy-t1 is actually u1 in the PLS-R method, thereby letting the Y-data structure directly guide the otherwise much more “PCA-like” decomposition of X.
• Subsequently u1 is later substituted by t1 at the relevant stage in the PLS-algorithm in which the Y-space is decomposed.
![Page 25: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/25.jpg)
• The crucial point is that it is the u1 (reflecting the Y-space structure) that first influences the X-decomposition leading to calculation of the X-loadings, but these are now termed “w” (for “loading-weights”).
• Then the X-space t-vectors are calculated, formally in a “standard” PCA fashion, but necessarily based on this newly calculated w-vector.
![Page 26: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/26.jpg)
• This t-vector is now immediately used as the starting proxy- u1-vector, i.e. instead of u1, as described above only symmetrically with the X- and the Y-space interchanged.
• By this means, the X-data structure also influences the “PCA (Y)-like” decomposition.
![Page 27: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/27.jpg)
B = W(PTW)-1QT
![Page 28: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/28.jpg)
• Thus, what might at first sight appear as two sets of independent PCA decompositions is in fact based on these interchanged score vectors.
• In this way we have achieved the goal of modeling the X- and Y-space interdependently. PLS actively reduces the influence of large X-variations which do not correlate with Y.
• PCR is based on the spectral decomposition of X’X, where X is the matrix of variables and PLS is based on the singular value decomposition of X’Y.
![Page 29: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/29.jpg)
![Page 30: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/30.jpg)
![Page 31: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/31.jpg)
![Page 32: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/32.jpg)
![Page 33: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/33.jpg)
![Page 34: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/34.jpg)
• Alternative overview of PLS (indirect modeling) states that the overall goal is to use the variables to predict the responses in the population.
• This is achieved indirectly by extracting latent variables T and U from sampled variables and responses, respectively.
• The extracted factors T (also referred to as X-scores) are used to predict the Y-scores U, and then the predicted Y-scores are used to construct predictions for the responses.
![Page 35: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/35.jpg)
![Page 36: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/36.jpg)
![Page 37: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/37.jpg)
![Page 38: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/38.jpg)
![Page 39: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/39.jpg)
Interpretation of PLS modelsInterpretation of PLS models
• In principle PLS models are interpreted in much the same way as PCA and PCR models.
• Plotting the X- and the Y-loadings in the same plot allows you to study the inter-variable relationship, now also including the relationship between the X- and Y-variables.
![Page 40: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/40.jpg)
• Since PLS focuses on Y, the Y-relevant information is usually expected already in early components.
• There are however situations where the variation related to Y is very subtle, so many components will be necessary to explain enough of Y.
![Page 41: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/41.jpg)
Loadings (p) and loading weights (w)Loadings (p) and loading weights (w)
• The P-loadings are very much like the well-known PCA-loadings; they express the relationship between the raw data matrix X and its score, T. (in PLS these may be called PLS scores.)
• These loadings may be interpreted in the same way as in PCA or PCR, so long as it is aware that the scores have been calculated by PLS.
• In many PLS applications P and W are quite similar. This means that the dominant structures in X “happen” to be directed more or less along the same directions as those with maximum correlation to Y.
![Page 42: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/42.jpg)
• The loading weights, W, however represent the effective loadings directly connected to building the sought for regression relationship between X and Y.
• In PLS there is also a set of Y-loadings, Q, which are the regression coefficients from the Y-variables onto the scores, U.
• Q and W may be used to interpret relationships between the X- and Y-variables, and to interpret the patterns in the score plots related to these loadings.
![Page 43: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/43.jpg)
Loading plot of non-spectra variablesLoading plot of non-spectra variables
![Page 44: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/44.jpg)
![Page 45: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/45.jpg)
![Page 46: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/46.jpg)
Loading plot of spectra variablesLoading plot of spectra variables
![Page 47: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/47.jpg)
![Page 48: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/48.jpg)
![Page 49: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/49.jpg)
![Page 50: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/50.jpg)
![Page 51: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/51.jpg)
![Page 52: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/52.jpg)
• The fact that both P and W are important however, is clear from construction of the formal regression equation Y = XB from any specific PLS solution with A components.
• This B matrix is calculated from:
B = W(PTW)-1QT
This B-matrix is often used for practical (numerical) prediction purposes.
![Page 53: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/53.jpg)
When to use which method?When to use which method?
• PLS-approach is easy to understand conceptually and to be preferred because it is direct, and effective.
• PLS is said to produce results, which are easier to interpret because they are less complex (using fewer components).
• Often PCR may give prediction errors as low as those of PLS, but almost invariably by using more PCs to do the jobs.
![Page 54: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/54.jpg)
• PLS2 is a natural method to start with when there are many Y-variables.
• You quickly get an overview of the basic patterns and see if there is significant correlation between the Y-variables.
• PLS2 may actually in a few cases even give better results if Y is collinear, because it utilises all the available information in Y.
• The drawback is that you may need different numbers of PCs for the different Y-variables, which you must remember at interpretation and prediction.
![Page 55: Partial Least Squares Regression (PLSR)](https://reader034.vdocuments.us/reader034/viewer/2022050615/546a5b56b4af9fe41b8b4677/html5/thumbnails/55.jpg)
Exercise- Interpretation of PLS (Jam)Exercise- Interpretation of PLS (Jam)