02 murphy multi variate distanc
DESCRIPTION
fghfghgfhTRANSCRIPT
-
Multivariate Distance and SimilarityRobert F. MurphyCytometry Development Workshop 2000
-
General Multivariate DatasetWe are given values of p variables for n independent observationsConstruct an n x p matrix M consisting of vectors X1 through Xn each of length p
-
Multivariate Sample MeanDefine mean vector I of length p
ormatrix notationvector notation
-
Multivariate VarianceDefine variance vector s2 of length pmatrix notation
-
Multivariate Varianceor
vector notation
-
Covariance MatrixDefine a p x p matrix cov (called the covariance matrix) analogous to s2
-
Covariance MatrixNote that the covariance of a variable with itself is simply the variance of that variable
-
Univariate DistanceThe simple distance between the values of a single variable j for two observations i and l is
-
Univariate z-score DistanceTo measure distance in units of standard deviation between the values of a single variable j for two observations i and l we define the z-score distance
-
Bivariate Euclidean DistanceThe most commonly used measure of distance between two observations i and l on two variables j and k is the Euclidean distance
-
Multivariate Euclidean DistanceThis can be extended to more than two variables
-
Effects of variance and covariance on Euclidean distancePoints A and B have similar Euclidean distances from the mean, but point B is clearly more different from the population than point A.BAThe ellipse shows the 50% contour of a hypothetical population.
-
Mahalanobis DistanceTo account for differences in variance between the variables, and to account for correlations between variables, we use the Mahalanobis distance