principle component analysisjing/cse601/fa13/materials/pca...–transfer a set of correlated...

23
Principle Component Analysis Jing Gao SUNY Buffalo 1

Upload: others

Post on 24-Feb-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Principle Component Analysis

Jing Gao SUNY Buffalo

1

Page 2: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

• We have too many dimensions

– To reason about or obtain insights from

– To visualize

– Too much noise in the data

– Need to “reduce” them to a smaller set of factors

– Better representation of data without losing much

information

– Can build more effective data analyses on the

reduced-dimensional space: classification, clustering,

pattern recognition

Why Dimensionality Reduction?

2

Page 3: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

• Discover a new set of factors/dimensions/axes

against which to represent, describe or evaluate

the data

• Factors are combinations of observed variables

– May be more effective bases for insights

– Observed data are described in terms of these factors

rather than in terms of original variables/dimensions

Component Analysis

3

Page 4: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Basic Concept

• Areas of variance in data are where items can be best discriminated and key underlying phenomena observed – Areas of greatest “signal” in the data

• If two items or dimensions are highly correlated or dependent – They are likely to represent highly related phenomena

– If they tell us about the same underlying variance in the data, combining them to form a single measure is reasonable

4

Page 5: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Basic Concept

• So we want to combine related variables, and focus on uncorrelated or independent ones, especially those along which the observations have high variance

• We want a smaller set of variables that explain most of the variance in the original data, in more compact and insightful form

• These variables are called “factors” or “principal components”

5

Page 6: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Principal Component Analysis

• Most common form of factor analysis

• The new variables/dimensions

– Are linear combinations of the original ones

– Are uncorrelated with one another

• Orthogonal in dimension space

– Capture as much of the original variance in the data as possible

– Are called Principal Components

6

Page 7: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

What are the new axes?

Original Variable A

Ori

gin

al V

aria

ble

B

PC 1 PC 2

• Orthogonal directions of greatest variance in data • Projections along PC1 discriminate the data most along any one axis

7

Page 8: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Principal Components

• First principal component is the direction of greatest variability (covariance) in the data

• Second is the next orthogonal (uncorrelated) direction of greatest variability

– So first remove all the variability along the first component, and then find the next direction of greatest variability

• And so on …

8

Page 9: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Principal Components Analysis (PCA)

• Principle – Linear projection method to reduce the number of

parameters – Transfer a set of correlated variables into a new set of

uncorrelated variables – Map the data into a space of lower dimensionality

• Properties – It can be viewed as a rotation of the existing axes to new

positions in the space defined by original variables – New axes are orthogonal and represent the directions with

maximum variability

9

Page 10: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Algebraic definition of PCs

.,,2,1,1

111 njxaxazp

i

ijij

T

p

nxxx ,,, 21

]var[ 1z

Given a sample of n observations on a vector of p variables

define the first principal component of the sample

by the linear transformation

where the vector

is chosen such that is maximum.

),,,(

),,,(

21

121111

pjjjj

p

xxxx

aaaa

10

Page 11: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Algebraic derivation of PCs

To find first note that

where

is the covariance matrix.

Ti

n

i

i xxxxn

S 1

1

1a

11

1

11

1

2

11

2

111

1

1))((]var[

Saaaxxxxan

xaxan

zzEz

Tn

i

T

ii

T

n

i

T

i

T

mean. theis 1

1

n

i

ixn

x

In the following, we assume the Data is centered. 0x

11

Page 12: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Algebraic derivation of PCs

np

nxxxX ],,,[ 21

0x

TXXn

S1

Assume

Form the matrix:

then

12

Page 13: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

To find that maximizes subject to

Let λ be a Lagrange multiplier

is an eigenvector of S

corresponding to the largest eigenvalue

therefore

Algebraic derivation of PCs

1a ]var[ 1z 111 aaT

11

11

11

1

1111

0

)1(

Saa

aSa

aSaLa

aaSaaL

T

TT

1a

.1 13

Page 14: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

To find the next coefficient vector maximizing

then let λ and φ be Lagrange multipliers, and maximize

subject to

and to

Algebraic derivation of PCs

2a

122 aaT

]var[ 2z

0],cov[ 12 zz

2112112 ],cov[ aaSaazz TT

122222 )1( aaaaSaaL TTT

uncorrelated

14

Page 15: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

We find that is also an eigenvector of S

whose eigenvalue is the second largest.

In general

• The kth largest eigenvalue of S is the variance of the kth PC.

• The kth PC retains the kth greatest fraction of the variation

in the sample.

Algebraic derivation of PCs

2a

2

kk

T

kk Saaz ]var[

kz

15

Page 16: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Algebraic derivation of PCs

• Main steps for computing PCs – Form the covariance matrix S.

– Compute its eigenvectors:

– Use the first d eigenvectors to form the

d PCs.

– The transformation G is given by ],,,[ 21 daaaG

p

iia1

d

iia1

.point A test dTp xGx 16

Page 17: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Dimensionality Reduction

dY pdTG

pX

dTdp XGYXG :

Linear transformation

Original data reduced data

17

Page 18: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

18

Steps of PCA

• Let be the mean vector (taking the mean of all rows)

• Adjust the original data by the mean

X’ = X –

• Compute the covariance matrix S of adjusted X

• Find the eigenvectors and eigenvalues of S.

X

X

Page 19: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

19

Principal components - Variance

0

5

10

15

20

25

PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10

Var

ian

ce (

%)

Page 20: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

20

Transformed Data

• Eigenvalues j corresponds to variance on each

component j

• Thus, sort by j

• Take the first d eigenvectors ai; where d is the number

of top eigenvalues

• These are the directions with the largest variances

nin

i

i

did

i

i

xx

xx

xx

a

a

a

y

y

y

.........

22

11

2

1

2

1

Page 21: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

21

An Example

X1 X2 X1' X2'

19 63 -5.1 9.25

39 74 14.9 20.25

30 87 5.9 33.25

30 23 5.9 -30.75

15 35 -9.1 -18.75

15 43 -9.1 -10.75

15 32 -9.1 -21.75

30 73 5.9 19.25

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50

Series1

Mean1=24.1 Mean2=53.8

-40

-30

-20

-10

0

10

20

30

40

-15 -10 -5 0 5 10 15 20

Series1

Page 22: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

22

Covariance Matrix

• C=

• We find out:

– Eigenvectors:

– a2=(-0.98,-0.21), 2=51.8

– a1=(0.21,-0.98), 1=560.2

75 106

106 482

Page 23: Principle Component Analysisjing/cse601/fa13/materials/PCA...–Transfer a set of correlated variables into a new set of uncorrelated variables –Map the data into a space of lower

Transform to One-dimension

• We keep the dimension of a1=(0.21,-0.98)

• We can obtain the final data as

21

2

1*98.0*21.098.021.0 ii

i

i

i xxx

xy

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

-40 -20 0 20 40

yi

-10.14

-16.72

-31.35

31.374

16.464

8.624

19.404

-17.63

23