non-linear dimension-reduction methods

36
Non-linear dimension-reduction methods Olga Sorkine January 2006

Upload: osias

Post on 11-Feb-2016

56 views

Category:

Documents


1 download

DESCRIPTION

Non-linear dimension-reduction methods. Olga Sorkine January 2006. Overview. Dimensionality reduction of high-dimensional data Good for learning, visualization and … parameterization. Dimension reduction. Input: points in some D -dimensional space ( D is large) Images - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Non-linear  dimension-reduction methods

Non-linear dimension-reduction methods

Olga SorkineJanuary 2006

Page 2: Non-linear  dimension-reduction methods

2

Overview

Dimensionality reduction of high-dimensional data Good for learning, visualization and … parameterization

Page 3: Non-linear  dimension-reduction methods

3

Dimension reduction

Input: points in some D-dimensional space (D is large)– Images– Physical measurements– Statistical data– etc…

We want to discover some structure/correlation in the input data. Hopefully, the data lives on a d-dimensional surface (d << D).– Discover the real dimensionality d– Find a mapping from RD to Rd that preserves something about

the data• Today we’ll talk about preserving variance/distances

Page 4: Non-linear  dimension-reduction methods

4

Discovering linear structures

PCA – finds linear subspaces that best preserve the variance of the data points

Page 5: Non-linear  dimension-reduction methods

5

Linear is sometimes not enough

When our data points sit on a non-linear manifold– We won’t find a good linear mapping from the data points to a

plane, because there isn’t any

Page 6: Non-linear  dimension-reduction methods

6

Today

Two methods to discover such non-linear manifolds:

Isomap (descendent of MultiDimensional Scaling) Llocally Linear Embedding

Page 7: Non-linear  dimension-reduction methods

7

Notations

Input data points: columns of X RDn

Assume that the center of mass of the points is the origin

2

| | |

| | |nX

1x x x

Page 8: Non-linear  dimension-reduction methods

8

Reminder about PCA

PCA finds a linear d-dimensional subspace of RD along which the variance of the data is the biggest

Denote by the data points projected onto the d-dimensional space. PCA finds such subspace that:

When we do parallel projection of the data points, the distances between them can only get smaller. So finding a subspace which attains the maximum scatter means we get the distances somehow preserved.

, , , 1 2 nx x x

2

maxi j

i jx x

Page 9: Non-linear  dimension-reduction methods

9

Reminder about PCA

To find the principal axes:– Compute the scatter matrix S RDD

– Diagonalize S:

The eigenvectors of S are the principal directions. The eigenvalues are sorted in descending order.

Take d first eigenvectors as the “principal subspace” and project the data points onto this subspace.

TS XX

1| | | |

| | | |

T

D

S

1 D 1 Dv v v v

Page 10: Non-linear  dimension-reduction methods

10

Why this works?

The eigenvectors vi are the maxima of the following quadratic form:

In fact, we get directions of maximal variance:2

( ) ( ) ( )T T T T T T Tf S XX X X X v v v v v v v v

2 2

22| ,

| ,

T

n

X

1 1

i

n

x x vv v x

x x v

( ) , Tf S S v v v v v

Page 11: Non-linear  dimension-reduction methods

Multidimensional Scaling

J. Tenenbaum, V. Silva, J.C. LangfordScience, December 2000

Page 12: Non-linear  dimension-reduction methods

12

Multidimensional scaling (MDS)

The idea: compute the pairwise distances between the input points:

Now, find n points in low-dimensional space Rd, so that their distance matrix is as close as possible to M.

2dist ( , )n n

M

i jx x

Page 13: Non-linear  dimension-reduction methods

13

MDS – the math details

We look for X’,

such that || M’ – M || is as small as possible, where

M’ is the Euclidean distances matrix for points xi’.

| |

| |

d nX R

1 nx x

22dist ( , ) n nM R

i j i jx x x x

Page 14: Non-linear  dimension-reduction methods

14

MDS – the math details

Ideally, we want:

2

2 2

,

|| || || || 2 ,

M M

M

M

i j

i j i j

i j i j

x x

x x x x

x x x x

2 2 2

|| || || || || ||

|| || || || || ||

|| || || || || ||

1 1 1

n n n

x x x

x x x

x x x

2

1 2

1 2

|| || || || || ||

|| || || || || ||

|| || || || || ||

1 n

n

n

x x x

x x x

x x x

| |

| |

1

1 n

n

xx x

x

TX X want to get rid of these

Page 15: Non-linear  dimension-reduction methods

15

MDS – the math details

Trick: use the “magic matrix” J :1 1

1 1 1

1 1

11

1

n n

n n n

n n n n

J

0a a a J

0

bb

J

b

Page 16: Non-linear  dimension-reduction methods

16

MDS – the math details

Cleaning the system:

2

2 2 2 1 2

1 2

|| || || || || || || || || || || ||

|| || || || || || || || || || || || 2

|| || || || || || || || || || || ||

TX X M

1 1 1 1 n

n

n n n n

x x x x x x

x x x x x x

x x x x x x

J J

12

2

:

T

T

X X JMJ

X X JMJ B

TX X B

Page 17: Non-linear  dimension-reduction methods

17

How to find X’

We will use the spectral decomposition of B:

1| | | |

| | | |

T

T

n

X X B

1 n 1 nv v v v

1 1| | | | | || | | | | |

| | | | | || | | | | |

TT

Tn nd d

n n

X X

1 d 1 dv v v v v v

n d

d d

TX X

Page 18: Non-linear  dimension-reduction methods

18

How to find X’

So we find X’ by throwing away the last nd eigenvalues

1

d

X

1

d

v

v

d n

2arg min T

LXX X X B

22

,ijL

i j

A A

Page 19: Non-linear  dimension-reduction methods

19

Isomap

The idea of Tenenbaum et al.: estimate geodesic distances of the data points (instead of Euclidean)

Use K nearest neighbors or -balls to define neighborhood graphs

Approximate the geodesics by shortest paths on the graph.

Page 20: Non-linear  dimension-reduction methods

20

Inducing a graph

-15

-10

-5

0

5

10

15

-15

-10

-5

0

5

10

150

20

40

60

Page 21: Non-linear  dimension-reduction methods

21

Defining neighborhood and weights

ijw i jx x

Page 22: Non-linear  dimension-reduction methods

22

Finding geodesic paths

Compute weighted shortest paths on the graph (Dijkstra)

Page 23: Non-linear  dimension-reduction methods

23

Locating new points in the Isomap embedding

Suppose we have a new data point p RD

Want to find where it belongs in the Rd embedding Compute the distances from p to all other points:

2

dist( , )dist( , )

dist( , )n

1p xp x

u

p x

d d V

p u

Page 24: Non-linear  dimension-reduction methods

24

Some results

Page 25: Non-linear  dimension-reduction methods

25

Morph in Isomap space

Page 26: Non-linear  dimension-reduction methods

26

Flattening results (Zigelman et al.)

Page 27: Non-linear  dimension-reduction methods

27

Flattening results (Zigelman et al.)

Page 28: Non-linear  dimension-reduction methods

28

Flattening results (Zigelman et al.)

Page 29: Non-linear  dimension-reduction methods

Locally Linear Embedding

S.T. Roweis and L.K. SaulScience, December 2000

Page 30: Non-linear  dimension-reduction methods

30

The idea

Define neighborhood relations between points– K nearest neighbors -balls

Find weights that reconstruct each data point from its neighbors:

Find low-dimensional coordinates so that the same weights hold:

2

( )1min

ijj

j N iijw

w

i jx x

, ( ),

2

min iji j N i

w

1 n

i jx x

x x

, , dR 1 nx x

Page 31: Non-linear  dimension-reduction methods

31

Local information reconstructs global one

The weights wij capture the local shape– Invariant to translation, rotation and scale of the neighborhood– If the neighborhood lies on a manifold, the local mapping from

the global coordinates (RD) to the surface coordinates (Rd) is almost linear

– Thus, the weights wij should hold also for manifold (Rd) coordinate system!

, ( ),

2

min iji j N i

w

1 n

i jx x

x x

2

( )1min

ijj

j N iijw

w

i jx x

Page 32: Non-linear  dimension-reduction methods

32

Solving the minimizations

Linear least squares (using Lagrange multipliers)

To find that minimize,

a sparse eigen-problem is solved. Additional constraints

are added for conditioning:

2

( )1min

ijj

j N iijw

w

i jx x

, ( ),

2

min iji j N i

w

1 n

i jx x

x x

, , dR 1 nx x

10, T

i i

In

i i ix x x

Page 33: Non-linear  dimension-reduction methods

33

Some results

The Swiss roll

Page 34: Non-linear  dimension-reduction methods

34

Some results

Page 35: Non-linear  dimension-reduction methods

35

Some results

Page 36: Non-linear  dimension-reduction methods

The end