nonlinear dimensionality reduction approaches. dimensionality reduction the goal: the meaningful...

26
Nonlinear Dimensionality Reduction Approaches

Upload: samson-mccormick

Post on 15-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Nonlinear Dimensionality Reduction Approaches

Page 2: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Dimensionality Reduction

The goal:The meaningful low-dimensional structures hidden in their high-dimensional observations.

Classical techniques Principle Component Analysis—preserves the variance Multidimensional Scaling—preserves inter-point distance

Isomap Locally Linear Embedding

Page 3: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Common Framework

Algorithm Given data . Construct a nxn affinity matrix M. Normalize M, yielding . Compute the m largest eigenvalues and eigenvectors

of . Only positive eigenvalues should be considered. The embedding of each example is the vector with

the i-th element of the j-th principle eigenvector of . Alternatively (MDS and Isomap), the embedding is , with

. If the first m eigenvalues are positive, then is the best approximation of using only m corrdinates, in the sense of squared error.

nxxD ,,1

M~

'j jv

M~

jx jy ijy

jv M~

ie

ijjij ye ' ji ee .

M~

Page 4: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Linear Dimensionality Reduction PCA

Finds a low-dimensional embedding of the data points that best preserves their variance as measured in the high-dimensional input space

MDS Finds an embedding that preserves the inter-point

distances, equivalent to PCA when the distances are Euclidean.

                             

Page 5: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Multi-Dimensional Scaling

MDS starts from a notion of distacne of affinity that is computed each pair of training examples.

The normalizing step is equivalent to dot products using the “double-centering” formula:

The embedding of example is given by where is the k-th eigenvector of . Note that if then where is the average value of

jijiijij SS

nSn

Sn

MM2

111

2

1~ j

iji MSwhere

M~

2

jiij yyM yyyyM jiij ~ y

iy

ike ikk vix

kv.

Page 6: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Nonlinear Dimensionality Reduction Many data sets contain essential nonlinear

structures that invisible to PCA and MDS Resorts to some nonlinear dimensionality

reduction approaches.

Page 7: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

A Global Geometric Framework for Nonlinear Dimensionality Reduction (Isomap)

Joshua B. Tenenbaum, Vin de Silva, John C. Langford

Page 8: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Example

64X64 Input Images form4096-dimensional vectors

Intrinsically, three dimensions is enough for presentations Two pose parameters and azimuthal lighting angle

Page 9: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Isomap Advantages

Combining the major algorithmic features of PCA and MDS Computational efficiency Global optimality Asymptotic convergence guarantees

Flexibility of learning a broad class of nonlinear manifold

Page 10: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Example of Nonlinear Structure Swiss roll

Only the geodesic distances reflect the true low-dimensional geometry of the manifold.

Page 11: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Intuition

Built on top of MDS. Capturing in the geodesic manifold path of

any two points by concatenating shortest paths in-between.

Approximating these in-between shortest paths given only input-space distance.

Page 12: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Algorithm Description

Step 1Determining neighboring points within a fixed radius based on the input space distance

These neighborhood relations are represented as a weighted graph G over the data points.

Step 2Estimating the geodesic distances between all pairs of points on the manifold M by computing their shortest path distances in the graph G

Step 3Constructing an embedding of the data in d-dimensional Euclidean space Y that best preserves the manifold’s geometry

jid ,X

jidM ,

jidG ,

Page 13: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Construct Embeddings

The coordinate vector for points in Y are chosen to minimize the cost function

where denotes the matrix of Euclidean distances

and the matrix norm The operator converts distances to inner products.

iy

2LYG DDE

YD jiY yyjid ,

2LA 2L ji ijA,

2

Page 14: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Dimension

The true dimensionality of data can be estimated from the decrease in error as the dimensionality of Y is increased.

Page 15: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Manifold Recovery Guarantee Isomap is guaranteed asymptotically to recover the

true dimensionality and geometric structure of nonlinear manifolds

As the sample data points increases, the graph distances provide increasingly better approximations to the intrinsic geodesic distances

),( jidG

),( jidM

Page 16: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Examples

Interpolations between distant points in the low-dimensional coordinate space.

Page 17: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Summary

Isomap handles non-linear manifold Isomap keeps the advantages of PCA and

MDS Non-iterative procedure Polynomial procedure Guaranteed convergence

Isomap represents the global structure of a data set within a single coordinate system.

Page 18: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Nonlinear Dimensionality Reduction by Locally Linear EmbeddingSam T. Roweis and Lawrence K. Saul

Page 19: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

LLE

Neighborhood preserving embeddings Mapping to global coordinate system of low

dimensionality No need to estimate pairwise distances

between widely separated points Recovering global nonlinear structure from

locally linear fits

Page 20: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Algorithm Description

We expect each data point and its neighbors to lie on or close to a locally linear patch of the manifold.

We reconstruct each point from its neighbors.

where summarize the contribution of jth data point to the ith data reconstruction and is what we will estimated by optimizing the error

Reconstructed from only its neighbors Wj sums to 1

i

j jiji XWXW2

ijW

Page 21: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Algorithm Description

A linear mapping for transform the high dimensional coordinates of each neighbor to global internal coordinates on the manifold.

Note that the cost defines a quadratic form

where

The optimal embedding is found by computing the bottom d eigenvector of M, d is the dimension of the embedding

i

j jijiY

YWYY2

min

ij

jiij YYMY

k

kjkijiijijij WWWWM

Page 22: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Illustration

Page 23: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Examples Two Dimensional Embeddings of Faces

Page 24: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Examples

Page 25: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Examples

Page 26: Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional

Thank you