project 11: determining the intrinsic dimensionality of a distribution
DESCRIPTION
Project 11: Determining the Intrinsic Dimensionality of a Distribution. Okke Formsma, Nicolas Roussis and Per Løwenborg. Outline. About the project What is intrinsic dimensionality? How can we assess the ID? PCA Neural Network Nearest Neighbour Experimental Results. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/1.jpg)
Project 11: Determining the Intrinsic Dimensionality of a
Distribution
Okke Formsma, Nicolas Roussis and Per Løwenborg
![Page 2: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/2.jpg)
Outline
• About the project• What is intrinsic dimensionality?• How can we assess the ID?– PCA– Neural Network– Nearest Neighbour
• Experimental Results
![Page 3: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/3.jpg)
Why did we chose this Project?
• We wanted to learn more about developing and experiment with algorithms for analyzing high-dimensional data
• We want to see how we can implement this into a program
![Page 4: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/4.jpg)
Papers
N. Kambhatla, T. Leen, “Dimension Reduction by Local Principal Component Analysis”
J. Bruske and G. Sommer, “Intrinsic Dimensionality Estimation with Optimally Topology Preserving Maps”
P. Verveer, R. Duin, “An evaluation of intrinsic dimensionality estimators”
![Page 5: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/5.jpg)
How does dimensionality reduction influence our lives?
• Compress images, audio and video• Redusing noise • Editing• Reconstruction
![Page 6: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/6.jpg)
This is a image going through different steps in a reconstruction
![Page 7: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/7.jpg)
Intrinsic Dimensionality
The number of ‘free’ parameters needed to generate a pattern
Ex:• f(x) = -x² => 1 dimensional• f(x,y) = -x² => 1 dimensional
![Page 8: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/8.jpg)
PRINCIPAL COMPONENT ANALYSIS
![Page 9: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/9.jpg)
Principal Component Analysis (PCA)
• The classic technique for linear dimension reduction.
• It is a vector space transformation which reduce multidimensional data sets to lower dimensions for analysis.
• It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences.
![Page 10: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/10.jpg)
Advantages of PCA
• Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analysing data.
• Once you have found these patterns in the data, you can compress the data, -by reducing the number of dimensions- without much loss of information.
![Page 11: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/11.jpg)
Example
![Page 12: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/12.jpg)
Problems with PCA
• Data might be uncorrelated, but PCA relies on second-order statistics (correlation), so sometimes it fails to find the most compact description of the data.
![Page 13: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/13.jpg)
Problems with PCA
![Page 14: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/14.jpg)
First eigenvector
![Page 15: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/15.jpg)
Second eigenvector
![Page 16: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/16.jpg)
A better solution?
![Page 17: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/17.jpg)
Local eigenvector
![Page 18: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/18.jpg)
Local eigenvectors
![Page 19: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/19.jpg)
Local eigenvectors
![Page 20: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/20.jpg)
Another problem
![Page 21: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/21.jpg)
Is this the principal eigenvector?
![Page 22: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/22.jpg)
Or do we need more than one?
![Page 23: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/23.jpg)
Choose
![Page 24: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/24.jpg)
The answer depends on your application
Low resolution High resolution
![Page 25: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/25.jpg)
Challenges
• How to partition the space?• How many partitions should we use?• How many dimensions should we retain?
![Page 26: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/26.jpg)
How to partition the space?
Vector Quantization
Lloyd AlgorithmPartition the space in k setsRepeat until convergence:
Calculate the centroids of each setAssociate each point with the nearest centroid
![Page 27: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/27.jpg)
Lloyd Algorithm
Set 1
Set 2
Step 1: randomly assign
![Page 28: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/28.jpg)
Lloyd Algorithm
Set 1
Set 2
Step 2: Calculate centriods
![Page 29: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/29.jpg)
Lloyd Algorithm
Set 1
Set 2
Step 3: Associate points with nearest centroid
![Page 30: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/30.jpg)
Lloyd Algorithm
Set 1
Set 2
Step 2: Calculate centroids
![Page 31: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/31.jpg)
Lloyd Algorithm
Set 1
Set 2
Step 3: Associate points with nearest centroid
![Page 32: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/32.jpg)
Lloyd Algorithm
Set 1
Set 2
Result after 2 iterations:
![Page 33: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/33.jpg)
How many partitions should we use?
Bruske & Sommer: “just try them all”
For k = 1 to k ≤ dimension(set):Subdivide the space in k regionsPerform PCA on each regionRetain significant eigenvalues per region
![Page 34: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/34.jpg)
Which eigenvalues are significant?
Depends on:• Intrinsic dimensionality• Curvature of the surface• Noise
![Page 35: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/35.jpg)
Which eigenvalues are significant?
Discussed in class:• Largest-n
In papers:• Cutoff after normalization (Bruske & Summer)• Statistical method (Verveer & Duin)
![Page 36: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/36.jpg)
Which eigenvalues are significant?
Cutoff after normalizationµx is the xth eigenvalue
With α = 5, 10 or 20.
%max
jj
ii
![Page 37: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/37.jpg)
Which eigenvalues are significant?
Statistical method (Verveer & Duin)
Calculate the error rate on the reconstructed data if the lowest eigenvalue is dropped
Decide whether this error rate is significant
![Page 38: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/38.jpg)
Results
• One dimensional space, embedded in 256*256 = 65,536 dimensions
• 180 images of rotatingcylinder
• ID = 1
![Page 39: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/39.jpg)
Results
![Page 40: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/40.jpg)
NEURAL NETWORK PCA
![Page 41: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/41.jpg)
Basic Computational Element - Neuron
• Inputs/Outputs, Synaptic Weights, Activation Function
![Page 42: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/42.jpg)
![Page 43: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/43.jpg)
3-Layer Autoassociators
• N input, N output and M<N hidden neurons.• Drawbacks for this model. The optimal solution
remains the PCA projection.
![Page 44: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/44.jpg)
5-Layer Autoassociators Neural Network Approximators for principal surfaces
using 5-layers of neurons. Global, non-linear dimension reduction technique. Succesfull implementation of nonlinear PCA using these
networks for image and speech dimension reduction and for obtaining concise representations of color.
![Page 45: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/45.jpg)
• Third layer carries the dimension reduced representation, has width M<N
• Linear functions used for representation layer.
• The networks are trained to minimize MSE training criteria.
• Approximators of principal surfaces.
![Page 46: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/46.jpg)
Locally Linear Approach to nonlinear dimension reduction (VQPCA Algorithm)
• Much faster than to train five-layer autoassociators and provide superior solutions.
• This algorithm attempts to minimize the MSE (like 5-layers autoassociators) between the original data and its reconstruction from a low-dimensional representation. (reconstruction error)
![Page 47: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/47.jpg)
• 2 Steps in Algorithm:1) Partition the data space by VQ (clustering).2) Performs local PCA about each cluster
center.
VQPCA
VQPCA is actually a local PCA to each cluster.
![Page 48: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/48.jpg)
We can use 2 kinds of distances measures in VQPCA:1) Euclidean Distance2) Reconstruction Distance
Example intended for a 1D local PCA:
![Page 49: Project 11: Determining the Intrinsic Dimensionality of a Distribution](https://reader036.vdocuments.us/reader036/viewer/2022062322/5681457d550346895db254de/html5/thumbnails/49.jpg)
5-layer Autoassociators vs VQPCA
• Difficulty to train 5-layer autoassociators. Faster training in VQPCA algorithm. (VQ can be accelerated using tree-structured
or multistage VQ)• 5-layer autoassociators are prone to trapping
in poor local optimal.• VQPCA slower for encoding new data but
much faster for decoding.