manifold blurring mean shift algorithms for manifold denoising, presentation, 2012
DESCRIPTION
(General) To retrieve a clean dataset by deleting outliers. (Computer Vision) the recovery of a digital image that has been contaminated by additive white Gaussian noise.TRANSCRIPT
Computer Vision
Manifold Blurring Mean Shift algorithms for manifold denoising
Kevin ADDA, Florent RENUCCI
Denoising (General) To retrieve a clean dataset by deleting outliers.
(Computer Vision) the recovery of a digital image that has been contaminated by additive white Gaussian noise.
Noisy spiral dataset Handwritten digits recognition Noisy image
2Computer Vision project
Manifold Blurring Mean Shift algorithm (MBMS)
Blurring mean-shift update :
Projection on a sub-dimensional space with PCA:
, where K is a Gaussian kernel:
, such that:
Parameters: the variance of the Gaussian kernel ; k the number of neighbors to consider ; L the local instrinsic dimension; Iteration number for the whole algorithm.
3Computer Vision project
Setting the parameters: the kernel variance
related to the level of local noise outside the manifold;
The larger it is, the stronger the denoising effect;
But can distort the manifold shape over iterations.
Trade-off between kernel variance and iteration number.
4Computer Vision project
Setting the parameters: the number of neighbors
k is the number of nearest neighbors that estimates the local tangent space;
MBMS is quite robust to it. It typically grows sublinearly with N.
However, it effects strongly the mean-shift blurring effect as each point is motioned toward the Gaussian kernel mean on the neighbors.
Trade-off between the number of parameters and kernel variance.
5Computer Vision project
Setting the parameters: the intrinsic dimensionality
If L is too small, it produces more local clustering and can distort the manifold;
If L is too big, points will move a little : if L is equal to the dimension of the set, no motion.
6Computer Vision project
Since we use 2D datasets, we will usually choose L=1, except for GBMS Algorithm (L=0)
Setting the parameters: the number of iterations
A few iterations (1 to 5) achieve most of the denoising
More iterations can refine this and produce a better result, but shrinkage might arise.
7Computer Vision project
Trade-off between the number of iterations and the other parameters.
Spiral dataset
Computer Vision project8
Pinwheel.m: generates little two-dimensional datasets that are spirals of noisy data.
(credit: Harvard intelligent probabilistic systems)
Spiral dataset: application
Computer Vision project9
Parameters : L = 1; k = 15 ; = 1.1
Initial set: Noisy spiral with uniformely distributed outliers
N = 1250
Spiral dataset: application
Computer Vision project10
Parameters : L = 1; k = 15 ; = 1.1
Iteration 1
Spiral dataset: application
Computer Vision project11
Parameters : L = 1; k = 15 ; = 1.1
Iteration 2
Spiral dataset: application
Computer Vision project12
Parameters : L = 1; k = 15 ; = 1.1
Iteration 3
Spiral dataset: application
Computer Vision project13
Parameters : L = 1; k = 15 ; = 1.1
Iteration 4
Spiral dataset: application
Computer Vision project14
Parameters : L = 1; k = 15 ; = 1.1
Iteration 5
Spiral dataset: application
Computer Vision project15
Parameters : L = 1; k = 15 ; = 1.1
Iteration 6
Spiral dataset: application
Computer Vision project16
Parameters : L = 1; k = 15 ; = 1.1
Iteration 7
Spiral dataset: application
Computer Vision project17
Parameters : L = 1; k = 15 ; = 1.1
Iteration 8
Number of neighbors effect Initial dataset:
2 sets of parameters: L = 1, k = 10, sigma = 1.1
L = 1, k = 100, sigma = 1.1
18Computer Vision project
Number of neighbors effect
Computer Vision project19
K = 10 K = 100
Iteration 1
Number of neighbors effect
Computer Vision project20
K = 10 K = 100
Iteration 2
Number of neighbors effect
Computer Vision project21
K = 10 K = 100
Iteration 3
Intrinsic dimension effect Initial dataset:
2 sets of parameters: L = 1, k = 15, sigma = 1.1
L = 0, k = 15, sigma = 1.1
22Computer Vision project
Number of neighbors effect
Computer Vision project23
L = 1 L = 0
Iteration 1
Number of neighbors effect
Computer Vision project24
L = 1 L = 0
Iteration 2
Number of neighbors effect
Computer Vision project25
L = 1 L = 0
Iteration 3
MNIST Dataset Classification
26Computer Vision project
Input : 16x8 matrices of 0 and 1 representing the image of a letter.
MNIST Dataset Classification
27Computer Vision project
Input : 16x8 matrices of 0 and 1 representing the image of a letter.
Parameters :
L = 1; sigma = 1;
k = 4; (must be an even number)
n_iteration = 1;
Preprocessing algorithm :
Extraction the "1" elements. It means that if m1,3=1 for example, we extract the point 1,3. coordinates of the white points.
Denoising step.
If the result is not an integer, we round it.
for example if we plan to move a pixel to the coordinates (12,54;14,1), we round it to (13;14).
The vector obtained is transformed in a matrix of 0 and 1.
MNIST Dataset Classification
28Computer Vision project
General algorithm :
We learn a neural network that labels the dataset
We compute the good labelling rate
We denoise the images
We learn a new neural network
We compute the good labelling rate
MNIST Dataset Classification
29Computer Vision project
Results :
We first run the algorithm on the dataset, and then separate training set and test set. We compare the good labelling rates.
Good labelling rates dataset Training/test dataset
No blurring 51% 35%
blurring 53% 39%
Conclusion
30Computer Vision project
The Manifold Blurring Mean Shift algorithm allows to blur an image in order to: Erase some outliers in merging them in the "real" image;
Merge outliers and decreasing their number.
decrease the error rate of a labelling methodMore congruent image for a human eye
Also more congruent for an automatic classification
Computer Vision project31
Thank you