manifold learning and pattern matching with entropic graphs
DESCRIPTION
Manifold learning and pattern matching with entropic graphs. Alfred O. Hero Dept. EECS, Dept Biomed. Eng., Dept. Statistics University of Michigan - Ann Arbor [email protected] http://www.eecs.umich.edu/~hero. Multimodality Face Matching. Clustering Gene Microarray Data. - PowerPoint PPT PresentationTRANSCRIPT
Manifold learning and pattern matching with entropic graphs
Alfred O. Hero Dept. EECS, Dept Biomed. Eng., Dept. Statistics
University of Michigan - Ann Arbor [email protected]
http://www.eecs.umich.edu/~hero
Multimodality Face Matching
Clustering Gene Microarray Data
Cy5/Cy3 hybridization profiles
Image Registration
Vehicle Classification
• 128x128 images of three vehicles over 1 deg increments of 360 deg azimuth at 0 deg elevation
• The 3(360)=1080 images evolve on a lower dimensional imbedded manifold in R^(16384)
Courtesy of Center for Imaging Science, JHU
HMMVT62Truck
Image Manifold
What is manifold learning good for?
• Interpreting high dimensional data• Discovery and exploitation of lower dimensional
structure • Deducing non-linear dependencies between
populations• Improving detection and classification
performance• Improving image compression performance
Random Sampling on a Manifold
Classifying on a Manifold
Class A Class B
Background on Manifold Learning• Manifold intrinsic dimension estimation
– Local KLE, Fukunaga, Olsen (1971)– Nearest neighbor algorithm, Pettis, Bailey, Jain, Dubes (1971) – Fractal measures, Camastra and Vinciarelli (2002)– Packing numbers, Kegl (2002)
• Manifold Reconstruction– Isomap-MDS, Tenenbaum, de Silva, Langford (2000)– Locally Linear Embeddings (LLE), Roweiss, Saul (2000)– Laplacian eigenmaps (LE), Belkin, Niyogi (2002)– Hessian eigenmaps (HE), Grimes, Donoho (2003)
• Characterization of sampling distributions on manifolds– Statistics of directional data, Watson (1956), Mardia (1972)– Statistics of shape, Kendall (1984), Kent, Mardia (2001)– Data compression on 3D surfaces, Kolarov, Lynch (1997)
Assumption:
is a conformal mappingA statistical sample
Sampling distribution
2D manifold
Sampling
Embedding
Sampling on a Domain Manifold
Alpha-Entropy and Divergence• Alpha-entropy
• Alpha-divergence
• Other alpha-dissimilarity measures– Alpha-Jensen difference– Alpha geometric-arithmetic (GA) divergence
MST and Geodesic MST• For a set of points in d-
dimensional Euclidean space, the Euclidean MST with edge power weighting gamma is defined as
• edge lengths of a spanning tree over
• pairwise distance matrix of complete graph
• When the matrix is constructed from geodesic distances between points on , e.g. using ISOMAP, we obtain the Geodesic MST
A Planar Sample and its Euclidean MST
Convergence of Euclidean MST
Beardwood, Halton, Hammersley Theorem:
Key Result for GMST
Ref: Costa&Hero:TSP2003
Special Cases
• Isometric embedding (ISOMAP)
• Conformal embedding (C-ISOMAP)
Remarks
• Result holds for many other combinatorial optimization algorithms (Costa&Hero:2003)– K-NNG– Steiner trees– Minimal matchings– Traveling Salesman Tours
• a.s. convergence rates (Hero&etal:2002)• For isometric embeddings Jacobian does not
have to be estimated for dimension estimation
Joint Estimation Algorithm
• Assume large-n log-affine model
• Use bootstrap resampling to estimate mean MST length and apply LS to jointly estimate slope and intercept from sequence
• Extract d and H from slope and intercept
Random Samples on a Swiss Roll
• Ref: Grimes and Donoho (2003)
Bootstrap Estimates of GMST Length
785 790 795 800805
806
807
808
809
810
811
812
813
814
815
n
E[L n]
Segment n=786:799 of MST sequence (=1,m=10) for unif sampled Swiss Roll
loglogLinear Fit to GMST Length
6.665 6.67 6.675 6.68 6.6856.692
6.694
6.696
6.698
6.7
6.702
6.704Segment of logMST sequence (=1,m=10) for unif sampled Swiss Roll
log(n)
log(
E[L n])
y = 0.53*x + 3.2
log(E[Ln])LS fit
Dimension and Entropy Estimates
• From LS fit find:• Intrinsic dimension estimate
• Alpha-entropy estimate (nats)
Dimension Estimation Comparisons
Practical Application
• Yale face database 2– Photographic folios of many people’s faces – Each face folio contains images at 585
different illumination/pose conditions– Subsampled to 64 by 64 pixels (4096 extrinsic
dimensions)• Objective: determine intrinsic dimension
and entropy of a face folio
GMST for 3 Face Folios
GMST for 3 Face Folios
Yale Face Database Results
• GMST LS estimation parameters– ISOMAP used to generate pairwise distance matrix– LS based on 25 resamplings over 26 largest folio sizes
• To represent any folio we might hope to attain– factor > 600 reduction in degrees of freedom (dim)– only 1/10 bit per pixel for compression– a practical parameterization/encoder?
Ref: Costa&Hero 2003
Conclusions
• Characterizing high dimension sampling distributions – Standard techniques (histogram, density estimation) fail
due to curse of dimensionality– Entropic graphs can be used to construct consistent
estimators of entropy and information divergence – Robustification to outliers via pruning
• Manifold learning and model reduction– Standard techniques (LLE, MDS, LE, HE) rely on local
linear fits – Entropic graph methods fit the manifold globally– Computational complexity is only n log n
Advantages of Geodesic Entropic Graph Methods
Summary of Algorithm
• Run ISOMAP or C-ISOMAP algorithm to generate pairwise distance matrix on intrinsic domain of manifold
• Build geodesic entropic graph from pairwise distance matrix– MST: consistent estimator of manifold dimension and
process alpha-entropy– K-NNG: consistent estimator of information divergence
between labeled vectors• Use bootstrap resampling and LS fitting to extract
rate of convergence (intrinsic dimension) and convergence factor (entropy) of entropic graph
Swiss Roll Example
Uniform Samples on 3D Imbedding of Swiss Roll
Geodesic Minimal Spanning Tree
GMST over Uniform Samples on Swiss Roll
Geodesic MST on Imbedded Mixture
GMST on Gaussian Samples on Swiss Roll
Classifying on a Manifold
Class A Class B