![Page 1: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/1.jpg)
Mathematics of Data���᠋᠌᠍I
姚 远 �
2011.7.11
![Page 2: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/2.jpg)
Data Science is the science of ���᠋᠌᠍Learning from Data.���᠋᠌᠍
In mathematics, we look for mappings ���᠋᠌᠍from ���᠋᠌᠍
data domain ���᠋᠌᠍to ���᠋᠌᠍
knowledge domain.���᠋᠌᠍
![Page 3: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/3.jpg)
John Dewey
If knowledge comes from the impressions made upon us by natural objects, it is
impossible to procure knowledge without the use of objects which impress the
mind.
Democracy and Educa.on: an introduc.on to the philosophy of educa.on, 1916 �
![Page 4: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/4.jpg)
Statistics has been the most successful
information science.
Those who ignore Statistics are condemned
to reinvent it.
---- Bradley Efron
![Page 5: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/5.jpg)
Types of Data
Data as vectors/matrices/tensors in Euclidean spaces images, videos, speech waves, gene expr., financial data most of sta<s<cal methods deal with this type of data
Data as graphs data as points in metric spaces (molecular dynamics, etc.) internet, biological/social networks data where we just know rela<ons (similarity,…) data visualiza<on modern Computer Science likes this type of data
and there are more coming …
![Page 6: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/6.jpg)
A Dictionary between Machine Learning vs. Statistics
Machine Learning Sta.s.cs
Supervised Learning Regression Classifica<on Ranking …
Unsupervised Learning Dimensionality Reduc<on Clustering Density es<ma<on …
Semi-‐supervised Learning ✗
![Page 7: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/7.jpg)
Contribu<ons
• Machine learning and computer science society had proposed many of scalable algorithms to deal with massive data sets
• Those algorithms are oMen found following consistency theory of sta<s<cs
• But some<mes tradi<onal sta<s<cs does not work …
![Page 8: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/8.jpg)
Principal Component Analysis (PCA) • Principal Component Analysis (PCA)
One Dimensional Manifold
€
Xn×p = [X1 X2 ... Xp ]Σij = [cov(Xi,X j )] = E[(Xi − µi)(X j − µ j )]
Eigen-‐decomposi<on of
€
ˆ Σ = XT I − 1neeT
⎛
⎝ ⎜
⎞
⎠ ⎟
2
X →Σ for fixed p and n-‐>∞
![Page 9: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/9.jpg)
PCA may go wrong if p ≥ n
• For p/n = γ >0, assume p-‐dimensional Xi
where θν are orthonormal, then PCA is inconsistent by Random Matrix Theory
• Phase transi<on: – Below the threshold, es<ma<on is orthogonal to the truth – Above the threshold, the angle decreases as eigenvalue grows, but always biased
€
Xi = λν uνiθνν =1
M
∑ +σZi, uνi ≈ N 0,1( ), Zi ≈ Np (0,Ip )
€
ˆ θ ν ,θν →0 λν ∈[0, γ ]
1− γ /λν2
1+ γ /λνλν > γ
⎧
⎨ ⎪
⎩ ⎪
Johnstone (2006) High Dimensional Sta<s<cal Inference and Random Matrices, arxiv.org/abs/math/0611589v1
![Page 10: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/10.jpg)
A Dawn of the Science for High Dimensional Massive Data Sets
• When p>>n, PCA may work with addi<onal requirements – Σ is sparse or fast decay – Σ-‐1 is sparse – … (e.g. see Tony Cai tutorials)
• Geometry and topology begin to enter this Odyssey – data concentrate on low-‐dimensional manifolds …
![Page 11: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/11.jpg)
Geometric and Topological Methods Algebraic geometry for graphical models
Bernd Sturmfels (UC Berkeley) Mathias Drton (U Chicago)
Differen<al geometry for graphical models Shun-‐ichi Amari (RIKEN, Japan) John Lafferty (CMU)
Integral geometry for sta<s<cal signal analysis Jonathan Taylor (Stanford) Rob Ghrist (UIUC -‐ U Penn)
Spectral kernel embedding: LLE (Roweis, Saul etal), ISOMAP (Tennenbaum etal), Laplacian eigenmap (Niyogi, Belkin etal), Hessian LLE (Donoho etal), Diffusion map (Coifman, Singer etal)
Computa<onal topology for data analysis Herbert Edelsbrunner (Duke -‐ Ins<tute of Sci. Tech. Austria), Gunnar Carlsson (Stanford), et al.
![Page 12: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/12.jpg)
Two aspects in those works
characterizing local or global constraints (symmetry) in model spaces or data
algebraic and differen<al geometry for graphical models spectral analysis for discrete groups (Risi Kondor)
characterizing nonlinear distribu<on (sparsity) of data spectral kernel embedding (nonlinear dimensionality reduc<on) integral geometry for signal analysis computa<onal topology for data analysis
Geometry and topology may play a role as
Let’s focus on the second aspect.
![Page 13: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/13.jpg)
Geometric and Topological Data Analysis
General area of geometric data analysis ahempts to give insight into data by imposing a geometry (metric) on it
manifold learning: global coordinate preserving local structure metric learning: find a metric accoun<ng for similarity
Topological method is to study invariants under metric distor<on clustering as connected components loops, holes
Between them, lies in Hodge Theory, a bridge over geometry and topology
0-‐dimensional Hodge Theory: Laplacian eigenmaps, Diffusion Maps 1-‐dimensional Hodge Theory: Preference Aggrega<on, Game Theory
![Page 14: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/14.jpg)
What we’ll cover in this course?
I. Geometric Data Analysis: from PCA/MDS to LLE/ISOMAP
II. Geometric Data Analysis: diffusion geometry
III. Topological Data Analysis IV. Hodge Theory in Data Analysis V. Seminar
![Page 15: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/15.jpg)
Manifold learning: Spectral Kernel Embeddings
– Principle Component Analysis (PCA)
– Mul<-‐Dimensional Scaling (MDS) – Locally Linear Embedding (LLE) – Isometric map (ISOMAP)
– Laplacian Eigenmaps – Diffusion map – Local Tangent Space Alignment (LTSA)
– …
![Page 16: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/16.jpg)
Dimensionality Reduction
• Data are concentrated around low dimensional Manifolds – Linear:MDS, PCA
– NonLinear: ISOMAP, LLE, …
![Page 17: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/17.jpg)
Generative Models in Manifold Learning
![Page 18: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/18.jpg)
Example…
![Page 19: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/19.jpg)
![Page 20: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/20.jpg)
Biomolecular: Alanine-dipeptide Application I: Alanine-dipeptide
ISOMAP 3D embedding with RMSD metric on 3900 Kcenters
![Page 21: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/21.jpg)
Distances & Mappings
• Given an Euclidean embedding, it’s easy to calculate the distances between the points :
• Mul<-‐Dimensional Scaling (MDS) operates the other way round: – Given the “distances” [data] find the embedding map [configura<on] which generated
them
– MDS can do so when all but ordinal informa<on has been jersoned (fruit of the “non-‐metric revolu<on”)
– even when there are missing data and in the presence of considerable “noise”/error (MDS is robust).
![Page 22: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/22.jpg)
PCA => MDS
• Here we are given pairwise distances instead of the actual data points. – First convert the pairwise distance matrix into the dot product matrix
– AMer that same as PCA.
If we preserve the pairwise distances do we preserve the structure??
![Page 23: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/23.jpg)
Matlab Commands • STATS toolbox provides the following:
– cmdscale -‐ classical MDS
– mdscale – nonmetric MDS
![Page 24: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/24.jpg)
Example I: 50-‐node 2-‐D Sensor Network Localiz<on
![Page 25: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/25.jpg)
Example II City Map
![Page 26: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/26.jpg)
How to get dot product matrix from pairwise distance matrix?
k
ijd
kjd
kid
αj
i
![Page 27: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/27.jpg)
Origin centered MDS • MDS—origin as one of the points and orienta<on arbitrary.
Centroid as origin
![Page 28: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/28.jpg)
MDS Summary
• Given pairwise distances D, where Dij = dij2, the squared distance between point i and j – Convert the pairwise distance matrix D (c.n.d.) into the dot product matrix B (p.s.d.)
• Bij (a) = -‐.5 H(a) D H’(a), Hölder matrix H(a) = I-‐1a’; • a = 1k: Bij = -‐.5 (Dij -‐ Dik – Djk) • a = 1/n:
– Eigendecomposi<on of B = YYT
If we preserve the pairwise Euclidean distances do we preserve the structure??
€
Bij = − 12 Dij −
1N Dsj
s=1
N
∑ − 1N Dit
t=1
N
∑ + 1N 2 Dst
s,t=1
N
∑⎛
⎝ ⎜
⎞
⎠ ⎟
![Page 29: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/29.jpg)
Theory of Classical MDS: a few concepts
• An n-‐by-‐n matrix C is posi<ve semi-‐definite (psd) if for all v∈Rn, v’Cv ≥ 0.
• An n-‐by-‐n matrix C is condi<onally nega<ve definite (c.n.d) if for all v∈Rn such that sumi vi=0, v’Cv ≤ 0.
![Page 30: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/30.jpg)
Young-‐Householder-‐Shoenberg Lemma
• Let x be a signed distribu<on, i.e. x obeying sumi xi=1 while xi can be nega<ve
• Householder centering matrix: H(x) = I – 1x’;
• Define B(x) = -‐ 1/2 H(x) C H’(x), for any C • Theorem [Young/Householder, Schoenberg 1938b] For any signed distribu<on x,
B(x) p.s.d. iff C c.n.d.
![Page 31: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/31.jpg)
Proof
• “⇐” first observe that if B(x) is p.s.d., then B(y) is also p.s.d. for any other signed distribu<on y, in view of the iden<ty B(y) = H(y)B(x)H’(y), itself a consequence of H(y) = H(y)H(x). Also, for any z, z’B(x)z = −y’Cy/2, where the vector y = H’(x)z obeys sumi yi = 0 for any z, showing necessity.
• “⇒” Also, y = H’(x)y whenever sumi yi = 0, and hence y’B(x)y = −y’Cy/2, thus demonstra<ng sufficiency.
![Page 32: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/32.jpg)
Classical MDS Theorem
• Let D be n-‐by-‐n symmetric matrix. Define a zero diagonal matrix C to be Cij = Dij – 0.5 Dii – 0.5 Djj. Then we have – B(x) := -‐0.5 H(x) D H’(x) = -‐0.5 H(x) C H’(x) – Cij = Bii(x) + Bjj(x) – 2Bij(x) – D is c.n.d. iff C is c.n.d. – If C is c.n.d., then C is isometrically embeddable, i.e. Cij = sumk(Yik-‐ Yjk)2 where
Y= U Λ1/2, with eigendecomp B(x) =U Λ U’
![Page 33: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/33.jpg)
Proof • the first iden<ty in follows from H(x)1 = 0; • the second one from Bii(x) + Bjj(x) − 2Bij(x) = Cij − 0.5Cii − 0.5Cjj, itself a consequence of the defini<on Bij(x) = −0.5Cij + γi + γj for some vector γ;
• the next asser<on follows from u’Du = u’Cu whenever sumi ui = 0;
• the last one can be shown to amount to the second iden<ty by direct subs<tu<on.
![Page 34: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/34.jpg)
Remarks • P.S.D. B(x) (or c.n.d. C) defines a unique squared Euclidean distance D, which sa<sfies:
• Bij(x) = -‐ 0.5 (Dij – Dik – Djk), with freedom x to choose center
• B = Y Y’, the scalar product matrix of n-‐by-‐d Euclidean coordinate matrix Y
![Page 35: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/35.jpg)
Gaussian Kernels • Theorem. Let Dij be a squared Euclidean distance. Then for any λ≥0, Bij(λ)=exp(-‐λDij) is p.s.d., and Cij(λ) = 1 – exp(-‐λ Dij) is a squared Euclidean distance (c.n.d. with zero diagonal).
• So Gaussian kernels are p.s.d. and 1 – gaussian kernel is a squared Euclidean distance.
![Page 36: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/36.jpg)
Schoenberg Transform • A Schoenberg transforma<on is a func<on φ(D) from R+ to R+ of the form (Schoenberg 1938a)
• where g(λ)dλ is a non-‐nega<ve measure on [0, ∞) such that
€
φ(d) =1− exp(−λd)
λ0
∞
∫ g(λ)dλ
€
g(λ)λ
dλ1
∞
∫ < ∞
![Page 37: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/37.jpg)
Schoenberg Theorem [1938a] • Let D be a n x n matrix of squared Euclidean distances. Define the components of the n x n matrix Cij as Cij = φ(Dij). Then C is a squared Euclidean distance iff φ is the Schoenberg Transforma<on,
€
φ(d) =1− exp(−λd)
λ0
∞
∫ g(λ)dλ
![Page 38: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/38.jpg)
Extension I: Nonmetric MDS • Instead of pairwise distances we
can use paiwise “dissimilari<es”.
• When the distances are Euclidean MDS is equivalent to PCA.
• Eg. Face recogni<on, wine tas<ng
• Can get the significant cogni<ve dimensions.
![Page 39: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/39.jpg)
Extension II: MDS with missing values (Graph Realiza<on Problem)
Classical MDS (complete graph with squared distance dij): Young/Householder 1938, Shoenberg 1938
![Page 40: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/40.jpg)
MDS with Uncertainty Quadratic equality and inequality
system
![Page 41: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/41.jpg)
MDS in Sensor Network Localization: anchor points
![Page 42: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/42.jpg)
Key Problems
![Page 43: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/43.jpg)
Nonlinear Least Squares
A difficult global op<miza<on problem.
![Page 44: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/44.jpg)
Matrix Represenation I
A difficult global op<miza<on problem.
![Page 45: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/45.jpg)
Matrix Represenation II
![Page 46: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/46.jpg)
SDP Relaxation [Biswas-Ye 2004]
![Page 47: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/47.jpg)
SDP Standard Form [Biswas-Ye 2004]
![Page 48: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/48.jpg)
SDP Dual Form [Biswas-Ye 2004]
![Page 49: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/49.jpg)
Unique Localizability
UL is called universal rigidity.
![Page 50: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/50.jpg)
ULP is localizable in polynomial time [So-Ye 2005]
![Page 51: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/51.jpg)
UL Graphs
![Page 52: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/52.jpg)
d-lateration Graphs
![Page 53: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/53.jpg)
Open Problems in Sensor Network Localization
![Page 54: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/54.jpg)
Reference
• Schoenberg, I. J. (1938a) Metric Spaces and Completely Monotone Func<ons. The Annals of Mathema<cs 39, 811–841
• Schoenberg, I. J. (1938b) Metric Spaces and Posi<ve Definite Func<ons. Transac<ons of the American Mathema<cal Society 44, 522–536
• Francois Bavaud (2010) On the Schoenberg Transforma<ons in Data Analysis: Theory and Illustra<ons, arXiv:1004.0089v2
• T. F. Cox and M. A. A. Cox, Mul<dimensional Scaling. New York: Chapman & Hall, 1994.
• Yinyu Ye, Semidefinite Programming and its Low-‐Rank Theorems, ICCM 2010
![Page 55: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/55.jpg)
Nonlinear Manifolds..
A
Unfold the manifold
PCA and MDS see the Euclidean distance
What is important is the geodesic distance
![Page 56: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/56.jpg)
Intrinsic Description..
• To preserve structure, preserve the geodesic distance and not the Euclidean distance.
![Page 57: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/57.jpg)
Two Basic Geometric Embedding Methods
• Tenenbaum-‐de Silva-‐Langford Isomap Algorithm
– Global approach. – On a low dimensional embedding
• Nearby points should be nearby. • Faraway points should be faraway.
• Roweis-‐Saul Locally Linear Embedding Algorithm
– Local approach • Nearby points nearby
![Page 58: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/58.jpg)
Isomap �• Es.mate the geodesic distance between faraway points.
• For neighboring points Euclidean distance is a good approxima<on to the geodesic distance.
• For faraway points es<mate the distance by a series of short hops between neighboring points.
– Find shortest paths in a graph with edges connec<ng neighboring data points
Once we have all pairwise geodesic distances use classical metric MDS
![Page 59: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/59.jpg)
Isomap - Algorithm • Determine the neighbors.
– All points in a fixed radius. – K nearest neighbors
• Construct a neighborhood graph.
– Each point is connected to the other if it is a K nearest neighbor. – Edge Length equals the Euclidean distance
• Compute the shortest paths between two nodes
– Floyd’s Algorithm (O(N3)) – Dijkstra’s Algorithm (O(kN2logN))
• Construct a lower dimensional embedding.
– Classical MDS
![Page 60: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/60.jpg)
Example…
![Page 61: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/61.jpg)
![Page 62: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/62.jpg)
![Page 63: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/63.jpg)
Residual Variance
Face Images SwisRoll
Hand Images 2
![Page 64: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/64.jpg)
Biomolecular: Alanine-dipeptide Application I: Alanine-dipeptide
ISOMAP 3D embedding with RMSD metric on 3900 Kcenters
![Page 65: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/65.jpg)
Convergence of ISOMAP • ISOMAP has provable convergence guarantees; • Given that {xi} is sampled sufficiently dense, ISOMAP will approximate closely the original distance as measured in manifold M;
• In other words, actual geodesic distance approxima<ons using graph G can be arbitrarily good;
• Let’s examine these theore<cal guarantees in more detail …
![Page 66: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/66.jpg)
Possible Issues
![Page 67: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/67.jpg)
Two step approxima<ons
![Page 68: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/68.jpg)
![Page 69: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/69.jpg)
Dense-‐sampling Theorem [Bernstein, de Silva, Langford, and
Tenenbaum 2000]
![Page 70: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/70.jpg)
Proof Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
Proof of Theorem 1
dM(x , y) ≤ dS(x , y) ≤ (1+ 4δ/�)dM(x , y)
Proof:� The left hand side of the inequality follows directly from thetriangle inequality.
� Let γ be any piecewise-smooth arc connecting x to y with� = length(γ).
� If � ≤ �− 2δ then x and y are connected by an edge in Gwhich we can use as our path.
Global vs. Local Methods in NLDR
![Page 71: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/71.jpg)
Proof
![Page 72: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/72.jpg)
Proof
![Page 73: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/73.jpg)
The Second Approxima<on Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
dS ≈ dG� We would like to now show the other approximate equality:dS ≈ dG. First let’s make some definitions:1. The minimum radius of curvature r0 = r0(M) is defined by
1r0 = maxγ,t �γ��(t)� where γ varies over all unit-speedgeodesics in M and t is in the domain D of γ.
� Intuitively, geodesics in M curl around ’less tightly’ thancircles of radius less than r0(M).
2. The minimum branch separation s0 = s0(M) is the largestpositive number for which �x − y� < s0 impliesdM(x , y) ≤ πr0 for any x , y ∈ M.
Lemma: If γ is a geodesic in M connecting points x and y, and if� = length(γ) ≤ πr0, then:
2r0sin(�/2r0) ≤ �x − y� ≤ �
Global vs. Local Methods in NLDR
![Page 74: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/74.jpg)
Remarks Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
Notes on Lemma� We will take this Lemma without proof as it is somewhattechnical and long.
� Using the fact that sin(t) ≥ t − t3/6 for t ≥ 0 we can writedown a weakened form of the Lemma:
(1− �2/24r20 )� ≤ �x − y� ≤ �
� We can also write down an even more weakened versionvalid for � ≤ πr0:
(2/π)� ≤ �x − y� ≤ �
� We can now show dG ≈ dS.
Global vs. Local Methods in NLDR
![Page 75: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/75.jpg)
Theorem 2 [Bernstein, de Silva, Langford, and Tenenbaum 2000]
Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
Theorem 2: Euclidean Hops ≈ Geodesic Hops
Theorem 2: Let λ > 0 be given. Suppose data pointsxi , xi+1 ∈ M satisfy:
�xi − xi+1� < s0�xi − xi+1� ≤ (2/π)r0
√24λ
Suppose also there is a geodesic arc of length � = dM(xi , xi+1)connecting xi to xi+1. Then:
(1− λ)� ≤ �xi − xi+1� ≤ �
Global vs. Local Methods in NLDR
![Page 76: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/76.jpg)
Proof Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
Proof of Theorem 2
� By the first assumption we can directly conclude � ≤ πr0.� This fact allows us to apply the Lemma using the weakest formcombined with the second assumption gives us:
� ≤ (π/2) �xi − xi+1� ≤ r0√24λ
� Solving for λ in the above gives: 1− λ ≤ (1− �2/24r20 ). Applyingthe weakened statement of the Lemma then gives us the desiredresult.
� Combining Theorem 1 and 2 shows dM ≈ dG. This leads us thento our main theorem...
Global vs. Local Methods in NLDR
![Page 77: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/77.jpg)
Main Theorem [Bernstein, de Silva, Langford, and
Tenenbaum 2000]
Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
Main TheoremTheorem 1: Let M be a compact submanifold of Rn and let {xi} be a finite setof data points in M. We are given a graph G on {xi} and positive realnumbers λ1, λ2 < 1 and δ, � > 0. Suppose:1. G contains all edges (xi , xj) of length �xi − xj� ≤ �.2. The data set {xi} statisfies a δ-sampling condition – for every point
m ∈ M there exists an xi such that dM(m, xi) < δ.3. M is geodesically convex – the shortest curve joining any two points on
the surface is a geodesic curve.4. � < (2/π)r0
√24λ1, where r0 is the minimum radius of curvature of M –
1r0
= maxγ,t �γ��(t)� where γ varies over all unit-speed geodesics in M.5. � < s0, where s0 is the minimum branch separation of M – the largest
positive number for which �x − y� < s0 implies dM(x , y) ≤ πr0.6. δ < λ2�/4.
Then the following is valid for all x , y ∈ M,(1− λ1)dM(x , y) ≤ dG(x , y) ≤ (1+ λ2)dM(x , y)
Global vs. Local Methods in NLDR
![Page 78: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/78.jpg)
Probabilis<c Result
Introduction Theoretical Claims Conformal ISOMAP Landmark ISOMAP Summary
ISOMAP Asymptotic Convergence Proofs
Recap
� So, short Euclidean distance hops along G approximate well actualgeodesic distance as measured in M.
� What were the main assumptions we made? The biggest one was theδ-sampling density condition.
� A probabilistic version of the Main Theorem can be shown where eachpoint xi is drawn from a density function. Then the approximationbounds will hold with high probability. Here’s a truncated version of whatthe theorem looks like now:
Asymptotic Convergence Theorem: Given λ1, λ2, µ > 0 then for densityfunction α sufficiently large:
1− λ1 ≤dG(x , y)dM(x , y)
≤ 1+ λ2
will hold with probability at least 1− µ for any two data points x, y.
Global vs. Local Methods in NLDR
![Page 79: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/79.jpg)
A Shortcoming of ISOMAP
• One need to compute pairwise shortest path between all sample pairs (i,j) – Global – Non-‐sparse – Cubic complexity O(N3)
![Page 80: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/80.jpg)
Locally Linear Embedding manifold is a topological space which is locally Euclidean.”
Fit Locally, Think Globally
![Page 81: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/81.jpg)
We expect each data point and its neighbours to lie on or close to a locally linear patch of the manifold.
Each point can be wrihen as a linear combina<on of its neighbors. The weights choosen to minimize the reconstruc<on Error.
Deriva8on on board
Fit Locally…
![Page 82: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/82.jpg)
Important property... • The weights that minimize the reconstruc<on errors are invariant to rota<on, rescaling and transla<on of the data points. – Invariance to transla<on is enforced by adding the constraint that the weights sum to one.
• The same weights that reconstruct the datapoints in D dimensions should reconstruct it in the manifold in d dimensions. – The weights characterize the intrinsic geometric proper<es of each neighborhood.
![Page 83: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/83.jpg)
Think Globally…
![Page 84: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/84.jpg)
Algorithm (K-NN) • Local firng step (with centering):
– Consider a point xi – Choose its K(i) neighbors ηj whose origin is at xi – Compute the (sum-‐to-‐one) weights wij which minimizes
• Contruct neighborhood inner product: • Compute the weight vector wi=(wij), where 1 is K-‐vector of all-‐one and λ is a regulariza<on parameter
• Then normalize wi to a sum-‐to-‐one vector.
€
Ψi w( ) = xi − wijη jj=1
K ( i)
∑2
, wijj∑ =1, xi = 0
€
C jk = η j ,ηk
€
wi = C + λI( )−11
![Page 85: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/85.jpg)
Algorithm (K-NN) • Local firng step (without centering):
– Consider a point xi – Choose its K(i) neighbors xj – Compute the (sum-‐to-‐one) weights wij which minimizes
• Contruct neighborhood inner product: • Compute the weight vector wi=(wij), where €
Ψi w( ) = xi − wij x jj=1
K ( i)
∑2
,
€
C jk = η j ,ηk
€
wi = C+vi, vi = vik( )∈RK ( i)
€
vik = ηk, xi
![Page 86: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/86.jpg)
Algorithm continued • Global embedding step:
– Construct N-‐by-‐N weight matrix W: – Compute d-‐by-‐N matrix Y which minimizes
• Compute: • Find d+1 bohom eigenvectors of B, vn,vn-‐1,…,vn-‐d • Let d-‐dimensional embedding Y =[ vn-‐1,vn-‐2,…vn-‐1] €
φ Y( ) = Yi − WijYjj=1
N
∑2
i∑ =Y (I −W )T (I −W )YT
€
B = (I −W )T (I −W )
€
Wij =wij , j ∈N(i)0, otherwise⎧ ⎨ ⎩
![Page 87: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/87.jpg)
Remarks on LLE
• Searching k-‐nearest neighbors is of O(kN)
• W is sparse, kN/N^2=k/N nozeros • W might be nega<ve, addi<onal nonnega<ve constraint can be imposed
• B=(I-‐W)T(I-‐W) is posi<ve semi-‐definite (p.s.d.)
• Open Problem: exact reconstruc<on condi<on?
![Page 88: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/88.jpg)
![Page 89: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/89.jpg)
![Page 90: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/90.jpg)
![Page 91: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/91.jpg)
Grolliers Encyclopedia
![Page 92: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/92.jpg)
Summary.. ISOMAP LLE
Do MDS on the geodesic distance matrix.
Model local neighborhoods as linear a patches and then embed in a lower dimensional manifold.
Global approach Local approach
Might not work for nonconvex manifolds with holes
Nonconvex manifolds with holes
Extensions: Landmark, Conformal & Isometric ISOMAP
Extensions: Hessian LLE, Laplacian Eigenmaps etc.
Both needs manifold finely sampled.�
![Page 93: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/93.jpg)
Reference
• Tenenbaum, de Silva, and Langford, A Global Geometric Framework for Nonlinear Dimensionality Reduc<on. Science 290:2319-‐2323, 22 Dec. 2000.
• Roweis and Saul, Nonlinear Dimensionality Reduc<on by Locally Linear Embedding. Science 290:2323-‐2326, 22 Dec. 2000.
• M. Bernstein, V. de Silva, J. Langford, and J. Tenenbaum. Graph Approxima<ons to Geodesics on Embedded Manifolds. Technical Report, Department of Psychology, Stanford University, 2000.
• V. de Silva and J.B. Tenenbaum. Global versus local methods in nonlinear dimensionality reduc<on. Neural Informa<on Processing Systems 15 (NIPS’2002), pp. 705-‐712, 2003.
• V. de Silva and J.B. Tenenbaum. Unsupervised learning of curved manifolds. Nonlinear Es<ma<on and Classifica<on, 2002.
• V. de Silva and J.B. Tenenbaum. Sparse mul<dimensional scaling using landmark points. Available at: hhp://math.stanford.edu/~silva/public/publica<ons.html
![Page 94: Mathematics of Data I - PKU · Data Science is the science of Learning from Data. In mathematics, we look for mappings from data domain to knowledge domain](https://reader031.vdocuments.us/reader031/viewer/2022041504/5e23b0cbe8928c102978c304/html5/thumbnails/94.jpg)
Matlab Dimensionality Reduction Toolbox
• hhp://homepage.tudelM.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduc<on.html
• Math.pku.edu.cn/teachers/yaoy/Spring2011/matlab/drtoolbox – Principal Component Analysis (PCA), Probabilis<c PC
– Factor Analysis (FA), Sammon mapping, Linear Discriminant Analysis (LDA)
– Mul<dimensional scaling (MDS), Isomap, Landmark Isomap
– Local Linear Embedding (LLE), Laplacian Eigenmaps, Hessian LLE, Conformal Eigenmaps
– Local Tangent Space Alignment (LTSA), Maximum Variance Unfolding (extension of LLE)
– Landmark MVU (LandmarkMVU), Fast Maximum Variance Unfolding (FastMVU)
– Kernel PCA
– Diffusion maps
– …