non-linear principal manifolds a useful tool in bioinformatics and medical applications
DESCRIPTION
Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications. Andrei Zinovyev Institute des Hautes Etudes Scientifique, France. Plan of the talk. Object of study Definition of principal manifold (PM) Constructing PMs: elastic maps - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/1.jpg)
Non-linear Principal Manifoldsa Useful Tool in Bioinformatics and Medical Applications
Andrei ZinovyevInstitute des Hautes Etudes
Scientifique,France
![Page 2: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/2.jpg)
Plan of the talk
Object of study Definition of principal manifold
(PM) Constructing PMs: elastic maps Examples of biomedical
applications
![Page 3: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/3.jpg)
Principal manifoldsElastic maps framework
SVM
Principal manifolds
Regression,approximation
Supervisedclassification
K-means
SOM
Clustering
Multidim.scaling
VisualizationPCA
Factor analysis
LLE ISOMAP
Non-linearData-miningmethods
![Page 4: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/4.jpg)
Finite set of objects in RN
X i
i=1..m
IRIS database
Petal heght
Petal width
Sepal width
Sepal height
SPECIES
4.9 3 1.4 0.2 Iris-setosa
4.7 3.2 1.3 0.3 Iris-setosa
4.6 3.1 1.5 0.2 Iris-setosa
7 3.2 4.7 1.4 Iris-versicolor
6.4 3.2 4.5 1.5 Iris-versicolor
6.9 3.1 4.9 1.5 Iris-versicolor
6.3 3.3 6 2.5 Iris-virginica
5.8 2.7 X 1.9 Iris-virginica
7.1 3 5.9 2.1 Iris-virginica
6.3 2.9 5.6 1.8 Iris-virginica
![Page 5: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/5.jpg)
Mean point
m
iiX
mX
1
1
min1
2
m
ii XX
K-meansclustering
min1
2
m
ii YclosestX
![Page 6: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/6.jpg)
Principal “Object”
,
min1
2
m
i
![Page 7: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/7.jpg)
Principal Component Analysis
,
Max
imal
disp
ersio
n
1st Principalaxis
2nd principalaxis
![Page 8: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/8.jpg)
Principal manifold
![Page 9: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/9.jpg)
What do we want?
Non-linear surface (1D, 2D, 3D …) Smooth and not twisted The data model is unknown Speed (time linear with Nm) Uniqueness
Fast way to project datapoints
![Page 10: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/10.jpg)
Metaphor of elasticity
Datapoints
Graphnodes
U(Y)U(E), U(R)
![Page 11: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/11.jpg)
Constructing elastic nets
y E (0) E (1) R (1) R (0) R (2)
![Page 12: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/12.jpg)
Definition of elastic energy
)()()( REY UUUU
2)(
1
)(
)()(
1 ijp
i Kx
Y yXN
Uij
2)()(
1
)( )0()1( iis
ii
E EEU
r
i
iiii
R RRRU1
2)()()()( )0(2)2()1(.
E (0) E (1)
R (1) R (0) R (2)
y
Xj
00 , ii
![Page 13: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/13.jpg)
Elastic manifold
![Page 14: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/14.jpg)
Global minimum and softening
0, 0 103
0, 0 102
0, 0 101
0, 0 10-1
![Page 15: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/15.jpg)
Adaptive algorithms
Growing net
Adaptive net
Refining net:
Idea of scaling:
![Page 16: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/16.jpg)
Projection onto the manifold
Closest node of the net
Closest point of the manifold
![Page 17: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/17.jpg)
Colorings: visualize any function
![Page 18: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/18.jpg)
Density visualization
![Page 19: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/19.jpg)
Example: different topologies
RN
R2
![Page 20: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/20.jpg)
VIDAExpert tool and elmap C++ package
![Page 21: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/21.jpg)
Regression and principal manifolds
regression principal component
x
F(x)
min2 ii Pxx min)(
2 ii xFx
Data
Gen.curve
Grid
![Page 22: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/22.jpg)
Image skeletonization or clustering around curves
![Page 23: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/23.jpg)
Approximation of molecular surfaces
![Page 24: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/24.jpg)
Application: economical data
Gross output
Density
ProfitGrowth temp
![Page 25: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/25.jpg)
Medical table1700 patients with infarctus myocarde
Lethal casesPatients map, density
![Page 26: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/26.jpg)
Medical table1700 patients with infarctus myocarde
128 indicators
Age Numberof infarctusin anamnesis
Stenocardia functionalclass
![Page 27: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/27.jpg)
Codon usage in all genes of one genome
Escherichia coli Bacillus subtilis
Majority of genes
Highly expressed genes
“Foreign” genes
“Hydrophobic” genes
![Page 28: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/28.jpg)
Golub’s leukemia dataset3051 genes, 38 samples (ALL/B-cell,ALL/T-cell,AML)
ALL sample AML sample
Map of genes: vote for ALL vote for AML used by T.Golub used by W.Lie
![Page 29: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/29.jpg)
Golub’s leukemia datasetmap of samples: AML ALL/B-cell ALL/T-cell
density
Cystatin C Retinoblastomabinding protein P48
CA2 Carbonic anhydrase II
X-linked Helicase II
![Page 30: Non-linear Principal Manifolds a Useful Tool in Bioinformatics and Medical Applications](https://reader030.vdocuments.us/reader030/viewer/2022032805/56813214550346895d986dfa/html5/thumbnails/30.jpg)
Thank you for your attention!
Questions?