tensor decomposition based unsupervised feature extraction applied to matrix products for multiview...
TRANSCRIPT
![Page 1: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/1.jpg)
Tensor decompositionbased unsupervisedfeature extraction applied to matrix products
for multiview data processing
Yh. Taguchi
Department of Physics, Chuo UniversityTokyo, Japan.
PLoS ONE 12(8): e0183933. PLoS ONE 12(8): e0183933. DOI: 10.1371/journal.pone.0183933DOI: 10.1371/journal.pone.0183933
![Page 2: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/2.jpg)
What's typical in Bioinformatics?What's typical in Bioinformatics?
Small samples(a few), variables(=genes)arehuge(~104)→a typical “large p small n” problem
Difficult to apply usual statistical analyses
ex. small samples deep learning → דlarge p small n” problem→sparse modeling (lasso)variable selections ×
Approaches specific to bioinformatics are required
![Page 3: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/3.jpg)
Purpose: multiview data analysis
persons×
features
persons
features
persons×
shoppings
shoppings
features:A,B,D,M
persons:β,δ,μ
shoppings:1,3,4
persons
![Page 4: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/4.jpg)
matrix tensor
×xij xil
xij ×xil
xijl
Tensor decomposition
Gxik1
xjk2
xlk3
xijl=xij ×xil≒Σk1,k2,k3 Gk1,k2,k3
xik1xjk2
xlk3
i:personsj:featuresl:shoppings
![Page 5: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/5.jpg)
Demonstration using synthetic data set
50 50
1000+20%ノイズ
50
100%noise
No correlationsNo correlations
++
50
+20%ノイズ
50×1000×1000
tensor
Tensor decomposition
![Page 6: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/6.jpg)
xik1
k1=1
1≦i 50≦
k1=2 k1=3
xjk2
k2=1
k2=2
xlk2
k3=1
k3=2
1≦j 1000≦ 1≦l 1000≦
persons
features shoppings
![Page 7: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/7.jpg)
Advantages as multiview data analysis toolsAdvantages as multiview data analysis tools
・No weights required to integrate multiple views・Complete unspervised learning
(no model buildings using preknowledge)・smaller computational resources because of linearity
Disadvantages....
・tendency to require more memoriesSolution:summing up Σi xij ×xil results in j×l matrix that can be converted back (explains omitted)。
・no shared feature or samples result in four mode.
![Page 8: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/8.jpg)
Feature extractionFeature extraction No real data separated well
Assume Gaussian
Detect outliers
Pi=P [ >∑k(x ikσ )
2
]
BenjaminiHochberg corrected P <0.01
Pvalues by χ2 dist
P(p)
1p0 1
![Page 9: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/9.jpg)
Applications:multiomics data
mRNAsample1
sample2
sample3
sample4
sample5
miRNA
A group
B group
activeactive
expression interaction
xij ×xil i:161samples, j:13393mRNA, l:755miRNA,(8 groups)
![Page 10: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/10.jpg)
Selection of xik1distinct between symptoms
k1=1 k1=2 k1=3 k1=4 k1=5
1≦k1 5 are symptom dependent≦Pvalue
![Page 11: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/11.jpg)
k2 k3 k1 G(k1,k2,k3)
1≦k1 k2 k3 5≦
k1 :samplek2 :mRNA k3 :miRNA
1≦ k2 5≦Larger G
Smaller G
1≦ k3 2≦
xjk2xlk3
assume Gaussian
Detect outliers
BenjaminiHochberg corrected P <0.01
Pvalues by χ2 dist
755miRNA中7miRNA13393mRNA中427mRNA(Biological validations omitted)
![Page 12: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/12.jpg)
SummarySummary
・ As a feature selection in multi view data, after applying tensor decomposition to a tensor generated by product of matrices, I propose to select features associated with BHcorrected Pvalues <0.01 computed by χ2 dist assumed for a mode.
・ As for synthetic data set, apparently uncorrelated variables embedded into noised are decomposed to original orthogonal vectors after identifying correlated variables.
・As for muli omics data set, a few (a few %) intercorrelated and biologically reasonable miRNAs and mRNAs are identified among huge number of mRNAs and miRNAs
![Page 13: Tensor decomposition based unsupervised feature extraction applied to matrix products for multiview data processing](https://reader035.vdocuments.us/reader035/viewer/2022062523/5a653c2b7f8b9a8c388b47cb/html5/thumbnails/13.jpg)
My presentation in GIW2017:GIW 7 RNA Bioinformatics2nd Nov. Morning (c.a. 10 AM)
at Adonis (1F)
Tensor decompositionbased unsupervised feature extraction identified the universal nature of sequencenonspecific offtarget
regulation of mRNA mediated by microRNA transfection