exploring temporal graph data with python: a study on tensor decomposition of wearable sensor data...
TRANSCRIPT
![Page 1: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/1.jpg)
EXPLORING TEMPORAL GRAPH DATA WITH PYTHONA STUDY ON TENSOR DECOMPOSITION OF WEARABLE SENSOR DATA
ANDRÉ PANISSON
@apanisson ISI Foundation, Torino, Italy & New York City
![Page 2: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/2.jpg)
WHY TENSOR FACTORIZATION + PYTHON?▸ Matrix Factorization is already used in many fields
▸ Tensor Factorization is becoming very popularfor multiway data analysis
▸ TF is very useful to explore temporal graph data
▸ But still, the most used tool is Matlab
▸ There’s room for improvement in the Python libraries for TF
▸ Study: NTF of wearable sensor data
![Page 3: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/3.jpg)
TENSORS AND TENSOR DECOMPOSITION
![Page 4: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/4.jpg)
FACTOR ANALYSIS
Spearman ~1900
X≈WH
Xtests x subjects ≈ Wtests x intelligences Hintelligences x subjects
Spearman, 1927: The abilities of man.
≈
test
s
subjects subjects
test
s
Int.
Int.
X WH
![Page 5: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/5.jpg)
TOPIC MODELING / LATENT SEMANTIC ANALYSIS
Blei, David M. "Probabilistic topic models." Communications of the ACM 55.4 (2012): 77-84.
. , ,
. , ,
. . .
genednageneti c
lifeevolveorganism
brai nneuronnerve
datanumbercomputer. , ,
Topics DocumentsTopic proportions and
assignments
0.040.020.01
0.040.020.01
0.020.010.01
0.020.020.01
datanumbercomputer. , ,
0.020.020.0 1
![Page 6: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/6.jpg)
TOPIC MODELING / LATENT SEMANTIC ANALYSIS
X≈WHNon-negative Matrix Factorization (NMF):
(~1970 Lawson, ~1995 Paatero, ~2000 Lee & Seung)
2005 Gaussier et al. "Relation between PLSA and NMF and implications."
argminW,H
kX�WHk s. t. W,H � 0
≈do
cum
ents
terms terms
docu
men
ts
topic
topi
c
SparseMatrix!
![Page 7: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/7.jpg)
NON-NEGATIVE MATRIX FACTORIZATION (NMF)
NMF gives Part based representation(Lee & Seung – Nature 1999)
NMF
=×
Original
PCA
×
=
NMF is equivalent to Spectral Clustering(Ding et al. - SDM 2005)
W W • VHT
WHHT
H H • WTV
WTWH
argminW,H
kX�WHk s. t. W,H � 0
![Page 8: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/8.jpg)
from sklearn import datasets, decomposition
digits = datasets.load_digits()A = digits.data
nmf = decomposition.NMF(n_components=10)W = nmf.fit_transform(A)H = nmf.components_
plt.rc("image", cmap="binary")plt.figure(figsize=(8,4))for i in range(10): plt.subplot(2,5,i+1) plt.imshow(H[i].reshape(8,8)) plt.xticks(()) plt.yticks(())plt.tight_layout()
![Page 9: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/9.jpg)
BEYOND MATRICES: HIGH DIMENSIONAL DATASETS
Cichocki et al. Nonnegative Matrix and Tensor Factorizations
Environmental analysis ▸ Measurement as a function of (Location, Time, Variable) Sensory analysis ▸ Score as a function of (Food sample, Judge, Attribute) Process analysis ▸ Measurement as a function of (Batch, Variable, time) Spectroscopy ▸ Intensity as a function of (Wavelength, Retention, Sample, Time,
Location, …)
…
MULTIWAY DATA ANALYSIS
![Page 10: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/10.jpg)
DIGITAL TRACES FROM SENSORS AND IOTUSER POSITION TIME …
![Page 11: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/11.jpg)
Sidiropoulos,
Giannakis and Bro,
IEEE Trans. Signal Processing, 2000.
Mørup, Hansen and Arnfred,
Journal of Neuroscience Methods, 2007.
Hazan, Polak and
Shashua, ICCV 2005.
Bader, Berry, Browne,
Survey of Text Mining: Clustering, Classification, and Retrieval, 2nd Ed.,
2007.
Doostan and Iaccarino, Journal of Computational Physics, 2009.
Andersen and Bro, Journalof Chemometrics, 2003.
• Chemometrics– Fluorescence Spectroscopy– Chromatographic Data
Analysis• Neuroscience
– Epileptic Seizure Localization– Analysis of EEG and ERP
• Signal Processing• Computer Vision
– Image compression, classification
– Texture analysis• Social Network Analysis
– Web link analysis– Conversation detection in
emails– Text analysis
• Approximation of PDEs
data reconstruction, cluster analysis, compression, dimensionality reduction, latent semantic analysis, …
![Page 12: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/12.jpg)
TENSORS
![Page 13: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/13.jpg)
WHAT IS A TENSOR?
A tensor is a multidimensional arrayE.g., three-way tensor:
Mode-1
Mode-2
Mode-3
651a
![Page 14: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/14.jpg)
FIBERS AND SLICES
Cichocki et al. Nonnegative Matrix and Tensor Factorizations
Column (Mode-1) Fibers Row (Mode-2) Fibers Tube (Mode-3) Fibers
Horizontal Slices Lateral Slices Frontal Slices
A[:, 4, 1] A[:, 1, 4] A[1, 3, :]
A[1, :, :] A[:, :, 1]A[:, 1, :]
![Page 15: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/15.jpg)
TENSOR UNFOLDINGS: MATRICIZATION AND VECTORIZATION
Matricization: convert a tensor to a matrix
Vectorization: convert a tensor to a vector
![Page 16: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/16.jpg)
>>> T = np.arange(0, 24).reshape((3, 4, 2))>>> Tarray([[[ 0, 1], [ 2, 3], [ 4, 5], [ 6, 7]],
[[ 8, 9], [10, 11], [12, 13], [14, 15]],
[[16, 17], [18, 19], [20, 21], [22, 23]]])
OK for dense tensors: use a combination of transpose() and reshape()
Not simple for sparse datasets (e.g.: <authors, terms, time>)
for j in range(2): for i in range(4): print T[:, i, j]
[ 0 8 16][ 2 10 18][ 4 12 20][ 6 14 22][ 1 9 17][ 3 11 19][ 5 13 21][ 7 15 23]
# supposing the existence of unfold
>>> T.unfold(0)array([[ 0, 2, 4, 6, 1, 3, 5, 7], [ 8, 10, 12, 14, 9, 11, 13, 15], [16, 18, 20, 22, 17, 19, 21, 23]])>>> T.unfold(1)array([[ 0, 8, 16, 1, 9, 17], [ 2, 10, 18, 3, 11, 19], [ 4, 12, 20, 5, 13, 21], [ 6, 14, 22, 7, 15, 23]])>>> T.unfold(2)array([[ 0, 8, 16, 2, 10, 18, 4, 12, 20, 6, 14, 22], [ 1, 9, 17, 3, 11, 19, 5, 13, 21, 7, 15, 23]])
![Page 17: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/17.jpg)
RANK-1 TENSORThe outer product of N vectors results in a rank-1 tensor
array([[[ 1., 2.], [ 2., 4.], [ 3., 6.], [ 4., 8.]],
[[ 2., 4.], [ 4., 8.], [ 6., 12.], [ 8., 16.]],
[[ 3., 6.], [ 6., 12.], [ 9., 18.], [ 12., 24.]]])
a = np.array([1, 2, 3])b = np.array([1, 2, 3, 4])c = np.array([1, 2])
T = np.zeros((a.shape[0], b.shape[0], c.shape[0]))
for i in range(a.shape[0]): for j in range(b.shape[0]): for k in range(c.shape[0]): T[i, j, k] = a[i] * b[j] * c[k]
T = a(1) � · · · � a(N)=
a
c
b
Ti,j,k = aibjck
![Page 18: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/18.jpg)
TENSOR RANK
▸ Every tensor can be written as a sum of rank-1 tensors
=
a1 aJ
c1 cJ
b1 bJ
+ +
▸ Tensor rank: smallest number of rank-1 tensors that can generate it by summing up
X ⇡RX
r=1
a(1)r � a(2)r � · · · � a(N)r ⌘ JA(1),A(2), · · · ,A(N)K
T ⇡RX
r=1
ar � br � cr ⌘ JA,B,CK
![Page 19: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/19.jpg)
array([[[ 61., 82.], [ 74., 100.], [ 87., 118.], [ 100., 136.]],
[[ 77., 104.], [ 94., 128.], [ 111., 152.], [ 128., 176.]],
[[ 93., 126.], [ 114., 156.], [ 135., 186.], [ 156., 216.]]])
A = np.array([[1, 2, 3], [4, 5, 6]]).TB = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]).TC = np.array([[1, 2], [3, 4]]).T
T = np.zeros((A.shape[0], B.shape[0], C.shape[0]))for i in range(A.shape[0]): for j in range(B.shape[0]): for k in range(C.shape[0]): for r in range(A.shape[1]): T[i, j, k] += A[i, r] * B[j, r] * C[k, r]
T = np.einsum('ir,jr,kr->ijk', A, B, C)
: Kruskal TensorT ⇡RX
r=1
ar � br � cr ⌘ JA,B,CK
![Page 20: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/20.jpg)
TENSOR FACTORIZATION▸ CANDECOMP/PARAFAC factorization (CP) ▸ extensions of SVD / PCA / NMF of matrices
NON-NEGATIVE TENSOR FACTORIZATION▸ Decompose a non-negative tensor to
a sum of R non-negative rank-1 tensors
argmin
A,B,CkT� JA,B,CKk
with JA,B,CK ⌘RX
r=1
ar � br � cr
subject to A � 0,B � 0,C � 0
![Page 21: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/21.jpg)
TENSOR FACTORIZATION: HOW TO
Alternating Least Squares(ALS):Fix all but one factor matrix to which LS is applied
minA�0
kT(1) �A(C�B)T k
minB�0
kT(2) �B(C�A)T k
minC�0
kT(3) �C(B�A)T k
� denotes the Khatri-Rao product, which is a
column-wise Kronecker product, i.e., C�B = [c1 ⌦ b1, c2 ⌦ b2, . . . , cr ⌦ br]
T(1) = A(C� B)T
T(2) = B(C� A)T
T(3) = C(B� A)T
Unfolded Tensoron the kth mode
![Page 22: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/22.jpg)
F = [zeros(n, r), zeros(m, r), zeros(t, r)]FF_init = np.rand((len(F), r, r))
def iter_solver(T, F, FF_init):
# Update each factor for k in range(len(F)): # Compute the inner-product matrix FF = ones((r, r)) for i in range(k) + range(k+1, len(F)): FF = FF * FF_init[i]
# unfolded tensor times Khatri-Rao product XF = T.uttkrp(F, k)
F[k] = F[k]*XF/(F[k].dot(FF)) # F[k] = nnls(FF, XF.T).T
FF_init[k] = (F[k].T.dot(F[k])) return F, FF_init
W W • VHT
WHHT
H H • WTV
WTWH
minA�0
kT(1) �A(C�B)T k
minB�0
kT(2) �B(C�A)T k
minC�0
kT(3) �C(B�A)T k
argminW,H
kX�WHk s. t. W,H � 0
J. Kim and H. Park. Fast Nonnegative Tensor Factorization with an Active-set-like Method. In High-Performance Scientific Computing: Algorithms and Applications, Springer, 2012, pp. 311-326.
![Page 23: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/23.jpg)
HOW TO INTERPRET: USER X TERM X TIME
X is a 3-way tensor in which xnmt is 1 if the term m was used by user n at interval t, 0 otherwise ANxK is the the association of each user n to a factor k BMxK is the association of each term m to a factor k CTxK shows the time activity of each factor
user
s
user
s
C
=X A
B
(N×M×T)
(T×K)
(N×K)
(M×K)terms
time
time
terms
factors
![Page 24: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/24.jpg)
http://www.datainterfaces.org/2013/06/twitter-topic-explorer/
![Page 25: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/25.jpg)
TOOLS FOR TENSOR DECOMPOSITION
![Page 26: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/26.jpg)
TOOLS FOR TENSOR FACTORIZATION
![Page 27: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/27.jpg)
TOOLS: THE PYTHON WORLD
NumPy SciPy
Scikit-Tensor (under development): github.com/mnick/scikit-tensor
NTF: gist.github.com/panisson/7719245
![Page 28: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/28.jpg)
TENSOR DECOMPOSITION OF WEARABLE SENSOR DATA
![Page 29: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/29.jpg)
![Page 30: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/30.jpg)
recorded proximity data
direct proximitysensing
![Page 31: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/31.jpg)
primary school
Lyon, France primary school 231 students 10 teachers
![Page 32: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/32.jpg)
Hong Kong primary school 900 students 65 teachers
![Page 33: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/33.jpg)
SocioPatterns.org
7 years, 30+ deployments, 10 countries, 50,000+ persons • Mongan Institute for Health Policy, Boston• US Army Medical Component of the Armed Forces, Bangkok• School of Public Health of the University of Hong Kong• KEMRI Wellcome Trust, Kenya• London School for Hygiene and Tropical Medicine, London• Public Health England, London• Saw Swee Hock School of Public Health, Singapore
![Page 34: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/34.jpg)
TENSORS
![Page 35: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/35.jpg)
0 1 0
1 0 1
0 1 0
FROM TEMPORAL GRAPHS TO 3-WAY TENSORS
![Page 36: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/36.jpg)
temporal network
tensorialrepresentation
tensor factorization
factors
communities temporal activity
factorizationquality
A,B C
tuning the complexityof the model
node
s
communities
1B5A
3B5B
2B2A
3A4A
1A4B
50
60
70
80
0
10
20
30
4040
�����
Figure 2: Temporal activity of each community
3
50
60
70
80
0
10
20
30
4040
�����
Figure 2: Temporal activity of each community
3
50
60
70
80
0
10
20
30
4040
�����
Figure 2: Temporal activity of each community
3
structures in temporal networks
components
node
s
time
time interval
quality metrics
component
![Page 37: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/37.jpg)
L. Gauvin et al., PLoS ONE 9(1), e86028 (2014)
1B5A
3B5B
2B2A
3A4A
1A
4B
TENSOR DECOMPOSITION OF SCHOOL NETWORK
![Page 39: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/39.jpg)
ANOMALY DETECTION IN TEMPORAL NETWORKS
![Page 40: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/40.jpg)
ANOMALY DETECTION IN TEMPORAL NETWORKS
A. Sapienza et al. ”Detecting anomalies in time-varying networks using tensor decomposition”, ICDM Data Mining in Networks
![Page 41: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/41.jpg)
anomaly detection in temporal networks
![Page 42: Exploring temporal graph data with Python: a study on tensor decomposition of wearable sensor data (PyData NYC 2015)](https://reader036.vdocuments.us/reader036/viewer/2022062502/589b864d1a28abc0098b4693/html5/thumbnails/42.jpg)
Laetitia Gauvin Ciro Cattuto Anna Sapienza
.fit().predict()
( )