tensor decompositions and applications · the canonical tensor decomposition and its applications...

52
The Canonical Tensor Decomposition and The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia National Labs Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

Upload: others

Post on 06-Jun-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

The Canonical Tensor Decomposition and The Canonical Tensor Decomposition and Its Applications to Social Network AnalysisIts Applications to Social Network Analysis

Evrim Acar, Tamara G. Kolda and Daniel M. DunlavySandia National Labs

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

Page 2: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

+…+=

CANDECOMP/PARAFAC (CP) model [Hitchcock’27, Harshman’70, Carroll & Chang’70]

II

KK

R components

What is Canonical Tensor What is Canonical Tensor Decomposition?Decomposition?

JJ

Page 3: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CP Application: NeuroscienceCP Application: Neuroscience

Tim

esa

mpl

es

Scales Channels

+a1

b1

c1

a2

b2

c2

Epileptic Seizure Localization:

Page 4: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CP Application: NeuroscienceCP Application: Neuroscience

Tim

esa

mpl

es

Scales Channels

+a1

b1

c1

a2

b2

c2

Epileptic Seizure Localization:

Acar et al, 2007, De Vos et al, 2007

Page 5: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CP has Numerous Applications!CP has Numerous Applications!

• Chemometrics– Fluorescence Spectroscopy– Chromatographic Data

Analysis• Neuroscience

– Epileptic Seizure Localization– Analysis of EEG and ERP

• Signal Processing• Computer Vision

– Image compression, classification

– Texture analysis• Social Network Analysis

– Web link analysis– Conversation detection in

emails– Text analysis

• Approximation of PDEs

Sidiropoulos, Giannakis and Bro, IEEE Trans. Signal Processing, 2000.

Mørup, Hansen and Arnfred, Journal of Neuroscience

Methods, 2007.

Hazan, Polak and Shashua, ICCV 2005.

Bader, Berry, Browne, Survey of Text Mining: Clustering, Classification, and Retrieval, 2nd Ed., 2007.

Doostan and Iaccarino, Journal of Computational Physics, 2009.

Andersen and Bro, Journal of Chemometrics, 2003.

Page 6: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Algorithms: Algorithms: How Can We Compute CP?How Can We Compute CP?

Page 7: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Mathematical Details for CPMathematical Details for CP

Unfolding(Matricization)

Columns: mode-1 fibers

Page 8: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Mathematical Details for CPMathematical Details for CP

Unfolding(Matricization)

Row: mode-2 fibers

Page 9: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Mathematical Details for CPMathematical Details for CP

Unfolding(Matricization)

+…+=

Matrix Khatri-Rao Product

Tube: mode-3 fibers

Page 10: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CP is a Nonlinear CP is a Nonlinear Optimization ProblemOptimization Problem

+…+=

Given tensor and R (# of components), find matrices A, B, C that solve the following problem:

where the vector x comprises the entries of A, B, and C stacked

column-wise:

Optimization Problem

II

JJKK

variablesvariables

Objective Function

Page 11: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Traditional Approach: CPALSTraditional Approach: CPALS

for k = 1,…

end

Alternating Algorithm

Optimization Problem

CPALS dating back to Harshman’70 and Carroll & Chang’70 solves for one factor matrix at a time.

Each step can be converted to a matrix least squares problem:

R x R matrix

I x RI x JK JK x R

I x JK JK x R

Page 12: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Repeat the following steps until “convergence”:

Very fast, but not always accurate.Not guaranteed to converge to a stationary point.Other issues, e.g., cannot exploit symmetry.

Traditional Approach: CPALSTraditional Approach: CPALS

Optimization Problem

Page 13: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Our Approach: CPOPTOur Approach: CPOPT

Unlike CPALS, CPOPT solves for all factor matrices simultaneously using a gradient based optimization.

Optimization Problem

Define the objective function:

Page 14: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Rewriting the Objective FunctionRewriting the Objective Function

Inner Product

Norm

Page 15: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Derivative of 2Derivative of 2ndnd SummandSummand

Tensor-Vector Multiplication

Analogous formulas exist for partials w.r.t. columns of B and C.

Page 16: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Derivative of 3Derivative of 3rdrd SummandSummand

Analogous formulas exist for partials w.r.t. columns of B and C.

Page 17: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Objective and GradientObjective and Gradient

+…+=Objective Function

Gradient (for r = 1,…,R)

Page 18: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Gradient in Matrix FormGradient in Matrix Form

+…+=

Gradient

Note that this formulation can be used to derive the ALS approach!

Objective Function

Page 19: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Indeterminacies of CPIndeterminacies of CP

• CP is often unique.

• However, CP has two fundamental indeterminacies

– Permutation – The components can be reordered

• Swap a1, b1, c1with a3, b3, c3

– Scaling – The vectors comprising a single rank-one factor can be scaled

• Replace a1 and b1with 2 a1 and ½ b1

+…+=

Not a big deal. Leads to multiple, but separated, minima.

This leads to a continuous space of equivalent solutions.

Page 20: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Adding RegularizationAdding Regularization

Objective Function

Gradient

Page 21: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Our methods:Our methods:CPOPT & CPOPTRCPOPT & CPOPTR

CPOPT: Apply derivative-based optimization method to the following objective function:

CPOPTR: Apply derivative-based optimization method to the following regularized objective function:

Page 22: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Another competing method:Another competing method:CPNLSCPNLS

CPNLS: Apply nonlinear least squares solver to the following equations:

Proposed by Paatero’97 and also Tomasi and Bro’05.

Jacobian is of size .

Page 23: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Experimental SetExperimental Set--UpUp[Tomasi&Bro’06]

Step 2: Construct tensor from factor matrices and add noise. All combinations of:

• Homoscedastic: 1%, 5%, 10%• Heteroscedastic: 0%, 1%, 5%

Step 3: Use algorithm to extract factors, using Rtrue and Rtrue+1 factors. Compare against factors in Step 1. 180

tensors

+= + +

R=3360 tests

20 triplets

Step 1: Generate random factor matrices A, B, C with Rtrue = 3 or 5 columns each and collinearity set to 0.5,i.e.,

Page 24: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Implementation DetailsImplementation Details

• All experiments were performed in MATLAB on a Linux workstation (Quad-Core Intel Xeon 2.50GHz, 9 GB RAM).

• Methods– CPALS – Alternating least squares. Used parafac_als in the Tensor Toolbox

(Bader & Kolda)– CPNLS – Nonlinear least squares. Used PARAFAC3W, which implements

Levenberg-Marquadt (necessary due to scaling ambiguity), by Tomasi and Bro.

– CPOPT – Optimization. Used routines in the Tensor Toolbox in calculation of function values and gradients. Optimization via Nonlinear Conjugate Gradient (NCG) method with Hestenes-Stiefel update, using Poblano (in-house code to be released soon).

– CPOPTR – Optimization with regularization. Same as above. (Regularization parameter = 0.02.)

Page 25: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPOPT is Fast and AccurateCPOPT is Fast and Accurate

Generated 360 dense test problems (with ranks 3 and 5) and factorized with R as the correct number of components and one more than that. Total of 720 tests for each entry below.

K x K x KR = # components

O(RK3) O(RK3)O(R3K3) O(RK3)

Page 26: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

OverfactoringOverfactoring has a significant impacthas a significant impact

Page 27: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPOPT is robust to CPOPT is robust to overfactoringoverfactoring

250 300 350 400 4500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Emission wavelength

1

Emission mode

250 300 350 400 4500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Emission wavelength

2

Emission mode

250 300 350 400 450-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

Emission wavelength

3

Emission mode

Amino (http://Amino (http://www.models.life.ku.dkwww.models.life.ku.dk/)/)

Page 28: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Application: Application: Link PredictionLink Prediction

Page 29: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

20052005 20072007……

Link Prediction on Link Prediction on BibliometricBibliometric DataData

auth

ors

auth

ors

conferencesconferences

1991199119921992

……20042004

Question1: Can we use tensor decompositions to model the data and extract meaningful factors?

# of papers by ith author

at jth conf. in year k.

Question2: Can we predict who is going to publish at which conferences in future?

Page 30: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Components make sense! Components make sense!

auth

ors

auth

ors

conferencesconferences

≈a1

b1

c1

++

a2

b2

c2

……

aR

bR

cRyearyear

ccrrbbrraarr

1992 1994 1996 1998 2000 2002 20040

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

YearsC

oeffs

.

Time mode

0 200 400 600 800 1000 1200 1400 1600 1800-0.2

0

0.2

0.4

0.6

0.8

1

1.2Conference Mode

Conferences

Coe

ffs.

BILDMED

CARSDAGM

0 2000 4000 6000 8000 10000 12000-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3Author Mode

Authors

Coe

ffs.

Hans Peter Meinzer

Heinrich Niemann

Thomas Martin Lehmann

DBLPDBLP

Page 31: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Components make sense! Components make sense!

auth

ors

auth

ors

conferencesconferences

≈a1

b1

c1

++

a2

b2

c2

……

aR

bR

cR

XX

yearyear

ccrrbbrraarr

1992 1994 1996 1998 2000 2002 20040

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

YearsC

oeffs

.

Time mode

0 200 400 600 800 1000 1200 1400 1600 1800-0.2

0

0.2

0.4

0.6

0.8

1

1.2Conference Mode

Conferences

Coe

ffs.

BILDMED

CARSDAGM

0 2000 4000 6000 8000 10000 12000-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3Author Mode

Authors

Coe

ffs.

Hans Peter Meinzer

Heinrich Niemann

Thomas Martin Lehmann

Page 32: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Components make sense! Components make sense!

auth

ors

auth

ors

conferencesconferences

≈a1

b1

c1

++

a2

b2

c2

……

aR

bR

cRyearyear

ccrrbbrraarr

1992 1994 1996 1998 2000 2002 2004-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

YearsC

oeffs

.

Time mode

0 200 400 600 800 1000 1200 1400 1600 1800-0.2

0

0.2

0.4

0.6

0.8

1

1.2Conference mode

Conferences

Coe

ffs.

0 2000 4000 6000 8000 10000 12000-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16Author mode

Coe

ffs.

Authors

IJCAI

Craig Boutilier

Daphne Koller

Page 33: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Components make sense! Components make sense!

auth

ors

auth

ors

conferencesconferences

≈a1

b1

c1

++

a2

b2

c2

……

aR

bR

cRyearyear

ccrrbbrraarr

1992 1994 1996 1998 2000 2002 2004-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

YearsC

oeffs

.

Time mode

0 200 400 600 800 1000 1200 1400 1600 1800-0.2

0

0.2

0.4

0.6

0.8

1

1.2Conference mode

Conferences

Coe

ffs.

0 2000 4000 6000 8000 10000 12000-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16Author mode

Coe

ffs.

Authors

IJCAI

Craig Boutilier

Daphne Koller

Page 34: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Link Prediction ProblemLink Prediction Problem

TRAIN:TRAIN:

TEST:TEST:

a1

b1

c1

++

a2

b2

c2

……

aR

bR

cR

auth

ors

auth

ors

conferencesconferences

1991199119921992

20042004……

≈≈au

thor

sau

thor

s

conferencesconferences

2005200520062006

20072007au

thor

sau

thor

s

conferencesconferences

~ 60K links out of 19 million possible <author, conf> pairs

~ 0.3% dense~ 32K previously unseen links in the training set

<authori, confj> = 1 if ith author publishes at jth conf.

<authori, confj> = 0

Page 35: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Score for <Score for <authorauthorii, , confconfjj>>

a1b1

• Fix signs using the signs of the maximum magnitude entries and then compute a score for each author-conference pair using the information from the time domain:

• Sign ambiguity:

a1

b1

a2

b2++ ……

aR

bR

0 2 4 6 8 10 12 14-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7 cc11

timett

a1

b1

c1

++

a2

b2

c2

≈≈

Page 36: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

0 2 4 6 8 10 12 14-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45 cc22

Score for <Score for <authorauthorii, , confconfjj>>

a1b1

• Fix signs using the signs of the maximum magnitude entries and then compute a score for each author-conference pair using the information from the time domain:

• Sign ambiguity:

a1

b1

a2

b2++ ……

aR

bR

tt

a1

b1

c1

++

a2

b2

c2

≈≈

Page 37: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Performance Measure: AUCPerformance Measure: AUC

s: contains the scores for all possible pairs, e.g., ~19 million

11

12

....

....ij

IJ

ss

s

s

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

95

23

67

....

....

....

ss

s

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

sortsort

10....1....0

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

labels

conferencesconferences

sorted scoresscores auth

ors

auth

ors

<authori, confj> = 1 if ith author publishes at jth conf.

<authori, confj> = 0

N: number of 1’sM: number of 0’s

Page 38: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Performance Measure: AUCPerformance Measure: AUC

s: contains the scores for all possible pairs, e.g., ~19 million

11

12

....

....ij

IJ

ss

s

s

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

95

23

67

....

....

....

ss

s

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

sortsort

10....1....0

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

sorted scoresscores labels TP rate FP rate

1/N 0

N: number of 1’sM: number of 0’s

1/N 1/M

1 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

FP rateT

P ra

te

Receiver Operating Characteristic (ROC)

Curve

Area Under the curve(AUC)

Page 39: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Performance EvaluationPerformance Evaluation

Predicting Links for 2005 - 2007 (~ 60K):

Predicting Previously Unseen LinksPreviously Unseen Linksfor 2005 - 2007(~ 32K):

AUC=0.92AUC=0.92

AUC=0.87AUC=0.87

CPCP

RANDOMRANDOM

Page 40: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPCP--WOPT: WOPT: Handling Missing DataHandling Missing Data

Page 41: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

EEG

chan

nels

chan

nels

timetime--frequencyfrequency

subject 1subject 1

Missing Data ExamplesMissing Data Examples

Missing data in different disciplines due to lossof information, machine failures, different samplingfrequencies or experimental-set ups.• Chemometrics • Biomedical signal processing (e.g., EEG)• Network traffic analysis (e.g., packet drops)• Computer vision (e.g., occlusions)• … excitationexcitation

emissionemission

Tomasi&Bro’05CHEMISTRY

subject Nsubject N subjectssubjects

chan

nels

chan

nels

timetime--frequencyfrequency

+…+≈≈

Page 42: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Modify the objective for CPModify the objective for CP

Optimization ProblemOptimization Problem

Objective Function

NO MISSING DATAFOR HANDLING MISSING DATA

Page 43: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Our approach: CPOur approach: CP--WOPTWOPT

Objective Function

Page 44: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Objective and GradientObjective and Gradient

Objective Function

Gradient (for r = 1,…,R; i=1,…I; j=1,..J; k=1,..K )

Page 45: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Gradient in Matrix FormGradient in Matrix Form

Objective Function

Gradient

Page 46: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Experimental SetExperimental Set--UpUp[Tomasi&Bro’05]

Step 1: Generate random factor matrices A, B, C with R = 5 or 10 columns each and collinearity set to 0.5.

Step 2: Construct tensor from factor matrices and add noise ( 2% homoscedastic noise)

Step 4: Use algorithm to extract R factors. Compare against factors in Step 1.

+ …= +

R

20 triplets

Step 3: Set some entries to missing• Percentage of Missing Data: 10%, 40%,

70%

Missing: entries, fibers

Page 47: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPCP--WOPT is Accurate!WOPT is Accurate!

Generated 40 test problems (with ranks 5 and 10) and factorized with an R-component CP model. Each entry corresponds to the percentage of correctly recovered solutions.

# known data entries# variables

Page 48: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPCP--WOPT is Accurate!WOPT is Accurate!

Generated 40 test problems (with ranks 5 and 10) and factorized with an R-component CP model. Each entry corresponds to the percentage of correctly recovered solutions.

CPNLS : Nonlinear least squares. Used INDAFAC, which implements Levenberg-Marquadt [Tomasi and Bro’05].Other alternative: ALS-based imputation (For comparisons, see Tomasi and Bro’05).

Page 49: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPCP--WOPT is Fast!WOPT is Fast!

Generated 60 test problems (with M =10%, 40% and 70%) and factorized with an R-component CP model. Each entry corresponds to the average/std of the CP models, which successfully recover the underlying factors.

Page 50: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

CPCP--WOPT is useful for real data!WOPT is useful for real data!

GOAL: To differentiate between left and right hand stimulation

subjectssubjects

chan

nels

chan

nels

timetime--frequencyfrequency

≈≈ + +

COMPLETE DATACOMPLETE DATA INCOMPLETE DATAINCOMPLETE DATA

Thanks to Morten Mørup!

missing

Page 51: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Summary & Future WorkSummary & Future Work

• New CPOPT method – Accurate & scalable

• Extend CPOPT to CP-WOPT tohandle missing data– Accurate & scalable

• More open questions…– Starting point?– Tuning the optimization– Regularization – Exploiting sparsity– Nonnegativity

• Application to link prediction– On-going work comparing to other

methods

Page 52: Tensor Decompositions and Applications · The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia

Thank you!Thank you!

• More on tensors and tensor models:– Survey : E. Acar and B. Yener, Unsupervised Multiway Data Analysis: A Literature Survey,

IEEE Transactions on Knowledge and Data Engineering, 21(1): 6-20, 2009.– CPOPT : E. Acar, T. G. Kolda and D. M. Dunlavy, An Optimization Approach for Fitting

Canonical Tensor Decompositions, Submitted for publication.– CP-WOPT : E. Acar, T.G. Kolda, D. M. Dunlavy and M. Mørup, Tensor Factorizations with

Missing Data, Submitted for publication.– Link Prediction: E. Acar, T.G. Kolda and D. M. Dunlavy, Link Prediction on Evolving Data, in

preparation.

• Contact:– Evrim Acar, [email protected]– Tamara G. Kolda, [email protected]– Daniel M. Dunlavy, [email protected]

Minisymposia onTensors and Tensor-based Computations