non-negative residual matrix factorization w/ application to graph anomaly detection

18
© 2011 IBM Corporation IBM Research SIAM-DM 2011, Mesa AZ, USA, Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang Tong and Ching-Yung Lin April 28-30, 2011

Upload: zwi

Post on 08-Jan-2016

37 views

Category:

Documents


1 download

DESCRIPTION

Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection. Hanghang Tong and Ching-Yung Lin. April 28-30, 2011. Large Graphs are Everywhere!. -----. Q: How to find patterns? e.g., community, anomaly, etc. Terrorist Network [Krebs 2002]. Food Web [2007]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

© 2011 IBM Corporation

IBM Research

SIAM-DM 2011, Mesa AZ, USA,

Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection

Hanghang Tong and Ching-Yung Lin

April 28-30, 2011

Page 2: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Large Graphs are Everywhere!

2

-----

Internet Map [Koren 2009] Food Web [2007]

Protein Network [Salthe 2004]

Social Network [Newman 2005] Web Graph

Terrorist Network [Krebs 2002]

Q: How to find patterns?e.g., community, anomaly, etc.

Page 3: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

A Typical Procedure:

Matrix Tool for Finding Graph Patterns

Graph Adj. Matrix A A = F x G + R

Low-rank matrices Residual matrix

3

Page 4: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

A Typical Procedure:

Matrix Tool for Finding Graph Patterns

Graph Adj. Matrix A A = F x G + R

community anomalies

4

An Illustrative Example

Low-rank matrices Residual matrix

Page 5: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

A Typical Procedure:

An Example

Improve Interpretation by Non-negativity

Interpretation by Non-negativity

GraphAdjacencyMatrix A

A = F x G + R

community

anomalies

Non-negative Matrix FactorizationF >= 0; G >= 0

(for community detection)

Non-negative Residual Matrix Factorization

R(i,j) >= 0; for A(i,j) > 0(for anomaly detection)

This Paper

5

Page 6: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Anomaly Detection on Graphs

Social Networks– `Popularity contest’

Computer Networks– Spammer, Port Scanner, Vulnerable Machines, etc

Financial Transaction Networks– Fraud transaction (e.g., money-laundry ring), scammer

Criminal Networks– New criminal trend

Tele-communication Networks– Tele-marketer

6

Key Observation: Abnormal Behavior Actual Activities

Page 7: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Optimization Formulation

General Case

8

Weighted Frobenius Form

WeightCommon in Any Matrix Factorization

Page 8: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Optimization Formulation

General Case

9

Non-negative residual

Weighted Frobenius Form

WeightCommon in Any Matrix Factorization

Unique in This Paper

Page 9: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Optimization Formulation

0/1 Weight Matrix (Major Focus of the Paper)

10

Non-negative residual

Common in Any Matrix Factorization

Unique in This Paper

0/1weight

Page 10: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Optimization Formulation with 0/1 Weight Matrix

NrMF with 0/1 Weight Matrix

Q: How to find ‘optimal’ F and G? – D1: Quality C1: non-convexity of opt. objective

– D2: Scalability C2: large size of the graph

11

Page 11: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Optimization Method: Batch Mode

Basic Idea 1: Alternating

Basic Idea 2: Separation

12

Not convex wrt F and G, jointlyBut convex if fixing either F or G

argminG

s.t..

argminG

s.t..

For each j i,

Standard Quadratic Programming Prob.

Overall Complexity: Polynomial Can we do better?

Page 12: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Optimization Method: Incremental Mode

Basic Idea 1: Recursive Basic Idea 2: Alternating Basic Idea 3: Separation

13

Overall Complexity: Linear wrt # of edges

QP for a single variable w/ boundary constrains

Adjacency MatrixA

Initialize: R=A

Rank-1 Approximation

Update Residual Matrix R

Output Final Residual Matrix

Do r times

Can be solved in constant time

Page 13: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Experimental Evaluation

Effectiveness

Anomaly Type

Accuracy Wall-clock Time

# of edges

14

Efficiency

Page 14: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Batch Method vs. Incremental Method

Log Wall-clock time (sec.)

Data SetIncremental Method

Batch Method

16

Page 15: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Conclusion

Problem Formulation: Non-negative Residual Matrix Factorization– a new matrix factorization for interpretable graph anomaly detection

Optimization Methods– Batch: straight-forward, polynomial time complexity

– Incremental: linear time complexity

Future Work– Other interpretable properties (sparseness) for anomaly detection

– Matrix Factorization w/ Total Non-negativity

17

Page 16: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Thank you!

[email protected](We are hiring at IBM Research!)

18

Page 17: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

Visual Comparison

19

Page 18: Non-Negative  Residual  Matrix Factorization  w/ Application to Graph Anomaly Detection

IBM Research

© 2011 IBM Corporation

low q up q low up