icdm'07 1 depth-based novelty detection yixin chen dept. of computer and information science...

21
ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi http://www.cs.olemiss.edu/~ychen Joint work with Henry Bart, Xin Dang, and Hanxiang Peng

Upload: aubrey-day

Post on 15-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 1

Depth-Based Novelty Detection

Yixin ChenDept. of Computer and Information ScienceUniversity of Mississippihttp://www.cs.olemiss.edu/~ychen

Joint work with Henry Bart, Xin Dang, and Hanxiang Peng

Page 2: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 2

Outline

Novelty detectionMotivationsKernelized spatial depth (KSD)Bounds on the false alarm probabilityEmpirical studiesDiscussions

Page 3: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 3

Outlier Detection

Missing label problem

One-class learning

Page 4: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 4

A Simple Outlier Detector

1-d example

Sensitivity

Threshold

Structure of the data

X

mean

median

X

X

X

?

Page 5: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 5

Median

The sign function

Median is

Page 6: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 6

Spatial Median

The spatial sign function

The spatial median is

Page 7: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 7

Spatial Depth

Spatial Depth

Sample version

The expectation of the unit vector starting from x

Page 8: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 8

Spatial Depth and Outlier Detection

outlier

Page 9: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 9

Example: Half-Moon Data

FAR = 70%

Page 10: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 10

Example: Ring Data

FAR = 100%

Page 11: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 11

Kernelized Spatial Depth (KSD)

σ→∞, KSD converges to SDσ→0, KSD → 0.293

Page 12: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 12

Example: Half-Moon Data

0.2495

Page 13: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 13

Example: Ring Data

0.2651

Page 14: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 14

KSD Outlier Detector

outliers

normal observations

b is margin

How should we decide the threshold t?

Page 15: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 15

Threshold Selection

Largest threshold such that upper bound on FAP ≤ desired level

Page 16: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 16

Bounds on the False Alarm Probability

A training set bound

A test set bound

Page 17: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 17

Empirical Study 110 species under the order Cypriniforms 989 specimens from Tulane University Museum of Natural History

Page 18: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 18

Empirical Study 1

MaskingEffect

Page 19: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 19

Empirical Study 2

Page 20: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 20

Discussions

KSD outlier detection and density based approaches

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

observations

kern

el s

pa

tial d

ep

th

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

observations

est

ima

ted

pro

ba

bili

ty d

en

sity

Page 21: ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi ychen

ICDM'07 21

Acknowledgment

Kory P. Northrop, Tulane UniversityHuimin Chen, University of New OrleansUniversity of MississippiNational Science Foundation