modeling, searching, and explaining abnormal instances in multi-relational networks chapter 1....

17
Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007 . 7 . 9

Upload: kristopher-sharp

Post on 26-Dec-2015

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

Modeling, Searching, and Explaining Abnormal Instances in Multi-

Relational NetworksChapter 1. Introduction

Speaker: Cheng-Te Li

2007 . 7 . 9

Page 2: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

2

Outline• Introduction• Problem Definition

– Multi-relational Networks– The Importance of Abnormal Instances– Explanation

• Design Considerations• Objective and Challenges• Approach• Contributions

Page 3: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

3

Introduction• A discovery is said to be an accident meeting

a prepared mind. – Albert Szent Gyorgyi

• For CS, to model the discovery process via AI• Motivation: “Natural Selection”• The discovery process

Page 4: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

4

Outline• Introduction• Problem Definition

– Multi-relational Networks– The Importance of Abnormal Instances– Explanation

• Design Considerations• Objective and Challenges• Approach• Contributions

Page 5: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

5

Problem Definition

• Essentially, how to model through AI?– Our general framework

• Three key features– Multi-relational network (MRN)– Abnormal Instances– Human-understandable explanation

Page 6: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

6

Multi-relational Networks• Definition

– Nodes : objects of different types– Links : binary relationships between objects– Multi-relational : multiple different types of

links– Attributes

• Encode semantic relationship between different types of object

• E.g. Bibliography network

Page 7: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

7

Multi-relational Networks (con’t)

• More examples– Kinship network (親屬網絡 )– WWW : incoming, outgoing, and email links– WordNet : lexical relationship between concepts

• Multiple relationship types carry different kinds of semantic information to compare and contrast

• PageRank, Centrality Theory– Cannot deal with relation types in a network

Page 8: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

8

Abnormal Instances• Discovery from a network

– Identify central nodes, recognize frequent subgraphs, learn interesting property

• Our goal is to discover those look different !– Attraction of “light bulb”– An unheard-of anomaly detection via relational data– Potential applications :

• Information Awareness and Homeland Security• Fraud Detection and Law Enforcement• General Scientific Discovery• Data Cleaning

Page 9: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

9

Explanation• The difficulty of verification

– To find something previously unknown– False positive problem may exists even if

high precision and high recall, which likes unsupervised discovery

• Explanation-based discovery– Human-understandable explanation– Intuitive validation by user– Further investigation

Page 10: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

10

Outline• Introduction• Problem Definition

– Multi-relational Networks– The Importance of Abnormal Instances– Explanation

• Design Considerations• Objective and Challenges• Approach• Contributions

Page 11: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

11

Design Considerations• Three strategies to identify abnormal

instancesRule-based

learning

Pattern-matching e.g. “abnormal if it doesn’t cite any other people’s papers”

Supervised Learning

Manual labeling for training and classification Merit : high precision Demerit : domain dependent expensive to create sensitive to human bias can only find expected, not for novel

Unsupervised

Learning

Comparison-based due to our definition Property : Easily adapted to new domain without training More suitable to security-related problems

Page 12: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

12

Design Considerations (con’t)

• System Requirements

Utilize information of MRN, e.g. type of links

Adapt to different domains, no training

Explainable

Scalable

Provide high-level bias

Support different levels of detail for explanations

Page 13: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

13

Outline• Introduction• Problem Definition

– Multi-relational Networks– The Importance of Abnormal Instances– Explanation

• Design Considerations• Objective and Challenges• Approach• Contributions

Page 14: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

14

Objectives & Challenges• Objectives

Discovery stage : identify abnormal nodes Explanation stage : produce descriptions for nodes fou

nd– e.g. organized crime network

• Challenges Make anomaly detection obey previous requirements

• Identify suspicious instances in MRN : rule-based, supervised• Conventional unsupervised algo. for propositional or numerical da

ta• PageRank, HITS, Random Walk : not consider link types

Consider understandable explanations as discovery• Need a complex-enough and not-over-complicated model

Page 15: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

15

Approach Design a model capturing the semantic of nodes

– Select a set of relevant path types as semantic features– Compute statistical dependency between nodes and

path types as feature values

Find nodes with abnormal semantics– Distance-based outlier detection with semantic

profiles

Explain them !– Apply a classification to separate abnormal from others– Translate generated rules into natural language

Page 16: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

16

Contributions

An unsupervised way to identify abnormal in MRN

Outperform state-of-the-art algo. by a large margin

Generate understandable explanations

Do complex data analysis accurately and efficiently

Generality and applicability

Page 17: Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9

17

Q & A

Thanks for your listening !