novelty detection in data streams profa. elaine faria ufu ...elaine/disc/mfcd2018/aula5... · –is...
TRANSCRIPT
![Page 1: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/1.jpg)
Novelty Detection in Data Streams
Profa. Elaine Faria UFU - 2018
![Page 2: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/2.jpg)
• Slides based on the papers– FARIA, ELAINE R.; GONÇALVES, ISABEL J. C. R. ; DE
CARVALHO, ANDRÉ C. P. L. F. ; GAMA, JOÃO . Novelty detection in data streams. Artificial Intelligence Review, v. 45, p. 235-269, 2016.
– FARIA, ELAINE RIBEIRO; PONCE DE LEON FERREIRA CARVALHO, ANDRÉ CARLOS ; GAMA, JOÃO . MINAS: multiclass learning algorithm for novelty detection in data streams. Data Mining and Knowledge Discovery, v. 30, p. 640-680, 2016.
– FARIA, ELAINE; GONCALVES, ISABEL ; GAMA, JOAO ; PONCE DE LEON FERREIRA CARVALHO, ANDRE . Evaluation of Multiclass Novelty Detection Algorithms for Data Streams. IEEE Transactions on Knowledge and Data Engineering (Print), v. 27, p. 2961-2973, 2015.
2
![Page 3: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/3.jpg)
Introduction
• Novelty Detection(ND) - DefinitionsNovelty detection is concerned with identifying abnormal system behaviours and abrupt changes from one regime to another (Lee and Roberts 2008)
The recognition that an input differs in some respect from previous inputs (Perner 2008)
Novelty detection makes it possible to recognize novel concepts, which may indicate the appearance of a new concept, a change occurred in known concepts or the presence of noise (Gama 2010).
3
![Page 4: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/4.jpg)
Introduction
• Novelty detection – is useful in cases where an important class is
under-represented in the training set– is an important task, since, for many problems,
we never know if the currently available training data include on all possible object classes
– allows the recognition of novel profiles (concepts) in unlabeled data
4
![Page 5: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/5.jpg)
Introduction
• Novelty Detection - Challenges– Concept drift
– Noise and outliers
– Recurring Concepts
– Concept Evolution• Number of problem classes increases over time
5
![Page 6: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/6.jpg)
Introduction
• Data stream applications for ND– Intrusion detection– Fraud detection– Medical diagnosis– Detection of interest regions in images– Fault detection– Spam filter– Text classification– ....
6
![Page 7: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/7.jpg)
Introduction
• It is important to distinguish– Anomaly detection
– Outlier detection
– Novelty detection
7
![Page 8: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/8.jpg)
Introduction
• Novelty, anomaly and outlier detection are related to find patterns that are different from the normal (usual)– Anomaly and outlier detection give the idea of
an undesired pattern– Novelty indicates an emergent or a new
concept that needs to be incorporated to the normal pattern
8
![Page 9: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/9.jpg)
Novelty detection - Formalization of the Problem
Training set (Offline Phase )Dtr = {(X1, y1), (X2, y2), …, (Xm, ym)}
Xi: vector of input attributes for the ith example yi: target attributeyi Ytr and Ytr ={c1,c2, …,cL}
When new data arrive (Online Phase)Yall ={c1,c2, …,cL, …, cK}, K > LGoal: Classify Xnew in Yall
9
![Page 10: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/10.jpg)
Novelty detection - Phases
• Offline Phase– Induces a classifier from a set of labeled
examples → known concept about the problem
• Online Phase– Classifies new unlabeled examples– Identifies novelty patterns– Updates the decision model
10
![Page 11: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/11.jpg)
Offline Phase - Taxonomy
11
![Page 12: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/12.jpg)
Offline Phase
• Learning task– Unsupervised approaches
• Suppose that all the examples from the training set belongs to the normal concept
– Supervised approaches• Use the label of the examples to build the decision
model• Normal concept is composed by a set of different
classes
12
![Page 13: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/13.jpg)
Online Phase
• Tasks– Classification of new examples– Detection of novelty patterns– Adaptation of the decision model
• some algorithms update the decision model in an offline fashion
13
![Page 14: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/14.jpg)
Online Phase
14
Classification
![Page 15: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/15.jpg)
Online Phase
• Classification– Verify if a new example can be explained by
the current decision model– Approach 1
• Classify new examples only as normal or novelty– Approach 2
• Consider the problem as a multiclass classification task
15
![Page 16: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/16.jpg)
Online Phase
16
Classification - Taxonomy
![Page 17: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/17.jpg)
Online Phase
• Classification with unknown label option– Examples not explained by the current
decision are not immediately classified• Assign an unknown profile
– They are put in a short-term memory for future analysis
• Used to update the decision model: extensions and novelty patterns
17
![Page 18: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/18.jpg)
Online Phase
18
Detection of novelty patterns
or
![Page 19: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/19.jpg)
Online Phase
• Detection of novelty patterns– Uses unlabeled examples not explained by
the current decision model to identify novelty patterns
– Anomaly detection• Presence of one example not explained by the
model identifies an anomaly behavior– Novelty
• Composed by a set of cohesive and representative examples not explained by the decision model
19
![Page 20: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/20.jpg)
Online Phase
20
Detection of novelty patterns: Taxonomy
![Page 21: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/21.jpg)
Online Phase
21
Update of the decision model
or
![Page 22: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/22.jpg)
Online Phase
• Update of the decision model– Necessary task to address concept drift and
concept evolution– Can be carried with or without feedback– Forgetting mechanisms
• Important strategy used to remove outdated concepts
22
![Page 23: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/23.jpg)
Online Phase
23
Update of the decision model
![Page 24: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/24.jpg)
Online Phase
• Update of the decision model: External Feedback– Approach 1: external feedback
• Assume that the true label of all the examples will be available after a delay
• Unrealistic assumption for data streams– Approach 2: active learning
• Ask the user the label of a subset of the examples in the stream
– Approach 3: without feedback• Decision model is updated without information
about the true label of the examples 24
![Page 25: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/25.jpg)
Online Phase
• Update of the decision model: Forgetting mechanism → Important to forget previous, outdated, concepts– Approach 1: Based on an ensemble of classifiers
• To train a new classifier and replace an old one– Approach 2: Based on clusters
• Clusters that do not received new examples for a long time are removed
– Approach 3: Based on weight• To reduce the weight of the old examples
25
![Page 26: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/26.jpg)
Detection of recurring concepts
• Recurring concepts: definition– The class definitions may change when
previous situations recur, in periodic or random way, after some period of time (Elwell and Polikar 2011)
– Special type of concept drift where concepts that appeared in the past may recur in the future (Katakis et al. 2010)
26
![Page 27: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/27.jpg)
Detection of recurring concepts
• Recurring contexts: Examples– Climate change– Electricity demand – Buyer habits – ....
27
![Page 28: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/28.jpg)
Detection of recurring concepts
It would be a waste of effort to relearn an old concept from scratch for each recurrence (Widmer and Kubat 1996)
– In recurring contexts • Instead of forgetting outdated concepts, these
concepts should be saved and reexamined at some later time when they can improve the prediction performance in a cost-effective way
28
![Page 29: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/29.jpg)
Detection of recurring concepts
• Systems that do not address recurring concepts: Treat them as novelty– Undesirable effects
• Increase in the false alarm rate• Increase in the human effort in analyzing the false
alarms• Computational efforts in executing a novelty
detection task and in learning a new class that was already learned
29
![Page 30: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/30.jpg)
Detection of recurring concepts
• Approaches– Approach 1: To use an auxiliary ensemble of
classifiers that detects recurring classes– Approach 2: To use c ensembles, one per
class• Each ensemble is never deleted, but only updated• c is the number of classes seen so far in the
stream– Approach 3: To use a sleep memory to store
clusters not used to classify new examples for a long time 30
![Page 31: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/31.jpg)
Treatment of Outliers
• Outliers– Data that are isolated, sparse and not present
in a representative number• Novelty detection algorithms
– Look for a cohesive and representative set of examples
– Must address the treatment of noise or outliers which can be confused with the appearing of a new concept or a change in the known concepts
31
![Page 32: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/32.jpg)
Treatment of Outliers
• Approach for outlier treatment (used by MCM, ECSMiner, MINAS, OLINDDA algorithms)– To store the examples not explained by the current
model in a temporary memory– To cluster these examples– To apply validation criteria on the clusters
• Examples of validation criteria: cohesiveness, representativeness, separability
• Not valid clusters are potential outliersMinas also propose to remove old examples, which stay in the temporary memory for a long time
32
![Page 33: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/33.jpg)
Examples of Novelty Detection Algorithms for Data Streams
• ECSMiner (Masud et al. 2011)• OLINDDA (Spinosa 2009)• MINAS (Faria 2016)• MCM (Masud et al. 2010)• CLAM (Al-Khateeb et al. 2012)
33
![Page 34: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/34.jpg)
ECSMiner • Supervised algorithm for concept drift and
concept evolution• The decision model is composed by an
ensemble of classifiers– It supposes that all examples will be labeled after a
delay– Each classification model is trained from a chunk of
data– The ensemble is composed by M models– The ensemble is continuoulsy updated
• The model with the highest prediction error is replaced by a new model
34
![Page 35: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/35.jpg)
ECSMiner
• Assumptions– After Tl timestamps the true label of the
example will be available– It is possible to wait to Tc timestamps before
to make a decision about the classification of an example
Tc < Tl
35
![Page 36: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/36.jpg)
ECSMiner• Offline Phase
– Supervised– Ensemble of classifiers
• Decision tree or KNN
• Online Phase– Use the ensemble for classify new examples– Store the examples not explained by the ensemble (f-outliers)– Build clusters from f-outliers using K-Means– Calculate the q-NSC measure (q-neighbourhood silhouette
coefficient)– If most of the classifiers has the q-NSC positive→ a novelty is
detected
36
![Page 37: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/37.jpg)
OLINDDA
• Offline Phase– Unsupervised– Learn a decision model about the normal class– The decision model is a set of clusters (k-hypershperes)
• Clustering algorithm: K-Means
37
![Page 38: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/38.jpg)
OLINDDA
• Online Phase• Unsupervised• Use the decision model created in the offline
phase to classify new examples as normal • Examples not explained by the decision model are
put in a short-term memory (unknown)• Valid clusters of unknown examples are used to
create the extension and novelty models
38
![Page 39: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/39.jpg)
OLINDDA
Normal
Extension
Novelty
39
![Page 40: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/40.jpg)
OLINDDA
Normal
Extension
Novelty
Example ???
Example
Example
Example
If a new example is inside the radius of one of the hypersphers classify it with the label of the hypersphere
Example
Example
Example
Example
Example
40
Normal
Extension
![Page 41: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/41.jpg)
OLINDDA
• If the example is labeled as unknown it is stored in a short-term memory
Example
Short-term memory
Not explained by any of the hyperspheres
41
![Page 42: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/42.jpg)
OLINDDA
If the number of examples in the short-term memory > threshold cluster the examples using K-Means Only valid clusters (cohesive and representative) are considered
Sort-term memory
# Examples > Threshold
K- Means
42
![Page 43: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/43.jpg)
OLINDDA
• A new cluster is – Extension
• Neighbourhood of the normal model
– Novelty• Distant from the
normal model
43
![Page 44: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/44.jpg)
MINASMultIclass learNing Algorithm for data Streams
• Offline Phase– Learns a decision model based on the known concept
about the problem – Execute once– Each class represent by a set of clusters (hyperspheres)
• Online Phase– Receives new examples and classify them either as one of
the known classes or as unknown– Cohesive group of unknown examples are used to detect
new classes or extensions44
![Page 45: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/45.jpg)
MINAS - Offline Phase
45
![Page 46: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/46.jpg)
MINAS - Offline Phase• Micro-clusters: statistical summary (incremental)N number of examplesLS linear sum of the examplesSS squared sum of the examplest timestamp of the arrival of the last example classified by the micro-
cluster
• Example of clustering algorithms used in the Training Phase– K-Means– Clustream
46
![Page 47: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/47.jpg)
MINAS - Offline Phase
47
![Page 48: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/48.jpg)
MINAS
• Online Phase– To classify new examples– To detect novelty patterns– To update the decision model
48
![Page 49: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/49.jpg)
MINAS - Classification
49
![Page 50: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/50.jpg)
MINAS - Classification
• Classify an example as unknown means– The example is a noise or outlier and it can not be
explained by anyone of the micro-clusters • The example must be discarded
– The example represents a concept drift • The example must be used to update the decision
model– The example represents a novelty pattern
• The example must be used to update the decision model
50
![Page 51: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/51.jpg)
MINAS – Novelty detection and update
51
![Page 52: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/52.jpg)
MINAS - Online Phase
52
![Page 53: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/53.jpg)
MINAS - Online Phase
53
![Page 54: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/54.jpg)
MINAS-Active Learning
• Used when the label of a reduced set of examples are available
• Use active learning techniques to select a representative set of examples to be labeled and used to update the decision model
• Main idea– Time to time select the centroid of the new created
micro-clusters as the examples to be labeled by the specialist
– Update the decision model with the new label
54
![Page 55: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/55.jpg)
Evaluation in Novelty DetectionMulticlass novelty detection data stream algorithms use binary evaluation measures
% of examples misclassified in the normal class
% of normal class examples wrongly classified as novelty
% classificações incorretas
FP: # of examples from the known classes wrongly classified as noveltyFN: # of examples from the novel classes wrongly classified as known classesFE: # of examples from known classes misclassified (other than FP)N: # of examples in the stream Nc: # of examples from the novel classes
55
![Page 56: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/56.jpg)
Evaluation in Novelty Detection
• Binary classification evaluation measures: Problems– Considers the novelty detection as a binary
classification task• It is a multiclassification task
– Do not consider the unknown examples separately
– Do not consider that different novelty patterns can appear
– Evaluate only the final confusion matrix 56
![Page 57: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/57.jpg)
57
Evaluation in Novelty Detection (Faria et. al 2013)
• Confusion matrix– Not square (rectangle)– Number of columns
increases over time– Novelty patterns do not
have direct matching with problem classes
– Presence of unknown examples
![Page 58: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/58.jpg)
58
Evaluation in Novelty Detection (Faria et. al 2013)
• Rectangular Confusion Matrix – Problem
• Difficult to define hits and errors• Matrix is not squared• Each novelty pattern needs to be assigned to only
one class – One class may be associated with one or more novelties
– Solution• Representation using Bipartite graph• Based on the Hungarian Method
![Page 59: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/59.jpg)
59
Evaluation in Novelty Detection (Faria et. al 2013)
Confusion Matrix
Corresponding Bipartite Graph Resulting Bipartite Subgraph
![Page 60: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/60.jpg)
60
Evaluation in Novelty Detection (Faria et. al 2013)
• Unknown examples– Problem
• How to consider the unknown examples? – Hits or Errors?
– Solution• Neither hits nor errors• Unknown examples should be computed
separately
![Page 61: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/61.jpg)
61
Unknown examples
ACCExp + ErrExp = 1 ACCExp/ErrExp: accuracy/error considering only the
examples explained by the model
Unki: # examples from the class Ci classified as unknown
ExCi: # examples from class Ci
M: # classes
Evaluation in Novelty Detection (Faria et. al 2013)
![Page 62: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/62.jpg)
62
Evaluation in Novelty Detection (Faria et. al 2013)
• Use evaluation measure CER (Combined Error Rate) to calculate classification error rate
• Considerer only the examples classified as not unknown
#Ex′Ci: number of examples from class Ci#Ex′: number of examples
FPRi: false positive rateFNRi: false negative rate
![Page 63: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/63.jpg)
63
Evaluation in Novelty Detection (Faria et. al 2013)
• Evaluation over time: Problem– In evolving data stream, it is not sufficient to extract
information about the final confusion matrix• Solution
– Plot a 2D-graphic• X represents the data timestamps • Y represents the evaluation measure values
– Plot the information about errors and unknown examples
– Identify the timestamps of when a new concept was detected
![Page 64: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/64.jpg)
64
Referências• Masud M, Gao J, Khan L, Han J, Thuraisingham BM (2011)
Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transaction on Knowledge Data Engineering 23(6):859–874
• Spinosa EJ, Carvalho ACPLF, Gama J (2009) Novelty detection with application to data streams. Intelligent Data Analysis 13(3):405–422
• Faria, ER; Carvalho ACPLF, Gama J (2016) MINAS: multiclass learning algorithm for novelty detection in data streams. Data Mining and Knowledge Discovery, v. 30, p. 640-680
• Masud MM, Chen Q, Khan L, Aggarwal CC, Gao J, Han J, Thuraisingham BM (2010) Addressing concept evolution in concept-drifting data streams. In: Proceedings of the 10th IEEE international conference on data mining (ICDM’10), pp 929–934
![Page 65: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/65.jpg)
65
Referências• Al-Khateeb TM, Masud MM, Khan L, Thuraisingham B
(2012) Cloud guided stream classification using class-based ensemble. In: Proceedings of the 2012 IEEE 5th international conference on computing (CLOUD’12). IEEE Computer Society, Washington, DC, USA, pp 694–701
• Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Transactions on Neural Network 22(10):1517–1531
• Katakis I, Tsoumakas G, Vlahavas I (2010) Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowl Inf Syst 2(3):371–391
![Page 66: Novelty Detection in Data Streams Profa. Elaine Faria UFU ...elaine/disc/MFCD2018/Aula5... · –is useful in cases where an important class is under-represented in the training set](https://reader034.vdocuments.us/reader034/viewer/2022050601/5fa8a9cb4817af5ba371ed3a/html5/thumbnails/66.jpg)
66
Referências• Widmer G, Kubat M (1996) Learning in the presence of
concept drift and hidden contexts. Machine Learning 23(1):69–101