novelty detection through semantic context modelling
DESCRIPTION
NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING. Pattern Recognition 45 (2012) 3439–3450. Suet-Peng Yong , Jeremiah D.Deng, MartinK.Purvis. The geographical DB contains a lot of objects as well as scenes and we cannot compare only theirs properties - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/1.jpg)
NOVELTY DETECTION THROUGH
SEMANTIC CONTEXT MODELLING
Pattern Recognition 45 (2012) 3439–3450
Suet-Peng Yong , Jeremiah D.Deng, MartinK.Purvis
![Page 2: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/2.jpg)
The geographical DB contains a lot of objects as well as scenes and
we cannot compare only theirs properties
we need to set up different requirements
to detect or discover context relationship between objects.
![Page 3: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/3.jpg)
Detection is an important functionality that has found many applications in
information retrieval and processing.
The framework starts with image segmentation,
followed by feature extraction and
classification of the image blocks extracted from image segments.
SEMANTIC CONTEXT WITHIN THE SCENE
![Page 4: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/4.jpg)
Contextual knowledge can be obtained from the nearby image data and
location of other objects.
Semantic context (probability): co-occurrence with other objects
and in terms of its occurrence in scenes.
Basic information is obtained from training data
or from an external knowledge base.
![Page 5: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/5.jpg)
Context information - from a global and local image level.
Interactions: pixel, region, object and object-scene interactions.
Contextual interactions
Local interactions
Pixel interactions
Region interactions
Object interactions
Global interactionsObject-scene interactions
The goal: Integrating context
![Page 6: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/6.jpg)
Pixel based image similarity
Let f and g be two gray-value image functions.
Histogram based image similarity
Similar images have similar histograms
Warning: Different images can have similar histograms
23161141
158
000
535
,
559
100
734
:..
),(),(),(1
2
1
d
ge
jigjifgfdn
i
m
j
255
0
2))(())(()(),(i
igHifHgHfHd
![Page 7: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/7.jpg)
Local interactions
Co-occurrence matrices (spatial context)
In image processing, co-occurrence matrices were proposed by
Haralick as texture feature representation (local or low-level context).
Size 256 X 256, co-occurrence in different directions
and different distances.
Sparse matrix – evaluation of arrangement inside matrix.
![Page 8: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/8.jpg)
Global interactions
The semantic co-occurrence matrices then undergo binarization
and principal component analysis for dimension reduction,
forming the basis for constructing one-class models on scene categories.
***********************************************************************************An advantage of this approach is that it can be used for novelty detection
and scene classification at the same time.
![Page 9: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/9.jpg)
Scenes with similar objects but different context can either be normal or novel: (a)‘normal’ scene (b) ‘novel’ scene
![Page 10: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/10.jpg)
From a statistical point of view, novelty detection is equivalent to
anomaly detection or outliers detection.
Type 1 - to determine outliers without prior knowledge - analogous to unsupervised clustering).
Type 2 - is to model both normality and abnormality with pre-trained samples, which is analogous to supervised classification.
Type 3 - refers to semi-supervised recognition, where only normality is
modelled but the algorithm learns to recognize abnormality.
![Page 11: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/11.jpg)
This approach is focused on semantic analysis of
individual scenes and then employ statistical analysis to detect
novelty.
Anomaly detection in images has been explored in specific domains such
as biomedicine and geography.
WE ARE INTERESTED NOT IN PIXEL OR REGION LEVEL NOVELTY,
BUT IN THE CONTEXT OF THE OVERALL SCENE.
![Page 12: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/12.jpg)
Computational framework
The computational framework for image novelty detection and
scene classification.
![Page 13: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/13.jpg)
JSEG - Segmentation of color-texture regions in images
The essential idea of JSEG is to separate the segmentation process
into two independently processed stages,
color quantization and spatial segmentation (efficient and effective).
![Page 14: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/14.jpg)
In JSEG,
colours in the image are first quantized to some representing classes
that are used to separate the regions in the image.
Image pixel colours are then replaced by their corresponding colour class
labels to form a class-map of the image.
Two parameters are involved in the algorithm:
a colour quantization threshold and a merger threshold.
![Page 15: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/15.jpg)
The segmentation has been done using the JSEG package,
with the colour quantization threshold set as 250,
and the merger threshold as 0.4.
By segmenting an image into homogeneous regions, it facilitates
detection or classification of the objects.
![Page 16: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/16.jpg)
Each segment image is further tiled into b x b pixel blocks where b Є N.
Feature extraction and block labelling. ! Size of block – texture features!
smallest segment 31 X 67 b = 25
![Page 17: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/17.jpg)
The main disadvantage of the RGB colour space in applications with
natural images is a high correlation between its components:
about 0.78 for rBR (cross correlation between the B and R channel), 0.98 for rRG and
0.94 for rGB
Another problem is the perceptual non-uniformity, the low correlation
between the perceived difference of two colours and
the Euclidian distance in the RGB space.
![Page 18: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/18.jpg)
LUV histograms
are found to be robust to resolution and rotation changes.
It models human’s perception of colour similarity very well, and is also
machine independent.
LUV colour space - main goal is to provide a perceptually equal space.
![Page 19: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/19.jpg)
The LUV channels have different ranges:
L ( 0 - 100), U (- 134 to 220) and V (- 140 to 122).
Same interval - it is 20 bins to the L channel, 70 bins to the U channel, 52 bins to the V channel and
standard deviation
The LUV histogram feature code thus has 143 dimensions.
Haralick texture features (a total of 13 statistical measures can
be calculated for each co-occurrence matrix in four directions)
μ and Ϭ 26 dimensions 169 dimensions
![Page 20: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/20.jpg)
The Euclidian distance between two colours in the LUV colour space is
strongly correlated with the human visual perception.
L gives luminance and U and V gives chromaticity values of colour image.
Positive value of U indicates prominence of red component in colour image,
negative value of V indicates prominence of green component.
![Page 21: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/21.jpg)
Edge Histogram Descriptor Gabor filtering features
![Page 22: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/22.jpg)
Block labelling
Classification was done on the segments and also on the image blocks
with similar feature extractions the classifier employed is
nearest neighbour (1-NN)
We can now adopt the LUV and Haralick features and concatenate
them together, giving a feature vector of
169 dimensions to represent an image block.
![Page 23: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/23.jpg)
Semantic context modelling
To model the semantic context within a scene, we further generate a
block label co-occurrence matrix (BLCM) within a distance
threshold R.
The co-occurrence statistics is gathered across the entire image and
normalized by the total number of image blocks.
Obviously the variation on the object sizes will affect the matrix values.
To reduce this effect, one option is to binarize the values of BLCM
elements, with non-zero elements all set to 1.
![Page 24: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/24.jpg)
Image blocks in an ‘elephant’ image and its corresponding co-occurrence matrix. The matrix is read row-column, e.g., a ‘sky’-‘land’ entry of ‘1’indicates there is ‘sky’ block above (or to the left of) ‘land’ block.
![Page 25: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/25.jpg)
The dimension of the BLCM will depend on the number of object classes in the
knowledge or database. Sparse matrix, PCA to reduce dimensionality and
to concanate rows – 1D feature vector.
Building scene classifiers
This is not a typical multi-class classification scenario.
To build a one-class classifier for each of the scene types,
the classifiers need only normally labelled images for training.
The method is called ‘Multiple One-class Classification with Distance
Thresholding’ (MOC-DT).
![Page 26: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/26.jpg)
scene
![Page 27: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/27.jpg)
Testing of MOC-DT. 1.
1.Given a query image, calculate its BLCM code and its distance towards
each image group as
2. If label the image as ‘novel’; otherwise,
assign the scene label c to the image,
![Page 28: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/28.jpg)
Experiments and results
The normal image set consists of 335 normal wildlife scenes
taken from Google image Search and Microsoft Research
(the ‘cow’ image set).
Wildlife images usually feature one type of animal in the scene with a
few other object classes in background to form the semantic
context.
![Page 29: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/29.jpg)
Experiments and results
There are six scene types (each followed by the number of
instances in each type):
‘cow’ (51), ‘dolphin’ (57), ‘elephant’ (65), ‘giraffe’ (50), ‘penguin’ (55) and
‘zebra’ (57).
The distribution of number of images across different scene types is
roughly even.
The background objects belong to eight object classes:
‘grass’, ‘land’, ‘mount’, ‘rock’, ‘sky’, ‘snow’, ‘trees’ and ‘water’.
![Page 30: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/30.jpg)
Animal and background objects: there are 14 object classes in total.
![Page 31: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/31.jpg)
![Page 32: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/32.jpg)
43 ‘novel’ images
![Page 33: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/33.jpg)
After segmentation, there are 65 000 image blocks obtained in total, and
these are used to train the classifiers,
30 ‘normal’ images together with 43 ‘novel’ images to make up a testing
image set of 73 images
For the classification of image blocks, an average accuracy of 84.6% is
achieved in a 10-fold cross validation using a 1-NN classifier.
![Page 34: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/34.jpg)
Block labelling
Classification was done on the segments and also on the image blocks
with similar feature extractions the classifier employed is
nearest neighbour (1-NN)
We can now adopt the LUV and Haralick features and concatenate
them together, giving a feature vector of
169 dimensions to represent an image block.
![Page 35: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/35.jpg)
90% of all image blocks are set aside to be used to train a classifier each
time, which then labels the image blocks of the testing images.
Results
The semantic context of image scenes are calculated, undergoing
binarization and PCA.
The BLCM data has a higher dimensionality (196) than the number of
images in each scene type (50–65).
![Page 36: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/36.jpg)
PCA is conducted to reduce the dimension of BLCM.
![Page 37: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/37.jpg)
The 2-D PCA projection of BLCM data for ‘normal’ images shown with their respective classes.
![Page 38: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/38.jpg)
Global scene descriptor(GSD),Local binary pattern (LBP),
Gist [34], BLCM,
BLCM with PCA (BLCM/PCA), Binary BLCM (B-BLCM),
Binary BLCM with PCA (B-BLCM/PCA).
![Page 39: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/39.jpg)
The fitted Gaussian curve is displayed along with the data points and the threshold line.
![Page 40: NOVELTY DETECTION THROUGH SEMANTIC CONTEXT MODELLING](https://reader033.vdocuments.us/reader033/viewer/2022051517/568157f3550346895dc57052/html5/thumbnails/40.jpg)
Thanks for your attention