spatial coordinate coding to reduce histogram...

13
Spatial Coordinate Coding To Reduce Histogram Representations, Dominant Angle And Colour Pyramid Match P. Koniusz, K. Mikolajczyk CVSSP, University of Surrey, UK {P.Koniusz, K.Mikolajczyk}@surrey.ac.uk September 11, 2011 P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 1 / 13

Upload: others

Post on 06-Sep-2019

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Spatial Coordinate Coding To Reduce HistogramRepresentations, Dominant Angle And Colour Pyramid

Match

P. Koniusz, K. Mikolajczyk

CVSSP, University of Surrey, UK

{P.Koniusz, K.Mikolajczyk}@surrey.ac.uk

September 11, 2011

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 1 / 13

Page 2: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Introduction

Recognition approach (Bag of Words)

1. Feature extraction

Computedescriptors

3. Mid-level features2. Visual vocabulary

Kernel +SVM or KDA

Cluster descriptors

Detect key-points

Buildhistograms

4. Classification

Average or max pooling

freq.

codewords

S2-spatial pyramid match

pool|LX0,LY0

pool|LX1,LY1

pool|LX2,LY2

1. Feature extraction 3. Mid-level features2. Visual vocabulary 4. Classification

freq.

codewords

pool|LY0

L0

L1

L2

pool|LY1

pool|LY2

(x0,y0), d0

(x1,y1), d1(x2,y2), d2

...(xN,yN), dN

L0

L1

L2

(x0), d0

(x1), d1(x2), d2

...(xN), dN

1. Feature extraction 3. Mid-level features2. Visual vocabulary 4. Classification

freq.

codewords

pool

L0

Computedescriptors

Kernel +SVM or KDA

Joint clustering

Detect key-points

Buildhistograms

Average or max pooling

S1-spatial pyramid match

Computedescriptors

Kernel +SVM or KDA

Joint clustering

Detect key-points

Buildhistograms

Average or max pooling

Pyramid match removed

Spatial Pyramid Match [S. Lazebnik, 2006] at a heart of modernobject category recognition to exploit spatial bias in images

Mid-level feature representations result from mapping low levelfeatures (e.g. descriptors) to a given vocabulary space

Increasing number of quantisation levels results in extreme histogramvectors of 200K or more elements

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 2 / 13

Page 3: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Introduction

Aim

To propose a new joint appearance and spatial representation

To reduce resulting vector sizes and therefore both computational andmemory requirements

To investigate which of pooling modalities (spatial, dominant angle,scale, colour bias) benefit from multiple levels of quantisation

Bias in images (Spatial Pyramid Match)

sky trees

fence fence fencetrunktrunk

sky ,tree, ship, grass

sky, treetree, ship, grass

skysky, tree, ship, grassgrass

Coordinate set Xs of an object s introduces spatial bias p(s|~x) ≥ p(s)for ~x ∈ Xs

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 3 / 13

Page 4: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Introduction

Bias in images (Dominant Edge Orientation)

sky trees

fence fence fencetrunktrunk

sky ,tree, ship, grass

sky, treetree, ship, grass

skysky, tree, ship, grassgrass

Trunks t remain largely vertical order Θt : p(t|θ) ≥ p(t) if θ ∈ Θt

Bias in images (Dominant Colours)

sky trees

fence fence fencetrunktrunk

sky ,tree, ship, grass

sky, treetree, ship, grass

skysky, tree, ship, grassgrass

Foliage f is of a limited colour set Cf , thus p(f |~c) ≥ p(f ) if ~c ∈ Cf

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 4 / 13

Page 5: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Spatial Coordinate Coding for Soft Assignment

Descriptor to mid-level features mapping:

~hn = f (~xn), n = 1, ...,N~xn ∈ X - image descriptors~hn - mid-level features

Mid-level features are Component Membership Probabilities of GMM:

hnk = p(~mk |~xn) =g(~xn; ~mk , σ)∑K

k ′=1 g(~xn; ~mk ′ , σ)~mk ∈ M - visual wordsσ - model paremeter

Average (or maximum) pooling operation performed on columns ofmatrix HN×K

We assume independence of visual appearance and spatial bias andcode both modalities as a joint distribution (key idea):

g′α(n, k) = g [(1− α)~xn; (1− α)~mk , σ

′]︸ ︷︷ ︸

visual term

· g(α~x′n;α~m

′k , σ

′)︸ ︷︷ ︸

spatial term

We assume idddP. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 5 / 13

Page 6: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Spatial Coordinate Coding for Sparse Coding

Mid-level features by optimising:

arg min~hn

∥∥∥~xn −M~hn

∥∥∥2+ β|~hn|

MD×K - visual vocabulary with K atoms of length D

Spatial descriptor ~x′n and dictionary M

′terms added to the problem

(key idea):

arg min~hn

(1− α)∥∥∥~xn −M~hn

∥∥∥2

︸ ︷︷ ︸visual term

+α∥∥∥~x ′

n −M′~hn

∥∥∥2

︸ ︷︷ ︸spatial term

+β|~hn| (1)

Soft Assignment and Sparse Coding can be spatially enhanced by justconcatenating image descriptors with the spatial information ~x

′n, i.e.:

~xaugn = [√

1− α~xTn︸ ︷︷ ︸visual term

,√α(~x

′n)T︸ ︷︷ ︸

spatial term

]T (key outcome)

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 6 / 13

Page 7: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Experiments on Spatial Information (VOC 2010)

Spatial Coordinate Coding

Pascal 2010 [M. Everingham, 2010] Action Classification set

9 classes, 301 training, 307 validation, and 613 testing bounding boxes

Soft Assignment (SA) and Spatial Coordinate Coding (SCC) withRBF χ2 kernels used

Results reported as Mean Average Precision

SA+ SPM(3levels) SA+SCC SA+SCCvalidation, 1 kernel validation, 1 kernel test, multiple kernels

49.8 51.6 62.15

Spatial Coordinate Coding outperforms Spatial Pyramid MatchP. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 7 / 13

Page 8: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Experiments on Spatial Information (Flower 17)

Spatial Coordinate Coding

Flower 17 [M. E. Nilsback, 2008], 17 classes, 3 splits of data, eachconsisting of 680 training, 340 validation, and 340 testing images

Soft Assignment SCC SPM (3 levels)χ2 kernel 91.16 89.3

Sparse Coding SCC SPM (4 levels)linear kernel 88.43 88.86

Spatial Coordinate Coding is a weaker performer if Sparse Coding andlinear classifier are used

Pyramid Match elevates histogram data to a higher dimensionalrepresentation (vital for linear classifier)

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 8 / 13

Page 9: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Experiments on Dominant Angle Pyramid Match

Dominant Angle Pooling

Pascal 2007 consists of 20 object categories with high variability inintra-class appearance, rotation, and spatial position

Dominant Angle (DA) on descriptor level (variant, invariant, anddescriptor augmentation cases)

DA invariant DA variant DA coordinate appended46.00 50.23 50.24

Dominant Angle is important in classification

Dominant Angle (DA) with multiple qunatisation levels (DAPM) andSpatial Pyramid Match (SPM)

SPM (3 levels) DAPM(5levels) DAPM + SPM54.3 53.40 SPM 56.3

Best results achieved when using both Spatial (3 levels) andDominant Angle Pyramid Match (5 levels)

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 9 / 13

Page 10: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Experiments on Colour Pyramid Match

Colour Component Pooling

Flower 17 set used for further evaluation as it greatly benefits fromcolour information

Soft Assignment (SA) and Spatial Coordinate Coding (SCC) withRBF χ2 kernels used

Results Reported as Average Accuracy

SCC 86.4%SCC+Colour Pyramid Match 87.4%

SCC+Colour Pyramid Match+Opponent SIFT 91.4%MKL based approach [F. Yan, 2010] 86.7%

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 10 / 13

Page 11: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

Conclusions

Spatial Coordinate Coding outperforms SPM (3 levels) (e.g. by 1.8%on Flower 17)

It reduces histogram sizes from e.g. 56K to 4K bypassing SpatialPyramid Match

Spatial bias does not benefit much form multi-level quantisation

Dominant Angle benefits from multi-level quantisation (DAPM)

DAPM+SPM results in 2.0% improvement on VOC 2007

Colour Pyramid Match improves further Spatial Coordinate Coding by1.0% on Flower 17

Letting classifier decide the right level of quantisation formultiple modalities leads to performance improvement

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 11 / 13

Page 12: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

sky trees

fence fence fencetrunktrunk

sky ,tree, ship, grass

sky, treetree, ship, grass

skysky, tree, ship, grassgrass

Thank You

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 12 / 13

Page 13: Spatial Coordinate Coding To Reduce Histogram ...claret.wdfiles.com/local--files/start/SpatCoordCoding_Ta_ICIP2011.pdf · Spatial Coordinate Coding To Reduce Histogram Representations,

References

S. Lazebnik et al. (2006)

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.CVPR.

J. C. Van Gemert et al. (2010)

Visual Word Ambiguity.PAMI.

J. Yang et al. (2009)

Linear spatial pyramid matching using sparse coding for image classification.CVPR.

M. E. Nilsback et al. (2008)

Automated Flower Classification over a Large Number of Classes.ICCV.

M. Everingham et al. (2010)

The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results.ICCV.

F. Yan et al. (2010)

Lp Norm Multiple Kernel Fisher Discriminant Analysis for Object and Image Categorisation.CVPR.

P. Koniusz, K. Mikolajczyk (CVSSP) Spatial Cooridnate Coding September 11, 2011 13 / 13