i . problem improve large-scale retrieval / classification accuracy

1
I. Problem Improve large-scale retrieval / classification accuracy Incorporate spatial relationship between the features in the image Oxford 5K Dataset II. Approach Use a mining algorithm to find Frequent Itemsets (phrases) Use transactions to encode spatial information among features Different geometric configurations to capture spatial information Spatial Encodings for Visual Phrases Using Data Mining Techniques Ivette Carreras Haroon Idrees University of Central Florida III. Bag of Visual Words V. Visual Phrases Multiple words make up a phrase Different configurations to capture spatial information IV. Data Mining Algorithms used in large market basket types of data Quadrant 1 Quadrant 4 Quadrant 2 Quadrant 3 Single Circle Transaction Format Four Quadrants ›Prefixes : 1000, 2000, 3000, 4000 ›5000 for the origin Single Circle ›No prefixes Three Circles ›Prefixes : 1000, 2000, 3000 ›4000 for the origin VII. Qualitative Results Phrases capturing fences/tiles (length = 6) Phrases capturing window/arches (length = 3 & 4) References: Extract Regions Compute descriptors Find clusters and frequencies Compute distance matrix Faces Bikes Wild cats Word 1 Word 2 Word 3 Word n 0 0.2 0.4 0.6 0.8 Bag of Words Ranking (Phrases vs. Bag of Words) ([email protected] ) ([email protected] ) Circle 1 Circle 2 Circle 3 Extract features & quantize For each word, find k-NN Encode configurat ions Mine phrases with support s Sort phrases by length Build Bag of Visual Phrases VIII. Quantitative Results Phrase Percenta ge BoW (mAP ) BoVP_Q (mAP) BoVP_1 C (mAP) 10% 17.7 2 13.74 19.47 25% 21.9 9 16.53 22.75 40% 24.6 1 18.06 25.24 50% 25.5 6 18.96 25.94 65% 26.4 0 19.37 26.42 80% 27.4 4 20.21 26.85 85% 27.4 9 20.21 26.89 90% 27.4 2 20.19 26.80 95% 27.1 3 20.49 26.50 97% 26.6 7 20.50 26.28 98% 26.6 8 20.49 26.42 Configurations mAP Bag of Words 26.95% Bag of Phrases – Quadrants 20.68% Bag of Phrases –1 Circle 24.77% Bag of Phrases – 3 Circles 17.65% BoW & BoP_Q 27.09% BoW & BoP_1C 27.69% BoW & BoP_3C 23.87% BoW & BoP_Q & BoP_3C 24.02% BoW & BoP_1C & BoP_3C 24.63% BoW & BoP_Q & BoP_1C 27.74% 20k 40k 60k 95k Single Circle (mAP) 22.55% 23.69% 24.06% 24.77% Statistics of Phrases Phrases BoVP single circle Length 2 98112 Length 3 2782 Length 4 94 Length 5 5 Total 100989 Set of items Transacti ons 1 FIM Frequent Itemsets {Beer, Bread, Jelly, Milk, PeanutButter} Transaction Items t 1 Bread, Jelly, PeanutButter t 2 Bread, PeanutButter t 3 Bread, Milk, PeanutButter, Beer t 4 Beer, Bread t 5 Beer, Milk Apriori, Eclat {Bread, PeanutButter} – 3/5 {Beer, Milk} – 2/5 Selection of Phrases - Frequency Selection of Phrases - Entropy Results * http://www.di.ens.fr/willow/events/cvml2010/materials/INRIA_summer_schoo l_2010_Cordelia_ bof_classification.pdf 1. Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching. In CVPR (2007). 2. T. Quack, V. Ferrari, and L. Van Gool. Video mining with frequent itemset configurations. In CIVR'06, 2006. 3. Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR. (2011)

Upload: jane

Post on 23-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Spatial Encodings for Visual Phrases Using Data Mining Techniques Ivette Carreras Haroon Idrees University of Central Florida. ( [email protected] ). ( [email protected] ) . IV. Data Mining Algorithms used in large market basket types of data. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: I .     Problem Improve large-scale retrieval / classification accuracy

I. Problem Improve large-scale retrieval / classification accuracy

Incorporate spatial relationship between the features in the image

Oxford 5K Dataset

II. Approach Use a mining algorithm to find Frequent Itemsets (phrases)

Use transactions to encode spatial information among features

Different geometric configurations to capture spatial information

Spatial Encodings for Visual Phrases Using Data Mining TechniquesIvette Carreras Haroon Idrees

University of Central Florida

III. Bag of Visual Words

V. Visual Phrases Multiple words make up a phrase Different configurations to capture spatial information

IV. Data Mining Algorithms used in large market basket types of data

Quadrant 1

Quadrant 4

Quadrant 2

Quadrant 3

Single Circle

Transaction Format Four Quadrants

›Prefixes : 1000, 2000, 3000, 4000 ›5000 for the origin

Single Circle›No prefixes

Three Circles›Prefixes : 1000, 2000, 3000›4000 for the origin

VII. Qualitative Results

Phrases capturing fences/tiles (length = 6)

Phrases capturing window/arches (length = 3 & 4)

References:

Extract Regions Compute descriptors Find clusters and frequencies

Compute distancematrix

Faces

Bikes

Wild cats

Word 1 Word 2 Word 3 … Word n0

0.10.20.30.40.50.60.7

Bag of Words

Ranking (Phrases vs. Bag of Words)

([email protected]) ([email protected])

Circle 1

Circle 2

Circle 3

Extract features & quantize

For each word, find k-NN

Encode configurations

Mine phrases with support s

Sort phrases by length

Build Bag of Visual Phrases

VIII. Quantitative Results

PhrasePercentage

BoW (mAP)

BoVP_Q (mAP)

BoVP_1C (mAP)

10% 17.72 13.74 19.4725% 21.99 16.53 22.7540% 24.61 18.06 25.2450% 25.56 18.96 25.9465% 26.40 19.37 26.4280% 27.44 20.21 26.8585% 27.49 20.21 26.8990% 27.42 20.19 26.8095% 27.13 20.49 26.5097% 26.67 20.50 26.2898% 26.68 20.49 26.42

Configurations mAPBag of Words 26.95%Bag of Phrases – Quadrants 20.68%Bag of Phrases –1 Circle 24.77%Bag of Phrases – 3 Circles 17.65%BoW & BoP_Q 27.09%BoW & BoP_1C 27.69%BoW & BoP_3C 23.87%BoW & BoP_Q & BoP_3C 24.02%BoW & BoP_1C & BoP_3C 24.63%BoW & BoP_Q & BoP_1C 27.74%

20k 40k 60k 95kSingle Circle (mAP) 22.55% 23.69% 24.06% 24.77%

Statistics of Phrases

Phrases BoVP single circleLength 2 98112Length 3 2782Length 4 94Length 5 5Total 100989

Set of items

Transactions

1

FIM

Frequent Itemsets

{Beer, Bread, Jelly, Milk, PeanutButter}

Transaction Items

t1 Bread, Jelly, PeanutButter

t2 Bread, PeanutButter

t3 Bread, Milk, PeanutButter, Beer

t4 Beer, Bread

t5 Beer, Milk

Apriori, Eclat

{Bread, PeanutButter} – 3/5 {Beer, Milk} – 2/5

Selection of Phrases - Frequency

Selection of Phrases - Entropy

Results

* http://www.di.ens.fr/willow/events/cvml2010/materials/INRIA_summer_school_2010_Cordelia_bof_classification.pdf

1. Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching. In CVPR (2007). 2. T. Quack, V. Ferrari, and L. Van Gool. Video mining with frequent itemset configurations. In CIVR'06, 2006.3. Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR. (2011)