i. problem improve large-scale retrieval / classification accuracy incorporate spatial...
TRANSCRIPT
I. Problem Improve large-scale retrieval / classification accuracy
Incorporate spatial relationship between the features in the image
Oxford 5K Dataset
II. Approach Use a mining algorithm to find Frequent Itemsets (phrases)
Use transactions to encode spatial information among features
Different geometric configurations to capture spatial information
Spatial Encodings for Visual Phrases Using Data Mining TechniquesIvette Carreras Haroon Idrees
University of Central Florida
III. Bag of Visual Words
V. Visual Phrases Multiple words make up a phrase Different configurations to capture spatial information
IV. Data Mining Algorithms used in large market basket types of data
Quadrant 1
Quadrant 4
Quadrant 2
Quadrant 3
Single Circle
Transaction Format Four Quadrants
›Prefixes : 1000, 2000, 3000, 4000 ›5000 for the origin
Single Circle›No prefixes
Three Circles›Prefixes : 1000, 2000, 3000›4000 for the origin
VII. Qualitative Results
Phrases capturing fences/tiles (length = 6)
Phrases capturing window/arches (length = 3 & 4)
References:
Extract Regions Compute descriptors Find clusters and frequencies
Compute distancematrix
Faces
Bikes
Wild cats
Word 1 Word 2 Word 3 … Word n0
0.10.20.30.40.50.60.7
Bag of Words
Ranking (Phrases vs. Bag of Words)
([email protected]) ([email protected])
Circle 1
Circle 2
Circle 3
Extract features & quantize
For each word, find k-NN
Encode configurations
Mine phrases with support s
Sort phrases by length
Build Bag of Visual Phrases
VIII. Quantitative Results
PhrasePercentage
BoW (mAP)
BoVP_Q (mAP)
BoVP_1C (mAP)
10% 17.72 13.74 19.4725% 21.99 16.53 22.7540% 24.61 18.06 25.2450% 25.56 18.96 25.9465% 26.40 19.37 26.4280% 27.44 20.21 26.8585% 27.49 20.21 26.8990% 27.42 20.19 26.8095% 27.13 20.49 26.5097% 26.67 20.50 26.2898% 26.68 20.49 26.42
Configurations mAPBag of Words 26.95%Bag of Phrases – Quadrants 20.68%Bag of Phrases –1 Circle 24.77%Bag of Phrases – 3 Circles 17.65%BoW & BoP_Q 27.09%BoW & BoP_1C 27.69%BoW & BoP_3C 23.87%BoW & BoP_Q & BoP_3C 24.02%BoW & BoP_1C & BoP_3C 24.63%BoW & BoP_Q & BoP_1C 27.74%
20k 40k 60k 95kSingle Circle (mAP) 22.55% 23.69% 24.06% 24.77%
Statistics of Phrases
Phrases BoVP single circleLength 2 98112Length 3 2782Length 4 94Length 5 5Total 100989
Set of items
Transactions
1
FIM
Frequent Itemsets
{Beer, Bread, Jelly, Milk, PeanutButter}
Transaction Items
t1 Bread, Jelly, PeanutButter
t2 Bread, PeanutButter
t3 Bread, Milk, PeanutButter, Beer
t4 Beer, Bread
t5 Beer, Milk
Apriori, Eclat
{Bread, PeanutButter} – 3/5 {Beer, Milk} – 2/5
Selection of Phrases - Frequency
Selection of Phrases - Entropy
Results
* http://www.di.ens.fr/willow/events/cvml2010/materials/INRIA_summer_school_2010_Cordelia_bof_classification.pdf
1. Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching. In CVPR (2007). 2. T. Quack, V. Ferrari, and L. Van Gool. Video mining with frequent itemset configurations. In CIVR'06, 2006.3. Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR. (2011)