![Page 1: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/1.jpg)
Large Image Databases and Small Codes
for Object Recognition
Rob Fergus (NYU)Antonio Torralba (MIT)Yair Weiss (Hebrew U.)
William T. Freeman (MIT)
![Page 2: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/2.jpg)
Banks
y,
2006
Object Recognition
Pixels Description of scene contents
![Page 3: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/3.jpg)
Amazing resource Maybe we can use it for recognition?
Large image datasets
Internet contains billions of images
![Page 4: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/4.jpg)
Web image dataset• 79.3 million images• Collected using image
search engines• List of nouns taken from
Wordnet
• 32x32resolution
a-bomba-horizona._conan_doylea._e._burnsidea._e._housmana._e._kennellya.e.a_batterya_cappella_singinga_horizon
• Example first 10:
See “80 Million Tiny images” TR
![Page 5: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/5.jpg)
Noisy Output from Image Search Engines
![Page 6: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/6.jpg)
Nearest-Neighbor methods for recognition
# images
105
106
108
![Page 7: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/7.jpg)
Issues
• Need lots of data• How to find neighbors quickly?• What distance metric to use?
• How to transfer labels from neighbors to query image?
• What happens if labels are unreliable?
![Page 8: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/8.jpg)
Overview
1.Fast retrieval using compact codes
2.Recognition using neighbors with unreliable labels
![Page 9: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/9.jpg)
Overview
1.Fast retrieval using compact codes
2.Recognition using neighbors with unreliable labels
![Page 10: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/10.jpg)
Binary codes for images
• Want images with similar contentto have similar binary codes
• Use Hamming distance between codes– Number of bit flips– E.g.:
• Semantic Hashing [Salakhutdinov & Hinton, 2007]– Text documents
Ham_Dist(10001010,10001110)=1
Ham_Dist(10001010,11101110)=3
![Page 11: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/11.jpg)
Binary codes for images
• Permits fast lookup via hashing– Each code is a memory address– Find neighbors by exploring Hamming
ball around query address– Lookup time depends on
radius of ball, NOT on # data points
Address Space
Query ImageSemantic HashFunction
Semantically similar images
Query address
Figu
re a
dapt
ed fr
om S
alak
hutd
inov
& H
into
n ‘0
7
![Page 12: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/12.jpg)
Compact Binary Codes
• Google has few billion images (109)• Big PC has ~10 Gbytes (1011 bits)• Codes must fit in memory (disk too slow) Budget of 102 bits/image
• 1 Megapixel image is 107 bits• 32x32 color image is 104 bits Semantic hash function must also reduce
dimensionality
![Page 13: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/13.jpg)
Code requirements
• Preserves neighborhood structureof input space
• Very compact (<102 bits/image)• Fast to compute
Three approaches:1.Locality Sensitive Hashing (LSH)2.Boosting3.Restricted Boltzmann Machines (RBM’s)
![Page 14: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/14.jpg)
Image representation: Gist vectors• Pixels not a convenient representation• Use GIST descriptor instead• 512 dimensions/image• L2 distance btw. Gist vectors not bad
substitute for human perceptual distance
Oliva & Torralba, IJCV 2001
![Page 15: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/15.jpg)
1. Locality Sensitive Hashing• Gionis, A. & Indyk, P. & Motwani, R. (1999)• Take random projections of data• Quantize each projection with few bits
•For our N bit code:– Compute first N PCA components of data– Each random projection must be linear combination of the N PCA components
![Page 16: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/16.jpg)
2. Boosting
• Modified form of BoostSSC [Shaknarovich, Viola & Darrell, 2003]
• GentleBoost with exponential loss• Positive examples are pairs of neighbors• Negative examples are pairs of unrelated images• Each binary regression stump examines a single
coordinate in input pair, comparing to some learnt threshold to see that they agree
![Page 17: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/17.jpg)
3. Restricted Boltzmann Machine (RBM)
Hidden units: h
Visible units: v
Symmetric weights w
Parameters: Weights w Biases b
• Network of binary stochastic units• Hinton & Salakhutdinov, Science 2006
![Page 18: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/18.jpg)
RBM architecture
Hidden units: h
Visible units: v
Symmetric weights w
Learn weights and biases using Contrastive Divergence
Parameters: Weights w Biases b
• Network of binary stochastic units• Hinton & Salakhutdinov, Science 2006
Convenient conditional distributions:
![Page 19: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/19.jpg)
Multi-Layer RBM: non-linear dimensionality reduction
512
512w1
Input Gist vector (512 dimensions)
Layer 1
512
256w2
Layer 2
256
Nw3
Layer 3
Output binary code (N dimensions)
![Page 20: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/20.jpg)
Training RBM models• Two phases
1. Pre-training – Unsupervised– Use Contrastive Divergence to learn weights and biases– Gets parameters to right ballpark
2. Fine-tuning– Can make use of label information– No longer stochastic– Back-propagate error to update parameters– Moves parameters to local minimum
![Page 21: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/21.jpg)
Greedy pre-training (Unsupervised)
512
512w1
Input Gist vector (512 dimensions)
Layer 1
![Page 22: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/22.jpg)
Greedy pre-training (Unsupervised)
Activations of hidden units from layer 1 (512 dimensions)
512
256w2
Layer 2
![Page 23: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/23.jpg)
Greedy pre-training (Unsupervised)
Activations of hidden units from layer 2 (256 dimensions)
256
Nw3
Layer 3
![Page 24: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/24.jpg)
Backpropagation using Neighborhood Components Analysis objective
512
512w1
Input Gist vector (512 dimensions)
Layer 1
512
256w2
Layer 2
256
Nw3
Layer 3
Output binary code (N dimensions)
![Page 25: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/25.jpg)
Neighborhood Components Analysis• Goldberger, Roweis, Salakhutdinov & Hinton, NIPS 2004
Labels defined ininput space (high dim.)
Points in output space (low dimensional)
![Page 26: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/26.jpg)
Neighborhood Components Analysis• Goldberger, Roweis, Salakhutdinov & Hinton, NIPS 2004
Output of RBMW are RBM weights
![Page 27: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/27.jpg)
Neighborhood Components Analysis• Goldberger, Roweis, Salakhutdinov & Hinton, NIPS 2004
Pulls nearby pointsOF SAME CLASScloser
Push nearby pointsOF DIFFERENT CLASSaway
![Page 28: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/28.jpg)
Neighborhood Components Analysis• Goldberger, Roweis, Salakhutdinov & Hinton, NIPS 2004
Pulls nearby pointsOF SAME CLASScloser
Preserves neighborhood structure of original, high dimensional, space
Push nearby pointsOF DIFFERENT CLASSaway
![Page 29: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/29.jpg)
Two test datasets1. LabelMe– 22,000 images (20,000 train | 2,000 test)– Ground truth segmentations for all– Can define ground truth distance btw. images using these
segmentations – Per pixel labels [SUPERVISED]
2. Web data– 79.3 million images– Collected from Internet– No labels, so use L2 distance btw. GIST vectors as
ground truth distance [UNSUPERVISED]– Noisy image labels only
![Page 30: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/30.jpg)
Retrieval Experiments
![Page 31: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/31.jpg)
Examples of LabelMe retrieval• 12 closest neighbors under different distance metrics
![Page 32: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/32.jpg)
LabelMe Retrieval
Size of retrieval set % o
f 50
true
nei
ghbo
rs in
retr
ieva
l set
0 2,000 10,000 20,0000
![Page 33: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/33.jpg)
LabelMe Retrieval
Size of retrieval set % o
f 50
true
nei
ghbo
rs in
retr
ieva
l set
0 2,000 10,000 20,0000
Number of bits% o
f 50
true
nei
ghbo
rs in
firs
t 500
retr
ieve
d
![Page 34: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/34.jpg)
Web images retrieval%
of 5
0 tr
ue n
eigh
bors
in re
trie
val s
et
Size of retrieval set
![Page 35: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/35.jpg)
Web images retrieval
Size of retrieval set
% o
f 50
true
nei
ghbo
rs in
retr
ieva
l set
% o
f 50
true
nei
ghbo
rs in
retr
ieva
l set
Size of retrieval set
![Page 36: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/36.jpg)
Examples of Web retrieval
• 12 neighbors using different distance metrics
![Page 37: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/37.jpg)
Retrieval Timings
![Page 38: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/38.jpg)
LabelMe Recognition examples
![Page 39: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/39.jpg)
LabelMe Recognition
![Page 40: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/40.jpg)
Overview
1.Fast retrieval using compact codes
2.Recognition using neighbors with unreliable labels
![Page 41: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/41.jpg)
Label Assignment
• Retrieval gives set of nearby images• How to compute label?
• Issues:– Labeling noise– Keywords can be very specific
• e.g. yellowfin tuna
Query Grover Cleveland Linnet Birdcage Chiefs Casing
Neighbors
![Page 42: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/42.jpg)
Wordnet – a Lexical Dictionary
Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun aardvark
Sense 1aardvark, ant bear, anteater, Orycteropus afer => placental, placental mammal, eutherian, eutherian mammal => mammal => vertebrate, craniate => chordate => animal, animate being, beast, brute, creature
=> organism, being => living thing, animate thing => object, physical object => entity
http://wordnet.princeton.edu/
![Page 43: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/43.jpg)
Wordnet Hierarchy
Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun aardvark
Sense 1aardvark, ant bear, anteater, Orycteropus afer => placental, placental mammal, eutherian, eutherian mammal => mammal => vertebrate, craniate => chordate => animal, animate being, beast, brute, creature
=> organism, being => living thing, animate thing => object, physical object => entity
• Convert graph structure into tree by taking most common meaning
![Page 44: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/44.jpg)
![Page 45: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/45.jpg)
Wordnet Voting Scheme
Ground truth
One image – one vote
![Page 46: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/46.jpg)
Classification at Multiple Semantic Levels
Votes:
Animal 6Person 33Plant 5Device 3Administrative4Others 22
Votes:
Living 44Artifact 9Land 3Region 7Others 10
![Page 47: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/47.jpg)
Wordnet Voting Scheme
![Page 48: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/48.jpg)
Person Recognition
• 23% of all imagesin dataset containpeople
• Wide range ofposes: not justfrontal faces
![Page 49: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/49.jpg)
Person Recognition – Test Set
• 1016 images fromAltavista using“person” query
• High res and 32x32available
• Disjoint from 79million tiny images
![Page 50: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/50.jpg)
Person Recognition• Task: person in image or not? • 64-bit RBM code trained on web data (Weak image
labels only)
Viola-JonesViola-Jones
![Page 51: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/51.jpg)
Object Classification• Hand-designed metric
![Page 52: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/52.jpg)
Distribution of objects in the world
1
10
1,000
100
10,000
10 100 1,000
L a b e lM e s t a t i s t i c s
object rank
10% of the classes account for 93% of the labels!
1500 classes with lessthan 100 samples
Number oflabeledsamples
LabelMe dataset
![Page 53: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/53.jpg)
Conclusions
• Possible to build compact codes for retrieval– # bits seems to depend on dataset– Much room for improvement– Use JPEG coefficients as input to RBM
• Can do interesting things with lots of data– What would happen with Google’s ~ 2 billion
images?– Video data
![Page 54: Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)](https://reader036.vdocuments.us/reader036/viewer/2022062806/56649d4d5503460f94a2cc3d/html5/thumbnails/54.jpg)