svm active learning for image retrieval

Morris LeBlanc

Why Image Retrieval is Hard? Problems with Image Retrieval Support Vector Machines Active Learning Image Processing

◦ Texture and Color Relevance Feedback

What is the topic of this image?

What are right keywords to index this image

What words would you use to retrieve this image?

The Semantic Gap

A picture is worth a thousand words

The meaning of an image is highly individual and subjective

Is a set of related learning methods used for classification and regression

Views data in two sets of vectors in a n-dimensional space

With this we are able to label “relevant” and “non-relevant” images◦Based on distance from a labeled

instance

SVM training process proceeds as follows:1. Choose some working subset of the query images2. Construct classifier – i.e. create a new surface:

Optimize the weights associated with the working subset of images (feature vectors)

Update optimality conditions for images (vectors) not in working subset Broadcast working subset images (vectors) and

weights Update optimality conditions for all images in query

(Map) Reduce to find greatest violating image (vector) not

contained in working subset (Reduce)

Updating SVMs Cont’d

3. Update working subset to include greatest violating image (vector)

4. Iterate until all images (vectors) satisfy optimality conditions

5. Repeat steps 2 through 4 until correct images are returned

This image shows the multiple current version space chosen by the user (wi) and all instances found later. The closet one is what will be shown to the user.

Here, one allows the learner the flexibility to choose the data points that it feels are most relevant for learning a particular task

◦ An analogy is that a standard passive learner is a student that sits and listens to a teacher while an active learner is a student that asks the teacher questions, listens to the answers and asks further questions based upon the teacher's response

Representing the Images

◦Segmentation

◦Low Level Features Color Texture

Information about color or texture or shape which are extracted from an image are known as image features

◦Also a low-level features Red, sandy

◦As opposed to high level features or concepts Beaches, mountains, happy, serene, George Bush

Do we consider the whole image or just part ?

◦Whole image - global features

◦Parts of image - local features

Segment images into parts

Two sorts:◦Tile Based◦Region based

(a) 5 tiles (b) 9 tiles

(c) 5 regions (d) 9 regions

Tiles

Regions

Break image down into simple geometric shapes

Similar Problems to GlobalPlus dangers of breaking up significant

objectsComputational SimpleSome Schemes seem to work well in practice

Break image down into visually coherent areas

Can identify meaningful areas and objects

Computationally intensive Unreliable

Produce a color signature for region/whole image

Typically done using color correllograms or color histograms

Identify a number of buckets in which to sort the available colours (e.g. red green and blue, or up to ten or so colours)

Allocate each pixel in an image to a bucket and count the number of pixels in each bucket.

Use the figure produced (bucket id plus count, normalised for image size and resolution) as the index key (signature) for each image

0

10

20

30

40

50

60

70

80

90

Red Orange

Produce a mathematical characterization of a repeating pattern in the image◦Smooth◦Sandy◦Grainy◦Stripey

Reduces an area/region to a (small - 15 ?) set of numbers which can be used a signature for that region

Proven to work well in practice

Hard for people to understand

Well established technique in text retrieval◦ Experimental results have always shown it to

work well in practice

Unfortunately experience with search engines has show it is difficult to get real searchers to adopt it - too much interaction

User performs an initial query

Selects some relevant results

System then extracts terms from these to augment the initial query

Requeries

Identify the N top-ranked images Identify all terms from the N top-ranked images

Select the feedback terms Merge the feedback terms with the original query

Identify the top-ranked images for the modified queries through relevance ranking

Q’ = aQ + b sum(R) - c sum(S)

◦Q: original query vector◦R: set of relevant document vectors◦S: set of non-relevant image vectors◦a, b, c: constants (Rocchio weights)◦Q’: new query vector

“SVM Active Learning For Image Retrieval” Simon Tong, Stanford University and Edward Chang, UCSB

John Tait, University of Sunderland, UK tait.ppt

http://robotics.stanford.edu/~stong/research.html -Simon Tong’s website

svm active learning for image retrieval

Documents

regionsbreak image

returnedthis image

image size

image featuresalso

practicebreak image

subset images vectors

correct images

working subsetbroadcast