support vector machine concept-dependent active learning for image retrieval reporter: francis...

Support vector machine concept-dependent active learning for image retrieval

Reporter: Francis

2005-7-5

2

1. Introduction

RF: A query refinement scheme to inform a database of his query concept.

Such a query refinement scheme (query-concept learner) is a case of pool-based active learning. In the beginning, the unlabeled pool would be

the entire image database.

3

1-1 Active learning

Traditional: passive learning randomly select k images to training set.

Active learning: choosing informative images within the pool to users. Such request is called a pool-query

It should choose its next pool-query based upon the past answers to previous pool-queries.

Our approach is called SVM active learner.

4

1-2 Querying example

1

5

2. Support vector machines and version space

6

3. Active learning and batch sampling strategies Two steps:

Sampling: request user feedbacks to query concept key step of SVM active learner.

Learning: to be a better classifier Then return k images farthest from the

boundary on the relevant side.

7

3-1 Speculative sampling

It’s computationally intensive. We use it as a yardstick to measure other

active-learning strategies.

8

3-2 Batch-simple sampling

Choosing h unlabeled instances closest to the hyperplane (between the relevant and the irrelevant instances in the feature space).

9

3-3 Angle-diversity sampling

For maintaining the diversity. Diversity of samples is measured by

angles between the samples:

Score:

Trade-off parameter is

set at 0.5

Unlabeled instance

Unlabeled instance

10

3-4 Error-reduction sampling

11

1

1

11

4-1 Concept complexity

1. Scarcity: Using hit-rate to indicate it. Ex: keyword “sun” v.s “sunrise”

2. Diversity: Ex: the “flowers” concept is more diverse tha

n the “red roses” concept.

12

4-1 Concept complexity (con.)

3. Isolation: Input space isolation： Keyword isolation

Using association-rules mining 1 Ex: fruit apple(0.5) v.s applefruit(0.7)

1、 0.25“Fruit” is poorly isolated from “apple”2、 0.21“Apple” is well isolated from “fruit”

13

4-2 Limitations of active learning

When the target concept instances are scarce and not well isolated, active learning will be ineffective.

1. Scarce: common situation is that target concept matching images is less than 1%It needs many feedback iterations to obtain positive feedback.

14

4-2 Limitations of active learning (con.)

2. Not well isolated:

15

4-3 concept-dependent active learning algorithms

State C – keyword disambiguation State B – input-space disambiguation State D – key word & space disambiguation

16

4-3-1 keyword disambiguation

消去跟負回饋有相同關鍵字的 unlabel

ed set元素

隨機找出 n個具含糊關鍵字的元素

17

4-3-2 input-space disambiguation

18

4-3-3 State D & A

State D: using DK and DS algorithms State A: adapt to Diversity

Ex: “flowers” concept: learner may need to be more explorative and search for flowers of all colors.

Classification score function:

In state A, λis reduced to result in more weight in angle diversity during sample selections

19

5. Experiments

Using five image datasets from Corel image database.Four-category set: 602 imagesTen-category set: 1277 imagesFifteen-category set: 1920 images107-category set: 50000 imagesLarge set: 300K image from a stock-photo

company.

20

5-1 active learning v.s passive

第一輪 20張 random sampling，之後 active learning 選 10張或 20張

21

5-2 against traditional relevance feedback schemes

22

5-3 Sampling method evaluation

Using 107 category dataset

Error reduction sampling

23

5-3 Sampling method evaluation

1

24

5-4 concept-dependent learning

support vector machine concept-dependent active learning for image retrieval reporter: francis...

Documents