support vector machines pattern recognition sergios theodoridis konstantinos koutroumbas second...

Support Vector Machines

Pattern RecognitionSergios TheodoridisKonstantinos Koutroumbas

Second Edition

A Tutorial on Support Vector Machines for Pattern RecognitionData Mining and Knowledge Discovery, 1998C. J. C. Burges

Separable Case

Maximum Margin Formulation

Separable Case

Label the training data

Hyperplane satisfyw： normal to the hyperplane |b|/||w||： perpendicular distance from the

hyperplane to the origin

d+ (d-)：margin

diiii xyliyx R},1,1{,,...,1},,{

0xw bg(x)

Separable Case

d+

d-

positive example

negative example

Separable Case

Suppose that all the training data satisfy the following constraints

These can be combines into one set of inequalities

Distance of a point from a hyperplane

1for1wx ii yb1for1wx ii yb

iby ii ,01)wx(

||||

1

wdd

class 1

class 2

Separable Case

Having a margin of

Task compute the parameter w, b of the hyperplane

||||

2

||||

1

||||

1

www maximize

2||||2

1)(minimize wwJ

ibxy ii 01)w(tosubject

Separable Case

Karush-Kuhn-Tucker (KKT) conditions

: vector of the Langrange multiplier : Langrangian function

0),,( bwLw

0),,( bwLb

Nii ,....,2,1,0

),,( bwL

Nibwxy iii ,...,2,1,0]1)([

N

iiii bwxywbwL

1

2 ]1)([||||2

1),,(

N

iiii xyw

1

N

iii y

1

0

Separable Case

Wolfe dual representation form

),,(maximize bwL

N

iiii xyw

1

tosubject 01

N

iii y 0i

N

i jij

Tijijii xxyy

1 ,2

1max

N

iiii y

1

0,0tosubject

Image Categorization by Learning and Image Categorization by Learning and Reasoning with RegionsReasoning with Regions

Yixin ChenUniversity of New Orleans

James Z. WangThe Pennsylvania State University

Journal of Machine Learning Research 5 (2004)Journal of Machine Learning Research 5 (2004)(Submitted 7/03; Revised 11/03; Published 8/04)(Submitted 7/03; Revised 11/03; Published 8/04)

Introduction

Automatic image categorization Difficulties

Variable & uncontrolled image conditions Complex and hard-to-describe objects in image Objects occluding other objects

Applications Digital libraries, Space science, Web searching,

Geographic information systems, Biomedicine, Surveillance and sensor system, Commerce, Education

Overview

Give a set of labeled images, can a computer program learn such knowledge or semantic concepts form implicit information of objects contained in image?

Related Work

Multiple-Instance LearningDiverse Density Function (1998)MI-SVM (2003)

Image CategorizationColor Histograms (1998-2001)Subimage-based Methods

(1994-2004)

Motivation

Correct categorization of an image depends on identifying multiple aspects of the image

Extension of MIL→A bag must contain a number of instances satisfying various properties

A New Formulation of Multiple-Instance Learning

Maximum margin problem in a new feature space defined by the DD function

DD-SVMIn the instance feature space, a

collection of feature vectors, each of which is called an instance prototype, is determined according to DD

A New Formulation of Multiple-Instance Learning

Instance prototype:• A class of instances (or regions) that is

more likely to appear in bags (or images) with the specific label than in the other bags

Maps every bag to a point in bag feature space

Standard SVMs are the trained in the bag feature space

Outline

Image segmentation & feature representation

DD-SVM, and extension of MIL Experiments & result Conclusions & future work

Image Segmentation

Partitions the image into non-overlapping blocks of size 4x4 pixels

Each feature vector consists of six featuresAverage color components in a block

• LUV color spaceSquare root of the second order

moment of wavelet coefficients in high-frequency bands

Image Segmentation

Daubechies-4 wavelet transform

Moments of wavelet coefficients in various frequency bands are effective for representing texture (Unser, 1995)

LL HL

HHLH

k, l2x2 coefficients

2

11

0

1

0

2,4

1

i jjlikcf

Image Segmentation

k-means algorithm: cluster the feature vectors into several classes with every class corresponding to one “region”

Adaptively select N by gradually increasing N until a stopping criterion is met (Wang et al. 2001)

Segmentation Results

Image Representation

:the mean of the set of feature vectors corresponding to each region Rj

Shape properties of each regionNormalized inertia of order 1, 2, 3

(Gersho, 1979)

j

f

21

Rrr-r

γ),R(

j

j

V

Ij

Image Representation

Shape feature of region Rj as

An image Bi

Segmentation: {Rj : j = 1, …, Ni}

Feature vectors: { xij : j = 1, …, Ni}

321

)3,R(,

)2,R(,

)1,R(

I

jI

I

jI

I

jIs j

T

Tj

T

jij sf

,x 9-dimensional feature vector

An extension of Multiple-Instance Learning

Maximum margin formulation of MIL in a bag feature space

Constructing a bag feature spaceDiverse densityLearning instance prototypesComputing bag features

Maximum Margin Formulation of MIL in a Bag Feature Space

Basic idea of new MIL framework:Map every bag to a point in a new

feature space, named the bag feature space

To train SVMs in the bag feature space

l

jijijiji

l

ii Kyy

i 1,1

))B(),B((2

1maxarg*

liC

y

i

l

iii

,...,1,0

01

subject to

Constructing a Bag Feature Space

Clues for classifier design:What is common in positive bags and

does not appear in the negative bagsInstance prototypes computed from the

DD function A bag feature space is then constructed

using the instance prototypes

Diverse Density (Maron and Lozano-Perez, 1998)

A function defined over the instance space

DD value at a point in the feature spaceThe probability that the point agrees

with the underlying distribution of positive and negative bags

Diverse Density

It measures a co-occurrence of instances from different (diverse) positive bags

2

wxx

111

2

1)wx,( jiey

yDD

Ni

ji

il

iD

2

1

2T xDiag(w)xx

w

Learning Instance Prototype

An instance prototype represents a class of instances that is more likely to appear in positive bags than in negative bags

Learning instance prototypes then becomes an optimization problemFinding local maximizers of the DD fu

nction in a high-dimensional

Learning Instance Prototype

How do we find the local maximizers?Start an optimization at every instance

in every positive bag

Constraints:Need to be distinct from each otherHave large DD values

Computing Bag Features

Let be the collection of instance prototypes

Bag features,

},...,1:)w,x{( ** nkkk

},...,1:x{B),B( iijii Nj

*

*2

*1

*,...,1

*2,...,1

*1,...,1

xxmin

...

xxmin

xxmin

)B(

ni

i

i

wnijNj

wijNj

wijNj

i

Experimental Setup for Image Categorization

COREL Corp: 2,000 images20 image categoriesJPEG format, size 384*256 (256*384)Each category are randomly divided

into a training set and a test set (50/50)SVMLight [Joachims, 1999] software is used

to train the SVMs

Sample Images (COREL)

Image Categorization Performance

5 random test sets, 95% confidence intervals The images belong to Cat.0 ~ Cat.9

14.8%

6.8%

Chapelle et al., 1999

Andrews et al., 2003

Image Categorization Experiments

Sensitivity to Image Segmentation

k-means clustering algorithmwith 5 different stopping criteria

1,000 images for Cat.0 ~ Cat.9

Robustness to Image Segmentation

6.8% 9.5% 11.7% 13.8%

27.4%

Robustness to the Number of Categories in a Data Set

81.5%

67.5%

6.8%

12.9

Difference in Average Classification accuracies

Sensitivity to the Size of Training Images

Sensitivity to the Diversity of Training Images Varies

MUSK Data Sets

Speed

40 minutes Training set of 500 images (4.31 regions per

image) Pentium III 700MHz PC running the Linux op

erating system Algorithm is implemented in Matlab, C progr

amming language The majority is spent on learning

instance prototypes

Conclusions

A region-based image categorization method using an extension of MIL → DD-SVM

Image → collection of regions → k-means alg. Image → a point in a bag feature space

(defined by a set of instance prototypes learned with the DD func.)

SVM-based image classifiers are trained in the bag feature space

DD-SVM outperforms two other methods DD-SVM generates highly competitive results

on MUSK data set

Future Work

Limitations Region naming (Barnard et al., 2003)

Texture dependence Improvement

Image segmentation algorithm DD function

Scene category can be a vector Semantically-adaptive searching Art & biomedical images

support vector machines pattern recognition sergios theodoridis konstantinos koutroumbas second...

Documents

dd slide

separable case slide

education slide

burges slide

bag feature space slide

image objects

langrangian function

origin d d margin slide