support vector machines pattern recognition sergios theodoridis konstantinos koutroumbas second...

45
Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Patte rn Recognition Data Mining and Knowledge Discovery, 1998 C. J. C. Burges

Post on 20-Dec-2015

227 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Support Vector Machines

Pattern RecognitionSergios TheodoridisKonstantinos Koutroumbas

Second Edition

A Tutorial on Support Vector Machines for Pattern RecognitionData Mining and Knowledge Discovery, 1998C. J. C. Burges

Page 2: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

Page 3: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Maximum Margin Formulation

Page 4: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

Label the training data

Hyperplane satisfyw: normal to the hyperplane |b|/||w||: perpendicular distance from the

hyperplane to the origin

d+ (d-):margin

diiii xyliyx R},1,1{,,...,1},,{

0xw bg(x)

Page 5: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

d+

d-

positive example

negative example

Page 6: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

Suppose that all the training data satisfy the following constraints

These can be combines into one set of inequalities

Distance of a point from a hyperplane

1for1wx ii yb1for1wx ii yb

iby ii ,01)wx(

||||

1

wdd

class 1

class 2

Page 7: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

Having a margin of

Task compute the parameter w, b of the hyperplane

||||

2

||||

1

||||

1

www maximize

2||||2

1)(minimize wwJ

ibxy ii 01)w(tosubject

Page 8: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

Karush-Kuhn-Tucker (KKT) conditions

: vector of the Langrange multiplier : Langrangian function

0),,( bwLw

0),,( bwLb

Nii ,....,2,1,0

),,( bwL

Nibwxy iii ,...,2,1,0]1)([

N

iiii bwxywbwL

1

2 ]1)([||||2

1),,(

N

iiii xyw

1

N

iii y

1

0

Page 9: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Separable Case

Wolfe dual representation form

),,(maximize bwL

N

iiii xyw

1

tosubject 01

N

iii y 0i

N

i jij

Tijijii xxyy

1 ,2

1max

N

iiii y

1

0,0tosubject

Page 10: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Categorization by Learning and Image Categorization by Learning and Reasoning with RegionsReasoning with Regions

Yixin ChenUniversity of New Orleans

James Z. WangThe Pennsylvania State University

Journal of Machine Learning Research 5 (2004)Journal of Machine Learning Research 5 (2004)(Submitted 7/03; Revised 11/03; Published 8/04)(Submitted 7/03; Revised 11/03; Published 8/04)

Page 11: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Introduction

Automatic image categorization Difficulties

Variable & uncontrolled image conditions Complex and hard-to-describe objects in image Objects occluding other objects

Applications Digital libraries, Space science, Web searching,

Geographic information systems, Biomedicine, Surveillance and sensor system, Commerce, Education

Page 12: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Overview

Give a set of labeled images, can a computer program learn such knowledge or semantic concepts form implicit information of objects contained in image?

Page 13: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Related Work

Multiple-Instance LearningDiverse Density Function (1998)MI-SVM (2003)

Image CategorizationColor Histograms (1998-2001)Subimage-based Methods

(1994-2004)

Page 14: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Motivation

Correct categorization of an image depends on identifying multiple aspects of the image

Extension of MIL→A bag must contain a number of instances satisfying various properties

Page 15: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

A New Formulation of Multiple-Instance Learning

Maximum margin problem in a new feature space defined by the DD function

DD-SVMIn the instance feature space, a

collection of feature vectors, each of which is called an instance prototype, is determined according to DD

Page 16: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

A New Formulation of Multiple-Instance Learning

Instance prototype:• A class of instances (or regions) that is

more likely to appear in bags (or images) with the specific label than in the other bags

Maps every bag to a point in bag feature space

Standard SVMs are the trained in the bag feature space

Page 17: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Outline

Image segmentation & feature representation

DD-SVM, and extension of MIL Experiments & result Conclusions & future work

Page 18: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Segmentation

Partitions the image into non-overlapping blocks of size 4x4 pixels

Each feature vector consists of six featuresAverage color components in a block

• LUV color spaceSquare root of the second order

moment of wavelet coefficients in high-frequency bands

Page 19: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Segmentation

Daubechies-4 wavelet transform

Moments of wavelet coefficients in various frequency bands are effective for representing texture (Unser, 1995)

LL HL

HHLH

k, l2x2 coefficients

2

11

0

1

0

2,4

1

i jjlikcf

Page 20: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Segmentation

k-means algorithm: cluster the feature vectors into several classes with every class corresponding to one “region”

Adaptively select N by gradually increasing N until a stopping criterion is met (Wang et al. 2001)

Page 21: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Segmentation Results

Page 22: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Representation

:the mean of the set of feature vectors corresponding to each region Rj

Shape properties of each regionNormalized inertia of order 1, 2, 3

(Gersho, 1979)

j

f

21

Rrr-r

γ),R(

j

j

V

Ij

Page 23: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Representation

Shape feature of region Rj as

An image Bi

Segmentation: {Rj : j = 1, …, Ni}

Feature vectors: { xij : j = 1, …, Ni}

321

)3,R(,

)2,R(,

)1,R(

I

jI

I

jI

I

jIs j

T

Tj

T

jij sf

,x 9-dimensional feature vector

Page 24: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

An extension of Multiple-Instance Learning

Maximum margin formulation of MIL in a bag feature space

Constructing a bag feature spaceDiverse densityLearning instance prototypesComputing bag features

Page 25: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Maximum Margin Formulation of MIL in a Bag Feature Space

Basic idea of new MIL framework:Map every bag to a point in a new

feature space, named the bag feature space

To train SVMs in the bag feature space

l

jijijiji

l

ii Kyy

i 1,1

))B(),B((2

1maxarg*

liC

y

i

l

iii

,...,1,0

01

subject to

Page 26: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Constructing a Bag Feature Space

Clues for classifier design:What is common in positive bags and

does not appear in the negative bagsInstance prototypes computed from the

DD function A bag feature space is then constructed

using the instance prototypes

Page 27: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Diverse Density (Maron and Lozano-Perez, 1998)

A function defined over the instance space

DD value at a point in the feature spaceThe probability that the point agrees

with the underlying distribution of positive and negative bags

Page 28: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Diverse Density

It measures a co-occurrence of instances from different (diverse) positive bags

2

wxx

111

2

1)wx,( jiey

yDD

Ni

ji

il

iD

2

1

2T xDiag(w)xx

w

Page 29: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Learning Instance Prototype

An instance prototype represents a class of instances that is more likely to appear in positive bags than in negative bags

Learning instance prototypes then becomes an optimization problemFinding local maximizers of the DD fu

nction in a high-dimensional

Page 30: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Learning Instance Prototype

How do we find the local maximizers?Start an optimization at every instance

in every positive bag

Constraints:Need to be distinct from each otherHave large DD values

Page 31: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Computing Bag Features

Let be the collection of instance prototypes

Bag features,

},...,1:)w,x{( ** nkkk

},...,1:x{B),B( iijii Nj

*

*2

*1

*,...,1

*2,...,1

*1,...,1

xxmin

...

xxmin

xxmin

)B(

ni

i

i

wnijNj

wijNj

wijNj

i

Page 32: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Experimental Setup for Image Categorization

COREL Corp: 2,000 images20 image categoriesJPEG format, size 384*256 (256*384)Each category are randomly divided

into a training set and a test set (50/50)SVMLight [Joachims, 1999] software is used

to train the SVMs

Page 33: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Sample Images (COREL)

Page 34: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Categorization Performance

5 random test sets, 95% confidence intervals The images belong to Cat.0 ~ Cat.9

14.8%

6.8%

Chapelle et al., 1999

Andrews et al., 2003

Page 35: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Image Categorization Experiments

Page 36: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Sensitivity to Image Segmentation

k-means clustering algorithmwith 5 different stopping criteria

1,000 images for Cat.0 ~ Cat.9

Page 37: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Robustness to Image Segmentation

6.8% 9.5% 11.7% 13.8%

27.4%

Page 38: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Robustness to the Number of Categories in a Data Set

81.5%

67.5%

6.8%

12.9

Page 39: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Difference in Average Classification accuracies

Page 40: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Sensitivity to the Size of Training Images

Page 41: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Sensitivity to the Diversity of Training Images Varies

Page 42: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

MUSK Data Sets

Page 43: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Speed

40 minutes Training set of 500 images (4.31 regions per

image) Pentium III 700MHz PC running the Linux op

erating system Algorithm is implemented in Matlab, C progr

amming language The majority is spent on learning

instance prototypes

Page 44: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Conclusions

A region-based image categorization method using an extension of MIL → DD-SVM

Image → collection of regions → k-means alg. Image → a point in a bag feature space

(defined by a set of instance prototypes learned with the DD func.)

SVM-based image classifiers are trained in the bag feature space

DD-SVM outperforms two other methods DD-SVM generates highly competitive results

on MUSK data set

Page 45: Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern

Future Work

Limitations Region naming (Barnard et al., 2003)

Texture dependence Improvement

Image segmentation algorithm DD function

Scene category can be a vector Semantically-adaptive searching Art & biomedical images