ti5216100 machine vision support vector machines maxim mikhnevich pavel stepanov pankaj sharma ivan...

Ti5216100 MACHINE VISION

SUPPORT VECTOR MACHINES

• Maxim Mikhnevich • Pavel Stepanov • Pankaj Sharma • Ivan Ryzhov• Sergey Vlasov

2006-2007

Content

1. Where Support Vector Machine comes from?2. Relationship between Machine Vision and Pattern

Recognition (the place of SVM in the whole system)3. Application areas of Support Vector Machines4. Classification problem5. Linear classifiers6. The Non-Separable Case7. Kernel-trick 8. Advantages and disadvantages

max

pasha

Out of the presentation

• Lagrange Theorem• Kuhn-Tucker Theorem• Quadratic Programming• We don’t go to deep math

max

maxim make. pavel talk about it

History

The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs

max

pasha. i think you shoud mentioned that vapnik was born in ussr and worked at about 30 years in moscow.

Relationship between Machine Vision and Pattern Recognition

Our task during this

presentation is to

show that SVM is one

of the best classifiers

max

pasha

Application Areas (just several examples)

max

pasha

Application Areas (cont.)

max

pasha

Application Areas (cont.)

Geometrical Interpretation of how the SVM separates the face and non-face classes. The patterns

are real support vectors obtained after training the system. Notice the small number of total support vectors

and the fact that a higher proportion of them correspond to non-faces.

max

pasha

Basic Definitions from technical viewpoint

Feature

Feature space

Hyperplane

Margin

max

maxim or pasha talk about it better if pash will do it

Problem

•Binary classification

•Learning collection:

- Vectors x1,…,xn – our documents (objects)

- y1,…,yn {-1,1}

Our goal is to find the optimal hyperplane!

max

maxim

Linear classifiers

w·xi > b => yi=1

w·xi < b => yi = -1

Maximum margin linear classifier

w·xi - b >= 1 => yi = 1

w·xi - b <= -1 => yi = -1

max

maxim

Linear classifiers (cont.)

(a) - a separating hyperplane with a small margin

(b) - a separating hyperplane with a larger margin

A better generalization capability is expected from (b)!

max

maxim

Margin width

Let’s take two any points from H1 and H2: x+ and x-

||||

2

||||

))1(1(

||||

),,(

||||,

ww

bb

w

wxwx

w

wxxM

Formalization

Our aim is to find the widest margin!!!

1)( bxwy ii

wwwMinimize ||||

Constraints:

Optimization criterion:

Number of constraints = number of pairs (xi,yi)

max

maxim

Noise and Penalties

Optimization criterion:

0:1)( iiii bxwy

iCwMinimize ||||

Constraints:

Number of constraints = 2 * number of pairs (xi,yi)

where i >= 0

First great idea

The idea give us how to find linear classifier: then wider our margin and then sum of errors is smaller then better.

Now we’ve brought our problem of finding linear classifier to Quadratic Programming problem.

max

maxim

How to solve our problem

1. Construct Lagrangian

2. Use Kuhn-Tucker Theorem

iiii

i

bwiiiii

bxwyor

bxwyCww

1)(0

0

maxmin)1)((,,

max

may be reformulate this slyde

How to solve our problem

Our solution is:

iii xyw 0ii y

Ci 0

0 bxxyiii

ijijiji xxyy )(

Second great idea

•Chose the mapping to extended space•After that we can find the new function which is called Kernel: •Find the linear margin w, b in extended space•Now we have our hyperplane in initial space

)(xx

)()(),( yxyxK

bxw )(

max

maxim

Second great idea - Extend our spaceSolution of XOR problem with the help of Support Vector Machines (by increasing of our space dimension)

OR different example:

max

pasha and maxm

Examples (Extend our space)

max

mne nravitsya bolshe eti resunki negeli iz prediduschego slyada, oni bolee naglaydni. predlagaiu ego zamesto prededuschego

SVM Kernel Functions

K(a,b)=(a . b +1)d is an example of an SVM Kernel Function

Beyond polynomials there are other very high dimensional

basis functions that can be made practical by finding the right Kernel Function

Radial-Basis-style Kernel Function:

– Neural-net-style Kernel Function:

2

2

2

)(exp),(

ba

baK

).tanh(),( babaK

, and are magic parameters that must be chosen by a model selection method such as CV or VCSRM

Advantages and Disadvantages

References

1. V. Vapnik, The Nature of Learning Theory, Springer-Verlag, New York, 1995.

2. http://www.support-vector-machines.org/3. http://en.wikipedia.org/wiki/Support_vector_machine4. Pattern Classification (2nd ed.) Richard O. Duda, Peter E. Hart5. Support Vector Machines Andrew W. Moore tutorial at

http://www.autonlab.org/tutorials/svm.html6. B. Schölkopf, C.J.C. Burges, and A.J. Smola, Advances in Kernel

Methods—Support Vector Learning, to appear, MIT Press, Cambridge, Mass, 1998.

ti5216100 machine vision support vector machines maxim mikhnevich pavel stepanov pankaj sharma ivan...

Documents