ti5216100 machine vision support vector machines maxim mikhnevich pavel stepanov pankaj sharma ivan...
Post on 21-Dec-2015
216 views
TRANSCRIPT
Ti5216100 MACHINE VISION
SUPPORT VECTOR MACHINES
• Maxim Mikhnevich • Pavel Stepanov • Pankaj Sharma • Ivan Ryzhov• Sergey Vlasov
2006-2007
Content
1. Where Support Vector Machine comes from?2. Relationship between Machine Vision and Pattern
Recognition (the place of SVM in the whole system)3. Application areas of Support Vector Machines4. Classification problem5. Linear classifiers6. The Non-Separable Case7. Kernel-trick 8. Advantages and disadvantages
Out of the presentation
• Lagrange Theorem• Kuhn-Tucker Theorem• Quadratic Programming• We don’t go to deep math
History
The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs
Relationship between Machine Vision and Pattern Recognition
Our task during this
presentation is to
show that SVM is one
of the best classifiers
Application Areas (cont.)
Geometrical Interpretation of how the SVM separates the face and non-face classes. The patterns
are real support vectors obtained after training the system. Notice the small number of total support vectors
and the fact that a higher proportion of them correspond to non-faces.
Basic Definitions from technical viewpoint
Feature
Feature space
Hyperplane
Margin
Problem
•Binary classification
•Learning collection:
- Vectors x1,…,xn – our documents (objects)
- y1,…,yn {-1,1}
Our goal is to find the optimal hyperplane!
Linear classifiers
w·xi > b => yi=1
w·xi < b => yi = -1
Maximum margin linear classifier
w·xi - b >= 1 => yi = 1
w·xi - b <= -1 => yi = -1
Linear classifiers (cont.)
(a) - a separating hyperplane with a small margin
(b) - a separating hyperplane with a larger margin
A better generalization capability is expected from (b)!
Margin width
Let’s take two any points from H1 and H2: x+ and x-
||||
2
||||
))1(1(
||||
),,(
||||,
ww
bb
w
wxwx
w
wxxM
Formalization
Our aim is to find the widest margin!!!
1)( bxwy ii
wwwMinimize ||||
Constraints:
Optimization criterion:
Number of constraints = number of pairs (xi,yi)
Noise and Penalties
Optimization criterion:
0:1)( iiii bxwy
iCwMinimize ||||
Constraints:
Number of constraints = 2 * number of pairs (xi,yi)
where i >= 0
First great idea
The idea give us how to find linear classifier: then wider our margin and then sum of errors is smaller then better.
Now we’ve brought our problem of finding linear classifier to Quadratic Programming problem.
How to solve our problem
1. Construct Lagrangian
2. Use Kuhn-Tucker Theorem
iiii
i
bwiiiii
bxwyor
bxwyCww
1)(0
0
maxmin)1)((,,
Second great idea
•Chose the mapping to extended space•After that we can find the new function which is called Kernel: •Find the linear margin w, b in extended space•Now we have our hyperplane in initial space
)(xx
)()(),( yxyxK
bxw )(
Second great idea - Extend our spaceSolution of XOR problem with the help of Support Vector Machines (by increasing of our space dimension)
OR different example:
Examples (Extend our space)
SVM Kernel Functions
K(a,b)=(a . b +1)d is an example of an SVM Kernel Function
Beyond polynomials there are other very high dimensional
basis functions that can be made practical by finding the right Kernel Function
Radial-Basis-style Kernel Function:
– Neural-net-style Kernel Function:
2
2
2
)(exp),(
ba
baK
).tanh(),( babaK
, and are magic parameters that must be chosen by a model selection method such as CV or VCSRM
References
1. V. Vapnik, The Nature of Learning Theory, Springer-Verlag, New York, 1995.
2. http://www.support-vector-machines.org/3. http://en.wikipedia.org/wiki/Support_vector_machine4. Pattern Classification (2nd ed.) Richard O. Duda, Peter E. Hart5. Support Vector Machines Andrew W. Moore tutorial at
http://www.autonlab.org/tutorials/svm.html6. B. Schölkopf, C.J.C. Burges, and A.J. Smola, Advances in Kernel
Methods—Support Vector Learning, to appear, MIT Press, Cambridge, Mass, 1998.