october 2-4, 2000m20001 support vector machines: hype or hallelujah? kristin bennett math sciences...

39
October 2-4, 2000 M2000 1 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst. http://www.rpi.edu/~bennek

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Slide 2
  • October 2-4, 2000M20001 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst. http://www.rpi.edu/~bennek
  • Slide 3
  • October 2-4, 2000M20002 Outline zSupport Vector Machines for Classification yLinear Discrimination yNonlinear Discrimination zExtensions zApplication in Drug Design zHallelujah zHype
  • Slide 4
  • October 2-4, 2000M20003 Support Vector Machines (SVM) Key Ideas: zMaximize Margins zDo the Dual zConstruct Kernels A methodology for inference based on Vapniks Statistical Learning Theory.
  • Slide 5
  • October 2-4, 2000M20004 Best Linear Separator?
  • Slide 6
  • October 2-4, 2000M20005 Best Linear Separator?
  • Slide 7
  • October 2-4, 2000M20006 Best Linear Separator?
  • Slide 8
  • October 2-4, 2000M20007 Best Linear Separator?
  • Slide 9
  • October 2-4, 2000M20008 Best Linear Separator?
  • Slide 10
  • October 2-4, 2000M20009 Find Closest Points in Convex Hulls c d
  • Slide 11
  • October 2-4, 2000M200010 Plane Bisect Closest Points d c
  • Slide 12
  • October 2-4, 2000M200011 Find using quadratic program Many existing and new solvers.
  • Slide 13
  • October 2-4, 2000M200012 Best Linear Separator: Supporting Plane Method Maximize distance Between two parallel supporting planes Distance = Margin =
  • Slide 14
  • October 2-4, 2000M200013 Maximize margin using quadratic program
  • Slide 15
  • October 2-4, 2000M200014 Dual of Closest Points Method is Support Plane Method Solution only depends on support vectors:
  • Slide 16
  • October 2-4, 2000M200015 Statistical Learning Theory zMisclassification error and the function complexity bound generalization error. zMaximizing margins minimizes complexity. zEliminates overfitting. zSolution depends only on Support Vectors not number of attributes.
  • Slide 17
  • October 2-4, 2000M200016 Margins and Complexity Skinny margin is more flexible thus more complex.
  • Slide 18
  • October 2-4, 2000M200017 Margins and Complexity Fat margin is less complex.
  • Slide 19
  • October 2-4, 2000M200018 Linearly Inseparable Case Convex Hulls Intersect! Same argument wont work.
  • Slide 20
  • October 2-4, 2000M200019 Reduced Convex Hulls Dont Intersect Reduce by adding upper bound D
  • Slide 21
  • October 2-4, 2000M200020 Find Closest Points Then Bisect No change except for D. D determines number of Support Vectors.
  • Slide 22
  • October 2-4, 2000M200021 Linearly Inseparable Case: Supporting Plane Method Just add non-negative error vector z.
  • Slide 23
  • October 2-4, 2000M200022 Dual of Closest Points Method is Support Plane Method Solution only depends on support vectors:
  • Slide 24
  • October 2-4, 2000M200023 Nonlinear Classification
  • Slide 25
  • October 2-4, 2000M200024 Nonlinear Classification: Map to higher dimensional space IDEA: Map each point to higher dimensional feature space and construct linear discriminant in the higher dimensional space. Dual SVM becomes:
  • Slide 26
  • October 2-4, 2000M200025 Generalized Inner Product By Hilbert-Schmidt Kernels (Courant and Hilbert 1953) for certain and K, e.g.
  • Slide 27
  • October 2-4, 2000M200026 Final Classification via Kernels The Dual SVM becomes:
  • Slide 28
  • October 2-4, 2000M200027
  • Slide 29
  • October 2-4, 2000M200028 zSolve Dual SVM QP zRecover primal variable b zClassify new x Final SVM Algorithm Solution only depends on support vectors :
  • Slide 30
  • October 2-4, 2000M200029 Support Vector Machines (SVM) zKey Formulation Ideas: yMaximize Margins yDo the Dual yConstruct Kernels zGeneralization Error Bounds zPractical Algorithms
  • Slide 31
  • October 2-4, 2000M200030 SVM Extensions zRegression zVariable Selection zBoosting zDensity Estimation zUnsupervised Learning yNovelty/Outlier Detection yFeature Detection yClustering
  • Slide 32
  • October 2-4, 2000M200031 Example in Drug Design zGoal to predict bio-reactivity of molecules to decrease drug development time. zTarget is to predict the logarithm of inhibition concentration for site "A" on the Cholecystokinin (CCK) molecule. zConstructs quantitative structure activity relationship (QSAR) model.
  • Slide 33
  • October 2-4, 2000M200032 SVM Regression: -insensitive loss function ++ --
  • Slide 34
  • October 2-4, 2000M200033 SVM Minimizes Underestimate+Overestimate
  • Slide 35
  • October 2-4, 2000M200034 LCCKA Problem zTraining data 66 molecules z323 original attributes are wavelet coefficients of TAE Descriptors. z39 subset of attributes selected by linear 1-norm SVM (with no kernels). zFor details see DDASSL project link off of http://www.rpi.edu/~bennek. http://www.rpi.edu/~bennek zTesting set results reported.
  • Slide 36
  • October 2-4, 2000M200035 LCCK Prediction Q2=.25
  • Slide 37
  • October 2-4, 2000M200036 Many Other Applications zSpeech Recognition zData Base Marketing zQuark Flavors in High Energy Physics zDynamic Object Recognition zKnock Detection in Engines zProtein Sequence Problem zText Categorization zBreast Cancer Diagnosis zSee: http://www.clopinet.com/isabelle/Projects/http://www.clopinet.com/isabelle/Projects/ SVM/applist.html
  • Slide 38
  • October 2-4, 2000M200037 Hallelujah! zGeneralization theory and practice meet zGeneral methodology for many types of problems zSame Program + New Kernel = New method zNo problems with local minima zFew model parameters. Selects capacity. zRobust optimization methods. zSuccessful Applications BUT
  • Slide 39
  • October 2-4, 2000M200038 HYPE? zWill SVMs beat my best hand-tuned method Z for X? zDo SVM scale to massive datasets? zHow to chose C and Kernel? zWhat is the effect of attribute scaling? zHow to handle categorical variables? zHow to incorporate domain knowledge? zHow to interpret results?
  • Slide 40
  • October 2-4, 2000M200039 Support Vector Machine Resources zhttp://www.support-vector.net/ zhttp://www.kernel-machines.org/ zLinks off my web page: http://www.rpi.edu/~bennek