svm: algorithms of choice for challenging data · oracle data mining (odm) – commercial svm...
TRANSCRIPT
![Page 1: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/1.jpg)
![Page 2: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/2.jpg)
SVM: Algorithms of Choice for Challenging Data
Boriana Milenova, Joseph Yarmus, Marcos CamposData Mining TechnologiesORACLE Corp.
![Page 3: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/3.jpg)
Overview
SVM theoretical framework
ORACLE data mining technology– SVM parameter estimation– SVM optimization strategy
SVM on challenging data
![Page 4: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/4.jpg)
SVM Model Defines a Hyperplane
Linear models in feature space
Hyperplane defined by a set of coefficients and a bias term
0 bxw
wb
![Page 5: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/5.jpg)
Maximum Margin Models
))(min( ii xfymarginFunctional
support vectors
)max(min marginw
ww1))(min( ii xfymarginGeometric
![Page 6: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/6.jpg)
SVM Optimization ProblemMinimize ||w|| subject toLagrangian in primal space:
subject to
1)( ii xfy
121)( byL iiip xwwww
0i
0wpL
0bLp
iii y xw
0 ii y
![Page 7: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/7.jpg)
Duality
Lagrangian in dual space:
subject to
Dot products!– dimension-insensitive optimization– generalized dot products via non-linear map
jijijiiD yyL xx21
0i 0 ii y
)()(),( jijiK xxxx
![Page 8: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/8.jpg)
Towards Higher Dimensionality via Kernels1. Transform data via non-linear mapping to an inner
product feature space2. Train a linear machine in the new feature space
),(,.)(,.)( jiji KKK xxxx
Mercer’s kernels:– symmetry
– positive semi-definite kernel matrix
– reproducing property
),(),( ijji KK xxxx
![Page 9: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/9.jpg)
Soft Margin: Non-Separable Data
k
p CL www21)(
iii by 1xwsubject to
Capacity parameter Ctrades off complexity and empirical risk
![Page 10: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/10.jpg)
1-Norm Dual Problem
Lagrangian in dual space:
subject to
Quadratic problem– linear and inequality constraints
),(21
jijijiiD KyyL xx
Ci 0 0 ii y
![Page 11: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/11.jpg)
SVM Regression
)ˆ(21)( kk
p CL www
iii yb xwsubject to
iii by ˆ xw
![Page 12: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/12.jpg)
SVM Fundamental Properties
Convexity– single global minimum
Regularization– trades off structural and empirical risk to
avoid overfittingSparse solution
– usually only a fraction of training data become support vectors
Not probabilistic
Solvable in polynomial time…
![Page 13: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/13.jpg)
SVM in the Database
ORACLE Data Mining (ODM)– commercial SVM implementation in the
database– product targets application developers and
data mining practitioners– focuses on ease of use and efficiency
Challenges:– effective and inexpensive parameter
tuning– computationally efficient SVM model
optimization
![Page 14: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/14.jpg)
SVM Out-Of-The-Box
Inexperienced users can get dramatically poor results
LIBSVM examples:
VehicleBioinformaticsAstroparticle Physics
0.880.020.790.570.970.67
After tuningcorrect rate
Out-of-the-boxcorrect rate
![Page 15: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/15.jpg)
SVM Parameter Tuning
Grid search (+ cross-validation or generalization error estimates)
– naive– guided (Keerthi & Lin, 2002)
Parameter optimization– gradient descent (Chapelle et al., 2000)
Heuristics
![Page 16: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/16.jpg)
ODM On-the-Fly Estimates
Standard deviation for Gaussian kernel– single kernel parameter– kernel has good numeric properties
bounded, no overflowCapacity
– key to good classification generalizationEpsilon estimate for regression
– key to good regression generalization
![Page 17: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/17.jpg)
ODM Standard Deviation Estimate
Goal: Estimate distance between classes
3. Pick random pairs from opposite classes
4. Measure distances5. Order descending6. Exclude tail (90th percentile)7. Select minimum distance
![Page 18: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/18.jpg)
ODM Capacity EstimateGoal: Allocate sufficient capacity
to separate typical examples2. Pick m random examples per class3. Compute yi assuming = C
5. Exclude noise (incorrect sign)6. Scale C, (non bounded sv)
8. Order descending9. Exclude tail (90th percentile)10.Select minimum value
m
j ijji KCyy 2
1),( xx
m
j ijji KyyC 2
1),(/ xx
1iy
![Page 19: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/19.jpg)
Some Comparison Numbers
LIBSVM examples:
0.710.840.97
On-the-fly estimates
VehicleBioinformaticsAstroparticle Physics
0.880.020.850.570.970.67
Grid search + xval
Out-of-the-box
![Page 20: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/20.jpg)
ODM Epsilon EstimateGoal: estimate target noise
by fitting a preliminary model
3. Pick m random examples 4. Train SVM model with 5. Compute residuals on
remaining data6. Scale 7. Retrain
0
2/1 ntt
![Page 21: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/21.jpg)
Comparison Numbers Regression
0.020.356.57
On-the-fly estimatesRMSE
PumadynComputer activityBoston housing
0.020.336.26
Grid searchRMSE
![Page 22: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/22.jpg)
Optimization Approaches
QP solvers– MINOS, LOQO, quadprog (Matlab)
Gradient descent methods– Sequentially update one coefficient at a
timeChunking and decomposition
– optimize small “working sets” towards global solution
– analytic solution possible (SMO - Platt, 1998)
![Page 23: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/23.jpg)
Chunking strategy
/* WS working set */select initial WS randomly;while (violations){Solve QP on WS;Select new WS;
}
![Page 24: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/24.jpg)
ODM Working Set Selection
Avoid oscillations– overlap across chunks– retain non-bounded support vectors
Choose among violators– add large violators
Computational efficiency– avoid sorting
![Page 25: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/25.jpg)
Who to Retain?
/* Examine previous working set */if (non-bounded sv < 50%){
retain all non-bounded sv;add other randomly selected up to 50%;
}else{
randomly select non-bounded sv;}
![Page 26: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/26.jpg)
Who to Add?create violator list;/* Scan I - pick largest violators */while (new examples < 50% AND WS Not Full){
if (violation > avg_violation)add to WS;
}
/* Scan II - pick other violators */while (new examples < 50% AND WS Not Full){
add randomly selected violators to WS;}
![Page 27: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/27.jpg)
SVM in Feed-Forward Framework
j iijji Kyy ),( xx
j
),( iiK xx
![Page 28: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/28.jpg)
DOF in Neural Nets / RBF
![Page 29: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/29.jpg)
DOF in SVM
![Page 30: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/30.jpg)
SVM vs. Neural Net / RBF
Compact model
Global minimum
Regularization
––
–
NN / RBFSVM
![Page 31: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/31.jpg)
Text Mining
Domain characteristics:– thousands of features– hundreds of topics– sparse data
Science Sport Art
![Page 32: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/32.jpg)
SVM in Text Mining
Reuters corpus~10K documents, ~10K terms, 115 classesAccuracy: recall / precision breakeven point
0.860.840.820.790.800.72
SVMnon-linear
SVM linear
K-NNC4.5RocchioNaive Bayes
Joachims, 1998
![Page 33: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/33.jpg)
Biomining
Domain characteristics:– thousands of features– very few data points– dense data
…
microarray data
![Page 34: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/34.jpg)
SVM on Microarray Data
Multiple tumor types144 samples, 16063 genes, 14 classes Accuracy: correct rate
0.43
Naive Bayes
0.780.680.62
SVM linearK-NNWeighted voting
Ramaswamy et al., 2001
![Page 35: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/35.jpg)
Other domains
High dimensionality problems:– image (color and texture histograms)– satellite remote sensing– speech
Linear kernels sufficient in most cases– data separability– single parameter tuning (capacity)– small model size
![Page 36: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/36.jpg)
Final Note
SVM classification and regression algorithms available in ORACLE 10G database
Two APIs– JAVA (J2EE)– PL/SQL
![Page 37: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/37.jpg)
References
Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2001). Choosing Multiple Parameters for Support Vector Machines.Hsu C., Chang C., & Lin, C. (2003). A Practical Guide to Support Vector Classification.Joachims, T. (1998). Text Categorization with Support Vector Machines: Learning with Many Relevant Features.Keerthi, S. & Lin, C. (2002). Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel.Platt, J. (1998). Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines.Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J., Poggio, T., Gerald, W., Loda, M., Lander, E., Golub, T. (2001). Multi-Class Cancer Diagnosis Using Tumor Gene Expression Signatures.
![Page 38: SVM: Algorithms of Choice for Challenging Data · ORACLE Data Mining (ODM) – commercial SVM implementation in the database – product targets application developers and data mining](https://reader034.vdocuments.us/reader034/viewer/2022042103/5e8077b911fb0a6be2006bb6/html5/thumbnails/38.jpg)