the power of ensembles in machine learning
TRANSCRIPT
![Page 1: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/1.jpg)
Power of Ensembles______
Amit Kapoor @amitkaps
Bargava Subramanian @bargava
![Page 2: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/2.jpg)
The Blind Men & the Elephant___
“And so these men of IndostanDisputed loud and long,Each in his own opinionExceeding stiff and strong,Though each was partly in the right,And all were in the wrong.”— John Godfrey Saxe
![Page 3: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/3.jpg)
Model Abstraction___
"All models are wrong, but some are useful"— George Box
![Page 4: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/4.jpg)
Building Many Models___
"All models are wrong, but some are useful, and their combination may be better"
![Page 5: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/5.jpg)
Ensembles___
(noun) /änˈsämbəl/a group of items viewed as a whole rather than individually.
![Page 6: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/6.jpg)
Machine Learning Process___
Frame: Problem definitionAcquire: Data ingestion Refine: Data wranglingTransform: Feature creation Explore: Feature selection Model: Model selectionInsight: Solution communication
![Page 7: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/7.jpg)
Machine Learning Model Pipeline___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
![Page 8: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/8.jpg)
Machine Learning Challenge___
"...the task of searching through a hypotheses space to find a suitable hypothesis that will make good predictions for a particular problem"
![Page 9: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/9.jpg)
Hypotheses Space___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
|_________________________| Hypotheses Space
![Page 10: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/10.jpg)
Search Hypotheses Space___
The key question is how do we efficiently search this Hypotheses Space?
![Page 11: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/11.jpg)
Illustrative Example___
Binary Classification Problem
Classes: c = 2Features: f = x1, x2, x3Observations: n = 1,000
![Page 12: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/12.jpg)
![Page 13: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/13.jpg)
Dimensionality Reduction___
"Easier to visualise the data space"
Principal Component AnalysisDimensions = 2 --> (pc1, pc2)
![Page 14: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/14.jpg)
![Page 15: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/15.jpg)
Approach to Build Models___
1. Train a model
2. Predict the probabilities for each class
3. Score the model on AUC via 5-Fold Cross-Validation
4. Show the Decision Space
![Page 16: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/16.jpg)
Simple Model with Different Features___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% x1,x2 Logistic C=1 100% x2,x3 Logistic C=1 100% x1,x3 Logistic C=1 100% pc1,pc2 Logistic C=1
![Page 17: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/17.jpg)
![Page 18: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/18.jpg)
Tune the Model Parameters___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% pc1,pc2 Logistic C=1100% pc1,pc2 Logistic C=1e-1100% pc1,pc2 Logistic C=1e-3100% pc1,pc2 Logistic C=1e-5
![Page 19: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/19.jpg)
![Page 20: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/20.jpg)
Make Many Simple Models___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% pc1,pc2 Logistic C=1 100% pc1,pc2 SVM-Linear prob=True 100% pc1,pc2 Decision Tree d=full 100% pc1,pc2 KNN nn=3
![Page 21: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/21.jpg)
![Page 22: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/22.jpg)
Hypotheses Search Approach___
Exhaustive search is impossibleCompute as a proxy for human IQClever algorithmic way to search the solution space
![Page 23: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/23.jpg)
Ensemble Thought Process___
"The goal of ensemble methods is to combine the predictions of several base estimators built with (one or multiple) model algorithms in order to improve generalisation and robustness over a single estimator."
![Page 24: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/24.jpg)
Ensemble Approach___
Different training setsDifferent feature samplingDifferent algorithmsDifferent (hyper) parameters
Clever aggregation
![Page 25: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/25.jpg)
Ensemble Methods___
[1] Averaging[2] Boosting[3] Voting[4] Stacking
![Page 26: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/26.jpg)
[1] Averaging: Concept___
"Build several estimators independently and then average their predictions to reduce model variance"
To ensemble several good models to produce a less variance model.
![Page 27: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/27.jpg)
[1] Averaging: Approaches
![Page 28: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/28.jpg)
[1] Averaging: Simple Models___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
R[50%] pc1,pc2 Decision Tree n_est=10R[50%,r] pc1,pc2 Decision Tree n_est=10100% R[50%] Decision Tree n_est=10R[50%] R[50%] Decision Tree n_est=10
![Page 29: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/29.jpg)
![Page 30: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/30.jpg)
[1] Averaging: Extended Models___
Use perturb-and-combine techniques
Bagging: Bootstrap Aggregation Random Forest: Best split amongst random features Extremely Randomised: Random threshold for split
![Page 31: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/31.jpg)
[1] Averaging: Extended Models___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% pc1,pc2 Decision Tree d=fullR[50%,r] pc1,pc2 Decision Tree n_est=10R[50%,r] R[Split] Decision Tree n_est=10R[50%,r] R[Split,thresh] Decision Tree n_est=10
![Page 32: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/32.jpg)
![Page 33: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/33.jpg)
[2] Boosting: Concept___
"Build base estimators sequentially and then try to reduce the bias of the combined estimator."
Combine several weak models to produce a powerful ensemble.
![Page 34: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/34.jpg)
[2] Boosting: Concept
___
Source: Wang Tan
![Page 35: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/35.jpg)
[2] Boosting: Models___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% pc1,pc2 Decision Tree d=full 100% pc1,pc2 DTree Boost n_est=10 100% pc1,pc2 DTree Boost n_est=10, d=1 100% pc1,pc2 DTree Boost n_est=10, d=2
![Page 36: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/36.jpg)
![Page 37: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/37.jpg)
[3] Voting: Concept___
"To use conceptually different model algorithms and use majority vote or soft vote (average probabilities) to combine."
To ensemble a set of equally well performing models in order to balance out their individual weaknesses.
![Page 38: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/38.jpg)
[3] Voting: Approach
![Page 39: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/39.jpg)
[3] Voting: Approach___
Hard Voting: the majority (mode) of the class labels predicted
Soft Voting: the argmax of the sum of predicted probabilities
Weighted Voting: the argmax of the weighted sum of predicted probabilities*
![Page 40: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/40.jpg)
[3] Voting: Models___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% pc1,pc2 Gaussian - 100% pc1,pc2 SVM - RBF gamma=0.5, C=1 100% pc1,pc2 Gradient Boost n_est=10, d=2 ----- ------- -------------- -------------- 100% pc1,pc2 Voting Soft
![Page 41: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/41.jpg)
![Page 42: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/42.jpg)
[4] Stacking___
"To use output of different model algorithms as feature inputs for a meta classifier."
To ensemble a set of well performing models in order to uncover higher order interaction effects
![Page 43: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/43.jpg)
[4] Stacking: Approach
![Page 44: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/44.jpg)
[4] Stacking: Models___
Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
100% pc1,pc2 Gaussian - 100% pc1,pc2 SVM - RBF gamma=0.5, C=1 100% pc1,pc2 Gradient Boost n_est=10, d=2 ----- ------- -------------- -------------- 100% pc1,pc2 Stacking Logistic()
![Page 45: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/45.jpg)
![Page 46: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/46.jpg)
Ensemble Advantages___
Improved accuracyRobustness and better generalisationParallelization
![Page 47: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/47.jpg)
Ensemble Disadvantages___
Model human readability isn't greatTime/effort trade-off to improve accuracy may not make sense
![Page 48: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/48.jpg)
Advanced Ensemble Approach___
— Grid Search for voting weights
— Bayesian Optimization for weights & hyper-parameters
— Feature Engineering e.g. tree or cluster embedding
![Page 49: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/49.jpg)
Ensembles___
"It is the harmony of the diverse parts, their symmetry, their happy balance; in a word it is all that introduces order, all that gives unity, that permits us to see clearly and to comprehend at once both the ensemble and the details."- Henri Poincare
![Page 50: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/50.jpg)
Code and Slides____
All the code and slides are available at https://github.com/amitkaps/ensemble
![Page 51: The Power of Ensembles in Machine Learning](https://reader034.vdocuments.us/reader034/viewer/2022042907/58cf3cc11a28ab254a8b509b/html5/thumbnails/51.jpg)
Contact______
Amit Kapoor @amitkapsamitkaps.com
Bargava Subramanian @bargava