Towards Automatic Composition of MultiComponent Predictive Systems
Manuel Martin Salvador, Marcin Budka, Bogdan GabrysData Science Institute, Bournemouth University, UK
April 18th, 2016Seville, Spain
Data is imperfect
Missing Values Noise
High dimensionalityOutliers
Question Mark: http://commons.wikimedia.org/wiki/File:Question_mark_road_sign,_Australia.jpgNoise: http://www.flickr.com/photos/benleto/3223155821/Outliers: http://commons.wikimedia.org/wiki/File:Diagrama_de_caixa_com_outliers_and_whisker.png3D plot: http://salsahpc.indiana.edu/plotviz/
MultiComponent Predictive Systems
Preprocessing
Data
Predictive Model
Postprocessing Predictions
Preprocessing
Preprocessing Predictive Model
Predictive Model
CASH problem
k-fold cross validation
Combined Algorithm Selection and Hyperparameter configuration problem
Objective function(e.g. classification error)
HyperparametersAlgorithms
Training dataset
Validation dataset
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms.In: Proc. of the 19th ACM SIGKDD. (2013) 847–855
Auto-WEKAWEKA methods as search space
One-click black boxData + Time Budget → MCPS
Our contributionRecursive extension of complex hyperparameters in the search space.
Code available in https://github.com/dsibournemouth/autoweka
Optimisation strategies
● Grid search: exhaustive exploration of the whole search space. Not feasible in high dimensional spaces.
● Random search: explores the search space randomly during a given time.● Bayesian optimisation: assumes that there is a function between the hyperparameters
and the objective and try to explore the most promising parts of the search space.
Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization, 6683 LNCS, 507–523.
Evaluated strategies
1. WEKA-Def: All the predictors and meta-predictors are run using WEKA’s default hyperparameter values.
2. Random search: The search space is randomly explored.3. SMAC: Sequential Model-based Algorithm Configuration incrementally
builds a Random Forest as inner model.4. TPE: Tree-structure Parzen Estimation uses Gaussian Processes to
incrementally build an inner model.
Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization, 6683 LNCS, 507–523.J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl, Algorithms for Hyper-Parameter Optimization. in Advances in NIPS 24, 2011, pp. 1–9.
Experiments
21 datasets (classification problems)
Budget: 30 CPU-hours (per run)
25 runs with different seeds
Timeout: 30 minutes
Memout: 3GB RAM
Results
Classification error on test set● WEKA-Def (best): 1/21● Random search (mean): 4/21● SMAC (mean): 10/21● TPE (mean): 6/21
Search spaces● NEW > PREV: 52/63
Conclusion and future work
Automation of composition and optimisation of MCPSs is feasible
Extending the search space has helped to find better solutions
Bayesian optimisation strategies have performed better than random search in most cases
Future work:
● Still gap for improvement in Bayesian optimisation strategies.● Multi-objective optimisation (e.g. time and error).● Adaptive optimisation in changing environments.
Thank you!
Paper available in https://dx.doi.org/10.1007/978-3-319-32034-2_3Slides available in http://slideshare.net/draxus