motivation: why automatic parameter tuning ? (1)

29
Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms: An Initial Investigation Frank Hutter 1 , Youssef Hamadi 2 , Holger Hoos 1 , and Kevin Leyton-Brown 1 1 University of British Columbia, Vancouver, Canada 2 Microsoft Research Cambridge, UK

Upload: paniz

Post on 03-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms: An Initial Investigation. Frank Hutter 1 , Youssef Hamadi 2 , Holger Hoos 1 , and Kevin Leyton-Brown 1 1 University of British Columbia, Vancouver, Canada 2 Microsoft Research Cambridge, UK. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Motivation:  Why  automatic parameter tuning ?  (1)

Performance Prediction andAutomated Tuning of

Randomized and Parametric Algorithms:An Initial Investigation

Frank Hutter1, Youssef Hamadi2, Holger Hoos1, and Kevin Leyton-Brown1

1University of British Columbia, Vancouver, Canada2Microsoft Research Cambridge, UK

Page 2: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

2

Motivation: Why automatic parameter tuning? (1)

• Most approaches for solving hard combinatorial problems (e.g. SAT, CP) are highly parameterized Tree search

- Variable/value heuristic- Propagation- Whether and when to restart- How much learning

Local search- Noise parameter- Tabu length in tabu search - Strength of penalty increase and decrease in DLS- Pertubation, acceptance criterion, etc. in ILS

Page 3: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

3

Motivation: Why automatic parameter tuning? (2)

• Algorithm design: new algorithm/application : A lot of time is spent for parameter tuning

• Algorithm analysis: comparability Is algorithm A faster than algorithm B because they

spent more time tuning it ?

• Algorithm use in practice: Want to solve MY problems fast, not necessarily the

ones the developers used for parameter tuning

Page 4: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

4

Related work in automated parameter tuning

• Best fixed parameter setting for instance set [Birattari et al. ’02, Hutter ’04, Adenso-Daz & Laguna ’05]

• Algorithm selection/configuration per instance [Lobjois and Lemaître, ’98, Leyton-Brown, Nudelman et al. ’02 & ’04,

Patterson & Kautz ’02]

• Best sequence of operators / changing search strategy during the search [Battiti et al, ’05, Lagoudakis & Littman, ’01 & ‘02]

Page 5: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

5

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models automatic tuning for parametric algorithms

• Conclusions

Page 6: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

6

Empirical hardness models:Basics (1 algorithm)

• Training: Given a set of t instances s1,...,st For each instance si

- Compute instance features xi = (xi1,...,xim)

- Run algorithm and record its runtime yi

Learn function f: features runtime, such that yi f(xi) for i=1,…,t

• Test: Given a new instance st+1 Compute features xt+1 Predict runtime yt+1 = f(xt+1)

Page 7: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

7

Empirical hardness models:Which instance features?

• Features should be polytime computable (for us, in seconds) Basic properties, e.g. #vars, #clauses, ratio (13) Graph-based characterics (10) Estimates of DPLL search space size (5) Local search probes (15)

• Combine features to form more expressive basis functions Basis functions = (1,...,q) can be arbitrary combinations of the

features x1,...,xm

• Basis functions used for SAT in [Nudelman et al. ’04] 91 original features: xi

Pairwise products of features: xi * xj

Only subset of these (drop useless basis functions)

Page 8: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

8

Empirical hardness models:How to learn function f: features runtime?

• Runtimes vary by orders of magnitudes, and we need to pick a model that can deal with that Log-transform the output

e.g. runtime is 103 sec yi = 3

• Learn linear function of the basis functionsf(i) = i * wT yi

Learning reduces to fitting the weights w (ridge regression: w = ( + T )-1 Ty)

Page 9: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

9

Algorithm selection based on empirical hardness models (e.g. satzilla)

• Given portfolio of n different algorithms A1,...,An

Pick best algorithm for each instance

• Training: Learn n separate functions

fj: features runtime of algorithm j

• Test (for each new instance st+1): Predict runtime yj

t+1 = fj(t+1) for each algorithm Choose algorithm Aj with minimal yj

t+1

Page 10: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

10

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models automatic tuning for parametric algorithms

• Conclusions

Page 11: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

11

Empirical hardness models for randomized algorithms

• Can this same approach predict the run-time of incomplete, randomized local search algorithms? Yes!

• Incomplete Limitation to satisfiable instances (train & test)

• Local search No changes needed

• Randomized Ultimately, want to predict entire run-time distribution (RTDs) For our algorithms, these RTDs are typically exponential and can thus be

characterized by a single sufficient statistic, such as median run-time

Page 12: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

12

• Training: Given a set of t instances s1,...,st For each instance si

- Compute features xi = (xi1,...,xim)

- Run algorithm multiple times to get its runtimes yi1, …, yi

k

- Compute median mi of yi1, …, yi

k Learn function f: features median run-time, mi f(xi)

• Test: Given a new instance st+1 Compute features xt+1 Predict median run-time mt+1 = f(xt+1)

Prediction of median run-time (only red stuff changed)

Page 13: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

13

Experimental setup: solvers

• Two SAT solvers Novelty+ (WalkSAT variant) SAPS (Scaling and Probabilistic Smoothing) Adaptive version of Novelty+ won SAT04

competition for random instances, SAPS came second

• Runs cut off after 15 minutes

Page 14: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

14

Experimental setup: benchmarks

• Three unstructured distributions: CV-fix: c/v ratio 4.26

20,000 instances with 400 variables, (10,011 satisfiable) CV-var: c/v ratio between 3.26 and 5.26

20,000 instances with 400 variables, (10,129 satisfiable) SAT04: 3,000 instances created with the generators for the SAT04

competition (random instances) with same parameters (1,420 satisfiable)

• One structured distribution: QWH: quasi groups with holes, 25% to 75% holes,

7,498 instances, satisfiable by construction

• All data sets were split 50:25:25 for train/valid/test

Page 15: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

15

Results for predicting (median) run-time for SAPS on CV-var

Prediction based on single runs

Prediction based on medians of 1000 runs

Page 16: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

16

Results for predicting median run-time based on 10 runs

SAPS on QWHNovelty+ on SAT04

Page 17: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

17

Predicting complete run-time distribution for exponential RTDs (from extended CP version)

RTD of SAPS on q0.75 instance of QWH

RTD of SAPS on q0.25 instance of QWH

Page 18: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

18

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models automatic tuning for parametric algorithms

• Conclusions

Page 19: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

19

Empirical hardness models for parametric algorithms (only red stuff changed)

• Training: Given a set of instances s1,...,st For each instance si

- Compute features xi - Run algorithm with some settings pi

1,...,pin

i to get runtimes yi1

,...,yin

i

- Basis functions i j (xi, pi

j) of features and parameter settings(quadratic expansion of params, multiplied by instance features)

Learn a function g:basis functions run-time, g(i j) yi

j

• Test: Given a new instance st+1 Compute features xt+1 For each parameter setting p of interest,

construct t+1 p (xt+1, p) and predict run-time g(t+1)

Page 20: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

20

Experimental setup

• Parameters Novelty+: noise {0.1,0.2,0.3,0.4,0.5,0.6} SAPS: all 30 combinations of {1.2,1.3,1.4} and

{0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}

• One additional data set Mixed = union of QWH and SAT04

Page 21: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

21

Results for predicting SAPS runtime with 30 different parameter settings on QWH

5 instances, one symbol per instance

1 instance in detail, (blue diamonds in left figure)

(from extended CP version)

Page 22: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

22

Results for automated parameter setting

Data Set

Algo Speedup over default params

Speedup over best fixed params

SAT04 Nov+ 0.89 0.89

QWH Nov+ 177 0.91

Mixed Nov+ 13 10.72

SAT04 SAPS 2 1.07

QWH SAPS 2 0.93

Mixed SAPS 1.91 0.93Not th

e best

algorithm to

tuneDo you have

a better one?

Page 23: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

23

Results for automated parameter setting, Novelty+ on Mixed

Compared to random parameters

Compared to best fixed parameters

Page 24: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

24

Overview

• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]

• EH models for randomized algorithms

• EH models and automated tuning for parametric algorithms

• Conclusions

Page 25: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

25

Conclusions

• We can predict the run-time of randomized and incomplete parameterized local search algorithms

• We can automatically find good parameter settings Better than the default settings Sometimes better than the best possible fixed setting

• There’s no free lunch Long initial training time Need domain knowledge to define features for a

domain (only once per domain)

Page 26: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

26

Future work

• Even better predictions More involved ML techniques, e.g. Gaussian processes Predictive uncertainty to know when our predictions are not reliable

• Reduce training time Especially for high-dimensional parameter spaces,

the current approach will not scale Use active learning to choose best parameter configurations to train on

• We need compelling domains: please come talk to me! Complete parametric SAT solvers Parametric solvers for other domains where features can be defined

(CP? Even planning?) Optimization algorithms

Page 27: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

27

The End

• Thanks to Holger Hoos, Kevin Leyton-Brown,

Youssef Hamadi Reviewers for helpful comments You for your attention

Page 28: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

28

Experimental setup: solvers

• Two SAT solvers Novelty+ (WalkSAT variant)

- Default noise setting 0.5 (=50%) for unstructured instances

- Noise setting 0.1 used for structured instances

SAPS (Scaling and Probabilistic Smoothing)- Default setting (alpha, rho) = (1.3, 0.8)

Adaptive version of Novelty+ won SAT04 competition for random instances, SAPS came second

• Runs cut off after 15 minutes

Page 29: Motivation:  Why  automatic parameter tuning ?  (1)

July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning

29

Which features are most important?(from extended CP version)

• Results consistent with those for deterministic tree-search algorithms Graph-based and DPLL-based features Local search probes are even more important here

• Only very few features needed for good models Previously observed for all-sat data [Nudelman et al. ’04]

A single quadratic basis function is often almost as good as the best feature subset

Strong correlation between features Many choices yield comparable performance