Introduction and problems Tables and convergence graph Performance and data profiles References
Benchmarking DFO algorithms
MTH8418
S. Le Digabel, Polytechnique Montreal
Winter 2020(v2)
MTH8418: Benchmarking DFO 1/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Plan
Introduction and problems
Tables and convergence graph
Performance and data profiles
References
MTH8418: Benchmarking DFO 2/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Introduction and problems
Tables and convergence graph
Performance and data profiles
References
MTH8418: Benchmarking DFO 3/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Introduction
I How to measure the performance of an algorithm?
I How to compare algorithms amongst themselves?
I How to identify groups of problems?
I Are we interested in the final solution only, or in theprogression of the algorithms?
MTH8418: Benchmarking DFO 4/36
Introduction and problems Tables and convergence graph Performance and data profiles References
How to benchmark DFO algorithms
I Performance indicators:I Typically: Value of f and number of blackbox evaluations
I Special measures for robustness, multiobjective optimization,etc.
I In the DFO context, CPU time is rarely relevant since thenumber of evaluations is a perfect machine-independentcriterion
I However, CPU time is useful for:I Blackboxes with variable complexityI ParallelismI . . .
MTH8418: Benchmarking DFO 5/36
Introduction and problems Tables and convergence graph Performance and data profiles References
How to find problems
I We want to achieve reproducibility. However most blackboxapplications are proprietary and/or cannot be shared easily.There is not yet a universal and accepted collection of suchproblems.
I Typically, algorithms are tested on a mixture of analyticproblems and real-life applications.
I When testing on real applications, it is necessary to use manyinstances, for example by changing the parameters of theapplications, or by using several starting solutions.
MTH8418: Benchmarking DFO 6/36
Introduction and problems Tables and convergence graph Performance and data profiles References
A realistic blackbox for benchmarking
STYRENE problem [Audet et al., 2008]
8 variables, 11 constraints, one evaluation ' 1s, '20% of failures
MTH8418: Benchmarking DFO 7/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Collections of analytic problems
The usual analytic collections are:
I The Hock and Schittkowski set [Hock and Schittkowski, 1981]
I The Luksan and Vlcek set [Luksan and Vlcek, 2000]
I Problems used in [More and Wild, 2009]. Core of 22unconstrained CUTEst problems reformulated as 212instances (smooth, nonsmooth, and noisy)
I CUTEst, the latest evolution of the CUTE and CUTErcollections [Gould et al., 2015]
I The COCO platform
I Isolated problems such as ROSENBROCK or GRIEWANK
MTH8418: Benchmarking DFO 8/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Introduction and problems
Tables and convergence graph
Performance and data profiles
References
MTH8418: Benchmarking DFO 9/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Tables of numerical results
I Useful to summarize instance characteristics
I Give the exact values for each execution of Solver s onProblem p
I Problem: becomes difficult to read (too much information)
MTH8418: Benchmarking DFO 10/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Table example
MTH8418: Benchmarking DFO 11/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Convergence graph
I Provides a picture of the convergence of one or severalmethods for one given instance of a problem
I Represents only the successes or all evaluations
I Horizontal steps between two f values
I Problem: Only for one instance. Does not allow to draw anygeneral conclusion
MTH8418: Benchmarking DFO 12/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Convergence graph example (1)
0 100 200 300 400 500 600 700 800 900 10003.4
3.2
3
2.8
2.6
2.4
2.2
2
1.8
1.6
1.4x 107 STYRENE (run S4)
number of evaluations
obje
ctiv
e va
lue
MADS aloneQuadraticTGPTGP (precise)
MTH8418: Benchmarking DFO 13/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Convergence graph example (2)
0 100 200 300 400 500
3000
040
000
5000
060
000
blackbox evaluations (n)
best
val
id o
bjec
tive
(f)
EYEY−nomaxEIEI−nomax
SANNMADS−ALAsyEntMADSmodel−based
From [Gramacy et al., 2015]
MTH8418: Benchmarking DFO 14/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Convergence + trajectory plots for 2D examples
From [Audet and Hare, 2017]
MTH8418: Benchmarking DFO 15/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Introduction and problems
Tables and convergence graph
Performance and data profiles
References
MTH8418: Benchmarking DFO 16/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Performance and data profilesI Introduced by [Dolan and More, 2002, More and Wild, 2009]
I Graphical and straightforward way of comparing methods onsets of problems
I Relative comparison of the methods
I More and Wild give the MATLAB tools to draw profiles athttp://www.mcs.anl.gov/~more/dfo/
I We consider:I Unconstrained problems
I Single-objective problems
I Deterministic instances and algorithms
I No parallelism
MTH8418: Benchmarking DFO 17/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Profiles: Original version from the M&W paper
I P: set of problems or instances
I S: set of solvers, or algorithms, or methods
I Performance measure tp,s > 0 available for each p ∈ P ands ∈ S. Typically the number of evaluations required to satisfya convergence test
I Small values of the performance measure are preferable
I Performance ratio for problem p ∈ P and solver s ∈ S:
rp,s =tp,s
min{tp,a : a ∈ S}
MTH8418: Benchmarking DFO 18/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Convergence test (1/2)
I One possible convergence test is, for the candidate solution x:
f(x0)− f(x) ≥ (1− τ)(f(x0)− fL)
I Where:I τ > 0: toleranceI x0: unique and feasible starting pointI fL: smallest value of f obtained by any solver within a given
budget of evaluations, for each p ∈ PI It requires that the reduction f(x0)− f(x) achieved by x be
at least 1− τ times the best possible reduction f(x0)− fLI τ represents the percentage decrease from f(x0). As it
decreases, the accuracy of f(x) as an approximation to fLincreases
MTH8418: Benchmarking DFO 19/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Convergence test (2/2)
I Example 1: Values of f(x0) and of f after 100 evaluations,for 3 algorithms on 2 problems:
Pb 1 Pb 2
f(x0) 10 -703.57
f Algo. 1 0.01 -1,907.47f Algo. 2 1.2 -3,964.20f Algo. 3 0 -3,682.12
For τ = 0.1, when does the convergence test pass?
I Other convergence tests include the relative error betweenf(x) and fL. For example f(x)−fL
|fL| ≤ τ if fL 6= 0
MTH8418: Benchmarking DFO 20/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Performance profilesI The best solver s∗ ∈ S for a particular problem p ∈ P attains
the lower bound rp,s∗ = 1
I tp,s = rp,s =∞ when s fails to satisfy the convergence teston p
I The performance profile of s is the fraction of problems wherethe performance ratio is at most α:
ρs(α) =1
|P|size{p ∈ P : rp,s ≤ α}
I It is the probability distribution for the ratio rp,sI ρs(1) is the fraction of problems for which s performs the bestI For α sufficiently large, ρs(α) is the fraction of problems
solved by sI Solvers with high values for ρs are preferable
MTH8418: Benchmarking DFO 21/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Performance profiles: Example
Accurate view of the performance for τ = 10−3. Takenfrom [More and Wild, 2009]
MTH8418: Benchmarking DFO 22/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Example 2
Draw the performance profiles for the following table:
tp,s Pb 1 Pb 2
Algo. 1 35 ∞Algo. 2 ∞ 1,200Algo. 3 112 500
tp,s represents the number of evaluations for which theconvergence test succeeded, for a given value of τ
MTH8418: Benchmarking DFO 23/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Data profilesI We are interested in the percentage of problems that can be
solved, for a given tolerance τ with a variable budget ofevaluations
I The data profile of Solver s is
ds(κ) =1
|P|size
{p ∈ P :
tp,snp + 1
≤ κ},
where np is the number of variables in Problem p
I It represents the percentage of problems that can be solvedwith κ groups of np + 1 function evaluations, or simplexgradient estimates
I np + 1 is the number of evaluations needed to compute aone-sided finite-difference estimate of the gradient
MTH8418: Benchmarking DFO 24/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Data profiles: Example
Taken from [More and Wild, 2009]
MTH8418: Benchmarking DFO 25/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Example 3
Draw the data profiles for the following table.
tp,s Pb 1 Pb 2
Algo. 1 35 ∞Algo. 2 ∞ 1,200Algo. 3 112 500
tp,s represents the number of evaluations for which theconvergence test succeeded, for a given value of τ . Problem 1 has2 variables and Problem 2 has 9 variables
MTH8418: Benchmarking DFO 26/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Simplified performance profiles
I Consider only the final solutions. The stopping criteria mustbe the same for all the algorithms (i.e. same budget ofevaluations)
I x-axis: tolerance τ
I y-axis: percentage of problems solved given the τ tolerance
I “solved” must be defined
I At τ = 0:I We see the methods that gave the best solutions
I The performance values may sum up to a value < 100%
MTH8418: Benchmarking DFO 27/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Simplified performance profiles: Example
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
70
80
90
100GRIEWANK, performance profile
precis ion
number
ofpro
blems(%
)
MADS aloneQuadraticTGP
MTH8418: Benchmarking DFO 28/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Simplified data profilesI We consider the entire history of all executions: We focus on
the convergence
I The tolerance τ is fixed
I x-axis: Convergence measure: Number of evaluations ornumber of simplex gradients when the problems have differentdimensions
I y-axis: Percentage of problems solved given the τ tolerance
I For x = 0, we should have y = 0
I For x = xmax, we should observe the same values on theperformance profiles, at τ
I Equivalent to the original version. Only the presentationdiffers
MTH8418: Benchmarking DFO 29/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Simplified data profiles: Example
0 500 1000 15000
10
20
30
40
50
60 = 10 3
number of evaluations
num
ber o
f pro
blem
s (%
)
model search + model orderingmodel search onlymodel ordering onlyno models
MTH8418: Benchmarking DFO 30/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Extensions
I How to extend performance and data profilesI To the constrained case?
I With parallelism?
I With stochastic algorithms?
I Several choices have to be taken. This is the subject ofHomework #1 for the constrained case
I Accuracy profiles [Audet and Hare, 2017] for the robustnessand quality of the final solution
MTH8418: Benchmarking DFO 31/36
Introduction and problems Tables and convergence graph Performance and data profiles References
Introduction and problems
Tables and convergence graph
Performance and data profiles
References
MTH8418: Benchmarking DFO 32/36
Introduction and problems Tables and convergence graph Performance and data profiles References
References
I Performance and dataprofiles [Dolan and More, 2002, More and Wild, 2009]
I Test problems [Hock and Schittkowski, 1981,Luksan and Vlcek, 2000, More and Wild, 2009,Gould et al., 2015]
I Some benchmark papers [Whitley et al., 2006,Fowler et al., 2008, More and Wild, 2009,Rios and Sahinidis, 2013, Martelli and Amaldi, 2014]
MTH8418: Benchmarking DFO 33/36
Introduction and problems Tables and convergence graph Performance and data profiles References
References I
Audet, C., Bechard, V., and Le Digabel, S. (2008).
Nonsmooth optimization through Mesh Adaptive Direct Search and Variable Neighborhood Search.Journal of Global Optimization, 41(2):299–318.
Audet, C. and Hare, W. (2017).
Derivative-Free and Blackbox Optimization.Springer Series in Operations Research and Financial Engineering. Springer International Publishing, Berlin.
Dolan, E. and More, J. (2002).
Benchmarking optimization software with performance profiles.Mathematical Programming, 91(2):201–213.
Fowler, K., Reese, J., Kees, C., Dennis Jr., J., Kelley, C., Miller, C., Audet, C., Booker, A., Couture, G.,
Darwin, R., Farthing, M., Finkel, D., Gablonsky, J., Gray, G., and Kolda, T. (2008).Comparison of derivative-free optimization methods for groundwater supply and hydraulic capturecommunity problems.Advances in Water Resources, 31(5):743–757.
Gould, N., Orban, D., and Toint, P. (2015).
CUTEst: a Constrained and Unconstrained Testing Environment with safe threads for mathematicaloptimization.Computational Optimization and Applications, 60(3):545–557.Code available at http://ccpforge.cse.rl.ac.uk/gf/project/cutest/wiki.
MTH8418: Benchmarking DFO 34/36
Introduction and problems Tables and convergence graph Performance and data profiles References
References II
Gramacy, R., Gray, G., Le Digabel, S., Lee, H., Ranjan, P., Wells, G., and Wild, S. (2015).
Modeling an Augmented Lagrangian for Improved Blackbox Constrained Optimization.To appear in Technometrics (with discussion).
Hock, W. and Schittkowski, K. (1981).
Test Examples for Nonlinear Programming Codes, volume 187 of Lecture Notes in Economics andMathematical Systems.Springer Verlag, Berlin, Germany.
Luksan, L. and Vlcek, J. (2000).
Test problems for nonsmooth unconstrained and linearly constrained optimization.Technical Report V-798, ICS AS CR.
Martelli, E. and Amaldi, E. (2014).
PGS-COM: A hybrid method for constrained non-smooth black-box optimization problems: Brief review,novel algorithm and comparative evaluation.Computers and Chemical Engineering, 63:108–139.
More, J. and Wild, S. (2009).
Benchmarking derivative-free optimization algorithms.SIAM Journal on Optimization, 20(1):172–191.Test problems and results available at http://www.mcs.anl.gov/~more/dfo/.
MTH8418: Benchmarking DFO 35/36
Introduction and problems Tables and convergence graph Performance and data profiles References
References III
Rios, L. and Sahinidis, N. (2013).
Derivative-free optimization: a review of algorithms and comparison of software implementations.Journal of Global Optimization, 56(3):1247–1293.
Whitley, D., Lunacek, M., and Sokolov, A. (2006).
Comparing the Niches of CMA-ES, CHC and Pattern Search Using Diverse Benchmarks.In Parallel Problem Solving from Nature - PPSN IX, volume 4193/2006 of Lecture Notes in ComputerScience, pages 988–997. Springer Berlin / Heidelberg.
MTH8418: Benchmarking DFO 36/36