introduction to optimisation-1cm

69
OPT 0.1 Local Optimisation Global Optimisation Combinatorial Optimisation INTRODUCTION TO OPTIMISATION Annalisa RiccardiEdmondo MinisciKerem AkartunalıDepartment of Mechanical and Aerospace Engineering Department of Management Science University of Strathclyde, Glasgow (United Kingdom) November 20, 2017

Upload: others

Post on 14-Nov-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

INTRODUCTION TO OPTIMISATION

Annalisa Riccardi†Edmondo Minisci†Kerem Akartunalı‡

† Department of Mechanical and Aerospace Engineering‡ Department of Management Science

University of Strathclyde, Glasgow (United Kingdom)

November 20, 2017

Page 2: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Outline

Terminology, Optimality condi-tions, Local Optimisation algo-rithms

Global Optimisation approaches,multiobjective optimisation

Network and combinatorial opti-misation, integer programming

Page 3: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Optimisation

Optimization is derived from the Latin word optimus, the best

Optimization characterizes the activities involved to find the best

People have been optimizing forever, but the roots for modern day(engineering) optimization can be traced to the Second World War

Applications in the service industries did not start until the mid

1960s.

Page 4: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Programming

You will often hear the phrase programming as in: mathematicalprogramming, linear programming, nonlinear programming, mixedinteger programming, etc.

This has (in principle) nothing to do with

modern day computer programming.

the early days, a set of values which represented a solution to aproblem was referred to as a program. Nowadays you program(software) to find a program!

Page 5: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Challenges

• Problem formulation: the problem explained in ”engineering”terms needs to be translated in its mathematical formulation(objectives and constraints). Regularity of the model is an issueif gradient based optimisation techniques are applied

• Algorithm selection: the advantages and disadvantages ofeach method needs to be assessed accordingly to the problemthat needs to be solved

• Complexity: both in terms of large number of optimisationvariables and both in terms of expensive function(s) evaluation

Page 6: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Mathematical Formulation

• x ∈ Ω ⊂ RnC × ZnD is a vector of optimisation variables

• f (x) : Ω→ Rm is the objective function

• g(x) : Ω→ Rni is the inequality constraint function

• h(x) : Ω→ Rne is the equality constraint function

Definition (Optimisation Problem)

minx∈Ω

f (x), subject tog(x) ≤ 0,h(x) = 0

The set of points satisfying the constraints is called the feasible region

D = x ∈ Ω | g(x) ≤ 0, h(x) = 0

Page 7: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Optimisation Problems Taxonomy

• Continuous (nC > 0, nD = 0), Discrete (nC = 0, nD > 0) orMixed Integer (nC > 0, nD > 0): about the nature of the setof optimisation variables

• Single (m = 1) or Multi (1 < m < 4) or Many Objectives(m ≥ 4): about the size of the objective space

• Constrained (ni > 0 and/or ne > 0) or Unconstrained (ni = 0and ne=0): about the size of the constraints space

• Linear or Nonlinear: about the linearity/non linearity ofobjective and constraints functions

• Local or Global: about the nature of the fitness landscape

Page 8: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Pareto Optimality

In case of single objective optimisation the optimum is a point, incase of multi (and many) objectives optimisation the optimum is aset of points

Definition (Pareto dominance)A point x1 ∈ D Pareto dominates x2 ∈ D (for a minimisationproblem) if

fi (x1) ≤ fi (x2), ∀i = 1, ...,m

and there is at least one component j ∈ 1, ...,m such thatfj(x1) < fj(x2). This is indicated by x1 x2.

Definition (Pareto optimality)A point x∗ ∈ D is Pareto optimal if it isn’t dominated by any x ∈ D

Page 9: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Pareto Front

Definition (Pareto Optimal Set)For a multiple objectives optimization problem the Pareto OptimalSet is defined as

P∗ = x ∈ D | @x ′ ∈ D | x ′ x.

Definition (Pareto Front)The union of the objective values of all Pareto optimal points iscalled Pareto front or equivalently

PF = f (x) ∈ Rm | x ∈ P∗.

Page 10: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Example

Page 11: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Multi-Objectives to Single Objectives

Some examples (not exhaustive)

• Weighted Sum Approach (with wi ≥ 0 and∑m

i=1 wi = 1weight coefficients)

minx∈Ω

m∑i=1

wi fi (x), s.t. g(x) ≤ 0; h(x) = 0.

• ε-Constrained

minx∈Ω

fj(x), s.t.g(x) ≤ 0; h(x) = 0fi (x) ≤ ε ∀i = 1, ...,m; i 6= j

• Goal Programming (With Ti ∈ R target values)

minx∈Ω

m∑i=1

|fi (x)− Ti |, s.t. g(x) ≤ 0; h(x) = 0.

Page 12: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Constrained to Unconstrained

Some examples (not exhaustive). The examples refer only to the caseof inequality constrained optimisation problem

• Penalisation (with pi ≥ 0 penalty terms): constraint violationis added as penalty terms to the objective function

minx∈Ω

f (x) +

ni∑i=1

pi max0, gi (x)

• Multiobjective: the constraints (or the sum of the constraintsviolation) is added to the list of objectives

minx∈Ω

[f (x), g1(x), ..., gni (x)]; minx∈Ω

[f (x),

ni∑i=1

max0, g1(x)]

Page 13: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Optimality Conditions - definitions

Definition (Lagrangian)The real function L : Ω× Rni+ne → R defined as

L(x , λ) = f (x)− µTg(x)− λTh(x)

is the Lagrangian and the coefficients µ ∈ Rni and λ ∈ Rne are calledLagrange multipliers.

Definition (Active Set)Given a point x in the feasible region D, the active set A(x) isdefined as A(x) = i ∈ I | gi (x) = 0 ∪ Ie , with I = 1, ..., ni andIe = 1, ..., ne indexes sets

Definition (LICQ)The Linear Independence Constraint Qualification (LICQ) conditionholds in a point x ∈ Ω if the gradients of the active inequalityconstraints and the gradients of the equality constraints are linearlyindependent

Page 14: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Optimality Conditions

Theorem (First-order necessary condition)Suppose that f , g and h are continuous and differentiable, x∗ is alocal solution of the constrained problem and that the LICQ holds atx∗. Then a Lagrange multiplier vector (µ∗, λ∗) exists such that thefollowing conditions are satisfied

∇xL(x∗, µ∗, λ∗) = 0 (1)

g(x∗) ≤ 0; h(x∗) = 0 (2)

µ∗i ≥ 0,∀i = 1, .., ni (3)

µ∗i gi (x∗) = 0,∀i = 1, .., ni (4)

These conditions are known as the Karush Kuhn Tucker (KKT)conditions.Remark: The last condition implies that the Lagrange multiplierscorresponding to inactive inequality constraints are zero

Page 15: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Optimality Condition

Theorem (Second-order necessary condition)Let f , c and h be twice continuously differentiable, x∗ is a localsolution of the constrained problem and that the LICQ condition issatisfied. Let (µ∗, λ∗) be the Lagrange multipliers for which the triple(x∗, µ∗, λ∗) satisfies the KKT conditions. Then

ωT∇2xxL(x∗, µ∗, λ∗)ω ≥ 0, ∀ω ∈ C (µ∗, λ∗)

where C (µ∗, λ∗) is the critical cone, i.e ”the part of the cone of

feasible directions for which the behavior of f is not clear from its

first derivative”

Page 16: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Unconstrained NLP algorithms

NLP algorithms ⇒ starts from an initial guess x0 and move towards adirection of ”improvement”

• Line Search the algorithm determines a search direction pk andsearches along this direction from the current iterate xk for anew iterate with a lower function value.

• Trust Region the algorithm constructs a model function mk

whose behavior near the current iterate xk is similar to that ofthe actual objective function f . The direction of search pk isfound as the direction that minimises mk within the trust regionxk + pk

Page 17: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Examples

Examples of line search directions (it exists the Thrust Regioncounterpart)

• Steepest Descent method: chooses as search direction thedescent one pk = −∇f (xk)

• Newton method: the search direction is the solution of theNewton equation pk = −H(xk)−1∇f (xk)

• Quasi Newton method: they dont require the computation ofthe second order derivatives but use an approximation of it,noted as B, pk = −B(xk)−1∇f (xk)

Page 18: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Constrained NLP algorithms

• Penalty, Barrier, Augmented Lagrangian methods andSequential linearly constrained methods: they solve asequence of simpler subproblems (unconstrained or with simplelinearized constraints) related to the original one. The solutionsof the subproblems converge to the solution of the primal oneeither in a finite number of steps or at the limit.

• Newton-like methods: they try to find a point satisfying thenecessary conditions of optimality (KKT conditions in general).The Sequential Quadratic Programming (SQP) method is partof this class.

Page 19: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Optimal Control

Finding the control laws u(t) ∈ C0p([t0, tf ];Rnc ) and states

x(t) ∈ C0p([t0, tf ];Rns ) for a given system that minimizes a cost

functional subject to initial and final states as well as path constraints

Definition (Optimal Control Problem)

minu,x

φ(x(tf )) +

∫ tf

t0

f0(x(t), u(t))dt

subject to

x(t) = f (x(t), u(t))

c(x(t), u(t)) ≤ 0, t ∈ [t0, tf ]

ω(x(t0), x(tf )) = 0

Page 20: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Solution Methods

• Indirect They convert the optimal control problem into aboundary value problem using the necessary condition of thePontryagin minimum principle.

• Direct Discretization of the control and state functions, andapproximation of the infinite dimensional optimal controlproblem by an NLP problem (using numerical methods toapproximate the integrals and ODE solution). Example:Multiple shooting, Collocation methods

Page 21: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

References

• Numerical Optimization, by Jorge Nocedal and Stephen J.Wright (Springer, 2006)

• Numerical Optimization - Theoretical and Practical Aspects, byJ. Frederic Bonnans, J. Charles Gilbert, Claude Lemarechal, andClaudia A. Sagastizabal (Springer, 2006)

• Introduction to Applied Optimization, 2nd Ed. by UrmilaDiwekar (Springer, 2008)

• Optimization Theory and Methods - Nonlinear Programming, byWenyu Sun, Ya-Xiang Yuan (Springer, 2006)

• Linear and Nonlinear Programming, 4th Ed. by , David G.Luenberger and Yinyu Ye (Springer, 2016)

Page 22: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Local and Global Optimal solutions

For convex optimisation problems, there is no needto distinguish between Local and Global optimalsolutionOptimal solutionss, because a locally optimal solution isalso globally optimal.Convex optimisation problems include:

• LP problems;

• QP problems where the objective is positive definite (ifminimising - negative definite if maximising); and

• NLP problems where the objective is a convex function (ifminimising - concave if maximising) and the constraints (ifexisting/formulated) form a convex set.

But all the other NLP problems are non-convex and generallyhave multiple locally optimal solutions.

Page 23: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Local and Global Optimal solutionsConvex set

Let the set S ⊂ Rn. If, for any x1, x2 ∈ S , we haveαx1 + (1− α)x2 ∈ S , ∀α ∈ [0, 1], then S is said to be a convexset.x = αx1 + (1− α)x2, where [0, 1], is called a convexcombination of x1 and x2.For any two points x1, x2 ∈ S , the line segment joining x1 andx1 is entirely contained in S .

Page 24: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Local and Global Optimal solutionsConvex functions

Let S ⊂ Rn be a non-empty convex set.Let f : S ⊂ Rn → R.If, for any x1, x2 ∈ S and all α ∈ [0, 1], we havef (αx1 + (1− α)x2) ≤ αf (x1) + (1− α)f (x2), then f is said tobe convex on S .If the above inequality is true as a strict inequality for allx1 6= x2, then f is called a strict convex function on S .If there is a constant c > 0 such that for any x1, x2 ∈ S ,f (αx1 + (1− α)x2) ≤αf (x1) + (1− α)f (x2)− 1/2cα(1− α)‖x1 − x2‖2

2, then f iscalled a uniformly (or strongly) convex function on S .

Page 25: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Local vs Global Optimisation

In Global Optimisation, we distinguish between (considerminimisation):

• Local minimum f ∗ = f (x∗), local minimizer x∗

• smallest function value in some feasible neighbourhood• x∗ ∈ S• there exists a δ > 0 such that f ∗ ≤ f (x)∀x ∈ x ∈ S : |x− x∗| < δ

• Global minimum f ∗ = f (x∗), global minimizer x∗

• smallest function value over all feasible points• f ∗ ≤ f (x) ∀x ∈ S

Page 26: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches

The aim is to find a global minimiser for the function f .The search for a solution is commonly made up of twocomponents:

• Global exploration

• Local exploitation

The global component is used to prevent local convergence andto globally characterise the solution space.The local component is used for accurate convergence to localoptimal solutions.The critical point is to have a balanced use of bothcomponents. Complexity/cost of global optimization methodsgrows exponentially with the problem sizes.

Page 27: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches

Basin of Attraction: set of solutions that tend toconverge to the same attractor (defined by the characteristicsof the problem and the characteristics of the search algorithm)

Global exploration: to find theoptimal basin of attractionLocal exploitation: to convergeto the local minimum

Traditional techniques for global optimization , such as a)branch and bound, b) grid and multi-grid algorithms, and c)multi-start algorithms are usually used for problems with asmall number of variables.Other (stochastic) algorithms, such as evolutionary algorithms,demonstrate to be an extremely valid alternative to theprevious mentioned ones.

Page 28: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesDeterministic vs Stochastic algorithms

All the algorithms seen up to now (local methods)are deterministic.For global search we have both deterministic and stochasticalgorithms.

Wikipedia (so, common knowledge):

• in computer science, a deterministic algorithm is analgorithm which, given a particular input, will alwaysproduce the same output, with the underlying systemalways passing through the same sequence of states.

• Stochastic algorithms are algorithms with one or moresteps depending on random number (given a particularinput, they will not always produce the same output, andthe underlying system will not always pass through thesame sequence of states)

Page 29: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesHeuristics

Most of the methods contain heuristic techniques or steps

Wikipedia: A heuristic technique, or simply a heuristic, is anyapproach to problem solving, learning, or discovery thatemploys a practical method not guaranteed to be optimal orperfect, but sufficient for the immediate goals. Where findingan optimal solution is impossible or impractical, heuristicmethods can be used to speed up the process of finding asatisfactory solution.

Heuristics are strategies derived from experience with similarproblems.

Page 30: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesGrid and Multigrid Search

It is the most straightforward deterministic approach

• The function f is evaluated at a set of regular grid pointsin the solution domain S

• From each point of the grid a finer grid can be builtlocally where the coarser grid revels good values for f

The procedure is analogous to an enumerative search of adiscrete set of values.

Page 31: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesDIRECT (DIviding RECTangles)

Deterministic search algorithmbased on systematic division of thesearch domain into smaller andsmaller hyperrectangles.The algorithm generates a pre-defined number of Ns samplepoints over a grid in a box-constrained feasible area start-ing from the scaled midpointx1 = 0.5(1, 1, ..., 1)T ∈ S ⊂ Rd

All sample points x1, x2, ..., xNs arestored as potential places whererefinement may take place.Refinement of the generic solutionxk consists of ”sampling more” ina region around xk

Page 32: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesMulti-start algorithms

The simple idea behind multi-start algorithms is topick a number of points in the search space and start a localsearch from each one of them (e.g., the local search can beperformed with a gradient based method).The initial grid can be deterministically or stochastically set.

1 Set k = 1, fbest = +∞ (minimisation), perform a DOE

2 Select point yk from the DOE set

3 Run a local optimizer al from yk and let xk be thedetected local minimum.

4 Evaluate the objective function f (xk)

5 if f (xk) < fbest then xbest = xk , fbest = f (xk)

6 Set neval = neval + neval ,k

7 Termination: Unless neval ≥ neval ,max , k = k + 1 andgoto Step 2

Page 33: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesDesign of Experiment (DOE)

It is a way of choosing samples in the design space in order toget the maximum amount of information using the minimumamount of resources, that is, with the lowest number ofsamples.DOE is normally used

• to start the global search by population based or grid andmulti-start methods (see previous multi-start algorithm)

• to build surrogates

• to perform sensitivity analyses

Page 34: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches

An appropriate application of search algorithmsinvolves both recognizing what kind of system the user isdealing with and knowing the right algorithm to apply, butthese aspects are not easy to handle.Stochastic Algorithms are able to handle complex andnon-identifiable optimization problems. These algorithmsproved to be extremely robust, that means they are effectivewithin a wide range of applications, but they pay theirrobustness with a general low efficiency in terms ofcomputational resources.No-Free-Lunch (NFL) theorems confirm this behaviour.(D.H. Wolpert and W.G. Macready. No Free Lunch Theoremsfor Optimization. IEEE Transaction on EvolutionaryComputation, 1(1):67-82, April 1997.)

Page 35: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesNo-Free-Lunch (NFL)

The main theorem states:If any algorithm A outperforms another algorithm B in thesearch for an extremum of an objective function, thenalgorithm B will outperform A over other objective functions.

The NFL theorem(s) suggests that the average performanceover all possible objective functions is the same for all searchalgorithms.

All algorithms for optimization will give the same averageperformance when averaged over all possible functions, whichmeans that the universally best method does not exist for alloptimization problems.

Page 36: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesStochastic Methods

Nature provides some of the most efficient ways to solveproblems - Algorithms imitating processes in nature/inspiredfrom nature Nature Inspired Algorithms.

Page 37: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesSimulated Annealing

In 1983, Kirkpatrick et All proposed a method of using aMetropolis Monte Carlo simulation to find the lowest energy(most stable) orientation of a system.Called Simulated Annealing because it is inspired by theannealing process of metals during cooling. Annealing: at hightemperatures the molecules in a metal move freely but as themetal is cooled this movement is gradually reduced and atomsalign to form crystals.The crystalline form constitutes a state of minimum energy andannealed metals have better mechanical characteristics.

Page 38: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global ApproachesEvolutionary Algorithms (EAs)

EAS are stochastic search methods that take their inspirationfrom natural selection and survival of the fittest in thebiological world.By analogy to natural evolution, the solution candidates arecalled individuals, the set of solution candidates is called thepopulation, and each optimisation iteration is a generation.

Each individual represents a possible solution, i.e., a decisionvector, to the problem at hand.

Sometimes, an individual is not a decision vector but ratherencodes it, based on an appropriate representation.

Page 39: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - EAs

Most of EAs are populationbased.A general evolutionary algo-rithm is a stochastic one which:a) principally memorizes a pop-ulation of solutions; b) has somekind of mating selection; c) hassome kind of recombination andmutation as variation operators;and d) has some kind of envi-ronment selection.

Page 40: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - EAs

Among the EAs, Genetic Algorithms (GAs) havebeen having (or ”had”) an enormous success.GAs were born, and are well suited, to solve combinatorialproblems, but they have been successfully applied tocontinuous problems as well.Most of their efficacy is due to a powerful recombinationoperator, which, for this reason, becomes the main operator.The recombination operation used by GAs requires that theproblem can be represented in a manner that makescombinations of two solutions likely to generate interestingsolutions.Selecting an appropriate representation is a challenging aspectto properly apply these methods.

Page 41: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - EAsGenetic Algorithms

Usually a binary coding is used and many applicationsdemonstrated the validity of this approach. In these cases,typically, recombination operator is a one-point, or a two-point,or multi-point, or uniform-crossover.

Page 42: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - Other PopulationBased AlgorithmsDifferential Evolution (DE)

The DE belongs to the class of Evolution Strategy optimizers.

The main idea is to generate the variation vector vi ,k+1 bytaking the weighted difference between two other solutionvectors randomly chosen within a population of solution vectorsand to add that difference to the vector difference between xi ,kand a third solution vector.

Page 43: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - Other PopulationBased AlgorithmsParticle Swarm Optimisation (PSO)

PSO is a population based stochastic optimization inspired bysocial behaviour of bird flocking or fish schooling.In PSO, the potential solutions, called particles, fly through theproblem space by following the current optimum particles.Each particle keeps track of its coordinates in the problemspace which are associated with the best solution it hasachieved so far.The particle swarm optimization concept consists of, at eachiteration, changing the velocity of each particle xi according toa close-loop control mechanism.

Page 44: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - Constraint HandlingTechniques for Nature Inspired

Despite the broad applicability of Nature Inspired methods to awide range of domains, they are essentially unconstrainedoptimization techniques.The constraints handling techniques for can be roughlyclassified into five classes (non exhaustive list):

• using penalty function;

• special representations and/or genetic operators (for EAs);

• repairing algorithms;

• considering separations of objectives and constraints; and

• hybrid methods.

Page 45: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Global Approaches - Constraint HandlingTechniques for Nature InspiredUsing penalty function

The most common approach to handle constraints.Nature inspired methods do not usually require an initialfeasible solution, so the penalty should be able to bring anunfeasible solution into the feasible region.There are at least two main choices to define a relationshipbetween an unfeasible individual and the region of the searchspace:

• an individual can be penalized just for being unfeasibleregardless of its amount of constraint violation (i.e., nouse of any information about how close it is from thefeasible region);

• the amount of its unfeasibility can be measured and usedto determine its correspondingpenalty.

Page 46: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Multi-Objective Problems

A multi-objective optimisation problem can be formulated as:finding the vector x∗ = [x∗1 , x

∗2 , ..., x

∗d ]T ∈ S ⊆ Rd that satisfies

the m constraints gi (x) ≤ 0 (i = 1, 2, ...,m), and optimises thevector function f(x) = [f1(x), f2(x), ..., fk(x)]T ∈ F ⊆ Rk

A problem should be formulatedas a MO one, if finding the par-ticular solution x∗ that yields theoptimum values of all the objec-tive functions, i.e. [fi (x∗)fi (x)]∀x ∈ S and ∀i = 1, 2, ..., k isNOT possible.

Page 47: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Multi-Objective Problemsbi-objective example - aggregation of objectives

Single-solution approaches can converge only to one point ofthe Pareto front at each run (not necessarily with a weightedsum approach).

Page 48: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Multi-Objective Problems

EAs (in general, population basedalgorithms) are particularly suit-able to solve multi-objective op-timization problems, because theydeal simultaneously with a set ofpossible solutions.EAs are less susceptible to theshape or continuity of the Paretofront.Some examples of well knownPareto based approach will be pre-sented.Pareto based approaches arebased on the idea of calculatingthe fitness of individuals on thebasis of Pareto dominance.

Page 49: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Multi-Objective Problems

Some approaches use the dominance rank, i.e., the number ofindividuals dominating an individual, to determine the fitnessvalues.Others make use of the dominance depth/class, where thepopulation is divided into several fronts and the depth reflectsto which front an individual belongs to. Alternatively, also thedominance count, i.e., the number of individuals dominatedby a certain individual, can be taken into account.A niching/crowding mechanism allows the algorithms tomaintain individuals all along the non-dominated frontier.

Some newer algorithms use measures such as the hypervolume

Page 50: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Multi-Objective ProblemsNon-dominated Sorting Genetic Algorithm (NSGA)

Based on several layers of clas-sifications of individuals.Before selection, the populationis ranked on the basis of non-domination: all non-dominatedindividuals are classified into onecategory.Sharing (function) in the ob-jective space helps to distributethe population in the non dom-inated region.

Page 51: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

References

• Carlos A. Coello Coello, David A. Van Veldhuisen, Gary B.Lamont, (2013). Evolutionary Algorithms for SolvingMulti-Objective Problems. Springer.

• Achille Messac, (2015). Optimization in Practice withMATLAB. Cambridge Univ. Press.

• Xin-She Yang, Xingshi He (Editors), (2016). Nature-InspiredOptimization Algorithms in Engineering: Overview andApplications. Springer.

• Marco Locatelli, Fabio Schoen, (2013). Global Optimization:Theory, Algorithms, and Applications. SIAM.

Page 52: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

The Challenge of Combinatorial Choices

• Often we have “discrete” choices to make.• On which route should I drive to work?• In which order should I carry my tasks today?• Which combination of classes should I take this semester?

• The “Devil’s Triangle” of routing, scheduling andplanning.

• Applications from nurse scheduling to aircraft routing,from production planning to radiotherapy optimisation.

• Most natural mathematical models using “mixed integerprogramming” (MIP).

• For the sake of clarity, let’s focus on linear problems.• Format: mincT x |Ax ≥ b, x ∈ Rn

+ × Zp+.

• Often NP-hard, but special cases exist.

• Also constraint programming may be useful.

Page 53: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: LP - Life is good!

• When p = 0, then mincT x |Ax ≥ b, x ∈ Rn+ × Zp

+ is LP.

• Always a corner solution.• Simplex & interior points.

Page 54: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: Discrete Variables

• Corners not necessarily feasible!• Continuity lost.

Page 55: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: Discrete Variables

• Convex hull of feasible points.• Nice theory, not so easy practice.

Page 56: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: Cutting Planes

• Valid inequalities violated by fractional corners.

Page 57: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: Branch & Bound

• Branch&Bound (Divide and conquer!)• Useful bound information.

Page 58: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: Branch & Bound Tree

Figure from L. Wolsey, Integer Programming, Wiley, 1998.

• Enumeration of all possible combinations.• Of course many will be eliminated by “pruning”.

Page 59: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP 101: Branch & Bound Pruning

Figures from L. Wolsey, Integer Programming, Wiley, 1998.

• Prune by Optimality (top) and Bound (bottom).

Page 60: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP Solution Methods

• Exact methods• Dynamic programming• Valid inequalities/extended reformulations• Dantzig-Wolfe/Column generation/Benders• Lagrangean relaxation• We know solution quality but often significant

enumeration and time.

• Heuristic methods• Problem-specific vs. general heuristics• Construction vs. improvement heuristics• MIP-heuristics vs. metaheuristics• Fast solutions but no guarantee on finding even a feasible

solution.

• Hence also often combined methods (tradeoff betweensolution quality vs. solution time)

Page 61: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

MIP-heuristics: Relax-and-Fix

• Option: Good formulation; fast heuristic.

• Start “Time windows” (overlapping)

1 32 4 65

relaxed

• Option: If time permits, a fast construction heuristic.

1 32 4 65

fix

• Continue with the next time window (until last period)

1 32 4 65

relaxedfixed

• Option: Improvement heuristic.

Page 62: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Relaxations

• LP relaxation: simply relax all integrality constraints.

• Lagrangean relaxation: remove a “difficult” constraint andinstead penalize it in the objective function.

• mincT x |Ax ≥ b,Dx ≥ d , x ∈ Rn+ × Zp

+• minλ≥0cT x − λ(Dx − d)|Ax ≥ b, x ∈ Rn

+ × Zp+

• Why do we care about relaxations?

Page 63: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Decompositions

• Often problems have a “special” structure.• E.g. block-angular matrix• Many subproblems “easy” to solve.

• Dantzig-Wolfe decomposition

• Master problem & n subproblems.

Page 64: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

“Nice” Problems

• A problem being “combinatorial” or MIP does notnecessarily mean it is hard.

• Plenty in the complexity theory.• Some are indeed polynomially solvable.

• A matrix A is totally unimodular if each of its squarenon-singular submatrices is unimodular (i.e., det 1 or -1).

• Very important property...• The set Ax ≥ b, x ∈ Rn

+ has all integer corners!

• Might look cumbersome, but there is a big class ofproblems fitting into this description: min cost flowproblems.

• mincT x |∑

(i,j)∈E xij −∑

(j,i)∈E xji = bi ∀i ∈ N, x ∈ Rn+

Page 65: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

An Easy Min Cost Flow Problem

Shortest path from our venue to my favourite pub!

Page 66: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

An Easy Min Cost Flow Problem

mincT x |∑

(i,j)∈E xij −∑

(j,i)∈E xji = bi ∀i ∈ N, x ∈ Rn+

Page 67: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Shortest Path and Dijkstra

Figure from R. Ahuja, T. Magnanti, J. Orlin, Network Flows, Prentice

Hall, 1993.

Page 68: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Shortest Path and Dijkstra (cont’d)

Figure from R. Ahuja, T. Magnanti, J. Orlin, Network Flows, Prentice

Hall, 1993.

Page 69: INTRODUCTION TO OPTIMISATION-1cm

OPT 0.1

LocalOptimisation

GlobalOptimisation

CombinatorialOptimisation

Caution: TSP

532-city USA tour by Padberg-Rinaldi, 1987.

mincT x |∑

(i,j)∈E xij −∑

(j,i)∈E xji = bi ∀i ∈ N, x ∈ Rn+