1 integration of artificial intelligence and operations research techniques for combinatorial...

1

Integration of Artificial Intelligence and Operations Research Techniques

for Combinatorial Problems

Carla P. GomesCornell University

[email protected]

Ken McAloon and Carol TretkoffILOG

{mcaloon,tretkoff}@ilog.com

Integration of Artificial Intelligence and Operations Research Techniques

for Combinatorial Problems

Carla P. GomesCornell University

[email protected]

Ken McAloon and Carol TretkoffILOG

{mcaloon,tretkoff}@ilog.com

2

AI, OR, and CSAI, OR, and CS

AI OR

CS

3

Integration of Artificial Intelligence &Operations Research

TechniquesAI

RepresentationsConstraint Languages

Logic FormalismsObject-Oriented Prog.

Bayesian NetsRule Based Systems

• • •Tools

Constraint PropagationSystematic SearchStochastic Search

• • •Pros / Cons

Rich RepresentationsComputational Complexity

OR

RepresentationsMathematical

Modeling LanguagesLinear & Non-linear

(In)Equalities• • •Tools

Linear ProgrammingMixed-Integer Prog.Non-linear Models

• • •Pros / Cons

More Tractable (LP)Primarily Complete InfoLimited Representations

Combinatorial Problems

Planning Scheduling

THE CHALLENGE

AI OR

UNIFY APPROACHES TO:

SCALE UP SOLUTIONS

HANDLE UNCERTAINTY

ANALYZE COMPLEXITY (phase transition)

FRAGILE

EXPLOIT PROBLEM STRUCTURE

INCREASE ROBUSTNESS

31 - 45: ACPOWER? 0 NUM-UNAV-RESS 1UNAV-RES-MAP (DIV2 D24BUS-3 D24-2 D24-1) (ACPLOSS D24BUS-3 D24-2

ROME LABORATORY OUTAGE MANAGER (ROMAN)

Parameters Load Run Gantt Charts Utilities Parameters Load Run Gantt Charts Utilities

AC-POWER StatusAC PowerDIV1DIV2DIV3DIV4

0 10 20 30 40 50 60 70 80 90

GoalStart

4

OutlineOutline

I. Short Overview of OR

II. Disjunctive Programming and Hybrid Solvers

III. Exploiting Randomization to Solve Hard Combinatorial Problems

IV. Conclusions

5

I. Short OR OverviewI. Short OR Overview

6

Outline for Linear Programming and Integer Programming


• Standard Form of LP and a Simple Example

• Geometric Interpretation of LP

• Complexity issues

• MIP

• Example: Fast Food

• Example: Capacitated Warehouse

• Example: 911

7

OutlineOutline

1. Short Overview of OR

2. Constraint Programming

3. Cooperating Solvers

4. Disjunctive Programming

5. Exploiting Randomization to Solve Hard Combinatorial Problems

6. Conclusions

8

Optimization Technology Evolution

Dispatch Rules

1960 1970 1980 1990

SA, GA, Tabu

CPMPERT

Constraint-based Scheduling

19981947

Primal Simplex LP

ParallelLP/MIP

ConcurrentScheduling

Interior Point

ConstraintPropagation

Large IPsMIP

ShiftingBottleneck

First CP Systems

CooperatingSolvers (LP/CP)

Global constraints

Barrier LPBarrier Crossover

Dual Simplex Implementation

Dual Simplex

9

1. Short OR Overview1. Short OR Overview

10



• Standard Form of LP and a Simple Example

• Geometric Interpretation of LP

• Complexity issues

• MIP

• Example: Fast Food

• Example: Capacitated Warehouse

• Example: 911

11

An LP StoryAn LP Story

A factory can produce n products from m parts

For product j it needs aij units of part i

There are bi units of part i available

Each unit of product j sold earns cj

Amount of each product to make is unknown xj 0

Each part i determines a constraint

ai1 x1 + … + ain xn bi

Obvious solution: do nothing

Better: maximize c1 x1 + … + cn xn

12

Standard Forms of LPStandard Forms of LP

A linear program (LP) in standard form (Dantzig 1947)

max cTx

subject to Ax b

x 0

Input data: c (n x 1), A (m x n), b (m x 1).

Variables: x (n x 1)

13


// The objective function

max c1 x1 + … + cn xn

// The constraints

subject to

a11 x1 + … + a1n xn b1

...

am1 x1 + … + amn xn bm

x1 0 , … , xn 0

14


• In OR emphasis is on optimality

• Solution means optimal solution

• Feasible solution means solution in the ordinary sense

15


Interpretation of standard form:

• xj = amount of product j to make

• cj = revenue per unit product j

• bi = available amount of component i

• aij = units of i used per unit of j produced

The constraints “say”:

aijxj = units of i used by j

= units of i used

bi

16

What are models?What are models?

A model is a data-independent abstraction of a problem

A model lets you write down the mathematical representation of a model independently of the data

Project

Model Data

OneProblemInstance

17

Products Could be Jewelry

Products: Rings and Earrings

Components: Gold and Diamonds

One ring requires 3 units of Gold, and 1 DiamondOne set of earrings requires 2 units of Gold, and 2 Diamonds

Total Gold and Diamonds are limited

Profit is different for Rings than for Earrings

Products = { rings, earrings };Components = { Gold, Diamonds };

demand = [ [3, 1], [2, 2] ];

stock = [150, 180];

profit = [60, 40];

18

Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl

Components: Nitrogen, Hydrogen, Chlorine

One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen

One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine

Total Nitrogen, Hydrogen, Chlorine is limited

Profit is different for Gas than Chloride

Products Could be Chemicals

Products = { gas, chloride };

Components = { nitrogen, hydrogen, chlorine };

demand = [ [1, 3, 0], [1, 4, 1] ];

stock = [50, 180, 40];

profit = [30, 40];

19

The Problems Have One ModelThe Problems Have One Model

enum Products ...;enum Components ...;

float+ demand[Products, Components] = ...;float+ profit[Products] = ...;float+ stock[Components] = ...;

var float+ production[Products];

maximize sum (p in Products) profit[p] * production[p]

subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c]};

Data

DecisionVariables

Objective Function

Constraints

20

OR Modeling SystemsOR Modeling Systems

• OPL

• AMPL

• 2LP

• AIMMS

• GAMS

• MPL

• ILOG Planner

• etc

21

The Dual The Dual

The dual linear program (von Neumann 1947);

min yTb

subject to yTA c

y 0

Variables y (m x 1)

Awesome Symmetry -

The dual of the dual is the primal

22

Rows and Columns ExchangedRows and Columns Exchanged

min b1 y1 + … + bm yn

subject to

a11 y1 + … + am1 ym c1

...

a1n y1 + … + amn ym cn

y1 0 , … , ym 0

23

Duality TheoremDuality Theorem

Theorem: min yTb = max cTx

• Consequence: This turns optimality problem into a feasibility problem in x and y

Ax b

x 0

yTA cT

y 0

yTb = cTx

• Consequence: Enumeration not needed to verify optimality

24

Duality TheoremDuality Theorem

• Sensitivity Analysis

• Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b

yi * = Z / bi where Z is the optimum

Reference: McAloon and Tretkoff [1996] Wiley

Duality

Two different views of the same phenomenon

Point vs Set

Arc vs Node

Momentum vs Position

Vector vs Hyperplane

Landlord vs Renter

26

Simplex and BarrierSimplex and Barrier

• The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function

• The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope

27

Geometric Interpretation of LPGeometric Interpretation of LP

X

Y

Max: Xsubject to:

-X + Y <= 4X + 4*y <= 362*X + y <= 23X + Y >= 4Y >= X + 10

(0,4)

(4,0) (8,0)

(10,3)

(4,8)

Barrier

Simplex

28

Complexity of Linear ProgrammingComplexity of Linear Programming

Simplex Method

Worst-case --- exponential (Klee and Minty 72)

Practice --- good performance

Ellipsoid MethodKhachian’s Ellipsoid Method

Worst-case --- polynomial

Practice --- poor performance

29


Interior Point Methods or Barrier Methods“Karmarkar’s” (and variants) Method

Worst-case --- polynomial

Practice --- good performance

30


• Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice.

• Which method performs best is problem dependent.

31

Success StoriesSuccess Stories

• Industrial Planning

Given current resources, decide what to produce in what quantity

• Supply Chain Management

Multiperiod planning models that link flow from one period to the next

• Network Flow

How best to route goods across a network

32

Assumptions of Linear Programming

Assumptions of Linear Programming

• Linearity

when violated: ( xy = 50)

Nonlinear programming

• Continuity

when violated: (x integral)

(Mixed) Integer programming

33

Assumptions of Linear Programming - continued

Assumptions of Linear Programming - continued

• No Disjunctive Constraints

when violated: (x 100 or x 0)

Disjunctive programming

Additional 0-1 variables and Big M constraints

• Certainty

when violated: (cost c is a random variable)

Stochastic programming

34

Search and MIPSearch and MIP

• In order to deal with variables that must have integer values in the solution, a search must be performed.

• Mixed Integer Programming problems are combinatorial optimization problems and are NP hard

• feasibility is NP-Complete

• verifying optimality is co-NP-Complete

35

MIP and Combinatorial Optimization

MIP and Combinatorial Optimization

• These problems have been attacked by both the AI and OR communities.

• In AI, these problems are attacked as CSPs or as Planning Problems.

• In OR, they are done as MIPs and use linear relaxation to help guide the search.

• The overriding idea in each case is to limit search.

36

Integer Program: All Integer Points in Region

Integer Program: All Integer Points in Region

37

Cut to Create Integer VertexCut to Create Integer Vertex

Integer Vertex

38

Example - Fast FoodExample - Fast Food

• Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet?

• If so, what is the least he can do it for?

39

Nutritional RequirementsNutritional Requirements

• At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron

• At least 55 grams of protein

• At most 3000 milligrams of sodium

• At most 30% of the calories can come from fat

• Nutritional information is available from fast food outlets

40

College Student’s RequirementsCollege Student’s Requirements

• At least 2000 calories a day

• No more than 3 servings of any one food

• Milk only with cereal and not as a stand-alone drink

41

Fast Food - MIP ModelFast Food - MIP Model

• We will have variables Servk to represent the number of servings of item k in the plan.

• The variable Servk will have to take an integer value for the solution to be valid.

• The objective function: Z for cost

42


• Let foodk,j represent the percent of RDA of nutrient j in a serving of item k

• The for each nutrient j, we have a constraint

foodk,j Servk 100 k

43


• Let sodiumk represent the amount of salt in a serving of item k

• For salt we have the constraint

sodiumk Servk 3000 k

• Similarly for fat

44


• Let costk represent the cost of a serving of item k

• For the objective function we have the defining constraint

costk Servk = Z k

45

Fast Food - SolutionFast Food - Solution

• With a MIP solver and a way to input these constraints we ask for

• a solution that makes the variables Servk integral

• and which minimizes Z

46

MIP Solution TechniqueMIP Solution Technique

• What the MIP solver does is to carry out a branch and bound search guided by

• the linear relaxation– the solution to the problem with the integrality

requirements relaxed

• Initialize the global variable best_so_far to 1000 (or something else very big).

47

At a NodeAt a Node

• Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if

z* best_so_far ,

• If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node.

48

Branching at a nodeBranching at a node

• Choose a variable Servk whose value s* is not integral.

• Typical heuristic: most non-integral variable

• Create two child nodes,

• add Servk floor(s*)

• add Servk ceil(s*)

49

Good NewsGood News

• The linear relaxation can prune nodes before all variables Servk are forced to be integral.

• Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why

• A solution to the LP is at a vertex

• A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints

• Many of these constraints are integer bounding constraints yielding X = integer

50

Arboreally SpeakingArboreally Speaking

• Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A*

• If the linear relaxation is tight

| z*linear - z*integral | is relatively small

then z*linear is an excellent evaluation function

51

Answer - Fast FoodAnswer - Fast Food

Total cost is 8.71

Buy 3 burgers

Buy 2 fries

Buy 3 honeys

Buy 1 yogurt

...

52

Example - Fixed CostExample - Fixed Cost

• Warehouses must be rented in order to supply stores and we must decide which to use

• For each store j we know its monthly demand dj

• For each warehouse i we know its capacity ki

• For each warehouse i we know the fixed cost to run it each month fci

• For each pair i, j we know the monthly cost cij of supplying j from i

53


• Xij is the fraction of store j’s demand met by i

• Xij 1

• Yi is a “fuzzy” boolean

• it will be 1 if the warehouse is rented

• 0 if it is not rented

• Yi 1

54


• Each store must be supplied

X ij = 1 i

• Warehouse capacity can not be exceeded

dj Xij ki j

• Tighter

dj Xij ki Yi j

55


• Objective function

fci Yi + cij Xij

• This yields a MIP with 0-1 variables Yi

56

Branch and Cut: An Enhanced Solution Method

Branch and Cut: An Enhanced Solution Method

• Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation

Xij Yi

• Add at a node if violated by solution to linear relaxation

• Powerful method - will solve the Imperial College OR lib CW problems very easily

57

Example - Call 911Example - Call 911

• PCTs answer the phone 24 hours a day, 7 days a week.

• It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate.

• Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours.

58

Example - Call 911Example - Call 911

• Each PCT has a work week of 5 days followed by 2 days off.

• Want to meet the demand with minimal or near-minimal number of PCTs.

• So need to determine how many PCTs start their work week at each hour h of the week

59

Modeling 911

• A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0 h < 168.

60

Modeling 911

• A continuous variable Z will represent the objective function

Pcth = Z h

• There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed.

61

Modeling 911

• For this constraint we need to represent the number of workers who are on duty at time h

• Certainly, those who start the week at time h are here, as are those who started the week at time h - 1

• And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break.

62

Modeling 911

• This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h

Pctw bh w

63

Call 911 solved with progressive roundoff

int b[168] = { // New York City 91130,24,18,15,14,14,15,25,34,36,38,40,41,43,46,57,57,59,61,59,55,50,45,38,32,25,20,17,15,13,17,25,32,35,38,40,42,43,47,58,57,57,59,57,55,52,47,41,33,25,20,17,15,13,15,25,32,33,37,39,42,43,47,57,56,57,57,56,53,50,47,41,34,27,22,19,16,15,16,25,31,35,37,40,44,45,48,57,57,56,58,56,53,53,46,41,34,28,23,19,16,15,17,25,33,37,39,42,45,47,51,59,58,60,61,61,57,56,57,55,48,41,35,30,26,20,18,22,26,32,42,46,49,53,54,56,56,56,59,59,57,57,56,56,52,46,41,34,29,23,18,19,25,31,36,41,46,50,52,53,52,53,54,53,50,49,45,40

};

64

Modeling 911

• Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small.

• The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer.

• The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers.

65

Modeling 911

• For this and many other applications, heuristics can be used to develop good solutions

• Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc.

66

Solving the Integer Problem

main() // Planner Code

{

IlcInitFloat();

IlcManager m(IlcNoEdit);

IlcLinOpt simplex(m);

IlcFloatVarArray Pct(m,168,0,1000);

IlcFloatArray coeffs(m,168);

int i,j,k,h,n;

67

Solving the Integer Problem

// Pctw bh w

for(h=0;h<168;h++) { // for each hour of 168 in week

for(j=0;j<168;j++)

coeffs[j] = 0;

for(k=0;k<5;k++) // for each of 5 days

for(j=k*24;j<k*24+8;j++) // for each of 8

if (j!=(k*24+4)) // hours

coeffs[(h+168-j)%168] = 1;

simplex.add(IlcScalProd(coeffs,Pct) >= b[h]);

}

68

Solving the Integer Linear Problem

IlcFloatVar Z = IlcSum(Pct);// Objective

simplex.setObjMin(Z);

for(i=0;i<168;i++) { //Progressive roundoff

n =ceil(simplex.getCurrentValue(Pct[i]));

// Fix variable and re-optimize

simplex.add(Pct[i] == n);

}

m.out() << “Number of Pcts needed is “ << Z << endl;

m.end();

}

69

Solution

• This code finds a solution with 208 workers in a couple of seconds. The optimum is 207.

• The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution

• [Bartholdi,Ratliffe,Orlin]

70

2. Constraint Programming2. Constraint Programming

LP/MIP is Beautiful, except when

• Variable domain information is important to the search strategy

• especially critical in scheduling

• The problem variables range over symbolic entities and there are lots of symmetries

• timetabling

• The MIP representation can be too verbose or awkward

• configuration

• There are just too many constraints e.g. vehicle routing

72

Mathematical Basis of Constraint Programming (CP)

Mathematical Basis of Constraint Programming (CP)

The Constraint Satisfaction Problem:

• Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain.

• A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk) D1 x …x Dk.

• A constraint satisfaction problem (CSP) is given by a finite set of constraints.

• A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied.

73

Domain ReductionDomain Reduction

• In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints.

• With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint.

• Accelerates convergence toward a solution

• Detects infeasibility

74

Constraint PropagationConstraint Propagation

• The other key issue is communication among the constraints or subproblems.

• The basic method used is called constraint propagation which links the constraints through their shared variables.

• The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints.

Monsieur Jordan Phenomenon

• Like prose, you have been doing constraint propagation all your life.

– Crossword puzzles

• Incomplete and so backtracking is needed

– NY Times Sunday Crossword

– Optical Illusions

• Origin: Vision analysis (Marr,Waltz et al)

76

Strengths of Constraint Programming

Strengths of Constraint Programming

Constraint Programming provides a rich Rich

• Rich representation language.

• CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT.

• Opportunity to choose a good heuristic for the solution strategy.

77

Which Method for Which App?Which Method for Which App?

ProductMix

LP

ProductionPlanning

MIP

DistributionPlanning

MIP

Scheduling

ConstraintBased

Scheduling

Dispatching

CPLocal search

Configuration

CP Technology

Application

Linear => Disjunctive Constraints

Strategic => Operational Optimization

78

3. Cooperating Solvers3. Cooperating Solvers

First Stop

CP/CP

80

Mother of All Examples - N Queens

Mother of All Examples - N Queens

• Do we think in terms of queens

• Where do we place this queen ?

• Do we think in terms of squares

• Will this square contain a queen ?

• These views are dual to one other

The Primal ViewThe Primal View

For each queen assign it a square

Place this queen in this square ?

The Dual ViewThe Dual ViewFor each square decide whether it will

have a queen

Place a queen in this square ?

83

The Primal ModelThe Primal Model

In which row do we place q[j] - the queen in column j

The constraints

q[i] != q[j]

q[i] - q[j] != i - j

q[i] - q[j] != j - i

Note: no alldifferent constraint

84

Yet Another duality - rows vs columns

Yet Another duality - rows vs columns

In which column do we place qq[i] the queen in row i

The constraints are the same

qq[i] != qq[j]

qq[i] - qq[j] != i - j

qq[i] - qq[j] != j - i

85

The RelationshipThe Relationship

Can link them as inverse functions:

q[qq[i]] = i

qq[q[j]] = j

The constraint propagation

i leaves domain of q[j] iff j leaves domain of qq[i]

86

In this primal/dual modelIn this primal/dual model

• Apply first-fail to q[i]

• Lo and behold

one-third fewer fails

(Example from Jean Jordan’s thesis)

88

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

89

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Q

X

X

X

X

X XX

X

X

X

90

Q

Q

X

X

X XX X X

X

X

X

X

X

X

X

X

Q

X

X

X

X

X XX

X

X

X

Q

X X

X

X

91

So … So …

The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint

The arc consistency of this global constraint is non-trivial to maintain

Network flow algorithmsflow goes from values to variables

each variable has unit demand and capacity

92

Remarks Remarks

• An IP model will encode the first dual solution

Will this square contain a queenx[i][j] = 0 or x[i][j] = 1

• A disaster beyond 30 queens

• network structure on rows and columns lost

• Another example - sports scheduling

93

Constraints and IndicesConstraints and Indices

• In IP symbols are represented by indices as opposed to values for variables.

• nurses, teams

• Paris-St-Germain plays Manchester United on day k

xijk {0-1} to represent “team i plays team j on day k”

• You can’t put symmetry breaking and other constraints on indices.

Second Stop

CP/IP

95

CP Is Powerful, But ….CP Is Powerful, But ….

• Sometimes, inconsistencies can be overlooked

X - Y 12

X + Y 10

X in [1..20]

Y in [1..20]

• Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution

• but an LP solver would spot this

96

2 Dimensional Bin Packing2 Dimensional Bin Packing

• Application for the Automobile Industry built by Greg Glockner

97


• The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed.

The actual application involves circuit boards

• There are two complete models, one a CP model and the other an IP model.

The CP model directs the search

The LP relaxation prunes the search space by detecting infeasible nodes

98

2-D Bin Packing2-D Bin Packing

Arrange circuit boards onto raw material

Boards may be rotated

Use same number of each board

Objective: minimize scrap

Classic combinatorial optimization problem

99

Solving 2-D Bin PackingSolving 2-D Bin Packing

Use CP to generate partial solutions (nodes)

Restrict placement to reduce fragmentation of blank space

Use tight LP to test feasibility

If any partial solution is infeasible in the LP, prune the tree immediately

CP constraints reduce the tree widthLP allows us to prune quickly

100


• As the search tree is traversed, the two models are in sync.

• Note that the variables used in the 2 models are disjoint

• The two models are dual to each other

• The IP sees the model from the point of view of the board, the large rectangle

• The CP sees the model from the point of view of the small rectangles

• Solutions are obtained in minutes

101

2DBP: Basic CP Formulation2DBP: Basic CP Formulation

• Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile

• Basic constraints:

– Disjunctive constraints to prevent overlapping tiles

xi + wi xj yi + hi yj

xj + wj xi yj + hj yi

– Constraints to count the number of each tile type

Tile-oriented formulation

102

2DBP: Basic IP Formulation2DBP: Basic IP Formulation

Let xijnt = 1 if tile n of type t is in position (i, j)

The constraints are:

tnjix

jix

tnx

ijnt

hjjwiijjiitnji

ntji

jiijnt

tt

,,,}1,0{

,1

,1

,,,:,,,

,

Grid-oriented formulation

103

2DBP: LP Issues2DBP: LP Issues

• The LP is large• The LP exhibits significant primal degeneracy• The LP exhibits significant dual degeneracy

104

2DBP: LP Issues2DBP: LP Issues

• The simplex algorithm cannot solve the LP• There is no way for a MIP solver to solve the IP

as such• The barrier method can solve the LP

105

2DBP: Summary2DBP: Summary

• CP as master problem– Orders tiles

– Places tiles by position, then type

– Selects tile type by frequency to scatter tiles throughout the bin

– Uses a one-ply lookahead constraint to limit the position of following tile

• LP relaxation prunes the CP search space– Checks whether the partial solution will

lead to an infeasible instance• Use idiomatic formulations for CP and IP

106

2DBP: Remarks2DBP: Remarks

• The CP fixes significant numbers of variables at each node

• The LP pre-processor greatly simplifies the LP

• Therefore, the lack of incrementality of the barrier method does not cost us

107

2DBP: Cooperative Algorithm Demo

2DBP: Cooperative Algorithm Demo

108

Last StopLast Stop

• Constraint Programming and Local Search cooperation

• Another example of duality in action

109

CP/LSCP/LS

Parallel machines with set-up times

Ready times

Dues dates

Splittable jobs

Rogue machines

Objectives

meet due dates

minimize setup costs

110

Two Phase CooperationTwo Phase Cooperation

Phase I - the Primal (Work on first objective)

• Configure and schedule the jobs

– Use constraint based scheduling

111


Machines morph into trucks

112


• Phase 2 - the Dual (Work on second objective)

• Schedule the trucks

– Use Lin, Lin-Kernighan, tabu etc

113

Parallel Machines: Cooperative Algorithm Demo

Parallel Machines: Cooperative Algorithm Demo

114

IC Park ExampleIC Park Example

• Hoist Scheduling (Rodosek and Wallace)

• The original model is an IP

• The CP model is “the same”

• CP guides search, LP relaxation and CP share pruning duties

• No apparent duality

115

RemarksRemarks

• One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms

• IP/LS is just around the corner

• IP/IP cooperation is hard because one can’t formulate truly dual views

– either simply not there

– or too verbose

counterexamples welcome

116

4. DISJUNCTIVE PROGRAMMING 4. DISJUNCTIVE PROGRAMMING

117

Disjunctive Linear ProgrammingDisjunctive Linear Programming

• An extension of Mixed Integer Programming

• A union of polyhedral sets (feasible regions) is called a disjunctive set.

118

Disjunctive SetDisjunctive Set

119

Disjunctive Linear ProgrammingDisjunctive Linear Programming

• The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem.

• The solution set of the disjunctive programming problem is

Fij

i<M j<N

120

Disjunctive Linear Programming Examples

Disjunctive Linear Programming Examples

• Semi-continuous variables

either X >= 100

or X == 0

• Rather than

X <= BigM*Y ,

X >=100*Y,

Y a 0-1 variable

121

Solution Set Inside Initial RegionSolution Set Inside Initial Region

122

Disjunctive Linear Programming Examples - continued

Disjunctive Linear Programming Examples - continued

• Bollapragada, Ghattas and Hooker

Truss structure design problem

Branches directly on alternatives dictated by Hooke’s Law

• Wyatt

Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization

Extends Bender’s decomposition to disjunctive linear programs

123

Disjunctive Linear Programming continued


• Balas, Cornuejols and Ceria

Generating cuts for disjunctive programming problems.

• McAloon and Tretkoff

Basic mathematical results: Optimization and Computational Logic, Wiley

124



• Dealing with the disjunctive part requires search.

• This requires an engine which is not available in MIP packages

• Also the linear relaxation is not as tight and the evaluation function is not as faithful

• The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers

• Beringer and DeBacker for MIP

125

To Keep It Simple

GERALD + DONALD = ROBERT

An AI classic

Newell and Simon

Assignment problem + 1 constraint

Surprisingly hard for MIP solvers

CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution

126

The Disjunctive ProgramThe Disjunctive Program

• One constraint for the equation

• 100000 G + … + D = 100000 R + … + T

• For each variable X among G,…,T

• X = 0 or X = 1 or … or X = 9

• For each pair X, Y

• X Y-1 or Y X-1

127

Solution Set: SOME of the Integer Points in the RegionSolution Set: SOME of the

Integer Points in the Region

128

The Twin Variables for Cooperating Solvers

The Twin Variables for Cooperating Solvers

• Integer variables for the letters

0 g, e, r, a, l, d, o, n, b, t 9

• With continuous doppelgangers

0 G, E, R, A, L, D, O, N, B, T 9

129

The VariablesThe Variables

• One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver

100000 G + 10000 E + 1000 R + … + D +

100000 D + 10000 O + 1000 N + … + D

=

100000 R + 10000 O + 1000 B + … + T

130

The VariablesThe Variables

• One CSP constraint on the integer variables posted to a discrete constraint propagation engine

AllDifferent(g, e, r, a, l, d, r, n, b, t )

131

The SearchThe Search

• Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely

• The branching strategy is guided by the linear relaxation on the continuous variables

• if there is a non-integral variable X, branch on it

X floor(X*)

or

X ceil(X*)

132


• If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack

• If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack

133


• New wrinkle

• The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values

• In this case, branch to keep them apart

• either X Y - 1

• or Y X - 1

134

The Variables

void main()

{

IlcInitFloat();

IlcManager m(IlcNoEdit);

IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9),

G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9);

IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T);

// Continued on next slide

135

The Constraints

m.add(IlcAllDiff(vars,IlcWhenValue));

IlcLinOpt simplex(m);

simplex.add(

100000*R + 10000*O + 1000*B + 100*E + 10*R + T

==

100000*G + 10000*E + 1000*R + 100*A + 10*L + D

+

100000*D + 10000*O + 1000*N + 100*A + 10*L + D ,

IlcTrue // Post to Solver as well

);

136

The Search for solutions

m.add(Generate(m,simplex,vars)); // Search strategy

if (m.nextSolution()) { // Find a solution

m.out() << " solution found " << endl;;

}

m.printInformation();

m.end();

}

137

Branch if a variable is non-integer

ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars)

{

IlcInt varIndex = MostNotInteger(vars, simplex);

if (varIndex >= 0) // There is a non-integer variable

return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this);

138

Is integer relaxation a solution ?

IlcManager m = getManager();

if(m.solve(TestIntegerRelaxation(m,simplex)))

return 0;

139

Find two variables with same value

IlcInt j;

for(i=0;i<vars.getSize()-1;i++) {

if (vars[i].isBound()) continue; // Can’t both be bound

IlcInt n = simplex.nearest(simplex.getCurrentValue(vars[i]));

for(j=i+1;j<vars.getSize();j++) {

IlcInt m =

simplex.nearest(simplex.getCurrentValue(vars[j]));

if (m == n) break;

}

if (j< vars.getSize()) break;

}

140

Branch to push them apart

// j and i are the indices of two variables with same current value

return

IlcAnd(IlcOr(

Smaller(m,vars[i],vars[j],simplex),

Smaller(m,vars[j],vars[i],simplex)),

this // Recursion

);

}

141

Pushing two variables apart

ILCGOAL3(Smaller,IlcIntVar,x,IlcIntVar,y,IlcSimplex,simplex)

{

simplex.add(x <= y-1,IlcTrue);

return 0;

}

142

Testing the integer relaxation

ILCGOAL1(TestIntegerRelaxation, IlcSimplex, simplex)

{

simplex.trySolution();

return 0;

}

143

Results

• ILOG Solver/Planner finds a solution in 6

nodes (.29 seconds on laptop)

• Straightforward ILOG Solver finds a solution

in 8024 nodes (1.8 seconds on a laptop)

• Again, CPLEX MIP takes 1 minute and 29048

nodes (on Sun Enterprise) to find a feasible

integer solution

144

Example: The Dutch Trains

Scheduling intercity trains

Amsterdam,Rotterdam,Roosendaal,Vlissengen

Without coupling constraints, multi-commodity integer flow problem

With coupling constraints, a DLP with an integer relaxation

Additional logic handled directly in 2LP with CPLEX

Disjunctive Programming and Cooperating Solvers, CSTS 98 (Kluwer, edited by D. Woodruff)

ConclusionsConclusions

• CP and MIP are powerful techniques that can solve many combinatorial problems

• Each has preferred formulations

• Can get even greater benefits when combining CP and IP algorithms

146

Recent and Current Work

Beaumont

Beringer, DeBacker

Balas, Ceria, Cornuejols.

Wallace, Rodosek, Schrimpf

Heipke, Colombani

Bockmayr

McAloon, Tretkoff, Wetzel

147

III. Exploiting Randomization to Solve Hard Combinatorial

Problems

III. Exploiting Randomization to Solve Hard Combinatorial

Problems

148

BackgroundBackground

Combinatorial search methods often exhibit

a remarkable variability in performance. It is common to observe significant differences between:

- different heuristics

- same heuristic on different instances

- different runs of same heuristic with different seeds (stochastic methods)

149

Main ClaimMain Claim

One can take advantage of the extreme variability of combinatorial search methods:

One can One can improve the performance of a improve the performance of a deterministic complete methoddeterministic complete method, by , by introducing a introducing a stochastic elementstochastic element, while , while maintaining maintaining completeness.completeness.

We’ll explain We’ll explain WHYWHY that is the case. that is the case.

150

A Structured Benchmark Domain for Studying the Distributions of Search Methods

Stochasticity in Search Procedures

Intriguing Properties of Complete BacktrackStyle Algorithms

Consequences for Algorithm Design - Rapid Randomized Restarts

Portfolio of Algorithms

151

Structured Benchmark DomainStructured Benchmark Domain

152

Study of local and systematic search methods has been driven by:

Random instance distributions (Hogg et al. 96). Limitation: lack of structure that characterizes realistic problems;

Highly structured problems (Fujita at al. 93). Limitation: “too much” structure.

We propose a benchmark domain that We propose a benchmark domain that bridges the gap between purely random bridges the gap between purely random instances and highly structured instances and highly structured problems.problems.

Background Background

Gomes and Selman 1997 - Proc. AAAI-97

153

Defn.: a pair (Q, *) where Q is a set, and * is a binary

operation on Q such that

a * x = b ; y * a = b

are uniquely solvable for every pair of elements a,b in Q.

The multiplication table of its binary operation defines a

latin square (i.e., each element of Q appears exactly once

in each row/column).

Example:Quasigroup of order 4

QuasigroupsQuasigroups

154

Given a partial latin square, can it be completed?

Example:

Quasigroup Completion Problem (QCP)

Quasigroup Completion Problem (QCP)

155

Quasigroup Completion Problem A Framework for Studying SearchQuasigroup Completion Problem

A Framework for Studying Search

NP-Complete (Colbourn 1983, 1984; Anderson 1985).

Has a structure not found in random instances.

Leads to interesting search problems when structure is perturbed.

The study of this problem led us to identify

the unusual distributions of combinatorial search (Gomes, Selman & Crato --- CP97)

156

Aside: Applications of QuasigroupsAside: Applications of Quasigroups

Design of statistical experiments

eliminating data dependencies Scheduling/Timetabling (Anderson 1992)

completing a schedule given a set of pre-defined events

Automated theorem proving (Fujita et al. 1993)

existence vs. non-existence of quasigroups with intricate mathematical properties

157

Example: Scheduling of Drug Experiment

Example: Scheduling of Drug Experiment

Given 5 different drugs, test the effects of the

different medications on 5 different subjects over

different days of the week.

Use constraint:

No two people get same brand on the same day

(eliminate bias for day of the week).

158

Quasigroup Completion Quasigroup Completion S

UB

JEC

T

DAY

Mon. Tues. Wed. Thurs. Fri.

Tim

Sue

Frank

Teresa

Todd

Tylenol Aleve Bayer ExhedrinExhedrin Advil

Aleve Bayer Exhedrin Advil Tylenol

Bayer Exhedrin AdvilAdvil Tylenol Aleve

Exhedrin AdvilAdvil Tylenol Aleve Bayer

Advil Tylenol Aleve Bayer Exhedrin

(*) Pre-assigned(*) Pre-assigned

159

QCP has a natural formulation as a Constraint

Satisfaction Problem

variable for each NxN entry

constraints capture row/column requirement

variable assignments capture pre-assigned values

160

How does the difficulty of

QCP vary with the fraction

of pre-assignment?

161

Fraction of pre-assignment

Med

ian

num

ber

of

back

track

s (l

og

)

Overconstrained areaUnderconstrained

area

Critically constrained area

162

Complexity Graph shows (up to order 20):

curve peaks around 42% of pre-assignment ---

critically constrained area.critically constrained area.

under-constrainedunder-constrained and over-over-constrainedconstrained areas are easier.

163

Directly related to the peak in

computational difficulty is the so-

called phase transition graph for

the QCP problem.

164

Fraction of pre-assignment

Fract

ion o

f U

nso

lved

case

s

Almost all unsolvable area

Almost all solvable area

Phase transition area

165

Phase TransitionPhase Transition

QCP Phase Transition --- threshold phenomenonthreshold phenomenon from almost all solvable to almost all from almost all solvable to almost all unsolvableunsolvable --- occurs around 42% of preassignment.

It’s called a phase transition because of the close

relation to state transition phenomena studied in

physics, such as the melting of a solid into a

liquid.

166

Exploiting StructureExploiting Structure

167

Forward Checking Arc Consistency on binary constraints

Exploiting Structure in QCP

168

Arc Consistency on Binary Constraints

Further Exploiting Structure in QCP

Shaw, Stergiou and Walsh - ECAI98

General Arc Consistency on all different

constraints

169

Enforcing General Arc Consistency on All Different Constraints

Enforcing General Arc Consistency on All Different Constraints

• Beautiful example of integration of AI/OR techniques for a well defined sub-problem

• Propagation uses Maximum Matching problem (particular case of Network Flow problems which have polynomial time complexity)

Regin - AAAI94

170

Further Exploiting Structure in QCPFurther Exploiting Structure in QCP

By enforcing general arc consistency on all different constraints problems up to order 50 could be solved!

Shaw, Stergiou and Walsh - ECAI98

Regin - AAAI94

171

Stochasticity in Search ProceduresStochasticity in Search ProceduresStochasticity in Search ProceduresStochasticity in Search Procedures

172

BackgroundBackground

Stochastic strategies have been very successful in the area of local search.

Limitation: inherent incomplete nature of local search methods.

We want to explore the addition of a We want to explore the addition of a stochastic element to a systematic search stochastic element to a systematic search procedure without losing completeness.procedure without losing completeness.

173

We introduce stochasticity in a

backtrack search method by randomly

breaking ties in variable and/or value

selection.

Compare with standard lexicographic

tie-breaking.

174

Randomized StrategiesRandomized Strategies

Strategy Variable sel. Value sel.

DD deterministic deterministic

DR deterministic random

RD random deterministic

RR random random

178

Lesson: Randomized tie-breaking can

improve performance over a purely

deterministic strategy.

Next: But we can obtain a more dramatic

advantage from randomization ...

179

Cost DistributionsCost Distributions

Key Properties:

I Erratic behavior of mean.I Erratic behavior of mean.

II Distributions have “II Distributions have “heavy tailsheavy tails”. ”.

180

Median = 1!

samplemean

number of runs

3500!

500

2000

181

1

182

75%<=30

Number backtracks Number backtracks

Pro

port

ion o

f ca

ses

Solv

ed

5%>100000

183

Heavy-Tailed DistributionsHeavy-Tailed Distributions

… … infinite variance … infinite meaninfinite variance … infinite mean

Introduced by Pareto in the 1920’s

--- “probabilistic curiosity.”

Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.

Examples: stock-market, earth-quakes, weather,...

184

Decay of DistributionsDecay of Distributions

Standard --- Exponential Decay

e.g. Normal:

Heavy-Tailed --- Power Law Decay

e.g. Pareto-Levy:

Pr[ ] , ,X x Ce x for someC x 2 0 1

Pr[ ] ,X x Cx x 0

185Standard Distribution

(finite mean & variance)

Power Law Decay

Exponential Decay

186

Normal, Cauchy, and LevyNormal, Cauchy, and Levy

Normal - Exponential Decay

Cauchy -Power law DecayLevy -Power law Decay

187

Tail Probabilities (Standard Normal, Cauchy, Levy)

Tail Probabilities (Standard Normal, Cauchy, Levy)

c Normal Cauchy Levy0 0.5 0.5 11 0.1587 0.25 0.68272 0.0228 0.1476 0.52053 0.001347 0.1024 0.43634 0.00003167 0.078 0.3829

188

How to Check for “Heavy Tails”?How to Check for “Heavy Tails”?

Log-Log plot of tail of distribution

should be approximately linear.

Slope gives value of

infinite mean and infinite varianceinfinite mean and infinite variance

infinite varianceinfinite variance

1

1 2

189

Example of Heavy Tailed Model(Random Walk)

Example of Heavy Tailed Model(Random Walk)

Random Walk:•Start at position 0

•Toss a fair coin:

• with each head take a step up (+1)

• with each tail take a step down (-1)

X --- number of steps the random walk takes to return to position 0.

190

The record of 10,000 tosses of an ideal coin

(Feller)

Zero crossing Long periods without zero crossing

191

Random Walk

Heavy-tails vs. Non-Heavy-TailsHeavy-tails vs. Non-Heavy-Tails

Normal(2,1000000)

Normal(2,1)

O,1%>200000

50%

2

Median=2

1-F

(x)

Unso

lved f

ract

ion

X - number of steps the walk takes to return to zero (log scale)

192

466.0

319.0153.0

Number backtracks (log)

1-F

(x)

Unso

lved f

ract

ion

1 => Infinite mean

Heavy-tails in QCP Domain

193

The Log-Log plot shows a linear relation

over many orders of magnitude. This isclear evidence of heavy-tailed behavior.

196

Heavy Tailed Cost DistributionHeavy Tailed Cost Distribution

0.1

1

1 10 100 1000 10000 100000

log( Backtracks )

log

( 1

- F

(x)

)

197

The Log-Log plot shows a linear relation

over many orders of magnitude. This isclear evidence of heavy-tailed behavior.

198

By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution.

Right-hand side: non-negligible fraction of very long runs

Left-hand side: non-negligible fraction of very short runs

199

70%>250000

15!

1%<=650!

Sports Scheduling


Cu

mula

tive D

istr

ibu

tion F

un

ctio

n

200Standard Distribution

(finite mean & variance)

Power Law Decay

Exponential Decay

Also, heavy tails on left. (High probability of very short runs.)

201

Consequence for algorithm design:

Use rapid restarts or parallel / inter-leaved runs

Super linear speedups!!!

202

X XX XX

solved10 101010 10

Sequential: 50 +1 = 51 seconds

Parallel: 10 machines --- 1 second 51 x speedup

Super-linear Speedups

Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup

203

Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution.

RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left.

Restarts provably eliminate heavy tails (Gomes, Selman & Crato )

204

Sketch of proof of elimination of heavy tails

Sketch of proof of elimination of heavy tails

Let’s truncate the search procedure after m backtracks.

Probability of solving problem with truncated version:

Run the truncated procedure and restart it repeatedly.

pm X m Pr[ ]

X numberof backtracks to solve the problem

205

Y total number backtracks with restarts

F Y y pmY m

c e c y

Pr[ ] ( )

/1

12

Number of starts Y m Geometric pmRe / ~ ( )

Y - does not have Heavy Tails

206

RestartsRestarts

70%unsolved

250~ 62.5 restarts

1-F

(x)

Unso

lved f

ract

ion


207

Example of Rapid Restart Speedup(planning)

Example of Rapid Restart Speedup(planning)

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( b

ackt

rack

s )

20

2000 ~100 restarts

Cutoff (log)

Num

ber

back

track

s (l

og)

~10 restarts

100000

208

Deterministic

Logistics Planning 108 mins. 95 sec.

Scheduling 14 411 sec 250 sec

(*) not found after 2 days

Scheduling 16 ---(*) 1.4 hours

Scheduling 18 ---(*) ~18 hrs

Circuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.

Summary Results

R3

209

Our results provide the first indication of heavy-tailed distri-butions in a computational model.

Overall insight:Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy.

210

Heavy-Tailed Distributionsin Other Domains


Quasigroup Completion Problem

Graph Coloring

Logistic Planning

Circuit Synthesis

Gomes, Selman, and Crato 1997;

Gomes, Selman, McAloon, and Tretkoff 1998;

Gomes,Kautz, and Selman 1998;

211

Deterministic

Logistics Planning 108 mins. 95 sec.

Scheduling 14 411 sec 250 sec

(*) not found after 2 days

Scheduling 16 ---(*) 1.4 hours

Scheduling 18 ---(*) ~18 hrs

Circuit Synthesis 1 ---(*) 165sec.Circuit Synthesis 2 ---(*) 17min.

Summary Results

R3

212

Rapid Restart SpeedupRapid Restart Speedup

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( b

ackt

rack

s )

213

Our results provide the first indication of heavy-tailed distri-butions in a computational model.

Overall insight:Overall insight:

Randomized tie-breaking with

rapid restarts gives powerful

search strategy.

214



Quasigroup Completion Problem

Graph Coloring

Logistic Planning

Circuit Synthesis

Gomes, Selman, and Crato 1997 - Proc. CP97;

Gomes, Selman, McAloon, and Tretkoff 1998 - Proc AIPS98;

Gomes,Kautz, and Selman 1998 - Proc. AAAI98.

215

Algorithm Portfolio DesignAlgorithm Portfolio Design

Gomes and Selman 1997 - Proc. UAI-97;

Gomes, Selman, and Crato 1997 - Proc. CP97.

216

MotivationMotivation

The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances.

Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths.

217

Branch & Bound:Best Bound vs. Depth First Search

Branch & Bound:Best Bound vs. Depth First Search

218

Branch & Bound(Randomized)

Branch & Bound(Randomized)

Standard OR approach for solving Mixed Integer Programs (MIPs)

• Solve linear relaxation of MIP

• Branch on the integer variables for which the solution of the LP relaxation is non-integer:

apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value)

• Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values

219

Branch & BoundDepth First vs. Best bound

Branch & BoundDepth First vs. Best bound

Critical in performance of Branch & Bound:

the way in which the next node to be expanded is selected.

Best-bound - select the node with the best LP bound

(standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic

Depth-first - often quickly reaches an integer solution

(may take longer to produce an overall optimal value)

220

Portfolio of AlgorithmsPortfolio of Algorithms

A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.

Goal: to improve on the performance of the component algorithms in terms of:

expected computational cost

“risk” (variance)

Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.

221

Depth-first vs. Best-bound(logistics planning)

Depth-first vs. Best-bound(logistics planning)

Number of nodes

Cu

mula

tive F

requ

en

cies

Depth-First

~50%

Best-Bound

~30%

222

Depth-First and Best and Bound do not dominate each other overall.

223

Heavy-tailed behavior of Depth-firstHeavy-tailed behavior of Depth-first

224

Portfolio for heavy-tailed search procedures (2 processors)


0 DF / 2 BB

2 DF / 0 BB

Standard deviation of run time of portfolios

Expect

ed r

un t

ime o

f p

ort

folio

s

225



0 DF / 6 BB

6 DF / 0BB


Expect

ed r

un t

ime o

f p

ort

folio

s

5 DF / 1BB

3 DF / 3 BB

4 DF / 2 BB

Efficient set

226



0 DF / 20 BB

20 DF / 0 BB


Expect

ed r

un t

ime o

f p

ort

folio

s

227

Portfolio for heavy-tailed search procedures (2-20 processors)

Portfolio for heavy-tailed search procedures (2-20 processors)

228

A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena.

229

Summary of RandomizationSummary of Randomization

Considered randomized backtrack search.

Showed Heavy-Tailed Distributions.

Suggests: Rapid Restart Strategy.

--- cuts very long runs

--- exploits ultra-short runs

Experimentally validated on previously unsolved planning and scheduling problems.

Portfolio of Algorithms for cases where no single heuristic dominates

230

Summary of RandomizationSummary of Randomization

Considered randomized backtrack search.

Showed Heavy-Tailed Distributions.

Suggests: Rapid Restart Strategy.

--- exploits ultra-short runs

--- cuts very long runs

Experimentally validated on previously unsolved

planning and scheduling problems.

Portfolio of Algorithms for cases where no single heuristic dominates

231

IV. CONCLUSIONSIV. CONCLUSIONS

232

Important Themes in ORImportant Themes in OR

Linear Programming (Mixed) Integer Programming

Exploit Structure e.g., Network Flow Problems

Duality very elegant theory in LP

sensitivity analysis

233

Opportunities for Integration of AI/OR


OR methods:Have focused on tractable representations (LP)

Have demonstrated the ability to identify optimal and locally optimal solutions

LIMITATION: Restricted to rigid models with limited expressive power

AI methods:Richer and more flexible representations,supporting constraint-based reasoning

mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop.

LIMITATION: Rich representations in general lead to intractable problems

CHALLENGE: good representations / fast & good solutions

,

234



AI methods are becoming competitive

AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples:

SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches.

ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX.

OR people have acknowledge the benefits of combining OR and AI methods

235



Exploiting Duality in CSP frameworks

Exploiting Randomization

Hybrid Solvers

236



Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX

local constraint propagation - local consistency algs

global constraint propagation - LP relaxations

Only a hybrid approach could prove optimality, e.g.:

Hoist scheduling (Rodosek & Wallace 1998)

Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998)

237

Updated version of tutorial slides

www.cs.cornell.edu/gomes/

TalksDemos

Updated version of tutorial slides

www.cs.cornell.edu/gomes/

TalksDemos

238

AppendixAppendix

239


A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors.

A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost.

The standard deviation of the portfolio is a measure of the risk inherent to the portfolio.

240



expected computational cost

“risk” (variance)

Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.

241

AppendixPortfolio of Algorithms

AppendixPortfolio of Algorithms


expected computational cost;

risk;

Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk.

Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk.

242

Appendix Portfolio of Two Algorithms


Let us consider the random variables:

A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist;

A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist;

243



Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2).

Let us consider the random variable:

X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist;

244

AppendixPortfolio of Two Algorithms

AppendixPortfolio of Two Algorithms

Given N processors, and

P[X x]N

ii 1

NP[A1 x]i P[A1 x]

(N i)

n N1 n2 0

245

Appendix Portfolio of Algorithms

Appendix Portfolio of Algorithms

Given N processors, such that and n1n N n2 1 ,

0 1 n N

P[X x]n1

i'i' 0

n1P[A1 x]i

'P[A1 x]

(n1 i' )

i

N

1

n2

i' 'P[A2 x]i

' 'P[A2 x]

(n2 i' ' )

i i i' ' ' and the term in the summation is 0 when 2'',0'' nii

246

Preliminary Research on Structure of Search Spaces

Preliminary Research on Structure of Search Spaces

247

Fringe of Search TreeFringe of Search Tree

248

Fractal DimensionFractal Dimension

249

Fractal DimensionFractal Dimension

When plotting the length of a curve as a function of the measuring tool on a log-log plot, one obtains a linear relationship:

L - the measured length;

s - length of the yardstick;

c and d are constants;

Mandelbrot introduced the fractal dimension D = d +1;

A straight line has D = 1.0;

The coast of Britain has fractal dimension 1.22;

The higher D the more fractal the curve is.

dscL )/1(

250

Heavy-Tailed Behavior vs Non-heavy-tailed behaviorHeavy-Tailed Behavior vs

Non-heavy-tailed behavior

1 integration of artificial intelligence and operations research techniques for combinatorial...

Documents

c n x n slide

ij x j

unknown x j

mn x n b

ax b x

overview slide

n products

cs slide