csci 1951-g optimization methods in finance part 03...

55
CSCI 1951-G – Optimization Methods in Finance Part 03: (Mixed) Integer (Linear) Programming February 9, 2018 1 / 55

Upload: others

Post on 29-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

CSCI 1951-G – Optimization Methods in FinancePart 03:

(Mixed) Integer (Linear) Programming

February 9, 2018

1 / 55

Page 2: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Roadmap

1 Introduce Integer Programming and the various sub classes of IP

2 Study the feasible region for IPs

3 Show the computational complexity of IP

4 Comment on the importance of modeling and formulation

5 Present di�erent algorithms and strategies to solve IPs:• Branch and bound• Cu�ing planes (Gomory cuts)• Branch and cut

2 / 55

Page 3: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Motivation example: Combinatorial auctions• Auction: a set of items M = {1, . . . ,m} are available for sale

from a seller, and n bidders compete to buy some of them.• Each bidder j makes a bid Bj = (Sj , pj) where Sj ⊆M andpj > 0 to buy the set Sj of items at the price pj , 1 ≤ j ≤ n.

• The seller wants to maximize her profits. Which bids should sheaccept?

Let’s model the variables, objective function, and constraints.

Combinatorial auction

max

n∑i=1

pjxj∑j : i∈Sj

xj ≤ 1 for i = 1, . . . ,m

xj ∈ {0, 1} for j = 1, . . . , n

Looks like a Linear Program. . . but the variables are discrete.It is an integer linear program. 3 / 55

Page 4: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Integer Programming

Integer Program: some or all the variables are restricted to Z.

Integer Linear Program: an integer program with linearconstraints and linear objective function.

Mixed/Pure (Linear) Integer Program: an IP (or ILP)s.t. some/all variables are restricted to Z, the others to R.

0-1 (Mixed/Pure) (Linear) Integer Program: anMIP/PIP/MILP/PILP s.t. all integer variables are restricted to{0, 1}.

We will mostly restrict to (0-1) MILPs.

4 / 55

Page 5: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Feasible region

Consider the ILPmin cTx

s.t. Ax = b

x ≥ 0

x ∈ Z

Feasible “region”: integer points inside a polyhedron:1

Finding the optimal LP solution takes polynomial time.Finding the optimal ILP solution?

1Image by Ted Ralphs

5 / 55

Page 6: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Complexity of solving 0-1 PILP

Theorem (Karp 1972)

0-1 PILP is NP-Complete.

Proof.

Reduction from minimum vertex cover: given a graph G = (V,E),find the smallest set S ⊆ V such that each e ∈ E is incident to atleast one vertex in S.

0-1 PILP formulation:min

∑v∈V

xv

xv + xu ≥ 1 for each (u, v) ∈ Exv ∈ {0, 1} for each v ∈ V

6 / 55

Page 7: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

LP RelaxationFeasible region for the ILP

LP defined on the polyhedron:LP Relaxation of the ILP.

xL: optimal solution to LPR,with obj. value zLxI: optimal solution to ILP,woth obj. value zI

What’s the relationship between zL and zI? zL ≤ zI.zL = zI i�. . .xL is integral.

Can we use xL obtain xI?No, rounding the components of xL does not give xI

Can we say something about minroundings

(cTrounding(xL)− zI)?

No: any rounding(xI) may be arbitrarily far from any optimalsolution for the ILP, and the change in objective value may be huge(i.e., rounding has no approximation guarantee!)

7 / 55

Page 8: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Models and formulation

Two ways to specify that a variable is binary:• x ∈ {0, 1}• x2 =xLead to optimization problems of di�erent classes, with di�erentalgorithms to solve them, and di�erent complexity.

�estion

Does the formulation/model impact dxLe − xI?

Di�erent formulations may result in di�erent polyhedra thatbe�er/worse approximate the convex hull of the feasible solution ofthe ILP. (Homework?)

8 / 55

Page 9: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Partitioning the feasible region

Partition the feasible region S of the ILP into non-overlappingsubsets S1, . . . , Sk.z∗: optimal obj. value of the ILP; z∗i optimal obj. value of the ILPrestricted to Si, 1 ≤ i ≤ k (ILPi)

Can we express z∗ in terms of z∗i ?

z∗ = min1≤i≤k

z∗i

Equivalently:

minx∈S

cTx = min

{minx∈Si

cTx, 1 ≤ i ≤ k}

Let x∗ be such that cTx∗ = z∗. x∗ must be optimal for some Si.

Idea for finding the optimal solution to the ILP

Find the optimal solution in each Si (i.e., solve subproblems)

9 / 55

Page 10: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branch and bound

Branching

The process of dividing the original problem into subproblems.

• Can branch recursively: creates a hierarchical partitioning of thefeasible region.

• The partitioning is tree whose nodes are subsets of the feasibleregion (equivalently: ILPs defined on those subsets).

• The root is the whole feasible region, i.e., the original ILP.

• The ILP in a node at level i > 0 is obtained by adding constraintsto its parent at level i− 1, according to a branching rule.

What happens if we keep branching?Complete enumeration of all solutions. Good? No, let’s avoid it.

How:By pruning subtrees using upper bounds to z∗.

10 / 55

Page 11: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Lower and upper bounds

We saw a Lower Bound (LB) to z∗ . . . the optimal obj. value to LPR.

Are the opt. obj. values z̃i of the LPRs of ILPi, 1 ≤ i ≤ k, LBs to z∗?No, each z̃i is in general only a LB to z∗i .

Can z̃i be an Upper Bound (UB) to z∗? Yes if z̃i is integral.

11 / 55

Page 12: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Lower and upper bounds (cont.)

Let u be an UB to z∗ such that there exists and is known x∗ feasiblefor ILP such that cTx∗ = u.

Let z̃i be the optimal obj. value of the LPR of ILPi, for a specific i.

• If z̃i ≥ u then Si . . . cannot contain a feasible solution to ILP withlower obj. value than x∗ (Why?).We can prune the node for ILPi

• If z̃i < u then we must branch recursively inside of Si.But if the solution corresponding to z̃i is integral then . . .z̃i is a be�er UB to z∗ than u.

12 / 55

Page 13: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Initial subproblem

Start by solving the LPR of the original ILP (i.e., on S).

1 The LPR is infeasible⇒ the ILP is infeasible

2 The optimal solution of the LPR is integral⇒ It is also the optimalsolution for the ILP

3 The optimal solution of the LPR is not integral⇒ it is a lowerbound to the optimal solution of the ILP (useful? no)

• In cases 1 and 2, we are done.

• In case 3, we must branch: create subproblems by partitioning S,and solve the subproblems recursively.

13 / 55

Page 14: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branching rule(Brainstorm)1 Select a single variable xi whose value vi is fractional in the

optimal LPR solution of the node to branch

2 Create two subproblems (nodes):• In one subproblem, add the constraint xi ≤ bvic.• In the other subproblem, add the constraint xi ≥ dvie.

Original LP rel.

min− x1 + x2

− x1 + x2 ≤ 2

8x1 + 2x2 ≤ 19

x1, x2 ≥ 0Optimal solution isx1 = 1.5, x2 = 3.5,obj.val. 5

1st Modified ILP

min− x1 + x2

− x1 + x2 ≤ 2

8x1 + 2x2 ≤ 19

x1 ≤ 1

x1, x2 ≥ 0

x1, x2 ∈ Z

2nd Modified ILP

min− x1 + x2

− x1 + x2 ≤ 2

8x1 + 2x2 ≤ 19

x1 ≥ 2

x2 ≥ 0

x1, x2 ∈ Z

14 / 55

Page 15: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Impact on the feasible region

(This is not the feasible region for the problems in the previous slide)

15 / 55

Page 16: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

A�er branching

A�er branching, we solve each subproblem recursively.

We stopping when either:• the optimal obj. value of the LPR of the current problem is larger

than the current upper bound to the optimal obj. value of theoriginal ILP; or

• the LPR is infeasible;

Branch-and-bound

We avoid exploring (branching into) subsets of the feasible regionbecause the bounds tell us that they do not contain any be�ersolution than what found already.

16 / 55

Page 17: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

LP-based Branch-and-Bound algorithm

L: data structure, will contain pairs (P, r), (P is a ILP, r is a number)1 L← (P0,+∞), where P0 is original ILP.

2 Upper bound u← +∞, vector x∗ ← infeasible

3 Pop an ILP P from L and solve its LPR. Let z̃P be the optimalobj. value and x̃P the optimal solution for the LPR• If LPR is infeasible or z̃P ≥ u, the node for P . . . can be pruned

(go to Step 4)• Else if z̃P < u and x̃P is integral, then . . .u← z̃P , x∗ ← x̃P ,

and . . . remove all pairs (X, r) from L s.t. r ≥ u (go to Step 4).• Else, branch: create subproblems P1 and P2, add (P1, z̃P ) and

(P2, z̃P ) to L.

4 if L 6= ∅, go to Step 3. Else, terminate with output (x∗, u).

17 / 55

Page 18: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Details to discuss

Details of the algorithm that are le� open:• How do we select which variable to branch on if there are

multiple fractional ones?

• How do we select the next candidate subproblem to process?

Remark

The trade-o� space for the above choices is extremely large.“Optimizing” branch-and-bound is, in itself, a di�icult optimizationproblem, and still subject to a lot of research.

We also want to obtain good upper bounds as fast as possible(why?). How?

(Obtaining good upper bounds faster allows us to prune largerregions, hence to terminate earlier.)

18 / 55

Page 19: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branching

In the optimal solution of an LPR, many variables may be fractional.

�estion

How do we select which variable to branch on?

�estion

What do we look for in a variable to branch on?

Brainstorm.

Answer

We want the two resulting subproblems to both give much be�er lowerbounds than the subproblem we are branching.

We will present heuristics to achieve this goal.

19 / 55

Page 20: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branching (cont.)

Most Infeasible Branching

Choose the variable with the fractional value closest to 0.5.

Reason: The closest feasible solutions for the LPRs of thesubproblems will both have large distances from the optimalsolution to the current relaxation.

Underlying assumption: large distance implies large di�erence inobjective value. Yes/No?It depends on the objective function!

20 / 55

Page 21: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branching (cont.)

• Suppose we are considering a subproblem P . Let z̃P be theoptimal obj. value of the LPR of P . Assume that thecorresponding solution is not integral. Let vi be the value takenby the variable xi.

• For 1 ≤ i ≤ n, let Pi,− be the subproblem with added constraintxi ≤ bvic. Let zi,− be the optimal obj. value of the LPR of Pi,−.

• Let Pi,+ be the subproblem with added constraint xi ≥ dvie, anddefine zi,+ similarly.

• We would like to branch on a variable such that Di,− = zi,− − z̃Pand Di,+ = zi,+ − z̃P are both large.

Possible branching rules

• Branch on the variable for which min{Di,−, Di,+} is maximized.

• Branch on the variable for which Di,− +Di,+ is maximized.

• Combine the above, with weights.

21 / 55

Page 22: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Strong Branching

The strategy of computing Di,− and Di,+ explicitly for each i isknown as Strong Branching.• Strong Branching reduces the size of the branch-and-bound tree

by a factor of 20x or more, in most cases, w.r.t. Most InfeasibleBranching.

• Computationally, it can be . . . expensive: . . . we are solving two LPsfor each variable (although not from scratch)

• More reasonable strategy: only evaluate Di,− and Di,+ for somevariables, e.g., those with fractional part closest to 0.5.

22 / 55

Page 23: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

The relative value of time

Where would you spend more time looking for good lower bounds?In the top part of the exploration tree or later in the execution?

It is reasonable to spend more time in evaluating Di,− and Di,+ atthe top of the tree.

The value of time is relative to the achieved improvement in thelower bound.This leads to the concepts of pseudocosts.

23 / 55

Page 24: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Pseudocosts (cont.)

For a variable xi, let fi = vi − bxic be its fractional part.

Definition (Pseudocosts)

If fi > 0, define the up and down pseudocosts as:

Ci,+ =Di,+

1− fiCi,− =

Di,−fi

Ci,+ and Ci,− tend to be almost constant in all nodes thebranch-and-bound tree.

Don’t compute them exactly at each node, rather estimate them!

24 / 55

Page 25: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Pseudocosts estimation

• Initialize pseudocost estimation C̃i,+ with strong branching atthe root node (or whenever a variable becomes fractional for thefirst time).

• When branching on xi, update C̃i,+ as the averages of theobserved pseudocosts over all nodes where we branched over xi.

• When choosing which variable to branch on, estimate Di,+ andDi,− as C̃i,+(1− fi) and C̃i,+fi, and use the estimates to decidewhich variable to branch on (e.g., choosing the xi that maximizesmin{Di,−, Di,+}).

25 / 55

Page 26: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Search strategy

During the execution of branch-and-bound, the list L may containmultiple subproblems.

�estion

How do we select the next candidate subproblem to process?

Let’s ask the real question first:

�estion

What do we look for in a node that would make us select it?

We have two goals:1 Find good ILP feasible solutions to decrease upper bound u; and

2 Proving that the current best feasible solution is optimal, byincreasing the lower bound as quickly as possible.

(Brainstorm)

26 / 55

Page 27: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Search strategy (cont.)

To achieve the first goal (find good ILP feasible solutions):

Best Estimate Criterion

1 estimate the value of the best feasible solution in each subproblemPi in L as

Ei = rPi +

n∑j=1

min{C̃

(i)j,−f

(i)j , C̃

(i)j,+(1− f

(i)j )}

2 choose the subproblem one that most decreases (in estimation) theupper bound.

Intuition on the estimation

Round the non-integral solution to a nearby integral solution, anduse the pseudocosts to estimate the change in the objective value

27 / 55

Page 28: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Node selection (cont.)To achieve the second goal (prove that currently best feasiblesolution is optimal), we act di�erently depending on whether wealready achieved the first goal (find good ILP feasible solutions).

Depth-First Search strategy (if good upper bound available)

Go deep until you prune, then go back and go the other way.

• Advantage: The LPR of successive nodes are small variations ofeach other: can be solved fast in sequence.

• Disadvantage: If we do not have a good upper bound, we mayexplore many nodes with a value larger than the optimal.

Best-first search strategy

Select the node with the best lower bound.

Why? That node cannot be pruned by exploring other nodes: wemust explore it at some point.Advantage:BFS minimizes the total number of nodes in the tree.

28 / 55

Page 29: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

The following material is inspired or taken from the course IntegerProgramming, ISE 418, by Dr. Ted Ralphs

29 / 55

Page 30: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branch and bound treeBaB tree forminimization IPa�er 400 nodes

Node height is LPRoptimal obj. value (a. . . lower bound forthe IP optimalobj. value)

Node color:• red: candidates for processing/branching;

• green: branched or infeasible;

• blue: pruned by bound (maybe gave feasible sol.) or infeasible.

Red line: obj.value for best feasible sol. found (global . . . upperbound).

The level of the highest red node is the global lower bound.30 / 55

Page 31: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branch and bound tree

31 / 55

Page 32: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

The many trade-o�s of branch and bound

• Goal: solve the problem as fast as possible;

• Assuming time to process (e.g., solve LPR) of a node is constant p,and BaB processes ` nodes, overall running time is: p× `

• Di�erent branching and exploration strategies give di�erent `;

What about changing p?

For now we only looked at LPR. Are there other relaxations?

32 / 55

Page 33: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Relaxations

IP:

zI = min{cTx : x ∈ S}, with S = {x ∈ Zn+ : Ax ≤ b}

Relaxation of an IP: a minimization problem

zR = min{fR(x) : x ∈ SR}

with the following properties:

S ⊆ SR

cT ≥ fR(x), for every x ∈ S

33 / 55

Page 34: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Relaxation

What’s the goal of a relaxation? Obtain n lower bound to zI.

What is a desirable computational property of a relaxation?

The relaxation should be much easier to solve optimally than theoriginal problem;

Types of relaxations (there are others):• LP relaxation

• Combinatorial relaxation

• Lagrangian relaxation (Prefer “Lagrangean”? Fine!)

LPR and CR drop constraints, LR relaxes them.

34 / 55

Page 35: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

CR Example: Traveling Salesman Problem

TSP: combinatorial problem on a clique G = (V,E).• V = [n]: set of n cities

• E = all pairs of di�erent nodes: travel links between cities, eachedge (i, j) has cost ci,j

Goal: find a tour of all the nodes minimizing the cost of thetraversed edges

In general, TSP is NP-Complete

ILP formulation ?

35 / 55

Page 36: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

ILP formulation of TSPVariables:

xij =

{1 tour moves from city i to city j

0 otherwise

ui ∈ Z dummy variables

Problem:

min

n∑i=1

n∑j=1,j 6=i

cijxij∑i=1n,i 6=j

xij = 1 j = 1, . . . , n

∑j=1n,j 6=i

xij = 1 i = 1, . . . , n

ui − uj + nxij ≤ n− 1 2 ≤ i 6= j ≤ n

The last set of constraints ensures that . . . there is a single tourcovering all cities.

36 / 55

Page 37: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Combinatorial relaxation for TSP

Build a Minimum 1-Tree (M1T):1 Take away a vertex x from the graph.

2 Find the Minimum Spanning Tree (MST) T for the remaindergraph.

3 Connect x to T using the 2 cheapest edges incident to v

Theorem: The minimum 1-tree problem is a relaxation of the TSP.

Before formal proof, what constraints of TSP are dropped in M1T?

No single tour enforced and no 2-degree constraints.

Why is M1T easier to solve than TSP? Finding MST T takespolynomial time (e.g., Prim’s algorithm takes O(n2) on clique).

37 / 55

Page 38: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Minimum 1-tree is a relaxation of TSP

What do we need to show?

Is STSP ⊆ SM1T ?Yes, easy to see.

Is zM1T ≤ zTSP ?1 The two edges incident to x in the opt.sol. of TSP cannot be

cheaper than those in the sol. of M1T.

2 The opt.sol. of TSP must touch all the nodes in V \ {x}. The MSTT is the cheapest way to touch them all.

38 / 55

Page 39: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Lagrangian Relaxation

For many combinatorial problems, it is possible to find combinatorialrelaxations. What about other problems?

The Lagrangian relaxation removes some constraints andincorporates them in the objective function

The goal is to penalize the violation of the dropped constraints.

We will talk a lot about Lagrangian relaxation when dealing withquadratic optimization.

39 / 55

Page 40: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Example of Lagrangian RelaxationOriginal IP:

min cTx

A′x ≤ b′

A′′x ≤ b′′

x ∈ Zn+

Assume to know that optimization over SR = {x ∈ Zn+ : A′x ≤ b′}is “easy” (from an oracle)Lagrangian relaxation LR(u):

min (cT + uA′′)x− ub′′

A′x ≤ b′

x ∈ Zn+

For any u ≥ 0, LR(u) is a relaxation of the IP. (why?) Think of u asa set of “dual” variables, maximize over u to find best LB.

40 / 55

Page 41: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Cu�ing planes

• Branching involves generating constraints dynamically to split thefeasible region into subsets.

• The added constraints are generated “‘on the fly” using theopt. sol. to the LPR at the present node.

• The goal was not to reduce the feasible region for the LP relaxation,but in practice it happened (how?)

�estion

Can we add constraints that directly reduce the feasible region?

Why would we want to do it?

41 / 55

Page 42: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Cu�ing planes (cont.)

Our goal is to make the feasible region of the LP relaxation closer tothe feasible region for the ILP.

Doing so would result in tighter bounds:1 “More likely” that the opt. sol. to the LPR is integral.

2 “More likely” that the opt. obj.val. to the LPR is close to theopt.obj. val. to the ILP subproblem.

�estion

Which constraints should we try to add?

42 / 55

Page 43: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Cu�ing planes (cont.)

A cu�ing plane is a constraint that is satisfied by all ILP feasiblesols., but violated by the opt. sol. of the (current) LPR.

The goal of adding cu�ing planes:Improve the bound produced by the LP relaxation by removingsubsets of the LP relaxation feasible region where1 there are no integral points; and

2 the objective value is low.

There are many ways of building cu�ing planes. Basic idea:• Take combinations of the constraints;

• Use rounding to produce stronger ones;

43 / 55

Page 44: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

P = {x ∈ Rn+ : Ax ≤ b}: feasible region of LPRWith one pathological exception,2 all valid inequalities for P areeither equivalent or dominated by an inequality of the form

uAx ≤ ub, u ∈ Rm+

Let c(S) be the convex hull of the feasible region of the original IP.

All inequalities valid for P are also valid for c(S).Are they cu�ing planes?Not necessarily.

How to do be�er? If a ≤ b and a ∈ Z and b ∈ R, then . . .a ≤ bbc.

2When one or more variables have no explicit upper bound and both primal anddual problems are infeasible. We ignore this case, and assume A contains explicitupper bounds for all variables

44 / 55

Page 45: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Perfect Matching problem• n people need to be paired in teams of two.

• cij : “cost” of the team formed by i and j (ine�iciency). It may beinfinite.

• Want to maximize e�iciency over all teams.Think of it as a problem on graph G = ([n], E). Nodes are people,edges are pairings of non-infinite cost.IP formulation:xij = 1 if endpoints (i, j) are matched, 0 otherwise

min∑

(i,j)∈E

cijxij∑{j : (i,j)∈E}

xij = 1, for each i ∈ [n]

xij ∈ {0, 1}, for each (i, j) ∈ E

45 / 55

Page 46: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

The odd set inequalities

U : odd set of nodes.

Cutset of U : δ(U) = {(i, j) ∈ E : i ∈ U, j 6∈ U}

Every perfect matching contains at least one edge from every oddcutset.Every odd cutset induces a valid inequality:∑

(i,j)∈δ(U)

xi,j ≥ 1, U ⊆ [n], |U | odd

46 / 55

Page 47: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Generating new inequalities1 Consider an odd set of nodes U

2 Sum the constraints∑

{j : (i,j)∈E}

xij ≤ 1 for i ∈ U to obtain

2∑

(i,j)∈E(U)

xij +∑

(i,j)∈δ(U)

xij ≤ |U |

3 Divide through by two and drop the second term of the sum toobtain: ∑

(i,j)∈E(U)

xij ≤1

2|U |

4 Can we go a step further?∑(i,j)∈E(U)

xij ≤⌊1

2|U |⌋

47 / 55

Page 48: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

The Chvátal-Gomory procedure

Let A = (a1, a2, . . . , an).1 choose a weight vector u ∈ Rm

2 obtain the valid inequality∑j∈[n]

(uaj)x ≤ ub

3 Round coe�icients down:∑j∈[n]

buajc)x ≤ ub (why is it valid?)

4 Round the rhs to get: ∑j∈[n]

buajc)x ≤ bubc

For pure IP, any inequality valid for c(S) can be obtained by a finitenumber of iterations of this procedure!

48 / 55

Page 49: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

The Chvátal-Gomory procedure

The major challenges in the C-G procedure are numerical errors andslow convergence.

The inequalities may be very weak: they may not even touch c(S).

Rounding “pushes” the inequality to the nearest point in Zn, whichmay not be su�icient.

For the generated hyperplane to have integral points, thecoe�icients of the inequality must be relative primes.

Theorem: Let S = {x ∈ Zn :∑n

j=1 ajxj ≤ b}, where aj ∈ Z for1 ≤ j ≤ n. Let k = gcd(a1, . . . , an).Then c(S) = {x ∈ Rn :

∑nj=1(aj/k)xj ≤ bb/kc}.

49 / 55

Page 50: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Gomory cutsLet T be the set of solutions to an IP with one equation

T =

x ∈ Zn+ :

n∑j=1

ajxj = a0

For each j let fj = aj − bajc. Then equivalently:

T =

x ∈ Zn+ :

n∑j=1

fjxj = f0 + k for some integer k

Since

∑nj=1 fjxj ≥0 and f0 ≤1, then k ≥0, and so

n∑j=1

fjxj ≥ f0

is a valid inequality for T , called a Gomory cut50 / 55

Page 51: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Gomory cutsWhen solving general IP, we can get the set T used in the previousslide from a valid inequality for P .Choose λ ∈ Rm so that π = λA and π0 = λb are such that πx ≤ π0is a valid inequality for P .Add a slack variable s to obtain an equation and consider

T =

(x, s) ∈ Zn+ × R+ :

n∑j=1

πjxj + s = π0

The Gomory cut for T is

n∑j=1

(λAj − bλAjc)xj ≥ λb− bλbc.

Does it look like anything we have already seen?It is a C-G inequality with weights ui = λi − bλic.Using the optimal simplex tableau of the LPR, one can obtainGomory cuts that are violed by the opt. sol. of the LPR.

51 / 55

Page 52: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

GMI cuts• Consider a Mixed ILP in standard form:

min cTx

Ax = b

x ≥ 0

xj ∈ Z for j ∈ Ixj ∈ R for j ∈ C

• Assume that there is one component of b that is fractional.• Consider one equation for which the r.h.s. is fractional, e.g.:∑

j∈Iaijxj +

∑j∈C

aijxj = bi

• Let f0 = bi − bbic and fj = aij − baijc for j ∈ I .For any ILP feasible solution, rewrite the above as:∑j∈I:fj≤f0

fjxj +∑

j∈I:fj>f0

(fj − 1)xj +∑j∈C

aijxj = k + f0

for some integer k.52 / 55

Page 53: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

GMI cuts (cont.)

∑j∈I:fj≤f0

fjxj +∑

j∈I:fj>f0

(fj − 1)xj +∑j∈C

aijxj = k + f0

for some integer k.Since k ≥ 0 or k ≤ −1 it must hold either∑j∈I:fj≤f0

fjf0xj −

∑j∈I:fj>f0

1− fjf0

xj +∑j∈C

aijf0xj ≥ 1 or

−∑

j∈I:fj≤f0

fj1− f0

xj +∑

j∈I:fj>f0

1− fj1− f0

xj −∑j∈C

aij1− f0

xj ≥ 1

I.e., “∑cjxj ≥ 1 or

∑djxj ≥ 1”, which implies for x ≥ 0: . . .∑

max{cj , dj}xj ≥ 1

53 / 55

Page 54: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

GMI cuts (cont.)

“∑cjxj ≥ 1 or

∑djxj ≥ 1” implies∑

max{cj , dj}xj ≥ 1

For each variable xj , one of cj or dj is positive and the other isnegative, so the max is immediate:∑j∈I:fj≤f0

fjf0xj+

∑j∈I:fj>f0

1− fj1− f0

xj+∑

j∈C:aj>0

aijf0xj−

∑j∈C:aj<0

aij1− f0

xj ≥ 1

• This inequality holds for all x ≥ 0 that satisfy the originalequation and for which xj ∈ Z for all j ∈ I .

• It holds for all ILP feasible solutions. . . and may not hold for somefeasible solutions of the LP relaxation, i.e., it may reduce thedi�erence between the convex hull of the ILP feasible region andthe LP relaxation feasible region, leading to be�er bounds!

54 / 55

Page 55: CSCI 1951-G Optimization Methods in Finance Part 03 ...cs.brown.edu/courses/cs1951g/slides/03-IPTheory.pdf · Integer Programming Integer Program: some or all the variables are restricted

Branch and cut

• Cu�ing planes are not very e�icient: they only “cut” a very smallpart of the LPR feasible region.

• Mixing branch-and-bound and cu�ing planes may lead to greatspeedups• During branch-and bound, a�er solving a LP relaxation at a

node, generate new cuts.• Cuts may be local or global, i.e., may hold just for the

subproblem at this node (and its descendants) or be valid forthe whole feasible region.

55 / 55