the traveling salesman problem in theory & practice

The Traveling Salesman Problem in Theory & Practice

Lecture 2: NP-Hardness28 January 2014

David S. [email protected]

http://davidsjohnson.net

Seeley Mudd 523, Tuesdays and Fridays

mailto:[email protected]

http://davidsjohnson.net/

http://davidsjohnson.net/

Today’s Outline• NP-completeness and the complexity of

finding optimal TSP tours.

• Hardness of Approximation, the PCP Theorem, and the TSP.

Sources• Garey & Johnson, Computers and Intractability: A Guide to the

Theory of NP-Completeness, 1979.

• Johnson & Papadimitriou, “Computational Complexity,” Chapter 3 of Lawler, Lenstra, Rinnooy Kan, & Shmoys, The Traveling Salesman Problem, 1985.

• Johnson, “The tale of the second prover” (NP-Completeness Column #23), J. Algorithms 13 (1992), 502-524 [Also available on DSJ’s website].

• Johnson, “A brief history of NP-completeness, 1954-2012,” in Optimization Stories, Martin Grötschel (Editor), Special Volume of Documenta Mathematica (2012), 359-376. (Book distributed at the 21st International Symposium on Mathematical Programming, Berlin, August 19-24, 2012) [Also available on DSJ’s website].

NP-CompletenessConcept invented contemporaneously and

independently by Stephen Cook [1971] and Leonid Levin [1973]

Breadth of applicability first illustrated by Karp [1972]

First BookM. R. Garey & D. S. Johnson

Computers and Intractability: A Guide to the Theory of NP-Completeness

W.H. Freeman, 1979

First Cartoon

Still an Open Question:

Does P = NP?Note: If you can provide a valid proof of the answer, you will win fame and fortune ($1,000,000 from the Clay Institute)

The P-versus-NP Pagehttp://www.win.tue.nl/~gwoegi/P-versus-NP.htm

Milestones1 [Equal]: In 1986/87 Ted Swart (University of Guelph) wrote a number of papers (some of them had the title:

"P=NP") that gave linear programming formulations of polynomial size for the Hamiltonian cycle problem. Since linear programming is polynomially solvable and Hamiltonian cycle is NP-hard, Swart deduced that P=NP. In 1988, Mihalis Yannakakis closed the discussion with his paper "Expressing combinatorial optimization problems by linear programs."

27 [Not equal]: In November 2005, Ron Cohen proved that P is not equal to NP. In addition, his paper shows that P is not equal to the intersection of NP and co-NP. Finally, the exact inclusion relationships between the classes P, NP and co-NP are discussed. The paper is available at http://www.arxiv.org/abs/cs.CC/0511085. The title of the paper is "Proving that P is not equal to NP and that P is not equal to the intersection of NP and co-NP".

96 [Unprovable]: In November 2012, Natalia L. Malinina put the paper "On the principal impossibility to prove P=NP" onto the arxive, at http://arxiv.org/abs/1211.3492. On page 19, she writes: "Summarizing all that was said, it can be concluded that such dividing of the graphs into three classes and the behavior of the complicated vertexes at the converting (they turn into the independent cycles) gives us the infallible fact that it is impossible to prove that P=NP."

98 [Equal]: In January 2013, Dmitriy Nuriyev established P=NP. His paper "A DP Approach to Hamiltonian Path Problem" designs a polynomial worst case time Dynamic Programming algorithm for computing a Hamiltonian Path in a directed graph. The result is obtained via the use of original colored hypergraph structures in order to maintain and update the necessary DP states. The running time of the resulting algorithm is O(n^8) where n denotes the number of vertices in the directed graph. The paper is available at http://arxiv.org/abs/1301.3093.

49 for Equals, 41 for Not Equals, 1 for both, 3 for Unprovable/Undecidable, 1 for NP=coNP, 3 Others

http://www.arxiv.org/abs/cs.CC/0511085

http://arxiv.org/abs/1211.3492

http://arxiv.org/abs/1301.3093

Proving that the TSP is NP-Hard• What we will actually prove: HAMILTON

CYCLE for “grid graphs” is NP-complete.

Finite subgraph of the infinite rectilinear grid.

General Asymmetric TSP

Asymmetric Triangle Inequality TSP

Hamilton Cycle for Bipartite Planar Graphs

Hamilton Cycle

Rectilinear TSP

Hamilton Cycle for Grid Graphs

Euclidean TSP

Directed Hamilton Cycle

Symmetric Triangle Inequality TSP

Symmetric TSP

EXACT COVER

• Instance: A finite set U together with a collection C of subsets of U.

• Question: Does C contain an exact cover for U, that is, a subcollection S’ ⊆ S such that every element of U occurs in exactly one member of S’?

• Status: NP-complete [Karp, 1972].

SATISFIABILITY

EXACT COVER

HAMILTON CYCLE for Grid Graphs

HAMILTON CYCLE for Bipartite, Max Degree

3 Graphs

HAMILTON CYCLE for Planar Bipartite, Max

Degree 3 Graphs

SATISFIABILITY

• Instance: A set of X variables {x1,x2,…,xn} and a set of clauses C = {c1,c2,…,cm}, where each ci is a subset of the “literals” {x1, ¬x1, x2, ¬x2, … , xn, ¬xn}.

• Question: Is there an assignment of the values “true” or “false” to the variables so that each clause c contains at least one true literal (where ¬xi is true if xi is assigned the value “false” and otherwise is false) ?

• Status: NP-complete [Cook, 1971], [Levin, 1973]

Transformation:SATISFIABILITY to EXACT COVER

Elements of X:• <c,x> for each combination of a clause c and a

literal x that it contains.• Plus |c|-1 auxiliary “garbage collection”

elements for each clause c.• Plus auxiliary elements for each variable

<c,x3>

<c,x21>

<c,¬x7>

<c,¬x6>

To cover the garbage collection elements, we must pick up all but one of the <c,x> elements, and all choices of the uncovered <c,x> are

possible.The uncovered <c,x> specifies the literal x chosen to satisfy the

clause c.

Clause Component

Need Variable Components to insure that no conflicting literal choices are made (elements <c,x> and <c’,¬x> are left uncovered).

Garbage Collection Elements

Variable Component for x<c2,¬x>

<c4,¬x>

<c7,x>

<c9,x><c22,¬x>

<c11,¬x> <c15,x>

<c2,¬x>

<c4,¬x>

<c7,x>

<c9,x><c22,¬x>

<c11,¬x> <c15,x>

Nx = max(|{c: clause c contains literal x}|, |{c: clause c contains literal ¬x}|)We begin with 2Nx red elements.

TheoremThe original SATISFIABILITY instance is satisfiable

if and only if

the resulting EXACT COVER instance has an exact cover.

Corollary: Since the construction can be accomplished in polynomial time, we have produced a polynomial reduction from SATISFIABILITY to EXACT COVER.

EXACT COVER to HAMILTON CIRCUIT for Bipartite Graphs with Max Degree 3

• The construction involves three types of components this time.– Element components

– Set Choosing components

– Linking components

• We will exploit the fact that a tour is 2-connected and has only degree-2 vertices.

Component for Element uSuppose that element u is contained in sets Sa[1], Sa[2], …, Sa[k].

Sa[1] Sa[2] Sa[k]

Component for Element uProposition: Every Hamilton Cycle omits precisely one of the bottom edges of the component for u, and any one those bottom edges can be the omitted one.

(The omitted edge identifies the set that is supposed to cover element u.)

Proof: We will illustrate the proof for the case of k = 6

Any one of the bottom edges can be the omitted one.

(Edges Forced by Degree-2 and 2-Connectivity Constraints)

Every Hamilton Cycle omits precisely one of the bottom edges.

If none of the edges is omitted, the rest of the component cannot be in the tour.

Suppose 2 or more edges are omitted.Consider the leftmost omitted edge.Tour must contain these edges.Consider the rightmost omitted edge.Tour must contain these edges.Oops!

Every Hamilton Cycle omits precisely one of the bottom edges.

Set Choosing Components

S1 SmS2

A set is in the cover if the Hamilton path takes the top path in the pair for that set.

Linking Component: The Exclusive-Or Graph

Only two options for edges linking to the outside world.

Choice 1Choice 2


Forced by Degree-2 Constraints

Suppose we take top left edge but not the bottom.Then these edges are forced.

Suppose we take bottom left edge but not the top..


Then these edges are forced.

Suppose we take both bottom and top left edges.


Then these edges are forced.Oops!

Suppose we take neither bottom nor top left edge.


Then these edges are forced.Oops!


Shorthand for Exclusive-Or

The Transformation from EXACT COVER to HAMILTONIAN CYCLE

a cb

{a,b}

{a,c}

{b} {b,c}

Oops! – Not BipartiteFixed!

Adding Planarity: The Crossover

Getting to the Grid

Lemma: Any 2-connected planar graph with f faces and n edges (all of degree 2 or 3) has an embedding in the 2D grid that can be contained in square of size 2f+n (has “extent” no more than 2f+n).

Proof of Lemma• Note that, if a graph is planar, then for each of its

faces there is a planar representation in which that face is the external face (that is, the one that contains all the other vertices).

• We proceed by induction on the number of faces f, with the hypothesis that, for any graph G as above and a designated face, there is a grid embedding of extent 2f+n or less, such that– the designated face is the external face of the embedding, and

– no degree-2 vertex of the external face has any vertex or edge of the embedding to its right on the gridline containing it.

Base Case: f = 1

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂

2f + n = 4 + 6 = 10 > 7

Inductive step: Assume true for all f’ < f

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂ ∂

∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂ ∂

∂∂∂

∂∂

Chosen Face F

Planar 2-connected graph G with all vertex degrees equal to 2 or 3

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂ ∂

∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂ ∂

∂∂∂

∂∂

Chosen Face F

Pick a face F’ that shares an edge with our chosen face.Delete all shared edges and degree-2 vertices leaving n’ vertices.

Now we have an (f-1)-face graph and a chosen face, for which the induction hypothesis holds.

Neighboring Face F’

Combined Face F’’

Embedding with F’’ as external face.Extent no more than 2(f-1)+n’

∂∂∂

Degree 2 vertices on the boundary of our original face∂

∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

Two Cases for the boundary of F’’ that is not shared with F:1. Interior2. Exterior

Extent no more than 2(f-1)+n’

∂∂∂

Degree 2 vertices on the boundary of our original face∂

∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

First Case: Internal

Deleted edges and degree-2 vertices from F.

∂∂∂

∂∂∂

Extent no more than 2(f-1)+n’ + 1 < 2f+n

Extent no more than 2(f-1) + n’Extent no more than 2(f-1) + n’ + (n-n’) < 2(f-1) + nExtent no more than 2(f-1) + n + 1 < 2f + n

∂∂∂∂

∂∂

∂∂∂

∂∂∂

∂∂∂

First Case: Internal

But what if there were many deleted degree-2 vertices?

∂∂∂∂∂∂

∂∂∂∂∂∂

∂∂∂

∂∂∂

∂∂∂∂∂∂

∂∂∂∂∂∂

2nd Case: External

∂∂∂

∂∂∂

∂∂∂

∂∂∂

Extent no more than 2(f-1) + n’Extent no more than 2(f-1) + n’ + (n-n’) < 2(f-1) + nExtent no more than 2(f-1) + n + 2 < 2f + n

∂∂∂∂∂∂

∂∂∂∂∂∂

∂∂∂

∂∂∂

∂∂∂∂∂∂

∂∂∂∂∂∂

Note that this argument assumes we can always find a neighboring face that shares only a single path with our chosen face.

Exercise: Show that this is true.

∂∂∂∂

∂∂

∂∂∂∂

∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

F

NoYes

Color-Preserving Embeddings• If a graph is bipartite, we can two-color its

vertices (say black and white), so that no two adjacent vertices get the same color.

• Similarly, we can 2-color the vertices of the 2D grid, where the vertices whose coordinates sum to an even number get white, and those with odd sum get black.

• Lemma: If our original planar graph is bipartite, we can obtain an embedding into the 2D grid is color preserving.

1. Embed as before.2. Multiply the scale by two so that all vertices go to black grid

points3. Move white vertices one cell to the right, as illustrated

below.

Hardware for the Final Step:Strips and Tentacles

Lemma: In a tentacle there is a Hamilton path between two endpoints if and only if one is black and one is white, and our tour must contain such a path.

Endpoints

Endpoints

Endpoints

Tour must contain paths from this vertex to one of {a,b} and to one of {d’,c’}Each vertex of the tentacle must be contained in one of these paths.

Consequently, the union of the two paths makes up a Hamilton path between two endpoints

Oops!

Two possibilities starting with vertex a

Both go from white to blackSituation is analogous if we start with any other

endpoint

Hardware for the Final Step:The Box

Lemma: for all i,j, 1 ≤ i < j ≤ 4, there is a Hamilton path from pi to pj containing all four edges e1, e2, e3, and e4.

(For future reference, let the “color” of the box be the color of the four corner vertices and the central one, in this case white.)

Take our embedding and refine the grid by a factor of 9. This adds space and maintains vertex color)

Increase the width of edges to a grid cell, suitably offset

Replace edges with the included grid points

Replace edges with the included grid points

Path must end in a white vertex and hence must be adjacent to a corner of a black box.

The full construction (with tour)

Historical Notes• Directed and Undirected HAMILTON CYCLE first proved NP-

complete in [Karp, 1972], with the directed case a transformation from EXACT COVER, credited to Karp & Tarjan and the undirected case credited to Tarjan.

• Planar directed HAMILTON PATH was proved NP-complete in “Some simplified NP-complete problems, Garey, Johnson, & Stockmeyer, STOC 1974, with journal version in Theor. Comput. Sci. 1 (1976), 237-267.

• Planar undirected HAMILTON CYCLE for triply-connected cubic graphs was proved NP-complete in “The planar Hamilton circuit problem is NP-complete,” Garey, Johnson, & Tarjan, SIAM J. Comput. 5 (1976), 704-714. [This paper introduced the exclusive-or and crossover constructions.]

Rectilinear and Euclidean TSP were first proved NP-hard in“Some NP-complete geometric problems,” Garey, Graham, & Johnson, 8th Annual ACM Symp. on Theory of Computing, 1976, 10-22.and“Some complexity results for the Traveling Salesman Problem,” Papadimitriou & Steiglitz, 8th Annual ACM Symp. on Theory of Computing, 1976, 1-9. [The results were claimed here, with proof details given in Papadimitriou, “The Euclidean traveling salesman problem is NP-complete,” Theor. Comp. Sci. 4 (1977), 237-244.]

Hamilton Circuit for Grid Graphs was first proved NP-complete in “Hamilton paths in grid graphs,” Itai, Papadimitriou, and Swarcfiter, SIAM J. Comput. 11 (1982), 676-686.

From [GGJ 1976]

Coping with Complexity: Approximation Algorithms

Hardness of Approximation• First paper proving that for some problems it was just as

hard to get close as to find the optimal solution was by Sahni & Gonzalez [1974].

• Basic idea: Prove a “gap” theorem.• Simple example: GRAPH COLORING:

– It is NP-complete to tell whether a graph is 3-colorable.– Any graph that is not 3-colorable requires at least 4 colors.– Thus it is NP-hard to determine the chromatic number of a graph

to within a factor r < 4/3.– Drawback: This proof does not rule out a polynomial-time

algorithm with a guarantee of OPT + 1.

Asymptotic Hardness of Approximation

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

H G

Graph Product G[H]: Replace each vertex v in G by a copy Hv of H, and if there is an edge between u and v in G, then include edges between each vertex in Hu and each vertex in Hv.

∂∂∂

∂∂∂

∂∂∂

∂∂∂∂

∂∂

∂∂∂

∂∂∂

∂∂∂

∂∂∂

Asymptotic Hardness of Approximation

• Let G be the graph we wish to test for 3-colorability, and let Ck be a clique on k vertices.

• Then G[Ck] has chromatic number 3k if G is 3-colorable, and otherwise has chromatic number at least 4k.

• Since G[Ck] can be constructed in polynomial time for any fixed k, this implies that GRAPH COLORING cannot be approximated to within a factor of rOPT + b for any constant b and r < 4/3 unless P = NP.

• This was improved to r < 2 in Garey & Johnson, “The complexity of near-optimal graph coloring,” J.ACM 23 (1976), 43-49.

Theoretical Breakthrough (1991,…): PCP Theorems

• A probabilistically checkable proof (PCP) system for a decision problem P consists of a polynomial-time “verifier” who behaves as follows:

• Given an instance of P, a string X to which the verifier has random access, and k random bits, the verifier determines, in time polynomial in the instance size, the addresses of b bits of the proof, and then reads what those bits are.

• If the answer for instance is “yes”, then there is a string X (possibly exponentially long in terms of the instance size) which will lead the verifier to say “yes” no matter what her random bits are.

• If the answer is “no”, then no string can lead her to say “yes” with probability greater than 0.5.

Theoretical Breakthrough (1991,…): PCP Theorems

• Let PCB[f(n),g(n)] be the set of languages that have PCPs that use k = O(f(n)) random bits and make b = O(g(n)) queries.

• In 1991, Feige, Goldwasser, Lovasz, Safra, & Szegedy showed thatNP ⊆ PCP[ log(n)loglog(n), log(n)loglog(n) ].

• Surprisingly, this implies that no approximation algorithm for the CLIQUE problem can be guaranteed to find a clique of size rOPT + b or greater for any constants r and b unless NP = DTIME[nO(loglogn)].

• In February 1992, Arora & Safra showed NP = PCP[log(n),log(n)].• In April 1992, Arora, Lund, Safra, Sudan, & Szegedy showed

NP = PCP[log(n),1].• This led to implies a hardness of approximation result for the TSP,

assuming only P ≠ NP.

Another Necessary Digression:PTAS’s and MaxSNP

• A polynomial time approximation scheme (PTAS) for an optimization problem P is a collection of algorithms {Aε:ε > 0} such that each Aε runs in time bounded by a polynomial in the input size (although possibly exponential in ε) and is guaranteed to find a solution within a ratio of 1+ε of optimum.

• Many problems have a PTAS, including, as we shall see later, the 2-D (rounded) Euclidean TSP and the planar graph TSP.

Another Necessary Digression:PTAS’s and MaxSNP

• MaxSNP is a class of optimization problems, introduced in Papadimitriou & Yannakakis [1988,1991], each of which has a polynomial-time approximation algorithm with with a finite worst-case ratio, but none of which was known to have a PTAS, although if any one did have a PTAS, then they all did. It includes– MaxCut.– Max kSAT for any fixed k.

• A problem is MaxSNP-hard if every member of MaxSNP has a polynomial-time “gap-preserving” reduction to it.

• If we can show that, assuming P ≠ NP, some specific member of MaxSNP problem does not have a PTAS, then no MaxSNP-hard problem has a PTAS under the same assumption.

• A key example of a MaxSNP-hard problem:– TSP with each edge weight in {1,2} [Papadimitriou & Yannakakis, 1993].

MAX kSAT

• In MAX kSAT, we are given an instance of SATISFIABILITY in which each clause contains exactly k literals, and are asked for a truth assignment that satisfies the maximum number of clauses.

• Johnson [1973] presents a simple algorithm that is guaranteed to satisfy a fraction 1 – (1/2)k of the clauses and hence at least ((1 – (1/2)k)OPT.

• Theorem [Arora, Lund, Safra, Sudan, & Szegedy, 1992]: There exists a k > 2 and ε > 0 such that, assuming P ≠ NP, no polynomial time approximation algorithm for MAX kSAT can guarantee to satisfy more than (1-ε)OPT clauses, and hence MAX kSAT does not have a PTAS.

Proof• Suppose we have a PCP[log(n),1] proof system for 3SAT (as is

guaranteed to exist by the PCP Theorem).• We may assume without loss of generality that the verifier always

queries precisely k proof bits, for some constant k, and never uses more than clog(n) random bits, where n is the length of the input, for some other constant c.

• Note that given these constants, the verifier can never examine more than knc different proof bits on an instance of size n, so we may assume that knc is an upper bound on proof length.

• Let I be an instance of 3SAT of length n. We shall construct a corresponding instance Ik of MAX-kSAT.

• There are knc variables in Ik, with variable xi standing for the statement ‘‘the bit in location i of the proof is 1.’’

Proof, Slide II• Let r be the precise number of random bits that the verifier uses

for instance I.• For each of the 2r possible sequences x of r random bits, let

vx[1], …, vx[k] be the k variables corresponding to the addresses the verifier examines given I and x.

• Let Nx be the set of k-tuples of values for these variables (assignments of bits to the addresses) that would cause the verifier to disbelieve the proof. Note that we must have |Nx| ≤ 2k.

• Now suppose (b1,...,bk) is a tuple in Nx. The statement that the variables vx[i] do not take on this tuple of values is simply a single kSAT clause.

• For instance, if k = 4 and (b1 ,...,b4 ) = (1,0,0,1), then the clause would be (¬vx[1] or vx[2] or vx[3] or ¬vx[4] ). Given x, the verifier will believe the proof if and only if the conjunction of such kSAT clauses, one for each tuple in Nx is true.

Proof, Slide III• Our kSAT instance is the conjunction, over all length r

binary sequences x, of all the clauses corresponding to tuples in Nx (for a total of N = Σ|Nx| ≤ 2r+k clauses).

• Note that the size of Ik (number of literals it contains) is at most kN ≤ k2r+k ≤ k2knc. Thus it is bounded by a polynomial in the size of I (and of course, so is the time to construct Ik).

• Now, if I is satisfiable, there must be a proof (i.e., a truth assignment for the variables vi) such that all the clauses corresponding to each set Nx are satisfied.

• If I is not satisfiable, then, for any truth assignment, at least ½2r of the sets Nx must yield one or more unsatisfied clauses.

• The ratio between the maximum number of satisfiable clauses in the two cases is thus at least (N−½2r)/N < 1 - ½2r/N ≥ 1 - 1/2k+1.

• QED

Constants• Original version of Arora et al. paper required k > 100 and

so only concluded something like “no guarantee greater than 1-ε for an almost infinitesimal ε.

• Subsequently, Håstad, “Some optimal inapproximability results,” J.ACM 48 (2001), 798-859, showed that one could take k=3 and that for no ε > 0 could a polynomial-time algorithm for Max 3SAT guarantee to satisfy more than (7/8+ε)OPT clauses unless P = NP, a tight match with the 7/8 guarantee for that problem provided by the Johnson [1973] algorithm.

• Corollary: Assuming P ≠ NP, there is no PTAS for the undirected TSP with triangle inequality.

• Proof: This is because the TSP with all edge lengths in {1,2} obeys the triangle inequality and is MaxSNP-hard.

• The unobtainable ratio was initially something like 1.000001.

• This has subsequently been improved by [Papadimitriou & Vempala, 2006] to– 117/116 for the asymmetric TSP with triangle

inequality, and– 220/219 for the symmetric TSP with triangle

inequality.

Possible Papers for Class Presentations• Papadimitriou & Vempala, “On the approximability of the traveling salesman

problem,” Combinatorica 26 (2006), 101-120.

• Papadimitriou & Yannakakis (1991,1993): The class MaxSNP and the complexity of the TSP with edge lengths in {1,2}.

• Papadimitriou & Yannakakis (1984): “The complexity of facets and some facets of complexity.”

• Yannakakis (1991) on why attempts to prove that Hamilton Circuit is in P via linear programming are doomed to fail, and the recent generalization of this result.

• PTAS’s for Planar and Euclidean TSP’s.

• Papers on polynomial-time solvable special cases of the TSP.

• Papers on algorithms for the TSP on graphs with unit-length edges under the shortest path metric.

the traveling salesman problem in theory & practice

Documents

theory of np

intersection of np

np pagehttp

brief history of np

classes p

prover npcompleteness

hamiltonian path problem

hamiltonian cycle problem