1 combinatorial algorithms parametric pruning. 2 metric k-center given a complete undirected graph g...

47
1 Combinatorial Algorithms Parametric Pruning

Upload: ashlynn-miller

Post on 03-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

1

Combinatorial Algorithms

Parametric Pruning

2

Metric k-center

• Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the triangle inequality, and k is a positive integer. For any set S V and vertex v define connect(v,S) = min{cost(u,v)|uS} (the cost of the cheapest edge from v to a vertex in S.)

• Find a set S V, with |S|=k, so as to minimize maxv{connect(v,S)}.

• The metric k-center problem is NP-hard.

Parametric pruning (1)

• If we know the cost of an optimal solution, we may be able to prune away irrelevant parts of the input and thereby simplify the search for a good solution.

• However computing the cost of an optimal solution is precisely the difficult core of NP-hard NP-optimization problems.

• The technique of parametric pruning gets around this difficulty as follows. A parameter t is chosen, which can be viewed as a “guess” on the cost of an optimal solution. For each value of t, the given instance I is pruned by removing parts that will not be used in any solution of cost > t.

3

Parametric pruning (2)

The algorithm consists of two steps. • In the first step, the family of instances I(t) is

used for computing a lower bound on OPT, say t∗.

• In the second step, a solution is found in instance I(α t∗), for a suitable choice of α.

4

5

Instance

10 7 5 3

6

Idea of Algorithm (k=2, OPT 7)

10 7 5 3

7

Idea of Algorithm (k=2, OPT 3)

10 7 5 3

8

Parametric pruning

• Sort the edges of G in nondecreasing order of cost, i.e. cost(e1) cost(e2) … cost(em).

• Let Gi = (V, Ei), where Ei={e1, e2,…, ei}.

• For each Gi , we have to check whether there exists a subset S V such that every vertex in V – S is adjacent to a vertex in S.

9

Dominating Set

• A dominating set in an undirected graph G = (V, E) is a subset S V such that every vertex in V – S is adjacent to a vertex in S.

• Let dom(G) denote the size of minimum cardinality dominating set in G.

• Computing dom(G) is NP-hard.

10

k-Center

• The k-center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k.

• Gi contains k stars (K1,p) spanning all vertices.

K1,7

11

G2

• Independent set (stable set) in G = (V, E) is a subset I V of pairwise non-adjacent vertices.

• Define the square of graph G = (V, E) to be the graph G2 = (V, E′), containing an edge (u,v) E′ whenever G has a path of length at most 2 between.

G=K1,4G2=K5

12

Lower bound

• Lemma 4.1 Given a graph H, let I be an independent

set in H2. Then, | I | dom(H).

13

Hochbaum-Shmoys Algorithm (1986)

Input (G, cost: E → Q+)1) Construct G1

2, G22,…, Gm

2. 2) Compute a maximal independent set, Ir in

each graph Gr2.

3) Find the smallest index r such that | Ir | k, say j.

Output (Ij)

14

Approximation ratio of Hochbaum-Shmoys Algorithm-1

Theorem 4.2

Hochbaum-Shmoys Algorithm achieves an approximation factor of 2 for the metric k-center problem.

15

Main Lemma

• Lemma 4.3

For j as defined in the algorithm, cost(ej) ≤ OPT.

Proof. • For evry r < j we have that | Ir | > k.

• Now by Lemma 4.1 dom(Gr) ≥ | Ir | > k.

• So r* > r, and r* ≥ j. • cost(ej) ≤ OPT

16

Proof of Theorem 4.2

• A maximal independent set Ij in a graph Gj2 is

also a dominating set.• Thus there exist stars in Gj

2 centered on the vertices of Ij , covering all vertices.

• By the triangle inequality, each edge used in constructing these stars has cost at most 2cost(ej).

• Lemma 6.3 implies 2 cost(ej) ≤ 2 OPT.

17

Tight Example (k = 1)

2

2

2

1

1

2

18

Metric weighted k-center

• Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the triangle inequality, a weight function on vertices, w: V → R+ and a bound W R+. For any set S V and vertex v define connect(v,S) = min{cost(u,v)|uS}.

• Find a set S V of total weight at most W, so as to minimize maxv{connect(v,S)}.

19

Weight dominating set

• Let wdom(G) denote the weight of minimum weight dominating set in G.

• Calculating wdom(G) is NP-hard.

20

Parametric pruning

• Sort the edges of G in nondecreasing order of cost, i.e. cost(e1) cost(e2) … cost(em).

• Let Gi = (V, Ei), where Ei={e1, e2,…, ei}.

• We need to find the smallest index индекс i such that wdom(Gi) W. If i* is this index, then the cost of the optimal solution is OPT = cost(ei*).

21

Lightest neighbors

• Given a vertex weighted graph G = (V, E) let I be an independent set in G2.

• For each uI, let s(u) denote a lightest neighbor of u in G, where u is also considered a neighbor of itself.

• Let S = {s(u) | uI }.

22

Lower Bound

• Lemma 4.4 Given graph H. Let I be an independent set in H2.

Then w(S) wdom(H).Proof. • Let D be a minimum weight dominating set of H. • Then the exists a set of disjoint stars in H, centered on the

vertices of D and covering all the vertices. • Since each of these stars becomes a clique in H 2, the set I

can pick at most one vertex from each of them.• Thus each vertex in I has a center of the corresponding star

available as a neighbor in H. Hence, w(S) wdom(H).

23

Hochbaum-Shmoys Algorithm-2

Input (G, cost: E → Q+, w: V → R+ ,W)1) Construct G1

2, G22,…, Gm

2. 2) Compute a maximal independent set Ir , in each

graph Gr2.

3) Compute Sr = {sr(u) | uIr }4) Find the minimum index r such that w(Sr) W,

say j.Output (Sj)

24

Approximation ratio of Hochbaum-Shmoys Algorithm-2

Theorem 4.5

Hochbaum-Shmoys Algorithm-2 achieves an approximation factor of 3 for the metric weighted k-center problem.

Proof

• By Lemma 4.4, cost(ej) is a lower bound on OPT; the argument is identical to that in Lemma 4.3. Since Ij is a dominating set in Gj

2, we can cover V with stars of Gj

2 centered in vertices of Ij. By the triangle inequality these stars use edges of cost at most 2 cost(ej).

• Each star center is adjacent to a vertex in Sj, using an edge of cost at most cost(ej). Move each of the centers to the adjacent vertex in Sj and redefine the stars. Again, by the triangle inequality, the largest edge cost used in constructing the final stars is at most cost(ej).25

26

Tight Example (W = 3)

2 2

1+ε

1

2

1+ε

1+ε

1 1

1

G

ba c d

27

Tight Example

2 2

1+ε

1

2

1+ε

1+ε

1 1

1

G 2

In+3={b}

ba c d

Sn+3={a} OPT={a, c}

31

30

28

Shortest superstring

• Given a finite alphabet Σ, and a set of n strings S = {s1,…,sn} Σ+.

• Find a shortest string s that contains each si as a substring.

• Without lost of generality, we may assume that no string si is a substring of another string sj, i j.

Overlap, prefix

• We begin by developing a good lower bound on OPT.• Let us assume that s1, s2,…, sn are numbered in order

of leftmost occurrence in the shortest superstring, s.• Let overlap(si, sj) denote the maximum overlap

between si and sj i.e., the longest suffix of si that is a prefix of sj.

• Let prefix(si, sj) be the prefix of si obtained removing its overlap with sj.

29

30

Prefix

ss1

sn–1

s2

pref(s1, s2)

sn

s1

pref(sn–1, sn) pref(sn, s1) over(sn, s1)

.,overlap,prefix

,prefix,prefixOPT

11

3221

ssss

ssss

nn

31

• Define the prefix graph of S as the directed graph Gpref on vertex set V={1,…,n} that contains an edge i → j of weight prefix(si,sj) for each i, j.

• | prefix(s1,s2)| + | prefix(s2,s3)| + …+ | prefix(sn,s1)| represents the weight of the tour 12…n1.

• Hence the minimum weight of a travelling salesman tour of the prefix graph gives a lower bound on OPT.

• Unfortunately, this lower bound is not very useful. TSP is NP-hard.

113221 ,overlap,prefix ,prefix,prefixOPT ssssssss nn

32

Lower Bound

• We will use the minimum weight of a cycle cover of the prefix graph.

• A cycle cover is a collection of disjoint cycles covering all vertices.

• A Hamiltonian cycle is a cycle cover. • We get that the minimum weight of a cycle cover lower-

bounds OPT.• Unlike minimum TSP, a minimum weight cycle cover can be

computed in polynomial time.

33

Cycle → prefix

• If c = (i1 i2 … il i1) is a cycle in the prefix graph, let α(с) = prefix(si1

,si2) ○…○ prefix(sil-1

,sil) ○ prefix(sil

,si1).

• Let w(с) be the weight of с, w(с) = |α(с)|.• Notice that each string si1

,si2,…, sil

is a substring of (α(с)).

• Next, let σ(с) = α(с) ○ si1.

• Then σ(с) is a superstring of si1,si2

,…, sil .

• In the above construction, we “opened” cycle c at an arbitrary string si1

. For the rest of the algorithm, we will call si1 the

representative string for с.

34

Example

abcdeabcdeabcde bcdeabcdeabcdea cdeabcdeabcdeabc deabcdeabcdeabcd abcdeabcdeabcde

α(с) = abcde , |α(с)|=5, (α(с))2 = abcdeabcde , bcdeabcdeabcdea is a substring of (α(с))4. σ(с) = α(с)○si1

= abcdeabcdeabcdeabcde

35

Algorithm Superstring

Input (S = {s1,…,sn })1) Construct the prefix graph Gpref corresponding to

strings in S. 2) Find a minimum weight cycle cover of Gpref , С

= {c1,…,ck}Output (σ(c1)○…○ σ(ck)).

36

Remark

• Clearly, the output σ(c1)○…○ σ(ck) is a superstring of the strings in S.

• Notice that if in each of the cycles we can find a representative string of length at most the weight of the cycle, then the string output is within 2OPT.

• Thus, the hard case is when all strings of some cycle c are long.

37

Example

abcde|abcde|abcde bcde|abcde|abcde|a cde|abcde|abcde|abc de|abcde|abcde|abcd abcde|abcde|abcde

α(с) = abcde , |α(с)|=5, (α(с))2 = abcdeabcde , bcdeabcdeabcdea is a substring of (α(с))4. σ(с) = α(с)○si1

= abcde|abcde|abcde|abcde

38

New lower bound

• Lemma 4.6 If each string in S′ S is a substring of t for a

string t, then there is a cycle of weight at most |t| in the prefix graph covering all the vertices corresponding to string in S′ .

39

Proof of Lemma 4.6

• For each string in S′, locate the starting point of its first occurrence in t .

• All these starting points will be distinct and will lie in the first copy of t.

• Consider the cycle in the prefix graph visiting the corresponding vertices in this order.

• Clearly, the weight of this cycle is at most |t|.

40

Lower bound on overlap

• Lemma 4.7 Let c and c′ be two cycles in C (cyclic cover of

the minimal weight), and let r, r′ be representative strings from these cycles. Then

|overlap(r, r′)| < w(c) + w(c′).

41

|overlap(r, r′)| ≥ w(c) + w(c′)

r

r'

overlap(r, r′)

α α

α' α' α'

α○α' = α'○α

α is a prefix of length w(c) of overlap (r, r′).

α′ is a prefix of length w(c′) of overlap (r, r′).

Since |overlap(r, r′)| ≥ w(c) + w(c′), it is follows that α and α′ commute.

42

|overlap(r, r′)| ≥ w(c) + w(c′).

r

r'

overlap(r, r′)

α α

α' α' α'

α○α' = α'○α

α is a prefix of length w(c) of overlap (r, r′).

α′ is a prefix of length w(c′) of overlap (r, r′).

(α)∞ = (α')∞For any N > 0, the prefix of length N of (α)∞ is the same as that of (α')∞.

Proof of Lemma 4.7

• Now, by Lemma 4.6, there is a cycle of weight at most w(c) in the prefix graph covering all strings in c and c, contradicting the fact that C is a minimum weight cycle cover.

• So, we have |overlap(r, r′)| < w(c) + w(c′).

43

44

Approximation ratio ofAlgorithm Superstring

Theorem 4.8

Algorithm Superstring achieves an approximation factor of 4 for the shortest superstring problem.

45

Algorithm Superstring

Input (S = {s1,…,sn })1) Construct the prefix graph Gpref corresponding to

strings in S. 2) Find a minimum weight cycle cover of Gpref , С

= {c1,…,ck}Output (σ(c1)○…○ σ(ck)).

46

Proof

OPT1

k

iicwCw

k

ii

k

ii rCwcA

11

ri is a representative string for сi.

,...,...,,...,...:* 21 krrrstring

k

ii

k

ii

L

k

iii

k

ii

cwr

rrr

11

7.4

1

11

1

2

,overlapOPT OPT31

k

iir

OPT4A

Exercise 4.1

• Show that the metric k-center problem cannot be approximated within factor < 2, unless P=NP.

• Hint: show that such an algorithm can solve the dominating set problem in polynomial time.

Dominating set• Given an undirected graph G=(V,E) and a

number k N, ∈ is there a dominating set X ⊆ V(G) with |X| ≤ k.

47