A (1+)-Approximation Algorithm for 2-Line-Center
P.K. Agarwal, C.M. Procopiuc, K.R. VaradarajanComputational Geometry 2003
Outline Introduction Preliminaries Approximation Algorithm Conclusion
1.Introduction: Projective clustering Given a set S of n objects in Rd and two integers k < n and
q d, find k q-dimensional flats h1,...,hk and partition S into k subsets S1, ...,Sk so that
is minimized. The k-line-center problem is the projective clustering
problem for d =2 and q = 1. Partition S into k clusters and each cluster Si is projected onto a
line so that the maximum distance between a point p and its projection p* is minimized.
1.Introduction:This paper 2-line-center
Given a set S of n points in R2, cover S by two strips so that th maximum width of a strip is minimized
Projective clustering has recently received attention as a tool for creating more efficient nearest neighbor structures, as searching amid high dimensional point set is becoming increasingly important.
1.Introduction: Previous Work 2-line:
near-quadratic running time for exact version. 1-line: width problem
(nlogn) for d =2 (1+ )Approximation:
General: computing k projective clusters Whether a set of n points in the plane can be covered by k lines is NP-
Complete Projective clustering is NP-Complete Approximating the minimum width within a constant factor is NP-
Complete.
1.Introduction: This result Let w* denote the minimum value so that S can be covered
by two strips of width at most w*. This paper present an algorithm that computes, for any >0,
a cover of S by two strips of width at most (1+ ) w*, in time
Strategy of this paper: first presenting a 6-approximation algorithm then derive a (1+ )-approximation algorithm
2.Preliminaries Notations
Strip : the region lying between two parallel lines l1 and l2
width of : distance between l1 and l2
direction of : direction of l1 strip cover of S: two strips that each point of S lies
in one of the strips. For any points p,q, lpq: the line passing through p, q (p,q, r): if r lpq , is the same as lpq
(p,q; w): the strip having lpq as the median line of width 2w.
p
q
r
(p,q, r)
lpq
2.Preliminaries Notations
Optimal cover: * = {1*, 2
*} of S, its width w*
Si* = S i
*
Anchor pair (p,q) of : if d(p,q) diam(S )
p
q
S
diam(S )
2.Preliminaries
Proof let be the diameter
of S*
: the smallest rectangle containing S*, the length of is L, the width of is w.
We choose rS* to be the point farthest away from lpq . Since r, d(r,lpq) 3w.
Moreover S* =S * (p,q,r), and the lemma follows
’: parallel to lpq, thinnest strip contains , its width w’.
3. Approximation Algorithm Two phases
phase 1: computes a cover of S by two strips of width at most 6w*
phase 2: Use to compute a new cover by two strips of width at most (1+ )w*
3.1 6-approximation cover Suppose we have an anchor pair (p,q) of a strip in
*
How to obtain such a pair will be described in 3.2 WLOG, let (p,q) be an anchor pair of 1
*
By Lemma 2.1 there exist r S so that width((p,q,r)) 6w* and (S\ (p,q,r)) 2
*
Perform a binary search to find that r !! Then compute a strip of width at most 2w* that
contains the rest points, i.e. S\ (p,q,r)
Suppose we have an anchor pair (p,q) of a strip in *
f(w)
Proof:
3.1 6-approximation cover
w
f(w)2wg(w)
wi wi+1 wn
f(w)
Binary search over w
Proof:
3.1 6-approximation cover
w
f(w) 2wg(w)
wi wi+1 wn
Compute a family F of at most 11 pairs of points that contains an anchor pair. compute the diameter of S, and let (p,q) be
a diametral pair in S. Let Dp, Dq be the disks of radius /2, centered at p, respectively q.
3.2 Computing an anchor pair
Case 1 If S\(Dp Dq) , let rS\(Dp Dq). Return F ={(p,q),(q,r),(p,r)}
Correctness: At least two points among p,q, and r must be in the
same strip subset. Since d(p,q) = and d(p,r), d(q,r) /2.
all these 3 are greater than diam(S)/2, and is also greater than any diam/2 of any subset.
At least one of these 3 pairs must be an anchor pair. (of an optimal strip)
(Recall the definition of anchor pairs)
3.2 Computing an anchor pair
Case 2 else, S\(Dp Dq) =
Let P =S Dp and Q = S Dq .
conv(P) and conv(Q) be their convex hulls, these two hulls do not intersect
Compute l1 and l2, the inner common tangent lines of conv(P) and conv(Q)
let p1 P, q1 Q be the points lying on l1. Respectively p2, q2
let p3, p4 be a diametral pair in P, and q3,q4 be a diametral pair in Q
Return F = { (p,q), (p3,p4), (q3,q4),
(p,q1), (p,q2), (p,q3), (p,q4),
(q, p1), (q,p2), (q,p3), (q,p4)}
3.2 Computing an anchor pair
P, Q are points
Correctness of Case 2 Suppose on the contrary that no pair of F
is an anchor pair. This implies p,q is neither an anchor pair
of 1* nor of 2
* , so S12* contains either p
or q but not both. WLOG, let p S12
* and q S21*. Since
d(p,qi), d(q,pi) /2 (different disk), pi S12 and qi S21
*, for i = 1,2,3,4
S12* Q . because otherwise S12
* P, and (p3,p4) is an anchor pair, a contradiction. Similarly S21
* P Therefore there exist point p’ S12
* Q, and q’ S21
* P
3.2 Computing an anchor pair
P, Q are points
S12*
S21*
Correctness of Case 2 1
* | 2*
p1~p4 | q1~q4
p | q p’ | q’ x let s be the intersection point of l1 and l2.
Since strip q1, q2, and q’, it also contains the triangle q1q2s. Hence, p’ q1q2s
But p’ lies in the wedge. therefore p1p2p’ intersects the segment q1q2
(green) .Let x be a point on this segment. Since 1
* contains p1p2p’, it also contains x. But q1 q2 do not lie inside 1
* ,so 1*
separates q1 and q2. 2* separates p1 and p2.
3.2 Computing an anchor pair
P, Q are points
x
Strategy We have a 6-approximation cover. Within this region, we try to “guess” the optimal *. We guess its
direction(), displacement(by z) and its width w* (by w) The result of our guess is ’ , an (1+ )-approximation of *, and
totally contains *. For the points not covered by ’ , we run the known PTAS width algorithm to find the second strip covering them.
3.3 (1+ )-Approximation
z
3w’
4d(p,q)
Detail R as shown. Z , , and W Let = C, where C is a constant to be specified later. Z: grid of “positions” along the boundary, so that there are
grid points on each side of R : grid of “directions”.
W: grid of the value of “width”
3.3 (1+ )-APX
-
2w~w~/6
(ε/2) . (w~/6)
Existence of z’, ’, w’ Assuming we know z’, ’, we can perform binary search on w’.
(By computing the width of the “rest”) Since we don’t know z’, ’, we try all possible pairs of them.
3.3 (1+ )-APX
Proof of correctness Can we find by guessing?
Also by Lemma 2.1, we know So R is “big enough”. The remaining question is whether the grids are “dense enough”?
1.First we prove there exists a “good” 2. Then we prove there exists a good z.
3.3 (1+ )-APX
s
Proof of Lemma 3.6 2. There is a good z (together with the previous ,) such that there
exists a strip such that S* , width() (1+ /2)w*
3.3 (1+ )-APX s
Proof of Lemma 3.6 1.There is a good , such that there exists a strip such that S*
, width() (1+ /4)w*
If ½ (w~) /d(p,q), assuming 2/3,
If ½ >(w~) /d(p,q) , which implies <30°
3.3 (1+ )-APX s
Conclusion We have an simple and efficient 2-line-center
approximation algorithm. k-line-center for fixed k, to higher dimensions hyper-strips