testing metric properties michal parnas and dana ron
Post on 20-Dec-2015
218 views
TRANSCRIPT
Testing Metric Properties
Michal Parnas and Dana Ron
Property Testing (Informal Definition)
For a fixed property P and any object O,determine whether O has property P,or whether O is far from having property P (i.e., far from any other object having P ).
Task should be performed by querying the object (in as few places as possible).
? ?
?
??
Property Testing - Background
• Initially defined by Rubinfeld and Sudan in the context of Program Testing (of algebraic functions).
• Goldreich Goldwasser and Ron initiated study of testing properties of (undirected) graphs.
• Growing body of work deals with properties of functions, graphs, strings, sets of points ... Many algorithms with complexity that is sub-linear in (or even independent of) size of object.
Motivation
• Computational: Design testing algorithms that are (much) more efficient than exact decision algorithms for properties.
• Combinatorial: Gain new understanding about tested property.
Testing Metric Properties
P - Metric property ;M - n x n rational-valued matrix;
- Distance/approximation parameter;
M is said to be -far from property P if must modify more than fraction of n2 entries so that M obtains P. Otherwise say that it is -close.
Testing algorithm can query M on entries M[i,j].If M has property P, should accept;
If M is -far from property P, should reject w.p. 2/3.
Tree Metrics and Ultametrics
An n x n matrix M is a tree metric (additive metric) if exists a tree T with positive weights on edges, such that:
• There exists a mapping from [n] into nodes of T;
• For every i,j[n]={1,…,n}, T((i),(j))=M[i,j];
• All nodes to which no i[n] is mapped to, have degree greater than 2.
If: T is rooted, maps only to leaves of T, and distance of all leaves to root is the same, then M is an ultrametric.
75
3 25
35
4
1 53
6 7
4
2
M[1,2]=8;M[1,3]=12;M[1,4]=10;M[1,5]=15; . . .
Tree Metric
1 2 3 4 5 6
4 4 4
3
3
2
2
11
M[1,2]=M[1,3]=M[2,3]=8;M[1,4]=M[1,5]=M[1,6]=12;M[4,5]=M[4,6]=6;M[5,6]=2; . . .
Ultrametric
Our Results
• Can test ultrametrics with |S|= O(log(1/)/).
• Can test general tree metrics with |S|=O(log(1/)/).
• Can extend result for ultrametrics to approximate ultrametrics.
• Can test d-dimensional Euclidean metrics with |S|=O(d log d/).
Our algorithms all work by taking uniformly selected sample S [n] and querying M[i,j] for i,j S. Size of sample is always poly(1/) and independent of n. Specifically:
Our Results (continued)
Testing algorithms can be used to solve relaxed versions of corresponding search problems in time linear in n (and polynomial in 1/). That is, can construct tree that agrees with M on all but at most -fraction of entries.
(Note that running time is sub-linear in size of matrix M.)
Constructing an Ultrametric Tree
Suppose M is an ultrametric. We can construct an ultrametric tree that agrees with M on given subset {1,…,s} in following manner:
• Initialization: Position points 1 and 2 at equal distance M[1,2]/2 from root node.
• Iterations: For each point j = 3,…,s add point j to current tree by adding new branch that emits from j’s unique point of departure from tree. This point is determined by closest point in tree.
M[1,2]=8; M[1,3]=M[1,4]=M[1,5]=10;M[2,3]=M[2,4]=M[2,5]=10;M[3,4]=2; M[3,5]=6;M[4,5]=6;
1 2
4 4
3
1
5
2
3
45
1 1
1
Consistency of points with tree
For U [n] , let TU denote tree with leaf-set U, that agrees with M on U (if exists, such tree is unique).
Def: Say that j [n] \ U is consistent with TU if adding j to TU as described in construction procedure, results in tree that agrees with M on U+j.
Denote set of points consistent with U by U.
The “Scaffold Partition”
For U [n] , let TU denote tree with leaf-set U, that agrees with M on U. We refer to tree as scaffold.
Def: Let PU be following partition of U, induced by TU: Points i and j are in same class i.f.f have same point of departure from TU .
1
3
1
11
3
2
1 1
2
2
C1 C4C3C2
The scaffold partition
Violating Pairs
If M is an ultrametric, then for every subset U, and for
every two points i,j that belong to different classes in PU, value of M[i,j] is exactly determined by corresponding (different) departure points in TU.
Def: Say that i,j U that belong to different classes in
PU are a violating pair w.r.t. TU if distance between them according to scaffold TU differs from M[i,j] .
1
3
1
11
3
2
1 1
2
2
C1 C4C3C2
If M is ultrametric, must have M[i,j]=8.
ji
3 2
Two types of “witnesses”
Suppose have scaffold tree TU that agrees with M on U. (If can’t construct such tree, clearly M not ultrametric.)
It follows that:
• If obtain point j that is inconsistent with TU
then have witness that M not ultrametric.
• If obtain pair of points i,j that are violating w.r.t. TU
then have witness that M not ultrametric.
Testing Algorithm for Ultrametrics
1. Uniformly select s=O(log(1/)/3) points from [n]. Denote set by U.
2. Construct tree TU that agrees with M on U. If fail, reject.
3. Uniformly select m=O(1/) pairs of points from [n].
4. If any of these 2m points is inconsistent with TU, or any of the m pairs is violating w.r.t. TU, then reject.
5. If no step cause rejection then accept.
Analysis of Algorithm
If M is ultrametric -- Algorithm always accepts. (No inconsistent points and no violating pairs.)
From now on assume M is -far from ultrametric. Will show that algorithm rejects w.h.p.
Specifically: Either can’t construct TU that agrees with M; or many inconsistent points w.r.t. TU; or many violating pairs w.r.t. TU;
Special Case (for M -far from ultrametric)
Suppose TU agrees with M, and all but at most (/3)n2
pairs of points in U belong to different classes in PU
(are separated). (In particular is the case if all classes of size O( n).)
Claim: Either have > (/3)n inconsistent points w.r.t. TU
or have > (/3)n2 violating pairs w.r.t TU.
Subject to claim, if M is -far from ultrametric, then rejected w.h.p. as required.
Proof of Claim for special case
Assume, contrary to claim, that have (/3)n inconsistent points, and (/3)n2 violating pairs. Will show that ultrametric tree T that agrees with M on all but at most n2 entries, in contradiction to assumption on M.
Tree T builds on scaffold TU:
For every class C in PU create star-shaped sub-tree with leaf set C that is rooted at point of departure of C from TU.Inconsistent points are added arbitrarily.
By premise of lemma and (counter) assumptions, num of disagreements (/3)n .n + (/3)n2 + (/3)n2 = n2 .
incon. pts viol. Pairs unsep. pairs
1
3
1
11
3
2
1 1
2
2
C1 C4C3C2
1
3
1
11
3
2
1 1
2
2
C1 C4C3C2
General Case
By special case: Gain from separating points to diff classes.
Def: Say that point kU is effective separator w.r.t. TU if
adding k to U causes ( n/12)2 pairs of points to be
separated into different classes.
k
C1
C4C3C2C1,2C1,1
General Case
By special case: Gain from separating points to diff classes.
Def: Say that point kU is effective separator w.r.t. TU if
adding k to U causes ( n/12)2 pairs of points to be
separated into different classes.
k
C4C3C2C1,2C1,1
General Case (continued)
In analysis, view sample U as being selected in phases.
In each phase, if many effective separators then one selected w.h.p.
After sufficient num of phases, either have special case (few non-separated pairs), or U s.t. have few effective separators w.r.t. TU .
In latter case can show that class C in PU, tree TC s.t. for almost all pairs i,jC, M[i,j]= TC(i,j). (Tree is star-shaped/broom-shaped.)
General Case (continued)
Claim: Either have > (/4)n inconsistent points w.r.t. TU
or have > (/4)n2 violating pairs w.r.t TU.
Subject to claim, if M is -far from ultrametric, then rejected w.h.p. as required.
Proof of Claim is similar to that in special case: Assume few inconsistent points and violating pairs, show that tree close to M (contradicting M being-far from ultrametric).
1
3
1
11
3
2
1 1
2
2
C1 C4C3C2
1
3
1
11
3
2
1 1
2
2
C1 C4C3C2
Solving Relaxed version of Search ProblemAnalysis implies that testing algorithm can be used to solve relaxed version of corresponding search problem.
That is, if M is ultrametric then, w.h.p. can construct tree that agrees with M on all but at most -fraction of entries in time linear in n and polynomial in 1/:
• Construct scaffold TU on uniformly selected sample U;
• Partition all points in [n]\U into classes of PU according to distances to points in U;
• For each class C construct star/broom-shaped tree TC.
Testing Approximate Ultrametrics
Def: For a given approximation parameter , we say that matrix M is a -approximate ultrametric if exists ultrametric M’ s.t. for every i,j [n], |M[i,j]-M’[i,j]| .
We describe an algorithm, that for every and, if M is a –approximate ultrametric then algorithm accepts M, and if M is –far from being a c–approximate ultrametric then algorithm rejects M w.h.p. (c is a fixed constant).
Conclusions and Further Research
• Presented algorithm for testing whether matrix is an ultrametric or far from being an ultrametric. Analysis implies fast solution for relaxed search problem.
• Mentioned similar results for approximate ultrametrics, general tree metrics and Euclidean metrics.
• We suspect that results can be improved in terms of dependence on 1/.
• We conjecture that can extend result for general tree metrics to approximate variant.
• Testing other natural metric properties?