Download - Function Matching
![Page 1: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/1.jpg)
Function MatchingFunction Matching
Amihood Amir
Yonatan Aumann
Moshe Lewenstein
Ely Porat
Bar Ilan University
![Page 2: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/2.jpg)
Prog.c
int a,b;
a=1;a = g(a)*5+f(a);b=2;a = func(a,b);a = a*g(b);b=1;b = g(b)*5+f(b);….
Baker’s Parameterized MatchingBaker’s Parameterized Matching
![Page 3: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/3.jpg)
Prog.c
int a,b;
a=1;a = g(a)*5+f(a);b=2;a = func(a,b);a = a*g(b);b=1;b = g(b)*5+f(b);….
Baker’s Parameterized MatchingBaker’s Parameterized Matching
c=1;c = g(c)*5+f(c);
Pattern
Baker’s work
pdup dupstat psearch
SICOMP 1997 JCSS 1996
![Page 4: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/4.jpg)
Two dimensional parameterized matchingTwo dimensional parameterized matching
pattern
‘A horse is a horse,it ain’t make a differencewhat color it is’ John Wayne
![Page 5: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/5.jpg)
Input P = p1…pm over alphabet T = t1 . . . tn over alphabet
Output: locations i of T, for which a bijection : exists s.t.
(P) = (p1) (p2)… (pm) = ti…ti+m-1
T
P
TPΠ
Π Π Π Π
Parameterized MatchingParameterized Matching
![Page 6: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/6.jpg)
Parameterized MatchingParameterized Matching
• One dimensional
• Baker 1996, JCSS - Suffix Trees
• Baker 1997, SICOMP - Boyer Moore
• Amir, Farach, Muthu 1995, IPL - Knuth-Morris-Pratt
• Two dimensional
Regular methods fail !!
![Page 7: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/7.jpg)
Function MatchingFunction Matching
Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet
Output: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1
T
P
TP
![Page 8: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/8.jpg)
Input: P = p1…pm over alphabet T = t1 . . . tn over alphabetT
P
P = h e h a e hT = a b c b a c b a d a b d a d d a d
Function MatchingFunction Matching
TPOutput: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1
![Page 9: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/9.jpg)
Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T
P
P = h e h a e hT = a b c b a c b a d a b d a d d a d
f(h) = bf(e) = cf(a) = a
Function MatchingFunction Matching
TPOutput: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1
![Page 10: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/10.jpg)
Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T
P
P = h e h a e hT = a b c b a c b a d a b d a d d a d
f(h) = af(e) = df(a) = b
Function MatchingFunction Matching
TPOutput: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1
![Page 11: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/11.jpg)
Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T
P
P = h e h a e hT = a b c b a c b a d a b d a d d a d
f(h) = df(e) = af(a) = d
Function MatchingFunction Matching
TPOutput: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1
![Page 12: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/12.jpg)
Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T
P
P = h e h a e hT = a b c b a c b a d a b d a d d a d
f(h) = ??no match !
Function MatchingFunction Matching
TPOutput: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1
![Page 13: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/13.jpg)
Function Matching vs. Parameterized MatchingFunction Matching vs. Parameterized Matching
P p-matches ti…ti+m-1 iff
1. P f-matches ti…ti+m-1
and 2. # of symbols in ti…ti+m-1 = # of symbols in P
P = h e h a e h h e h a e hT = a b c b a c b a d a b d a d d a d
f(h) = df(e) = af(a) = d
f(h) = bf(e) = cf(a) = a
![Page 14: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/14.jpg)
Naïve AlgorithmNaïve Algorithm
At each location i of text T check if pattern f-matches
CheckFor each letter ‘a’ in pattern Are elements aligned with the pattern ‘a’s the same? no? declare ‘no match’ All letters “OK” – declare ‘match’
Running time: O(nm), where m = |P| and n = |T|
![Page 15: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/15.jpg)
Function Matching with Don’t CaresFunction Matching with Don’t Cares
Input: P = p1…pm over alphabet {?} T = t1 . . . tn over alphabet T
P
P = h e ? ? e hT = a b c b a c b c d b c d a d d a d
TPOutput: locations i of T, where f: exists s.t.
f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1,
f(?) - wildcard
![Page 16: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/16.jpg)
Why do we need don’t cares?Why do we need don’t cares?
Pattern
Text
![Page 17: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/17.jpg)
Linearize Text and PatternLinearize Text and Pattern
Text
Pattern
…Line 1 Line 2
T =
![Page 18: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/18.jpg)
Linearize Text and PatternLinearize Text and Pattern
Text
Pattern
…Line 5 Line 6
T= … P = ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Line 1 Line 2
n
n
m
m
n-m n-m
![Page 19: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/19.jpg)
t1 t2 t3 t4 . . . tn-2 tn-1 tn pm pm-1 . . . p2 p1
p1t1 p1t2 . . . p1tn-2 p1tn-1 p1tn p2t1 p2t2 p2t3 . . . p2tn-2 p2tn-1 p2tn
p3t1 p3t2 p3t3 p3t3 . . . p3tn-1 p3tn
pmt1 . . . pmtm pmtm+1 . . pmtn-1 pmtn
. . .. . .
..
Polynomial Multiplication - ConvolutionsPolynomial Multiplication - Convolutions
. . .. . .
Running time: O(n log m)
![Page 20: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/20.jpg)
t1 t2 t3 t4 . . . tn-2 tn-1 tn pm pm-1 . . . p2 p1
p1t1 p1t2 . . . p1tn-2 p1tn-1 p1tn p2t1 p2t2 p2t3 . . . p2tn-2 p2tn-1 p2tn
p3t1 p3t2 p3t3 p3t4 . . . p3tn-1 p3tn
pmt1 . . . pmtm pmtm+1 . . pmtn-1 pmtn
. . .. . .
..
Convolutions: Fischer-Patterson [1974]Convolutions: Fischer-Patterson [1974]
p1 p2 p3 p4 . . . pm
m
iiitp
1
. . .. . .
![Page 21: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/21.jpg)
t1 t2 t3 t4 . . . tn-2 tn-1 tn pm pm-1 . . . p2 p1
p1t1 p1t2 . . . p1tn-2 p1tn-1 p1tn p2t1 p2t2 p2t3 . . . p2tn-2 p2tn-1 p2tn
p3t1 p3t2 p3t3 p3t4 . . . p3tn-1 p3tn
pmt1 . . . pmtm pmtm+1 . . pmtn-1 pmtn
. . .. . .
..
p1 p2 p3 p4 . . . pm
m
iiitp
11
. . .. . .
Convolutions: Fischer-Patterson [1974]Convolutions: Fischer-Patterson [1974]
![Page 22: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/22.jpg)
How does this help for Function Matching?How does this help for Function Matching?
beneath each symbol from the pattern alphabet all text characters must be the same
The property that needs to be checked is:
![Page 23: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/23.jpg)
T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h
Example -Example -
![Page 24: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/24.jpg)
h in P vs.a in T
T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h
Example -Example -
Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR
h = 0 0 1 0 0 1 0 1
![Page 25: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/25.jpg)
h - a Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR
h = 0 0 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 1 0 2 0 2 1 0 3 0 1 2 0 1 2 0 1 1 0 1
T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h
Example -Example -
![Page 26: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/26.jpg)
h - a Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR
h = 0 0 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 1 1 0 2 0 2 1 0 3 0 1 2 0 1 2 0 1 1 0 1
T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h
Example -Example - h e h a e h ? e
![Page 27: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/27.jpg)
h - a
0 0 1 0 0 1 1 1 0 2 0 2 1 0 3 0 1 2 0 1 2 0 1 1 0 1
=> in O(n log m) time!!
T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h
Example -Example -
Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR
h = 0 0 1 0 0 1 0 1
![Page 28: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/28.jpg)
h - a 1 0 2 0 2 1 0 3 0 1 2 0
=> in O(| | n log m) time!!
h - b 0 3 0 1 1 1 1 0 1 0 1 0
h - c 2 0 1 2 0 1 1 0 1 0 0 0
h - d 0 0 0 0 0 0 1 0 1 2 0 3
T
0 1 0 0 0 0 0 1 0 0 0 1Match(h)
T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h
Example -Example -
![Page 29: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/29.jpg)
In general - the AlgorithmIn general - the Algorithm
• For each character ‘a’ in create Pa
• For each character ‘b’ in create Tb
• For all Pa and Tb multiply them and construct Match(a) for each ‘a’ in
• Announce each location i of T as a ‘match’ if Match(a)[i] = 1 for all a’s in P
=> in O(| || | n log m) time.T P
T
P
P
![Page 30: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/30.jpg)
Improvement Improvement
Lemma: Let a1, ..., ak , then
k iff
for all i,j, ai = aj
Ν
k
1h
2h
k
1h h
2 )a(a
Idea: Let’s encode text with numbers for symbols
and encode pattern to compute their sum
and separately their sum of squares.
![Page 31: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/31.jpg)
Improvement Improvement
Lemma: Let a1, ..., ak , then
k iff
for all i,j, ai = aj
Ν
T# = 1 2 3 2 13 2 1 3 1 2 4 1 4 4 1 4 5 1T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? ePe = 0 1 0 0 1 0 0 1
Example: Compute sum of text char’s beneath “e”
k
1h
2h
k
1h h
2 )a(a
![Page 32: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/32.jpg)
Improvement Improvement
Lemma: Let a1, ..., ak , then
k iff
for all i,j, ai = aj
Ν
T#2= 1 4 9 4 1 9 4 1 9 1 4 16 1 16 16 1 16 25 1
T# = 1 2 3 2 1 3 2 1 3 1 2 4 1 4 4 1 4 5 1T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? ePe = 0 1 0 0 1 0 0 1
Example: Compute sum of squares beneath “e”
k
1h
2h
k
1h h
2 )a(a
![Page 33: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/33.jpg)
Improvement Improvement
Lemma: Let a1, ..., ak , then
k iff
for all i,j, ai = aj
Ν
k
1h
2h
k
1h h
2 )a(a
Running Time:
Two convolutions for each pattern character.
O(| | n log m)P
![Page 34: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/34.jpg)
Can we do better for big alphabets?
We have seen – 2 algorithms for Function Matching
1. O(nm) - naïve algorithm
2. O(| | n log m) - convolution basedP
We will see:
1. O(n log2m) - randomized convolutions based2. Lower bound of (nm) for deterministic
convolutions based methodsΩ
![Page 35: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/35.jpg)
Def:Def: A pattern is 2-charactered if every character appears at most twice in the pattern.
Example:Example: P = a b c b c c b b P1 = a1 b1 c1 b1 c1 c2 b2 b2 (even pairs) P2 = a1 b1 c1 b2 c2 c2 b2 b3 (odd pairs)
Lemma: Lemma: Let P be a pattern and T a text. 2-charactered patterns P1 and P2 s.t. at loc. i of T P f-matches iff P1 and P2 f-match.
![Page 36: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/36.jpg)
Situation:Situation: An algorithm for Function Matching with 2-charactered patterns a general algorithm for Function Matching.
So,all that needs to be checked is that: each pair in P has equal text symbols beneath it.each pair in P has equal text symbols beneath it.
![Page 37: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/37.jpg)
1.1. For each character:For each character: - a in T, randomly choose ra in {0, 1} - relace all a’s in T with ra - get T’
- b in P, randomly choose sb in {1,2} - set first b to be sb and the second b to be -sb - get P’
2. Convolve T’ and P’R
3. For each location i, for which T’*P’R[i] equals 0 for the convolutiondeclare a ‘match’
New Randomized AlgorithmNew Randomized Algorithm
![Page 38: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/38.jpg)
Example:Example:
P = v q v u q u ? sT = a b a a b a b a c a b d a b c b d b a
f(a) =f(b) =f(c) =f(d) =
1001
g(v) =g(q) =g(u) =
268
f(T) = 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1
g(P) = 2 6 –2 8 –6 –8 0 0
2+0–2+8+0–8+0+0 = 0
h(v) = ah(q) = bh(u) = ah(s) = a
![Page 39: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/39.jpg)
Example:Example:
P = v q v u q u ? sT = a b a a b a b a c a b d a b c b d b a
f(a) =f(b) =f(c) =f(d) =
1001
g(v) =g(q) =g(u) =
268
f(T) = 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1
g(P) = 2 6 –2 8 –6 –8 0 0
0+6–2+0-6+0+0+0 = -2
![Page 40: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/40.jpg)
Example:Example:
P = v q v u q u ? sT = a b a a b a b a c a b d a b c b d b a
f(a) =f(b) =f(c) =f(d) =
1001
g(v) =g(q) =g(u) =
268
f(T) = 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1
g(P) = 2 6 –2 8 –6 –8 0 0
0= 2+6+0+0+0-8+0+0
![Page 41: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/41.jpg)
Running Time: Running Time: O(nk log m) with probability 2-k
O(n log2m) with probability 1/m
if P f-matches at location i of T then f(T)*g(P)R [i+m-1] is trivially always equal to 0
if P does not f-match at location i of T then for each convolution <f,g>, f(T)*g(P)R [i+m-1],equals 0 with probability ½with k rounds of amplification the probability is (½)k
Correctness:Correctness:
![Page 42: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/42.jpg)
Limitation of the Convolutions ModelLimitation of the Convolutions Model
Can we do the same deterministically? No!
To show this we use the model of communication complexity
Alice Bob
xf(x,y)
y
![Page 43: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/43.jpg)
Limitation of the Convolutions ModelLimitation of the Convolutions Model
Known:Known: for x,y in {0,1}k the communication complexity of equals(x,y) is (k)
Take pattern P = a1 a2 a3 … am a1 a2 a3 … am, where i j ai aj
Given a collection of convolutions {<g(P), f(T)>}the convolutions of location i, (g(P)*f(t))[i+m-1] = g(aj )*f(ti+j-1) + g(aj )*f(ti+j+m-1). Since we arein essence comparing ti…ti+m-1 to ti+m…ti+2m-1
we get the equal information from the convolution.This is lower bounded by (m) for each location,In general (nm)
ΩΩ
m
j 1
m
j 1
![Page 44: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/44.jpg)
Another Application for Function MatchingAnother Application for Function Matching
Protein Folding detection:
1 2 3 4 5 6
78910
789
10
1 2 3
P = 1 2 3 4 5 6 7 8 9 10 10 9 8 7 6 5 4 11 12 … 12 11 3 2 1
![Page 45: Function Matching](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814560550346895db23432/html5/thumbnails/45.jpg)
QuestionsQuestions
1. Can Function Matching be solved deterministicallyin o(nm) time for big alphabets?
2. Are there special cases of Function Matching thatare easier (other than Parameterized Matching andother trivial ones)?
3. Does 2-dimensional Parameterized Matching needto be solved with function matching?