simple and fast linear space computation of longest common subsequences claus rick, 1999
Post on 21-Dec-2015
227 views
TRANSCRIPT
![Page 1: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/1.jpg)
Simple and fast linear space computation of Longest common subsequences
Claus Rick, 1999
![Page 2: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/2.jpg)
What is the LCS problem?
A A B A C
A B C
…Finding a sequence of greatest possible length that can be obtained From both A and B by deleting zero or more (not necessarily adjacent) symbols.
![Page 3: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/3.jpg)
Some boring history…Year Author Time Constants Paradigm
1975 Hirschberg O(mn) 2 Dyn. Prog.
1985 Apostolico, Guerra O(mLgm+pm) [2,logm] contours
1986 Myers O(n(n-p)) 2 Shortest path
1987 Kumar, Rangan O(n(m-p)) 3 contours
1990 Wu et al. O(n(m-p)) 2 Shortest path
1992 Apostolico, et al. O(n(m-p)) 3 contours
1992 Apostolico, et al. O(pm) 3 contours
1999 Goeman, Clausen O(min(pm, mLgm + p(n-p)])
[5,25,lgM] contours
1999 This article O(min(pm,p(n-p)]) 2 contours
![Page 4: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/4.jpg)
Pre-Info
Divide and conquer Midpoint
![Page 5: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/5.jpg)
Some basic terms
Ordered Pair (i,j)
A A B A C
A B C
(2,3)= (A,C)
![Page 6: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/6.jpg)
Some basic terms
Match
A A B A C
A B C
![Page 7: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/7.jpg)
Some basic terms
Chain
A A B A C
A B C
![Page 8: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/8.jpg)
Rank k
A A B A C
A B C
![Page 9: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/9.jpg)
Some basic terms
c b a b b a c a cabacbcba
Matching Matrix
![Page 10: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/10.jpg)
Some basic terms
Dominant matches
All Upper-left matches in each rank
![Page 11: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/11.jpg)
c b a b b a c a cabacbcba
Dominant matches
1
2
3
4
5
![Page 12: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/12.jpg)
A A B A C
A B C
![Page 13: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/13.jpg)
c b a b b a c a cabacbcba
![Page 14: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/14.jpg)
c b a b b a c a c
abacbcba
Backward contours (BC)
1
2
3
4
5
![Page 15: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/15.jpg)
Some last basic terms
FCk
BCk
![Page 16: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/16.jpg)
c b a b b a c a cabacbcba
1
2
3
4
5
Forward contours (FC)
![Page 17: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/17.jpg)
c b a b b a c a c
abacbcba
Backward contours (BC)
1
2
3
4
5
![Page 18: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/18.jpg)
Let p be the length of an LCS between strings A and B. Then for every match (i,j) the following holds:
•There is an LCS containing (i,j) if and only if (i,j) is on the kth forward contour and on the (p-k+1)st backward contour.
Lemma 1
![Page 19: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/19.jpg)
Lemma 1- proof
|BC|- (p-k+1)|FC|= (k)
P
P
K <(p-k+1)<(p-k+1)
![Page 20: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/20.jpg)
Start calculating
FC1 BC1 FC2 BC2
Sooner or later…
![Page 21: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/21.jpg)
Really really last terms
Define sets Mi as:
M0= M
M1= M0\FC1
M2= M1\BC1
M2i-1=M2(i-1) \FCi
M2i=M2i-1\BCi
![Page 22: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/22.jpg)
c b a b b a c a cabacbcba
abacbcba
c b a b b a c a c
M
![Page 23: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/23.jpg)
c b a b b a c a cabacbcba
abacbcba
c b a b b a c a c
M1M2M3M4M5
![Page 24: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/24.jpg)
Let call the first empty Mi….
M p’
![Page 25: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/25.jpg)
Lemma 2
The Length of an LCS is p’ and each match in M(p’-1) is a possible midpoint
![Page 26: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/26.jpg)
Lemma 2- proof
K
M 0
K-1K-210
M 2M 1M k-1M kK=p
![Page 27: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/27.jpg)
Little problem…
We can`t keep tracks of each set- very expensive
![Page 28: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/28.jpg)
c b a b b a c a cabacbcba
abacbcba
c b a b b a c a c
![Page 29: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/29.jpg)
What do we do?
Keep only dominant matches…
When we see a dominant match below- done.
![Page 30: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/30.jpg)
c b a b b a c a cabacbcba
abacbcba
c b a b b a c a c
![Page 31: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/31.jpg)
Lets define:
FCf’ , BCb’ the minimal indices as stated above
![Page 32: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/32.jpg)
Lemma 3
The Length of an LCS is b’ + f’ -1.
![Page 33: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/33.jpg)
Complexity
Finding the dominant matches each contour:
O(min(m, (n-p))
Number of contours:
P
O(Min(pm, p(n-p)
![Page 34: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/34.jpg)
The End
![Page 35: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/35.jpg)
Simple and fast linear space computation of longest common subsequence
Written by: Claus Rick,1999
Based on algorithm by:D.Hirschberg, 1975
Cast:
MatricesLines
ArrowsSquares
Blue Red
BrownGreyBlack
String AString B
Presentation: Uri Scheiner
No Dominant Matches were harmed during the making of this presentation
![Page 36: Simple and fast linear space computation of Longest common subsequences Claus Rick, 1999](https://reader031.vdocuments.us/reader031/viewer/2022032201/56649d575503460f94a35cab/html5/thumbnails/36.jpg)
Appendix
What is the LCS
Divided And Conquer
Match
Chain
Dominant Matches
FC
BC
Lemma 1
Define M…
Lemma 2
Keep just Dominant…
Lemma 3
Complexity