simple algorithm for sorting the fibonacci string rotations
DESCRIPTION
Simple Algorithm for Sorting the Fibonacci String Rotations. Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan Jos é Pinz ó n Ardila. Our Goal. What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)? - PowerPoint PPT PresentationTRANSCRIPT
Simple Algorithm for Sorting theFibonacci String Rotations
Manolis ChristodoulakisManolis Christodoulakis King’s College London
Joint work with Costas S. IliopoulosYoan José Pinzón Ardila
SOFSEM 2006 2
Our GoalOur Goal
What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)?
Relationship between different rotations of a Fibonacci string
What is their lexicographic order? Side effect: we can deduce the symbol
stored at any position of any Fibonacci string in constant time (without using , provided that the fn values are known)
SOFSEM 2006 3
Fibonacci Strings & NumbersFibonacci Strings & Numbers
The n-th Fibonacci stringFn = Fn-1Fn-2 n≥2 F0=b, F1=a
The n-th Fibonacci numberfn = fn-1+fn-2 n≥2 f0=1, f1=1
F2 a= b
F3 a= b a
F4 a= b a a b
F1 a=
F0 b= f0 1=
f1 1=f2 2=f3 3=f4 5=
SOFSEM 2006 4
NotationNotation
The i-th rotation of a string
where i is taken modulo n.
rank(i,x) = the rank of Ri(x) rot(ρ,x) = the rotation whose rank is ρ
0 1 … i-1 i …n-1x =
0 1 … i-1 i …n-1Ri(x)
=
SOFSEM 2006 5
Burrows-Wheeler Transform (BWT)Burrows-Wheeler Transform (BWT)
M.Burrows and D.J.Wheeler. 1994 Purpose: to make a string more
compressible BWT Algorithm:
1. Create list of all rotations2. Sort them3. Output last symbol of every rotation4. Output the rank of the 0-th rotation
SOFSEM 2006 6
BWT on Fibonacci StringsBWT on Fibonacci Strings
F5 = abaababa, f5 = 8
R0(F5) a= b a a b a b aR1(F5) b= a a b a b a aR2(F5) a= a b a b a a bR3(F5) a= b a b a a b aR4(F5) b= a b a a b a aR5(F5) a= b a a b a a bR6(F5) b= a a b a a b aR7(F5) a= a b a a b a b
R0(F5) a= b a a b a b a
R1(F5) b= a a b a b a a
R2(F5) a= a b a b a a b
R3(F5) a= b a b a a b a
R4(F5) b= a b a a b a a
R5(F5) a= b a a b a a b
R6(F5) b= a a b a a b a
R7(F5) a= a b a a b a b
SOFSEM 2006 7
Properties of Fibonacci StringsProperties of Fibonacci Strings
The number of ‘b’ in Fn is fn-2
Proof: By induction.
C.S.Iliopoulos, D.W.Moore and W.F.Smyth. 1997Fn = Fn-2Fn-3…F1un, un = ba (n odd)
un = ab (n even)
Let’s call this the IMS formula.
SOFSEM 2006 8
Similarities in RotationsSimilarities in Rotations
R0(Fn) differs from Rfn-2(Fn) in 2 symbols Proof:
R0(Fn) = Fn-2Fn-3…F1un
Rfn-2(Fn) = Fn-3…F1unFn-2 (1)
R0(Fn) = Fn-1Fn-2
= Fn-3…F1un-1Fn-2 (2) Ri(Fn) differs from Ri+fn-2(Fn) in 2 symbols Proof:
Ri(Fn) = Ri(R0(Fn))
Ri+fn-2(Fn) = Ri(Rfn-2(Fn))
SOFSEM 2006 9
Relative Order of RotationsRelative Order of Rotations
Ri(Fn) < Ri+fn-2(Fn) for n odd, i fn-1-1 Proof:
R0(Fn) = Fn-3…F1un-1Fn-2
Rfn-2(Fn) = Fn-3…F1un Fn-2
For i=fn-1-1:
Ri(Fn) = bFn-2Fn-3…F1a
Ri+fn-2(Fn)= aFn-2Fn-3…F1b
Similarly, Ri(Fn) > Ri+fn-2(Fn) for n even, i fn-1-1
= Fn-3 … F1 ab Fn-2
= Fn-3 … F1 ba Fn-2
SOFSEM 2006 10
Sorted List of RotationsSorted List of Rotations
We proved (n odd):Ri(Fn) < Ri+fn-2(Fn) i fn-1-1 (3)
We will now prove that there is no j s.t.Ri(Fn) < Rj(Fn) < Ri+fn-2(Fn)
Proof: (constructive)Start at i=fn-1 and construct the partial list
Ri Ri+fn-2 Ri+2fn-2 Ri+3fn-2 … Ri+kfn-2 …
for as long asi+kfn-2 fn-1-1 (mod fn) kfn-1
I.e. the list is complete!
SOFSEM 2006 11
Identify Rotation Identify Rotation (i)(i) by Rank by Rank ((ρρ))
Therefore, for n odd:rot(ρ,Fn) = fn-1
= (ρfn-2-1) mod fn
Similarly, for n even, the sorted list is constructed bottom-up giving
rot(ρ,Fn) = (-(ρ+1)fn-2-1) mod fn
+ρfn-2) mod fn(
SOFSEM 2006 12
Identify Rank Identify Rank ((ρρ)) of a Rotation of a Rotation (i)(i)
This is simply the inverse of the previous function
n oddrank(i,Fn) = ((i+1)fn-2) mod fn
n evenrank(i,Fn) = ((i+1)fn-2-1) mod fn
SOFSEM 2006 13
Symbols of Fibonacci StringsSymbols of Fibonacci Strings
Fn[i] = ? Observe that
Fn[i] = Ri(Fn)[0]
In the sorted list of rotations, the first fn-1 rotations start with ‘a’, the rest with ‘b’
Thus Fn[i] can be deduced from rank(i,Fn)
If rank(i,Fn) ≤ fn-1 then Fn[i]=a else b.
SOFSEM 2006 14
BWT & Fibonacci ― The Quick WayBWT & Fibonacci ― The Quick Way
The first fn-2 symbols of BWT are ‘b’ Proof: (n odd)
We proved the first fn-2 rotations have index
(ρ·fn-2-1)modfn for 0 ≤ ρ < fn-2
The last symbol of these rotations isFn[ (ρ·fn-2-1 )modfn ]
Which for 0 ≤ ρ < fn-2 is ‘b’
The next fn-1 symbols of BWT are ‘a’ Proof: Consequence of previous lemma
+fn-1