a simpler bit-parallel algorithm for swap matching

21
arXiv:1606.04763v1 [cs.DS] 15 Jun 2016 A Simpler Bit-parallel Algorithm for Swap Matching aclav Blaˇ zej, Ondˇ rej Such´ y, and Tom´ s Valla Faculty of Information Technology, Czech Technical University in Prague, Prague, Czech Republic Abstract. The pattern matching problem with swaps is to find all occur- rences of a pattern in a text while allowing the pattern to swap adjacent symbols. The goal is to design fast matching algorithm that takes ad- vantage of the bit parallelism of bitwise machine instructions. We point out a fatal flaw in the algorithm proposed by Ahmed et al. [The swap matching problem revisited, Theor. Comp. Sci. 2014], which we describe in detail. Furthermore we devise a new swap pattern matching algorithm which is based on the same graph theoretical model as the algorithm by Ahmed et al. (thus still not based on FFT) and we prove its correct- ness. We also show that an approach using deterministic finite automata cannot achieve similarly efficient algorithms. 1 Introduction In the Pattern Matching problem with Swaps (Swap Matching, for short), the goal is to find all occurrences of any swap version of a pattern P in a text T , where P and T are strings over an alphabet Σ of length p and t, respectively. By the swap version of a pattern P we mean a string of symbols created from P by swapping adjacent symbols while ensuring that each symbol is swapped at most once (see Section 2 for formal definitions). The solution of Swap Matching is a set of indices which represent where the swap matches of P in T begin (or, alternatively, end). Swap Matching is an intensively studied problem due to its use in practical applications such as text and music retrieval, data mining, network security and biological computing [7]. Swap Matching was introduced in 1995 as an open problem in non-standard string matching [18]. The first result was reported by Amir et al. [2] in 1997, who provided an O(tp 1 3 log p)-time solution for alphabets of size 2, while also showing that alphabets of size exceeding 2 can be reduced to size 2 with a little overhead. Amir et al. [5] also came up with solution with O(t log 2 p) time complexity for some very restrictive cases. Several years later Amir et al. [3] showed that Swap Matching can be solved by an algorithm for the overlap matching achieving the running time of O(t log p log |Σ|). This algorithm as well as all the previous ones is based on fast Fourier transformation (FFT). In 2008 Iliopoulos and Rahman [16] came up with the first efficient solution to Swap Matching without using FFT and introduced a new graph theoretic

Upload: nguyendung

Post on 28-Jan-2017

240 views

Category:

Documents


1 download

TRANSCRIPT

arX

iv:1

606.

0476

3v1

[cs

.DS]

15

Jun

2016

A Simpler Bit-parallel Algorithm

for Swap Matching

Vaclav Blazej, Ondrej Suchy, and Tomas Valla

Faculty of Information Technology, Czech Technical University in Prague,Prague, Czech Republic

Abstract. The pattern matching problem with swaps is to find all occur-rences of a pattern in a text while allowing the pattern to swap adjacentsymbols. The goal is to design fast matching algorithm that takes ad-vantage of the bit parallelism of bitwise machine instructions. We pointout a fatal flaw in the algorithm proposed by Ahmed et al. [The swapmatching problem revisited, Theor. Comp. Sci. 2014], which we describein detail. Furthermore we devise a new swap pattern matching algorithmwhich is based on the same graph theoretical model as the algorithm byAhmed et al. (thus still not based on FFT) and we prove its correct-ness. We also show that an approach using deterministic finite automatacannot achieve similarly efficient algorithms.

1 Introduction

In the Pattern Matching problem with Swaps (Swap Matching, for short), thegoal is to find all occurrences of any swap version of a pattern P in a text T ,where P and T are strings over an alphabet Σ of length p and t, respectively.By the swap version of a pattern P we mean a string of symbols created from Pby swapping adjacent symbols while ensuring that each symbol is swapped atmost once (see Section 2 for formal definitions). The solution of Swap Matchingis a set of indices which represent where the swap matches of P in T begin(or, alternatively, end). Swap Matching is an intensively studied problem due toits use in practical applications such as text and music retrieval, data mining,network security and biological computing [7].

Swap Matching was introduced in 1995 as an open problem in non-standardstring matching [18]. The first result was reported by Amir et al. [2] in 1997, who

provided an O(tp13 log p)-time solution for alphabets of size 2, while also showing

that alphabets of size exceeding 2 can be reduced to size 2 with a little overhead.Amir et al. [5] also came up with solution with O(t log2 p) time complexity forsome very restrictive cases. Several years later Amir et al. [3] showed that SwapMatching can be solved by an algorithm for the overlap matching achieving therunning time of O(t log p log |Σ|). This algorithm as well as all the previous onesis based on fast Fourier transformation (FFT).

In 2008 Iliopoulos and Rahman [16] came up with the first efficient solutionto Swap Matching without using FFT and introduced a new graph theoretic

2 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

approach to model the problem. Their algorithm based on bit parallelism runs inO((t+p) log p) time if the pattern length is similar to the word-size of the targetmachine. One year later Cantone and Faro [9] presented the Cross Samplingalgorithm solving Swap Matching in O(t) time and O(|Σ|) space, assuming thatthe pattern length is similar to the word-size in the target machine. In the sameyear Campanelli et al. [8] enhanced the Cross Sampling algorithm using notionsfrom Backward directed acyclic word graph matching algorithm and named thenew algorithm Backward Cross Sampling. This algorithm also assumes shortpattern length. Although Backward Cross Sampling has O(|Σ|) space and O(tp)time complexity, which is worse than that of Cross Sampling, it improves thereal-world performance.

In 2013 Faro [13] presented a new model to solve Swap Matching using re-active automata and also presented a new algorithm with O(t) time complex-ity assuming short patterns. The same year Chedid [11] improved the dynamicprogramming solution by Cantone and Faro [9] which results in simpler algo-rithms and implementations. In 2014 a minor improvement by Fredriksson andGiaquinta [14] appeared, yielding slightly (at most factor |Σ|) better asymptotictime complexity (and also slightly worse space complexity) for special cases ofpatterns. The same year Ahmed et al. [1] took a new look on Swap Matchingusing ideas from the algorithm by Iliopoulos and Rahman [16]. They devisedtwo algorithms named Smalgo-I and Smalgo-II which both run in O(t) forshort pattern, which utilize the so-called graph theoretical approach to SwapMatching. Unfortunately, these algorithms contain a fatal flaw, as we show inthis paper.

Another remarkable effort related to Swap Matching is to actually count thenumber of swaps needed to match the pattern at the location [6]. This is moreoften studied with an extra operation of changing characters allowed [4,12,17].

Our Contribution We describe a fatal flaw in Smalgo-I and Smalgo-II al-gorithms proposed by Ahmed et al. [1]. The flaw is actually already present inthe algorithm introduced by Iliopoulos and Rahman [16].

We further propose a way to employ the graph theoretical approach cor-rectly by designing a new algorithm for Swap Matching. This algorithm hasO(⌈ p

w⌉(|Σ|+ t) + p) time and O(⌈ p

w⌉|Σ|) space complexity where w is the word-

size of the machine. We would also like to stress that our solution, as based onthe graph theoretic approach, does not use FFT. Therefore, it yields a much sim-pler non-recursive algorithm allowing bit parallelism and not suffering from thedisadvantages of the convolution-based methods. Although our algorithm doesnot provide better asymptotic bounds than some of the previous results [9,14],we believe that its strength lies in the applications where the alphabet is smalland the pattern length is at most the word-size, as it can be implemented usingonly 7+ |Σ| CPU registers and few machine instructions. This makes it practicalfor tasks like DNA sequences scanning. We have prepared an implementation of

A Simpler Bit-parallel Algorithm for Swap Matching 3

both the Smalgo-I and our own algorithm for illustration purposes, which isavailable for download.1

Finally we show that there are instances for which any deterministic finiteautomaton solving Swap Matching must have at least exponential number ofstates. This means that a possible approach to Swap Matching by designing asuitable deterministic finite automaton cannot lead to an efficient solution.

This paper is organized as follows. First we introduce all the basic definitions,recall the graph model introduced in [16] and its use for matching in Section 2.Then we describe the algorithms Smalgo-I and Smalgo-II and the flaw indetail in Section 3. We show there an input pattern and a text sequence whichcause the algorithms to produce an incorrect output. In Section 4 we show ourown algorithm which uses the graph model in a new way. Section 5 contains theproof that Swap Matching cannot be solved efficiently by deterministic finiteautomata.

2 Basic Definitions and the Graph Theoretical Model

In this section we collect the basic definitions, present the graph theoreticalmodel and show a basic algorithm that solves Swap Matching using the model.

Notations and Basic Definitions We use the word-RAM as our computa-tional model. That means we have access to memory cells of fixed capacity w(e.g., 64 bits). A standard set of arithmetic and bitwise instructions include And(&), Or (|), Left bitwise-shift (LShift) and Right bitwise-shift (RShift). Each ofthe standard operations on words takes single unit of time. The input is readfrom a read-only part of memory and the output is written to a write-only part ofmemory. We do not include the input and the output into the space complexityanalysis.

A string S over an alphabet Σ is a finite sequence of symbols from Σ and |S|is its length. By Si we mean the i-th symbol of S and we define a substringS[i,j] = SiSi+1 . . . Sj for 1 ≤ i ≤ j ≤ |S|, and prefix S[1,i] for 1 ≤ i ≤ |S|.String P prefix matches string T n symbols on position i if P[1,n] = T[i,i+n−1].

Next we formally introduce a swapped version of a string.

Definition 1 (Campanelli et al. [8]). A swap permutation for S is a permu-tation π : 1, . . . , n → 1, . . . , n such that:

(i) if π(i) = j then π(j) = i (symbols at positions i and j are swapped),(ii) for all i, π(i) ∈ i− 1, i, i+ 1 (only adjacent symbols are swapped),(iii) if π(i) 6= i then Sπ(i) 6= Si (identical symbols are not swapped).

For a string S a swapped version π(S) is a string π(S) = Sπ(1)Sπ(2) . . . Sπ(n)

where π is a swap permutation for S.

Now we formalize the version of matching we are interested in.

1 http://users.fit.cvut.cz/blazeva1/gsm.html

4 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

−1

0

1

1 2 3 4 5 6 7

a b c b b a c

a b c b b a

b c b b a c

Fig. 1. P -graph PP for the pattern P = abcbbac.

Definition 2. Given a text T = T1T2 . . . Tt and a pattern P = P1P2 . . . Pp,the pattern P is said to swap match T at location i if there exists a swappedversion π(P ) of P that matches T at location i, i.e., π(P ) = T[i,i+p−1].

A Graph Theoretic Model The algorithms in this paper are based on a modelintroduced by Iliopoulos and Rahman [16]. In this section we briefly describe thismodel.

For a pattern P of length p we construct a labeled graph PP = (V,E, σ) withvertices V , edges E, and a vertex labeling function σ : V → Σ (see Fig. 1 foran example). Let V = V ′ \ m−1,1,m1,p where V ′ = mr,c | r ∈ −1, 0, 1, c ∈1, 2, . . . , p. For mr,c ∈ V we set σ(mr,c) = Pr+c. Each vertex mr,c is identifiedwith an element of a 3 × p grid. We set E′ := E′

1 ∪ E′2 ∪ · · · ∪ E′

p−1, whereE′

j := (mk,j ,mi,j+1) | k ∈ −1, 0, i ∈ 0, 1 ∪ (m1,j,m−1,j+1), and letE = E′ ∩ V × V . We call PP the P -graph. Note that PP is directed acyclicgraph, |V (PP )| = 3p− 2, and |E(PP )| = 5(p− 1)− 4.

The idea behind the construction of PP is as follows. We create vertices V ′

and edges E′ which represent every swap pattern without any restrictions (equalsymbols can be swapped, beginning and end can swap to invalid positions). Weremove vertices m−1,1 and m1,p because they would represent symbols frominvalid indices 0 and p+ 1.

The P -graph now represents all possible swap permutations of the pattern Pin the following sense. Vertices m0,j represent ends of prefixes of swapped versionof the pattern which end by a non-swapped symbol. Possible swap of symbols Pj

and Pj+1 is represented by vertices m1,j and m−1,j+1. Edges represent symbolswhich can be consecutive. Each path from column 1 to column p represents aswap pattern and each swap pattern is represented this way.

The following definition gives a graph model equivalent of P -graph for ordi-nary matching.

Definition 3. Let T be a string. The T -graph of T is a graph TT = (V,E, σ)where V = vi | 1 ≤ i ≤ |T |, E = (vi, vi+1) | 1 ≤ i ≤ |T − 1| and σ : V → Σsuch that σ(vi) = Ti.

The next definition assigns to each path a string of labels of vertices alongit.

A Simpler Bit-parallel Algorithm for Swap Matching 5

Algorithm 1 The basic matching algorithm (BMA).

Input: Labeled directed acyclic graph G = (V,E, σ), set Q0 ⊆ V of startingvertices, set F ⊆ V of accepting vertices, text T , and position k.

1: Let D′

1 := Q0.2: for i = 1, 2, 3, . . . , p do

3: Let Di := x | x ∈ D′

i, σ(x) = Tk+i−1.4: if Di = ∅ then finish.

5: if Di ∩ F 6= ∅ then we have found a match and finish.

6: Define the next iteration set D′

i+1 as vertices which are successors of Di, i.e.,D′

i+1 := d ∈ V (PP ) | (v, d) ∈ E(PP ) for some v ∈ Di.

Definition 4. For a given Σ-labeled directed acyclic graph G = (V,E, σ) ver-tices s, e ∈ V and a directed path f = v1, v2, . . . , vk from v1 = s to vk = e, wecall S = σ(f) = σ(v1)σ(v2) . . . σ(vk) ∈ Σ∗ a path string of f .

Using Graph Theoretic Model for Matching In this section we describe analgorithm called Basic Matching Algorithm (BMA) which can determine whetherthere is a swap match of pattern P in text T on a position k using a graph model(V,E, σ), i.e., a labeled directed graph.

To use BMA, the model has to satisfy the following conditions:

– it is a directed acyclic graph,– V = V1 ⊎ V2 ⊎ · · · ⊎ Vp (we can divide vertices to columns) such that– E = (u,w) | u ∈ Vi, w ∈ Vi+1, 1 ≤ i < p (edges lead to next column), and– Q0 = V1 are the starting vertices and F = Vp are the accepting vertices.

BMA is designed to run on any graph which satisfies these conditions. SinceP -graph and T -graph satisfy these assumptions we can use BMA for PP and TT .

The algorithm runs as follows (see also Algorithm 1). We initialize the algo-rithm by setting D′

1 := Q0 (Step 1). D′1 now holds information about vertices

which are the end of some path f starting in Q0 for which σ(f) possibly prefixmatches 1 symbol of T[k,k+p−1]. To make sure that the path f represents a pre-fix match we need to check whether the label of the last vertex of the path fmatches the symbol Tk (Step 3). If no prefix match is left we did not find amatch (Step 4). If some prefix match is left we need to check whether we alreadyhave a complete match (Step 5). If the algorithm did not stop it means that wehave some prefix match but it is not a complete match yet. Therefore we cantry to extend this prefix match by one symbol (Step 6) and check whether it isa valid prefix match (Step 3). Since we extend the matched prefix in each step,we repeat these steps until the prefix match is as long as the pattern (Step 2).

Having vertices in sets is not handy for computing so we present another wayto describe this algorithm. We use their characteristic vectors instead.

Definition 5. A Boolean labeling function I : V → 0, 1 of vertices of PP iscalled a prefix match signal.

6 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

Algorithm 2 BMA in terms of prefix match signals

1: Let I0(v) := 1 for each v ∈ Q0 and I0(v) := 0 for each v /∈ Q0.2: for i = 0, 1, 2, 3, . . . , p− 1 do

3: Filter signals by a symbol Tk+i.4: if I (v) = 0 for every v ∈ PP then finish.

5: if I (v) = 1 for any v ∈ F then we have found a match and finish.

6: Propagate signals along the edges.

The algorithm can be easily divided into iterations according to the valueof i in Step 2. We denote the value of the prefix match signal in j-th iterationas Ij and we define the following operations:

– propagate signal along the edges, is an operation which sets Ij(v) := 1 if andonly if there exists an edge (u, v) ∈ E with Ij−1(u) = 1,

– filter signal by a symbol x ∈ Σ, is an operation which sets I (v) := 0 foreach v where σ(v) 6= x,

– match check, is an operation which checks whether there exists v ∈ F suchthat I(v) = 1 and if so reports a match.

With these definitions in hand we can describe BMA in terms of prefix matchsignals as Algorithm 2. See Fig. 2 for an example of use of BMA to figure outwhether P = acbab swap matches T = babcabc at a position 2.

a1 c b a4 b5

a c3 b a

c b2 a b

Fig. 2. BMA of T[2,6] = abcab on a P -graph of the pattern P = acbab. The prefixmatch signal propagates along the dashed edges. Index j above a vertex v representthat Ij(v) = 1, otherwise Ij(v) = 0.

3 Smalgo Algorithms and Why They Cannot Work

In this section we discuss how algorithms Smalgo-I and Smalgo-II introducedby Ahmed et al. [1] work, their flaw, and we also show inputs which cause falsepositives. However, first we describe the Shift-And algorithm for the standardpattern matching (without swaps) which forms the basis of both algorithms.

A Simpler Bit-parallel Algorithm for Swap Matching 7

Shift-And Algorithm The following description is based on [10, Chapter 5]describing the Shift-Or algorithm.

For a pattern P and a text T of length p and t, respectively, let R be abit array of size p and Rj its value after text symbol Tj has been processed. Itcontains information about all matches of prefixes of P that end at the position jin the text. For 1 ≤ i ≤ p, Rj

i = 1 if P[1,i] = T[j−i+1,j] and 0 otherwise. Thevector Rj+1 can be computed from Rj as follows. For each positive i we haveRj+1

i+1 = 1 if Rji = 1 and Pi+1 = Tj+1, and Rj+1

i+1 = 0 otherwise. Furthermore,

Rj+11 = 1 if P1 = Tj+1 and 0 otherwise. If Rj+1

p = 1 then a complete match canbe reported.

The transition fromRj toRj+1 can be computed very fast as follows. For eachx ∈ Σ let Dx be a bit array of size p such that for 1 ≤ i ≤ p,Dx

i = 1 if and onlyif Pi = x. The array Dx denotes the positions of the symbol x in the pattern P .EachDx can be preprocessed before the search. The computation of Rj+1 is thenreduced to three bitwise operations, namely Rj+1 = (LShift(Rj) | 1) & DTj+1 .When Rj

p = 1, the algorithm reports a match on a position j − p+ 1.

Before we show how Smalgo-I works, we need one more definition.

Definition 6. A degenerate symbol w over an alphabet Σ is a nonempty set ofsymbols from alphabet Σ. A degenerate string S is a string built over an alphabetof degenerate symbols. We say that a degenerate string P matches a text T at aposition j if Tj+i−1 ∈ Pi for every 1 ≤ i ≤ p.

Smalgo-I The Smalgo-I algorithm [1] is a modification of the Shift-And algo-rithm for Swap Matching. The algorithm uses the graph theoretic model intro-duced in Section 2.

First let P = P1, P2 . . . Px−1, Px, Px+1 . . . Pp−1, Pp be a a degenerate

version of pattern P . The symbol in P on position i represents the set of symbolsof P which can swap to that position. To accommodate the Shift-And algorithmto match degenerate patterns we need to change the way the Dx masks are

defined. For each x ∈ Σ let Dxi be the bit array of size p such that for 1 ≤ i ≤

p, Dx = 1 if and only if x ∈ Pi.

While a match of the degenerate pattern P is a necessary condition for a swapmatch of P , it is clearly not sufficient. The way the Smalgo algorithms try to fixthis is by introducing P-mask P (x1, x2, x3) which is defined as P (x1, x2, x3)i = 1if i = 1 or if there exist vertices u1, u2, and u3 and edges (u1, u2), (u2, u3) in PP

for which u2 = mr,i for some r ∈ −1, 0, 1 and σ(un) = xn for 1 ≤ n ≤ 3, andP (x1, x2, x3)i = 0 otherwise. One P -mask called P (x, x, x) is used to representthe P -masks for triples (x1, x2, x3) which only contain 1 in the first column.

Now, whenever checking whether P prefix swap matches T k + 1 symbolsat position j we check for a match of P in T and we also check whetherP (Tj+k−1, Tj+k, Tj+k+1)k+1 = 1. This ensures that the symbols are able to swapto respective positions and that those three symbols of the text T are present insome π(P ).

8 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

With the P-masks completed we initialize R1 = 1 & DT1 . Then for ev-ery j = 1 to t we repeat the following. We compute Rj+1 as Rj+1 = LSO(Rj)&

DTj+1 &RShift(DTj+2 ) & P (Tj, Tj+1, Tj+2), where LSO is defined as LSO(x) =LShift(x) | 1. To check whether or not a swap match occurred we check whetherRj

p−1 = 1. This is claimed to be sufficient because during the processing we arein fact considering not only the next symbol Tj+1 but also the symbol Tj+2.

Smalgo-II Smalgo-II algorithm is a derivative of Smalgo-I algorithm. It im-proves the space complexity from O(⌈ p

w⌉(p + |Σ|3)) to O(⌈ p

w⌉(p + |Σ|2)) for a

cost of more complex algorithm. The P-masks in Smalgo-I take O(⌈ p

w⌉|Σ|3)

space and are the main part of its space complexity. This can be improved bymaking P-masks hold information about paths of length 2 instead of 3. Thechange makes P-masks take space only O(⌈ p

w⌉|Σ|2).

In order to compensate for the less powerful P-masks, the algorithm intro-duces two new types of masks and further checks to filter out (some) invalidmatches.

To explain the algorithm in more detail, we first introduce a notion of change.An upward change corresponds to (the BMA) going to vertex m−1,i for somei, a downward change corresponds to going to vertex m+1,i, and a middle-wardchange corresponds to going to vertex m0,i.

If a downward change has occurred, then we have to check whether an upwardchange occurs at the next position. If an upward change has occurred, then wehave to check whether a downward or middle-ward change occurs at the nextposition. The main problem here is how to tell whether the changes actuallyoccur.

To this end, the authors of the algorithm introduce three new types of masks,namely up-masks up(x,y), down-masks down(x,y), and middle-masks middle(x,y),which express whether an upward, a downward, and a middle-ward change canoccur at the particular position, respectively, with the endpoints of the edgehaving labels x and y.

The authors of the algorithm now claim that to perform the above checks, itis enough to save the previous down-mask and match its value with current up-mask and Rj , or to save the previous up-mask and match its value with currentdown-mask, middle-mask, and Rj , respectively. However, this way in both caseswe only check whether the change can occur, not whether it actually occurred.This would lead not only to false positives (as shown in Section 3), but also tofalse negatives.

Unfortunately, no more details are available about the algorithm in the orig-inal paper. The pseudocode of Smalgo-II (which contains numerous errors)performs something different and we include its analysis in the appendix forcompleteness. Nevertheless, the example presented in the Section 3 still makesthe pseudocode (with the small errors corrected) report a false positive.

A Simpler Bit-parallel Algorithm for Swap Matching 9

Table 1. D-masks and P-masks for P = abab. A column xyz contains values P (x, y, z)i.

i Pi Dai Db

i aaa aab aba baa abb bab bba bbb

1 [ab] 1 1 1 1 1 1 1 1 1 12 [ba] 1 1 0 1 1 1 1 1 0 03 [ab] 1 1 0 1 1 0 1 1 1 04 [ba] 1 1 0 0 0 0 0 0 0 0

Table 2. Smalgo-I algorithm execution for P = abab and T = aaba. The column RDx

denotes the values of RShift(Dx).

i R1 LSO(R1) Da RDb P (a, a, b) R2 LSO(R2) Db RDa P (a, b, a) R3

1 1 1 1 1 1 1 1 1 1 1 12 0 1 1 1 1 1 1 1 1 1 13 0 0 1 1 1 0 1 1 1 1 14 0 0 1 0 0 0 0 1 0 0 0

The Flaw in the Smalgo Algorithms We shall see that for a pattern P =abab and a text T = aaba both algorithms Smalgo-I and Smalgo-II give falsepositives.

The concept of Smalgo-I and Smalgo-II is based on the assumption thatwe can find a path in PP by searching for consecutive paths of length 3 (triplets),where each two consecutive share two columns and can partially overlap. How-ever, this only works if the consecutive triplets actually share the two verticesin the common columns. If the assumption is not true then the found substringof the text might not match any swap version of P .

The above input gives such a configuration and therefore the assumption isfalse. In Tables 1 and 2 we can see the step by step execution of Smalgo-Ialgorithm on pattern P = abab and text T = aaba. In Table 2 we see that R3

has 1 in the 3-rd row which means that the algorithm reports a pattern matchon a position 1. This is a false positive, because it is not possible to swap matchthe pattern with two b symbols in the text with only one b symbol.

The reason behind the false positive match is as follows. The algorithm checkswhether the first triplet of symbols (a, a, b) matches. It can match the swappattern aabb. Next it checks the second triplet of symbols (a, b, a), which canmatch baba. We know that baba is not possible since it did not appear in theprevious check, but the algorithm cannot distinguish them since it checks onlythe triplets and nothing more. Since each step gave us a positive match thealgorithm reports a swap match of the pattern in the text.

In the Fig. 3 we see the two triplets which Smalgo-I assumes have twovertices in common. Even though Smalgo-II checks level changes, it does sojust to simulate how Smalgo-I works. Therefore the flaw is also contained inSmalgo-II.

10 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

a b a b

a b a

b a b

Fig. 3. Smalgo-I flaw represented in the P -graph for P = abab

4 Our Algorithm

In this section we will show an algorithm which solves Swap Matching and provecorrectness of the new algorithm. We call the algorithm GSM (Graph SwapMatching). GSM uses the graph theoretic model presented in Section 2 and isalso based on the Shift-And algorithm from Section 3.

The basic idea of the GSM algorithm is to represent prefix match signals(see Definition 5) from the basic matching algorithm (Section 2) over PP in bitvectors. The GSM algorithm represents all signals I in the bitmaps RX formedby three vectors, one for each row. Each time GSM processes a symbol of T ,it first propagates the signal along the edges, then filters the signal and finallychecks for matches. All these operations can be done very quickly thanks tobitwise parallelism.

First, we make the concept of GSM more familiar by presenting a way tointerpret the Shift-And algorithm by means of the T -graph model and the ba-sic matching algorithm (BMA) from Section 2 to solve the (ordinary) PatternMatching problem. Then we expand this idea to Swap Matching by using thegraph theoretic model.

Graph Theoretic View of the Shift-And Algorithm Let T and P bea text and a pattern of lengths t and p, respectively. We create the T -graphTP = (V,E, σ) of the pattern P . We know that the T -graph is directed acyclicgraph which can be divided into columns Vi, 1 ≤ i ≤ p (each of them containingone vertex vi) such that the edges lead from Vj to Vj+1. This means that theT -graph satisfies all assumptions of BMA. We apply BMA to TP to figure outwhether P matches T at a position j. We get a correct result because for eachi ∈ 1, . . . , p we check whether Tj+i−1 = σ(vi) = Pi.

To find every occurrence of P in T we would have to run BMA for eachposition separately. This is basically the naive approach to solve the patternmatching. We can improve the algorithm significantly when we parallelize thecomputations of p runs of BMA in the following way.

The algorithm processes one symbol at a time starting from T1. We say thatthe algorithm is in the j-th step when a symbol Tj has been processed. BMArepresents a prefix match as a prefix match signal I : V → 0, 1. Its value inthe j-th step is denoted Ij . Since one run of the BMA uses only one column of

A Simpler Bit-parallel Algorithm for Swap Matching 11

the T -graph at any time we can use other vertices to represent different runs ofthe BMA. We represent all prefix match indicators in one vector so that we canmanipulate them easily. To do that we prepare a bit vector R. Its value in j-thstep is denoted Rj and defined as Rj

i = Ij(vi).

First operation which is used in BMA (propagate signal along the edges) canbe done easily by setting the signal of vi to value of the signal of its predeces-sor vi−1 in the previous step. I.e., for i ∈ 1, . . . , p we set Ij(vi) = 1 if i = 1 andIj(vi) = Ij−1(vi−1) otherwise. In terms of Rj this means just Rj = LSO(Rj−1).

We also need a way to set I (vi) := 0 for each vi for which σ(vi) 6= Tj+i whichis another basic BMA operation (filter signal by a symbol). We can do this usingthe bit vectorDx from Section 3 and taking R&Dx. I.e., the algorithm computesRj as Rj = LSO(Rj−1) &DTj+1 .

The last BMA operation we have to define is the match detection. We dothis by checking whether Rj

p = 1 and if this is the case then a match starting atposition j − p+ 1 occurred.

Our Algorithm for Swap Matching Using the Graph Theoretic ModelNow we are ready to describe the GSM algorithm.

We again let PP = (V,E, σ) be the P -graph of the pattern P , apply BMAto PP to figure out whether P matches T at a position j, and parallelize p runsof BMA on PP .

Again, the algorithm processes one symbol at a time and it is in the j-th stepwhen a symbol Tj is being processed. We again denote the value of the prefixmatch signal I : V → 0, 1 of BMA in the j-th step by Ij . I.e., the semanticmeaning of Ij(mr,c) is that Ij(mr,c) = 1 if there exists a swap permutation πsuch that π(c) = c+ r and π(P )[1,c] = T[j−c+1,j]. Otherwise Ij(mr,c) is 0.

We want to represent all prefix match indicators in vectors so that we canmanipulate them easily. We can do this by mapping the values of I for rowsr ∈ −1, 0, 1 of the P -graph to vectors RU,RM , and RD, respectively. Wedenote value of the vector RX ∈ RU,RM,RD in j-th step as RXj . We definevalues of the vectors as RU j

i = Ij(m−1,i), RM ji = Ij(m0,i), and RDj

i = Ij(m1,i),where the value of Ij(v) = 0 for every v /∈ V .

We define BMA propagate signal along the edges as setting the signal of mr,c

to 1 if at least one of its predecessors have signal set to 1. I.e., we set Ij+1(m−1,i) :=Ij(m1,i−1), I

j+1(m0,i) := Ij(m−1,i−1) | Ij(m0,i−1), I

j+1(m0,1) := 1, Ij+1(m1,i) :=Ij(m−1,i−1) | Ij(m0,i−1), and Ij+1(m1,1) := 1. We can perform the above op-eration using the operation LSO(R). We obtain the operation propagate signal

along the edges for our algorithm in the form RU ′j+1:= LSO(RDj), RM ′j+1

:=

LSO(RM j | RU j), and RD′j+1:= LSO(RM j | RU j).

The operation filter signal by a symbol can be done by first constructing a bitvector Dx for each x ∈ Σ as Dx

i = 1 if x = Pi and Dxi = 0 otherwise, and letting

DUx = LShift(Dx), DMx = Dx, and DDx = RShift(Dx). Then we use these

vectors to filter signal by a symbol x by taking RU j := RU ′j & LShift(DTj ),

RM j := RM ′j &DTj , and RDj := RD′j & RShift(DTj ).

12 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

Algorithm 3 The graph swap matching (GSM).

Input: Pattern P of length p and text T of length t over alphabet Σ.Output: Positions of all swap matches.

1: Let RU0 := RM0 := RD0 := 0p.2: Let Dx := 0p, for all x ∈ Σ.3: for i = 1, 2, 3, . . . , p do DPi

i := 1

4: for j = 1, 2, 3, . . . , t do5: RU ′j := LSO(RDj−1), RM ′j := LSO(RMj−1 | RU j−1)6: RD′j := LSO(RMj−1 | RU j−1).7: RU j := RU ′j & LShift(DTj ), RM j := RM ′j & DTj , RDj := RD′j &

RShift(DTj ).8: if RU j

p = 1 or RM jp = 1 then report a match on position j − p+ 1.

The last operation we define is the match detection. We do this by checkingwhether RU j

p = 1 or RM jp = 1 and if this is the case, then a match starting at

a position j − p+ 1 occurred.The final GSM algorithm (Algorithm 3) first prepares the D-masks Dx for

every x ∈ Σ and initializes RU0 := RM0 := RD0 := 0 (Steps 1–3). Then thealgorithm computes the value of vectors RU j , RM j , and RDj for j ∈ 1, . . . , tby first using the above formula for signal propagation (Steps 5 and 6) and thenthe formula for signal filtering (Step 7) and checks whether RU j

p = 1 or RM jp = 1

and if this is the case the algorithm reports a match (Step 8).

Correctness of Our Algorithm To ease the notation let us define Rj(mr,c)to be RU j

c if r = −1, RM jc if r = 0, and RDj

c if r = 1. We define R′j(mr,c)analogously. Similarly, we define Dx(mr,c) as (LShift(Dx))c = Dx

c−1 if r = −1,Dx

c if r = 0, and (RShift(Dx))c = Dxc+1 if r = 1. By the way the masks Dx are

computed on lines 2 and 3 of Algorithm 3, we get the following observation.

Observation 1 For every mr,i ∈ V and every j ∈ i, . . . t we have DTj (mr,i) =1 if and only if Tj = Pr+i.

Lemma 1. For every mr,i ∈ V and every j ∈ i, . . . t if there exists a swappermutation π such that π(P )[1,i] = T[j−i+1,j] and π(i) = i+ r, then Rj(mr,i) =1.

Proof. We prove the claim by induction on i. If i = 1 and there is a swappermutation π such that π(1) = 1 + r and P1+r = Tj, then the algorithm setsR′j(mr,1) to 1 on line 5 or 6 (recall the definition of LSO). As P1+r = Tj, wehave DTj (mr,1) = 1 by Observation 1 and, therefore, by line 7, also Rj(mr,1).

Now assume that i > 1 and that the claim is true for every smaller i. Assumethat there exists a swap permutation π such that π(P )[1,i] = T[j−i+1,j] and

π(i) = i + r. By induction hypothesis we have that Rj−1(mr′,i−1) = 1, wherer′ = i−1−π(i−1). Since r equals−1 if and only if r′ equals +1 by Definition 1, wehave (r, r′) ∈ (−1, 1), (0,−1), (0, 0), (1,−1), (1, 0). Therefore the algorithm sets

A Simpler Bit-parallel Algorithm for Swap Matching 13

R′j(mr,i) to 1 on line 5 or 6. Moreover, since Pi+r = Tj , we have DTj (mr,i) = 1by Observation 1 and the algorithm sets Rj(mr,i) to 1 on line 7.

Lemma 2. For every mr,i ∈ V and every j ∈ i, . . . t if Rj(mr,i) = 1, thenthere exists a swap permutation π for P such that π(P )[1,i] = T[j−i+1,j] and

π(i) = i+ r.

Proof. We prove the claim by induction on i. If i = 1 and Rj(mr,i) = 1, thenwe must have DTj (mr,1) = 1 and, by Observation 1, also P1+r = Tj. We obtainπ by setting π(1) = 1 + r, π(2) = 2 − r and π(i′) = i′ for every i′ ∈ 2, . . . , p.It is easy to verify that this is a swap permutation for P and has the desiredproperties.

Now assume that i > 1 and that the claim is true for every smaller i. As-sume that Rj(mr,i) = 1. Then, due to line 7 we must have DTj (mr,i) = 1 and,hence, by Observation 1, also Pi+r = Tj . Moreover, we must have R′j(mr,i) = 1and, hence, by lines 5 and 6 of the algorithm also Rj−1(mr′,i−1) = 1 for somer′ with (r, r′) ∈ (−1, 1), (0,−1), (0, 0), (1,−1), (1, 0). By induction hypothesisthere exists a swap permutation π′ for P such that π′(P )[1,i−1] = T[j−i+1,j−1]

and π′(i − 1) = i − 1 + r′. If π′(i) = i + r, then setting π = π′ finishes theproof. Otherwise we have either r = 0 or r = 1 and i < p. In the former case welet π(i′) = i′ for every i′ ∈ i, . . . , p and in the later case we let π(i) = i + 1,π(i + 1) = i and π(i′) = i′ for every i′ ∈ i + 2, . . . , p. In both cases we letπ(i′) = π′(i′) for every i′ ∈ 1, . . . , i − 1. It is again easy to verify that π is aswap permutation for P with the desired properties.

Theorem 2. The GSM algorithm is correct.

Proof. By Lemma 1 if there is a swap match of P on position j−p+1 in T , thenRj(mp,−1) = 1 or Rj(mp,0) = 1. In either case the algorithm reports a matchon position j − p+ 1.

On the other hand, if the algorithm reports a match on position j − p + 1,then Rj(mp,−1) = 1 or Rj(mp,0) = 1. But then, by Lemma 2 there is a swapmatch of P on position j − p+ 1 in T .

and express the complexity of the algorithm.

Theorem 3. The GSM algorithm runs in O(⌈ p

w⌉(|Σ| + t) + p) time and uses

O(⌈ pw⌉|Σ|) memory cells (not counting the input and output cells), where t is

the length of the input text, p length of the input pattern, w is the word-size ofthe machine, and |Σ| size of the alphabet.2

Proof (of Theorem 3). The initialization of RX and Dx masks (lines 1 and 2)takes O(⌈ p

w⌉|Σ|) time. The bits in Dx masks are set according to the pattern in

O(p) time (line 3). The main cycle of the algorithm (lines 4–8) makes t iterations.Each iteration consists of computing values of RX in 13 bitwise operations, i.e.,

2 To simplify the analysis, we assume that log t < w, i.e., the iteration counter fitsinto one memory cell.

14 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

in O(⌈ pw⌉) machine operations, and checking for the result in O(1) time. This

gives O(⌈ p

w⌉(|Σ|+ t)+ p) time in total. The algorithm saves 3 RX masks (using

the same space for all j and also for RX ′ masks), |Σ| Dx masks, and constantnumber of variables for other uses (iteration counters, temporary variable, etc.).Thus, in total the GSM algorithm needs O(⌈ p

w⌉|Σ|) memory cells.

Corollary 1. If p = cw for some constant c, then the GSM algorithm runs inO(|Σ| + p+ t) time and has O(|Σ|) space complexity. Moreover, if p ≤ w, thenthe GSM algorithm can be implemented using only 7 + |Σ| memory cells.

Proof (of Corollary 1). The first part follows directly from Theorem 3. Let usshow the second part. We need |Σ| cells for all D-masks, 3 cells for R vectors(reusing the space also for R′ vectors), one pointer to the text, one iterationcounter, one constant for the match check and one temporary variable for thecomputation of the more complex parts of the algorithm. Alltogether, we needonly 7 + |Σ| memory cells to run the GSM algorithm.

From the space complexity analysis we see that for some sufficiently smallalphabets (e.g. DNA sequences) the GSM algorithm can be implemented inpractice using solely CPU registers with the exception of text which has to beloaded from the RAM.

5 Limitations of the Finite Deterministic Automata

Approach

It is easy to construct a non-deterministic finite automaton that solves SwapMatching. An alternative approach would thus be to determinize and executethis automaton. The drawback is that the determinization process may lead toan exponential number of states. We show that in some cases it actually does,contradicting the conjecture of Holub [15], stating that the number of states ofthis determinized automaton is O(p).

Theorem 4. There is an infinite family F of patterns such that any deter-ministic finite automaton AP accepting the language LS(P ) = uπ(P ) | u ∈Σ∗, π is a swap permutation for P for P ∈ F has 2Ω(|P |) states.

Table 3. An example of the construction from proof of Theorem 4 for k = 2. Supposewe have i = 0 and j = 1. Then Ti = acabcabc, Tj = acabcbac, T ′

i = acabcabcabcabc,T ′

j = acabcbacabcabc and the suffices are X = bcabcabc and Y = acabcabc.

P = T0 acabcabcT1 acabcbacT2 acbacabcT3 acbacbac

A Simpler Bit-parallel Algorithm for Swap Matching 15

Proof. For any integer k we define the pattern Pk := ac(abc)k. Note that thelength of Pk is Θ(k). Suppose that the automaton AP recognizing language L(P )has s states such that s < 2k. We define a set of strings T0, . . . , T2k−1 where Ti

is defined as follows. Let bik−1, bik−2 . . . b

i0 be the binary representation of the

number i. Let Bij = abc if bij = 0 and let Bi

j = bac if bij = 1. Then, let

Ti := acBik−1B

ik−2 . . . B

i0. See Table 3 for an example. Since s < 2k, there

exist 0 ≤ i < j ≤ 2k − 1 such that both Ti and Tj are accepted by thesame accepting state q of the automaton A. Let m be the minimum numbersuch that bik−1−m 6= bjk−1−m. Note that bim = 0 and bjm = 1. Now we define

T ′i = Ti(abc)

(m+1) and T ′j = Tj(abc)

(m+1). Let X = (T ′i )[3(m+1)+1,3(m+1+k)+2]

and Y = (T ′j)[3(m+1)+1,3(m+1+k)+2] be the suffices of the strings T ′

i and T ′j both

of length 3k + 2. Note that X begins with bc . . . and Y begins with ac . . . andthat block abc or bac repeats for k times in both. Therefore pattern P swapmatches Y and does not swap match X . Since for the last symbol of both Ti

and Tj the automaton is in the same state q, the computation for T ′i and T ′

j mustend in the same state q′. However as X should not be accepted and Y shouldbe accepted we obtain contradiction with the correctness of the automaton A.Hence, we may define the family F as F = P1, P2, . . . , concluding the proof.

References

1. Pritom Ahmed, Costas S. Iliopoulos, A.S.M. Sohidull Islam, and M. Sohel Rahman.The swap matching problem revisited. Theoretical Computer Science, 557:34–49,2014.

2. Amihood Amir, Yonatan Aumann, Gad M. Landau, Moshe Lewenstein, and NoaLewenstein. Pattern matching with swaps. Journal of Algorithms, 37(2):247–266,2000.

3. Amihood Amir, Richard Cole, Ramesh Hariharan, Moshe Lewenstein, and ElyPorat. Overlap matching. Information and Computation, 181(1):57–74, 2003.

4. Amihood Amir, Estrella Eisenberg, and Ely Porat. Swap and mismatch edit dis-tance. Algorithmica, 45(1):109–120, 2006.

5. Amihood Amir, Gad M. Landau, Moshe Lewenstein, and Noa Lewenstein. Efficientspecial cases of pattern matching with swaps. Inform. Process. Lett., 68(3):125–132,1998.

6. Amihood Amir, Moshe Lewenstein, and Ely Porat. Approximate swapped match-ing. Information Processing Letters, 83(1):33 – 39, 2002.

7. Pavlos Antoniou, Costas S. Iliopoulos, Inuka Jayasekera, and M. Sohel Rahman.Implementation of a swap matching algorithm using a graph theoretic model. InBioinformatics Research and Development, BIRD 2008, volume 13 of CCIS, pages446–455. Springer, 2008.

8. Matteo Campanelli, Domenico Cantone, and Simone Faro. A new algorithm forefficient pattern matching with swaps. In IWOCA 2009, volume 5874 of LNCS,pages 230–241. Springer, 2009.

9. Domenico Cantone and Simone Faro. Pattern matching with swaps for short pat-terns in linear time. In SOFSEM 2009, volume 5404 of LNCS, pages 255–266.Springer, 2009.

10. Christian Charras and Thierry Lecroq. Handbook of Exact String Matching Algo-

rithms. King’s College Publications, 2004.

16 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

11. F.B. Chedid. On pattern matching with swaps. In AICCSA 2013, pages 1–5. IEEE,2013.

12. Yair Dombb, Ohad Lipsky, Benny Porat, Ely Porat, and Asaf Tsur. The ap-proximate swap and mismatch edit distance. Theoretical Computer Science,411(43):3814 – 3822, 2010.

13. Simone Faro. Swap matching in strings by simulating reactive automata. In Pro-

ceedings of the Prague Stringology Conference 2013, pages 7–20. CTU in Prague,2013.

14. Kimmo Fredriksson and Emanuele Giaquinta. On a compact encoding of the swapautomaton. Information Processing Letters, 114(7):392 – 396, 2014.

15. Jan Holub. Personal communication. 2015.16. CostasS. Iliopoulos and M.Sohel Rahman. A new model to solve the swap matching

problem and efficient algorithms for short patterns. In SOFSEM 2008: Theory and

Practice of Computer Science, volume 4910 of LNCS, pages 316–327. Springer,2008.

17. Ohad Lipsky, Benny Porat, Ely Porat, B. Riva Shalom, and Asaf Tzur. Stringmatching with up to k swaps and mismatches. Information and Computation,208(9):1020 – 1030, 2010.

18. S. Muthukrishnan. New results and open problems related to non-standard stringol-ogy. In CPM 95, volume 937 of LNCS, pages 298–317. Springer, 1995.

A Simpler Bit-parallel Algorithm for Swap Matching 17

A Appendix: Analysis of the Pseudocode of SMALGO-II

In this section we analyze the pseudocode of the SMALGO-II algorithm as givenby Ahmed et al. in [1] in order to understand the meaning of the checks thepseudocode actually performs.

The original pseudocode is as follows.

Algorithm 4 SMALGO-II

Require: Text T, up-mask up, down-mask down, middle-mask middle,P-mask pmask, D-mask D for given pattern p

1: R0 ← 2patternlength−1

2: checkup← checkdown← 03: R0 ← R0 & DT0

4: R1 ← R0 >> 15: for j = 0 to (n− 2) do6: Rj ← Rj & pmask(Tj,Tj+1) & DTj+1

7: temp← prevcheckup >> 18: checkup← checkup | up(Tj ,Tj+1)

9: checkup← checkup & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

10: prevcheckup← checkup11: Rj ←∼ (temp & checkup) & Rj

12: temp← prevcheckdown >> 113: checkdown← checkdown | down(Tj ,Tj+1)

14: checkdown← checkdown & ∼ up(Tj ,Tj+1)

15: prevcheckdown← checkdown16: Rj ←∼ (temp & checkdown) & Rj

17: if (Rj & 1) = 1 then

18: Match found ending at position (j − 1)

19: Rj+1 ← Rj >> 120: checkup← checkup >> 121: checkdown← checkdown >> 1

The pseudocode has several problems. First, in the first iteration of the cy-cle, the algorithm uses the value of the variable prevcheckup which was neverinitialized. Second, the algorithm never adds new ones to the variabale R and,hence, can never report a match after position patternlength of the text. Third,if the text is of the same length as the pattern, the algorithm only appliesthe shift patternlength − 2 times to the original value of 2patternlength−1 (notethat in the first iteration it uses R0 and overwrites the value of R1) before thelast match check. Therefore, at the last check, the value could only drop to2patternlength−1−patternlength+2 = 21 = 2 and the match check cannot be success-ful. Also the reported position of the match does not make much sense.

Let us first correct all this easy problems.

18 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

Algorithm 5 SMALGO-II

1: R0 ← 2patternlength−1

2: prevcheckup← prevcheckdown← checkup← checkdown← 03: R0 ← R0 & DT0

4: R1 ← R0 >> 15: for j = 0 to (n− 2) do6: Rj+1 ← Rj+1 & pmask(Tj ,Tj+1) & DTj+1

7: temp← prevcheckup >> 18: checkup← checkup | up(Tj ,Tj+1)

9: checkup← checkup & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

10: prevcheckup← checkup11: Rj+1 ←∼ (temp & checkup) & Rj+1

12: temp← prevcheckdown >> 113: checkdown← checkdown | down(Tj ,Tj+1)

14: checkdown← checkdown & ∼ up(Tj ,Tj+1)

15: prevcheckdown← checkdown16: Rj+1 ←∼ (temp & checkdown) & Rj+1

17: if (Rj+1 & 1) = 1 then

18: Match found ending at position (j + 1)

19: Rj+2 ← (Rj+1 >> 1) | 2patternlength−1

20: checkup← checkup >> 121: checkdown← checkdown >> 1

If we now move the line setting prevcheckup to checkup after the line wherethe check with the temp variable is performed and similarly with prevcheckdown,we do not need the temp variable anymore. We also move the shifts of checkupand checkdown closer to where this variables are used. We only show the impor-tant part of the algorithm.

Algorithm 6 SMALGO-II. . .

5: for j = 0 to (n− 2) do6: Rj+1 ← Rj+1 & pmask(Tj ,Tj+1) & DTj+1

7: checkup← checkup | up(Tj ,Tj+1)

8: checkup← checkup & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

9: Rj+1 ←∼ (prevcheckup >> 1 & checkup) & Rj+1

10: prevcheckup← checkup11: checkup← checkup >> 112: checkdown← checkdown | down(Tj ,Tj+1)

13: checkdown← checkdown & ∼ up(Tj ,Tj+1)

14: Rj+1 ←∼ (prevcheckdown >> 1 & checkdown) & Rj+1

15: prevcheckdown← checkdown16: checkdown← checkdown >> 117: if (Rj+1 & 1) = 1 then

18: Match found ending at position (j + 1)

19: Rj+2 ← (Rj+1 >> 1) | 2patternlength−1

A Simpler Bit-parallel Algorithm for Swap Matching 19

Now we swap the order of setting prevcheckup to checkup and the shift ofcheckup. As this makes prevcheckup shifted by one, we remove the additionalshift in the check. Similarly for checkdown.

Algorithm 7 SMALGO-II. . .

7: checkup← checkup | up(Tj ,Tj+1)

8: checkup← checkup & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

9: Rj+1 ←∼ (prevcheckup & checkup) & Rj+1

10: checkup← checkup >> 111: prevcheckup← checkup12: checkdown← checkdown | down(Tj ,Tj+1)

13: checkdown← checkdown & ∼ up(Tj ,Tj+1)

14: Rj+1 ←∼ (prevcheckdown & checkdown) & Rj+1

15: checkdown← checkdown >> 116: prevcheckdown← checkdown

. . .

Now we institute checkup into the check and move its computation after thecheck.

Algorithm 8 SMALGO-II. . .

6: Rj+1 ← Rj+1 & pmask(Tj,Tj+1) & DTj+1

7: Rj+1 ←∼ (prevcheckup & (checkup | up(Tj ,Tj+1)) & ∼ down(Tj ,Tj+1) & ∼middle(Tj,Tj+1)) & Rj+1

8: checkup← (checkup | up(Tj ,Tj+1)) & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

9: checkup← checkup >> 110: prevcheckup← checkup11: Rj+1 ←∼ (prevcheckdown & (checkdown | down(Tj ,Tj+1)) & ∼ up(Tj ,Tj+1)) &

Rj+1

12: checkdown← (checkdown | down(Tj ,Tj+1)) & ∼ up(Tj ,Tj+1)

13: checkdown← checkdown >> 114: prevcheckdown← checkdown

. . .

Now note that during the check, the content of prevcheckup is exactly thesame as the content of checkup, so we can remove prevcheckup completely.

20 Vaclav Blazej, Ondrej Suchy, and Tomas Valla

Algorithm 9 SMALGO-II

1: R0 ← 2patternlength−1

2: checkup← checkdown← 03: R0 ← R0 & DT0

4: R1 ← R0 >> 15: for j = 0 to (n− 2) do6: Rj+1 ← Rj+1 & pmask(Tj ,Tj+1) & DTj+1

7: Rj+1 ←∼ (checkup & (checkup | up(Tj ,Tj+1)) & ∼ down(Tj ,Tj+1) & ∼middle(Tj,Tj+1)) & Rj+1

8: checkup← (checkup | up(Tj ,Tj+1)) & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

9: checkup← checkup >> 110: Rj+1 ←∼ (checkdown & (checkdown | down(Tj ,Tj+1)) & ∼ up(Tj ,Tj+1)) & Rj+1

11: checkdown← (checkdown | down(Tj ,Tj+1)) & ∼ up(Tj ,Tj+1)

12: checkdown← checkdown >> 113: if (Rj+1 & 1) = 1 then

14: Match found ending at position (j + 1)

15: Rj+2 ← (Rj+1 >> 1) | 2patternlength−1

Now we modify the expressions by laws of logic to arrive at the followingformulation.

Algorithm 10 SMALGO-II. . .

7: Rj+1 ← Rj+1 & (∼ checkup | down(Tj ,Tj+1) | middle(Tj ,Tj+1))8: checkup ← (checkup & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)) | (up(Tj ,Tj+1) &∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1))

9: checkup← checkup >> 110: Rj+1 ← Rj+1 & (∼ checkdown | up(Tj ,Tj+1))11: checkdown← (checkdown & ∼ up(Tj ,Tj+1)) | (down(Tj ,Tj+1) & ∼ up(Tj ,Tj+1))12: checkdown← checkdown >> 1

. . .

Now, if the first subexpression in the logical or setting the new value ofcheckup is true, then the appropriate bit of Rj+1 was just set to 0 on the previousline and filtrating this bit again in future is useless. Hence, we can omit this partof the expression. We arrive at the following resulting pseudocode.

A Simpler Bit-parallel Algorithm for Swap Matching 21

Algorithm 11 SMALGO-II

1: R0 ← 2patternlength−1

2: checkup← checkdown← 03: R0 ← R0 & DT0

4: R1 ← R0 >> 15: for j = 0 to (n− 2) do6: Rj+1 ← Rj+1 & pmask(Tj ,Tj+1) & DTj+1

7: Rj+1 ← Rj+1 & (∼ checkup | down(Tj ,Tj+1) | middle(Tj ,Tj+1))8: checkup← up(Tj ,Tj+1) & ∼ down(Tj ,Tj+1) & ∼ middle(Tj ,Tj+1)

9: checkup← checkup >> 110: Rj+1 ← Rj+1 & (∼ checkdown | up(Tj ,Tj+1))11: checkdown← down(Tj ,Tj+1) & ∼ up(Tj ,Tj+1)

12: checkdown← checkdown >> 113: if (Rj+1 & 1) = 1 then

14: Match found ending at position (j + 1)

15: Rj+2 ← (Rj+1 >> 1) | 2patternlength−1

Now it is easy to see, that checkup stores the information on whether anupward-change must have occurred in the previous step (provided that there wasa prefix match) and this is compared with the information whether downward-change or middle-change can occur. Similarly for the downward-change. This isnot sufficient to avoid false positives since sometimes both upward-change anddownward-change can occur (e.g, as in our counterexample), in which case nofiltration is performed at all.