16 jop april 08

c m y k

85www.openITis.com | L INUX For YoU | AprIL 2008

When we hear of some strange algorithms with names like ‘Dijkstra’s shortest path’, ‘Markov’s string replacement’ or ‘Sieve of

Eratosthenes’, we usually dislike the arcane and dry solutions they describe. However, such algorithms can have interesting applications or can provide some unusual solutions that are not possible with the straightforward solutions we often arrive at based on common sense.

We’ll explore ‘Markov’s string replacement’ algorithm in this article. What this algorithm does is simple string rewriting based on a set of rules. Let’s look at an interesting application of this algorithm: converting a binary to unary by string replacement, by applying a set of rules. The idea described here is from en.wikipedia.org/wiki/Markov_algorithm

Let us look at the rules first: • Rule 1: “#0” => “0##” • Rule 2: “1” => “0#” • Rule 3: “0” => “” • Rule 4: “#” => “1”

The symbol ‘=>’ stands for ‘replace with’. Take any binary number – preferably, a small one. Treat that binary number as a string. Start repeatedly applying the rules from the top to the bottom. Apply the rules from left to right on the string. Only if you cannot apply the current rule or the previous rules anymore on the string, go to the next rule. Once you finish applying the last rule, stop the process.

Consider the binary number ‘11’. Its equivalent decimal value is 3. So, its unary value is 1 repeated 3 times, that is, 111. By applying these rules, we should get ‘111’. Do the rules work? Let us check:

• Input string: “11” • Step 1: “0#1” (apply rule 2) • Step 2: “0#0#” (apply rule 2) • Step 3: “00###” (apply rule 1) • Step 4: “0###” (apply rule 3)

S.G. GaneSh

algorithms need not always be a dry and esoteric subject. In this column, we will look at how to convert binary values to unary values by repeatedly applying simple, automated rules using Markov’s string replacement algorithm.

The Joy ofProgrammingThe Beautiful Markov String Replacement Algorithm

S.G. Ganesh is a research engineer at Siemens (Corporate Technology). His latest book is “60 Tips on Object Oriented Programming”, published by Tata McGraw-Hill in December last year. You can reach him at [email protected].

• Step 5: “###” (apply rule 3)• Step 6: “1##” (apply rule 4)• Step 7: “11#” (apply rule 4)• Step 8: “111” (apply rule 4) • Terminate.

Looks interesting, but does it work in all cases? Let us try another one. ‘100’ is decimal 4 and unary 1111. Here are the strings during each step of applying the rules:

“100” => “0#00” => “00##0” => “00#0##” => “000####” => “00####” => “0####” => “####” => “1###” => “11##” => “111#” => “1111”.

It does indeed work! Now, what is the logic behind these rules? The first two steps do the actual work. The third rule just removes leading zeros, and the last rule converts “#”s to “1”s.

The first and second rule moves 1 from left to right, repeatedly. Each time it shifts, it increases the number of #’s by two. Remember that the value of 1 in a binary number is its power of two, and the rules implicitly calculate that. By the time you are unable to apply the first two rules any more, all the entries in the string would be “#”s, with leading “0”s that were created because of this process. Simple, isn’t it?

There is another very interesting application of this algorithm. It is possible to write programs that create grammatically correct, non-sense text! If you’re interested to know how, read the chapter on ‘Generating Text’ in Programming Pearls by Jon Bentley, Addison-Wesley, 2000.

The algorithm described here is simple, but powerful: This algorithm is shown to be ‘Turing complete’ (to describe it informally, Turing complete algorithms can be used to implement any computation that any powerful computer in the world is capable of performing!).