run length coding

2/27/2015

1

Data Compression

Eng. Reem Sabah Ahmed

Computer Engineering Department

Engineering College

Al-Nahrain University

Data Compression Introduction

Lossless and lossy compression.

Run-Length coding

Delta coding.

Huffman method(coding)

Huffman method(decoding)

Arithmetic method(coding)

Arithmetic method(decoding)

Dictionary methods

Lempel-Ziv Algorithm

Image compression

Video compression

2/27/2015

2

Run-Length Coding

Run length encoding (RLE) is a simple technique to

compress digital data by representing successive runs

of the same value in the data as the value followed

by the count, rather than the original run of values.

The goal is to reduce the amount of data needed to

be stored or transmitted.

Run length encoding (RLE) is one very simple lossless

data compression algorithm called run-lengthencoding that can be very useful in some cases.

Run-length encoding (RLE) is a very simple form of

data compression in which consecutive sequences

of the same data value (runs) are stored or

transmitted as a single data value and count,

rather than as the original individual data

elements. This is particularly useful for data that

contains many such runs such as simple graphic

images and faxed documents. If the original

doesnt have many runs, it could increase ratherthan decrease the size of the data file or

transmission.

2/27/2015

3

A fax machine scans a page and represents that image as

black or white pixels, which are sent over telephone

lines to another fax machine that will print the pixels

onto a blank page. The total number of pixels to be

transmitted may be very large which would result in

lengthy transmission times. Because fax images often

have large blocks of white (e.g. margins and inter-line

spacing) or black (e.g. horizontal lines) they are readily

amenable to run-length encoding.

this compression under the category of lossless

compression algorithms. The decoded data will be the

exact same as the original data.

The following are the main considerations that apply to

this technique:

Text in a natural language (as well as names) may have many

doubles and a few triplesas in AAA (an acronym), abbess,

Emmanuelle, bookkeeper, arrowwood, freeer (old usage),

and hostessship (used by Shakespeare)but longer runs are

limited to consecutive spaces and periods.

In a bi-level image there are two types of symbols, namely

black and white pixels, so runs of pixels alternate between

these two colors, which implies that RLE can compress such

an image by replacing each run with its length.

2/27/2015

4

In general, the data to be compressed includes several

types of symbols, so RLE compresses a run by replacing

it with a pair (length, symbol).

If the run is short, such a pair may be longer than the

run of symbols, thereby leading to expansion. A

reasonable solution is to write such short runs on the

output in raw format, so at least they do not cause

expansion. However, this raises the question of

distinguishing between pairs and raw items, because in

the output file both types are binary strings. A practical

RLE program must therefore precede each pair and each

raw item with a 1-bit indicator.

Thus, a pair becomes the triplet (0, length, symbol) and a

raw item becomes the pair (1, symbol).

Runs have different lengths, which is why the pairs and

triplets have different lengths. It therefore makes sense

to replace each by a variable-length code and write the

codes on the output. Thus, RLE is normally just one step

in a multistep compression algorithm that may include a

transform, variable-length codes, and perhaps also

quantization.

2/27/2015

5

Examples of Run Length coding

Example1

As a basic example, consider the following string of numbers:

5 5 5 5 8 8 8 2 2 2 2 2

There is a fair amount of redundancy there. In RLE notation, this

same string could be expressed as:

4 5 3 8 5 2

This indicates a run of 4 5 numbers, followed by a run of 3 8

numbers, followed by a run of 5 2 numbers. Suppose that each

number was represented by a byte on disk. The original encoding

requires 12 bytes. The RLE encoding requires 6 bytes. 2:1

compression in this simple case not bad.

Example 2

consider the following number string:

1 2 3 4 5 6

Apply the same RLE compression scheme as before:

1 1 1 2 1 3 1 4 1 5 1 6

So the original string was 6 bytes and the RLE string is 12

bytes. That did not go too well as the compression ratio is

now 1:2. This is why pure RLE implementations have a mode

for encoding strings of dissimilar numbers as well. It usually

works by encoding a negative number that, when negated,

will give the number of succeeding bytes that are not RLE

data.

2/27/2015

6

Example 2 Cont.

Since the above example contains a string of 6 dissimilar

numbers, the encoding would be:

-6 1 2 3 4 5 6

A decoder would see that the RLE code is negative, negate it

to get 6, and then copy 6 bytes from the encoded byte stream

to the decoded byte stream.

Example 3

Using the basic concepts described so far, lets decode thefollowing byte stream:

4 5 -6 1 2 3 4 5 6 3 8 5 2

This is a combination of examples from above and decodesto:

5 5 5 5 1 2 3 4 5 6 8 8 8 2 2 2 2 2

Work through it by hand as necessary. So the encoded stringis 13 bytes and the decoded string is 18 bytes. Modestcompression. RLE compression naturally works best wherethere are long runs of the same pixel value. At worst, therewill be no compression if there are almost no pairs ofadjacent pixels with the same value.

2/27/2015

7

Summery

The basic RLE principle is to detect sequences of repeated data values and after that to replace this sequence with two elements:

- the number of the same characters.

- the character itself.

Useful for compressing data that contains repeated values

Very simple compared with other compressiontechniques

Reversible (Lossless) compression: decompression is just as easy

Question?

2/27/2015

8

Thank you

run length coding

Documents