run length coding
DESCRIPTION
fTRANSCRIPT
-
2/27/2015
1
Data Compression
Eng. Reem Sabah Ahmed
Computer Engineering Department
Engineering College
Al-Nahrain University
Data Compression Introduction
Lossless and lossy compression.
Run-Length coding
Delta coding.
Huffman method(coding)
Huffman method(decoding)
Arithmetic method(coding)
Arithmetic method(decoding)
Dictionary methods
Lempel-Ziv Algorithm
Image compression
Video compression
-
2/27/2015
2
Run-Length Coding
Run length encoding (RLE) is a simple technique to
compress digital data by representing successive runs
of the same value in the data as the value followed
by the count, rather than the original run of values.
The goal is to reduce the amount of data needed to
be stored or transmitted.
Run length encoding (RLE) is one very simple lossless
data compression algorithm called run-lengthencoding that can be very useful in some cases.
Run-length encoding (RLE) is a very simple form of
data compression in which consecutive sequences
of the same data value (runs) are stored or
transmitted as a single data value and count,
rather than as the original individual data
elements. This is particularly useful for data that
contains many such runs such as simple graphic
images and faxed documents. If the original
doesnt have many runs, it could increase ratherthan decrease the size of the data file or
transmission.
-
2/27/2015
3
A fax machine scans a page and represents that image as
black or white pixels, which are sent over telephone
lines to another fax machine that will print the pixels
onto a blank page. The total number of pixels to be
transmitted may be very large which would result in
lengthy transmission times. Because fax images often
have large blocks of white (e.g. margins and inter-line
spacing) or black (e.g. horizontal lines) they are readily
amenable to run-length encoding.
this compression under the category of lossless
compression algorithms. The decoded data will be the
exact same as the original data.
The following are the main considerations that apply to
this technique:
Text in a natural language (as well as names) may have many
doubles and a few triplesas in AAA (an acronym), abbess,
Emmanuelle, bookkeeper, arrowwood, freeer (old usage),
and hostessship (used by Shakespeare)but longer runs are
limited to consecutive spaces and periods.
In a bi-level image there are two types of symbols, namely
black and white pixels, so runs of pixels alternate between
these two colors, which implies that RLE can compress such
an image by replacing each run with its length.
-
2/27/2015
4
In general, the data to be compressed includes several
types of symbols, so RLE compresses a run by replacing
it with a pair (length, symbol).
If the run is short, such a pair may be longer than the
run of symbols, thereby leading to expansion. A
reasonable solution is to write such short runs on the
output in raw format, so at least they do not cause
expansion. However, this raises the question of
distinguishing between pairs and raw items, because in
the output file both types are binary strings. A practical
RLE program must therefore precede each pair and each
raw item with a 1-bit indicator.
Thus, a pair becomes the triplet (0, length, symbol) and a
raw item becomes the pair (1, symbol).
Runs have different lengths, which is why the pairs and
triplets have different lengths. It therefore makes sense
to replace each by a variable-length code and write the
codes on the output. Thus, RLE is normally just one step
in a multistep compression algorithm that may include a
transform, variable-length codes, and perhaps also
quantization.
-
2/27/2015
5
Examples of Run Length coding
Example1
As a basic example, consider the following string of numbers:
5 5 5 5 8 8 8 2 2 2 2 2
There is a fair amount of redundancy there. In RLE notation, this
same string could be expressed as:
4 5 3 8 5 2
This indicates a run of 4 5 numbers, followed by a run of 3 8
numbers, followed by a run of 5 2 numbers. Suppose that each
number was represented by a byte on disk. The original encoding
requires 12 bytes. The RLE encoding requires 6 bytes. 2:1
compression in this simple case not bad.
Example 2
consider the following number string:
1 2 3 4 5 6
Apply the same RLE compression scheme as before:
1 1 1 2 1 3 1 4 1 5 1 6
So the original string was 6 bytes and the RLE string is 12
bytes. That did not go too well as the compression ratio is
now 1:2. This is why pure RLE implementations have a mode
for encoding strings of dissimilar numbers as well. It usually
works by encoding a negative number that, when negated,
will give the number of succeeding bytes that are not RLE
data.
-
2/27/2015
6
Example 2 Cont.
Since the above example contains a string of 6 dissimilar
numbers, the encoding would be:
-6 1 2 3 4 5 6
A decoder would see that the RLE code is negative, negate it
to get 6, and then copy 6 bytes from the encoded byte stream
to the decoded byte stream.
Example 3
Using the basic concepts described so far, lets decode thefollowing byte stream:
4 5 -6 1 2 3 4 5 6 3 8 5 2
This is a combination of examples from above and decodesto:
5 5 5 5 1 2 3 4 5 6 8 8 8 2 2 2 2 2
Work through it by hand as necessary. So the encoded stringis 13 bytes and the decoded string is 18 bytes. Modestcompression. RLE compression naturally works best wherethere are long runs of the same pixel value. At worst, therewill be no compression if there are almost no pairs ofadjacent pixels with the same value.
-
2/27/2015
7
Summery
The basic RLE principle is to detect sequences of repeated data values and after that to replace this sequence with two elements:
- the number of the same characters.
- the character itself.
Useful for compressing data that contains repeated values
Very simple compared with other compressiontechniques
Reversible (Lossless) compression: decompression is just as easy
Question?
-
2/27/2015
8
Thank you