run length coding

8
2/27/2015 1 Data Compression Eng. Reem Sabah Ahmed Computer Engineering Department Engineering College Al-Nahrain University Data Compression Introduction Lossless and lossy compression. Run-Length coding Delta coding. Huffman method(coding) Huffman method(decoding) Arithmetic method(coding) Arithmetic method(decoding) Dictionary methods Lempel-Ziv Algorithm Image compression Video compression

Upload: aya-mazin

Post on 29-Sep-2015

7 views

Category:

Documents


3 download

DESCRIPTION

f

TRANSCRIPT

  • 2/27/2015

    1

    Data Compression

    Eng. Reem Sabah Ahmed

    Computer Engineering Department

    Engineering College

    Al-Nahrain University

    Data Compression Introduction

    Lossless and lossy compression.

    Run-Length coding

    Delta coding.

    Huffman method(coding)

    Huffman method(decoding)

    Arithmetic method(coding)

    Arithmetic method(decoding)

    Dictionary methods

    Lempel-Ziv Algorithm

    Image compression

    Video compression

  • 2/27/2015

    2

    Run-Length Coding

    Run length encoding (RLE) is a simple technique to

    compress digital data by representing successive runs

    of the same value in the data as the value followed

    by the count, rather than the original run of values.

    The goal is to reduce the amount of data needed to

    be stored or transmitted.

    Run length encoding (RLE) is one very simple lossless

    data compression algorithm called run-lengthencoding that can be very useful in some cases.

    Run-length encoding (RLE) is a very simple form of

    data compression in which consecutive sequences

    of the same data value (runs) are stored or

    transmitted as a single data value and count,

    rather than as the original individual data

    elements. This is particularly useful for data that

    contains many such runs such as simple graphic

    images and faxed documents. If the original

    doesnt have many runs, it could increase ratherthan decrease the size of the data file or

    transmission.

  • 2/27/2015

    3

    A fax machine scans a page and represents that image as

    black or white pixels, which are sent over telephone

    lines to another fax machine that will print the pixels

    onto a blank page. The total number of pixels to be

    transmitted may be very large which would result in

    lengthy transmission times. Because fax images often

    have large blocks of white (e.g. margins and inter-line

    spacing) or black (e.g. horizontal lines) they are readily

    amenable to run-length encoding.

    this compression under the category of lossless

    compression algorithms. The decoded data will be the

    exact same as the original data.

    The following are the main considerations that apply to

    this technique:

    Text in a natural language (as well as names) may have many

    doubles and a few triplesas in AAA (an acronym), abbess,

    Emmanuelle, bookkeeper, arrowwood, freeer (old usage),

    and hostessship (used by Shakespeare)but longer runs are

    limited to consecutive spaces and periods.

    In a bi-level image there are two types of symbols, namely

    black and white pixels, so runs of pixels alternate between

    these two colors, which implies that RLE can compress such

    an image by replacing each run with its length.

  • 2/27/2015

    4

    In general, the data to be compressed includes several

    types of symbols, so RLE compresses a run by replacing

    it with a pair (length, symbol).

    If the run is short, such a pair may be longer than the

    run of symbols, thereby leading to expansion. A

    reasonable solution is to write such short runs on the

    output in raw format, so at least they do not cause

    expansion. However, this raises the question of

    distinguishing between pairs and raw items, because in

    the output file both types are binary strings. A practical

    RLE program must therefore precede each pair and each

    raw item with a 1-bit indicator.

    Thus, a pair becomes the triplet (0, length, symbol) and a

    raw item becomes the pair (1, symbol).

    Runs have different lengths, which is why the pairs and

    triplets have different lengths. It therefore makes sense

    to replace each by a variable-length code and write the

    codes on the output. Thus, RLE is normally just one step

    in a multistep compression algorithm that may include a

    transform, variable-length codes, and perhaps also

    quantization.

  • 2/27/2015

    5

    Examples of Run Length coding

    Example1

    As a basic example, consider the following string of numbers:

    5 5 5 5 8 8 8 2 2 2 2 2

    There is a fair amount of redundancy there. In RLE notation, this

    same string could be expressed as:

    4 5 3 8 5 2

    This indicates a run of 4 5 numbers, followed by a run of 3 8

    numbers, followed by a run of 5 2 numbers. Suppose that each

    number was represented by a byte on disk. The original encoding

    requires 12 bytes. The RLE encoding requires 6 bytes. 2:1

    compression in this simple case not bad.

    Example 2

    consider the following number string:

    1 2 3 4 5 6

    Apply the same RLE compression scheme as before:

    1 1 1 2 1 3 1 4 1 5 1 6

    So the original string was 6 bytes and the RLE string is 12

    bytes. That did not go too well as the compression ratio is

    now 1:2. This is why pure RLE implementations have a mode

    for encoding strings of dissimilar numbers as well. It usually

    works by encoding a negative number that, when negated,

    will give the number of succeeding bytes that are not RLE

    data.

  • 2/27/2015

    6

    Example 2 Cont.

    Since the above example contains a string of 6 dissimilar

    numbers, the encoding would be:

    -6 1 2 3 4 5 6

    A decoder would see that the RLE code is negative, negate it

    to get 6, and then copy 6 bytes from the encoded byte stream

    to the decoded byte stream.

    Example 3

    Using the basic concepts described so far, lets decode thefollowing byte stream:

    4 5 -6 1 2 3 4 5 6 3 8 5 2

    This is a combination of examples from above and decodesto:

    5 5 5 5 1 2 3 4 5 6 8 8 8 2 2 2 2 2

    Work through it by hand as necessary. So the encoded stringis 13 bytes and the decoded string is 18 bytes. Modestcompression. RLE compression naturally works best wherethere are long runs of the same pixel value. At worst, therewill be no compression if there are almost no pairs ofadjacent pixels with the same value.

  • 2/27/2015

    7

    Summery

    The basic RLE principle is to detect sequences of repeated data values and after that to replace this sequence with two elements:

    - the number of the same characters.

    - the character itself.

    Useful for compressing data that contains repeated values

    Very simple compared with other compressiontechniques

    Reversible (Lossless) compression: decompression is just as easy

    Question?

  • 2/27/2015

    8

    Thank you