data compression basics

Upload: harsha

Post on 06-Apr-2018

245 views

Category:

Documents


4 download

TRANSCRIPT

  • 8/3/2019 Data Compression Basics

    1/41

    EE465: Introduction to Digital Image Processing 1

    One-Minute Survey Result Thank you for your responses

    Kristen, Anusha, Ian, Christofer, Bernard, Greg, Michael, Shalini,

    Brian and Justin

    Valentines challenge

    Min: 30-45 minutes, Max: 5 hours, Ave: 2-3 hours Muddiest points

    Regular tree grammar (CS410 compiler or CS422: Automata)

    Fractal geometry (The fractal geometry of nature by Mandelbrot)

    Seeing the Connection

    Remember the first story in Steve Jobs speech Staying Hungry,

    Staying Foolish?

    In addition to Jobs and Shannon, I have two more examples:

    Charles Darwin and Bruce Lee

  • 8/3/2019 Data Compression Basics

    2/41

    EE465: Introduction to Digital Image Processing 2

    Data Compression Basics

    Discrete source

    Information=uncertainty

    Quantification of uncertainty

    Source entropy

    Variable length codes

    Motivation

    Prefix condition

    Huffman coding algorithm

  • 8/3/2019 Data Compression Basics

    3/41

    EE465: Introduction to Digital Image Processing 3

    Information

    What do we mean by information?

    A numerical measure of the uncertainty of anexperimental outcome Webster Dictionary

    How to quantitatively measure and representinformation?

    Shannon proposes a statistical-mechanics inspiredapproach

    Let us first look at how we assess the amount ofinformation in our daily lives using commonsense

  • 8/3/2019 Data Compression Basics

    4/41

    EE465: Introduction to Digital Image Processing 4

    Information = Uncertainty

    Zero information

    Pittsburgh Steelers won the Superbowl XL (past news, nouncertainty)

    Yao Ming plays for Houston Rocket (celebrity fact, no uncertainty)

    Little information It will be very cold in Chicago tomorrow (not much uncertainty

    since this is winter time)

    It is going to rain in Seattle next week (not much uncertainty sinceit rains nine months a year in NW)

    Large information An earthquake is going to hit CA in July 2006 (are you sure? an

    unlikely event)

    Someone has shown P=NP (Wow! Really? Who did it?)

  • 8/3/2019 Data Compression Basics

    5/41

    EE465: Introduction to Digital Image Processing 5

    Shannons Picture on Communication (1948)

    sourceencoder

    channel

    sourcedecoder

    source destination

    Examples of source:Human speeches, photos, text messages, computer programs

    Examples of channel:

    storage media, telephone lines, wireless transmission

    super-channel

    channelencoder

    channeldecoder

    The goal of communication is to move informationfrom here to there and from now to then

  • 8/3/2019 Data Compression Basics

    6/41

    EE465: Introduction to Digital Image Processing 6

    The role of source coding (data compression):

    Facilitate storage and transmission by eliminating source redundancy

    Our goal is to maximally remove the source redundancyby intelligent designing source encoder/decoder

    Source-Channel Separation Principle*

    The role of channel coding:

    Fight against channel errors for reliable transmission of information

    (design of channel encoder/decoder is considered in EE461)

    We simply assume the super-channel achieves error-free transmission

  • 8/3/2019 Data Compression Basics

    7/41

    EE465: Introduction to Digital Image Processing 7

    Discrete Source

    A discrete source is characterized by a discrete

    random variableX

    Examples Coin flipping: P(X=H)=P(X=T)=1/2

    Dice tossing: P(X=k)=1/6, k=1-6

    Playing-card drawing:

    P(X=S)=P(X=H)=P(X=D)=P(X=C)=1/4

    What is the redundancy with a discrete source?

  • 8/3/2019 Data Compression Basics

    8/41

    EE465: Introduction to Digital Image Processing 8

    Two Extreme Cases

    sourceencoder

    channelsourcedecoder

    tossinga fair coin

    Heador

    Tail?

    channel duplicationtossing a coin withtwo identical sides

    P(X=H)=P(X=T)=1/2: (maximum uncertainty)Minimum (zero) redundancy, compression impossible

    P(X=H)=1,P(X=T)=0: (minimum redundancy)Maximum redundancy, compression trivial (1bit is enough)

    HHHH

    TTTT

    Redundancy is the opposite of uncertainty

  • 8/3/2019 Data Compression Basics

    9/41

    EE465: Introduction to Digital Image Processing 9

    Quantifying Uncertainty of an Event

    ppI 2log)( p - probability of the eventx

    (e.g.,x can beX=HorX=T)

    p

    1

    0

    )(pI

    0

    notes

    must happen

    (no uncertainty)

    unlikely to happen(infinite amount of uncertainty)

    Self-information

    Intuitively, I(p) measures the amount of uncertainty with event x

  • 8/3/2019 Data Compression Basics

    10/41

    EE465: Introduction to Digital Image Processing 10

    Weighted Self-information

    p

    0

    1

    )(pI

    0

    1/2 1

    0

    0

    1/2

    pppIw 2log)(

    Question: Which value of p maximizes Iw(p)?

    )()( pIppIw

    As p evolves from 0 to 1, weighted self-information

    first increases and then decreases

  • 8/3/2019 Data Compression Basics

    11/41

    EE465: Introduction to Digital Image Processing 11

    p=1/e

    2ln

    1)(

    epIw

    Maximum of Weighted Self-information*

  • 8/3/2019 Data Compression Basics

    12/41

    EE465: Introduction to Digital Image Processing 12

    },...,2,1{ Nx

    Niixprobpi ,...,2,1),(

    N

    i

    ip1

    1

    To quantify the uncertainty of a discrete source,we simply take the summation ofweighted self-

    information over the whole set

    Xis a discrete random variable

    Quantification of Uncertainty of a Discrete Source

    A discrete source (random variable) is a collection(set) of individual events whose probabilities sum to 1

  • 8/3/2019 Data Compression Basics

    13/41

    EE465: Introduction to Digital Image Processing 13

    Shannons Source Entropy Formula

    N

    i

    iw pIXH1

    )()(

    N

    i

    ii ppXH1

    2log)( (bits/sample)or bps

    Weighting

    coefficients

  • 8/3/2019 Data Compression Basics

    14/41

    EE465: Introduction to Digital Image Processing 14

    Source Entropy Examples

    Example 1: (binary Bernoulli source)

    )1(1),0( xprobpqxprobp

    )loglog()( 22 qqppXH

    Flipping a coin with probability of head being p (0

  • 8/3/2019 Data Compression Basics

    15/41

    EE465: Introduction to Digital Image Processing 15

    Entropy of Binary Bernoulli Source

  • 8/3/2019 Data Compression Basics

    16/41

    EE465: Introduction to Digital Image Processing 16

    Source Entropy Examples

    Example 2: (4-way random walk)

    41)(,

    21)( NxprobSxprob

    bpsXH 75.1)81log

    81

    81log

    81

    41log

    41

    21log

    21()( 2222

    N

    E

    S

    W

    8

    1)()( WxprobExprob

  • 8/3/2019 Data Compression Basics

    17/41

    EE465: Introduction to Digital Image Processing 17

    Source Entropy Examples (Cont)

    Example 3:

    2

    1)(1,

    2

    1)( bluexprobpredxprobp

    A jar contains the same number of balls with two different colors: blue and red.

    Each time a ball is randomly picked out from the jar and then put back. Considerthe event that at the k-th picking, it is the first time to see a red ballwhat is the

    probability of such event?

    Prob(event)=Prob(blue in the first k-1 picks)Prob(red in the k-th pick )

    =(1/2)k-1(1/2)=(1/2)k

    (source with geometric distribution)

  • 8/3/2019 Data Compression Basics

    18/41

    EE465: Introduction to Digital Image Processing 18

    Source Entropy Calculation

    If we consider all possible events, the sum of their probabilities will be one.

    Then we can define a discrete random variable X with

    1

    2

    1

    1

    k

    k

    Check:

    k

    kxP

    2

    1)(

    Entropy:

    bpskppXHk

    k

    k

    kk 22

    1log)(

    11

    2

    Problem 1 in HW3 is slightly more complex than this example

  • 8/3/2019 Data Compression Basics

    19/41

    EE465: Introduction to Digital Image Processing 19

    Properties of Source Entropy

    Nonnegative and concave

    Achieves the maximum when the source

    observes uniform distribution (i.e., P(x=k)=1/N,k=1-N)

    Goes to zero (minimum) as the source becomes

    more and more skewed (i.e., P(x=k)1, P(xk)

    0)

  • 8/3/2019 Data Compression Basics

    20/41

    History of Entropy

    Origin: Greek root for transformation content

    First created by Rudolf Clausius to study

    thermodynamical systems in 1862 Developed by Ludwig Eduard Boltzmann in

    1870s-1880s (the first serious attempt to

    understand nature in a statistical language)

    Borrowed by Shannon in his landmark work A

    Mathematical Theory of Communication in

    1948EE465: Introduction to Digital Image Processing

    20

  • 8/3/2019 Data Compression Basics

    21/41

    A Little Bit of Mathematics*

    Entropy S is proportional to log P (P is the

    relative probability of a state)

    Consider an ideal gas ofNidentical particles,of whichNi are in the i-th microscopic

    condition (range) of position and momentum.

    Use Stirlings formula: log N! ~ NlogN-N and

    note that pi = Ni/N, you will get S ~ pi log pi

    EE465: Introduction to Digital Image Processing21

  • 8/3/2019 Data Compression Basics

    22/41

    Entropy-related Quotes

    My greatest concern was what to call it. I thought of calling it

    information, but the word was overly used, so I decided to call it

    uncertainty. When I discussed it with John von Neumann, he had a

    better idea. Von Neumann told me, You should call it entropy, for two

    reasons. In the first place your uncertainty function has been used instatistical mechanics under that name, so it already has a name. In the

    second place, and more important, nobody knows what entropy really

    is, so in a debate you will always have the advantage.

    --Conversation between Claude Shannon and John von Neumann regardingwhat name to give to the measure of uncertainty or attenuation in

    phone-line signals (1949)

    EE465: Introduction to Digital Image Processing22

  • 8/3/2019 Data Compression Basics

    23/41

    Other Use of Entropy

    In biology

    the orderproduced within cells as they grow and

    divide is more than compensated for by the

    disorderthey create in their surroundings in thecourse of growth and division. A. Lehninger

    Ecological entropy is a measure of biodiversity in

    the study of biological ecology.

    In cosmology

    black holes have the maximum possible entropy of

    any object of equal size Stephen Hawking

    EE465: Introduction to Digital Image Processing23

  • 8/3/2019 Data Compression Basics

    24/41

    EE465: Introduction to Digital Image Processing24

    What is the use of H(X)?

    Shannons first theorem (noiseless coding theorem)

    For a memoryless discrete source X, its entropy H(X)

    defines the minimum average code length required to

    noiselessly code the source.

    Notes:

    1. Memoryless means that the events are independently

    generated (e.g., the outcomes of flipping a coin N timesare independent events)

    2. Source redundancy can be then understood as the

    difference between raw data rate and source entropy

  • 8/3/2019 Data Compression Basics

    25/41

    EE465: Introduction to Digital Image Processing25

    Code Redundancy*

    0)( XHlr

    Average code length:

    Theoretical boundPractical performance

    N

    i ii p

    pXH1

    2

    1log)(

    N

    i

    iilpl1

    li: the length of

    codeword assigned

    to the i-th symbol

    Note: if we represent each symbol by q bits (fixed length codes),

    Then redundancy is simply q-H(X) bps

  • 8/3/2019 Data Compression Basics

    26/41

    EE465: Introduction to Digital Image Processing26

    How to achieve source entropy?

    Note: The above entropy coding problem is based on simplified

    assumptions are that discrete source X is memoryless and P(X)is completely known. Those assumptions often do not hold for

    real-world data such as images and we will recheck them later.

    entropy

    codingdiscrete

    source X

    P(X)

    binary

    bit stream

  • 8/3/2019 Data Compression Basics

    27/41

    EE465: Introduction to Digital Image Processing27

    Data Compression Basics

    Discrete source

    Information=uncertainty

    Quantification of uncertainty

    Source entropy

    Variable length codes

    Motivation

    Prefix condition Huffman coding algorithm

  • 8/3/2019 Data Compression Basics

    28/41

    EE465: Introduction to Digital Image Processing 28

    Recall:

    Variable Length Codes (VLC)

    Assign a long codeword to an event with small probability

    Assign a short codeword to an event with large probability

    ppI 2log)( Self-information

    It follows from the above formula that a small-probability event contains

    much information and therefore worth many bits to represent it. Conversely,

    if some event frequently occurs, it is probably a good idea to use as few bits

    as possible to represent it. Such observation leads to the idea of varying the

    code lengths based on the events probabilities.

    )(log)( 2 xpxl

  • 8/3/2019 Data Compression Basics

    29/41

    EE465: Introduction to Digital Image Processing 29

    symbol k pk

    S

    W

    N

    E

    0.5

    0.25

    0.125

    fixed-length

    codeword

    0.125

    00

    01

    10

    11

    variable-length

    codeword

    0

    10

    110

    111

    4-way Random Walk Example

    symbol stream : S S N W S E N N N W S S S N E S S

    fixed length:variable length:

    00 00 01 11 00 10 01 01 11 00 00 00 01 10 00 00

    0 0 10 111 0 110 10 10 111 0 0 0 10 110 0 0

    32bits

    28bits

    4 bits savings achieved by VLC (redundancy eliminated)

  • 8/3/2019 Data Compression Basics

    30/41

    EE465: Introduction to Digital Image Processing 30

    =0.51+0.252+0.1253+0.1253

    =1.75 bits/symbol

    average code length:

    Toy Example (Cont)

    source entropy:

    4

    1

    2log)(k

    kk ppXH

    s

    b

    N

    Nl

    Total number of bits

    Total number of symbols

    (bps)

    )(2 XHbpsl fixed-length variable-length

    )(75.1 XHbpsl

  • 8/3/2019 Data Compression Basics

    31/41

    EE465: Introduction to Digital Image Processing 31

    Problems with VLC

    When codewords have fixed lengths, the

    boundary of codewords is always identifiable.

    For codewords with variable lengths, their

    boundary could become ambiguous

    symbol

    S

    W

    N

    E

    VLC

    0

    1

    10

    11

    S S N W S E

    0 0 1 11 0 10

    0 0 11 1 0 10 0 0 1 11 0 1 0

    S S W N S E S S N W S E

    e

    d d

  • 8/3/2019 Data Compression Basics

    32/41

    EE465: Introduction to Digital Image Processing 32

    Uniquely Decodable Codes

    To avoid the ambiguity in decoding, we need to

    enforce certain conditions with VLC to make

    them uniquely decodable

    Since ambiguity arises when some codeword

    becomes the prefix of the other, it is natural to

    consider prefix condition

    Example: p pr pre pref prefi prefix

    ab: a is the prefix of b

  • 8/3/2019 Data Compression Basics

    33/41

    EE465: Introduction to Digital Image Processing 33

    Prefix condition

    No codeword is allowed tobe the prefix of any other

    codeword.

    We will graphically illustrate this condition

    with the aid of binary codeword tree

  • 8/3/2019 Data Compression Basics

    34/41

    EE465: Introduction to Digital Image Processing 34

    Binary Codeword Tree

    1 0

    1011 01 00

    root

    Level 1

    Level 2

    # of codewords

    2

    22

    2kLevel k

  • 8/3/2019 Data Compression Basics

    35/41

    EE465: Introduction to Digital Image Processing 35

    Prefix Condition Examples

    symbol x

    W

    E

    S

    N

    0

    1

    10

    11

    codeword 1 codeword 2

    0

    10

    110

    111

    1 0

    1011 01 00

    1 0

    1011

    codeword 1 codeword 2

    111 110

  • 8/3/2019 Data Compression Basics

    36/41

    EE465: Introduction to Digital Image Processing 36

    How to satisfy prefix condition?

    Basic rule: If a node is used as a codeword,

    then all its descendants cannot be used as

    codeword.

    1 0

    1011

    111 110

    Example

  • 8/3/2019 Data Compression Basics

    37/41

    EE465: Introduction to Digital Image Processing 37

    Krafts inequality 121

    N

    i

    li

    li: length of the i-th codeword

    Property of Prefix Codes

    W

    E

    S

    N

    0

    1

    10

    11

    0

    10

    110

    111

    symbol x VLC- 1 VLC-2Example

    124

    1

    i

    li 124

    1

    i

    li

    (proof skipped)

  • 8/3/2019 Data Compression Basics

    38/41

    EE465: Introduction to Digital Image Processing 38

    Two Goals of VLC design

    log2p(x) For an eventx with probability ofp(x), the optimal

    code-length is , where x denotes the

    smallest integer larger thanx (e.g., 3.4=4 )

    achieve optimal code length (i.e., minimal redundancy)

    satisfy prefix condition

    code redundancy: 0)( XHlr

    Unless probabilities of events are all power of 2,

    we often have r>0

  • 8/3/2019 Data Compression Basics

    39/41

    EE465: Introduction to Digital Image Processing 39

    Solution:

    Huffman Coding (Huffman1952)

    we will cover it later while studying JPEG

    Arithmetic Coding (1980s)

    not covered by EE465 but EE565 (F2008)

  • 8/3/2019 Data Compression Basics

    40/41

    EE465: Introduction to Digital Image Processing 40

    Golomb Codes for Geometric Distribution

    k

    1

    2

    3

    4

    5

    67

    8

    codeword

    0

    10

    110

    1110

    11110

    1111101111110

    11111110

    Optimal VLC for geometric source: P(X=k)=(1/2)k, k=1,2,

    01

    1 0

    1 0

    1 0

  • 8/3/2019 Data Compression Basics

    41/41

    EE465: Introduction to Digital Image Processing 41

    Summary of Data Compression Basics

    Shannons Source entropy formula (theory)

    Entropy (uncertainty) is quantified by weighted

    self-information

    VLC thumb rule (practice)

    Long codeword small-probability event Short codeword large-probability event

    N

    i

    ii ppXH1

    2log)( bps

    )(log)( 2 xpxl