lecture 5 lossless coding (ii) - uni-konstanz.de

51
Shujun LI (李树钧): INF-10845-20091 Multimedia Coding Lecture 5 Lossless Coding (II) May 20, 2009

Upload: others

Post on 04-Apr-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Lecture 5Lossless Coding (II)

May 20, 2009

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Outline

ReviewArithmetic CodingDictionary CodingRun-Length CodingLossless Image Coding

Data Compression: Hall of FameReferences

1

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Review

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Image and video encoding: A big picture

Pre-Processing

LossyCoding

Lossless Coding

Post-Processing

(Post-filtering)

EncodedImage/Video

Predictive Coding

QuantizationTransform Coding

Model-Based Coding…

Entropy CodingDictionary-Based Coding

Run-Length Coding…

A/D ConversionColor Space Conversion

Pre-FilteringPartitioning

Differential CodingMotion Estimation and

CompensationContext-Based Coding

InputImage/Video

3

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

The ingredients of entropy coding

A random source (X, P)A statistical model (X, P’) as an estimation of the random sourceAn algorithm to optimize the coding performance (i.e., to minimize the average codeword length)

At least one designer …

4

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

FLC, VLC and V2FLC

FLC = Fixed-length coding/code(s)/codeword(s)• Each symbol xi emitted from a random source (X, P) is encoded as an

n-bit codeword, where |X|≤2n.

VLC = Variable-length coding/code(s)/codeword(s)• Each symbol xi emitted from a random source (X, P) is encoded as an

ni-bit codeword.• FLC can be considered as a special case of VLC, where n1=…=n|X|.

V2FLC = Variable-to-fixed length coding/code(s)/codeword(s)• A symbol or a string of symbols is encoded as an n-bit codeword.• V2FLC can also be considered as a special case of VLC.

5

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Static coding vs. Dynamic/Adaptive coding

Static coding = The statistical model P’ is static, i.e., it does not change over time.Dynamic/Adaptive coding = The statistical model P’ is dynamically updated, i.e., it adapts itself to the context (i.e., changes over time).• Dynamic/Adaptive coding ⊂ Context-based coding

Hybrid coding = Static + Dynamic coding• A codebook is maintained at the encoder side, and the

encoder dynamically chooses a code for a number of symbols and inform the decoder about the choice.

6

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

A codingShannon codingShannon-Fano codingHuffman codingArithmetic coding (Range coding)• Shannon-Fano-Elisa coding

Universal coding• Exp-Golomb coding (H.264/MPEG-4 AVC, Dirac)• Elias coding family , Levenshtein coding , …

Non-universal coding• Truncated binary coding, unary coding, …• Golomb coding ⊃ Rice coding

Tunstall coding ⊂ V2FLC…

7

David Salomon, Variable-length Codes for Data Compression, Springer, 2007

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Shannon-Fano Code: An example

X={A,B,C,D,E}, P={0.35,0.2,0.19,0.13,0.13}, Y={0,1}

8

A, B C, D, E0.35+0.2 0.19+0.13+0.13

A B CD, E

D E

0.35 0.2 0.19 0.13+0.13

0.13 0.13

A Possible Code

A 00

B 01

C 10

D 110

E 111

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Universal coding (code)

A code is called universal if L≤C1(H+C2) for all possible values of H, where C1, C2≥1.• You may see a different definition somewhere, but the basic idea

remains the same – a universal code works like an optimal code, except there is a bound defined by a constant C1.

A universal code is called asymptotically optimal if C1→1 when H→∞.

9

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Coding positive/non-negative integers

Naive binary coding• |X|=2k: k-bit binary representation of an integer.

Truncated binary coding• |X|=2k+b: X={0,1,…,2k+b-1}

Unary code (Stone-age binary coding)• |X|=∞: X=Z+={1,2,…}

10

f(x) = 0 · · · 0| {z }x−1

1 or 1 · · · 1| {z }x−1

0

0⇒ 0 · · · 0| {z }k

2k − b− 1⇒ b0 · · · bk−1 2k + b− 1⇒ 1 · · · 1| {z }k+1

2k − b⇒ b0 · · · bk//////////////////////

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Golomb coding and rice coding

Golomb coding = Unary coding + Truncated binary coding• An integer x is divided into two parts (quotient and

remainder) according to a parameter M: , r=mod(x, M)=x-q*M.

• Golumb code = unary code of q + truncated binary code of r.

• When M=1, Golomb coding = unary coding.• When M=2k, Golomb coding = Rice coding.• Golomb code is the optimal code for the geometric

distribution: Prob(x=i)=(1-p)i-1p, where 0<p<1.11

q = bx/Mc

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Exp-Golomb coding (Universal)

Exp-Golomb coding ≠ Golomb codingExp-Golomb coding of order k=0 is used in some video coding standards such as H.264.The encoding process• |X|=∞: X={0,1,2,…}• Calculate , and .• Exp-Golomb code = unary code of nq + nq LSBs of

q + k-bit representation of r=mod(x,2k)=x-q*2k.

12

q = bx/2kc+ 1 nq = blog2 qc

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Huffman code: An example

X={1,2,3,4,5}, P=[0.4,0.2,0.2,0.1,0.1].

13

p4=0.1 p5=0.1

p4+5=0.2p3=0.2p1=0.4 p2=0.2

p3+4+5=0.4p1+2=0.6

p1+2+3+4+5=1

A Possible Code

1 00

2 01

3 10

4 110

5 111

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Huffman code: An optimal code

Relation between Huffman code and Shannon code: H≤LHuffman≤LHuffman-Fano≤LShannon<H+1A stronger result (Gallager, 1978)• When pmax≥ 0.5, LHuffman≤H+pmax<H+1• When pmax<0.5,

LHuffman≤H+pmax+log2(2(log2e)/e)≈H+pmax+0.086<H+0.586

Huffman’s rules of optimal codes imply that Huffman code is optimal.• When each pi is a negative power of 2, Huffman code reaches the

entropy.

14

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Huffman code: Small X problem

Problem• When |X| is small, the coding performance is less

obvious.• As a special case, when |X|=2, Huffman coding cannot

compress the data at all – no matter what the probability is, each symbol has to be encoded as a single bit.

Solutions• Solution 1: Work on Xn rather than X.• Solution 2: Dual tree coding = Huffman coding +

Tunstall coding15

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Huffman code: Variance problem

Problem• There are multiple choices of two smallest

probabilities, if more than two nodes have the same probability during any step of the coding process.

• Huffman codes with a larger variance may cause trouble for data transmissions via a CBR (constant bit rate) channel – a larger buffer is needed.

Solution• Shorter subtrees first. (A single node’s height is 0.)

16

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Modified Huffman code

Problem• If |X| is too large, the construction of the Huffman tree will be too

long and the memory used for the tree will be too demanding.

Solution• Divide X into two set X1={si|p(si)>2-v}, X2={si|p(si)≤2-v}.• Perform Huffman coding for the new set X3=X1∪{X2}.• Append f(X2) as the prefix of naive binary representation of all

symbols in X2.

17

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Huffman’s rules of making optimal codes

Source statistics: P=P0=[p1,…,pm], where p1≥…≥pm-1≥pm.Rule 1: L1≤…≤Lm-1=Lm.Rule 2: If L1≤…≤Lm-2<Lm-1=Lm, Lm-1 and Lm differs from each other only for the last bit, i.e., f(xm-1)=b0 and f(xm)=b1, where b is a sequence of Lm-1 bits.Rule 3: Each possible bit sequence of length Lm-1 must be either a codeword or the prefix of some codewords.

18

Answers: Read Section 5.2.1 (pp. 122-123) of the following book – Yun Q. Shi and Huifang Sun, Image and Video Compression for Multimedia

Engineering, 2nd Edition, CRC Press, 2008

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Justify Huffman’s rules

Rule 1• If Li>Li+1, we can swap them to get a smaller average codeword

length (when Pi=Pi+1 it does not make sense, though).• If Lm-1<Lm, then L1,…,Lm-1<Lm and the last bit of Lm is redundant.

Rule 2• If they do not have the same parent node, then the last bits of both

codewords are redundant.

Rule 3• If there is an unused bit sequence of length Lm-1, we can use it for

Lm.

19

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic Coding

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Why do we need a new coding algorithm?

Problems with Huffman coding• Each symbol in X is represented by at least one bit.• Coding for Xn: A Huffman tree with |X|n nodes needs to be

constructed. The value of n cannot be too large.• Encoding with a Huffman tree is quite easy, but decoding can be

difficult especially when X is large.• Dynamic Huffman coding is slow due to the update of the

Huffman tree from time to time.

Solution: Arithmetic coding• Encode a number of symbols (can be a message of a very large

size n) incrementally (progressively) as a binary fraction in a real range [low, high) ⊆ [0,1).

21

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

The name of the game: The historyShannon-Fano-Elias coding (sometimes simply Elias coding)• Shannon actually mentioned something about it in 1948. ☺• Elias invented the recursive implementation, but didn’t publish it at all. • Abramson introduced Elias’ idea in his book “Information Theory and Coding”

(1963) as a note on pages 61-62. ☺Practical arithmetic coding• Rissanen and Pasco proposed the finite-precision implementation in 1976

independently. The name “arithmetic coding” was coined by Rissanen.• The idea was patented by the IBM (Langdon and Rissanen as the inventors) in

1970s and afterwards more related patents were filed. …• Witten, Neal and Cleary published source code in C in 1987, which made

arithmetic coding more popular. ☺Range coding• It was proposed by Martin in 1979, but it is just a different way (a patent-free one ☺☺) of looking at arithmetic coding and implementing it.

22

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Associating input symbol(s) with intervals

Given a source (X,P): P=[p(x1),…,p(xi),…,p(xm)]For any symbol xi∈X• The associated sub-interval is [pc(xi),

pc(xi+1)=pc(xi)+p(xi)), where pc(x1)=0 and pc(xm+1)=pc(xm)+p(xm)=1.

For any n-symbol (n>1) word x=(xi1,…,xin)∈Xn

• The associated sub-interval is [pc(x), pc(x)+p(x)), where pc(x) is the starting point of the interval corresponding to x and p(x)=p(xi1)…p(xin) for a memoryless random source.

23

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic coding: An example

X={A,B,C,D}, P=[0.6,0.2,0.1,0.1].

24

Input: x=ACDOutput: y∈[0.534, 0.54)

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic coding: Which fraction?

Which fraction in [pc(x), pc(x)+p(x)) should be chosen?• One proper choice: 1/2l≤p(x)/2 ⇒ ⇒

choose f(x) to be a binary integer i such that (i-1)/2l≤pc(x)<i/2l<pc(x)+p(x) ⇒ Hb(P)≤L<Hb(P)+2/n

Assuming the above choice is used to design a code f:Xn→Y*, it can be proved that f is a PF code.• n can be as large as the size of the input message x.

25

l = dlog2(1/p(x))e+ 1What is this?

i/2l

0 1pc(x)

p(x))

pc(x)+p(x)

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic coding: Termination problem

Termination of the original message x• Transmit the size of x to the decoder separately.• Add an “end of message” symbol to X.• Make use of semantic information.

Termination of the encoded bitstream f(x)• Transmit the number of bits to the decoder separately.• Use the choice of fraction introduced in last slide (with

known value of m).• Add an “end of message” symbol to X.• Make use of semantic information.

26

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic coding: Performance

Arithmetic coding: Hb(P)≤L<Hb(P)+2/nHuffman coding: Hb(P)≤L<Hb(P)+1/nArithmetic coding < Huffman coding?• Theoretically yes – if all the entropy coding schemes are working

on the same source (Xn,Pn).• Practically no – The complexity of almost all other entropy codes

(including Huffman code and Shannon code) will soon reach its computational limit when n→∞. But arithmetic coding works in an incremental way, so n can be as large as the size of the to-be-encoded message itself.

⇒ Arithmetic coding > Huffman coding!

27

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic coding: Noticeable features

Non-block coding (Incremental coding)• The output of an encoder is not the concatenation of codewords of

consecutive symbols.

“Range” coding• The encoded message is actually a range in [0,1].

Separation between coding and the source’s statistics• The probability is used to divide a range with the coding process

(instead of before it).• ⇒ Arithmetic coding is inherently dynamic/adaptive/context-

based coding.

28

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Arithmetic coding: Practical issues

Finite-precision problem• Approximating probabilities and the arithmetic operations in

finite-precision ⇒ loss of coding efficiency, but negligible if the precision is high enough.

• Common bits of the two end points can be removed immediately and sent out.

• A renormalization process is needed to re-calculate the new “values” of the two end points.

Carry-propagation problem• A carry (caused by the probability addition) may propagate over q

bits (such as 0.011111110… ⇒ 0.100000000…), so an extra register has to be used to track this problem.

29

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Binary arithmetic codingX=Y={0,1}, P=[p0, 1-p0 ]n-bit finite precision is used: initially, [0,1) ⇒ [0,2n-1)Update the range as usual: low’=low or low+(high-low)*p0, high’=low+(high-low)*p0 or high.Do nothing if high-low≥2n-1

Do renormalization when high-low<2n-1

• When low, high<2n-1 ⇒ output a 0-bit, and [0,2n-1) ⇒ [0,2n-1).• When low, high≥2n-1 ⇒ output a 1-bit, subtract 2n-1 from low, high, and

[2n-1,2n-1) ⇒ [0,2n-1).• low<2n-1 (low=01…) but high≥2n-1 (high=10…) ⇒ store an unknown bit

(which is the complement of the next known bit) in the buffer, subtract 2n-2 from low, high, and [2n-2, 2n-1+2n-2) ⇒ [0,2n-1)

After all symbols are encoded, terminate the process by sending out a sufficient number of bits to represent the value of low.

30

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Binary arithmetic coding

31

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Multiplication-free implementation

For binary arithmetic coding, we can further approximate high-low≈α=¾ (⇐ high-low=½~1)Then, the update of low and high can be approximated without doing the multiplication.The optimal value of α is actually depends on the probability p0, and in most cases smaller than ¾ (typical value is ⅔ or around 0.7).When p0>1/(2α)=⅔, it is possible αp0>½>high-low, so high’<low’, which is a big mistake.• Solution: dynamically exchange p0 and p1 such that p0<½ always

holds. ⇒ 0 = LPS (less/least probable symbol) and 1 = MPS (more/most probable symbol).

32

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Q- arithmetic coders

Q-coder• The first multiplication-free binary arithmetic coder.• It is less efficient than ideal arithmetic coders, but the loss of

compression efficiency is less than 6%.• Why was it named after “Q”? – high-low=A, pLPS=Q ⇒ low’=low

or low+(A-Q), A’=A-Q or Q.• The renormalization process ensures A∈[0.75,1.5) ⇒ A≈1 (α=⅔).

QM-coder (JPEG/JBIG)• An enhanced edition of Q-coder.

MQ-coder (JPEG2000/JBIG2)• A context-based variant of QM-coder.

33low

high=low+A

A(1-Q)≈A-Q

AQ≈Q

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Dictionary Coding

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

The basic idea

Use a static/dynamic dictionary and encode a symbol or a sequence of symbols as the index of the dictionary.The dictionary can be implicit or explicit.The dictionary can be considered as an (maybe very rough) approximation of the probability distribution of the source.

35

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Static dictionary coding: Digram coding

A static dictionary of size 2n contains |X| symbols in X and also 2n-|X| most frequently used symbols in X2 (i.e., pairs of symbols in X).An example• |X|={A,B,C,D,E}, n=3• ABDEBE ⇒ 101011100111

36

Code/Index Entry

A 000

B 001

C 010

D 011

E 100

AB 101

AD 110

BE 111

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Adaptive dictionary coding

LZ (Lempel-Ziv) familyLZ77 (LZ1) and LZ78 (LZ2)DEFLATE: LZ77 + Huffman CodingLZMA (Lempel-Ziv-Markov chain-Algorithm): DEFLATE + Arithmetic CodingLZW (Lempel-Ziv-Welch): Improved LZ78LZSS (Lempel-Ziv-Storer-Szymanski): Improved LZ77LZO (Lempel-Ziv-Oberhumer): Fast LZLZRW (Lempel-Ziv-Ross Williams): Improved LZ77LZJB (Lempel-Ziv-Jeff Bonwick): Improved LZRW1…

37

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

LZ77 (Sliding window compression)

A sliding window = a search buffer + a look-ahead bufferOnly exploit local redundancies.Use implicit dictionary.Encoding: “BADA…” ⇒ (o,l,“A”),… (o=l=0 for no match)

38

B A DD C E B A D A EE AA

Search buffer Look-ahead buffer

Sliding window

l

o

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

LZ78

Problems with LZ77• Nothing can be captured out of the sliding window.

Solution: Using an explicit dictionary• At the beginning, the dictionary is empty.• An encoded symbol or a sequence of symbols that

cannot be found in the dictionary is added into the dictionary as a new entry.

• The output is (i,c), where i is the dictionary index and cis the next symbol following the dictionary entry.

39

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

LZW

Main advantage over LZ78• No need to transmit the next symbol c, but only i.

Encoding procedure• Step 1: Initialize the dictionary to include all single

symbols in X.• Step 2: Search the input sequence of symbols in the

dictionary until xi+1…xj can be found but xi+1…xjxj+1cannot, then output the index of xi+1…xj and add xi+1…xjxj+1 as a new entry in the dictionary.

• Step 3: Repeat Step 2 until all symbols are encoded.

40

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Run-Length Coding

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

1-D run-length coding

Statistical model: Discrete Markov Source S={S1,S2} with 4 transition probabilities: P(Si/Sj), i, j∈{1,2}.

Runs: Repetition of symbols S1 or S2

How to code: (Si, Run1,Run2,…)• All the run-lengths form a new random source, which can be further

coded with a (modified) Huffman coder or a universal code.

42

S1 S2P(S1/S1) P(S2/S2)P(S2/S1)

P(S1/S2)

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Lossless Image Coding

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

How entropy coding is used

Huffman/Arithmetic coding + …Dictionary coding + …Universal codes + Predictive coding• Spatial difference tends to have a monotonic

probability distribution

Universal codes + Run-length coding• The run-length pairs tends to have a monotonic

probability distribution

44

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Lossless image coding standards

JBIG (Joint Bi-level Image Experts Group): QM-coderLossless JPEG family: Huffman coding/Arithmetic codingJBIG 2: MQ-coderGIF (Graphics Interchange Format): LZWPNG (Portable Network Graphics): DELATE = LZ77 + Huffman codingTIFF (Tagged Image File Format): RLE, LZWCompressed BMP and PCX: RLE…

45

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Lossy coding followed by lossless coding

Pre-Processing

LossyCoding

Lossless Coding

Post-Processing

(Post-filtering)

EncodedImage/Video

Predictive Coding

QuantizationTransform Coding

Model-Based Coding…

Entropy CodingDictionary-Based Coding

Run-Length Coding…

A/D ConversionColor Space Conversion

Pre-FilteringPartitioning

Differential CodingMotion Estimation and

CompensationContext-Based Coding

InputImage/Video

46

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Data Compression: Hall of Fame

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

Hall of Fame

Abraham Lempel (1936-)• IEEE Richard W. Hamming Medal (2007)• Fellow of IEEE (1982)

Jacob Ziv (1931-)• IEEE Richard W. Hamming Medal (1995)• Israel Prize (1993)• Member (1981) and former President (1995-2004) of

Israel National Academy of Sciences and Humanities• Fellow of IEEE (1973)

48

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

References

Shujun LI (李树钧): INF-10845-20091 Multimedia Coding

References for further readingNorman L. Biggs, Codes: An Introduction to Information Communication and Cryptography, Sections 4.6-4.8, Springer, 2008

Khalid Sayood, Introduction to Data Compression, Chapter 4 “Arithmetic Coding” and Chapter 5 “Dictionary Techniques”, 3rd Edition, Morgan Kaufmann, 2005Yun Q. Shi and Huifang Sun, Image and Video Compression for Multimedia Engineering: Fundamentals, Algorithms, and Standards, Chapter 6 “Run-Length and Dictionary Coding - Information Theory Results (III)”, 2nd Edition, CRC Press, 2008David S. Taubman and Michael W. Marcellin, JPEG2000: Image Compression Fundamentals, Standards and Practice, Section 2.3 “Arithmetic Coding”, KluwerAcademic Publishers, 2002Tinku Acharya and Ping-Sing Tsai, JPEG2000 Standard for Image Compression: Concepts, Algorithms and VLSI Architectures, Sections 2.3-2.5, John Wiley & Sons, Inc., 2004Mohammed Ghanbari, Standard Codecs: Image Compression to Advanced Video Coding, Section 3.4.2 “Arithmetic coding”, IEE, 2003

50