answers to exercises - home - springer978-1-84628-603...answers to exercises a bird does not sing...

138
Answers to Exercises A bird does not sing because he has an answer, he sings because he has a song. —Chinese Proverb Intro.1: abstemious, abstentious, adventitious, annelidous, arsenious, arterious, face- tious, sacrilegious. Intro.2: When a software house has a popular product they tend to come up with new versions. A user can update an old version to a new one, and the update usually comes as a compressed file on a floppy disk. Over time the updates get bigger and, at a certain point, an update may not fit on a single floppy. This is why good compression is important in the case of software updates. The time it takes to compress and decompress the update is unimportant since these operations are typically done just once. Recently, software makers have taken to providing updates over the Internet, but even in such cases it is important to have small files because of the download times involved. 1.1: (1) ask a question, (2) absolutely necessary, (3) advance warning, (4) boiling hot, (5) climb up, (6) close scrutiny, (7) exactly the same, (8) free gift, (9) hot water heater, (10) my personal opinion, (11) newborn baby, (12) postponed until later, (13) unexpected surprise, (14) unsolved mysteries. 1.2: A reasonable way to use them is to code the five most-common strings in the text. Because irreversible text compression is a special-purpose method, the user may know what strings are common in any particular text to be compressed. The user may specify five such strings to the encoder, and they should also be written at the start of the output stream, for the decoder’s use. 1.3: 6,8,0,1,3,1,4,1,3,1,4,1,3,1,4,1,3,1,2,2,2,2,6,1,1. The first two are the bitmap reso- lution (6 ×8). If each number occupies a byte on the output stream, then its size is 25 bytes, compared to a bitmap size of only 6 × 8 bits = 6 bytes. The method does not work for small images.

Upload: builien

Post on 14-May-2018

243 views

Category:

Documents


0 download

TRANSCRIPT

Answers to Exercises

A bird does not sing because he has an answer, he sings because he has a song.—Chinese Proverb

Intro.1: abstemious, abstentious, adventitious, annelidous, arsenious, arterious, face-tious, sacrilegious.

Intro.2: When a software house has a popular product they tend to come up withnew versions. A user can update an old version to a new one, and the update usuallycomes as a compressed file on a floppy disk. Over time the updates get bigger and, at acertain point, an update may not fit on a single floppy. This is why good compression isimportant in the case of software updates. The time it takes to compress and decompressthe update is unimportant since these operations are typically done just once. Recently,software makers have taken to providing updates over the Internet, but even in suchcases it is important to have small files because of the download times involved.

1.1: (1) ask a question, (2) absolutely necessary, (3) advance warning, (4) boilinghot, (5) climb up, (6) close scrutiny, (7) exactly the same, (8) free gift, (9) hot waterheater, (10) my personal opinion, (11) newborn baby, (12) postponed until later, (13)unexpected surprise, (14) unsolved mysteries.

1.2: A reasonable way to use them is to code the five most-common strings in thetext. Because irreversible text compression is a special-purpose method, the user mayknow what strings are common in any particular text to be compressed. The user mayspecify five such strings to the encoder, and they should also be written at the start ofthe output stream, for the decoder’s use.

1.3: 6,8,0,1,3,1,4,1,3,1,4,1,3,1,4,1,3,1,2,2,2,2,6,1,1. The first two are the bitmap reso-lution (6×8). If each number occupies a byte on the output stream, then its size is 25bytes, compared to a bitmap size of only 6 × 8 bits = 6 bytes. The method does notwork for small images.

954 Answers to Exercises

1.4: RLE of images is based on the idea that adjacent pixels tend to be identical. Thelast pixel of a row, however, has no reason to be identical to the first pixel of the nextrow.

1.5: Each of the first four rows yields the eight runs 1,1,1,2,1,1,1,eol. Rows 6 and 8yield the four runs 0,7,1,eol each. Rows 5 and 7 yield the two runs 8,eol each. The totalnumber of runs (including the eol’s) is thus 44.

When compressing by columns, columns 1, 3, and 6 yield the five runs 5,1,1,1,eoleach. Columns 2, 4, 5, and 7 yield the six runs 0,5,1,1,1,eol each. Column 8 gives 4,4,eol,so the total number of runs is 42. This image is thus “balanced” with respect to rowsand columns.

1.6: The result is five groups as follows:

W1 to W2 :00000, 11111,

W3 to W10 :00001, 00011, 00111, 01111, 11110, 11100, 11000, 10000,

W11 to W22 :00010, 00100, 01000, 00110, 01100, 01110,

11101, 11011, 10111, 11001, 10011, 10001,

W23 to W30 :01011, 10110, 01101, 11010, 10100, 01001, 10010, 00101,

W31 to W32 :01010, 10101.

1.7: The seven codes are

0000, 1111, 0001, 1110, 0000, 0011, 1111,

forming a string with six runs. Applying the rule of complementing yields the sequence

0000, 1111, 1110, 1110, 0000, 0011, 0000,

with seven runs. The rule of complementing does not always reduce the number of runs.

1.8: As “11 22 90 00 00 33 44”. The 00 following the 90 indicates no run, and thefollowing 00 is interpreted as a regular character.

1.9: The six characters “123ABC” have ASCII codes 31, 32, 33, 41, 42, and 43. Trans-lating these hexadecimal numbers to binary produces “00110001 00110010 0011001101000001 01000010 01000011”.The next step is to divide this string of 48 bits into 6-bit blocks. They are 001100=12,010011=19, 001000=8, 110011=51, 010000=16, 010100=20, 001001=9, and 000011=3.The character at position 12 in the BinHex table is “-” (position numbering starts atzero). The one at position 19 is “6”. The final result is the string “-6)c38*$”.

1.10: Exercise 2.1 shows that the binary code of the integer i is 1 + �log2 i� bits long.We add �log2 i� zeros, bringing the total size to 1 + 2�log2 i� bits.

Answers to Exercises 955

1.11: Table Ans.1 summarizes the results. In (a), the first string is encoded with k = 1.In (b) it is encoded with k = 2. Columns (c) and (d) are the encodings of the secondstring with k = 1 and k = 2, respectively. The averages of the four columns are 3.4375,3.25, 3.56, and 3.6875; very similar! The move-ahead-k method used with small valuesof k does not favor strings satisfying the concentration property.

a abcdmnop 0b abcdmnop 1c bacdmnop 2d bcadmnop 3d bcdamnop 2c bdcamnop 2b bcdamnop 0a bcdamnop 3m bcadmnop 4n bcamdnop 5o bcamndop 6p bcamnodp 7p bcamnopd 6o bcamnpod 6n bcamnopd 4m bcanmopd 4

bcamnopd

(a)

a abcdmnop 0b abcdmnop 1c bacdmnop 2d cbadmnop 3d cdbamnop 1c dcbamnop 1b cdbamnop 2a bcdamnop 3m bacdmnop 4n bamcdnop 5o bamncdop 6p bamnocdp 7p bamnopcd 5o bampnocd 5n bamopncd 5m bamnopcd 2

mbanopcd

(b)

a abcdmnop 0b abcdmnop 1c bacdmnop 2d bcadmnop 3m bcdamnop 4n bcdmanop 5o bcdmnaop 6p bcdmnoap 7a bcdmnopa 7b bcdmnoap 0c bcdmnoap 1d cbdmnoap 2m cdbmnoap 3n cdmbnoap 4o cdmnboap 5p cdmnobap 7

cdmnobpa

(c)

a abcdmnop 0b abcdmnop 1c bacdmnop 2d cbadmnop 3m cdbamnop 4n cdmbanop 5o cdmnbaop 6p cdmnobap 7a cdmnopba 7b cdmnoapb 7c cdmnobap 0d cdmnobap 1m dcmnobap 2n mdcnobap 3o mndcobap 4p mnodcbap 7

mnodcpba

(d)

Table Ans.1: Encoding With Move-Ahead-k.

1.12: Table Ans.2 summarizes the decoding steps. Notice how similar it is to Ta-ble 1.16, indicating that move-to-front is a symmetric data compression method.

Code input A (before adding) A (after adding) Word

0the () (the) the1boy (the) (the, boy) boy2on (boy, the) (boy, the, on) on3my (on, boy, the) (on, boy, the, my) my4right (my, on, boy, the) (my, on, boy, the, right) right5is (right, my, on, boy, the) (right, my, on, boy, the, is) is5 (is, right, my, on, boy, the) (is, right, my, on, boy, the) the2 (the, is, right, my, on, boy) (the, is, right, my, on, boy) right5 (right, the, is, my, on, boy) (right, the, is, my, on, boy) boy

(boy, right, the, is, my, on)

Table Ans.2: Decoding Multiple-Letter Words.

956 Answers to Exercises

2.1: It is 1 + �log2 i� as can be seen by simple experimenting.

2.2: The integer 2 is the smallest integer that can serve as the basis for a numbersystem.

2.3: Replacing 10 by 3 we get x = k log2 3 ≈ 1.58k. A trit is therefore worth about1.58 bits.

2.4: We assume an alphabet with two symbols a1 and a2, with probabilities P1 andP2, respectively. Since P1 + P2 = 1, the entropy of the alphabet is −P1 log2 P1 − (1 −P1) log2(1−P1). Table Ans.3 shows the entropies for certain values of the probabilities.When P1 = P2, at least 1 bit is required to encode each symbol, reflecting the factthat the entropy is at its maximum, the redundancy is zero, and the data cannot becompressed. However, when the probabilities are very different, the minimum numberof bits required per symbol drops significantly. We may not be able to develop a com-pression method using 0.08 bits per symbol but we know that when P1 = 99%, this isthe theoretical minimum.

P1 P2 Entropy99 1 0.0890 10 0.4780 20 0.7270 30 0.8860 40 0.9750 50 1.00

Table Ans.3: Probabilities and Entropies of Two Symbols.

An essential tool of this theory [information] is a quantity for measuring theamount of information conveyed by a message. Suppose a message is encoded intosome long number. To quantify the information content of this message, Shannonproposed to count the number of its digits. According to this criterion, 3.14159, forexample, conveys twice as much information as 3.14, and six times as much as 3.Struck by the similarity between this recipe and the famous equation on Boltzman’stomb (entropy is the number of digits of probability), Shannon called his formula the“information entropy.”

Hans Christian von Baeyer, Maxwell’s Demon (1998)

2.5: It is easy to see that the unary code satisfies the prefix property, so it definitely canbe used as a variable-size code. Since its length L satisfies L = n we get 2−L = 2−n, so itmakes sense to use it in cases were the input data consists of integers n with probabilitiesP (n) ≈ 2−n. If the data lends itself to the use of the unary code, the entire Huffmanalgorithm can be skipped, and the codes of all the symbols can easily and quickly beconstructed before compression or decompression starts.

Answers to Exercises 957

2.6: The triplet (n, 1, n) defines the standard n-bit binary codes, as can be verified bydirect construction. The number of such codes is easily seen to be

2n+1 − 2n

21 − 1= 2n.

The triplet (0, 0,∞) defines the codes 0, 10, 110, 1110,. . .which are the unary codesbut assigned to the integers 0, 1, 2,. . . instead of 1, 2, 3,. . . .

2.7: The triplet (1, 1, 30) produces (230 − 21)/(21 − 1) ≈ A billion codes.

2.8: This is straightforward. Table Ans.4 shows the code. There are only three differentcodewords since “start” and “stop” are so close, but there are many codes since “start”is large.

a = nth Number of Range ofn 10 + n · 2 codeword codewords integers

0 10 0 x...x︸︷︷︸10

210 = 1K 0–1023

1 12 10 xx...x︸ ︷︷ ︸12

212 = 4K 1024–5119

2 14 11 xx...xx︸ ︷︷ ︸14

214 = 16K 5120–21503

Total 21504

Table Ans.4: The General Unary Code (10,2,14).

2.9: Each part of C4 is the standard binary code of some integer, so it starts with a1. A part that starts with a 0 therefore signals to the decoder that this is the last bit ofthe code.

2.10: We use the property that the Fibonacci representation of an integer does nothave any adjacent 1’s. If R is a positive integer, we construct its Fibonacci representationand append a 1-bit to the result. The Fibonacci representation of the integer 5 is 001,so the Fibonacci-prefix code of 5 is 0011. Similarly, the Fibonacci representation of 33 is1010101, so its Fibonacci-prefix code is 10101011. It is obvious that each of these codesends with two adjacent 1’s, so they can be decoded uniquely. However, the property ofnot having adjacent 1’s restricts the number of binary patterns available for such codes,so they are longer than the other codes shown here.

2.11: Subsequent splits can be done in different ways, but Table Ans.5 shows one wayof assigning Shannon-Fano codes to the 7 symbols.The average size in this case is 0.25 × 2 + 0.20 × 3 + 0.15 × 3 + 0.15 × 2 + 0.10 × 3 +0.10× 4 + 0.05× 4 = 2.75 bits/symbols.

2.12: The entropy is −2(0.25×log2 0.25)− 4(0.125× log2 0.125) = 2.5.

958 Answers to Exercises

Prob. Steps Final

1. 0.25 1 1 :112. 0.20 1 0 :1013. 0.15 1 0 :1004. 0.15 0 1 :015. 0.10 0 0 1 :0016. 0.10 0 0 0 0 :00017. 0.05 0 0 0 0 :0000

Table Ans.5: Shannon-Fano Example.

2.13: Figure Ans.6a,b,c shows the three trees. The codes sizes for the trees are

(5 + 5 + 5 + 5·2 + 3·3 + 3·5 + 3·5 + 12)/30 = 76/30,

(5 + 5 + 4 + 4·2 + 4·3 + 3·5 + 3·5 + 12)/30 = 76/30,

(6 + 6 + 5 + 4·2 + 3·3 + 3·5 + 3·5 + 12)/30 = 76/30.

(a)

A B

2

5

8

(b) (c) (d)

A B

2

A B

2

C D

3

C D

3C D

3 D

88

FE 5

5

E

5

E

8

G

20

H

10

F

E

A B

2

3

C

G

10

F G

10 F G

10H

30

3018 H

30

18H

30

18

Figure Ans.6: Three Huffman Trees for Eight Symbols.

2.14: After adding symbols A, B, C, D, E, F, and G to the tree, we were left withthe three symbols ABEF (with probability 10/30), CDG (with probability 8/30), andH (with probability 12/30). The two symbols with lowest probabilities were ABEF andCDG, so they had to be merged. Instead, symbols CDG and H were merged, creating anon-Huffman tree.

2.15: The second row of Table Ans.8 (due to Guy Blelloch) shows a symbol whoseHuffman code is three bits long, but for which �− log2 0.3� = �1.737� = 2.

Answers to Exercises 959

(a)

A B

2

5

8

(b) (c) (d)

A B

2

A B

2

C D

3

C D

3C D

3 D

88

FE 5

5

E

5

E

8

G

20

H

10

F

E

A B

2

3

C

G

10

F G

10 F G

10H

30

3018 H

30

18H

30

18

Figure Ans.7: Three Huffman Trees for Eight Symbols.

Pi Code − log2 Pi �− log2 Pi�.01 000 6.644 7

*.30 001 1.737 2.34 01 1.556 2.35 1 1.515 2

Table Ans.8: A Huffman Code Example.

2.16: The explanation is simple. Imagine a large alphabet where all the symbols have(about) the same probability. Since the alphabet is large, that probability will be small,resulting in long codes. Imagine the other extreme case, where certain symbols havehigh probabilities (and, therefore, short codes). Since the probabilities have to add upto 1, the rest of the symbols will have low probabilities (and, therefore, long codes). Wetherefore see that the size of a code depends on the probability, but is indirectly affectedby the size of the alphabet.

2.17: Figure Ans.9 shows Huffman codes for 5, 6, 7, and 8 symbols with equal proba-bilities. In the case where n is a power of 2, the codes are simply the fixed-sized ones.In other cases the codes are very close to fixed-size. This shows that symbols with equalprobabilities do not benefit from variable-size codes. (This is another way of saying thatrandom text cannot be compressed.) Table Ans.10 shows the codes, their average sizesand variances.

2.18: It increases exponentially from 2s to 2s+n = 2s × 2n.

2.19: The binary value of 127 is 01111111 and that of 128 is 10000000. Half the pixelsin each bitplane will therefore be 0 and the other half, 1. In the worst case, each bitplanewill be a checkerboard, i.e., will have many runs of size one. In such a case, each runrequires a 1-bit code, leading to one codebit per pixel per bitplane, or eight codebits perpixel for the entire image, resulting in no compression at all. In comparison, a Huffman

960 Answers to Exercises

1

2

3

4

5

6

7

8

1

1

1

0

0

0

1

2

3

4

5

6

1

1

0

0

1

2

3

4

5

6

7

1

1

1

0

0

0

1

2

3

4

5

1

1

0

0

Figure Ans.9: Huffman Codes for Equal Probabilities.

Avg.n p a1 a2 a3 a4 a5 a6 a7 a8 size Var.5 0.200 111 110 101 100 0 2.6 0.646 0.167 111 110 101 100 01 00 2.672 0.22277 0.143 111 110 101 100 011 010 00 2.86 0.12268 0.125 111 110 101 100 011 010 001 000 3 0

Table Ans.10: Huffman Codes for 5–8 Symbols.

Answers to Exercises 961

code for such an image requires just two codes (since there are just two pixel values) andthey can be one bit each. This leads to one codebit per pixel, or a compression factorof eight.

2.20: The two trees are shown in Figure 2.26c,d. The average code size for the binaryHuffman tree is

1×0.49 + 2×0.25 + 5×0.02 + 5×0.03 + 5×.04 + 5×0.04 + 3×0.12 = 2 bits/symbol,

and that of the ternary tree is

1×0.26 + 3×0.02 + 3×0.03 + 3×0.04 + 2×0.04 + 2×0.12 + 1×0.49 = 1.34 trits/symbol.

2.21: Figure Ans.11 shows how the loop continues until the heap shrinks to just onenode that is the single pointer 2. This indicates that the total frequency (which happensto be 100 in our example) is stored in A[2]. All other frequencies have been replacedby pointers. Figure Ans.12a shows the heaps generated during the loop.

2.22: The final result of the loop is

1 2 3 4 5 6 7 8 9 10 11 12 13 14[2 ] 100 2 2 3 4 5 3 4 6 5 7 6 7

from which it is easy to figure out the code lengths of all seven symbols. To find thelength of the code of symbol 14, e.g., we follow the pointers 7, 5, 3, 2 from A[14] to theroot. Four steps are necessary, so the code length is 4.

2.23: The code lengths for the seven symbols are 2, 2, 3, 3, 4, 3, and 4 bits. This canalso be verified from the Huffman code-tree of Figure Ans.12b. A set of codes derivedfrom this tree is shown in the following table:

Count: 25 20 13 17 9 11 5Code: 01 11 101 000 0011 100 0010Length: 2 2 3 3 4 3 4

2.24: A symbol with high frequency of occurrence should be assigned a shorter code.Therefore, it has to appear high in the tree. The requirement that at each level thefrequencies be sorted from left to right is artificial. In principle, it is not necessary, butit simplifies the process of updating the tree.

2.25: Figure Ans.13 shows the initial tree and how it is updated in the 11 steps (a)through (k). Notice how the esc symbol gets assigned different codes all the time, andhow the different symbols move about in the tree and change their codes. Code 10, e.g.,is the code of symbol “i” in steps (f) and (i), but is the code of “s” in steps (e) and (j).The code of a blank space is 011 in step (h), but 00 in step (k).

The final output is: “s0i00r100�1010000d011101000”. A total of 5×8 + 22 = 62bits. The compression ratio is thus 62/88 ≈ 0.7.

962 Answers to Exercises

1 2 3 4 5 6 7 8 9 10 11 12 13 14[7 11 6 8 9 ] 24 14 25 20 6 17 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[11 9 8 6 ] 24 14 25 20 6 17 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[11 9 8 6 ] 17+14 24 14 25 20 6 17 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[5 9 8 6 ] 31 24 5 25 20 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[9 6 8 5 ] 31 24 5 25 20 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[6 8 5 ] 31 24 5 25 20 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[6 8 5 ] 20+24 31 24 5 25 20 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[4 8 5 ] 44 31 4 5 25 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[8 5 4 ] 44 31 4 5 25 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[5 4 ] 44 31 4 5 25 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[5 4 ] 25+31 44 31 4 5 25 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[3 4 ] 56 44 3 4 5 3 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[4 3 ] 56 44 3 4 5 3 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[3 ] 56 44 3 4 5 3 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[3 ] 56+44 56 44 3 4 5 3 4 6 5 7 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13 14[2 ] 100 2 2 3 4 5 3 4 6 5 7 6 7

Figure Ans.11: Sifting the Heap.

Answers to Exercises 963

5 9 11 13

17 20250

1

11

1

11

0

00

0

5

9 11

13 17 20 25

9

1113

17 2025

13 14

25 17 20

11

17 14

25 20

13

17 24

25 20

14

20 24

25

17

24 25

31

20

25 31

24

31 44

25

44

31

(a)

(b)

Figure Ans.12: (a) Heaps. (b) Huffman Code-Tree.

964 Answers to Exercises

Initial tree

(a). Input: s. Output: ‘s’.esc s1

(b). Input: i. Output: 0‘i’.esc i1 1 s1

esc0

1s1esc

0

s10

i1esc0

1

11

(c). Input: r. Output: 00‘r’.esc r1 1 i1 2 s1 →esc r1 1 i1 s1 2

s10

i10

1

12

1r1esc

0

1

s10

i10

1

12

1r1esc

0

1

(d). Input: �. Output: 100‘�’.esc �1 1 r1 2 i1 s1 3 →esc �1 1 r1 s1 i1 2 2

�1

s10

i10

1

13

1r1

0

2

esc0

1

1

�1

s1

0

i10

1

1

2

1r1

0

2

esc0

1

1

Figure Ans.13: Exercise 2.25. Part I.

Answers to Exercises 965

�1

s2

0

i10

1

1

3

1r1

0

2

esc0

1

1

s2

�1

0

i10

1

1

3

1r1

0

2

esc0

1

1

(e). Input: s. Output: 10.esc �1 1 r1 s2 i1 2 3 →esc �1 1 r1 i1 s2 2 3

s2

�1

0

i20

1

1

4

1r1

0

2

esc0

1

1

(f). Input: i. Output: 10.esc �1 1 r1 i2 s2 2 4

s2

�1

0

i20

1

1

4

1r1

0

3

0

2

1

d1esc0

1

s2

�1

0

i20

1

1

4

1r1

0

3

0

2

1

d1esc0

1

(g). Input: d. Output: 000‘d’.esc d1 1 �1 2 r1 i2 s2 3 4 →esc d1 1 �1 r1 2 i2 s2 3 4

Figure Ans.13: Exercise 2.25. Part II.

966 Answers to Exercises

s2�2

0

i20

1

1

4

1

r1

0

4

0

2

1

d1esc0

1

1

s2

�2

0

i20

1

1

4

1r1

0

4

0

3

1

d1esc0

1

1

(h). Input: �. Output: 011.esc d1 1 �2 r1 3 i2 s2 4 4 →esc d1 1 r1 �2 2 i2 s2 4 4

s2�2

0

i30

1

1

5

1

r1

0

4

0

2

1

d1esc0

1

1

s2�2

0

i30

1

1

5

1

r1

0

4

0

2

1

d1esc0

1

1

(i). Input: i. Output: 10.esc d1 1 r1 �2 2 i3 s2 4 5 →esc d1 1 r1 �2 2 s2 i3 4 5

Figure Ans.13: Exercise 2.25. Part III.

Answers to Exercises 967

s3�2

0

i30

1

1

6

1

r1

0

4

0

2

1

d1esc0

1

1

(j). Input: s. Output: 10.esc d1 1 r1 �2 2 s3 i3 4 6

s3�3

0

i30

1

1

6

1

r1

0

5

0

2

1

d1esc0

1

1

s3�3

0

i30

1

1

6

1

r1

0

5

0

2

1

d1esc0

1

1

(k). Input: �. Output: 00.esc d1 1 r1 �3 2 s3 i3 5 6 →esc d1 1 r1 2 �3 s3 i3 5 6

Figure Ans.13: Exercise 2.25. Part IV.

968 Answers to Exercises

2.26: A simple calculation shows that the average size of a token in Table 2.35 isabout nine bits. In stage 2, each 8-bit byte will be replaced, on average, by a 9-bittoken, resulting in an expansion factor of 9/8 = 1.125 or 12.5%.

2.27: The decompressor will interpret the input data as 111110 0110 11000 0. . . , whichis the string XRP. . . .

2.28: Because a typical fax machine scans lines that are about 8.2 inches wide (≈208 mm), so a blank scan line produces 1,664 consecutive white pels.

2.29: These codes are needed for cases such as example 4, where the run length is 64,128, or any length for which a make-up code has been assigned.

2.30: There may be fax machines (now or in the future) built for wider paper, so theGroup 3 code was designed to accommodate them.

2.31: Each scan line starts with a white pel, so when the decoder inputs the next codeit knows whether it is for a run of white or black pels. This is why the codes of Table 2.41have to satisfy the prefix property in each column but not between the columns.

2.32: Imagine a scan line where all runs have length one (strictly alternating pels).It’s easy to see that this case results in expansion. The code of a run length of one whitepel is 000111, and that of one black pel is 010. Two consecutive pels of different colorsare thus coded into 9 bits. Since the uncoded data requires just two bits (01 or 10),the compression ratio is 9/2 = 4.5 (the compressed stream is 4.5 times longer than theuncompressed one; a large expansion).

2.33: Figure Ans.14 shows the modes and the actual code generated from the twolines.

↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑vertical mode horizontal mode pass vertical mode horizontal mode. . .

-1 0 3 white 4 black code +2 -2 4 white 7 black

↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓010 1 001 1000 011 0001 000011 000010 001 1011 00011

Figure Ans.14: Two-Dimensional Coding Example.

2.34: Table Ans.15 shows the steps of encoding the string a2a2a2a2. Because of thehigh probability of a2 the low and high variables start at very different values andapproach each other slowly.

2.35: It can be written either as 0.1000. . . or 0.0111. . . .

Answers to Exercises 969

a2 0.0 + (1.0− 0.0)× 0.023162=0.0231620.0 + (1.0− 0.0)× 0.998162=0.998162

a2 0.023162 + .975× 0.023162=0.045744950.023162 + .975× 0.998162=0.99636995

a2 0.04574495 + 0.950625× 0.023162=0.067763226250.04574495 + 0.950625× 0.998162=0.99462270125

a2 0.06776322625 + 0.926859375× 0.023162=0.089231243093750.06776322625 + 0.926859375× 0.998162=0.99291913371875

Table Ans.15: Encoding the String a2a2a2a2.

2.36: In practice, the eof symbol has to be included in the original table of frequenciesand probabilities. This symbol is the last to be encoded, and the decoder stops when itdetects an eof.

2.37: The encoding steps are simple (see first example on page 114). We start withthe interval [0, 1). The first symbol a2 reduces the interval to [0.4, 0.9). The second one,to [0.6, 0.85), the third one to [0.7, 0.825) and the eof symbol, to [0.8125, 0.8250). Theapproximate binary values of the last interval are 0.1101000000 and 0.1101001100, sowe select the 7-bit number 1101000 as our code.

The probability of a2a2a2eof is (0.5)3×0.1 = 0.0125, but since − log2 0.0125 ≈ 6.322it follows that the practical minimum code size is 7 bits.

2.38: Perhaps the simplest way to do this is to compute a set of Huffman codes for thesymbols, using their probabilities. This converts each symbol to a binary string, so theinput stream can be encoded by the QM-coder. After the compressed stream is decodedby the QM-decoder, an extra step is needed, to convert the resulting binary strings backto the original symbols.

2.39: The results are shown in Tables Ans.16 and Ans.17. When all symbols are LPS,the output C always points at the bottom A(1 − Qe) of the upper (LPS) subinterval.When the symbols are MPS, the output always points at the bottom of the lower (MPS)subinterval, i.e., 0.

2.40: If the current input bit is an LPS, A is shrunk to Qe, which is always 0.5 or less,so A always has to be renormalized in such a case.

2.41: The results are shown in Tables Ans.18 and Ans.19 (compare with the answerto exercise 2.39).

2.42: The four decoding steps are as follows:Step 1: C = 0.981, A = 1, the dividing line is A(1−Qe) = 1(1− 0.1) = 0.9, so the LPSand MPS subintervals are [0, 0.9) and [0.9, 1). Since C points to the upper subinterval,an LPS is decoded. The new C is 0.981−1(1−0.1) = 0.081 and the new A is 1×0.1 = 0.1.Step 2: C = 0.081, A = 0.1, the dividing line is A(1−Qe) = 0.1(1− 0.1) = 0.09, so theLPS and MPS subintervals are [0, 0.09) and [0.09, 0.1), and an MPS is decoded. C isunchanged and the new A is 0.1(1− 0.1) = 0.09.

970 Answers to Exercises

Symbol C A

Initially 0 1s1 (LPS) 0 + 1(1− 0.5) = 0.5 1× 0.5 = 0.5s2 (LPS) 0.5 + 0.5(1− 0.5) = 0.75 0.5× 0.5 = 0.25s3 (LPS) 0.75 + 0.25(1− 0.5) = 0.875 0.25× 0.5 = 0.125s4 (LPS) 0.875 + 0.125(1− 0.5) = 0.9375 0.125× 0.5 = 0.0625

Table Ans.16: Encoding Four Symbols With Qe = 0.5.

Symbol C A

Initially 0 1s1 (MPS) 0 1× (1− 0.1) = 0.9s2 (MPS) 0 0.9× (1− 0.1) = 0.81s3 (MPS) 0 0.81× (1− 0.1) = 0.729s4 (MPS) 0 0.729× (1− 0.1) = 0.6561

Table Ans.17: Encoding Four Symbols With Qe = 0.1.

Symbol C A Renor. A Renor. C

Initially 0 1s1 (LPS) 0 + 1− 0.5 = 0.5 0.5 1 1s2 (LPS) 1 + 1− 0.5 = 1.5 0.5 1 3s3 (LPS) 3 + 1− 0.5 = 3.5 0.5 1 7s4 (LPS) 7 + 1− 0.5 = 6.5 0.5 1 13

Table Ans.18: Renormalization Added to Table Ans.16.

Symbol C A Renor. A Renor. C

Initially 0 1s1 (MPS) 0 1− 0.1 = 0.9s2 (MPS) 0 0.9− 0.1 = 0.8s3 (MPS) 0 0.8− 0.1 = 0.7 1.4 0s4 (MPS) 0 1.4− 0.1 = 1.3

Table Ans.19: Renormalization Added to Table Ans.17.

Answers to Exercises 971

Step 3: C = 0.081, A = 0.09, the dividing line is A(1−Qe) = 0.09(1− 0.1) = 0.0081, sothe LPS and MPS subintervals are [0, 0.0081) and [0.0081, 0.09), and an LPS is decoded.The new C is 0.081− 0.09(1− 0.1) = 0 and the new A is 0.09×0.1 = 0.009.Step 4: C = 0, A = 0.009, the dividing line is A(1 − Qe) = 0.009(1 − 0.1) = 0.00081,so the LPS and MPS subintervals are [0, 0.00081) and [0.00081, 0.009), and an MPS isdecoded. C is unchanged and the new A is 0.009(1− 0.1) = 0.00081.

2.43: In practice, an encoder may encode texts other than English, such as a foreignlanguage or the source code of a computer program. Acronyms, such as QED andabbreviations, such as qwerty, are also good examples. Even in English there are someexamples of a q not followed by a u, such as in this sentence. (The author has noticedthat science-fiction writers tend to use non-English sounding words, such as Qaal, toname characters in their works.)

2.44: The number of order-2 and order-3 contexts for an alphabet of size 28 = 256 is2562 = 65, 536 and 2563 = 16, 777, 216, respectively. The former is manageable, whereasthe latter is perhaps too big for a practical implementation, unless a sophisticated datastructure is used or unless the encoder gets rid of older data from time to time.

For a small alphabet, larger values of N can be used. For a 16-symbol alphabetthere are 164 = 65, 536 order-4 contexts and 166 = 16, 777, 216 order-6 contexts.

2.45: A practical example of a 16-symbol alphabet is a color or grayscale image with4-bit pixels. Each symbol is a pixel, and there are 16 different symbols.

2.46: An object file generated by a compiler or an assembler normally has severaldistinct parts including the machine instructions, symbol table, relocation bits, andconstants. Such parts may have different bit distributions.

2.47: The alphabet has to be extended, in such a case, to include one more symbol. Ifthe original alphabet consisted of all the possible 256 8-bit bytes, it should be extendedto 9-bit symbols, and should include 257 values.

2.48: Table Ans.20 shows the groups generated in both cases and makes it clear whythese particular probabilities were assigned.

2.49: The d is added to the order-0 contexts with frequency 1. The escape frequencyshould be incremented from 5 to 6, bringing the total frequencies from 19 up to 21. Theprobability assigned to the new d is therefore 1/21, and that assigned to the escape is6/21. All other probabilities are reduced from x/19 to x/21.

2.50: The new d would require switching from order 2 to order 0, sending two escapesthat take 1 and 1.32 bits. The d is now found in order-0 with probability 1/21, so itis encoded in 4.39 bits. The total number of bits required to encode the second d istherefore 1 + 1.32 + 4.39 = 6.71, still greater than 5.

972 Answers to Exercises

Context f pabc→x 10 10/11Esc 1 1/11

Context f pabc→ a1 1 1/20

→ a2 1 1/20→ a3 1 1/20→ a4 1 1/20→ a5 1 1/20→ a6 1 1/20→ a7 1 1/20→ a8 1 1/20→ a9 1 1/20→ a10 1 1/20

Esc 10 10/20Total 20

Table Ans.20: Stable vs. Variable Data.

2.51: The first three cases don’t change. They still code a symbol with 1, 1.32, and6.57 bits, which is less than the 8 bits required for a 256-symbol alphabet withoutcompression. Case 4 is different since the d is now encoded with a probability of 1/256,producing 8 instead of 4.8 bits. The total number of bits required to encode the d incase 4 is now 1 + 1.32 + 1.93 + 8 = 12.25.

2.52: The final trie is shown in Figure Ans.21.

a,4 s,6

s,2 a,2 s,3

s,2 a,2

n,1

n,1

n,1

i,2

i,1

i,1

s,1

s,1

s,1

i,1

i,1m,1

m,1

m,1

14. ‘a’

a,1

a,1

s,1

Figure Ans.21: Final Trie of assanissimassa.

2.53: This probability is, of course

1− Pe(bt+1 = 1|bt1) = 1− b + 1/2

a + b + 1=

a + 1/2a + b + 1

.

Answers to Exercises 973

2.54: For the first string the single bit has a suffix of 00, so the probability of leaf00 is Pe(1, 0) = 1/2. This is equal to the probability of string 0 without any suffix.For the second string each of the two zero bits has suffix 00, so the probability ofleaf 00 is Pe(2, 0) = 3/8 = 0.375. This is greater than the probability 0.25 of string00 without any suffix. Similarly, the probabilities of the remaining three strings arePe(3, 0) = 5/8 ≈ 0.625, Pe(4, 0) = 35/128 ≈ 0.273, and Pe(5, 0) = 63/256 ≈ 0.246.As the strings get longer, their probabilities get smaller but they are greater than theprobabilities without the suffix. Having a suffix of 00 thus increases the probability ofhaving strings of zeros following it.

2.55: The four trees are shown in Figure Ans.22a–d. The weighted probability thatthe next bit will be a zero given that three zeros have just been generated is 0.5. Theweighted probability to have two consecutive zeros given the suffix 000 is 0.375, higherthan the 0.25 without the suffix.

(a)

1

1

1

0 (1,0)

(1,0)

Pw=.5Pe=.5

.5

.5(1,0).5.5

(1,0).5.5

(b)

1

1

1

0 (2,0)

(2,0)

Pw=.375Pe=.375

.375

.375(2,0).375.375

(2,0).375.375

(c)

1

1

1

0

0

(1,0)

(1,0)

Pw=.5Pe=.5

.5

.5(1,0).5.5

(1,0).5.5

(d)

1 0

0 0

00

(0,1)

(0,2)

Pw=.3125Pe=.375

.5

.5

(0,1).5.5

(0,1).5.5

(0,1).5.5

(0,1).5.5

(0,1).5.5

000|0 000|00

000|11000|1

Figure Ans.22: Context Trees For 000|0, 000|00, 000|1, and 000|11.

974 Answers to Exercises

3.1: The size of the output stream is N [48−28P ] = N [48−25.2] = 22.8N . The size ofthe input stream is, as before, 40N . The compression factor is therefore 40/22.8 ≈ 1.75.

3.2: The list has up to 256 entries, each consisting of a byte and a count. A byteoccupies eight bits, but the counts depend on the size and content of the file beingcompressed. If the file has high redundancy, a few bytes may have large counts, close tothe length of the file, while other bytes may have very low counts. On the other hand,if the file is close to random, then each distinct byte has approximately the same count.

Thus, the first step in organizing the list is to reserve enough space for each “count”field to accommodate the maximum possible count. We denote the length of the file byL and find the positive integer k that satisfies 2k−1 < L ≤ 2k. Thus, L is a k-bit number.If k is not already a multiple of 8, we increase it to the next multiple of 8. We nowdenote k = 8m, and allocate m bytes to each “count” field.

Once the file has been input and processed and the list has been sorted, we examinethe largest count. It may be large and may occupy all m bytes, or it may be smaller.Assuming that the largest count occupies n bytes (where n ≤ m), we can store each ofthe other counts in n bytes.

When the list is written on the compressed file as the dictionary, its length s isfirst written in one byte. s is the number of distinct bytes in the original file. This isfollowed by n, followed by s groups, each with one of the distinct data bytes followed byan n-byte count. Notice that the value n should be fairly small and should fit in a singlebyte. If n does not fit in a single byte, then it is greater than 255, implying that thelargest count does not fit in 255 bytes, implying in turn a file whose length L is greaterthan 2255 ≈ 1076 bytes.

An alternative is to start with s, followed by n, followed by the s distinct databytes, followed by the n×s bytes of counts. The last part could also be in compressedform, because only a few largest counts will occupy all n bytes. Most counts may besmall and occupy just one or two bytes, which implies that many of the n×s count byteswill be zero, resulting in high redundancy and therefore good compression.

3.3: The straightforward answer is The decoder doesn’t know but it does not need toknow. The decoder simply reads tokens and uses each offset to locate a string of textwithout having to know whether the string was a first or a last match.

3.4: The next step matches the space and encodes the string �e.

sir�sid|�eastman�easily� ⇒ (4,1,e)sir�sid�e|astman�easily�te ⇒ (0,0,a)

and the next one matches nothing and encodes the a.

3.5: The first two characters CA at positions 17–18 are a repeat of the CA at positions9–10, so they will be encoded as a string of length 2 at offset 18− 10 = 8. The next twocharacters AC at positions 19–20 are a repeat of the string at positions 8–9, so they willbe encoded as a string of length 2 at offset 20− 9 = 11.

3.6: The decoder interprets the first 1 of the end marker as the start of a token. Thesecond 1 is interpreted as the prefix of a 7-bit offset. The next 7 bits are 0, and theyidentify the end marker as such, since a “normal” offset cannot be zero.

Answers to Exercises 975

Dictionary Token Dictionary Token15 �t (4, t) 21 �si (19,i)16 e (0, e) 22 c (0, c)17 as (8, s) 23 k (0, k)18 es (16,s) 24 �se (19,e)19 �s (4, s) 25 al (8, l)20 ea (4, a) 26 s(eof) (1, (eof))

Table Ans.23: Next 12 Encoding Steps in the LZ78 Example.

3.7: This is straightforward. The remaining steps are shown in Table Ans.23

3.8: Table Ans.24 shows the last three steps.

Hashp_src 3 chars index P Output Binary output

11 h�t 7 any→11 h 0110100012 �th 5 5→12 4,7 0000|0011|0000011116 ws ws 01110111|01110011

Table Ans.24: Last Steps of Encoding that thatch thaws.

The final compressed stream consists of 1 control word followed by 11 items (9 literalsand 2 copy items)0000010010000000|01110100|01101000|01100001|01110100|00100000|0000|0011|00000101|01100011|01101000|0000|0011|00000111|01110111|01110011.

3.9: An example is a compression utility for a personal computer that maintains allthe files (or groups of files) on the hard disk in compressed form, to save space. Such autility should be transparent to the user; it should automatically decompress a file everytime it is opened and automatically compress it when it is being closed. In order to betransparent, such a utility should be fast, with compression ratio being only a secondaryfeature.

3.10: Table Ans.25 summarizes the steps. The output emitted by the encoder is97 (a), 108 (l), 102 (f), 32 (�), 101 (e), 97 (a), 116 (t), 115 (s), 32 (�), 256 (al), 102(f), 265 (alf), 97 (a),and the following new entries are added to the dictionary(256: al), (257: lf), (258: f�), (259: �e), (260: ea), (261: at), (262: ts),(263: s�), (264: �a), (265: alf), (266: fa), (267: alfa).

3.11: The encoder inputs the first a into I, searches and finds a in the dictionary.It inputs the next a but finds that Ix, which is now aa, is not in the dictionary. Theencoder thus adds string aa to the dictionary as entry 256 and outputs the token 97 (a).Variable I is initialized to the second a. The third a is input, so Ix is the string aa, whichis now in the dictionary. I becomes this string, and the fourth a is input. Ix is now aaa

976 Answers to Exercises

in new in newI dict? entry output I dict? entry output

a Y s� N 263-s� 115 (s)al N 256-al 97 (a) � Yl Y �a N 264-�a 32 (�)lf N 257-lf 108 (l) a Yf Y al Yf� N 258-f� 102 (f) alf N 265-alf 256 (al)� Y f Y�e N 259-�e 32 (w) fa N 266-fa 102 (f)e Y a Yea N 260-ea 101 (e) al Ya Y alf Yat N 261-at 97 (a) alfa N 267-alfa 265 (alf)t Y a Yts N 262-ts 116 (t) a,eof N 97 (a)s Y

Table Ans.25: LZW Encoding of “alf eats alfalfa”.

which is not in the dictionary. The encoder thus adds string aaa to the dictionary asentry 257 and outputs 256 (aa). I is initialized to the fourth a. Continuing this processis straightforward.

The result is that strings aa, aaa, aaaa,. . . are added to the dictionary as entries256, 257, 258,. . . , and the output is

97 (a), 256 (aa), 257 (aaa), 258 (aaaa),. . .

The output consists of pointers pointing to longer and longer strings of as. The first kpointers thus point at strings whose total length is 1 + 2 + · · ·+ k = (k + k2)/2.

Assuming an input stream that consists of one million as, we can find the size ofthe compressed output stream by solving the quadratic equation (k + k2)/2 = 1000000for the unknown k. The solution is k ≈ 1414. The original, 8-million bit input is thuscompressed into 1414 pointers, each at least 9-bit (and in practice, probably 16-bit) long.The compression factor is thus either 8M/(1414×9) ≈ 628.6 or 8M/(1414×16) ≈ 353.6.

This is an impressive result but such input streams are rare (notice that this par-ticular input can best be compressed by generating an output stream containing just“1000000 a”, and without using LZW).

3.12: We simply follow the decoding steps described in the text. The results are:1. Input 97. This is in the dictionary so set I=a and output a. String ax needs to besaved in the dictionary but x is still unknown.2. Input 108. This is in the dictionary so set J=l and output l. Save al in entry 256.Set I=l.3. Input 102. This is in the dictionary so set J=f and output f. Save lf in entry 257.Set I=f.

Answers to Exercises 977

4. Input 32. This is in the dictionary so set J=� and output �. Save f� in entry 258.Set I=�.5. Input 101. This is in the dictionary so set J=e and output e. Save �e in entry 259.Set I=e.6. Input 97. This is in the dictionary so set J=a and output a. Save ea in entry 260.Set I=a.7. Input 116. This is in the dictionary so set J=t and output t. Save at in entry 261.Set I=t.8. Input 115. This is in the dictionary so set J=s and output s. Save ts in entry 262.Set I=t.9. Input 32. This is in the dictionary so set J=� and output �. Save s� in entry 263.Set I=�.10. Input 256. This is in the dictionary so set J=al and output al. Save �a in entry264. Set I=al.11. Input 102. This is in the dictionary so set J=f and output f. Save alf in entry265. Set I=f.12. Input 265. This has just been saved in the dictionary so set J=alf and output alf.Save fa in dictionary entry 266. Set I=alf.13. Input 97. This is in the dictionary so set J=a and output a. Save alfa in entry 267(even though it will never be used). Set I=a.14. Read eof. Stop.

3.13: We assume that the dictionary is initialized to just the two entries (1: a) and(2: b). The encoder outputs

1 (a), 2 (b), 3 (ab), 5(aba), 4(ba), 7 (bab), 6 (abab), 9 (ababa), 8 (baba),. . .

and adds the new entries (3: ab), (4: ba), (5: aba), (6: abab), (7: bab), (8: baba), (9:ababa), (10: ababab), (11: babab),. . . to the dictionary. This regular behavior can beanalyzed and the kth output pointer and dictionary entry predicted, but the effort isprobably not worth it.

3.14: The answer to exercise 3.11 shows the relation between the size of the compressedfile and the size of the largest dictionary string for the “worst case” situation (input thatcreates the longest strings). For a 1 Mbyte input stream, there will be 1,414 strings inthe dictionary, the largest of which is 1,414 symbols long.

3.15: This is straightforward (Table Ans.26) but not very efficient since only one two-symbol dictionary phrase is used.

3.16: Table Ans.27 shows all the steps. In spite of the short input, the result is quitegood (13 codes to compress 18-symbols) because the input contains concentrations of asand bs.

3.17: 1. The encoder starts by shifting the first two symbols xy to the search buffer,outputting them as literals and initializing all locations of the index table to the nullpointer.

978 Answers to Exercises

Add toStep Input Output S dict. S’

swiss�miss

1 s 115 s — s2 w 119 w 256-sw w3 i 105 i 257-wi i4 s 115 s 258-is s5 s 115 s 259-ss s6 - 32 � 260-s� �7 m 109 m 261-�m m8 is 258 is 262-mis is9 s 115 s 263-iss s

Table Ans.26: LZMW Compression of “swiss miss”.

Add toStep Input Output S dict. S’

yabbadabbadabbadoo

1 y 121 y — y2 a 97 a 256-ya a3 b 98 b 257-ab b4 b 98 b 258-bb b5 a 97 a 259-ba a6 d 100 a 260-ad a7 ab 257 ab 261-dab ab8 ba 259 ba 262-abba ba9 dab 261 dab 263-badab dab10 ba 259 ba 264-dabba ba11 d 100 d 265-bad d12 o 111 o 266-do o13 o 111 o 267-o o

Table Ans.27: LZMW Compression of “yabbadabbadabbadoo”.

2. The current symbol is a (the first a) and the context is xy. It is hashed to, say, 5,but location 5 of the index table contains a null pointer, so P is null. Location 5 is set topoint to the first a, which is then output as a literal. The data in the encoder’s bufferis shifted to the left.3. The current symbol is the second a and the context is ya. It is hashed to, say, 1, butlocation 1 of the index table contains a null pointer, so P is null. Location 1 is set topoint to the second a, which is then output as a literal. The data in the encoder’s bufferis shifted to the left.4. The current symbol is the third a and the context is aa. It is hashed to, say, 2, butlocation 2 of the index table contains a null pointer, so P is null. Location 2 is set topoint to the third a, which is then output as a literal. The data in the encoder’s bufferis shifted to the left.

Answers to Exercises 979

5. The current symbol is the fourth a and the context is aa. We know from step 4 thatit is hashed to 2, and location 2 of the index table points to the third a. Location 2 isset to point to the fourth a, and the encoder tries to match the string starting with thethird a to the string starting with the fourth a. Assuming that the look-ahead buffer isfull of as, the match length L will be the size of that buffer. The encoded value of Lwill be written to the compressed stream, and the data in the buffer shifted L positionsto the left.6. If the original input stream is long, more a’s will be shifted into the look-ahead buffer,and this step will also result in a match of length L. If only n as remain in the inputstream, they will be matched, and the encoded value of n output.

The compressed stream will consist of the three literals x, y, and a, followed by(perhaps several values of) L, and possibly ending with a smaller value.

3.18: T percent of the compressed stream is made up of literals, some appearingconsecutively (and thus getting the flag “1” for two literals, half a bit per literal) andothers with a match length following them (and thus getting the flag “01”, one bit forthe literal). We assume that two thirds of the literals appear consecutively and one thirdare followed by match lengths. The total number of flag bits created for literals is thus

23T × 0.5 +

13T × 1.

A similar argument for the match lengths yields

23(1− T )× 2 +

13(1− T )× 1

for the total number of the flag bits. We now write the equation

23T × 0.5 +

13T × 1 +

23(1− T )× 2 +

13(1− T )× 1 = 1,

which is solved to yield T = 2/3. This means that if two thirds of the items in thecompressed stream are literals, there would be 1 flag bit per item on the average. Moreliterals would result in fewer flag bits.

3.19: The first three 1’s indicate six literals. The following 01 indicates a literal (b)followed by a match length (of 3). The 10 is the code of match length 3, and the last 1indicates two more literals (x and y).

4.1: An image with no redundancy is not always random. The definition of redundancy(Section 2.1) tells us that an image where each color appears with the same frequencyhas no redundancy (statistically) yet it is not necessarily random and may even beinteresting and/or useful.

980 Answers to Exercises

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

a b

cov(a) cov(b)

Figure Ans.28: Covariance Matrices of Correlated and Decorrelated Values.

a=rand(32); b=inv(a);figure(1), imagesc(a), colormap(gray); axis squarefigure(2), imagesc(b), colormap(gray); axis squarefigure(3), imagesc(cov(a)), colormap(gray); axis squarefigure(4), imagesc(cov(b)), colormap(gray); axis square

Code for Figure Ans.28.

4.2: Figure Ans.28 shows two 32×32 matrices. The first one, a, with random (andtherefore decorrelated) values and the second one, b, is its inverse (and therefore withcorrelated values). Their covariance matrices are also shown and it is obvious that matrixcov(a) is close to diagonal, whereas matrix cov(b) is far from diagonal. The Matlab codefor this figure is also listed.

Answers to Exercises 981

4.3: The results are shown in Table Ans.29 together with the Matlab code used tocalculate it.

43210 Gray 43210 Gray 43210 Gray 43210 Gray00000 00000 01000 01100 10000 11000 11000 1010000001 00001 01001 01101 10001 11001 11001 1010100010 00011 01010 01111 10010 11011 11010 1011100011 00010 01011 01110 10011 11010 11011 1011000100 00110 01100 01010 10100 11110 11100 1001000101 00111 01101 01011 10101 11111 11101 1001100110 00101 01110 01001 10110 11101 11110 1000100111 00100 01111 01000 10111 11100 11111 10000

Table Ans.29: First 32 Binary and Gray Codes.

a=linspace(0,31,32); b=bitshift(a,-1);b=bitxor(a,b); dec2bin(b)

Code for Table Ans.29.

4.4: One feature is the regular way in which each of the five code bits alternatesperiodically between 0 and 1. It is easy to write a program that will set all five bits to 0,will flip the rightmost bit after two codes have been calculated, and will flip any of theother four code bits in the middle of the period of its immediate neighbor on the right.Another feature is the fact that the second half of the table is a mirror image of the firsthalf, but with the most significant bit set to one. After the first half of the table hasbeen computed, using any method, this symmetry can be used to quickly calculate thesecond half.

4.5: Figure Ans.30 is an angular code wheel representation of the 4-bit and 6-bit RGCcodes (part a) and the 4-bit and 6-bit binary codes (part b). The individual bitplanesare shown as rings, with the most significant bits as the innermost ring. It is easy tosee that the maximum angular frequency of the RGC is half that of the binary code andthat the first and last codes also differ by just one bit.

4.6: If pixel values are in the range [0, 255], a difference (Pi −Qi) can be at most 255.The worst case is where all the differences are 255. It is easy to see that such a caseyields an RMSE of 255.

4.7: The code of Figure 4.15 yields the coordinates of the rotated points

(7.071, 0), (9.19, 0.7071), (17.9, 0.78), (33.9, 1.41), (43.13,−2.12)

(notice how all the y coordinates are small numbers) and shows that the cross-correlationdrops from 1729.72 before the rotation to −23.0846 after it. A significant reduction!

982 Answers to Exercises

(a)

(b)

Figure Ans.30: Angular Code Wheels of RGC and Binary Codes.

4.8: Figure Ans.31 shows the 64 basis images and the Matlab code to calculate anddisplay them. Each basis image is an 8× 8 matrix.

4.9: A4 is the 4×4 matrix

A4 =

⎛⎜⎝

h0(0/4) h0(1/4) h0(2/4) h0(3/4)h1(0/4) h1(1/4) h1(2/4) h1(3/4)h2(0/4) h2(1/4) h2(2/4) h2(3/4)h3(0/4) h3(1/4) h3(2/4) h3(3/4)

⎞⎟⎠ =

1√4

⎛⎜⎝

1 1 1 11 1 −1 −1√2 −√2 0 0

0 0√

2 −√2

⎞⎟⎠ .

Similarly, A8 is the matrix

A8 =1√8

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

1 1 1 1 1 1 1 11 1 1 1 −1 −1 −1 −1√2√

2 −√2 −√2 0 0 0 00 0 0 0

√2√

2 −√2 −√22 −2 0 0 0 0 0 00 0 2 −2 0 0 0 00 0 0 0 2 −2 0 00 0 0 0 0 0 2 −2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

.

Answers to Exercises 983

M=3; N=2^M; H=[1 1; 1 -1]/sqrt(2);for m=1:(M-1) % recursionH=[H H; H -H]/sqrt(2);

endA=H’;map=[1 5 7 3 4 8 6 2]; % 1:Nfor n=1:N, B(:,n)=A(:,map(n)); end;A=B;sc=1/(max(abs(A(:))).^2); % scale factorfor row=1:Nfor col=1:NBI=A(:,row)*A(:,col).’; % tensor productsubplot(N,N,(row-1)*N+col)oe=round(BI*sc); % results in -1, +1imagesc(oe), colormap([1 1 1; .5 .5 .5; 0 0 0])drawnow

endend

Figure Ans.31: The 8×8 WHT Basis Images and Matlab Code.

984 Answers to Exercises

4.10: The average of vector w(i) is zero, so Equation (4.12) yields

(W·WT

)jj

=k∑

i=1

w(i)j w

(i)j =

k∑i=1

(w

(i)j − 0

)2

=k∑

i=1

(c(j)i − 0

)2

= k Variance(c(j)).

4.11: The Mathematica code of Figure 4.21 produces the eight coefficients 140, −71, 0,−7, 0, −2, 0, and 0. We now quantize this set coarsely by clearing the last two nonzeroweights −7 and −2, When the IDCT is applied to the sequence 140, −71, 0, 0, 0, 0, 0,0, it produces 15, 20, 30, 43, 56, 69, 79, and 84. These are not identical to the originalvalues, but the maximum difference is only 4; an excellent result considering that onlytwo of the eight DCT coefficients are nonzero.

4.12: The eight values in the top row are very similar (the differences between themare either 2 or 3). Each of the other rows is obtained as a right-circular shift of thepreceding row.

4.13: It is obvious that such a block can be represented as a linear combination of thepatterns in the leftmost column of Figure 4.39. The actual calculation yields the eightweights 4, 0.72, 0, 0.85, 0, 1.27, 0, and 3.62 for the patterns of this column. The other56 weights are zero or very close to zero.

4.14: The arguments of the cosine functions used by the DCT are of the form (2x +1)iπ/16, where i and x are integers in the range [0, 7]. Such an argument can be writtenin the form nπ/16, where n is an integer in the range [0, 15×7]. Since the cosine functionis periodic, it satisfies cos(32π/16) = cos(0π/16), cos(33π/16) = cos(π/16), and so on.As a result, only the 32 values cos(nπ/16) for n = 0, 1, 2, . . . , 31 are needed. The authoris indebted to V. Saravanan for pointing out this feature of the DCT.

4.15: Figure 4.52 shows the results (that resemble Figure 4.39) and the Matlab code.Notice that the same code can also be used to calculate and display the DCT basisimages.

4.16: First figure out the zigzag path manually, then record it in an array zz ofstructures, where each structure contains a pair of coordinates for the path as shown,e.g., in Figure Ans.32.

(0,0) (0,1) (1,0) (2,0) (1,1) (0,2) (0,3) (1,2)(2,1) (3,0) (4,0) (3,1) (2,2) (1,3) (0,4) (0,5)(1,4) (2,3) (3,2) (4,1) (5,0) (6,0) (5,1) (4,2)(3,3) (2,4) (1,5) (0,6) (0,7) (1,6) (2,5) (3,4)(4,3) (5,2) (6,1) (7,0) (7,1) (6,2) (5,3) (4,4)(3,5) (2,6) (1,7) (2,7) (3,6) (4,5) (5,4) (6,3)(7,2) (7,3) (6,4) (5,5) (4,6) (3,7) (4,7) (5,6)(6,5) (7,4) (7,5) (6,6) (5,7) (6,7) (7,6) (7,7)

Figure Ans.32: Coordinates for the Zigzag Path.

Answers to Exercises 985

If the two components of a structure are zz.r and zz.c, then the zigzag traversalcan be done by a loop of the form :

for (i=0; i<64; i++){row:=zz[i].r; col:=zz[i].c...data_unit[row][col]...}

4.17: The third DC difference, 5, is located in row 3 column 5, so it is encoded as1110|101.

4.18: Thirteen consecutive zeros precede this coefficient, so Z = 13. The coefficientitself is found in Table 4.63 in row 1, column 0, so R = 1 and C = 0. Assuming thatthe Huffman code in position (R,Z) = (1, 13) of Table 4.66 is 1110101, the final codeemitted for 1 is 1110101|0.

4.19: This is shown by multiplying the largest four n-bit number, 11 . . . 1︸ ︷︷ ︸n

by 4, which

is easily done by shifting it 2 positions to the left. The result is the n + 2-bit number11 . . . 1︸ ︷︷ ︸

n

00.

4.20: They make for a more natural progressive growth of the image. They make iteasier for a person watching the image develop on the screen to decide if and when tostop the decoding process and accept or discard the image.

4.21: The only specification that depends on the particular bits assigned to the twocolors is Equation (4.25). All the other parts of JBIG are independent of the bit assign-ment.

4.22: For the 16-bit template of Figure 4.97a the relative coordinates are

A1 = (3,−1), A2 = (−3,−1), A3 = (2,−2), A4 = (−2,−2).

For the 13-bit template of Figure 4.97b the relative coordinates of A1 are (3,−1). Forthe 10-bit templates of Figure 4.97c,d the relative coordinates of A1 are (2,−1).

4.23: Transposing S and T produces better compression in cases where the text runsvertically.

4.24: Going back to step 1 we have the same points participate in the partition foreach codebook entry (this happens because our points are concentrated in four distinctregions, but in general a partition P

(k)i may consist of different image blocks in each

iteration k). The distortions calculated in step 2 are summarized in Table Ans.34. Theaverage distortion D

(1)i is

D(1) = (277+277+277+277+50+50+200+117+37+117+162+117)/12 = 163.17,

much smaller than the original 603.33. If step 3 indicates no convergence, more iterationsshould follow (Exercise 4.25), reducing the average distortion and improving the valuesof the four codebook entries.

986 Answers to Exercises

B1 B2

B3 B4

B5 C2

C1

C3

C4

B6

B7

B8

B9

B10

B11

B12

×

× ×

×

20

20

40

40

60

60

80

80

100

100

120

120

140

140

160

160

180

180

200

200

220

220

240

240

Figure Ans.33: Twelve Points and Four Codebook Entries C(1)i .

I: (46− 32)2 + (41− 32)2 = 277, (46− 60)2 + (41− 32)2 = 277,(46− 32)2 + (41− 50)2 = 277, (46− 60)2 + (41− 50)2 = 277,

II: (65− 60)2 + (145− 150)2 = 50, (65− 70)2 + (145− 140)2 = 50,III: (210− 200)2 + (200− 210)2 = 200,IV: (206− 200)2 + (41− 32)2 = 117, (206− 200)2 + (41− 40)2 = 37,

(206− 200)2 + (41− 50)2 = 117, (206− 215)2 + (41− 50)2 = 162,(206− 215)2 + (41− 35)2 = 117.

Table Ans.34: Twelve Distortions for k = 1.

Answers to Exercises 987

4.25: Each new codebook entry C(k)i is calculated, in step 4 of iteration k, as the

average of the block images comprising partition P(k−1)i . In our example the image

blocks (points) are concentrated in four separate regions, so the partitions calculatedfor iteration k = 1 are the same as those for k = 0. Another iteration, for k = 2,will therefore compute the same partitions in its step 1 yielding, in step 3, an averagedistortion D(2) that equals D(1). Step 3 will therefore indicate convergence.

4.26: It is 4−8 ≈ 0.000015 the area of the entire space.

4.27: Monitor the compression ratio and delete the dictionary and start afresh eachtime compression performance drops below a certain threshold.

4.28: Step 4: Point (2, 0) is popped out of the GPP. The pixel value at this position is7. The best match for this point is with the dictionary entry containing 7. The encoderoutputs the pointer 7. The match does not have any concave corners, so we push thepoint on the right of the matched block, (2, 1), and the point below it, (3, 0), into theGPP. The GPP now contains points (2, 1), (3, 0), (0, 2), and (1, 1). The dictionary is

updated by appending to it (at location 18) the block 47 .

Step 5: Point (1, 1) is popped out of the GPP. The pixel value at this position is5. The best match for this point is with the dictionary entry containing 5. The encoderoutputs the pointer 5. The match does not have any concave corners, so we push thepoint to the right of the matched block, (1, 2), and the point below it, (2, 1), into theGPP. The GPP contains points (1, 2), (2, 1), (3, 0), and (0, 2). The dictionary is updated

by appending to it (at locations 19, 20) the two blocks 25 and 4 5 .

4.29: It may simply be too long. When compressing text, each symbol is normally 1-byte long (two bytes in Unicode). However, images with 24-bit pixels are very common,and a 16-pixel block in such an image is 48-bytes long.

4.30: If the encoder uses a (2, 1, k) general unary code, then the value of k should alsobe included in the header.

4.31: The mean and standard deviation are p = 115 and σ = 77.93, respectively. Thecounts become n+ = n− = 8, and Equations (4.33) are solved to yield p+ = 193 andp− = 37. The original block is compressed to the 16 bits

⎛⎜⎝

1 0 1 11 0 0 11 1 0 10 0 0 0

⎞⎟⎠ ,

and the two 8-bit values 37 and 193.

4.32: Table Ans.35 summarizes the results. Notice how a 1-pixel with a context of 00is assigned high probability after being seen 3 times.

988 Answers to Exercises

# Pixel Context Counts Probability New counts

5 0 10=2 1,1 1/2 2,16 1 00=0 1,3 3/4 1,47 0 11=3 1,1 1/2 2,18 1 10=2 2,1 1/3 2,2

Table Ans.35: Counts and Probabilities for Next Four Pixels.

4.33: Such a thing is possible for the encoder but not for the decoder. A compressionmethod using “future” pixels in the context is useless because its output would beimpossible to decompress.

4.34: The model used by FELICS to predict the current pixel is a second-order Markovmodel. In such a model the value of the current data item depends on just two of itspast neighbors, not necessarily the two immediate ones.

4.35: The two previously seen neighbors of P=8 are A=1 and B=11. P is thus in thecentral region, where all codes start with a zero, and L=1, H=11. The computationsare straightforward:

k = �log2(11− 1 + 1)� = 3, a = 23+1 − 11 = 5, b = 2(11− 23) = 6.

Table Ans.36 lists the five 3-bit codes and six 4-bit codes for the central region.The code for 8 is thus 0|111.

The two previously seen neighbors of P=7 are A=2 and B=5. P is thus in the rightouter region, where all codes start with 11, and L=2, H=7. We are looking for the codeof 7− 5 = 2. Choosing m = 1 yields, from Table 4.120, the code 11|01.

The two previously seen neighbors of P=0 are A=3 and B=5. P is thus in the leftouter region, where all codes start with 10, and L=3, H=5. We are looking for the codeof 3− 0 = 3. Choosing m = 1 yields, from Table 4.120, the code 10|100.

Pixel Region PixelP code code

1 0 00002 0 00103 0 01004 0 0115 0 1006 0 1017 0 1108 0 1119 0 0001

10 0 001111 0 0101

Table Ans.36: The Codes for a Central Region.

Answers to Exercises 989

4.36: Because the decoder has to resolve ties in the same way as the encoder.

4.37: The weights have to add up to 1 because this results in a weighted sum whosevalue is in the same range as the values of the pixels. If pixel values are, for example,in the range [0, 15] and the weights add up to 2, a prediction may result in values of upto 30.

4.38: Each of the three weights 0.0039, −0.0351, and 0.3164 is used twice. The sumof the weights is therefore 0.5704, and the result of dividing each weight by this sumis 0.0068, −0.0615, and 0.5547. It is easy to verify that the sum of the renormalizedweights 2(0.0068− 0.0615 + 0.5547) equals 1.

4.39: An archive of static images is an example where this approach is practical. NASAhas a large archive of images taken by various satellites. They should be kept highlycompressed, but they never change, so each image has to be compressed only once. Aslow encoder is therefore acceptable but a fast decoder is certainly handy. Anotherexample is an art collection. Many museums have already scanned and digitized theircollections of paintings, and those are also static.

4.40: Such a polynomial depends on three coefficients b, c, and d that can be consid-ered three-dimensional points, and any three points are on the same plane.

4.41: This is straightforward

P(2/3) =(0,−9)(2/3)3 + (−4.5, 13.5)(2/3)2 + (4.5,−3.5)(2/3)=(0,−8/3) + (−2, 6) + (3,−7/3)=(1, 1) = P3.

4.42: We use the relations sin 30◦ = cos 60◦ = .5 and the approximation cos 30◦ =sin 60◦ ≈ .866. The four points are P1 = (1, 0), P2 = (cos 30◦, sin 30◦) = (.866, .5),P3 = (.5, .866), and P4 = (0, 1). The relation A = N ·P becomes⎛

⎜⎝abcd

⎞⎟⎠ = A = N ·P =

⎛⎜⎝−4.5 13.5 −13.5 4.59.0 −22.5 18 −4.5−5.5 9.0 −4.5 1.01.0 0 0 0

⎞⎟⎠⎛⎜⎝

(1, 0)(.866, .5)(.5, .866)

(0, 1)

⎞⎟⎠

and the solutions are

a = −4.5(1, 0) + 13.5(.866, .5)− 13.5(.5, .866) + 4.5(0, 1) = (.441,−.441),b = 19(1, 0)− 22.5(.866, .5) + 18(.5, .866)− 4.5(0, 1) = (−1.485,−0.162),c = −5.5(1, 0) + 9(.866, .5)− 4.5(.5, .866) + 1(0, 1) = (0.044, 1.603),d = 1(1, 0)− 0(.866, .5) + 0(.5, .866)− 0(0, 1) = (1, 0).

Thus, the PC is P(t) = (.441,−.441)t3 + (−1.485,−0.162)t2 + (0.044, 1.603)t + (1, 0).The midpoint is P(.5) = (.7058, .7058), only 0.2% away from the midpoint of the arc,which is at (cos 45◦, sin 45◦) ≈ (.7071, .7071).

990 Answers to Exercises

4.43: The new equations are easy enough to set up. Using Mathematica, they are alsoeasy to solve. The following code

Solve[{d==p1,a al^3+b al^2+c al+d==p2,a be^3+b be^2+c be+d==p3,a+b+c+d==p4},{a,b,c,d}];ExpandAll[Simplify[%]]

(where al and be stand for α and β, respectively) produces the (messy) solutions

a = −P1

αβ+

P2

−α2 + α3 + αβ − α2β+

P3

αβ − β2 − αβ2 + β3+

P4

1− α− β + αβ,

b = P1

(−α + α3 + β − α3β − β3 + αβ3)/γ + P2

(−β + β3)/γ

+ P3

(α− α3

)/γ + P4

(α3β − αβ3

)/γ,

c = −P1

(1 +

+1β

)+

βP2

−α2 + α3 + αβ − α2β

+αP3

αβ − β2 − αβ2 + β3+

αβP4

1− α− β + αβ,

d = P1,

where γ = (−1 + α)α(−1 + β)β(−α + β).

From here, the basis matrix immediately follows

⎛⎜⎜⎜⎜⎝

− 1αβ

1−α2+α3αβ−α2β

1αβ−β2−αβ2+β3

11−α−β+αβ

−α+α3+β−α3β−β3+αβ3

γ−β+β3

γα−α3

γα3β−αβ3

γ

−(1 + 1

α + 1β

−α2+α3+αβ−α2βα

αβ−β2−αβ2+β3αβ

1−α−β+αβ

1 0 0 0

⎞⎟⎟⎟⎟⎠ .

A direct check, again using Mathematica, for α = 1/3 and β = 2/3, reduces this matrixto matrix N of Equation (4.44).

4.44: The missing points will have to be estimated by interpolation or extrapolationfrom the known points before our method can be applied. Obviously, the fewer pointsare known, the worse the final interpolation. Note that 16 points are necessary, becausea bicubic polynomial has 16 coefficients.

4.45: Figure Ans.37a shows a diamond-shaped grid of 16 equally-spaced points. Theeight points with negative weights are shown in black. Figure Ans.37b shows a cut(labeled xx) through four points in this surface. The cut is a curve that passes throughpour data points. It is easy to see that when the two exterior (black) points are raised,the center of the curve (and, as a result, the center of the surface) gets lowered. It is

Answers to Exercises 991

now clear that points with negative weights push the center of the surface in a directionopposite that of the points.

Figure Ans.37c is a more detailed example that also shows why the four cornerpoints should have positive weights. It shows a simple symmetric surface patch thatinterpolates the 16 points

P00 = (0, 0, 0), P10 = (1, 0, 1), P20 = (2, 0, 1), P30 = (3, 0, 0),P01 = (0, 1, 1), P11 = (1, 1, 2), P21 = (2, 1, 2), P31 = (3, 1, 1),P02 = (0, 2, 1), P12 = (1, 2, 2), P22 = (2, 2, 2), P32 = (3, 2, 1),P03 = (0, 3, 0), P13 = (1, 3, 1), P23 = (2, 3, 1), P33 = (3, 3, 0).

We first raise the eight boundary points from z = 1 to z = 1.5. Figure Ans.37d showshow the center point P(.5, .5) gets lowered from (1.5, 1.5, 2.25) to (1.5, 1.5, 2.10938). Wenext return those points to their original positions and instead raise the four cornerpoints from z = 0 to z = 1. Figure Ans.37e shows how this raises the center point from(1.5, 1.5, 2.25) to (1.5, 1.5, 2.26563).

4.46: The decoder knows this pixel since it knows the value of average μ[i − 1, j] =0.5(I[2i− 2, 2j] + I[2i− 1, 2j + 1]) and since it has already decoded pixel I[2i− 2, 2j]

4.47: The decoder knows how to do this because when the decoder inputs the 5, itknows that the difference between p (the pixel being decoded) and the reference pixelstarts at position 6 (counting from the left). Since bit 6 of the reference pixel is 0, thatof p must be 1.

4.48: Yes, but compression would suffer. One way to apply this method is to separateeach byte into two 4-bit pixels and encode each pixel separately. This approach is badsince the prefix and suffix of a 4-bit pixel may often consist of more than four bits.Another approach is to ignore the fact that a byte contains two pixels, and use themethod as originally described. This may still compress the image, but is not veryefficient, as the following example illustrates.

Example: The two bytes 1100|1101 and 1110|1111 represent four pixels, each dif-fering from its immediate neighbor by its least-significant bit. The four pixels thereforehave similar colors (or grayscales). Comparing consecutive pixels results in prefixes of 3or 2, but comparing the two bytes produces the prefix 2.

4.49: The weights add up to 1 because this produces a value X in the same range asA, B, and C. If the weights were, for instance, 1, 100, and 1, X would have much biggervalues than any of the three pixels.

4.50: The four vectors are

a = (90, 95, 100, 80, 90, 85),

b(1) = (100, 90, 95, 102, 80, 90),

b(2) = (101, 128, 108, 100, 90, 95),

b(3) = (128, 108, 110, 90, 95, 100),

992 Answers to Exercises

(a)

x

x

(b)

1

23

0

12

30

0

1

2

0

1

23

0

12

3

2

0

1

0

1

2

3

0

1

23

1

2

(c) (d) (e)Figure Ans.37: An Interpolating Bicubic Surface Patch.

Clear[Nh,p,pnts,U,W];

p00={0,0,0}; p10={1,0,1}; p20={2,0,1}; p30={3,0,0};

p01={0,1,1}; p11={1,1,2}; p21={2,1,2}; p31={3,1,1};

p02={0,2,1}; p12={1,2,2}; p22={2,2,2}; p32={3,2,1};

p03={0,3,0}; p13={1,3,1}; p23={2,3,1}; p33={3,3,0};

Nh={{-4.5,13.5,-13.5,4.5},{9,-22.5,18,-4.5},

{-5.5,9,-4.5,1},{1,0,0,0}};

pnts={{p33,p32,p31,p30},{p23,p22,p21,p20},

{p13,p12,p11,p10},{p03,p02,p01,p00}};

U[u_]:={u^3,u^2,u,1}; W[w_]:={w^3,w^2,w,1};

(* prt [i] extracts component i from the 3rd dimen of P *)

prt[i_]:=pnts[[Range[1,4],Range[1,4],i]];

p[u_,w_]:={U[u].Nh.prt[1].Transpose[Nh].W[w],

U[u].Nh.prt[2].Transpose[Nh].W[w], \

U[u].Nh.prt[3].Transpose[Nh].W[w]};

g1=ParametricPlot3D[p[u,w], {u,0,1},{w,0,1},

Compiled->False, DisplayFunction->Identity];

g2=Graphics3D[{AbsolutePointSize[2],

Table[Point[pnts[[i,j]]],{i,1,4},{j,1,4}]}];

Show[g1,g2, ViewPoint->{-2.576, -1.365, 1.718}]

Code For Figure Ans.37

Answers to Exercises 993

and the code of Figure Ans.38 produces the solutions w1 = 0.1051, w2 = 0.3974, andw3 = 0.3690. Their total is 0.8715, compared with the original solutions, which addedup to 0.9061. The point is that the numbers involved in the equations (the elementsof the four vectors) are not independent (for example, pixel 80 appears in a and inb(1)) except for the last element (85 or 91) of a and the first element 101 of b(2), whichare independent. Changing these two elements affects the solutions, which is why thesolutions do not always add up to unity. However, compressing nine pixels producessolutions whose total is closer to one than in the case of six pixels. Compressing anentire image, with many thousands of pixels, produces solutions whose sum is very closeto 1.

a={90.,95,100,80,90,85};b1={100,90,95,100,80,90};b2={100,128,108,100,90,95};b3={128,108,110,90,95,100};Solve[{b1.(a-w1 b1-w2 b2-w3 b3)==0,b2.(a-w1 b1-w2 b2-w3 b3)==0,b3.(a-w1 b1-w2 b2-w3 b3)==0},{w1,w2,w3}]

Figure Ans.38: Solving for Three Weights.

4.51: Figure Ans.39a,b,c shows the results, with all Hi values shown in small type.Most Hi values are zero because the pixels of the original image are so highly correlated.The Hi values along the edges are very different because of the simple edge rule used.The result is that the Hi values are highly decorrelated and have low entropy. Thus,they are candidates for entropy coding.

1 . 3 . 5 . 7 .. 0 . 0 . 0 . -5

17 . 19 . 21 . 23 .. 0 . 0 . 0 . -13

33 . 35 . 37 . 39 .. 0 . 0 . 0 . -21

49 . 51 . 53 . 55 .. -33 . -34 . -35 . -64

1 . 7 . 5 . 5 .. . . . . . . .15 . 19 . 11 . 23 .. . . . . . . .33 . 0 . 37 . 0 .. . . . . . . .-33 . 51 . -35 . 55 .. . . . . . . .

1 . . . 5 . . .. . . . . . . .. . 0 . . . -5 .. . . . . . . .33 . . . 37 . . .. . . . . . . .. . -33 . . . -55 .. . . . . . . .

(a) (b) (c)

Figure Ans.39: (a) Bands L2 and H2. (b) Bands L3 and H3. (c) Bands L4 and H4.

4.52: There are 16 values. The value 0 appears nine times, and each of the other sevenvalues appears once. The entropy is therefore

−∑

pi log2 pi = − 916

log2

(916

)− 7

116

log2

(116

)≈ 2.2169.

994 Answers to Exercises

Not very small, because seven of the 16 values have the same probability. In practice,values of an Hi difference band tend to be small, are both positive and negative, andare concentrated around zero, so their entropy is small.

4.53: Because the decoder needs to know how the encoder estimated X for each Hi

difference value. If the encoder uses one of three methods for prediction, it has to precedeeach difference value in the compressed stream with a code that tells the decoder whichmethod was used. Such a code can have variable size (for example, 0, 10, 11) buteven adding just one or two bits to each prediction reduces compression performancesignificantly, because each Hi value needs to be predicted, and the number of thesevalues is close to the size of the image.

4.54: The binary tree is shown in Figure Ans.40. From this tree, it is easy to see thatthe progressive image file is 3 6|5 7|7 7 10 5.

3,7

3 4 5 6 6 4 5 8

3,5 5,7

3,6

6,7 4,10 5,5

Figure Ans.40: A Binary Tree for an 8-Pixel Image.

4.55: They are shown in Figure Ans.41.

..

.

Figure Ans.41: The 15 6-Tuples With Two White Pixels.

4.56: No. An image with little or no correlation between the pixels will not compresswith quadrisection, even though the size of the last matrix is always small. Even withoutknowing the details of quadrisection we can confidently state that such an image willproduce a sequence of matrices Mj with few or no identical rows. In the extreme case,where the rows of any Mj are all distinct, each Mj will have four times the numberof rows of its predecessor. This will create indicator vectors Ij that get longer andlonger, thereby increasing the size of the compressed stream and reducing the overallcompression performance.

Answers to Exercises 995

4.57: Matrix M5 is just the concatenation of the 12 distinct rows of M4

MT5 = (0000|0001|1111|0011|1010|1101|1000|0111|1110|0101|1011|0010).

4.58: M4 has four columns, so it can have at most 16 distinct rows, implying that M5

can have at most 4×16 = 64 elements.

4.59: The decoder has to read the entire compressed stream, save it in memory, andstart the decoding with L5. Grouping the eight elements of L5 yields the four distinctelements 01, 11, 00, and 10 of L4, so I4 can now be used to reconstruct L4. The fourzeros of I4 correspond to the four distinct elements of L4, and the remaining 10 elementsof L4 can be constructed from them. Once L4 has been constructed, its 14 elements aregrouped to form the seven distinct elements of L3. These elements are 0111, 0010, 1100,0110, 1111, 0101, and 1010, and they correspond to the seven zeros of I3. Once L3 hasbeen constructed, its eight elements are grouped to form the four distinct elements ofL2. Those four elements are the entire L2 since I2 is all zero. Reconstructing L1 and L0

is now trivial.

4.60: The two halves of L0 are distinct, so L1 consists of the two elements

L1 = (0101010101010101, 1010101010101010),

and the first indicator vector is I1 = (0, 0). The two elements of L1 are distinct, so L2

has the four elements

L2 = (01010101, 01010101, 10101010, 10101010),

and the second indicator vector is I2 = (0, 1, 0, 2). Two elements of L2 are distinct, soL3 has the four elements L3 = (0101, 0101, 1010, 1010), and the third indicator vectoris I3 = (0, 1, 0, 2). Again two elements of L3 are distinct, so L4 has the four elementsL4 = (01, 01, 10, 10), and the fourth indicator vector is I4 = (0, 1, 0, 2). Only twoelements of L4 are distinct, so L5 has the four elements L5 = (0, 1, 1, 0).

The output thus consists of k = 5, the value 2 (indicating that I2 is the first nonzerovector) I2, I3, and I4 (encoded), followed by L5 = (0, 1, 1, 0).

4.61: Using a Hilbert curve produces the 21 runs 5, 1, 2, 1, 2, 7, 3, 1, 2, 1, 5, 1, 2, 2,11, 7, 2, 1, 1, 1, 6. RLE produces the 27 runs 0, 1, 7, eol, 2, 1, 5, eol, 5, 1, 2, eol, 0, 3,2, 3, eol, 0, 3, 2, 3, eol, 0, 3, 2, 3, eol, 4, 1, 3, eol, 3, 1, 4, eol.

4.62: A straight line segment from a to b is an example of a one-dimensional curvethat passes through every point in the interval a, b.

4.63: The key is to realize that P0 is a single point, and P1 is constructed by connectingnine copies of P0 with straight segments. Similarly, P2 consists of nine copies of P1, indifferent orientations, connected by segments (the dashed segments in Figure Ans.42).

996 Answers to Exercises

(a) (b) (c)

P0 P1

Figure Ans.42: The First Three Iterations of the Peano Curve.

4.64: Written in binary, the coordinates are (1101, 0110). We iterate four times, eachtime taking 1 bit from the x coordinate and 1 bit from the y coordinate to form an (x, y)pair. The pairs are 10, 11, 01, 10. The first one yields [from Table 4.170(1)] 01. Thesecond pair yields [also from Table 4.170(1)] 10. The third pair [from Table 4.170(1)]11, and the last pair [from Table 4.170(4)] 01. Thus, the result is 01|10|11|01 = 109.

4.65: Table Ans.43 shows that this traversal is based on the sequence 2114.

1: 2 ↑ 1 → 1 ↓ 42: 1 → 2 ↑ 2 ← 33: 4 ↓ 3 ← 3 ↑ 24: 3 ← 4 ↓ 4 → 1

Table Ans.43: The Four Orientations of H2.

4.66: This is straightforward

(00, 01, 11, 10) → (000, 001, 011, 010)(100, 101, 111, 110)→ (000, 001, 011, 010)(110, 111, 101, 100)→ (000, 001, 011, 010, 110, 111, 101, 100).

4.67: The gray area of Figure 4.171c is identified by the string 2011.

4.68: This particular numbering makes it easy to convert between the number of asubsquare and its image coordinates. (We assume that the origin is located at thebottom-left corner of the image and that image coordinates vary from 0 to 1.) As anexample, translating the digits of the number 1032 to binary results in (01)(00)(11)(10).The first bits of these groups constitute the x coordinate of the subsquare, and thesecond bits constitute the y coordinate. Thus, the image coordinates of subsquare1032 are x = .00112 = 3/16 and y = .10102 = 5/8, as can be directly verified fromFigure 4.171c.

Answers to Exercises 997

10050

500

1

1,2(0.25)0,1,2,3(1)0,1,2,3(0.5)

3(0.5)

2

Figure Ans.44: A Two-State Graph.

4.69: This is shown in Figure Ans.44.

4.70: This image is described by the function

f(x, y) ={

x + y, if x + y ≤ 1,0, if x + y > 1.

4.71: The graph has five states, so each transition matrix is of size 5×5. Directcomputation from the graph yields

W0 =

⎛⎜⎜⎜⎝

0 1 0 0 00 0.5 0 0 00 0 0 0 10 −0.5 0 0 1.50 −0.25 0 0 1

⎞⎟⎟⎟⎠ , W3 =

⎛⎜⎜⎜⎝

0 0 0 0 00 0 0 0 10 0 0 0 00 0 0 0 00 −1 0 0 1.5

⎞⎟⎟⎟⎠ ,

W1 = W2 =

⎛⎜⎜⎜⎝

0 0 1 0 00 0.25 0 0 0.50 0 0 1.5 00 0 −0.5 1.5 00 −0.375 0 0 1.25

⎞⎟⎟⎟⎠ .

The final distribution is the five-component vector

F = (0.25, 0.5, 0.375, 0.4125, 0.75)T .

4.72: One way to specify the center is to construct string 033 . . . 3. This yields

ψi(03 . . . 3) = (W0 ·W3 · · ·W3 ·F )i

=(

0.5 00 1

)(0.5 0.50 1

)· · ·(

0.5 0.50 1

)(0.51

)i

=(

0.5 00 1

)(0 10 1

)(0.51

)i

=(

0.51

)i

.

998 Answers to Exercises

dim=256;for i=1:dimfor j=1:dimm(i,j)=(i+j-2)/(2*dim-2);

endendm

Figure Ans.45: Matlab Code for a Matrix mi,j = (i + j)/2.

4.73: Figure Ans.45 shows Matlab code to compute a matrix such as those of Fig-ure 4.175.

4.74: A direct examination of the graph yields the ψi values

ψi(0) = (W0 ·F )i = (0.5, 0.25, 0.75, 0.875, 0.625)Ti ,

ψi(01) = (W0 ·W1 ·F )i = (0.5, 0.25, 0.75, 0.875, 0.625)Ti ,

ψi(1) = (W1 ·F )i = (0.375, 0.5, 0.61875, 0.43125, 0.75)Ti ,

ψi(00) = (W0 ·W0 ·F )i = (0.25, 0.125, 0.625, 0.8125, 0.5625)Ti ,

ψi(03) = (W0 ·W3 ·F )i = (0.75, 0.375, 0.625, 0.5625, 0.4375)Ti ,

ψi(3) = (W3 ·F )i = (0, 0.75, 0, 0, 0.625)Ti ,

and the f values

f(0) = I ·ψ(0) = 0.5, f(01) = I ·ψ(01) = 0.5, f(1) = I ·ψ(1) = 0.375,

f(00) = I ·ψ(00) = 0.25, f(03) = I ·ψ(03) = 0.75, f(3) = I ·ψ(3) = 0.

4.75: Figure Ans.46a,b shows the six states and all 21 edges. We use the notationi(q, t)j for the edge with quadrant number q and transformation t from state i to statej. This GFA is more complex than pervious ones since the original image is less self-similar.

4.76: The transformation can be written (x, y) → (x,−x + y), so (1, 0) → (1,−1),(3, 0) → (3,−3), (1, 1) → (1, 0) and (3, 1) → (3,−2). Thus, the original rectangle istransformed into a parallelogram.

4.77: The explanation is that the two sets of transformations produce the sameSierpinski triangle but at different sizes and orientations.

4.78: All three transformations shrink an image to half its original size. In addition,w2 and w3 place two copies of the shrunken image at relative displacements of (0, 1/2)and (1/2, 0), as shown in Figure Ans.47. The result is the familiar Sierpinski gasket butin a different orientation.

Answers to Exercises 999

4

0 1

35

2

0

1

2

3

0

1

2

3

0

1

2

3

0

1

2

3

0

1

2

3

0

1

2

3(1,0) (2,2)

(2,1)(3,1)

(0,0) (3,0)

(0,0) (3,0)

(0,1) (3,3)

(1,0) (2,2)

(0,1) (3,3)

(1,6)

(0,7)

(0,0) (0,6)

(2,0) (2,6)

(0,0) (0,6)

(2,0) (2,6)

(0,0)

(0,0)

(3,0)

(0,0)(0,8)

(0,0)(2,0)

(2,1)(3,1)

(0,0)(2,0)

(2,4)

(1,6)

(0,7)

(3,0)

(2,4)

(0,0)(0,8)

(a)

Start state

0 12

4 5 3

(b)

0(0,0)1 0(3,0)1 0(0,1)2 0(1,0)2 0(2,2)2 0(3,3)2 1(0,7)31(1,6)3 1(2,4)3 1(3,0)3 2(0,0)4 2(2,0)4 2(2,1)4 2(3,1)43(0,0)5 4(0,0)5 4(0,6)5 4(2,0)5 4(2,6)5 5(0,0)5 5(0,8)5

Figure Ans.46: A GFA for Exercise 4.75.

1000 Answers to Exercises

Originalw2

w1

w3

After 1iteration

Z ZZZ Z

ZZZZZZZZ Z

ZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ Z

ZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

ZZZZZZZZZ

Figure Ans.47: Another Sierpinski Gasket.

4.79: There are 32×32 = 1, 024 ranges and (256− 15)×(256− 15) = 58, 081 domains.Thus, the total number of steps is 1, 024×58, 081×8 = 475, 799, 552, still a large number.PIFS is therefore computationally intensive.

4.80: Suppose that the image has G levels of gray. A good measure of data loss isthe difference between the value of an average decompressed pixel and its correct value,expressed in number of gray levels. For large values of G (hundreds of gray levels) anaverage difference of log2 G gray levels (or fewer) is considered satisfactory.

5.1: A written page is such an example. A person can place marks on a page and readthem later as text, mathematical expressions, and drawings. This is a two-dimensionalrepresentation of the information on the page. The page can later be scanned by, e.g.,a fax machine, and its contents transmitted as a one-dimensional stream of bits thatconstitute a different representation of the same information.

5.2: Figure Ans.48 shows f(t) and three shifted copies of the wavelet, for a = 1 andb = 2, 4, and 6. The inner product W (a, b) is plotted below each copy of the wavelet.It is easy to see how the inner products are affected by the increasing frequency.

The table of Figure Ans.49 lists 15 values of W (a, b), for a = 1, 2, and 3 and forb = 2 through 6. The density plot of the figure, where the bright parts correspond tolarge values, shows those values graphically. For each value of a, the CWT yields valuesthat drop with b, reflecting the fact that the frequency of f(t) increases with t. The fivevalues of W (1, b) are small and very similar, while the five values of W (3, b) are larger

Answers to Exercises 1001

2 4 6 8 10

Figure Ans.48: An Inner Product for a = 1 and b = 2, 4, 6.

a b = 2 3 4 5 61 0.032512 0.000299 1.10923×10−6 2.73032×10−9 8.33866×10−11

2 0.510418 0.212575 0.0481292 0.00626348 0.000480973 0.743313 0.629473 0.380634 0.173591 0.064264

3 4

2

1

5 6b=2

a=3

Figure Ans.49: Fifteen Values and a Density Plot of W (a, b).

1002 Answers to Exercises

and differ more. This shows how scaling the wavelet up makes the CWT more sensitiveto frequency changes in f(t).

5.3: Figure 5.11c shows these wavelets.

5.4: Figure Ans.50a shows a simple, 8×8 image with one diagonal line above the maindiagonal. Figure Ans.50b,c shows the first two steps in its pyramid decomposition. Itis obvious that the transform coefficients in the bottom-right subband (HH) indicate adiagonal artifact located above the main diagonal. It is also easy to see that subbandLL is a low-resolution version of the original image.

12 16 12 12 12 12 12 1212 12 16 12 12 12 12 1212 12 12 16 12 12 12 1212 12 12 12 16 12 12 1212 12 12 12 12 16 12 1212 12 12 12 12 12 16 1212 12 12 12 12 12 12 1612 12 12 12 12 12 12 12

14 12 12 12 4 0 0 012 14 12 12 0 4 0 012 14 12 12 0 4 0 012 12 14 12 0 0 4 012 12 14 12 0 0 4 012 12 12 14 0 0 0 412 12 12 14 0 0 0 412 12 12 12 0 0 0 0

13 13 12 12 2 2 0 012 13 13 12 0 2 2 012 12 13 13 0 0 2 212 12 12 13 0 0 0 22 2 0 0 4 4 0 00 2 2 0 0 4 4 00 0 2 2 0 0 4 40 0 0 2 0 0 0 4

(a) (b) (c)

Figure Ans.50: The Subband Decomposition of a Diagonal Line.

5.5: The average can easily be calculated. It turns out to be 131.375, which is exactly1/8 of 1051. The reason the top-left transform coefficient is eight times the averageis that the Matlab code that did the calculations uses

√2 instead of 2 (see function

individ(n) in Figure 5.22).

5.6: Figure Ans.51a–c shows the results of reconstruction from 3277, 1639, and 820coefficients, respectively. Despite the heavy loss of wavelet coefficients, only a very smallloss of image quality is noticeable. The number of wavelet coefficients is, of course, thesame as the image resolution 128×128 = 16, 384. Using 820 out of 16,384 coefficientscorresponds to discarding 95% of the smallest of the transform coefficients (notice, how-ever, that some of the coefficients were originally zero, so the actual loss may amountto less than 95%).

5.7: The Matlab code of Figure Ans.52 calculates W as the product of the threematrices A1, A2, and A3 and computes the 8×8 matrix of transform coefficients. Noticethat the top-left value 131.375 is the average of all the 64 image pixels.

5.8: The vector x = (. . . , 1,−1, 1,−1, 1, . . .) of alternating values is transformed by thelowpass filter H0 to a vector of all zeros.

Answers to Exercises 1003

0 20 40 60 80 100 120

0

20

40

60

80

100

120

nz = 3277

0 20 40 60 80 100 120

0

20

40

60

80

100

120

nz = 1639

0 20 40 60 80 100 120

0

20

40

60

80

100

120

nz = 820

(a)

(b)

(c)

0

20

40

60

80

100

120

0

20

40

60

80

100

120

0

20

40

60

80

100

120

0 20 40 60 80 100 120

0 20 40 60 80 100 120

0 20 40 60 80 100 120

nz=3277

nz=1639

nz=820

Figure Ans.51: Three Lossy Reconstructions of the 128×128 Lena Image.

1004 Answers to Exercises

cleara1=[1/2 1/2 0 0 0 0 0 0; 0 0 1/2 1/2 0 0 0 0;0 0 0 0 1/2 1/2 0 0; 0 0 0 0 0 0 1/2 1/2;1/2 -1/2 0 0 0 0 0 0; 0 0 1/2 -1/2 0 0 0 0;0 0 0 0 1/2 -1/2 0 0; 0 0 0 0 0 0 1/2 -1/2];% a1*[255; 224; 192; 159; 127; 95; 63; 32];a2=[1/2 1/2 0 0 0 0 0 0; 0 0 1/2 1/2 0 0 0 0;1/2 -1/2 0 0 0 0 0 0; 0 0 1/2 -1/2 0 0 0 0;0 0 0 0 1 0 0 0; 0 0 0 0 0 1 0 0;0 0 0 0 0 0 1 0; 0 0 0 0 0 0 0 1];a3=[1/2 1/2 0 0 0 0 0 0; 1/2 -1/2 0 0 0 0 0 0;0 0 1 0 0 0 0 0; 0 0 0 1 0 0 0 0;0 0 0 0 1 0 0 0; 0 0 0 0 0 1 0 0;0 0 0 0 0 0 1 0; 0 0 0 0 0 0 0 1];w=a3*a2*a1;dim=8; fid=fopen(’8x8’,’r’);img=fread(fid,[dim,dim])’; fclose(fid);w*img*w’ % Result of the transform

131.375 4.250 −7.875 −0.125 −0.25 −15.5 0 −0.250 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

12.000 59.875 39.875 31.875 15.75 32.0 16 15.7512.000 59.875 39.875 31.875 15.75 32.0 16 15.7512.000 59.875 39.875 31.875 15.75 32.0 16 15.7512.000 59.875 39.875 31.875 15.75 32.0 16 15.75

Figure Ans.52: Code and Results for the Calculation of Matrix W and Transform W ·I ·WT .

5.9: For these filters, rules 1 and 2 imply

h20(0) + h2

0(1) + h20(2) + h2

0(3) + h20(4) + h2

0(5) + h20(6) + h2

0(7) = 1,h0(0)h0(2) + h0(1)h0(3) + h0(2)h0(4) + h0(3)h0(5) + h0(4)h0(6) + h0(5)h0(7) = 0,

h0(0)h0(4) + h0(1)h0(5) + h0(2)h0(6) + h0(3)h0(7) = 0,h0(0)h0(6) + h0(1)h0(7) = 0,

and rules 3–5 yield

f0 =(h0(7), h0(6), h0(5), h0(4), h0(3), h0(2), h0(1), h0(0)

),

h1 =(−h0(7), h0(6),−h0(5), h0(4),−h0(3), h0(2),−h0(1), h0(0)

),

f1 =(h0(0),−h0(1), h0(2),−h0(3), h0(4),−h0(5), h0(6),−h0(7)

).

The eight coefficients are listed in Table 5.35 (this is the Daubechies D8 filter).

5.10: Figure Ans.53 lists the Matlab code of the inverse wavelet transform functioniwt1(wc,coarse,filter) and a test.

Answers to Exercises 1005

function dat=iwt1(wc,coarse,filter)% Inverse Discrete Wavelet Transformdat=wc(1:2^coarse);n=length(wc); j=log2(n);for i=coarse:j-1dat=ILoPass(dat,filter)+ ...IHiPass(wc((2^(i)+1):(2^(i+1))),filter);

end

function f=ILoPass(dt,filter)f=iconv(filter,AltrntZro(dt));

function f=IHiPass(dt,filter)f=aconv(mirror(filter),rshift(AltrntZro(dt)));

function sgn=mirror(filt)% return filter coefficients with alternating signssgn=-((-1).^(1:length(filt))).*filt;

function f=AltrntZro(dt)% returns a vector of length 2*n with zeros% placed between consecutive valuesn =length(dt)*2; f =zeros(1,n);f(1:2:(n-1))=dt;

Figure Ans.53: Code for the 1D Inverse Discrete Wavelet Transform.

A simple test of iwt1 isn=16; t=(1:n)./n;dat=sin(2*pi*t)filt=[0.4830 0.8365 0.2241 -0.1294];wc=fwt1(dat,1,filt)rec=iwt1(wc,1,filt)

5.11: Figure Ans.54 shows the result of blurring the “lena” image. Parts (a) and (b)show the logarithmic multiresolution tree and the subband structure, respectively. Part(c) shows the results of the quantization. The transform coefficients of subbands 5–7have been divided by two, and all the coefficients of subbands 8–13 have been cleared. Wecan say that the blurred image of part (d) has been reconstructed from the coefficientsof subbands 1–4 (1/64th of the total number of transform coefficients) and half of thecoefficients of subbands 5–7 (half of 3/64, or 3/128). On average, the image has beenreconstructed from 5/128 ≈ 0.039 or 3.9% of the transform coefficients. Notice that theDaubechies D8 filter was used in the calculations. Readers are encouraged to use thiscode and experiment with the performance of other filters.

5.12: They are written in the form a-=b/2; b+=a;.

1006 Answers to Exercises

(c) (d)

1 23 4

(a) (b)

6 75

8

1011

1312

9

H0

H1

↓2

↓2

H0

H1

↓2

↓2

H0

H1

↓2

↓2

H0

H1

↓2

↓2

Figure Ans.54: Blurring as a Result of Coarse Quantization.

clear, colormap(gray);filename=’lena128’; dim=128;fid=fopen(filename,’r’);img=fread(fid,[dim,dim])’;filt=[0.23037,0.71484,0.63088,-0.02798, ...-0.18703,0.03084,0.03288,-0.01059];

fwim=fwt2(img,3,filt);figure(1), imagesc(fwim), axis squarefwim(1:16,17:32)=fwim(1:16,17:32)/2;fwim(1:16,33:128)=0;fwim(17:32,1:32)=fwim(17:32,1:32)/2;fwim(17:32,33:128)=0;fwim(33:128,:)=0;

figure(2), colormap(gray), imagesc(fwim)rec=iwt2(fwim,3,filt);figure(3), colormap(gray), imagesc(rec)

Code for Figure Ans.54.

Answers to Exercises 1007

5.13: We sum Equation (5.13) over all the values of l to get

2j−1−1∑l=0

sj−1,l =2j−1−1∑

l=0

(sj,2l + dj−1,l/2) =12

2j−1−1∑l=0

(sj,2l + sj,2l+1) =12

2j−1∑l=0

sj,l. (Ans.1)

Therefore, the average of set sj−1 equals

12j−1

2j−1−1∑l=0

sj−1,l =1

2j−1

12

2j−1∑l=0

sj,l =12j

2j−1∑l=0

sj,l

the average of set sj .

5.14: The code of Figure Ans.55 produces the expression

0.0117P1 − 0.0977P2 + 0.5859P3 + 0.5859P4 − 0.0977P5 + 0.0117P6.

Clear[p,a,b,c,d,e,f];p[t_]:=a t^5+b t^4+c t^3+d t^2+e t+f;Solve[{p[0]==p1, p[1/5.]==p2, p[2/5.]==p3,p[3/5.]==p4, p[4/5.]==p5, p[1]==p6}, {a,b,c,d,e,f}];sol=ExpandAll[Simplify[%]];Simplify[p[0.5] /.sol]

Figure Ans.55: Code for a Degree-5 Interpolating Polynomial.

5.15: The Matlab code of Figure Ans.56 does that and produces the transformedinteger vector y = (111,−1, 84, 0, 120, 25, 84, 3). The inverse transform generates vectorz that is identical to the original data x. Notice how the detail coefficients are muchsmaller than the weighted averages. Notice also that Matlab arrays are indexed from1, whereas the discussion in the text assumes arrays indexed from 0. This causes thedifference in index values in Figure Ans.56.

5.16: For the case MC = 3, the first six images g0 through g5 will have dimensions

(3 · 25 + 1× 4 · 25 + 1) = 97× 129, 49× 65, 25× 33, 13× 17, and 7× 9.

5.17: In the sorting pass of the third iteration the encoder transmits the number l = 3(the number of coefficients ci,j in our example that satisfy 212 ≤ |ci,j | < 213), followedby the three pairs of coordinates (3, 3), (4, 2), and (4, 1) and by the signs of the threecoefficients. In the refinement step it transmits the six bits cdefgh. These are the 13thmost significant bits of the coefficients transmitted in all the previous iterations.

1008 Answers to Exercises

clear;N=8; k=N/2;x=[112,97,85,99,114,120,77,80];% Forward IWT into yfor i=0:k-2,y(2*i+2)=x(2*i+2)-floor((x(2*i+1)+x(2*i+3))/2);end;y(N)=x(N)-x(N-1);y(1)=x(1)+floor(y(2)/2);for i=1:k-1,y(2*i+1)=x(2*i+1)+floor((y(2*i)+y(2*i+2))/4);end;% Inverse IWT into zz(1)=y(1)-floor(y(2)/2);for i=1:k-1,z(2*i+1)=y(2*i+1)-floor((y(2*i)+y(2*i+2))/4);end;for i=0:k-2,z(2*i+2)=y(2*i+2)+floor((z(2*i+1)+x(2*i+3))/2);end;z(N)=y(N)+z(N-1);

Figure Ans.56: Matlab Code for Forward and Inverse IWT.

The information received so far enables the decoder to further improve the 16 ap-proximate coefficients. The first nine become

c2,3 = s1ac0 . . . 0, c3,4 = s1bd0 . . . 0, c3,2 = s01e00 . . . 0,

c4,4 = s01f00 . . . 0, c1,2 = s01g00 . . . 0, c3,1 = s01h00 . . . 0,

c3,3 = s0010 . . . 0, c4,2 = s0010 . . . 0, c4,1 = s0010 . . . 0,

and the remaining seven are not changed.

5.18: The simple equation 10×220×8 = (500x)×(500x)×8 is solved to yield x2 = 40square inches. If the card is square, it is approximately 6.32 inches on a side. Sucha card has 10 rolled impressions (about 1.5 × 1.5 each), two plain impressions of thethumbs (about 0.875×1.875 each), and simultaneous impressions of both hands (about3.125×1.875 each). All the dimensions are in inches.

5.19: The bit of 10 is encoded, as usual, in pass 2. The bit of 1 is encoded in pass 1since this coefficient is still insignificant but has significant neighbors. This bit is 1, socoefficient 1 becomes significant (a fact that is not used later). Also, this bit is the first1 of this coefficient, so the sign bit of the coefficient is encoded following this bit. Thebits of coefficients 3 and −7 are encoded in pass 2 since these coefficients are significant.

6.1: It is easy to calculate that 525× 4/3 = 700 pixels.

Answers to Exercises 1009

6.2: The vertical height of the picture on the author’s 27 in. television set is 16 in.,which translates to a viewing distance of 7.12× 16 = 114 in. or about 9.5 feet. It is easyto see that individual scan lines are visible at any distance shorter than about 6 feet.

6.3: Three common examples are: (1) Surveillance camera, (2) an old, silent moviebeing restored and converted from film to video, and (3) a video presentation takenunderwater.

6.4: The golden ratio φ ≈ 1.618 has traditionally been considered the aspect ratio thatis most pleasing to the eye. This suggests that 1.77 is the better aspect ratio.

6.5: Imagine a camera panning from left to right. New objects will enter the field ofview from the right all the time. A block on the right side of the frame may thereforecontain objects that did not exist in the previous frame.

6.6: Since (4, 4) is at the center of the “+”, the value of s is halved, to 2. The next stepsearches the four blocks labeled 4, centered on (4, 4). Assuming that the best match isat (6, 4), the two blocks labeled 5 are searched. Assuming that (6, 4) is the best match,s is halved to 1, and the eight blocks labeled 6 are searched. The diagram shows thatthe best match is finally found at location (7, 4).

6.7: This figure consists of 18×18 macroblocks, and each macroblock constitutes six8×8 blocks of samples. The total number of samples is therefore 18×18×6×64 = 124, 416.

6.8: The size category of zero is 0, so code 100 is emitted, followed by zero bits. Thesize category of 4 is 3, so code 110 is first emitted, followed by the three least-significantbits of 4, which are 100.

6.9: The zigzag sequence is

118, 2, 0,−2, 0, . . . , 0︸ ︷︷ ︸13

,−1, 0, . . . .

The run-level pairs are (0, 2), (1,−2), and (13,−1), so the final codes are (notice thesign bits following the run-level codes)

0100 0|000110 1|00100000 1|10,

(without the vertical bars).

6.10: There are no nonzero coefficients, no run-level codes, just the 2-bit EOB code.However, in nonintra coding, such a block is encoded in a special way.

7.1: An average book may have 60 characters per line, 45 lines per page, and 400 pages.This comes to 60× 45× 400 = 1, 080, 000 characters, requiring one byte of storage each.

1010 Answers to Exercises

7.2: The period of a wave is its speed divided by its frequency. For sound we get

34380 cm/s22000 Hz

= 1.562 cm,34380

20= 1719 cm.

7.3: The (base-10) logarithm of x is a number y such that 10y = x. The number 2 isthe logarithm of 100 since 102 = 100. Similarly, 0.3 is the logarithm of 2 since 100.3 = 2.Also, The base-b logarithm of x is a number y such that by = x (for any real b > 1).

7.4: Each doubling of the sound intensity increases the dB level by 3. Therefore, thedifference of 9 dB (3 + 3 + 3) between A and B corresponds to three doublings of thesound intensity. Thus, source B is 2·2·2 = 8 times louder than source A.

7.5: Each 0 would result in silence and each sample of 1, in the same tone. The resultwould be a nonuniform buzz. Such sounds were common on early personal computers.

7.6: Such an experiment should be repeated with several persons, preferably of differentages. The person should be placed in a sound insulated chamber, and a pure tone offrequency f should be played. The amplitude of the tone should be gradually increasedfrom zero until the person can just barely hear it. If this happens at a decibel valued, point (d, f) should be plotted. This should be repeated for many frequencies until agraph similar to Figure 7.5a is obtained.

7.7: We first select identical items. If all s(t − i) equal s, Equation (7.7) yields thesame s. Next, we select values on a straight line. Given the four values a, a + 2, a + 4,and a + 6, Equation (7.7) yields a + 8, the next linear value. Finally, we select valuesroughly equally-spaced on a circle. The y coordinates of points on the first quadrant of acircle can be computed by y =

√r2 − x2. We select the four points with x coordinates 0,

0.08r, 0.16r, and 0.24r, compute their y coordinates for r = 10, and substitute them inEquation (7.7). The result is 9.96926, compared to the actual y coordinate for x = 0.32rwhich is

√r2 − (0.32r)2 = 9.47418, a difference of about 5%. The code that did the

computations is shown in Figure Ans.57.

(* Points on a circle. Used in exercise to check4th-order prediction in FLAC *)

r = 10;ci[x_] := Sqrt[100 - x^2];ci[0.32r]4ci[0] - 6ci[0.08r] + 4ci[0.16r] - ci[0.24r]

Figure Ans.57: Code for Checking 4th-Order Prediction.

Answers to Exercises 1011

7.8: Imagine that the sound being compressed contains one second of a pure tone(just one frequency). This second will be digitized to 44,100 consecutive samples perchannel. The samples indicate amplitudes, so they don’t have to be the same. However,after filtering, only one subband (although in practice perhaps two subbands) will havenonzero signals. All the other subbands correspond to different frequencies, so they willhave signals that are either zero or very close to zero.

7.9: Assuming that a noise level P1 translates to x decibels

20 log(

P1

P2

)= x dB SPL,

results in the relation

20 log

(3√

2P1

P2

)= 20

[log10

3√

2 + log(

P1

P2

)]= 20(0.1 + x/20) = x + 2.

Thus, increasing the sound level by a factor of 3√

2 increases the decibel level by 2 dB SPL.

7.10: For a sampling rate of 44,100 samples/sec, the calculations are similar. Thedecoder has to decode 44,100/384 ≈ 114.84 frames per second. Thus, each frame has tobe decoded in approximately 8.7 ms. In order to output 114.84 frames in 64,000 bits,each frame must have Bf = 557 bits available to encode it. Thus, the number of slotsper frame is 557/32 ≈ 17.41. Thus, the last (18th) slot is not full and has to padded.

7.11: Table 7.58 shows that the scale factor is 111 and the select information is 2. Thethird rule in Table 7.59 shows that a scfsi of 2 means that only one scale factor wascoded, occupying just six bits in the compressed output. The decoder assigns these sixbits as the values of all three scale factors.

7.12: Typical layer II parameters are (1) a sampling rate of 48,000 samples/sec, (2) abitrate of 64,000 bits/sec, and (3) 1,152 quantized signals per frame. The decoder has todecode 48,000/1152 = 41.66 frames per second. Thus, each frame has to be decoded in24 ms. In order to output 41.66 frames in 64,000 bits, each frame must have Bf = 1,536bits available to encode it.

7.13: A program to play .mp3 files is an MPEG layer III decoder, not an encoder.Decoding is much simpler since it does not use a psychoacoustic model, nor does it haveto anticipate preechoes and maintain the bit reservoir.

8.1: Because the original string S can be reconstructed from L but not from F.

8.2: A direct application of Equation (8.1) eight more times produces:

S[10-1-2]=L[T2[I]]=L[T[T1[I]]]=L[T[7]]=L[6]=i;S[10-1-3]=L[T3[I]]=L[T[T2[I]]]=L[T[6]]=L[2]=m;S[10-1-4]=L[T4[I]]=L[T[T3[I]]]=L[T[2]]=L[3]=�;

1012 Answers to Exercises

S[10-1-5]=L[T5[I]]=L[T[T4[I]]]=L[T[3]]=L[0]=s;S[10-1-6]=L[T6[I]]=L[T[T5[I]]]=L[T[0]]=L[4]=s;S[10-1-7]=L[T7[I]]=L[T[T6[I]]]=L[T[4]]=L[5]=i;S[10-1-8]=L[T8[I]]=L[T[T7[I]]]=L[T[5]]=L[1]=w;S[10-1-9]=L[T9[I]]=L[T[T8[I]]]=L[T[1]]=L[9]=s;

The original string swiss�miss is indeed reproduced in S from right to left.

8.3: Figure Ans.58 shows the rotations of S and the sorted matrix. The last column, Lof Ans.58b happens to be identical to S, so S=L=sssssssssh. Since A=(s,h), a move-to-front compression of L yields C = (1, 0, 0, 0, 0, 0, 0, 0, 0, 1). Since C contains just thetwo values 0 and 1, they can serve as their own Huffman codes, so the final result is1000000001, 1 bit per character!

ssssssssshsssssssshsssssssshsssssssshsssssssshsssssssshsssssssshsssssssshsssssssshsssssssshsssssssss

hsssssssssshsssssssssshsssssssssshsssssssssshsssssssssshsssssssssshsssssssssshsssssssssshssssssssssh

(a) (b)Figure Ans.58: Permutations of “sssssssssh”.

8.4: The encoder starts at T[0], which contains 5. The first element of L is thus the lastsymbol of permutation 5. This permutation starts at position 5 of S, so its last elementis in position 4. The encoder thus has to go through symbols S[T[i-1]] for i = 0, . . . , n−1,where the notation i − 1 should be interpreted cyclically (i.e., 0 − 1 should be n − 1).As each symbol S[T[i-1]] is found, it is compressed using move-to-front. The value of Iis the position where T contains 0. In our example, T[8]=0, so I=8.

8.5: The first element of a triplet is the distance between two dictionary entries, theone best matching the content and the one best matching the context. In this case thereis no content match, no distance, so any number could serve as the first element, 0 beingthe best (smallest) choice.

8.6: because the three lines are sorted in ascending order. The bottom two lines ofTable 8.13c are not in sorted order. This is why the zz...z part of string S must bepreceded and followed by complementary bits.

8.7: The encoder places S between two entries of the sorted associative list and writesthe (encoded) index of the entry above or below S on the compressed stream. The fewerthe number of entries, the smaller this index, and the better the compression.

Answers to Exercises 1013

8.8: Context 5 is compared to the three remaining contexts 6, 7, and 8, and it is mostsimilar to context 6 (they share a suffix of “b”). Context 6 is compared to 7 and 8 and,since they don’t share any suffix, context 7, the shorter of the two, is selected. Theremaining context 8 is, of course, the last one in the ranking. The final context rankingis

1 → 3 → 4 → 0 → 5 → 6 → 7 → 8.

8.9: Equation (8.3) shows that the third “a” is assigned rank 1 and the “b” and “a”following it are assigned ranks 2 and 3, respectively.

8.10: Table Ans.59 shows the sorted contexts. Equation (Ans.2) shows the contextranking at each step.

0u

, 0u

→ 2b

, 1l

→ 3b

→ 0u

,

0u

→ 2l

→ 3a

→ 4b

, 2l

→ 4a

→ 1d

→ 5b

→ 0u

,(Ans.2)

3i

→ 5a

→ 2l

→ 6b

→ 5d

→ 0u

.

The final output is “u 2 b 3 l 4 a 5 d 6 i 6.” Notice that each of the distinct input symbolsappears once in this output in raw format.

0 λ u1 u x

(a)

0 λ u1 ub x2 u b

(b)

0 λ u1 ub l2 ubl x3 u b

(c)

0 λ u1 ubla x2 ub l3 ubl a4 u b

(d)

0 λ u1 ubla d2 ub l3 ublad x4 ubl a5 u b

(e)

0 λ u1 ubla d2 ub l3 ublad i4 ubladi x5 ubl a6 u b

(f)

Table Ans.59: Constructing the Sorted Lists for ubladiu.

8.11: All n1 bits of string L1 need be written on the output stream. This alreadyshows that there is going to be no compression. String L2 consists of n1/k 1’s, so all ofit has to be written on the output stream. String L3 similarly consists of n1/k2 1’s, andso on. Thus, the size of the output stream is

n1 +n1

k+

n1

k2+

n1

k3+ · · ·+ n1

km= n1

km+1 − 1km(k − 1)

,

1014 Answers to Exercises

for some value of m. The limit of this expression, when m → ∞, is n1k/(k − 1). Fork = 2 this equals 2n1. For larger values of k this limit is always between n1 and 2n1.

For the curious reader, here is how the sum above is computed. Given the series

S =m∑

i=0

1ki

= 1 +1k

+1k2

+1k3

+ · · ·+ 1km−1

+1

km,

we multiply both sides by 1/k

S

k=

1k

+1k2

+1k3

+ · · ·+ 1km

+1

km+1= S +

1km+1

− 1,

and subtractS

k(k − 1) =

km+1 − 1km+1

→ S =km+1 − 1km(k − 1)

.

8.12: The input stream consists of:1. A run of three zero groups, coded as 10|1 since 3 is in second position in class 2.2. The nonzero group 0100, coded as 111100.3. Another run of three zero groups, again coded as 10|1.4. The nonzero group 1000, coded as 01100.5. A run of four zero groups, coded as 010|00 since 4 is in first position in class 3.6. 0010, coded as 111110.7. A run of two zero groups, coded as 10|0.

The output is thus the 31-bit string 1011111001010110001000111110100.

8.13: The input stream consists of:1. A run of three zero groups, coded as R2R1 or 101|11.2. The nonzero group 0100, coded as 00100.3. Another run of three zero groups, again coded as 101|11.4. The nonzero group 1000, coded as 01000.5. A run of four zero groups, coded as R4 = 1001.6. 0010, coded as 00010.7. A run of two zero groups, coded as R2 = 101.

The output is thus the 32-bit string 10111001001011101000100100010101.

8.14: The input stream consists of:1. A run of three zero groups, coded as F3 or 1001.2. The nonzero group 0100, coded as 00100.3. Another run of three zero groups, again coded as 1001.4. The nonzero group 1000, coded as 01000.5. A run of four zero groups, coded as F3F1 = 1001|11.6. 0010, coded as 00010.7. A run of two zero groups, coded as F2 = 101.

The output is thus the 32-bit string 10010010010010100010011100010101.

Answers to Exercises 1015

8.15: Yes, if they are located in different quadrants or subquadrants. Pixels 123 and301, for example, are adjacent in Figure 4.157 but have different prefixes.

8.16: No, because all prefixes have the same probability of occurrence. In our examplethe prefixes are four bits long and all 16 possible prefixes have the same probabilitybecause a pixel may be located anywhere in the image. A Huffman code constructedfor 16 equally-probable symbols has an average size of four bits per symbol, so nothingwould be gained. The same is true for suffixes.

8.17: This is possible, but it places severe limitations on the size of the string. In orderto rearrange a one-dimensional string into a four-dimensional cube, the string size shouldbe 24n. If the string size happens to be 24n + 1, it has to be extended to 24(n+1), whichincreases its size by a factor of 16. It is possible to rearrange the string into a rectangularbox, not just a cube, but then its size will have to be of the form 2n12n22n32n4 wherethe four ni’s are integers.

8.18: The LZW algorithm, which starts with the entire alphabet stored at the begin-ning of its dictionary, is an example of such a method. However, an adaptive version ofLZW can be designed to compress words instead of individual characters.

8.19: A better choice for the coordinates may be relative values (or offsets). Each (x, y)pair may specify the position of a character relative to its predecessor. This results insmaller numbers for the coordinates, and smaller numbers are easier to compress.

8.20: There may be such letters in other, “exotic” alphabets, but a more commonexample is a rectangular box enclosing text. The four rules that constitute such a boxshould be considered a mark, but the text characters inside the box should be identifiedas separate marks.

8.21: This guarantees that the two probabilities will add up to 1.

8.22: Figure Ans.60 shows how state A feeds into the new state D′ which, in turn,feeds into states E and F . Notice how states B and C haven’t changed. Since the newstate D′ is identical to D, it is possible to feed A into either D or D′ (cloning can bedone in two different but identical ways). The original counts of state D should now bedivided between D and D′ in proportion to the counts of the transitions A → D andB,C → D.

8.23: Figure Ans.61 shows the new state 6 after the operation 1, 1 → 6. Its 1-outputis identical to that of state 1, and its 0-output is a copy of the 0-output of state 3.

8.24: A precise answer requires many experiments with various data files. A littlethinking, though, shows that the larger k, the better the initial model that is createdwhen the old one is discarded. Larger values of k thus minimize the loss of compression.However, very large values may produce an initial model that is already large and cannotgrow much. The best value for k is therefore one that produces an initial model largeenough to provide information about recent correlations in the data, but small enoughso it has room to grow before it too has to be discarded.

1016 Answers to Exercises

0

1

0B

AE

C

D F

D’

1

1

0

1

0B

AE

C

FD’

D0

0

0

0

Figure Ans.60: New State D’ Cloned.

0

1652 0

13

01

00 1

1

1

0

40

10

1

Figure Ans.61: State 6 Added.

8.25: The number of marked points can be written 8(1+2+3+5+8+13) = 256 andthe numbers in parentheses are the Fibonacci numbers.

8.26: The conditional probability P (Di|Di) is very small. A segment pointing indirection Di can be preceded by another segment pointing in the same direction only ifthe original curve is straight or very close to straight for more than 26 coordinate units(half the width of grid S13).

8.27: We observe that a point has two coordinates. If each coordinate occupies eightbits, then the use of Fibonacci numbers reduces the 16-bit coordinates to an 8-bit num-ber, a compression ratio of 0.5. The use of Huffman codes can typically reduce this 8-bitnumber to (on average) a 4-bit code, and the use of the Markov model can perhaps cutthis by another bit. The result is an estimated compression ratio of 3/16 = 0.1875. Ifeach coordinate is a 16-bit number, then this ratio improves to 3/32 = 0.09375.

8.28: The resulting, shorter grammar is shown in Figure Ans.62. It is one rule andone symbol shorter.

Input Grammar

S→ abcdbcabcdbc S→ CCA→ bcC→ aAdA

Figure Ans.62: Improving the Grammar of Figure 8.41.

Answers to Exercises 1017

8.29: Because generating rule C has made rule B underused (i.e., used just once).

8.30: Rule S consists of two copies of rule A. The first time rule A is encountered, itscontents aBdB are sent. This involves sending rule B twice. The first time rule B is sent,its contents bc are sent (and the decoder does not know that the string bc it is receivingis the contents of a rule). The second time rule B is sent, the pair (1, 2) is sent (offset 1,count 2). The decoder identifies the pair and uses it to set up the rule 1→ bc. Sendingthe first copy of rule A therefore amounts to sending abcd(1, 2). The second copy of ruleA is sent as the pair (0, 4) since A starts at offset 0 in S and its length is 4. The decoderidentifies this pair and uses it to set up the rule 2→ a1d1 . The final result is thereforeabcd(1, 2)(0, 4).

8.31: In each of these cases, the encoder removes one edge from the boundary andinserts two new edges. There is a net gain of one edge.

8.32: They create triangles (18, 2, 3) and (18, 3, 4), and reduce the boundary to thesequence of vertices

(4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18).

A problem can be found for almost every solution.

Unknown

Bibliography

All URLs have been checked and updated as of late July 2006. Any broken links reportedto the author will be added to the errata list in the book’s Web site.

The main event in the life of the data compression community is the annual data com-pression conference (DCC, see Joining the Data Compression Community) whose pro-ceedings are published by the IEEE. The editors have traditionally been James AndrewStorer and Martin Cohn. Instead of listing many references that differ only by year, westart this bibliography with a generic reference to the DCC, where “XX” is the last twodigits of the conference year.

Storer, James A., and Martin Cohn (eds.) (annual) DCC ’XX: Data Compression Con-ference, Los Alamitos, CA, IEEE Computer Society Press.

3R (2006) is http://f-cpu.seul.org/whygee/ddj-3r/ddj-3R.html.

7z (2006) is http://www.7-zip.org/sdk.html.

Abramson, N. (1963) Information Theory and Coding, New York, McGraw-Hill.

Abousleman, Glen P. (2006) “Coding of Hyperspectral Imagery with Trellis-CodedQuantization,” in G. Motta, F. Rizzo, and J. A. Storer, editors, Hyperspectral DataCompression, New York, Springer Verlag.

acronyms (2006) ishttp://acronyms.thefreedictionary.com/Refund-Anticipated+Return.

adaptiv9 (2006) is http://www.weizmann.ac.il/matlab/toolbox/filterdesign/ fileadaptiv9.html.

Adelson, E. H., E. Simoncelli, and R. Hingorani (1987) “Orthogonal Pyramid Transformsfor Image Coding,” Proceedings SPIE, vol. 845, Cambridge, MA, pp. 50–58, October.

adobepdf (2006) is http://www.adobe.com/products/acrobat/adobepdf.html.

afb (2006) is http://www.afb.org/prodProfile.asp?ProdID=42.

1020 Bibliography

Ahmed, N., T. Natarajan, and R. K. Rao (1974) “Discrete Cosine Transform,” IEEETransactions on Computers, C-23:90–93.

Akansu, Ali, and R. Haddad (1992) Multiresolution Signal Decomposition, San Diego,CA, Academic Press.

Anderson, K. L., et al., (1987) “Binary-Image-Manipulation Algorithm in the ImageView Facility,” IBM Journal of Research and Development, 31(1):16–31, January.

Anedda, C. and L. Felician (1988) “P-Compressed Quadtrees for Image Storing,” TheComputer Journal, 31(4):353–357.

ATSC (2006) is http://www.atsc.org/standards/a_52b.pdf.

ATT (1996) is http://www.djvu.att.com/.

Baker, Brenda, Udi Manber, and Robert Muth (1999) “Compressing Differences of Exe-cutable Code,” in ACM SIGPLAN Workshop on Compiler Support for System Software(WCSSS ’99).

Banister, Brian, and Thomas R. Fischer (1999) “Quadtree Classification and TCQ ImageCoding,” in Storer, James A., and Martin Cohn (eds.) (1999) DCC ’99: Data Compres-sion Conference, Los Alamitos, CA, IEEE Computer Society Press, pp. 149–157.

Barnsley, M. F., and Sloan, A. D. (1988) “A Better Way to Compress Images,” ByteMagazine, pp. 215–222, January.

Barnsley, M. F. (1988) Fractals Everywhere, New York, Academic Press.

Bass, Thomas A. (1992) Eudaemonic Pie, New York, Penguin Books.

Bentley, J. L. et al. (1986) “A Locally Adaptive Data Compression Algorithm,” Com-munications of the ACM, 29(4):320–330, April.

Blackstock, Steve (1987) “LZW and GIF Explained,” available fromhttp://www.ece.uiuc.edu/~ece291/class-resources/gpe/gif.txt.html.

Bloom, Charles R. (1996) “LZP: A New Data Compression Algorithm,” in Proceedingsof Data Compression Conference, J. Storer, editor, Los Alamitos, CA, IEEE ComputerSociety Press, p. 425.

Bloom, Charles R. (1998) “Solving the Problems of Context Modeling,” available for ftpfrom http://www.cbloom.com/papers/ppmz.zip.

BOCU (2001) is http://oss.software.ibm.com/icu/docs/papers/binary_ordered_compression_for_unicode.html.

BOCU-1 (2002) is http://www.unicode.org/notes/tn6/.

Born, Gunter (1995) The File Formats Handbook, London, New York, InternationalThomson Computer Press.

Bosi, Marina, and Richard E. Goldberg (2003) Introduction To Digital Audio Codingand Standards, Boston, MA, Kluwer Academic.

Bibliography 1021

Boussakta, Said, and Hamoud O. Alshibami (2004) “Fast Algorithm for the 3-D DCT-II,” IEEE Transactions on Signal Processing, 52(4).

Bradley, Jonathan N., Christopher M. Brislawn, and Tom Hopper (1993) “The FBIWavelet/Scalar Quantization Standard for Grayscale Fingerprint Image Compression,”Proceedings of Visual Information Processing II, Orlando, FL, SPIE vol. 1961, pp. 293–304, April.

Brandenburg, Karlheinz, and Gerhard Stoll (1994) “ISO-MPEG-1 Audio: A GenericStandard for Coding of High-Quality Digital Audio,” Journal of the Audio EngineeringSociety, 42(10):780–792, October.

Brandenburg, Karlheinz (1999) “MP3 and AAC Explained,” The AES 17th Interna-tional Conference, Florence, Italy, Sept. 2–5. Available athttp://www.cselt.it/mpeg/tutorials.htm.

Brislawn, Christopher, Jonathan Bradley, R. Onyshczak, and Tom Hopper (1996) “TheFBI Compression Standard for Digitized Fingerprint Images,” in Proceedings SPIE,Vol. 2847, Denver, CO, pp. 344–355, August.

BSDiff (2005) is http://www.daemonology.net/bsdiff/bsdiff-4.3.tar.gz

Burrows, Michael, et al. (1992) On-line Data Compression in a Log-Structured FileSystem, Digital, Systems Research Center, Palo Alto, CA.

Burrows, Michael, and D. J. Wheeler (1994) A Block-Sorting Lossless Data CompressionAlgorithm, Digital Systems Research Center Report 124, Palo Alto, CA, May 10.

Burt, Peter J., and Edward H. Adelson (1983) “The Laplacian Pyramid as a CompactImage Code,” IEEE Transactions on Communications, COM-31(4):532–540, April.

Buyanovsky, George (1994) “Associative Coding” (in Russian), Monitor, Moscow, #8,10–19, August. (Hard copies of the Russian source and English translation are availablefrom the author of this book. Send requests to the author’s email address found in thePreface.)

Buyanovsky, George (2002) Private communications ([email protected]).

Cachin, Christian (1998) “An Information-Theoretic Model for Steganography,” in Pro-ceedings of the Second International Workshop on Information Hiding, D. Aucsmith, ed.vol. 1525 of Lecture Notes in Computer Science, Berlin, Springer-Verlag, pp. 306–318.

Calgary (2006) is ftp://ftp.cpsc.ucalgary.ca/pub/projects/file text.compression.corpus.

Campos, Arturo San Emeterio (2006) Range coder, inhttp://www.arturocampos.com/ac_range.html.

Canterbury (2006) is http://corpus.canterbury.ac.nz.

Capon, J. (1959) “A Probabilistic Model for Run-length Coding of Pictures,” IEEETransactions on Information Theory, 5(4):157–163, December.

Carpentieri, B., M.J. Weinberger, and G. Seroussi (2000) “Lossless Compression ofContinuous-Tone Images,” Proceedings of the IEEE, 88(11):1797–1809, November.

1022 Bibliography

Chaitin, Gregory J. (1977) “Algorithmic Information Theory,” IBM Journal of Researchand Development, 21:350–359, July.

Chaitin, Gregory J. (1997) The Limits of Mathematics, Singapore, Springer-Verlag.

Chomsky, N. (1956) “Three Models for the Description of Language,” IRE Transactionson Information Theory, 2(3):113–124.

Cleary, John G., and I. H. Witten (1984) “Data Compression Using Adaptive Coding andPartial String Matching,” IEEE Transactions on Communications, COM-32(4):396–402, April.

Cleary, John G., W. J. Teahan, and Ian H. Witten (1995) “Unbounded Length Contextsfor PPM,” Data Compression Conference, 1995, 52–61.

Cleary, John G. and W. J. Teahan (1997) “Unbounded Length Contexts for PPM,” TheComputer Journal, 40(2/3):67–75.

Cole, A. J. (1985) “A Note on Peano Polygons and Gray Codes,” International Journalof Computer Mathematics, 18:3–13.

Cole, A. J. (1986) “Direct Transformations Between Sets of Integers and Hilbert Poly-gons,” International Journal of Computer Mathematics, 20:115–122.

Constantinescu, C., and J. A. Storer (1994a) “Online Adaptive Vector Quantization withVariable Size Codebook Entries,” Information Processing and Management, 30(6)745–758.

Constantinescu, C., and J. A. Storer (1994b) “Improved Techniques for Single-PassAdaptive Vector Quantization,” Proceedings of the IEEE, 82(6):933–939, June.

Constantinescu, C., and R. Arps (1997) “Fast Residue Coding for Lossless Textual ImageCompression,” in Proceedings of the 1997 Data Compression Conference, J. Storer, ed.,Los Alamitos, CA, IEEE Computer Society Press, pp. 397–406.

Cormack G. V., and R. N. S. Horspool (1987) “Data Compression Using DynamicMarkov Modelling,” The Computer Journal, 30(6):541–550.

Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest and Clifford Stein (2001)Introduction to Algorithms, 2nd Edition, MIT Press and McGraw-Hill.

corr.pdf (2002) is http://www.davidsalomon.name/DC2advertis/Corr.pdf.

CRC (1998) Soulodre, G. A., T. Grusec, M. Lavoie, and L. Thibault, “Subjective Eval-uation of State-of-the-Art 2-Channel Audio Codecs,” Journal of the Audio EngineeringSociety, 46(3):164–176, March.

CREW 2000 is http://www.crc.ricoh.com/CREW/.

Crocker, Lee Daniel (1995) “PNG: The Portable Network Graphic Format,” Dr. Dobb’sJournal of Software Tools, 20(7):36–44.

Culik, Karel II, and J. Kari (1993) “Image Compression Using Weighted Finite Au-tomata,” Computer and Graphics, 17(3):305–313.

Bibliography 1023

Culik, Karel II, and J. Kari (1994a) “Image-Data Compression Using Edge-OptimizingAlgorithm for WFA Inference,” Journal of Information Processing and Management,30(6):829–838.

Culik, Karel II, and Jarkko Kari (1994b) “Inference Algorithm for Weighted Finite Au-tomata and Image Compression,” in Fractal Image Encoding and Compression, Y. Fisher,editor, New York, NY, Springer-Verlag.

Culik, Karel II, and Jarkko Kari (1995) “Finite State Methods for Compression andManipulation of Images,” in DCC ’96, Data Compression Conference, J. Storer, editor,Los Alamitos, CA, IEEE Computer Society Press, pp. 142–151.

Culik, Karel II, and V. Valenta (1996) “Finite Automata Based Compression of Bi-Level Images,” in Storer, James A. (ed.), DCC ’96, Data Compression Conference, LosAlamitos, CA, IEEE Computer Society Press, pp. 280–289.

Culik, Karel II, and V. Valenta (1997a) “Finite Automata Based Compression of Bi-Level and Simple Color Images,” Computer and Graphics, 21:61–68.

Culik, Karel II, and V. Valenta (1997b) “Compression of Silhouette-like Images Basedon WFA,” Journal of Universal Computer Science, 3:1100–1113.

Dasarathy, Belur V. (ed.) (1995) Image Data Compression: Block Truncation Coding(BTC) Techniques, Los Alamitos, CA, IEEE Computer Society Press.

Daubechies, Ingrid (1988) “Orthonormal Bases of Compactly Supported Wavelets,”Communications on Pure and Applied Mathematics, 41:909–996.

Deflate (2003) is http://www.gzip.org/zlib/.

della Porta, Giambattista (1558) Magia Naturalis, Naples, first edition, four volumes1558, second edition, 20 volumes 1589. Translated by Thomas Young and Samuel Speed,Natural Magick by John Baptista Porta, a Neopolitane, London 1658.

Demko, S., L. Hodges, and B. Naylor (1985) “Construction of Fractal Objects withIterated Function Systems,” Computer Graphics, 19(3):271–278, July.

DeVore, R., et al. (1992) “Image Compression Through Wavelet Transform Coding,”IEEE Transactions on Information Theory, 38(2):719–746, March.

Dewitte, J., and J. Ronson (1983) “Original Block Coding Scheme for Low Bit RateImage Transmission,” in Signal Processing II: Theories and Applications—Proceedingsof EUSIPCO 83, H. W. Schussler, ed., Amsterdam, Elsevier Science Publishers B. V.(North-Holland), pp. 143–146.

Dolby (2006) is http://www.dolby.com/.

donationcoder (2006) ishttp://www.donationcoder.com/Reviews/Archive/ArchiveTools/index.html.

Durbin J. (1960) “The Fitting of Time-Series Models,” JSTOR: Revue de l’InstitutInternational de Statistique, 28:233–344.

DVB (2006) is http://www.dvb.org/.

1024 Bibliography

Ekstrand, Nicklas (1996) “Lossless Compression of Gray Images via Context Tree Weight-ing,” in Storer, James A. (ed.), DCC ’96: Data Compression Conference, Los Alamitos,CA, IEEE Computer Society Press, pp. 132–139, April.

Elias, P. (1975) “Universal Codeword Sets and Representations of the Integers,” IEEETransactions on Information Theory, IT-21(2):194–203, March.

Faller N. (1973) “An Adaptive System for Data Compression,” in Record of the 7thAsilomar Conference on Circuits, Systems, and Computers, pp. 593–597.

Fang I. (1966) “It Isn’t ETAOIN SHRDLU; It’s ETAONI RSHDLC,” Journalism Quar-terly, 43:761–762.

Feder, Jens (1988) Fractals, New York, Plenum Press.

Federal Bureau of Investigation (1993) WSQ Grayscale Fingerprint Image CompressionSpecification, ver. 2.0, Document #IAFIS-IC-0110v2, Criminal Justice Information Ser-vices, February.

Feig, E., and E. Linzer (1990) “Discrete Cosine Transform Algorithms for Image DataCompression,” in Proceedings Electronic Imaging ’90 East, pp. 84–87, Boston, MA.

Feldspar (2003) is http://www.zlib.org/feldspar.html.

Fenwick, P. (1996) Symbol Ranking Text Compression, Tech. Rep. 132, Dept. of Com-puter Science, University of Auckland, New Zealand, June.

Fenwick, Peter (1996a) “Punctured Elias Codes for variable-length coding of the in-tegers,” Technical Report 137, Department of Computer Science, The University ofAuckland, December. This is also available online.

Fiala, E. R., and D. H. Greene (1989), “Data Compression with Finite Windows,”Communications of the ACM, 32(4):490–505.

Fibonacci (1999) is file Fibonacci.html inhttp://www-groups.dcs.st-and.ac.uk/~history/References/.

FIPS197 (2001) Advanced Encryption Standard, FIPS Publication 197, November 26,2001. Available fromhttp://csrc.nist.gov/publications/fips/fips197/fips-197.pdf.

firstpr (2006) is http://www.firstpr.com.au/audiocomp/lossless/#rice.

Fisher, Yuval (ed.) (1995) Fractal Image Compression: Theory and Application, NewYork, Springer-Verlag.

flac.devices (2006) is http://flac.sourceforge.net/links.html#hardware.

flacID (2006) is http://flac.sourceforge.net/id.html.

Fox, E. A., et al. (1991) “Order Preserving Minimal Perfect Hash Functions and Infor-mation Retrieval,” ACM Transactions on Information Systems, 9(2):281–308.

Fraenkel, A. S., and S. T. Klein (1985) “Novel Compression of Sparse Bit-Strings—Preliminary Report,” in A. Apostolico and Z. Galil, eds. Combinatorial Algorithms onWords, Vol. 12, NATO ASI Series F:169–183, New York, Springer-Verlag.

Bibliography 1025

Frank, Amalie J., J. D. Daniels, and Diane R. Unangst (1980) “Progressive Image Trans-mission Using a Growth-Geometry Coding,” Proceedings of the IEEE, 68(7):897–909,July.

Freeman, H. (1961) “On The Encoding of Arbitrary Geometric Configurations,” IRETransactions on Electronic Computers, EC-10(2):260–268, June.

G131 (2006) ITU-T Recommendation G.131, Talker echo and its control.

G.711 (1972) is http://en.wikipedia.org/wiki/G.711.

Gallager, Robert G., and David C. van Voorhis (1975) “Optimal Source Codes for Geo-metrically Distributed Integer Alphabets,” IEEE Transactions on Information Theory,IT-21(3):228–230, March.

Gallager, Robert G. (1978) “Variations On a Theme By Huffman,” IEEE Transactionson Information Theory, IT-24(6):668–674, November.

Gardner, Martin (1972) “Mathematical Games,” Scientific American, 227(2):106, Au-gust.

Gersho, Allen, and Robert M. Gray (1992) Vector Quantization and Signal Compression,Boston, MA, Kluwer Academic Publishers.

Gharavi, H. (1987) “Conditional Run-Length and Variable-Length Coding of DigitalPictures,” IEEE Transactions on Communications, COM-35(6):671–677, June.

Gilbert, E. N., and E. F. Moore (1959) “Variable Length Binary Encodings,” Bell SystemTechnical Journal, Monograph 3515, 38:933–967, July.

Gilbert, Jeffrey M., and Robert W. Brodersen (1998) “A Lossless 2-D Image Compres-sion Technique for Synthetic Discrete-Tone Images,” in Proceedings of the 1998 DataCompression Conference, J. Storer, ed., Los Alamitos, CA, IEEE Computer SocietyPress, pp. 359–368, March. This is also available from URLhttp://bwrc.eecs.berkeley.edu/Publications/1999/A_lossless_2-D_image_compression_technique/JMG_DCC98.pdf.

Givens, Wallace (1958) “Computation of Plane Unitary Rotations Transforming a Gen-eral Matrix to Triangular Form,” Journal of the Society for Industrial and Applied Math-ematics, 6(1):26–50, March.

Golomb, Solomon W. (1966) “Run-Length Encodings,” IEEE Transactions on Informa-tion Theory, IT-12(3):399–401.

Gonzalez, Rafael C., and Richard E. Woods (1992) Digital Image Processing, Reading,MA, Addison-Wesley.

Gottlieb, D., et al. (1975) A Classification of Compression Methods and their Usefulnessfor a Large Data Processing Center, Proceedings of National Computer Conference,44:453–458.

Gray, Frank (1953) “Pulse Code Communication,” United States Patent 2,632,058,March 17.

1026 Bibliography

H.264Draft (2006) is ftp://standards.polycom.com/JVT_Site/draft_standard/ fileJVT-G050r1.zip. This is the H.264 draft standard.

H.264PaperIR (2006) is http://www.vcodex.com/h264_transform.pdf (a paper byIain Richardson).

H.264PaperRM (2006) is http://research.microsoft.com/~malvar/papers/, fileMalvarCSVTJuly03.pdf (a paper by Rico Malvar).

H.264Standards (2006) is ftp://standards.polycom.com/JVT_Site/ (the H.264 stan-dards repository).

H.264Transform (2006) ftp://standards.polycom.com/JVT_Site/2002_01_Geneva/files JVT-B038r2.doc and JVT-B039r2.doc (the H.264 integer transform).

h2g2 (2006) is http://www.bbc.co.uk/dna/h2g2/A406973.

Haffner, Patrick, et al. (1998) “High-Quality Document Image Compression with DjVu,”Journal of Electronic Imaging, 7(3):410–425, SPIE. This is also available fromhttp://citeseer.nj.nec.com/bottou98high.html.

Hafner, Ullrich (1995) “Asymmetric Coding in (m)-WFA Image Compression,” Report132, Department of Computer Science, University of Wurzburg, December.

Hans, Mat and R. W. Schafer (2001) “Lossless Compression of Digital Audio,” IEEESignal Processing Magazine, 18(4):21–32, July.

Havas, G., et al. (1993) Graphs, Hypergraphs and Hashing, in Proceedings of the Inter-national Workshop on Graph-Theoretic Concepts in Computer Science (WG’93), Berlin,Springer-Verlag.

Heath, F. G. (1972) “Origins of the Binary Code,” Scientific American, 227(2):76,August.

Hilbert, D. (1891) “Ueber stetige Abbildung einer Linie auf ein Flachenstuck,” Math.Annalen, 38:459–460.

Hirschberg, D., and D. Lelewer (1990) “Efficient Decoding of Prefix Codes,” Communi-cations of the ACM, 33(4):449–459.

Horspool, N. R. (1991) “Improving LZW,” in Proceedings of the 1991 Data CompressionConference, J. Storer, ed., Los Alamitos, CA, IEEE Computer Society Press, pp .332–341.

Horspool, N. R., and G. V. Cormack (1992) “Constructing Word-Based Text Compres-sion Algorithms,” in Proceedings of the 1992 Data Compression Conference, J. Storer,ed., Los Alamitos, CA, IEEE Computer Society Press, PP. 62–71, April.

Horstmann (2006) http://www.horstmann.com/bigj/help/windows/tutorial.html.

Howard, Paul G., and J. S. Vitter (1992a) “New Methods for Lossless Image CompressionUsing Arithmetic Coding,” Information Processing and Management, 28(6):765–779.

Howard, Paul G., and J. S. Vitter (1992b) “Error Modeling for Hierarchical Lossless Im-age Compression,” in Proceedings of the 1992 Data Compression Conference, J. Storer,ed., Los Alamitos, CA, IEEE Computer Society Press, pp. 269–278.

Bibliography 1027

Howard, Paul G., and J. S. Vitter (1992c) “Practical Implementations of ArithmeticCoding,” in Image and Text Compression, J. A. Storer, ed., Norwell, MA, Kluwer Aca-demic Publishers, PP. 85–112. Also available from URLhttp://www.cs.duke.edu/~jsv/Papers/catalog/node66.html.

Howard, Paul G., and J. S. Vitter, (1993) “Fast and Efficient Lossless Image Com-pression,” in Proceedings of the 1993 Data Compression Conference, J. Storer, ed., LosAlamitos, CA, IEEE Computer Society Press, pp. 351–360.

Howard, Paul G., and J. S. Vitter (1994a) “Fast Progressive Lossless Image Compres-sion,” Proceedings of the Image and Video Compression Conference, IS&T/SPIE 1994Symposium on Electronic Imaging: Science & Technology, 2186, San Jose, CA, pp. 98–109, February.

Howard, Paul G., and J. S. Vitter (1994b) “Design and Analysis of Fast text Com-pression Based on Quasi-Arithmetic Coding,” Journal of Information Processing andManagement, 30(6):777–790. Also available from URLhttp://www.cs.duke.edu/~jsv/Papers/catalog/node70.html.

Huffman, David (1952) “A Method for the Construction of Minimum RedundancyCodes,” Proceedings of the IRE, 40(9):1098–1101.

Hunt, James W. and M. Douglas McIlroy (1976) “An Algorithm for Differential FileComparison,” Computing Science Technical Report No. 41, Murray Hill, NJ, Bell Labs,June.

Hunter, R., and A. H. Robinson (1980) “International Digital Facsimile Coding Stan-dards,” Proceedings of the IEEE, 68(7):854–867, July.

hydrogenaudio (2006) is www.hydrogenaudio.org/forums/.

IA-32 (2006) is http://en.wikipedia.org/wiki/IA-32.

IBM (1988) IBM Journal of Research and Development, #6 (the entire issue).

IEEE754 (1985) ANSI/IEEE Standard 754-1985, “IEEE Standard for Binary Floating-Point Arithmetic.”

IMA (2006) is www.ima.org/.

ISO (1984) “Information Processing Systems-Data Communication High-Level DataLink Control Procedure-Frame Structure,” IS 3309, 3rd ed., October.

ISO (2003) is http://www.iso.ch/.

ISO/IEC (1993) International Standard IS 11172-3 “Information Technology, Coding ofMoving Pictures and Associated Audio for Digital Storage Media at up to about 1.5Mbits/s—Part 3: Audio.”

ISO/IEC (2000), International Standard IS 15444-1 “Information Technology—JPEG2000 Image Coding System.” This is the FDC (final committee draft) version 1.0, 16March 2000.

1028 Bibliography

ISO/IEC (2003) International Standard ISO/IEC 13818-7, “Information technology,Generic coding of moving pictures and associated audio information, Part 7: AdvancedAudio Coding (AAC),” 2nd ed., 2003-08-01.

ITU-R/BS1116 (1997) ITU-R, document BS 1116 “Methods for the Subjective Assess-ment of Small Impairments in Audio Systems Including Multichannel Sound Systems,”Rev. 1, Geneva.

ITU-T (1989) CCITT Recommendation G.711: “Pulse Code Modulation (PCM) ofVoice Frequencies.”

ITU-T (1990), Recommendation G.726 (12/90), 40, 32, 24, 16 kbit/s Adaptive Differ-ential Pulse Code Modulation (ADPCM).

ITU-T (1994) ITU-T Recommendation V.42, Revision 1 “Error-correcting Proceduresfor DCEs Using Asynchronous-to-Synchronous Conversion.”

ITU-T264 (2002) ITU-T Recommendation H.264, ISO/IEC 11496-10, “Advanced VideoCoding,” Final Committee Draft, Document JVT-E022, September.

ITU/TG10 (1991) ITU-R document TG-10-2/3-E “Basic Audio Quality Requirementsfor Digital Audio Bit-Rate Reduction Systems for Broadcast Emission and PrimaryDistribution,” 28 October.

Jayant N. (ed.) (1997) Signal Compression: Coding of Speech, Audio, Text, Image andVideo, Singapore, World Scientific Publications.

JBIG (2003) is http://www.jpeg.org/jbighomepage.html.

JBIG2 (2003) is http://www.jpeg.org/jbigpt2.html.

JBIG2 (2006) is http://jbig2.com/.

Jordan, B. W., and R. C. Barrett (1974) “A Cell Organized Raster Display for LineDrawings,” Communications of the ACM, 17(2):70–77.

Joshi, R. L., V. J. Crump, and T. R. Fischer (1993) “Image Subband Coding UsingArithmetic and Trellis Coded Quantization,” IEEE Transactions on Circuits and Sys-tems Video Technology, 5(6):515–523, December.

JPEG 2000 Organization (2000) is http://www.jpeg.org/JPEG2000.htm.

Kendall, Maurice G. (1961) A Course in the Geometry of n-Dimensions, New York,Hafner.

Kieffer, J., G. Nelson, and E-H. Yang (1996a) “Tutorial on the quadrisection methodand related methods for lossless data compression.” Available at URLhttp://www.ece.umn.edu/users/kieffer/index.html.

Kieffer, J., E-H. Yang, G. Nelson, and P. Cosman (1996b) “Lossless compression viabisection trees,” at http://www.ece.umn.edu/users/kieffer/index.html.

Kleijn, W. B., and K. K. Paliwal (1995) Speech Coding and Synthesis, Elsevier, Amster-dam.

Bibliography 1029

Knowlton, Kenneth (1980) “Progressive Transmission of Grey-Scale and Binary Pic-tures by Simple, Efficient, and Lossless Encoding Schemes,” Proceedings of the IEEE,68(7):885–896, July.

Knuth, Donald E. (1973) The Art of Computer Programming, Vol. 1, 2nd Ed., Reading,MA, Addison-Wesley.

Knuth, Donald E. (1985) “Dynamic Huffman Coding,” Journal of Algorithms, 6:163–180.

Korn D., et al. (2002) “The VCDIFF Generic Differencing and Compression DataFormat,” RFC 3284, June 2002, available on the Internet as text file “rfc3284.txt”.

Krichevsky, R. E., and V. K. Trofimov (1981) “The Performance of Universal Coding,”IEEE Transactions on Information Theory, IT-27:199–207, March.

Lambert, Sean M. (1999) “Implementing Associative Coder of Buyanovsky (ACB) DataCompression,” M.S. thesis, Bozeman, MT, Montana State University (available fromSean Lambert at [email protected]).

Langdon, Glen G., and J. Rissanen (1981) “Compression of Black White Images withArithmetic Coding,” IEEE Transactions on Communications, COM-29(6):858–867,June.

Langdon, Glen G. (1983) “A Note on the Ziv-Lempel Model for Compressing IndividualSequences,” IEEE Transactions on Information Theory, IT-29(2):284–287, March.

Langdon, Glenn G. (1983a) “An Adaptive Run Length Coding Algorithm,” IBM Tech-nical Disclosure Bulletin, 26(7B):3783–3785, December.

Langdon, Glen G. (1984) On Parsing vs. Mixed-Order Model Structures for Data Com-pression, IBM research report RJ-4163 (46091), January 18, 1984, San Jose, CA.

Levinson, N. (1947) “The Weiner RMS Error Criterion in Filter Design and Prediction,”Journal of Mathematical Physics, 25:261–278.

Lewalle, Jacques (1995) “Tutorial on Continuous Wavelet Analysis of ExperimentalData” available at http://www.ecs.syr.edu/faculty/lewalle/tutor/tutor.html.

Li, Xiuqi and Borko Furht (2003) “An Approach to Image Compression Using Three-Dimensional DCT,” Proceeding of The Sixth International Conference on Visual Infor-mation System 2003 (VIS2003), September 24–26.

Liebchen, Tilman et al. (2005) “The MPEG-4 Audio Lossless Coding (ALS) Standard -Technology and Applications,” AES 119th Convention, New York, October 7–10, 2005.Available at URLhttp://www.nue.tu-berlin.de/forschung/projekte/lossless/mp4als.html.

Liefke, Hartmut and Dan Suciu (1999) “XMill: an Efficient Compressor for XMLData,” Proceedings of the ACM SIGMOD Symposium on the Management of Data,2000, pp. 153–164. Available at http://citeseer.nj.nec.com/liefke99xmill.html.

Linde, Y., A. Buzo, and R. M. Gray (1980) “An Algorithm for Vector QuantizationDesign,” IEEE Transactions on Communications, COM-28:84–95, January.

1030 Bibliography

Liou, Ming (1991) “Overview of the p×64 kbits/s Video Coding Standard,” Communi-cations of the ACM, 34(4):59–63, April.

Litow, Bruce, and Olivier de Val (1995) “The Weighted Finite Automaton InferenceProblem,” Technical Report 95-1, James Cook University, Queensland.

Loeffler, C., A. Ligtenberg, and G. Moschytz (1989) “Practical Fast 1-D DCT Algorithmswith 11 Multiplications,” in Proceedings of the International Conference on Acoustics,Speech, and Signal Processing (ICASSP ’89), pp. 988–991.

Mallat, Stephane (1989) “A Theory for Multiresolution Signal Decomposition: TheWavelet Representation,” IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 11(7):674–693, July.

Manber, U., and E. W. Myers (1993) “Suffix Arrays: A New Method for On-Line StringSearches,” SIAM Journal on Computing, 22(5):935–948, October.

Mandelbrot, B. (1982) The Fractal Geometry of Nature, San Francisco, CA, W. H. Free-man.

Manning (1998), is file compression/adv08.html athttp://www.newmediarepublic.com/dvideo/.

Marking, Michael P. (1990) “Decoding Group 3 Images,” The C Users Journal pp. 45–54, June.

Matlab (1999) is http://www.mathworks.com/.

McConnell, Kenneth R. (1992) FAX: Digital Facsimile Technology and Applications,Norwood, MA, Artech House.

McCreight, E. M (1976) “A Space Economical Suffix Tree Construction Algorithm,”Journal of the ACM, 32(2):262–272, April.

Meridian (2003) is http://www.meridian-audio.com/.

Meyer, F. G., A. Averbuch, and J.O. Stromberg (1998) “Fast Adaptive Wavelet PacketImage Compression,” IEEE Transactions on Image Processing, 9(5) May 2000.

Miano, John (1999) Compressed Image File Formats, New York, ACM Press and Addison-Wesley.

Miller, V. S., and M. N. Wegman (1985) “Variations On a Theme by Ziv and Lempel,” inA. Apostolico and Z. Galil, eds., NATO ASI series Vol. F12, Combinatorial Algorithmson Words, Berlin, Springer, pp. 131–140.

Mitchell, Joan L., W. B. Pennebaker, C. E. Fogg, and D. J. LeGall, eds. (1997) MPEGVideo Compression Standard, New York, Chapman and Hall and International ThomsonPublishing.

MNG (2003) is http://www.libpng.org/pub/mng/spec/.

Moffat, Alistair (1990) “Implementing the PPM Data Compression Scheme,” IEEETransactions on Communications, COM-38(11):1917–1921, November.

Bibliography 1031

Moffat, Alistair (1991) “Two-Level Context Based Compression of Binary Images,” inProceedings of the 1991 Data Compression Conference, J. Storer, ed., Los Alamitos, CA,IEEE Computer Society Press, pp. 382–391.

Moffat, Alistair, Radford Neal, and Ian H. Witten (1998) “Arithmetic Coding Revis-ited,” ACM Transactions on Information Systems, 16(3):256–294, July.

monkeyaudio (2006) is http://www.monkeysaudio.com/index.html.

Motta, G., F. Rizzo, and J. A. Storer, eds. (2006) Hyperspectral Data Compression,New York, Springer Verlag.

Motte, Warren F. (1998) Oulipo, A Primer of Potential Literature, Normal, Ill, DalekyArchive Press.

MPEG (1998), is http://www.mpeg.org/.

mpeg-4.als (2006) ishttp://www.nue.tu-berlin.de/forschung/projekte/lossless/mp4als.html.

MPThree (2006) is http://inventors.about.com/od/mstartinventions/a/, fileMPThree.htm.

Mulcahy, Colm (1996) “Plotting and Scheming with Wavelets,” Mathematics Magazine,69(5):323–343, December. See also http://www.spelman.edu/~colm/csam.ps.

Mulcahy, Colm (1997) “Image Compression Using the Haar Wavelet Transform,” Spel-man College Science and Mathematics Journal, 1(1):22–31, April. Also available atURL http://www.spelman.edu/~colm/wav.ps. (It has been claimed that any smart15-year-old could follow this introduction to wavelets.)

Murray, James D., and William vanRyper (1994) Encyclopedia of Graphics File Formats,Sebastopol, CA, O’Reilly and Assoc.

Myers, Eugene W. (1986) “An O(ND) Difference Algorithm and its Variations,” Algo-rithmica, 1(2):251–266.

Netravali, A. and J. O. Limb (1980) “Picture Coding: A Preview,” Proceedings of theIEEE, 68:366–406.

Nevill-Manning, C. G. (1996) “Inferring Sequential Structure,” Ph.D. thesis, Depart-ment of Computer Science, University of Waikato, New Zealand.

Nevill-Manning, C. G., and Ian H. Witten (1997) “Compression and Explanation UsingHierarchical Grammars,” The Computer Journal, 40(2/3):104–116.

NHK (2006) is http://www.nhk.or.jp/english/.

Nix, R. (1981) “Experience With a Space Efficient Way to Store a Dictionary,” Com-munications of the ACM, 24(5):297–298.

ntfs (2006) is http://www.ntfs.com/.

Nyquist, Harry (1928) “Certain Topics in Telegraph Transmission Theory,” AIEE Trans-actions, 47:617–644.

1032 Bibliography

Ogg squish (2006) is http://www.xiph.org/ogg/flac.html.

Okumura, Haruhiko (1998) is http://oku.edu.mie-u.ac.jp/~okumura/ directorycompression/history.html.

Osterberg, G. (1935) “Topography of the Layer of Rods and Cones in the HumanRetina,” Acta Ophthalmologica, (suppl. 6):1–103.

Paeth, Alan W. (1991) “Image File Compression Made Easy,” in Graphics Gems II,James Arvo, editor, San Diego, CA, Academic Press.

Parsons, Thomas W. (1987) Voice and Speech Processing, New York, McGraw-Hill.

Pan, Davis Yen (1995) “A Tutorial on MPEG/Audio Compression,” IEEE Multimedia,2:60–74, Summer.

Pasco, R. (1976) “Source Coding Algorithms for Fast Data Compression,” Ph.D. disser-tation, Dept. of Electrical Engineering, Stanford University, Stanford, CA.

patents (2006) is www.ross.net/compression/patents.html.

PDF (2001) Adobe Portable Document Format Version 1.4, 3rd ed., Reading, MA,Addison-Wesley, December.

Peano, G. (1890) “Sur Une Courbe Qui Remplit Toute Une Aire Plaine,” Math. Annalen,36:157–160.

Peitgen, H. -O., et al. (eds.) (1982) The Beauty of Fractals, Berlin, Springer-Verlag.

Peitgen, H. -O., and Dietmar Saupe (1985) The Science of Fractal Images, Berlin,Springer-Verlag.

Pennebaker, William B., and Joan L. Mitchell (1988a) “Probability Estimation for theQ-coder,” IBM Journal of Research and Development, 32(6):717–726.

Pennebaker, William B., Joan L. Mitchell, et al. (1988b) “An Overview of the BasicPrinciples of the Q-coder Adaptive Binary Arithmetic Coder,” IBM Journal of Researchand Development, 32(6):737–752.

Pennebaker, William B., and Joan L. Mitchell (1992) JPEG Still Image Data Compres-sion Standard, New York, Van Nostrand Reinhold.

Percival, Colin (2003a) “An Automated Binary Security Update System For FreeBSD,”Proceedings of BSDCon ’03, PP. 29–34.

Percival, Colin (2003b) “Naive Differences of Executable Code,” Computing Lab, OxfordUniversity. Available fromhttp://www.daemonology.net/bsdiff/bsdiff.pdf

Percival, Colin (2006) “Matching with Mismatches and Assorted Applications,” Ph.D.Thesis (pending paperwork). Available at URLhttp://www.daemonology.net/papers/thesis.pdf.

Pereira, Fernando and Touradj Ebrahimi (2002) The MPEG-4 Book, Upper SaddleRiver, NJ, Prentice-Hall.

Bibliography 1033

Phillips, Dwayne (1992) “LZW Data Compression,” The Computer Application JournalCircuit Cellar Inc., 27:36–48, June/July.

PKWare (2003) is http://www.pkware.com.

PNG (2003) is http://www.libpng.org/pub/png/.

Pohlmann, Ken (1985) Principles of Digital Audio, Indianapolis, IN, Howard Sams &Co.

polyvalens (2006) is http://perso.orange.fr/polyvalens/clemens/wavelets/, filewavelets.html.

Press, W. H., B. P. Flannery, et al. (1988) Numerical Recipes in C: The Art of Scien-tific Computing, Cambridge, UK, Cambridge University Press. (Also available at URLhttp://www.nr.com/.)

Prusinkiewicz, P., and A. Lindenmayer (1990) The Algorithmic Beauty of Plants, NewYork, Springer-Verlag.

Prusinkiewicz, P., A. Lindenmayer, and F. D. Fracchia (1991) “Synthesis of Space-FillingCurves on the Square Grid,” in Fractals in the Fundamental and Applied Sciences,Peitgen, H.-O., et al. (eds.), Amsterdam, Elsevier Science Publishers, pp. 341–366.

quicktimeAAC (2006) is http://www.apple.com/quicktime/technologies/aac/.

Rabbani, Majid, and Paul W. Jones (1991) Digital Image Compression Techniques,Bellingham, WA, Spie Optical Engineering Press.

Rabiner, Lawrence R. and Ronald W. Schafer (1978) Digital Processing of Speech Signals,Englewood Cliffs, NJ, Prentice-Hall Series in Signal Processing.

Ramabadran, Tenkasi V., and Sunil S. Gaitonde (1988) “A Tutorial on CRC Computa-tions,” IEEE Micro, pp. 62–75, August.

Ramstad, T. A., et al (1995) Subband Compression of Images: Principles and Examples,Amsterdam, Elsevier Science Publishers.

Rao, K. R., and J. J. Hwang (1996) Techniques and Standards for Image, Video, andAudio Coding, Upper Saddle River, NJ, Prentice Hall.

Rao, K. R., and P. Yip (1990) Discrete Cosine Transform—Algorithms, Advantages,Applications, London, Academic Press.

Rao, Raghuveer M., and Ajit S. Bopardikar (1998) Wavelet Transforms: Introductionto Theory and Applications, Reading, MA, Addison-Wesley.

rarlab (2006) is http://www.rarlab.com/.

Reghbati, H. K. (1981) “An Overview of Data Compression Techniques,” IEEE Com-puter, 14(4):71–76.

Reznik, Yuriy (2004) “Coding Of Prediction Residual In MPEG-4 Standard For LosslessAudio Coding (MPEG-4 ALS),” Available at URLhttp://viola.usc.edu/paper/ICASSP2004/HTML/SESSIDX.HTM.

1034 Bibliography

RFC1945 (1996) Hypertext Transfer Protocol—HTTP/1.0, available at URLhttp://www.faqs.org/rfcs/rfc1945.html.

RFC1950 (1996) ZLIB Compressed Data Format Specification version 3.3, ishttp://www.ietf.org/rfc/rfc1950.

RFC1951 (1996) DEFLATE Compressed Data Format Specification version 1.3, ishttp://www.ietf.org/rfc/rfc1951.

RFC1952 (1996) GZIP File Format Specification Version 4.3. Available in PDF formatat URL http://www.gzip.org/zlib/rfc-gzip.html.

RFC1962 (1996) The PPP Compression Control Protocol (CCP), available from manysources.

RFC1979 (1996) PPP Deflate Protocol, is http://www.faqs.org/rfcs/rfc1979.html.

RFC2616 (1999) Hypertext Transfer Protocol – HTTP/1.1. Available in PDF format atURL http://www.faqs.org/rfcs/rfc2616.html.

Rice, Robert F. (1979) “Some Practical Universal Noiseless Coding Techniques,” JetPropulsion Laboratory, JPL Publication 79-22, Pasadena, CA, March.

Rice, Robert F. (1991) “Some Practical Universal Noiseless Coding Techniques—PartIII. Module PSI14.K,” Jet Propulsion Laboratory, JPL Publication 91-3, Pasadena, CA,November.

Richardson, Iain G. (2003) H.264 and MPEG-4 Video Compression Video Coding forNext-generation Multimedia, Chichester, West Sussex, UK, John Wiley and Sons,

Rissanen, J. J. (1976) “Generalized Kraft Inequality and Arithmetic Coding,” IBMJournal of Research and Development, 20:198–203, May.

Robinson, John A. (1997) “Efficient General-Purpose Image Compression with BinaryTree Predictive Coding,” IEEE Transactions on Image Processing, 6(4):601–607 April.

Robinson, P. and D. Singer (1981) “Another Spelling Correction Program,” Communi-cations of the ACM, 24(5):296–297.

Robinson, Tony (1994) “Simple Lossless and Near-Lossless Waveform Compression,”Technical Report CUED/F-INFENG/TR.156, Cambridge University, December. Avail-able at URL http://citeseer.nj.nec.com/robinson94shorten.html.

Rodriguez, Karen (1995) “Graphics File Format Patent Unisys Seeks Royalties fromGIF Developers,” InfoWorld, January 9, 17(2):3.

Roetling, P. G. (1976) “Halftone Method with Edge Enhancement and Moire Suppres-sion,” Journal of the Optical Society of America, 66:985–989.

Roetling, P. G. (1977) “Binary Approximation of Continuous Tone Images,” PhotographyScience and Engineering, 21:60–65.

Roger, R. E., and M. C. Cavenor (1996) “Lossless Compression of AVIRIS Images,”IEEE Transactions on Image Processing, 5(5):713–719, May.

Bibliography 1035

Ronson, J. and J. Dewitte (1982) “Adaptive Block Truncation Coding Scheme Usingan Edge Following Algorithm,” in Proceedings of the IEEE International Conference onAcoustics, Speech, and Signal Processing, Piscataway, NJ, IEEE Press, pp. 1235–1238.

Rossignac, J. (1998) “Edgebreaker: Connectivity Compression for Triangle Meshes,”GVU Technical Report GIT-GVU-98-35, Atlanta, GA, Georgia Institute of Technology.

Rubin, F. (1979) “Arithmetic Stream Coding Using Fixed Precision Registers,” IEEETransactions on Information Theory, 25(6):672–675, November.

Sacco, William, et al. (1988) Information Theory, Saving Bits, Providence, RI, JansonPublications.

Sagan, Hans (1994) Space-Filling Curves, New York, Springer-Verlag.

Said, A. and W. A. Pearlman (1996), “A New Fast and Efficient Image Codec Based onSet Partitioning in Hierarchical Trees,” IEEE Transactions on Circuits and Systems forVideo Technology, 6(6):243–250, June.

Salomon, David (1999) Computer Graphics and Geometric Modeling, New York, Springer.

Salomon, David (2000) “Prefix Compression of Sparse Binary Strings,” ACM CrossroadsMagazine, 6(3), February.

Salomon, David (2006) Curves and Surfaces for Computer Graphics, New York, Springer.

Samet, Hanan (1990a) Applications of Spatial Data Structures: Computer Graphics,Image Processing, and GIS, Reading, MA, Addison-Wesley.

Samet, Hanan (1990b) The Design and Analysis of Spatial Data Structures, Reading,MA, Addison-Wesley.

Sampath, Ashwin, and Ahmad C. Ansari (1993) “Combined Peano Scan and VQ Ap-proach to Image Compression,” Image and Video Processing, Bellingham, WA, SPIEvol. 1903, pp. 175–186.

Saponara, Sergio, Luca Fanucci, Pierangelo Terren (2003) “Low-Power VLSI Architec-tures for 3D Discrete Cosine Transform (DCT),” in Midwest Symposium on Circuits andSystems (MWSCAS).

Sayood, Khalid and K. Anderson (1992) “A Differential Lossless Image CompressionScheme,” IEEE Transactions on Signal Processing, 40(1):236–241, January.

Sayood, Khalid (2005) Introduction to Data Compression, 3rd Ed., San Francisco, CA,Morgan Kaufmann.

Schindler, Michael (1998) “A Fast Renormalisation for Arithmetic Coding,” a poster inthe Data Compression Conference, 1998, available at URLhttp://www.compressconsult.com/rangecoder/.

SHA256 (2002) Secure Hash Standard, FIPS Publication 180-2, August 2002. Availableat csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf.

Shannon, Claude (1951) “Prediction and Entropy of Printed English,” Bell System Tech-nical Journal, 30(1):50–64, January.

1036 Bibliography

Shapiro, J. (1993) “Embedded Image Coding Using Zerotrees of Wavelet Coefficients,”IEEE Transactions on Signal Processing, 41(12):3445–3462, October.

Shenoi, Kishan (1995) Digital Signal Processing in Telecommunications, Upper SaddleRiver, NJ, Prentice Hall.

Shlien, Seymour (1994) “Guide to MPEG-1 Audio Standard,” IEEE Transactions onBroadcasting, 40(4):206–218, December.

Sieminski, A. (1988) “Fast Decoding of the Huffman Codes,” Information ProcessingLetters, 26(5):237–241.

Sierpinski, W. (1912) “Sur Une Nouvelle Courbe Qui Remplit Toute Une Aire Plaine,”Bull. Acad. Sci. Cracovie, Serie A:462–478.

sighted (2006) is http://www.sighted.com/.

Simoncelli, Eero P., and Edward. H. Adelson (1990) “Subband Transforms,” in SubbandCoding, John Woods, ed., Boston, MA, Kluwer Academic Press, pp. 143–192.

Smith, Alvy Ray (1984) “Plants, Fractals and Formal Languages,” Computer Graphics,18(3):1–10.

softexperience (2006) ishttp://peccatte.karefil.com/software/Rarissimo/RarissimoEN.htm.

Softsound (2003) was http://www.softsound.com/Shorten.html but try also URLhttp://mi.eng.cam.ac.uk/reports/ajr/TR156/tr156.html.

sourceforge.flac (2006) is http://sourceforge.net/projects/flac.

Starck, J. L., F. Murtagh, and A. Bijaoui (1998) Image Processing and Data Analysis:The Multiscale Approach, Cambridge, UK, Cambridge University Press.

Stollnitz, E. J., T. D. DeRose, and D. H. Salesin (1996) Wavelets for Computer Graphics,San Francisco, CA, Morgan Kaufmann.

Storer, James A. and T. G. Szymanski (1982) “Data Compression via Textual Substi-tution,” Journal of the ACM, 29:928–951.

Storer, James A. (1988) Data Compression: Methods and Theory, Rockville, MD, Com-puter Science Press.

Storer, James A., and Martin Cohn (eds.) (annual) DCC ’XX: Data Compression Con-ference, Los Alamitos, CA, IEEE Computer Society Press.

Storer, James A., and Harald Helfgott (1997) “Lossless Image Compression by BlockMatching,” The Computer Journal, 40(2/3):137–145.

Strang, Gilbert, and Truong Nguyen (1996) Wavelets and Filter Banks, Wellesley, MA,Wellesley-Cambridge Press.

Strang, Gilbert (1999) “The Discrete Cosine Transform,” SIAM Review, 41(1):135–147.

Strømme, Øyvind, and Douglas R. McGregor (1997) “Comparison of Fidelity of Repro-duction of Images After Lossy Compression Using Standard and Nonstandard Wavelet

Bibliography 1037

Decompositions,” in Proceedings of The First European Conference on Signal Analysisand Prediction (ECSAP 97), Prague, June.

Strømme, Øyvind (1999) On The Applicability of Wavelet Transforms to Image andVideo Compression, Ph.D. thesis, University of Strathclyde, February.

Stuart, J. R. et al. (1999) “MLP Lossless Compression,” AES 9th Regional Convention,Tokyo. Available at http://www.meridian-audio.com/w_paper/mlp_jap_new.PDF.

suzannevega (2006) is http://www.suzannevega.com/about/funfactsMusic.htm.

Swan, Tom (1993) Inside Windows File Formats, Indianapolis, IN, Sams Publications.

Sweldens, Wim and Peter Schroder (1996), Building Your Own Wavelets At Home,SIGGRAPH 96 Course Notes. Available on the WWW.

Symes, Peter D. (2003) MPEG-4 Demystified, New York, NY, McGraw-Hill Professional.

Taubman, David (1999) ”High Performance Scalable Image Compression with EBCOT,”IEEE Transactions on Image Processing, 9(7):1158–1170.

Taubman, David S., and Michael W. Marcellin (2002) JPEG 2000, Image CompressionFundamentals, Standards and Practice, Norwell, MA, Kluwer Academic.

Thomborson, Clark, (1992) “The V.42bis Standard for Data-Compressing Modems,”IEEE Micro, pp. 41–53, October.

Thomas Dolby (2006) is http://version.thomasdolby.com/index_frameset.html

Trendafilov, Dimitre, Nasir Memon, and Torsten Suel (2002) “Zdelta: An Efficient DeltaCompression Tool,” Technical Report TR-CIS-2002-02, New York, NY, Polytechnic Uni-versity.

Tunstall, B. P., (1967) “Synthesis of Noiseless Compression Codes,” Ph.D. dissertation,Georgia Institute of Technology, Atlanta, GA.

Udupa, Raghavendra U., Vinayaka D. Pandit, and Ashok Rao (1999), Private Commu-nication.

Unicode (2003) is http://unicode.org/.

Unisys (2003) is http://www.unisys.com.

unrarsrc (2006) is http://www.rarlab.com/rar/unrarsrc-3.5.4.tar.gz.

UPX (2003) is http://upx.sourceforge.net/.

UTF16 (2006) is http://en.wikipedia.org/wiki/UTF-16.

Vetterli, M., and J. Kovacevic (1995) Wavelets and Subband Coding, Englewood Cliffs,NJ, Prentice-Hall.

Vitter, Jeffrey S. (1987) “Design and Analysis of Dynamic Huffman Codes,” Journal ofthe ACM, 34(4):825–845, October.

Volf, Paul A. J. (1997) “A Context-Tree Weighting Algorithm for Text GeneratingSources,” in Storer, James A. (ed.), DCC ’97: Data Compression Conference, Los Alami-tos, CA, IEEE Computer Society Press, pp. 132–139, (Poster).

1038 Bibliography

Vorobev, Nikolai N. (1983) in Ian N. Sneddon (ed.), and Halina Moss (translator),Fibonacci Numbers, New Classics Library.

Wallace, Gregory K. (1991) “The JPEG Still Image Compression Standard,” Commu-nications of the ACM, 34(4):30–44, April.

Watson, Andrew (1994) “Image Compression Using the Discrete Cosine Transform,”Mathematica Journal, 4(1):81–88.

WavPack (2006) is http://www.wavpack.com/.

Weinberger, M. J., G. Seroussi, and G. Sapiro (1996) “LOCO-I: A Low Complexity,Context-Based, Lossless Image Compression Algorithm,” in Proceedings of Data Com-pression Conference, J. Storer, editor, Los Alamitos, CA, IEEE Computer Society Press,pp. 140–149.

Weinberger, M. J., G. Seroussi, and G. Sapiro (2000) “The LOCO-I Lossless Image Com-pression Algorithm: Principles and Standardization Into JPEG-LS,” IEEE Transactionson Image Processing, 9(8):1309–1324, August.

Welch, T. A. (1984) “A Technique for High-Performance Data Compression,” IEEEComputer, 17(6):8–19, June.

Wikipedia (2003) is file Nyquist-Shannon_sampling_theorem inhttp://www.wikipedia.org/wiki/.

wiki.audio (2006) is http://en.wikipedia.org/wiki/Audio_data_compression.

Willems, F. M. J. (1989) “Universal Data Compression and Repetition Times,” IEEETransactions on Information Theory, IT-35(1):54–58, January.

Willems, F. M. J., Y. M. Shtarkov, and Tj. J. Tjalkens (1995) “The Context-TreeWeighting Method: Basic Properties,” IEEE Transactions on Information Theory, IT-41:653–664, May.

Williams, Ross N. (1991a) Adaptive Data Compression, Boston, MA, Kluwer AcademicPublishers.

Williams, Ross N. (1991b) “An Extremely Fast Ziv-Lempel Data Compression Algo-rithm,” in Proceedings of the 1991 Data Compression Conference, J. Storer, ed., LosAlamitos, CA, IEEE Computer Society Press, pp. 362–371.

Williams, Ross N. (1993), “A Painless Guide to CRC Error Detection Algorithms,”available from http://ross.net/crc/download/crc_v3.txt.

WinAce (2003) is http://www.winace.com/.

windots (2006) is http://www.uiciechi.it/vecchio/cnt/schede/windots-eng.html.

Wirth, N. (1976) Algorithms + Data Structures = Programs, 2nd ed., Englewood Cliffs,NJ, Prentice-Hall.

Witten, Ian H., Radford M. Neal, and John G. Cleary (1987) “Arithmetic Coding forData Compression,” Communications of the ACM, 30(6):520–540.

Bibliography 1039

Witten, Ian H. and Timothy C. Bell (1991) “The Zero-Frequency Problem: Estimatingthe Probabilities of Novel Events in Adaptive Text Compression,” IEEE Transactionson Information Theory, IT-37(4):1085–1094.

Witten, Ian H., T. C. Bell, M. E. Harrison, M. L. James, and A. Moffat (1992) “Tex-tual Image Compression,” in Proceedings of the 1992 Data Compression Conference,J. Storer, ed., Los Alamitos, CA, IEEE Computer Society Press, pp. 42–51.

Witten, Ian H., T. C. Bell, H. Emberson, S. Inglis, and A. Moffat, (1994) “Textualimage compression: two-stage lossy/lossless encoding of textual images,” Proceedings ofthe IEEE, 82(6):878–888, June.

Wolf, Misha et al. (2000) “A Standard Compression Scheme for Unicode,” Unicode Tech-nical Report #6, available at http://unicode.org/unicode/reports/tr6/index.html.

Wolff, Gerry (1999) is http://www.cognitionresearch.org.uk/sp.htm.

Wong, Kwo-Jyr, and C. C. Jay Kuo (1993) “A Full Wavelet Transform (FWT) Ap-proach to Image Compression,” Image and Video Processing, Bellingham, WA, SPIEvol. 1903:153–164.

Wong, P. W., and J. Koplowitz (1992) “Chain Codes and Their Linear ReconstructionFilters,” IEEE Transactions on Information Theory, IT-38(2):268–280, May.

Wright, E. V. (1939) Gadsby, Los Angeles, Wetzel. Reprinted by University Microfilms,Ann Arbor, MI, 1991.

Wu, Xiaolin (1995), “Context Selection and Quantization for Lossless Image Coding,”in James A. Storer and Martin Cohn (eds.), DCC ’95, Data Compression Conference,Los Alamitos, CA, IEEE Computer Society Press, p. 453.

Wu, Xiaolin (1996), “An Algorithmic Study on Lossless Image Compression,” in JamesA. Storer, ed., DCC ’96, Data Compression Conference, Los Alamitos, CA, IEEE Com-puter Society Press.

XMill (2003) is http://www.research.att.com/sw/tools/xmill/.

XML (2003) is http://www.xml.com/.

Yokoo, Hidetoshi (1991) “An Improvement of Dynamic Huffman Coding with a SimpleRepetition Finder,” IEEE Transactions on Communications, 39(1):8–10, January.

Yokoo, Hidetoshi (1996) “An Adaptive Data Compression Method Based on ContextSorting,” in Proceedings of the 1996 Data Compression Conference, J. Storer, ed., LosAlamitos, CA, IEEE Computer Society Press, pp. 160–169.

Yokoo, Hidetoshi (1997) “Data Compression Using Sort-Based Context Similarity Mea-sure,” The Computer Journal, 40(2/3):94–102.

Yokoo, Hidetoshi (1999a) “A Dynamic Data Structure for Reverse LexicographicallySorted Prefixes,” in Combinatorial Pattern Matching, Lecture Notes in Computer Sci-ence 1645, M. Crochemore and M. Paterson, eds., Berlin, Springer Verlag, pp. 150–162.

Yokoo, Hidetoshi (1999b) Private Communication.

1040 Bibliography

Yoo, Youngjun, Younggap Kwon, and Antonio Ortega (1998) “Embedded Image-DomainAdaptive Compression of Simple Images,” in Proceedings of the 32nd Asilomar Confer-ence on Signals, Systems, and Computers, Pacific Grove, CA, Nov. 1998.

Young, D. M. (1985) “MacWrite File Format,” Wheels for the Mind, 1:34, Fall.

Yu, Tong Lai (1996) “Dynamic Markov Compression,” Dr Dobb’s Journal, pp. 30–31,January.

Zalta, Edward N. (1988) “Are Algorithms Patentable?” Notices of the American Math-ematical Society, 35(6):796–799.

Zandi A., J. Allen, E. Schwartz, and M. Boliek, (1995), “CREW: Compression withReversible Embedded Wavelets,” in James A. Storer and Martin Cohn (eds.) DCC’95: Data Compression Conference, Los Alamitos, CA, IEEE Computer Society Press,pp. 212–221, March.

Zhang, Manyun (1990) The JPEG and Image Data Compression Algorithms (disserta-tion).

Ziv, Jacob, and A. Lempel (1977) “A Universal Algorithm for Sequential Data Com-pression,” IEEE Transactions on Information Theory, IT-23(3):337–343.

Ziv, Jacob and A. Lempel (1978) “Compression of Individual Sequences via Variable-Rate Coding,” IEEE Transactions on Information Theory, IT-24(5):530–536.

zlib (2003) is http://www.zlib.org/zlib_tech.html.

Zurek, Wojciech (1989) “Thermodynamic Cost of Computation, Algorithmic Complex-ity, and the Information Metric,”Nature, 341(6238):119–124, September 14.

Yet the guide is fragmentary, incomplete, and in no

sense a bibliography. Its emphases vary according

to my own indifferences and ignorance as well as

according to my own sympathies and knowledge.

J. Frank Dobie, Guide to Life and Literature of the Southwest (1943)

Glossary

7-Zip. A file archiver with high compression ratio. The brainchild of Igor Pavlov, thisfree software for Windows is based on the LZMA algorithm. Both LZMA and 7z weredesigned to provide high compression, fast decompression, and low memory requirementsfor decompression. (See also LZMA.)

AAC. A complex and efficient audio compression method. AAC is an extension of andthe successor to mp3. Like mp3, AAC is a time/frequency (T/F) codec that employsa psychoacoustic model to determine how the normal threshold of the ear varies in thepresence of masking sounds. Once the perturbed threshold is known, the original audiosamples are converted to frequency coefficients which are quantized (thereby providinglossy compression) and then Huffman encoded (providing additional, lossless, compres-sion).

AC-3. A perceptual audio coded designed by Dolby Laboratories to support severalaudio channels.

ACB. A very efficient text compression method by G. Buyanovsky (Section 8.3). Ituses a dictionary with unbounded contexts and contents to select the context that bestmatches the search buffer and the content that best matches the look-ahead buffer.

Adaptive Compression. A compression method that modifies its operations and/orits parameters according to new data read from the input stream. Examples are theadaptive Huffman method of Section 2.9 and the dictionary-based methods of Chapter 3.(See also Semiadaptive Compression, Locally Adaptive Compression.)

Affine Transformations. Two-dimensional or three-dimensional geometric transforma-tions, such as scaling, reflection, rotation, and translation, that preserve parallel lines(Section 4.35.1).

Alphabet. The set of all possible symbols in the input stream. In text compression, thealphabet is normally the set of 128 ASCII codes. In image compression it is the set ofvalues a pixel can take (2, 16, 256, or anything else). (See also Symbol.)

1042 Glossary

ALS. MPEG-4 Audio Lossless Coding (ALS) is the latest addition to the family ofMPEG-4 audio codecs. ALS can handle integer and floating-point audio samples andis based on a combination of linear prediction (both short-term and long-term), mul-tichannel coding, and efficient encoding of audio residues by means of Rice codes andblock codes.

ARC. A compression/archival/cataloging program written by Robert A. Freed in themid 1980s (Section 3.22). It offers good compression and the ability to combine severalfiles into an archive. (See also Archive, ARJ.)

Archive. A set of one or more files combined into one file (Section 3.22). The individualmembers of an archive may be compressed. An archive provides a convenient way oftransferring or storing groups of related files. (See also ARC, ARJ.)

Arithmetic Coding. A statistical compression method (Section 2.14) that assigns one(normally long) code to the entire input stream, instead of assigning codes to the in-dividual symbols. The method reads the input stream symbol by symbol and appendsmore bits to the code each time a symbol is input and processed. Arithmetic coding isslow, but it compresses at or close to the entropy, even when the symbol probabilitiesare skewed. (See also Model of Compression, Statistical Methods, QM Coder.)

ARJ. A free compression/archiving utility for MS/DOS (Section 3.22), written by RobertK. Jung to compete with ARC and the various PK utilities. (See also Archive, ARC.)

ASCII Code. The standard character code on all modern computers (although Unicodeis becoming a competitor). ASCII stands for American Standard Code for InformationInterchange. It is a (1+7)-bit code, with one parity bit and seven data bits per symbol.As a result, 128 symbols can be coded. They include the uppercase and lowercase letters,the ten digits, some punctuation marks, and control characters. (See also Unicode.)

Bark. Unit of critical band rate. Named after Heinrich Georg Barkhausen and used inaudio applications. The Bark scale is a nonlinear mapping of the frequency scale overthe audio range, a mapping that matches the frequency selectivity of the human ear.

Bayesian Statistics. (See Conditional Probability.)

Bi-level Image. An image whose pixels have two different colors. The colors are nor-mally referred to as black and white, “foreground” and “background,” or 1 and 0. (Seealso Bitplane.)

BinHex. A file format for reliable file transfers, designed by Yves Lempereur for use onthe Macintosh computer (Section 1.4.3).

Bintrees. A method, somewhat similar to quadtrees, for partitioning an image intononoverlapping parts. The image is (horizontally) divided into two halves, each half isdivided (vertically) into smaller halves, and the process continues recursively, alternatingbetween horizontal and vertical splits. The result is a binary tree where any uniformpart of the image becomes a leaf. (See also Prefix Compression, Quadtrees.)

Bitplane. Each pixel in a digital image is represented by several bits. The set of all thekth bits of all the pixels in the image is the kth bitplane of the image. A bi-level image,for example, consists of one bitplane. (See also Bi-level Image.)

Glossary 1043

Bitrate. In general, the term “bitrate” refers to both bpb and bpc. However, in audiocompression, this term is used to indicate the rate at which the compressed streamis read by the decoder. This rate depends on where the stream comes from (suchas disk, communications channel, memory). If the bitrate of an MPEG audio file is,e.g., 128 Kbps. then the encoder will convert each second of audio into 128 K bits ofcompressed data, and the decoder will convert each group of 128 K bits of compresseddata into one second of sound. Lower bitrates mean smaller file sizes. However, as thebitrate decreases, the encoder must compress more audio data into fewer bits, eventuallyresulting in a noticeable loss of audio quality. For CD-quality audio, experience indicatesthat the best bitrates are in the range of 112 Kbps to 160 Kbps. (See also Bits/Char.)

Bits/Char. Bits per character (bpc). A measure of the performance in text compression.Also a measure of entropy. (See also Bitrate, Entropy.)

Bits/Symbol. Bits per symbol. A general measure of compression performance.

Block Coding. A general term for image compression methods that work by breaking theimage into small blocks of pixels, and encoding each block separately. JPEG (Section 4.8)is a good example, because it processes blocks of 8×8 pixels.

Block Decomposition. A method for lossless compression of discrete-tone images. Themethod works by searching for, and locating, identical blocks of pixels. A copy B ofa block A is compressed by preparing the height, width, and location (image coordi-nates) of A, and compressing those four numbers by means of Huffman codes. (See alsoDiscrete-Tone Image.)

Block Matching. A lossless image compression method based on the LZ77 sliding win-dow method originally developed for text compression. (See also LZ Methods.)

Block Truncation Coding. BTC is a lossy image compression method that quantizespixels in an image while preserving the first two or three statistical moments. (See alsoVector Quantization.)

BMP. BMP (Section 1.4.4) is a palette-based graphics file format for images with 1, 2,4, 8, 16, 24, or 32 bitplanes. It uses a simple form of RLE to compress images with 4 or8 bitplanes.

BOCU-1. A simple algorithm for Unicode compression (Section 8.12.1).

BSDiff. A file differencing algorithm created by Colin Percival. The algorithm ad-dresses the problem of differential file compression of executable code while maintaininga platform-independent approach. BSDiff combines matching with mismatches and en-tropy coding of the differences with bzip2. BSDiff’s decoder is called bspatch. (See alsoExediff, File differencing, UNIX diff, VCDIFF, and Zdelta.)

Burrows-Wheeler Method. This method (Section 8.1) prepares a string of data for latercompression. The compression itself is done with the move-to-front method (Section 1.5),perhaps in combination with RLE. The BW method converts a string S to another stringL that satisfies two conditions:

1. Any region of L will tend to have a concentration of just a few symbols.

1044 Glossary

2. It is possible to reconstruct the original string S from L (a little more data may beneeded for the reconstruction, in addition to L, but not much).

CALIC. A context-based, lossless image compression method (Section 4.24) whose twomain features are (1) the use of three passes in order to achieve symmetric contexts and(2) context quantization, to significantly reduce the number of possible contexts withoutdegrading compression.

CCITT. The International Telegraph and Telephone Consultative Committee (ComiteConsultatif International Telegraphique et Telephonique), the old name of the ITU, theInternational Telecommunications Union. The ITU is a United Nations organizationresponsible for developing and recommending standards for data communications (notjust compression). (See also ITU.)

Cell Encoding. An image compression method where the entire bitmap is divided intocells of, say, 8×8 pixels each and is scanned cell by cell. The first cell is stored in entry0 of a table and is encoded (i.e., written on the compressed file) as the pointer 0. Eachsubsequent cell is searched in the table. If found, its index in the table becomes its codeand it is written on the compressed file. Otherwise, it is added to the table. In the caseof an image made of just straight segments, it can be shown that the table size is just108 entries.

CIE. CIE is an abbreviation for Commission Internationale de l’Eclairage (InternationalCommittee on Illumination). This is the main international organization devoted to lightand color. It is responsible for developing standards and definitions in this area. (SeeLuminance.)

Circular Queue. A basic data structure (Section 3.3.1) that moves data along an arrayin circular fashion, updating two pointers to point to the start and end of the data inthe array.

Codec. A term used to refer to both encoder and decoder.

Codes. A code is a symbol that stands for another symbol. In computer and telecom-munications applications, codes are virtually always binary numbers. The ASCII codeis the defacto standard, although the new Unicode is used on several new computers andthe older EBCDIC is still used on some old IBM computers. (See also ASCII, Unicode.)

Composite and Difference Values. A progressive image method that separates theimage into layers using the method of bintrees. Early layers consist of a few large,low-resolution blocks, followed by later layers with smaller, higher-resolution blocks.The main principle is to transform a pair of pixels into two values, a composite and adifferentiator. (See also Bintrees, Progressive Image Compression.)

Compress. In the large UNIX world, compress is commonly used to compress data.This utility uses LZW with a growing dictionary. It starts with a small dictionary ofjust 512 entries and doubles its size each time it fills up, until it reaches 64K bytes(Section 3.18).

Glossary 1045

Compression Factor. The inverse of compression ratio. It is defined as

compression factor =size of the input streamsize of the output stream

.

Values greater than 1 indicate compression, and values less than 1 imply expansion. (Seealso Compression Ratio.)

Compression Gain. This measure is defined as

100 loge

reference sizecompressed size

,

where the reference size is either the size of the input stream or the size of the compressedstream produced by some standard lossless compression method.

Compression Ratio. One of several measures that are commonly used to express theefficiency of a compression method. It is the ratio

compression ratio =size of the output streamsize of the input stream

.

A value of 0.6 indicates that the data occupies 60% of its original size after compression.Values greater than 1 mean an output stream bigger than the input stream (negativecompression).

Sometimes the quantity 100 × (1 − compression ratio) is used to express the quality ofcompression. A value of 60 means that the output stream occupies 40% of its originalsize (or that the compression has resulted in a savings of 60%). (See also CompressionFactor.)

Conditional Image RLE. A compression method for grayscale images with n shadesof gray. The method starts by assigning an n-bit code to each pixel depending on itsnear neighbors. It then concatenates the n-bit codes into a long string, and calculatesrun lengths. The run lengths are encoded by prefix codes. (See also RLE, RelativeEncoding.)

Conditional Probability. We tend to think of probability as something that is builtinto an experiment. A true die, for example, has probability of 1/6 of falling on anyside, and we tend to consider this an intrinsic feature of the die. Conditional probabilityis a different way of looking at probability. It says that knowledge affects probability.The main task of this field is to calculate the probability of an event A given thatanother event, B, is known to have occurred. This is the conditional probability of A(more precisely, the probability of A conditioned on B), and it is denoted by P (A|B).The field of conditional probability is sometimes called Bayesian statistics, since it wasfirst developed by the Reverend Thomas Bayes, who came up with the basic formula ofconditional probability.

Context. The N symbols preceding the next symbol. A context-based model uses con-text to assign probabilities to symbols.

1046 Glossary

Context-Free Grammars. A formal language uses a small number of symbols (calledterminal symbols) from which valid sequences can be constructed. Any valid sequenceis finite, the number of valid sequences is normally unlimited, and the sequences areconstructed according to certain rules (sometimes called production rules). The rulescan be used to construct valid sequences and also to determine whether a given sequenceis valid. A production rule consists of a nonterminal symbol on the left and a stringof terminal and nonterminal symbols on the right. The nonterminal symbol on the leftbecomes the name of the string on the right. The set of production rules constitutes thegrammar of the formal language. If the production rules do not depend on the contextof a symbol, the grammar is context-free. There are also context-sensitive grammars.The sequitur method of Section 8.10 is based on context-free grammars.

Context-Tree Weighting. A method for the compression of bitstrings. It can be appliedto text and images, but they have to be carefully converted to bitstrings. The methodconstructs a context tree where bits input in the immediate past (context) are used toestimate the probability of the current bit. The current bit and its estimated probabilityare then sent to an arithmetic encoder, and the tree is updated to include the currentbit in the context. (See also KT Probability Estimator.)

Continuous-Tone Image. A digital image with a large number of colors, such thatadjacent image areas with colors that differ by just one unit appear to the eye as havingcontinuously varying colors. An example is an image with 256 grayscale values. Whenadjacent pixels in such an image have consecutive gray levels, they appear to the eye asa continuous variation of the gray level. (See also Bi-level image, Discrete-Tone Image,Grayscale Image.)

Continuous Wavelet Transform. An important modern method for analyzing the timeand frequency contents of a function f(t) by means of a wavelet. The wavelet is itself afunction (which has to satisfy certain conditions), and the transform is done by multi-plying the wavelet and f(t) and computing the integral of the product. The wavelet isthen translated, and the process is repeated. When done, the wavelet is scaled, and theentire process is carried out again in order to analyze f(t) at a different scale. (See alsoDiscrete Wavelet Transform, Lifting Scheme, Multiresolution Decomposition, Taps.)

Convolution. A way to describe the output of a linear, shift-invariant system by meansof its input.

Correlation. A statistical measure of the linear relation between two paired variables.The values of R range from −1 (perfect negative relation), to 0 (no relation), to +1(perfect positive relation).

CRC. CRC stands for Cyclical Redundancy Check (or Cyclical Redundancy Code). Itis a rule that shows how to obtain vertical check bits from all the bits of a data stream(Section 3.28). The idea is to generate a code that depends on all the bits of the datastream, and use it to detect errors (bad bits) when the data is transmitted (or when itis stored and retrieved).

CRT. A CRT (cathode ray tube) is a glass tube with a familiar shape. In the back ithas an electron gun (the cathode) that emits a stream of electrons. Its front surface is

Glossary 1047

positively charged, so it attracts the electrons (which have a negative electric charge).The front is coated with a phosphor compound that converts the kinetic energy of theelectrons hitting it to light. The flash of light lasts only a fraction of a second, so inorder to get a constant display, the picture has to be refreshed several times a second.

Data Compression Conference. A meeting of researchers and developers in the area ofdata compression. The DCC takes place every year in Snowbird, Utah, USA. It laststhree days, and the next few meetings are scheduled for late March.

Data Structure. A set of data items used by a program and stored in memory such thatcertain operations (for example, finding, adding, modifying, and deleting items) can beperformed on the data items fast and easily. The most common data structures are thearray, stack, queue, linked list, tree, graph, and hash table. (See also Circular Queue.)

Decibel. A logarithmic measure that can be used to measure any quantity that takesvalues over a very wide range. A common example is sound intensity. The intensity(amplitude) of sound can vary over a range of 11–12 orders of magnitude. Instead ofusing a linear measure, where numbers as small as 1 and as large as 1011 would beneeded, a logarithmic scale is used, where the range of values is [0, 11].

Decoder. A decompression program (or algorithm).

Deflate. A popular lossless compression algorithm (Section 3.23) used by Zip and gzip.Deflate employs a variant of LZ77 combined with static Huffman coding. It uses a 32-Kb-long sliding dictionary and a look-ahead buffer of 258 bytes. When a string is notfound in the dictionary, its first symbol is emitted as a literal byte. (See also Gzip, Zip.)

Dictionary-Based Compression. Compression methods (Chapter 3) that save pieces ofthe data in a “dictionary” data structure. If a string of new data is identical to a piecethat is already saved in the dictionary, a pointer to that piece is output to the compressedstream. (See also LZ Methods.)

Differential Image Compression. A lossless image compression method where eachpixel p is compared to a reference pixel, which is one of its immediate neighbors, and isthen encoded in two parts: a prefix, which is the number of most significant bits of pthat are identical to those of the reference pixel, and a suffix, which is (almost all) theremaining least significant bits of p. (See also DPCM.)

Digital Video. A form of video in which the original image is generated, in the camera,in the form of pixels. (See also High-Definition Television.)

Digram. A pair of consecutive symbols.

Discrete Cosine Transform. A variant of the discrete Fourier transform (DFT) thatproduces just real numbers. The DCT (Sections 4.6, 4.8.2, and 8.15.2) transforms a setof numbers by combining n numbers to become an n-dimensional point and rotatingit in n-dimensions such that the first coordinate becomes dominant. The DCT and itsinverse, the IDCT, are used in JPEG (Section 4.8) to compress an image with acceptableloss, by isolating the high-frequency components of an image, so that they can later bequantized. (See also Fourier Transform, Transform.)

1048 Glossary

Discrete-Tone Image. A discrete-tone image may be bi-level, grayscale, or color. Suchimages are (with some exceptions) artificial, having been obtained by scanning a docu-ment, or capturing a computer screen. The pixel colors of such an image do not varycontinuously or smoothly, but have a small set of values, such that adjacent pixels maydiffer much in intensity or color. Figure 4.57 is an example of such an image. (See alsoBlock Decomposition, Continuous-Tone Image.)

Discrete Wavelet Transform. The discrete version of the continuous wavelet transform.A wavelet is represented by means of several filter coefficients, and the transform is car-ried out by matrix multiplication (or a simpler version thereof) instead of by calculatingan integral. (See also Continuous Wavelet Transform, Multiresolution Decomposition.)

DjVu. Certain images combine the properties of all three image types (bi-level, discrete-tone, and continuous-tone). An important example of such an image is a scanned docu-ment containing text, line drawings, and regions with continuous-tone pictures, such aspaintings or photographs. DjVu (pronounced “deja vu”) is designed for high compressionand fast decompression of such documents.

It starts by decomposing the document into three components: mask, foreground, andbackground. The background component contains the pixels that constitute the picturesand the paper background. The mask contains the text and the lines in bi-level form(i.e., one bit per pixel). The foreground contains the color of the mask pixels. Thebackground is a continuous-tone image and can be compressed at the low resolution of100 dpi. The foreground normally contains large uniform areas and is also compressedas a continuous-tone image at the same low resolution. The mask is left at 300 dpi butcan be efficiently compressed, since it is bi-level. The background and foreground arecompressed with a wavelet-based method called IW44, while the mask is compressedwith JB2, a version of JBIG2 (Section 4.12) developed at AT&T.

DPCM. DPCM compression is a member of the family of differential encoding compres-sion methods, which itself is a generalization of the simple concept of relative encoding(Section 1.3.1). It is based on the fact that neighboring pixels in an image (and alsoadjacent samples in digitized sound) are correlated. (See also Differential Image Com-pression, Relative Encoding.)

Embedded Coding. This feature is defined as follows: Imagine that an image encoderis applied twice to the same image, with different amounts of loss. It produces two files,a large one of size M and a small one of size m. If the encoder uses embedded coding,the smaller file is identical to the first m bits of the larger file.

The following example aptly illustrates the meaning of this definition. Suppose thatthree users wait for you to send them a certain compressed image, but they need differentimage qualities. The first one needs the quality contained in a 10 Kb file. The imagequalities required by the second and third users are contained in files of sizes 20 Kb and50 Kb, respectively. Most lossy image compression methods would have to compress thesame image three times, at different qualities, to generate three files with the right sizes.An embedded encoder, on the other hand, produces one file, and then three chunks—oflengths 10 Kb, 20 Kb, and 50 Kb, all starting at the beginning of the file—can be sentto the three users, satisfying their needs. (See also SPIHT, EZW.)

Glossary 1049

Encoder. A compression program (or algorithm).

Entropy. The entropy of a single symbol ai is defined (in Section 2.1) as −Pi log2 Pi,where Pi is the probability of occurrence of ai in the data. The entropy of ai is thesmallest number of bits needed, on average, to represent symbol ai. Claude Shannon,the creator of information theory, coined the term entropy in 1948, because this term isused in thermodynamics to indicate the amount of disorder in a physical system. (Seealso Entropy Encoding, Information Theory.)

Entropy Encoding. A lossless compression method where data can be compressed suchthat the average number of bits/symbol approaches the entropy of the input symbols.(See also Entropy.)

Error-Correcting Codes. The opposite of data compression, these codes detect andcorrect errors in digital data by increasing the redundancy of the data. They use checkbits or parity bits, and are sometimes designed with the help of generating polynomials.

EXE Compressor. A compression program for compressing EXE files on the PC. Sucha compressed file can be decompressed and executed with one command. The originalEXE compressor is LZEXE, by Fabrice Bellard (Section 3.27).

Exediff. A differential file compression algorithm created by Brenda Baker, Udi Manber,and Robert Muth for the differential compression of executable code. Exediff is aniterative algorithm that uses a lossy transform to reduce the effect of the secondarychanges in executable code. Two operations called pre-matching and value recovery areiterated until the size of the patch converges to a minimum. Exediff’s decoder is calledexepatch. (See also BSdiff, File differencing, UNIX diff, VCDIFF, and Zdelta.)

EZW. A progressive, embedded image coding method based on the zerotree data struc-ture. It has largely been superseded by the more efficient SPIHT method. (See alsoSPIHT, Progressive Image Compression, Embedded Coding.)

Facsimile Compression. Transferring a typical page between two fax machines can takeup to 10–11 minutes without compression, This is why the ITU has developed severalstandards for compression of facsimile data. The current standards (Section 2.13) areT4 and T6, also called Group 3 and Group 4, respectively. (See also ITU.)

FELICS. A Fast, Efficient, Lossless Image Compression method designed for grayscaleimages that competes with the lossless mode of JPEG. The principle is to code eachpixel with a variable-size code based on the values of two of its previously seen neighborpixels. Both the unary code and the Golomb code are used. There is also a progressiveversion of FELICS (Section 4.20). (See also Progressive FELICS.)

FHM Curve Compression. A method for compressing curves. The acronym FHM standsfor Fibonacci, Huffman, and Markov. (See also Fibonacci Numbers.)

Fibonacci Numbers. A sequence of numbers defined by

F1 = 1, F2 = 1, Fi = Fi−1 + Fi−2, i = 3, 4, . . . .

1050 Glossary

The first few numbers in the sequence are 1, 1, 2, 3, 5, 8, 13, and 21. These numbershave many applications in mathematics and in various sciences. They are also found innature, and are related to the golden ratio. (See also FHM Curve Compression.)

File Differencing. A compression method that locates and compresses the differencesbetween two slightly different data sets. The decoder, that has access to one of thetwo data sets, can use the differences and reconstruct the other. Applications of thiscompression technique include software distribution and updates (or patching), revisioncontrol systems, compression of backup files, archival of multiple versions of data. (Seealso VCDIFF.) (See also BSdiff, Exediff, UNIX diff, VCDIFF, and Zdelta.)

FLAC. An acronym for free lossless audio compression, FLAC is an audio compressionmethod, somewhat resembling Shorten, that is based on prediction of audio samples andencoding of the prediction residues with Rice codes. (See also Rice codes.)

Fourier Transform. A mathematical transformation that produces the frequency com-ponents of a function (Section 5.1). The Fourier transform shows how a periodic functioncan be written as the sum of sines and cosines, thereby showing explicitly the frequen-cies “hidden” in the original representation of the function. (See also Discrete CosineTransform, Transform.)

Gaussian Distribution. (See Normal Distribution.)

GFA. A compression method originally developed for bi-level images that can also beused for color images. GFA uses the fact that most images of interest have a certainamount of self-similarity (i.e., parts of the image are similar, up to size, orientation, orbrightness, to the entire image or to other parts). GFA partitions the image into sub-squares using a quadtree, and expresses relations between parts of the image in a graph.The graph is similar to graphs used to describe finite-state automata. The method islossy, because parts of a real image may be very (although not completely) similar toother parts. (See also Quadtrees, Resolution Independent Compression, WFA.)

GIF. An acronym that stands for Graphics Interchange Format. This format (Sec-tion 3.19) was developed by Compuserve Information Services in 1987 as an efficient,compressed graphics file format that allows for images to be sent between computers.The original version of GIF is known as GIF 87a. The current standard is GIF 89a.(See also Patents.)

Golomb Code. The Golomb codes consist of an infinite set of parametrized prefix codes.They are the best ones for the compression of data items that are distributed geometri-cally. (See also Unary Code.)

Gray Codes. These are binary codes for the integers, where the codes of consecutiveintegers differ by one bit only. Such codes are used when a grayscale image is separatedinto bitplanes, each a bi-level image. (See also Grayscale Image,)

Grayscale Image. A continuous-tone image with shades of a single color. (See alsoContinuous-Tone Image.)

Growth Geometry Coding. A method for progressive lossless compression of bi-levelimages. The method selects some seed pixels and applies geometric rules to grow eachseed pixel into a pattern of pixels. (See also Progressive Image Compression.)

Glossary 1051

Gzip. Popular software that implements the Deflate algorithm (Section 3.23) that usesa variation of LZ77 combined with static Huffman coding. It uses a 32 Kb-long slidingdictionary, and a look-ahead buffer of 258 bytes. When a string is not found in thedictionary, it is emitted as a sequence of literal bytes. (See also Zip.)

H.261. In late 1984, the CCITT (currently the ITU-T) organized an expert group todevelop a standard for visual telephony for ISDN services. The idea was to send imagesand sound between special terminals, so that users could talk and see each other. Thistype of application requires sending large amounts of data, so compression became animportant consideration. The group eventually came up with a number of standards,known as the H series (for video) and the G series (for audio) recommendations, alloperating at speeds of p×64 Kbit/sec for p = 1, 2, . . . , 30. These standards are knowntoday under the umbrella name of p× 64.

H.264. A sophisticated method for the compression of video. This method is a suc-cessor of H.261, H.262, and H.263. It has been approved in 2003 and employs the mainbuilding blocks of its predecessors, but with many additions and improvements.

Halftoning. A method for the display of gray scales in a bi-level image. By placinggroups of black and white pixels in carefully designed patterns, it is possible to createthe effect of a gray area. The trade-off of halftoning is loss of resolution. (See alsoBi-level Image, Dithering.)

Hamming Codes. A type of error-correcting code for 1-bit errors, where it is easy togenerate the required parity bits.

Hierarchical Progressive Image Compression. An image compression method (or anoptional part of such a method) where the encoder writes the compressed image inlayers of increasing resolution. The decoder decompresses the lowest-resolution layerfirst, displays this crude image, and continues with higher-resolution layers. Each layerin the compressed stream uses data from the preceding layer. (See also ProgressiveImage Compression.)

High-Definition Television. A general name for several standards that are currentlyreplacing traditional television. HDTV uses digital video, high-resolution images, andaspect ratios different from the traditional 3:4. (See also Digital Video.)

Huffman Coding. A popular method for data compression (Section 2.8). It assigns aset of “best” variable-size codes to a set of symbols based on their probabilities. Itserves as the basis for several popular programs used on personal computers. Someof them use just the Huffman method, while others use it as one step in a multistepcompression process. The Huffman method is somewhat similar to the Shannon-Fanomethod. It generally produces better codes, and like the Shannon-Fano method, itproduces best code when the probabilities of the symbols are negative powers of 2. Themain difference between the two methods is that Shannon-Fano constructs its codes topto bottom (from the leftmost to the rightmost bits), while Huffman constructs a codetree from the bottom up (builds the codes from right to left). (See also Shannon-FanoCoding, Statistical Methods.)

1052 Glossary

Hyperspectral data. A set of data items (called pixels) arranged in rows and columnswhere each pixel is a vector. An example is an image where each pixel consists of theradiation reflected from the ground in many frequencies. We can think of such data asseveral image planes (called bands) stacked vertically. Hyperspectral data is normallylarge and is an ideal candidate for compression. Any compression method for this typeof data should take advantage of the correlation between bands as well as correlationsbetween pixels in the same band.

Information Theory. A mathematical theory that quantifies information. It shows howto measure information, so that one can answer the question; How much informationis included in a given piece of data? with a precise number! Information theory is thecreation, in 1948, of Claude Shannon of Bell labs. (See also Entropy.)

Interpolating Polynomials. Given two numbers a and b we know that m = 0.5a +0.5b is their average, since it is located midway between a and b. We say that theaverage is an interpolation of the two numbers. Similarly, the weighted sum 0.1a +0.9b represents an interpolated value located 10% away from b and 90% away froma. Extending this concept to points (in two or three dimensions) is done by meansof interpolating polynomials. Given a set of points, we start by fitting a parametricpolynomial P(t) or P(u, w) through them. Once the polynomial is known, it can beused to calculate interpolated points by computing P(0.5), P(0.1), or other values.

ISO. The International Standards Organization. This is one of the organizations re-sponsible for developing standards. Among other things it is responsible (together withthe ITU) for the JPEG and MPEG compression standards. (See also ITU, CCITT,MPEG.)

Iterated Function Systems (IFS). An image compressed by IFS is uniquely defined bya few affine transformations (Section 4.35.1). The only rule is that the scale factors ofthese transformations must be less than 1 (shrinking). The image is saved in the outputstream by writing the sets of six numbers that define each transformation. (See alsoAffine Transformations, Resolution Independent Compression.)

ITU. The International Telecommunications Union, the new name of the CCITT, is aUnited Nations organization responsible for developing and recommending standards fordata communications (not just compression). (See also CCITT.)

JBIG. A special-purpose compression method (Section 4.11) developed specifically forprogressive compression of bi-level images. The name JBIG stands for Joint Bi-LevelImage Processing Group. This is a group of experts from several international organi-zations, formed in 1988 to recommend such a standard. JBIG uses multiple arithmeticcoding and a resolution-reduction technique to achieve its goals. (See also Bi-level Image,JBIG2.)

JBIG2. A recent international standard for the compression of bi-level images. It isintended to replace the original JBIG. Its main features are

1. Large increases in compression performance (typically 3–5 times better than Group4/MMR, and 2–4 times better than JBIG).

Glossary 1053

2. Special compression methods for text, halftones, and other bi-level image parts.

3. Lossy and lossless compression modes.

4. Two modes of progressive compression. Mode 1 is quality-progressive compres-sion, where the decoded image progresses from low to high quality. Mode 2 is content-progressive coding, where important image parts (such as text) are decoded first, followedby less important parts (such as halftone patterns).

5. Multipage document compression.

6. Flexible format, designed for easy embedding in other image file formats, such asTIFF.

7. Fast decompression. In some coding modes, images can be decompressed at over 250million pixels/second in software.

(See also Bi-level Image, JBIG.)

JFIF. The full name of this method (Section 4.8.7) is JPEG File Interchange Format.It is a graphics file format that makes it possible to exchange JPEG-compressed imagesbetween different computers. The main features of JFIF are the use of the YCbCr triple-component color space for color images (only one component for grayscale images) andthe use of a marker to specify features missing from JPEG, such as image resolution,aspect ratio, and features that are application-specific.

JPEG. A sophisticated lossy compression method (Section 4.8) for color or grayscale stillimages (not movies). It works best on continuous-tone images, where adjacent pixelshave similar colors. One advantage of JPEG is the use of many parameters, allowingthe user to adjust the amount of data loss (and thereby also the compression ratio) overa very wide range. There are two main modes: lossy (also called baseline) and lossless(which typically yields a 2:1 compression ratio). Most implementations support just thelossy mode. This mode includes progressive and hierarchical coding.

The main idea behind JPEG is that an image exists for people to look at, so when theimage is compressed, it is acceptable to lose image features to which the human eye isnot sensitive.

The name JPEG is an acronym that stands for Joint Photographic Experts Group. Thiswas a joint effort by the CCITT and the ISO that started in June 1987. The JPEGstandard has proved successful and has become widely used for image presentation,especially in Web pages. (See also JPEG-LS, MPEG.)

JPEG-LS. The lossless mode of JPEG is inefficient and often is not even implemented.As a result, the ISO decided to develop a new standard for the lossless (or near-lossless)compression of continuous-tone images. The result became popularly known as JPEG-LS. This method is not simply an extension or a modification of JPEG. It is a newmethod, designed to be simple and fast. It does not employ the DCT, does not usearithmetic coding, and applies quantization in a limited way, and only in its near-losslessoption. JPEG-LS examines several of the previously-seen neighbors of the current pixel,uses them as the context of the pixel, employs the context to predict the pixel and toselect a probability distribution out of several such distributions, and uses that distri-bution to encode the prediction error with a special Golomb code. There is also a run

1054 Glossary

mode, where the length of a run of identical pixels is encoded. (See also Golomb Code,JPEG.)

As for my mother, perhaps the Ambassador had not the type of mind towards whichshe felt herself most attracted. I should add that his conversation furnished so exhaus-tive a glossary of the superannuated forms of speech peculiar to a certain profession,class and period.

—Marcel Proust, Within a Budding Grove (1913–1927)

Kraft-MacMillan Inequality. A relation (Section 2.6) that says something about un-ambiguous variable-size codes. Its first part states: Given an unambiguous variable-sizecode, with n codes of sizes Li, then

n∑i=1

2−Li ≤ 1.

[This is Equation (2.5).] The second part states the opposite, namely, given a set of npositive integers (L1, L2, . . . , Ln) that satisfy Equation (2.5), there exists an unambigu-ous variable-size code such that Li are the sizes of its individual codes. Together, bothparts state that a code is unambiguous if and only if it satisfies relation (2.5).

KT Probability Estimator. A method to estimate the probability of a bitstring contain-ing a zeros and b ones. It is due to Krichevsky and Trofimov. (See also Context-TreeWeighting.)

Laplace Distribution. A probability distribution similar to the normal (Gaussian) dis-tribution, but narrower and sharply peaked. The general Laplace distribution withvariance V and mean m is given by

L(V, x) =1√2V

exp

(−√

2V|x−m|

).

Experience seems to suggest that the values of the residues computed by many im-age compression algorithms are Laplace distributed, which is why this distribution isemployed by those compression methods, most notably MLP. (See also Normal Distri-bution.)

Laplacian Pyramid. A progressive image compression technique where the original im-age is transformed to a set of difference images that can later be decompressed anddisplayed as a small, blurred image that becomes increasingly sharper. (See also Pro-gressive Image Compression.)

LHArc. This method (Section 3.22) is by Haruyasu Yoshizaki. Its predecessor is LHA,designed jointly by Haruyasu Yoshizaki and Haruhiko Okumura. These methods arebased on adaptive Huffman coding with features drawn from LZSS.

Lifting Scheme. A method for computing the discrete wavelet transform in place, so noextra memory is required. (See also Discrete Wavelet Transform.)

Glossary 1055

Locally Adaptive Compression. A compression method that adapts itself to localconditions in the input stream, and varies this adaptation as it moves from area to areain the input. An example is the move-to-front method of Section 1.5. (See also AdaptiveCompression, Semiadaptive Compression.)

Lossless Compression. A compression method where the output of the decoder is iden-tical to the original data compressed by the encoder. (See also Lossy Compression.)

Lossy Compression. A compression method where the output of the decoder is differentfrom the original data compressed by the encoder, but is nevertheless acceptable to auser. Such methods are common in image and audio compression, but not in textcompression, where the loss of even one character may result in wrong, ambiguous, orincomprehensible text. (See also Lossless Compression, Subsampling.)

LPVQ. An acronym for Locally Optimal Partitioned Vector Quantization. LPVQ isa quantization algorithm proposed by Giovanni Motta, Francesco Rizzo, and JamesStorer [Motta et al. 06] for the lossless and near-lossless compression of hyperspectraldata. Spectral signatures are first partitioned in sub-vectors on unequal length andindependently quantized. Then, indices are entropy coded by exploiting both spectraland spatial correlation. The residual error is also entropy coded, with the probabilitiesconditioned by the quantization indices. The locally optimal partitioning of the spectralsignatures is decided at design time, during the training of the quantizer.

Luminance. This quantity is defined by the CIE (Section 4.8.1) as radiant powerweighted by a spectral sensitivity function that is characteristic of vision. (See alsoCIE.)

LZ Methods. All dictionary-based compression methods are based on the work ofJ. Ziv and A. Lempel, published in 1977 and 1978. Today, these are called LZ77 andLZ78 methods, respectively. Their ideas have been a source of inspiration to manyresearchers, who generalized, improved, and combined them with RLE and statisticalmethods to form many commonly used adaptive compression methods, for text, images,and audio. (See also Block Matching, Dictionary-Based Compression, Sliding-WindowCompression.)

LZAP. The LZAP method (Section 3.14) is an LZW variant based on the followingidea: Instead of just concatenating the last two phrases and placing the result in thedictionary, place all prefixes of the concatenation in the dictionary. The suffix AP standsfor All Prefixes.

LZARI. An improvement on LZSS, developed in 1988 by Haruhiko Okumura. (See alsoLZSS.)

LZFG. This is the name of several related methods (Section 3.9) that are hybrids ofLZ77 and LZ78. They were developed by Edward Fiala and Daniel Greene. All thesemethods are based on the following scheme. The encoder produces a compressed filewith tokens and literals (raw ASCII codes) intermixed. There are two types of tokens, aliteral and a copy. A literal token indicates that a string of literals follow, a copy tokenpoints to a string previously seen in the data. (See also LZ Methods, Patents.)

1056 Glossary

LZMA. LZMA (Lempel-Ziv-Markov chain-Algorithm) is one of the many LZ77 variants.Developed by Igor Pavlov, this algorithm, which is used in his popular 7z software, isbased on a large search buffer, a hash function that generates indexes, somewhat similarto LZRW4, and two search methods. The fast method uses a hash-array of lists ofindexes and the normal method uses a hash-array of binary decision trees. (See also7-Zip.)

LZMW. A variant of LZW, the LZMW method (Section 3.13) works as follows: Insteadof adding I plus one character of the next phrase to the dictionary, add I plus the entirenext phrase to the dictionary. (See also LZW.)

LZP. An LZ77 variant developed by C. Bloom (Section 3.16). It is based on the principleof context prediction that says “if a certain string abcde has appeared in the input streamin the past and was followed by fg..., then when abcde appears again in the inputstream, there is a good chance that it will be followed by the same fg....” (See alsoContext.)

LZSS. This version of LZ77 (Section 3.4) was developed by Storer and Szymanski in1982 [Storer 82]. It improves on the basic LZ77 in three ways: (1) it holds the look-ahead buffer in a circular queue, (2) it implements the search buffer (the dictionary) ina binary search tree, and (3) it creates tokens with two fields instead of three. (See alsoLZ Methods, LZARI.)

LZW. This is a popular variant (Section 3.12) of LZ78, developed by Terry Welch in1984. Its main feature is eliminating the second field of a token. An LZW token consistsof just a pointer to the dictionary. As a result, such a token always encodes a string ofmore than one symbol. (See also Patents.)

LZX. LZX is an LZ77 variant for the compression of cabinet files (Section 3.7).

LZY. LZY (Section 3.15) is an LZW variant that adds one dictionary string per inputcharacter and increments strings by one character at a time.

MLP. A progressive compression method for grayscale images. An image is compressedin levels. A pixel is predicted by a symmetric pattern of its neighbors from precedinglevels, and the prediction error is arithmetically encoded. The Laplace distribution isused to estimate the probability of the error. (See also Laplace Distribution, ProgressiveFELICS.)

MLP Audio. The new lossless compression standard approved for DVD-A (audio) iscalled MLP. It is the topic of Section 7.7.

MNP5, MNP7. These have been developed by Microcom, Inc., a maker of modems,for use in its modems. MNP5 (Section 2.10) is a two-stage process that starts withrun-length encoding, followed by adaptive frequency encoding. MNP7 (Section 2.11)combines run-length encoding with a two-dimensional variant of adaptive Huffman.

Model of Compression. A model is a method to “predict” (to assign probabilities to)the data to be compressed. This concept is important in statistical data compression.When a statistical method is used, a model for the data has to be constructed before

Glossary 1057

compression can begin. A simple model can be built by reading the entire input stream,counting the number of times each symbol appears (its frequency of occurrence), andcomputing the probability of occurrence of each symbol. The data stream is then inputagain, symbol by symbol, and is compressed using the information in the probabilitymodel. (See also Statistical Methods, Statistical Model.)

One feature of arithmetic coding is that it is easy to separate the statistical model (thetable with frequencies and probabilities) from the encoding and decoding operations. Itis easy to encode, for example, the first half of a data stream using one model, and thesecond half using another model.

Monkey’s audio. Monkey’s audio is a fast, efficient, free, lossless audio compressionalgorithm and implementation that offers error detection, tagging, and external support.

Move-to-Front Coding. The basic idea behind this method (Section 1.5) is to maintainthe alphabet A of symbols as a list where frequently occurring symbols are located nearthe front. A symbol s is encoded as the number of symbols that precede it in this list.After symbol s is encoded, it is moved to the front of list A.

MPEG. This acronym stands for Moving Pictures Experts Group. The MPEG standardconsists of several methods for the compression of video, including the compression ofdigital images and digital sound, as well as synchronization of the two. There currentlyare several MPEG standards. MPEG-1 is intended for intermediate data rates, on theorder of 1.5 Mbit/sec. MPEG-2 is intended for high data rates of at least 10 Mbit/sec.MPEG-3 was intended for HDTV compression but was found to be redundant andwas merged with MPEG-2. MPEG-4 is intended for very low data rates of less than64 Kbit/sec. The ITU-T, has been involved in the design of both MPEG-2 and MPEG-4.A working group of the ISO is still at work on MPEG. (See also ISO, JPEG.)

Multiresolution Decomposition. This method groups all the discrete wavelet transformcoefficients for a given scale, displays their superposition, and repeats for all scales. (Seealso Continuous Wavelet Transform, Discrete Wavelet Transform.)

Multiresolution Image. A compressed image that may be decompressed at any resolu-tion. (See also Resolution Independent Compression, Iterated Function Systems, WFA.)

Normal Distribution. A probability distribution with the well-known bell shape. It isfound in many places in both theoretical models and real-life situations. The normaldistribution with mean m and standard deviation s is defined by

f(x) =1

s√

2πexp

{−1

2

(x−m

s

)2}

.

Patents. A mathematical algorithm can be patented if it is intimately associated withsoftware or firmware implementing it. Several compression methods, most notably LZW,have been patented (Section 3.30), creating difficulties for software developers who workwith GIF, UNIX compress, or any other system that uses LZW. (See also GIF, LZW,Compress.)

1058 Glossary

Pel. The smallest unit of a facsimile image; a dot. (See also Pixel.)

Phrase. A piece of data placed in a dictionary to be used in compressing future data.The concept of phrase is central in dictionary-based data compression methods sincethe success of such a method depends a lot on how it selects phrases to be saved in itsdictionary. (See also Dictionary-Based Compression, LZ Methods.)

Pixel. The smallest unit of a digital image; a dot. (See also Pel.)

PKZip. A compression program for MS/DOS (Section 3.22) written by Phil Katz whohas founded the PKWare company which also markets the PKunzip, PKlite, and PKArcsoftware (http://www.pkware.com).

PNG. An image file format (Section 3.25) that includes lossless compression with Deflateand pixel prediction. PNG is free and it supports several image types and number ofbitplanes, as well as sophisticated transparency.

Portable Document Format (PDF). A standard developed by Adobe in 1991–1992 thatallows arbitrary documents to be created, edited, transferred between different computerplatforms, and printed. PDF compresses the data in the document (text and images)by means of LZW, Flate (a variant of Deflate), run-length encoding, JPEG, JBIG2, andJPEG 2000.

PPM. A compression method that assigns probabilities to symbols based on the context(long or short) in which they appear. (See also Prediction, PPPM.)

PPPM. A lossless compression method for grayscale (and color) images that assignsprobabilities to symbols based on the Laplace distribution, like MLP. Different contextsof a pixel are examined, and their statistics used to select the mean and variance for aparticular Laplace distribution. (See also Laplace Distribution, Prediction, PPM, MLP.)

Prediction. Assigning probabilities to symbols. (See also PPM.)

Prefix Compression. A variant of quadtrees, designed for bi-level images with text ordiagrams, where the number of black pixels is relatively small. Each pixel in a 2n × 2n

image is assigned an n-digit, or 2n-bit, number based on the concept of quadtrees.Numbers of adjacent pixels tend to have the same prefix (most-significant bits), so thecommon prefix and different suffixes of a group of pixels are compressed separately. (Seealso Quadtrees.)

Prefix Property. One of the principles of variable-size codes. It states; Once a certainbit pattern has been assigned as the code of a symbol, no other codes should start withthat pattern (the pattern cannot be the prefix of any other code). Once the string 1,for example, is assigned as the code of a1, no other codes should start with 1 (i.e., theyall have to start with 0). Once 01, for example, is assigned as the code of a2, no othercodes can start with 01 (they all should start with 00). (See also Variable-Size Codes,Statistical Methods.)

Progressive FELICS. A progressive version of FELICS where pixels are encoded inlevels. Each level doubles the number of pixels encoded. To decide what pixels areincluded in a certain level, the preceding level can conceptually be rotated 45◦ and scaledby√

2 in both dimensions. (See also FELICS, MLP, Progressive Image Compression.)

Glossary 1059

Progressive Image Compression. An image compression method where the compressedstream consists of “layers,” where each layer contains more detail of the image. Thedecoder can very quickly display the entire image in a low-quality format, and thenimprove the display quality as more and more layers are being read and decompressed.A user watching the decompressed image develop on the screen can normally recognizemost of the image features after only 5–10% of it has been decompressed. Improvingimage quality over time can be done by (1) sharpening it, (2) adding colors, or (3)increasing its resolution. (See also Progressive FELICS, Hierarchical Progressive ImageCompression, MLP, JBIG.)

Psychoacoustic Model. A mathematical model of the sound masking properties of thehuman auditory (ear brain) system.

QIC-122 Compression. An LZ77 variant that has been developed by the QIC organi-zation for text compression on 1/4-inch data cartridge tape drives.

QM Coder. This is the arithmetic coder of JPEG and JBIG. It is designed for sim-plicity and speed, so it is limited to input symbols that are single bits and it employsan approximation instead of exact multiplication. It also uses fixed-precision integerarithmetic, so it has to resort to renormalization of the probability interval from timeto time, in order for the approximation to remain close to the true multiplication. (Seealso Arithmetic Coding.)

Quadrisection. This is a relative of the quadtree method. It assumes that the originalimage is a 2k× 2k square matrix M0, and it constructs matrices M1, M2,. . . ,Mk+1

with fewer and fewer columns. These matrices naturally have more and more rows,and quadrisection achieves compression by searching for and removing duplicate rows.Two closely related variants of quadrisection are bisection and octasection (See alsoQuadtrees.)

Quadtrees. This is a data compression method for bitmap images. A quadtree (Sec-tion 4.30) is a tree where each leaf corresponds to a uniform part of the image (a quad-rant, subquadrant, or a single pixel) and each interior node has exactly four children.(See also Bintrees, Prefix Compression, Quadrisection.)

Quaternary. A base-4 digit. It can be 0, 1, 2, or 3.

RAR. RAR An LZ77 variant designed and developed by Eugene Roshal. RAR is ex-tremely popular with Windows users and is available for a variety of platforms. Inaddition to excellent compression and good encoding speed, RAR offers options such aserror-correcting codes and encryption. (See also Rarissimo.)

Rarissimo. A file utility that’s always used in conjunction with RAR. It is designed toperiodically check certain source folders, automatically compress and decompress filesfound there, and then move those files to designated target folders. (See also RAR.)

Recursive range reduction (3R). Recursive range reduction (3R) is a simple codingalgorithm that offers decent compression, is easy to program, and its performance isindependent of the amount of data to be compressed.

1060 Glossary

Relative Encoding. A variant of RLE, sometimes called differencing (Section 1.3.1). Itis used in cases where the data to be compressed consists of a string of numbers thatdon’t differ by much, or in cases where it consists of strings that are similar to eachother. The principle of relative encoding is to send the first data item a1 followed bythe differences ai+1 − ai. (See also DPCM, RLE.)

Reliability. Variable-size codes and other codes are vulnerable to errors. In cases wherereliable storage and transmission of codes are important, the codes can be made morereliable by adding check bits, parity bits, or CRC (Section 2.12). Notice that reliabilityis, in a sense, the opposite of data compression, because it is achieved by increasingredundancy. (See also CRC.)

Resolution Independent Compression. An image compression method that does notdepend on the resolution of the specific image being compressed. The image can bedecompressed at any resolution. (See also Multiresolution Images, Iterated FunctionSystems, WFA.)

Rice Codes. A special case of the Golomb code. (See also Golomb Codes.)

RLE. A general name for methods that compress data by replacing a run of identicalsymbols with one code, or token, containing the symbol and the length of the run. RLEsometimes serves as one step in a multistep statistical or dictionary-based method. (Seealso Relative Encoding, Conditional Image RLE.)

Scalar Quantization. The dictionary definition of the term “quantization” is “to restricta variable quantity to discrete values rather than to a continuous set of values.” If thedata to be compressed is in the form of large numbers, quantization is used to convertthem to small numbers. This results in (lossy) compression. If the data to be compressedis analog (e.g., a voltage that changes with time), quantization is used to digitize itinto small numbers. This aspect of quantization is used by several audio compressionmethods. (See also Vector Quantization.)

SCSU. A compression algorithm designed specifically for compressing text files in Uni-code (Section 8.12).

SemiAdaptive Compression. A compression method that uses a two-pass algorithm,where the first pass reads the input stream to collect statistics on the data to be com-pressed, and the second pass performs the actual compression. The statistics (model) areincluded in the compressed stream. (See also Adaptive Compression, Locally AdaptiveCompression.)

Semistructured Text. Such text is defined as data that is human readable and alsosuitable for machine processing. A common example is HTML. The sequitur method ofSection 8.10 performs especially well on such text.

Shannon-Fano Coding. An early algorithm for finding a minimum-length variable-sizecode given the probabilities of all the symbols in the data (Section 2.7). This methodwas later superseded by the Huffman method. (See also Statistical Methods, HuffmanCoding.)

Glossary 1061

Shorten. A simple compression algorithm for waveform data in general and for speechin particular (Section 7.9). Shorten employs linear prediction to compute residues (ofaudio samples) which it encodes by means of Rice codes. (See also Rice codes.)

Simple Image. A simple image is one that uses a small fraction of the possible grayscalevalues or colors available to it. A common example is a bi-level image where each pixelis represented by eight bits. Such an image uses just two colors out of a palette of 256possible colors. Another example is a grayscale image scanned from a bi-level image.Most pixels will be black or white, but some pixels may have other shades of gray. Acartoon is also an example of a simple image (especially a cheap cartoon, where just afew colors are used). A typical cartoon consists of uniform areas, so it may use a smallnumber of colors out of a potentially large palette. The EIDAC method of Section 4.13is especially designed for simple images.

Sliding Window Compression. The LZ77 method (Section 3.3) uses part of the pre-viously seen input stream as the dictionary. The encoder maintains a window to theinput stream, and shifts the input in that window from right to left as strings of symbolsare being encoded. The method is therefore based on a sliding window. (See also LZMethods.)

Space-Filling Curves. A space-filling curve (Section 4.32) is a function P(t) that goesthrough every point in a given two-dimensional region, normally the unit square, as tvaries from 0 to 1. Such curves are defined recursively and are used in image compression.

Sparse Strings. Regardless of what the input data represents—text, binary, images, oranything else—we can think of the input stream as a string of bits. If most of the bits arezeros, the string is sparse. Sparse strings can be compressed very efficiently by speciallydesigned methods (Section 8.5).

SPIHT. A progressive image encoding method that efficiently encodes the image after ithas been transformed by any wavelet filter. SPIHT is embedded, progressive, and has anatural lossy option. It is also simple to implement, fast, and produces excellent resultsfor all types of images. (See also EZW, Progressive Image Compression, EmbeddedCoding, Discrete Wavelet Transform.)

Statistical Methods. These methods (Chapter 2) work by assigning variable-size codesto symbols in the data, with the shorter codes assigned to symbols or groups of symbolsthat appear more often in the data (have a higher probability of occurrence). (Seealso Variable-Size Codes, Prefix Property, Shannon-Fano Coding, Huffman Coding, andArithmetic Coding.)

Statistical Model. (See Model of Compression.)

String Compression. In general, compression methods based on strings of symbols canbe more efficient than methods that compress individual symbols (Section 3.1).

Subsampling. Subsampling is, possibly, the simplest way to compress an image. Oneapproach to subsampling is simply to ignore some of the pixels. The encoder may, forexample, ignore every other row and every other column of the image, and write theremaining pixels (which constitute 25% of the image) on the compressed stream. The

1062 Glossary

decoder inputs the compressed data and uses each pixel to generate four identical pixelsof the reconstructed image. This, of course, involves the loss of much image detail andis rarely acceptable. (See also Lossy Compression.)

Symbol. The smallest unit of the data to be compressed. A symbol is often a byte butmay also be a bit, a trit {0, 1, 2}, or anything else. (See also Alphabet.)

Symbol Ranking. A context-based method (Section 8.2) where the context C of thecurrent symbol S (the N symbols preceding S) is used to prepare a list of symbols thatare likely to follow C. The list is arranged from most likely to least likely. The positionof S in this list (position numbering starts from 0) is then written by the encoder, afterbeing suitably encoded, on the output stream.

Taps. Wavelet filter coefficients. (See also Continuous Wavelet Transform, DiscreteWavelet Transform.)

TAR. The standard UNIX archiver. The name TAR stands for Tape ARchive. It groupsa number of files into one file without compression. After being compressed by the UNIXcompress program, a TAR file gets an extension name of tar.z.

Textual Image Compression. A compression method for hard copy documents contain-ing printed or typed (but not handwritten) text. The text can be in many fonts andmay consist of musical notes, hieroglyphs, or any symbols. Pattern recognition tech-niques are used to recognize text characters that are identical or at least similar. Onecopy of each group of identical characters is kept in a library. Any leftover material isconsidered residue. The method uses different compression techniques for the symbolsand the residue. It includes a lossy option where the residue is ignored.

Time/frequency (T/F) codec. An audio codec that employs a psychoacoustic modelto determine how the normal threshold of the ear varies (in both time and frequency)in the presence of masking sounds.

Token. A unit of data written on the compressed stream by some compression algo-rithms. A token consists of several fields that may have either fixed or variable sizes.

Transform. An image can be compressed by transforming its pixels (which are corre-lated) to a representation where they are decorrelated. Compression is achieved if thenew values are smaller, on average, than the original ones. Lossy compression can beachieved by quantizing the transformed values. The decoder inputs the transformed val-ues from the compressed stream and reconstructs the (precise or approximate) originaldata by applying the opposite transform. (See also Discrete Cosine Transform, FourierTransform, Continuous Wavelet Transform, Discrete Wavelet Transform.)

Triangle Mesh. Polygonal surfaces are very popular in computer graphics. Such asurface consists of flat polygons, mostly triangles, so there is a need for special methodsto compress a triangle mesh. One such a method is edgebreaker (Section 8.11).

Trit. A ternary (base 3) digit. It can be 0, 1, or 2.

Tunstall codes. Tunstall codes are a variation on variable-size codes. They are fixed-sizecodes, each encoding a variable-size string of data symbols.

Glossary 1063

Unary Code. A way to generate variable-size codes of the integers in one step. Theunary code of the nonnegative integer n is defined (Section 2.3.1) as n − 1 1’s followedby a single 0 (Table 2.3). There is also a general unary code. (See also Golomb Code.)

Unicode. A new international standard code, the Unicode, has been proposed, and isbeing developed by the international Unicode organization (www.unicode.org). Unicodeuses 16-bit codes for its characters, so it provides for 216 = 64K = 65,536 codes. (Noticethat doubling the size of a code much more than doubles the number of possible codes. Infact, it squares the number of codes.) Unicode includes all the ASCII codes in addition tocodes for characters in foreign languages (including complete sets of Korean, Japanese,and Chinese characters) and many mathematical and other symbols. Currently, about39,000 out of the 65,536 possible codes have been assigned, so there is room for addingmore symbols in the future.

The Microsoft Windows NT operating system has adopted Unicode, as have also AT&TPlan 9 and Lucent Inferno. (See also ASCII, Codes.)

UNIX diff. A file differencing algorithm that uses APPEND, DELETE and CHANGE to encodethe differences between two text files. diff generates an output that is human-readableor, optionally, it can generate batch commands for a text editor like ed. (See also BSdiff,Exediff, File differencing, VCDIFF, and Zdelta.)

V.42bis Protocol. This is a standard, published by the ITU-T (page 104) for use infast modems. It is based on the older V.32bis protocol and is supposed to be used forfast transmission rates, up to 57.6K baud. The standard contains specifications for datacompression and error correction, but only the former is discussed, in Section 3.21.

V.42bis specifies two modes: a transparent mode, where no compression is used, anda compressed mode using an LZW variant. The former is used for data streams thatdon’t compress well, and may even cause expansion. A good example is an alreadycompressed file. Such a file looks like random data, it does not have any repetitivepatterns, and trying to compress it with LZW will fill up the dictionary with short,two-symbol, phrases.

Variable-Size Codes. These are used by statistical methods. Such codes should satisfythe prefix property (Section 2.2) and should be assigned to symbols based on theirprobabilities. (See also Prefix Property, Statistical Methods.)

VCDIFF. A method for compressing the differences between two files. (See also BSdiff,Exediff, File differencing, UNIX diff, and Zdelta.)

Vector Quantization. This is a generalization of the scalar quantization method. It isused for both image and audio compression. In practice, vector quantization is commonlyused to compress data that has been digitized from an analog source, such as sampledsound and scanned images (drawings or photographs). Such data is called digitallysampled analog data (DSAD). (See also Scalar Quantization.)

Video Compression. Video compression is based on two principles. The first is the spa-tial redundancy that exists in each video frame. The second is the fact that very often,a video frame is very similar to its immediate neighbors. This is called temporal redun-dancy. A typical technique for video compression should therefore start by encoding the

1064 Glossary

first frame using an image compression method. It should then encode each successiveframe by identifying the differences between the frame and its predecessor, and encodingthese differences.

Voronoi Diagrams. Imagine a petri dish ready for growing bacteria. Four bacteria ofdifferent types are simultaneously placed in it at different points and immediately startmultiplying. We assume that their colonies grow at the same rate. Initially, each colonyconsists of a growing circle around one of the starting points. After a while some ofthem meet and stop growing in the meeting area due to lack of food. The final resultis that the entire dish gets divided into four areas, one around each of the four startingpoints, such that all the points within area i are closer to starting point i than to anyother start point. Such areas are called Voronoi regions or Dirichlet Tessellations.

WavPack. WavPack is an open, multiplatform audio compression algorithm and soft-ware that supports three compression modes, lossless, high-quality lossy, and a uniquehybrid mode. WavPack handles integer audio samples up to 32-bits wide and also 32-bitIEEE floating-point audio samples. It employs an original entropy encoder that assignsvariable-size Golomb codes to the residuals and also has a recursive Golomb coding modefor cases where the distribution of the residuals is not geometric.

WFA. This method uses the fact that most images of interest have a certain amount ofself-similarity (i.e., parts of the image are similar, up to size or brightness, to the entireimage or to other parts). It partitions the image into subsquares using a quadtree, anduses a recursive inference algorithm to express relations between parts of the image ina graph. The graph is similar to graphs used to describe finite-state automata. Themethod is lossy, since parts of a real image may be very similar to other parts. WFA isa very efficient method for compression of grayscale and color images. (See also GFA,Quadtrees, Resolution-Independent Compression.)

WSQ. An efficient lossy compression method specifically developed for compressing fin-gerprint images. The method involves a wavelet transform of the image, followed byscalar quantization of the wavelet coefficients, and by RLE and Huffman coding of theresults. (See also Discrete Wavelet Transform.)

XMill. Section 3.26 is a short description of XMill, a special-purpose compressor forXML files.

Zdelta. A file differencing algorithm developed by Dimitre Trendafilov, Nasir Memonand Torsten Suel. Zdelta adapts the compression library zlib to the problem of differ-ential file compression. zdelta represents the target file by combining copies from boththe reference and the already compressed target file. A Huffman encoder is used tofurther compress this representation. (See also BSdiff, Exediff, File differencing, UNIXdiff, and VCDIFF.)

Zero-Probability Problem. When samples of data are read and analyzed in order togenerate a statistical model of the data, certain contexts may not appear, leaving entrieswith zero counts and thus zero probability in the frequency table. Any compressionmethod requires that such entries be somehow assigned nonzero probabilities.

Glossary 1065

Zip. Popular software that implements the Deflate algorithm (Section 3.23) that usesa variant of LZ77 combined with static Huffman coding. It uses a 32-Kb-long slidingdictionary and a look-ahead buffer of 258 bytes. When a string is not found in thedictionary, its first symbol is emitted as a literal byte. (See also Deflate, Gzip.)

The expression of a man’s face is commonly a

help to his thoughts, or glossary on his speech.

—Charles Dickens, Life and Adventures of Nicholas Nickleby (1839)

Joining the DataCompression Community

Those interested in a personal touch can join the “DC community” and communicatewith researchers and developers in this growing area in person by attending the DataCompression Conference (DCC). It has taken place, mostly in late March, every yearsince 1991, in Snowbird, Utah, USA, and it lasts three days. Detailed information aboutthe conference, including the organizers and the geographical location, can be found athttp://www.cs.brandeis.edu/~dcc/.

In addition to invited presentations and technical sessions, there is a poster sessionand “Midday Talks” on issues of current interest.

The poster session is the central event of the DCC. Each presenter places a descrip-tion of recent work (including text, diagrams, photographs, and charts) on a 4-foot-wideby 3-foot-high poster. They then discuss the work with anyone interested, in a relaxedatmosphere, with refreshments served. The Capocelli prize is awarded annually for thebest student-authored DCC paper. This is in memory of Renato M. Capocelli.

The program committee reads like a who’s who of data compression, but the twocentral figures are James Andrew Storer and Martin Cohn, both of Brandeis Univer-sity, who chair the conference and the conference program, respectively. (In 2006, theprogram committee chair is Michael W. Marcellin.)

The conference proceedings have traditionally been edited by Storer and Cohn.They are published by the IEEE Computer Society (http://www.computer.org/) andare distributed prior to the conference; an attractive feature. A complete bibliography(in bibTEX format) of papers published in past DCCs can be found athttp://liinwww.ira.uka.de/bibliography/Misc/dcc.html.

We were born to unite with our fellow men,

and to join in community with the human race.

—Cicero

Index

The index caters to those who have already read the book and want to locate a familiaritem, as well as to those new to the book who are looking for a particular topic. I haveincluded any terms that may occur to a reader interested in any of the topics discussedin the book (even topics that are just mentioned in passing). As a result, even a quickglancing over the index gives the reader an idea of the terms and topics included inthe book. Notice that the index items “data compression” and “image compression”have only general subitems such as “logical,” “lossless,” and “bi-level.” No specificcompression methods are listed as subitems.I have attempted to make the index items as complete as possible, including middlenames and dates. Any errors and omissions brought to my attention are welcome.They will be added to the errata list and will be included in any future editions.

2-pass compression, 8, 89, 220, 350, 426,876, 885, 1060

3R, see recursive range reduction

7 (as a lucky number), 823

7z, viii, ix, 241–246, 1056

7-Zip, viii, 241–246, 1041, 1056

8 1/2 (movie), 146

90◦ rotation, 516

A-law companding, 737–742, 752

AAC, see advanced audio coding

AAC-LD, see advanced audio coding

Abish, Walter (1931–), 141

Abousleman, Glen P., 949

AbS (speech compression), 756

AC coefficient (of a transform), 288, 298,301

AC-3 compression, see Dolby AC-3

ACB, xvii, 139, 851, 862–868, 1041

ad hoc text compression, 19–22

Adair, Gilbert (1944–), 142

Adams, Douglas (1952–2001), 229, 325

adaptive arithmetic coding, 125–127, 271,628, 908

adaptive compression, 8, 13

adaptive differential pulse code modulation(ADPCM), 448, 742–744

adaptive frequency encoding, 95, 1056

adaptive Golomb coding (Goladap), 69–70

adaptive Golomb image compression,436–437

adaptive Huffman coding, xii, 8, 38, 89–95,100, 223, 229, 459, 1041, 1054, 1056

and video compression, 669

word-based, 886–887

adaptive wavelet packet (wavelet imagedecomposition), 596

ADC (analog-to-digital converter), 724

1070 Index

Addison, Joseph (adviser, 1672–1719), 1Adler, Mark (1959–), 230Adobe acrobat, see portable document

formatAdobe Acrobat Capture (software), 890Adobe Inc. (and LZW patent), 257ADPCM, see adaptive differential pulse

code modulationADPCM audio compression, 742–744, 752

IMA, 448, 743–744advanced audio coding (AAC), ix, 821–847,

1041and Apple Computer, ix, 827low delay, 843–844

advanced encryption standard128-bit keys, 226256-bit keys, 242

advanced television systems committee, 662advanced video coding (AVC), see H.264AES, see audio engineering societyaffine transformations, 514–517

attractor, 518, 519age (of color), 454Aldus Corp. (and LZW patent), 257algorithmic encoder, 10algorithmic information content, 52Alice (in wonderland), xxalphabet (definition of), 1041alphabetic redundancy, 2Alphabetical Africa (book), 141ALS, see audio lossless codingAmer, Paul D., xxAmerica Online (and LZW patent), 257Amis, Martin (1949–), 103analog data, 41, 1060analog video, 653–660analog-to-digital converter, see ADCAnderson, Karen, 442anomalies (in an image), 542ANSI, 102apple audio codec (misnomer), see advanced

audio coding (AAC)ARC, 229, 1042archive (archival software), 229, 1042arithmetic coding, 47, 112–125, 369, 1042,

1052, 1057, 1061adaptive, 125–127, 271, 628, 908and move-to-front, 38and sound, 732

and video compression, 669

context-based (CABAC), 709

in JPEG, 129–137, 338, 350, 1059

in JPEG 2000, 641, 642

MQ coder, 641, 642, 647

principles of, 114

QM coder, 129–137, 161, 338, 350, 1059

ARJ, 229, 1042

array (data structure), 1047

ASCII, 35, 179, 254, 1042, 1044, 1063

ASCII85 (binary to text encoding), 929

Ashland, Matt, ix, 783

Asimov, Isaac (1920–1992), xx, 443

aspect ratio, 655–658, 661–664

definition of, 643, 656

of HDTV, 643, 661–664

of television, 643, 656

associative coding Buyanovsky, see ACB

asymmetric compression, 9, 172, 177, 391,451, 704, 875, 916

FLAC, 764

in audio, 719

ATC (speech compression), 753

atypical image, 404

audio compression, xvi, 8, 719–850

μ-law, 737–742

A-law, 737–742

ADPCM, 448, 742–744

and dictionaries, 719

asymmetric, 719

companding, 732–734

Dolby AC-3, 847–850, 1041

DPCM, 446

FLAC, viii, x, 762–772, 787, 1050

frequency masking, 730–731, 798

lossy vs. lossless debate, 784–785

LZ, 173, 1055

MLP, xiii, 11, 744–750, 1056

monkey’s audio, ix, 783, 1057

MPEG-1, 10, 795–820, 822

MPEG-2, ix, 821–847

silence, 732

temporal masking, 730–731, 798

WavPack, viii, 772–782, 1064

audio engineering society (AES), 796

audio lossless coding (ALS), ix, 784–795,842, 1042

audio, digital, 724–727

Index 1071

Austen, Jane (1775–1817), 100author’s email address, x, xiii, xviii, 15automaton, see finite-state machines

background pixel (white), 369, 375, 1042Baeyer, Hans Christian von (1938), 22, 51,

956Baker, Brenda, 937balanced binary tree, 86, 87, 125bandpass function, 536Bark (unit of critical band rate), 731, 1042Barkhausen, Heinrich Georg (1881–1956),

731, 1042and critical bands, 731

Barnsley, Michael Fielding (1946–), 513barycentric functions, 432, 605barycentric weights, 447basis matrix, 432Baudot code, 20, 21, 281Baudot, Jean Maurice Emile (1845–1903),

20, 281Bayes, Thomas (1702–1761), 1045Bayesian statistics, 137, 1045Beebe, Nelson F. H., xxbel (old logarithmic unit), 721bell curve, see Gaussian distributionBell, Quentin (1910–1996), xBellard, Fabrice, 253, 1049Berbinzana, Manuel Lorenzo de Lizarazu y,

142Bernstein, Dan, 213BGMC, see block Gilbert-Moore codesbi-level image, 77, 264, 270, 281, 366, 369,

547, 1042, 1050–1052bi-level image compression (extended to

grayscale), 273, 369, 449biased Elias Gamma code, 761Bible, index of (concordance), 874bicubic interpolation, 434bicubic polynomial, 434bicubic surface, 434

algebraic representation of, 434geometric representation of, 434

big endian (byte order), 768binary search, 49, 126, 127

tree, 179, 180, 182, 413, 893, 1056binary search tree, 244–245binary tree, 86

balanced, 86, 125

complete, 86, 125

binary tree predictive coding (BTPC), xvii,xx, 454–459

BinHex, 1042

BinHex4, 34–36, 954

bintrees, 457, 464–471, 509, 1042, 1044

progressive, 464–471

biorthogonal filters, 569

bisection, xvii, 483–485, 1059

bit budget (definition of), 11

bitmap, 28

bitplane, 1042

and Gray code, 275

and pixel correlation, 279

and redundancy, 274

definition of, 264

separation of, 273, 369, 449

bitrate (definition of), 11, 1043

bits/char (bpc), 10, 1043

bits/symbol, 1043

bitstream (definition of), 8

Blelloch, Guy, xx, 76, 958

blending functions, 432

block coding, 1043

block decomposition, xvii, 273, 450–454,630, 1043

block differencing (in video compression),666

block Gilbert-Moore codes (BGMC), ix,784, 794

block matching (image compression), xvi,56, 403–406, 1043

block mode, 10, 853

block sorting, 139, 853

block truncation coding, xvi, 406–411, 1043

blocking artifacts in JPEG, 338, 639

Bloom, Charles R., 1056

LZP, 214–221

PPMZ, 157

BMP file compression, xii, 36–37, 1043

BNF (Backus Naur Form), 186, 906

BOCU-1 (Unicode compression), xiii, 852,927, 1043

Bohr, Niels David (1885–1962), 424

Boltzman, Ludwig Eduard (1844–1906), 956

Boutell, Thomas (PNG developer), 246

bpb (bit per bit), 10

bpc (bits per character), 10, 1043

1072 Index

bpp (bits per pixel), 11Braille, 17–18Braille, Louis (1809–1852), 17Brandenburg, Karlheinz (mp3 developer,

1954–), 846break even point in LZ, 182Brislawn, Christopher M., xx, 129, 634Broukhis, Leonid A., 862browsers (Web), see Web browsersBryant, David (WavPack), viii, ix, 772BSDiff, 939–941, 1043bspatch, 939–941, 1043BTC, see block truncation codingBTPC (binary tree predictive coding),

454–459Burmann, Gottlob (1737–1805), 142Burroughs, Edgar Rice (1875–1950), 61Burrows-Wheeler method, xvii, 10, 139, 241,

851, 853–857, 868, 1043and ACB, 863

Buyanovsky, George (Georgii)Mechislavovich, 862, 868, 1041

BWT, see Burrows-Wheeler method

CABAC (context-based arithmetic coding),709

cabinet (Microsoft media format), 187Calgary Corpus, 11, 157, 159, 169, 333, 908CALIC, 439–442, 1044

and EIDAC, 389canonical Huffman codes, 84–88, 235Canterbury Corpus, 12, 333Capocelli prize, 1067Capocelli, Renato Maria (1940–1992), 1067Capon model for bi-level images, 436Carroll, Lewis (1832–1898), xx, 198, 345Cartesian product, 434cartoon-like image, 264, 513cascaded compression, 9case flattening, 19Castaneda, Carlos (Cesar Arana,

1925–1998), 185causal wavelet filters, 571CAVLC (context-adaptive variable-size

code), 350, 709, 710, 716CCITT, 104, 255, 337, 1044, 1052, 1053CDC display code, 19cell encoding (image compression), xvii,

529–530, 1044

CELP (speech compression), 703, 756

Chaitin, Gregory J. (1947–), 53

Chekhov, Anton (1860–1904), 322

chirp (test signal), 547

Chomsky, Noam (1928–), 906

Christie Mallowan, Dame Agatha MaryClarissa (Miller 1890–1976)), 95

chromaticity diagram, 341

chrominance, 559

Cicero, Marcus Tullius (106–43) b.c., 1067

CIE, 341, 1044

color diagram, 341

circular queue, 178–179, 219, 253, 1044

Clancy, Thomas Leo (1947–), 638

Clarke, Arthur Charles (1917–), 843

Clausius, Rudolph (1822–1888), 51

Cleary, John G., 139

cloning (in DMC), 898

Coalson, Josh, viii, x, 762, 772

code overflow (in adaptive Huffman), 93

codec, 7, 1044

codes

ASCII, 1044

Baudot, 20, 281

biased Elias Gamma, 761

CDC, 19

definition of, 1044

EBCDIC, 1044

Elias Gamma, 761

error-correcting, 1049

Golomb, xii, 59, 63–70, 416, 418, 710, 760,874

phased-in binary, 90

pod, vii, 761

prefix, 34, 55–60, 184, 223, 270, 271, 409,891

and Fibonacci numbers, 60, 957

and video compression, 669

Rice, 44, 59, 66, 161, 418, 760, 763, 767,777, 947, 1042, 1050, 1060, 1061

start-step-stop, 56

subexponential, 59, 418–421, 760

Tunstall, 61–62, 1063

unary, 55–60, 193, 220

Unicode, 1044

variable-size, 54–60, 78, 89, 94, 96, 100,101, 104, 112, 171, 173, 223, 270, 271,409, 959, 1060

Index 1073

unambiguous, 71, 1054coefficients of filters, 574–576Cohn, Martin, xiii, 1019, 1067

and LZ78 patent, 258collating sequence, 179color

age of, 454cool, 342warm, 342

color images (and grayscale compression),30, 279, 422, 439, 443, 449, 525, 526,549, 559

color space, 341Commission Internationale de l’Eclairage,

see CIEcompact disc (and audio compression), 821compact support (definition of), 585compact support (of a function), 550compaction, 18companding, 7, 733

ALS, 788, 789audio compression, 732–734

complete binary tree, 86, 125complex methods (diminishing returns), 5,

194, 225, 366complexity (Kolmogorov-Chaitin), 52composite values for progressive images,

464–468, 1044compression factor, 11, 560, 1045compression gain, 11, 1045compression performance measures, 10–11compression ratio, 10, 1045

in UNIX, 224known in advance, 19, 41, 266, 283, 284,

391, 408, 628, 645, 665, 734, 737, 744,796

compressor (definition of), 7Compuserve Information Services, 225, 257,

1050computer arithmetic, 338concordance, 874conditional exchange (in QM-coder),

134–137conditional image RLE, xvi, 32–34, 1045conditional probability, 1045cones (in the retina), 342context

definition of, 1045from other bitplanes, 389

inter, xvi, 389

intra, xvi, 389

symmetric, 414, 423, 439, 441, 647

two-part, 389

context modeling, 140

adaptive, 141

order-N , 142

static, 140

context-adaptive variable-size codes, seeCAVLC

context-based arithmetic coding, seeCABAC

context-based image compression, 412–414,439–442

context-free grammars, xvii, 852, 906, 1046

context-tree weighting, xvi, 161–169, 182,1046

for images, xvi, 273, 449, 945

contextual redundancy, 2

continuous wavelet transform (CWT), xv,343, 543–549, 1046

continuous-tone image, 264, 333, 454, 547,596, 1046, 1050, 1053

convolution, 567, 1046

and filter banks, 567

cool colors, 342

Cormack, Gordon V., 895

correlation, 1046

correlations, 270

between phrases, 259

between pixels, 77, 284, 413

between planes in hyperspectral data, 944

between quantities, 269

between states, 898–900

between words, 887

Costello, Adam M., 248

covariance, 270, 296

and decorrelation, 297

CRC, 254–256, 1046

in CELP, 757

in MPEG audio, 803, 809, 818

in PNG, 247

crew (image compression), 626

in JPEG 2000, 642

Crichton, Michael (1942–), 258

Crick, Francis Harry Compton (1916–2004),149

cross correlation of points, 285

1074 Index

CRT, xvi, 654–658, 1046color, 657

CS-CELP (speech compression), 757CTW, see context-tree weightingCulik, Karel II, xvii, xx, 497, 498, 504cumulative frequencies, 120, 121, 125curves

Hilbert, 485, 487–490Peano, 491, 496–497Sierpinski, 485, 490space-filling, xix, 489–497

CWT, see continuous wavelet transformcycles, 11cyclical redundancy check, see CRC

DAC (digital-to-analog converter), 724data compression

a special case of file differencing, 10adaptive, 8anagram of, 214and irrelevancy, 265and redundancy, 2, 265, 897as the opposite of reliability, 17asymmetric, 9, 172, 177, 391, 451, 704,

719, 875, 916audio, xvi, 719–850block mode, 10, 853block sorting, 853compaction, 18conference, xix, 1019, 1067definition of, 2dictionary-based methods, 7, 139,

171–261, 267diminishing returns, 5, 194, 225, 366, 622general law, 5, 259geometric, 852history in Japan, 14, 229hyperspectral, 941–952images, 263–530intuitive methods, xvi, 17–22, 283–284irreversible, 18joining the community, 1019, 1067logical, 10, 171lossless, 8, 19, 1055lossy, 8, 1055model, 12, 364, 1057nonadaptive, 8optimal, 10, 184packing, 19

patents, xvi, 241, 256–258, 1058

performance measures, 10–11

physical, 10

pointers, 14

progressive, 369

reasons for, 2

semiadaptive, 8, 89, 1060

small numbers, 38, 346, 350, 365, 444,449, 454, 455, 856, 1015

statistical methods, 7, 47–169, 172,266–267

streaming mode, 10, 853

symmetric, 9, 172, 185, 339, 634

two-pass, 8, 89, 114, 220, 350, 426, 876,885, 1060

universal, 10, 184

vector quantization, xvi, 283–284, 361

adaptive, xvi

video, xvi, 664–718

who is who, 1067

data structures, 13, 93, 94, 178, 179, 190,203, 204, 1044, 1047

arrays, 1047

graphs, 1047

hashing, 1047

lists, 1047

queues, 178–179, 1047

stacks, 1047

trees, 1047

DC coefficient (of a transform), 288, 298,301, 302, 330, 339, 345–347, 350

DCC, see data compression, conference

DCT, see discrete cosine transform

decibel (dB), 281, 721, 835, 1047

decimation (in filter banks), 567, 800

decoder, 1047

definition of, 7

deterministic, 10

decompressor (definition of), 7

decorrelated pixels, 269, 272, 284, 292, 297,331

decorrelated values (and covariance), 270

definition of data compression, 2

Deflate, xii, 178, 230–241, 929, 1047, 1051,1065

and RAR, viii, 227

deterministic decoder, 10

Deutsch, Peter, 236

Index 1075

DFT, see discrete Fourier transformDickens, Charles (1812–1870), 413, 1065dictionary (in adaptive VQ), 398dictionary-based methods, 7, 139, 171–261,

267, 1047and sequitur, 910and sound, 732compared to audio compression, 719dictionaryless, 221–224unification with statistical methods,

259–261DIET, 253diff, 931, 1063difference values for progressive images,

464–468, 1044differencing, 27, 208, 350, 449, 1060

file, ix, xii, 852, 930–941, 1050in BOCU, 927in video compression, 665

differentiable functions, 588differential encoding, 27, 444, 1048differential image compression, 442–443,

1047differential pulse code modulation, xvii, 27,

444–448, 1048adaptive, 448and sound, 444, 1048

digital audio, 724–727digital camera, 342digital image, see imagedigital silence, 766digital television (DTV), 664digital video, 660–661, 1047digital-to-analog converter, see DACdigitally sampled analog data, 390, 1063digram, 3, 26, 140, 141, 907, 1047

and redundancy, 2encoding, 26, 907frequencies, 101

discrete cosine transform, xvi, 292, 298–330,338, 343–344, 557, 709, 714, 1047

and speech compression, 753in H.261, 704in MPEG, 679–687modified, 800mp3, 815three dimensional, 298, 947–949

discrete Fourier transform, 343in MPEG audio, 797

discrete sine transform, xvi, 330–333

discrete wavelet transform (DWT), xv, 343,576–589, 1048

discrete-tone image, 264, 333, 334, 450, 454,513, 547, 1043, 1048

Disraeli (Beaconsfield), Benjamin(1804–1881), 26, 1092

distortion measures

in vector quantization, 391–392

in video compression, 668–669

distributions

energy of, 288

flat, 732

Gaussian, 1050

geometric, 63, 357, 777

Laplace, 271, 422, 424–429, 438, 449,726–727, 749, 760, 761, 767, 1054,1056, 1058

normal, 468, 1057

Poisson, 149

skewed, 69

dithering, 375, 471

DjVu document compression, xvii, 630–633,1048

DMC, see dynamic Markov coding

Dobie, J. Frank (1888–1964), 1040

Dolby AC-3, ix, 847–850, 1041

Dolby Digital, see Dolby AC-3

Dolby, Ray (1933–), 847

Dolby, Thomas (Thomas MorganRobertson, 1958–), 850

downsampling, 338, 373

Doyle, Arthur Conan (1859–1930), 167

Doyle, Mark, xx

DPCM, see differential pulse codemodulation

drawing

and JBIG, 369

and sparse strings, 874

DSAD, see digitally sampled analog data

DST, see discrete sine transform

Dumpty, Humpty, 888

Durbin J., 772

DWT, see discrete wavelet transform

dynamic compression, see adaptivecompression

dynamic dictionary, 171

GIF, 225

1076 Index

dynamic Markov coding, xiii, xvii, 852,895–900

Dyson, George Bernard (1953–), 47, 895

ear (human), 727–732Eastman, Willard L. (and LZ78 patent), 258EBCDIC, 179, 1044EBCOT (in JPEG 2000), 641, 642Eddings, David (1931–), 40edgebreaker, xvii, 852, 911–922, 1062Edison, Thomas Alva (1847–1931), 653, 656EIDAC, simple image compression, xvi,

389–390, 1061eigenvalues of a matrix, 297Einstein, Albert (1879–1955)

and E = mc2, 22Elias code (WavPack), 778Elias Gamma code, 761Elias, Peter (1923–2001), 55, 114Eliot, Thomas Stearns (1888–1965), 952Elton, John, 518email address of author, x, xiii, xviii, 15embedded coding in image compression,

614–615, 1048embedded coding using zerotrees (EZW),

xv, 626–630, 1049encoder, 1049

algorithmic, 10definition of, 7

energyconcentration of, 288, 292, 295, 297, 331of a distribution, 288of a function, 543, 544

English text, 2, 13, 172frequencies of letters, 3

entropy, 54, 71, 73, 76, 112, 124, 370, 957definition of, 51, 1049of an image, 272, 455physical, 53

entropy encoder, 573definition of, 52dictionary-based, 171

error metrics in image compression, 279–283error-correcting codes, 1049

in RAR, 226error-detecting codes, 254ESARTUNILOC (letter probabilities), 3escape code

in adaptive Huffman, 89

in Macwrite, 21

in pattern substitution, 27

in PPM, 145

in RLE, 23

in textual image, 892

in word-based compression, 885

ETAOINSHRDLU (letter probabilities), 3

Euler’s equation, 916

even functions, 330

exclusion

in ACB, 867

in PPM, 148–149

in symbol ranking, 859

EXE compression, 253

EXE compressors, 253–254, 936–939, 1049

exediff, 936–939, 1049

exepatch, 936–939, 1049

eye

and brightness sensitivity, 270

and perception of edges, 266

and spatial integration, 657, 677

frequency response of, 682

resolution of, 342

EZW, see embedded coding using zerotrees

FABD, see block decomposition

Fabre, Jean Henri (1823–1915), 333

facsimile compression, 104–112, 266, 270,382, 890, 1049

1D, 104

2D, 108

group 3, 104

factor of compression, 11, 560, 1045

Fano, Robert Mario (1917–), 72

Fantastic Voyage (novel), xx

fast PPM, xii, 159–161

FBI fingerprint compression standard, xv,633–639, 1064

FELICS, 415–417, 1049

progressive, 417–422

Feynman, Richard Phillips (1918–1988), 725

FHM compression, xvii, 852, 903–905, 1049

Fiala, Edward R., 5, 192, 258, 1055

Fibonacci numbers, 1049

and FHM compression, xvii, 903–905,1016

and height of Huffman trees, 84

and number bases, 60

Index 1077

and prefix codes, 60, 957and sparse strings, 879–880

file differencing, ix, xii, 10, 852, 930–941,1043, 1050, 1063, 1064

BSDiff, 939–941bspatch, 939–941exediff, 936–939exepatch, 936–939VCDIFF, 932–934, 1063zdelta, 934–936

filter banks, 566–576biorthogonal, 569decimation, 567, 800deriving filter coefficients, 574–576orthogonal, 568

filterscausal, 571deriving coefficients, 574–576finite impulse response, 567, 576taps, 571

fingerprint compression, xv, 589, 633–639,1064

finite automata, xiii, xvii, xixfinite automata methods, 497–513finite impulse response filters (FIR), 567,

576finite-state machines, xiii, xvii, xix, 78, 497,

852and compression, 436, 895–896

Fisher, Yuval, 513fixed-size codes, 5FLAC, see free lossless audio compressionFlate (a variant of Deflate), 929Forbes, Jonathan (LZX), 187Ford, Glenn (Gwyllyn Samuel Newton

1916–), 71Ford, Paul, 887foreground pixel (black), 369, 375, 1042Fourier series, 536Fourier transform, xix, 284, 300, 343,

532–541, 557, 570, 571, 1047, 1050, 1062and image compression, 540–541and the uncertainty principle, 538–540frequency domain, 534–538

Fourier, Jean Baptiste Joseph (1768–1830),531, 535, 1050

fractal image compression, 513–528, 1052fractals (as nondifferentiable functions), 588Frank, Amalie J., 366

free lossless audio compression (FLAC), viii,x, 727, 762–772, 787, 1050

Freed, Robert A., 229, 1042

French (letter frequencies), 3

frequencies

cumulative, 120, 121, 125

of digrams, 101

of pixels, 289, 370

of symbols, 13, 54, 55, 78, 89–91, 94, 96,97, 114, 115, 369, 1057

in LZ77, 178

frequency domain, 534–538, 730

frequency masking, 730–731, 798

frequency of eof, 117

fricative sounds, 751

front compression, 20

full (wavelet image decomposition), 595

functions

bandpass, 536

barycentric, 432, 605

blending, 432

compact support, 550, 585

differentiable, 588

energy of, 543, 544

even, 330

frequency domain, 534–538

nondifferentiable, 588

nowhere differentiable, 588

odd, 330

parity, 330

plotting of (a wavelet application),578–580

square integrable, 543

support of, 550

fundamental frequency (of a function), 534

Gadsby (book), 141

Gailly, Jean-Loup, 230

gain of compression, 11, 1045

gasket (Sierpinski), 519

Gaussian distribution, 1050, 1054, 1057

generalized finite automata, 510–513, 1050

generating polynomials, 1049

CRC, 256, 803

CRC-32, 255

geometric compression, xvii, 852

geometric distribution, 357, 777

in probability, 63

1078 Index

GFA, see generalized finite automataGIF, 225–226, 1050

and DjVu, 631and image compression, 267and LZW patent, 256–258, 1058and web browsers, 225compared to FABD, 450

Gilbert, Jeffrey M., xx, 450, 454Givens rotations, 316–325Givens, J. Wallace (1910–1993), 325Goladap (adaptive Golomb coding), 69–70golden ratio, 60, 1009, 1050Golomb code, xii, 59, 63–70, 416, 418, 760,

874, 1049, 1050, 1060adaptive, 69–70and JPEG-LS, 354, 355, 357–359, 1053Goladap, 69–70H.264, 710, 716WavPack, 777

Golomb, Solomon Wolf (1932–), 70Gouraud shading (for polygonal surfaces),

911Graham, Heather, 732grammars, context-free, xvii, 852, 906, 1046graphical image, 264graphical user interface, see GUIgraphs (data structure), 1047Gray code, see reflected Gray codeGray, Frank, 281grayscale image, 264, 271, 281, 1050, 1053,

1056, 1058grayscale image compression (extended to

color images), 30, 279, 422, 439, 443,449, 525, 526, 549, 559

Greene, Daniel H., 5, 192, 258, 1055group 3 fax compression, 104, 1049

PDF, 929group 4 fax compression, 104, 1049

PDF, 929growth geometry coding, 366–368, 1050GUI, 265Guidon, Yann, viii, ix, 43, 46Gulliver’s Travels (book), xxGzip, 224, 230, 257, 1047, 1051

H.261 video compression, xvi, 703–704, 1051DCT in, 704

H.264 video compression, viii, 350, 706–718,1051

Haar transform, xv, xvi, 292, 294–295, 326,549–566

Hafner, Ullrich, 509

Hagen, Hans, xx

halftoning, 372, 374, 375, 1051

and fax compression, 106

Hamming codes, 1051

Hardy, Thomas (1840–1928), 127

harmonics (in audio), 790–792

Haro, Fernando Jacinto de Zurita y, 142

hashing, xiii, xvii, 204, 206, 214, 405, 413,453, 893, 1047

HDTV, 643, 847

and MPEG-3, 676, 822, 1057

aspect ratio of, 643, 661–664

resolution of, 661–664

standards used in, 661–664, 1051

heap (data structure), 86–87

hearing

properties of, 727–732

range of, 541

Heisenberg uncertainty principle, 539

Heisenberg, Werner (1901–1976), 539

Herd, Bernd (LZH), 176

Herrera, Alonso Alcala y (1599–1682), 142

hertz (Hz), 532, 720

hierarchical coding (in progressivecompression), 361, 610–613

hierarchical image compression, 339, 350,1051

high-definition television, see HDTV

Hilbert curve, 464, 485–490

and vector quantization, 487–489

traversing of, 491–496

Hilbert, David (1862–1943), 490

history of data compression in Japan, 14,229

homeomorphism, 912

homogeneous coordinates, 516

Horspool, R. Nigel, 895

Hotelling transform, see Karhunen-Loevetransform

Householder, Alston Scott (1904–1993), 325

HTML (as semistructured text), 910, 1060

Huffman coding, xvii, 47, 55, 74–79,104–106, 108, 112, 140,

173, 253, 266, 855, 969, 1047, 1051, 1061,1065

Index 1079

adaptive, 8, 38, 89–95, 223, 229, 459,1041, 1054

and video compression, 669

word-based, 886–887

and Deflate, 230

and FABD, 454

and move-to-front, 38

and MPEG, 682

and reliability, 101

and sound, 732

and sparse strings, 876, 878

and wavelets, 551, 559

canonical, 84–88, 235

code size, 79–81

for images, 78

in image compression, 443

in JPEG, 338, 345, 639

in LZP, 220

in WSQ, 634

not unique, 74

number of codes, 81–82

semiadaptive, 89

ternary, 82

2-symbol alphabet, 77

unique tree, 235

variance, 75

Huffman, David A. (1925–1999), 74

human hearing, 541, 727–732

human visual system, 279, 409, 410

human voice (range of), 727

Humpty Dumpty, see Dumpty, Humpty

hybrid speech codecs, 752, 756–757

hyperspectral data, 941, 1052

hyperspectral data compression, 852,941–952

hyperthreading, 242

ICE, 229

IDCT, see inverse discrete cosine transform

IEC, 102, 676

IFS compression, 513–528, 1052

PIFS, 523

IGS, see improved grayscale quantization

IIID, see memoryless source

IMA, see interactive multimedia association

IMA ADPCM compression, 448, 743–744

image, 263

atypical, 404

bi-level, 77, 264, 281, 366, 369, 547, 1042,1050–1052

bitplane, 1042cartoon-like, 264, 513continuous-tone, 264, 333, 454, 547, 596,

1046, 1053definition of, 263discrete-tone, 264, 333, 334, 450, 454, 513,

547, 1043, 1048graphical, 264grayscale, 264, 271, 281, 1050, 1053, 1056,

1058interlaced, 30reconstruction, 540resolution of, 263simple, 389–390, 1061synthetic, 264types of, 263–264

image compression, 7, 8, 23, 32, 263–530bi-level, 369bi-level (extended to grayscale), 273, 369,

449dictionary-based methods, 267differential, 1047error metrics, 279–283IFS, xviiintuitive methods, xvi, 283–284lossy, 265principle of, 28, 268, 270–274, 403, 412,

461, 485and RGC, 273

progressive, 273, 360–368, 1059growth geometry coding, 366–368median, 365

reasons for, 265RLE, 266self-similarity, 273, 1050, 1064statistical methods, 266–267subsampling, xvi, 283vector quantization, xvi

adaptive, xviimage frequencies, 289image transforms, xvi, 284–333, 487,

554–559, 1062image wavelet decompositions, 589–596images (standard), 333–336improper rotations, 316improved grayscale quantization (IGS), 266inequality (Kraft-MacMillan), 71–72, 1054

1080 Index

and Huffman codes, 76

information theory, 47–53, 174, 1052

integer wavelet transform (IWT), 608–610

Interactive Multimedia Association (IMA),743

interlacing scan lines, 30, 663

International Committee on Illumination,see CIE

interpolating polynomials, xiii, xvii, 423,429–435, 604–608, 612, 1052

degree-5, 605

Lagrange, 759

interval inversion (in QM-coder), 133–134

intuitive compression, 17

intuitive methods for image compression,xvi, 283–284

inverse discrete cosine transform, 298–330,343–344, 984

in MPEG, 681–696

mismatch, 681, 696

modified, 800

inverse discrete sine transform, 330–333

inverse modified DCT, 800

inverse Walsh-Hadamard transform,293–294

inverted file, 874

irrelevancy (and lossy compression), 265

irreversible text compression, 18

Ismael, G. Mohamed, xii, 174

ISO, 102, 337, 676, 1052, 1053, 1057

JBIG2, 378

recommendation CD 14495, 354, 1053

standard 15444, JPEG2000, 642

iterated function systems, 513–528, 1052

ITU, 102, 1044, 1049, 1052

ITU-R, 341

recommendation BT.601, 341, 660

ITU-T, 104, 369

and fax training documents, 104, 333, 404

and MPEG, 676, 1057

JBIG2, 378

recommendation H.261, 703–704, 1051

recommendation H.264, 706, 1051

recommendation T.4, 104, 108

recommendation T.6, 104, 108, 1049

recommendation T.82, 369

V.42bis, 228, 1063

IWT, see integer wavelet transform

JBIG, xvi, 369–378, 630, 1052, 1059

and EIDAC, 389

and FABD, 450

probability model, 370

JBIG2, xvi, 129, 378–388, 929, 1052

and DjVu, 631, 1048

JFIF, 351–354, 1053

joining the DC community, 1019, 1067

Joyce, James Aloysius Augustine(1882–1941), 15, 19

JPEG, xvi, 32, 129, 137, 266, 288, 292,337–351, 354, 389, 449, 630, 682, 683,948, 1043, 1052, 1053, 1059

and DjVu, 631

and progressive image compression, 361

and sparse strings, 874

and WSQ, 639

blocking artifacts, 338, 639

compared to H.264, 718

lossless mode, 350–351

similar to MPEG, 678

JPEG 2000, xiii, xv–xvii, 129, 341, 532,639–652, 929

JPEG-LS, xvi, 351, 354–360, 642, 945, 1053

Jung, Robert K., 229, 1042

Karhunen-Loeve transform, xvi, 292,295–297, 329, 557, see also Hotellingtransform

Kari, Jarkko, 497

Katz, Philip W. (1962–2000), 229, 230, 238,241, 1058

King, Stephen Edwin (writer, 1947–), 928

Klaasen, Donna, 865

KLT, see Karhunen-Loeve transform

Knowlton, Kenneth (1931–), 464

Knuth, Donald Ervin (1938–), vii,(Colophon)

Kolmogorov-Chaitin complexity, 52

Kraft-MacMillan inequality, 71–72, 1054

and Huffman codes, 76

Kronecker delta function, 326

Kronecker, Leopold (1823–1891), 326

KT boundary (and dinosaur extinction), 163

KT probability estimate, 163, 1054

L Systems, 906

Hilbert curve, 486

Index 1081

La Disparition (novel), 142Lagrange interpolation, 759, 770Lanchester, John (1962–), xivLang, Andrew (1844–1912), 86Langdon, Glen G., 259, 412Laplace distribution, 271, 422, 424–429, 438,

449, 760, 761, 1054, 1056, 1058in audio MLP, 749in FLAC, 767in image MLP, 422of differences, 726–727

Laplacian pyramid, xv, 532, 590, 610–613,1054

Lau, Daniel Leo, 496lazy wavelet transform, 599LBG algorithm for vector quantization,

392–398, 487, 952Lempel, Abraham (1936–), 43, 173, 1055

LZ78 patent, 258Lempereur, Yves, 34, 1042Lena (image), 285, 333–334, 380, 563, 1003

blurred, 1005Les Revenentes (novelette), 142Levinson, Norman (1912–1975), 772Levinson-Durbin algorithm, 767, 772, 787,

(Colophon)lexicographic order, 179, 854LHA, 229, 1054LHArc, 229, 1054Liebchen, Tilman, 785LIFO, 208lifting scheme, 596–608, 1054light, 539

visible, 342Ligtenberg, Adriaan (1955–), 321line (as a space-filling curve), 995line (wavelet image decomposition), 590linear prediction

4th order, 770ADPCM, 742ALS, 785–787FLAC, 766MLP audio, 749shorten, 757

linear predictive coding (LPC), ix, 766,771–772, 784, 1042

hyperspectral data, 945–947linear systems, 1046lipogram, 141

list (data structure), 1047

little endian (byte order), 242

in Wave format, 734

LLM DCT algorithm, 321–322

lockstep, 89, 130, 203

LOCO-I (predecessor of JPEG-LS), 354

Loeffler, Christoph, 321

logarithm

as the information function, 49

used in metrics, 11, 281

logarithmic tree (in wavelet decomposition),569

logical compression, 10, 171

lossless compression, 8, 19, 1055

lossy compression, 8, 1055

LPC, see linear predictive coding

LPC (speech compression), 753–756

luminance component of color, 270, 272,337, 338, 341–344, 347, 559, 677

use in PSNR, 281

LZ1, see LZ77

LZ2, see LZ78

LZ77, xvi, 173, 176–179, 195, 199, 403, 862,864, 865, 910, 1043, 1047, 1051, 1055,1056, 1061, 1065

and Deflate, 230

and LZRW4, 198

and repetition times, 184

deficiencies, 182

LZ78, 173, 182, 189–192, 199, 865, 1055,1056

patented, 258

LZAP, 212, 1055

LZARI, viii, 181–182, 1055

LZC, 196, 224

LZEXE, 253, 1049

LZFG, 56, 192–194, 196, 1055

patented, 194, 258

LZH, 176

LZMA, viii, ix, 129, 241–246, 1056

LZMW, 209–210, 1056

LZP, 214–221, 261, 405, 859, 1056

LZRW1, 195–198

LZRW4, 198, 244

LZSS, viii, 179–182, 229, 1054–1056

used in RAR, viii, 227

LZW, 199–209, 794, 1015, 1055, 1056, 1063

decoding, 200–203

1082 Index

patented, 199, 256–258, 1058UNIX, 224word-based, 887

LZX, xii, 187–188, 1056LZY, 213–214, 1056

m4a, see advanced audio codingm4v, see advanced audio codingMackay, Alan Lindsay (1926–), 9MacWrite (data compression in), 21Mahler’s third symphony, 821Mallat, Stephane (and multiresolution

decomposition), 589Malvar, Henrique (Rico), 706Manber, Udi, 937mandril (image), 333, 596Manfred, Eigen (1927–), 114Marcellin, Michael W. (1959–), 1067Markov model, xvii, 32, 100, 139, 273, 370,

905, 988, 1016Marx, Julius Henry (Groucho, 1890–1977),

718masked Lempel-Ziv tool (a variant of LZW),

794, 795Matisse, Henri (1869–1954), 930Matlab software, properties of, 285, 563matrices

eigenvalues, 297QR decomposition, 316, 324–325sequency of, 293

Matsumoto, Teddy, 253Maugham, William Somerset (1874–1965),

527MDCT, see discrete cosine transform,

modifiedmean absolute difference, 487, 668mean absolute error, 668mean square difference, 669mean square error (MSE), 11, 279, 615measures of compression efficiency, 10–11measures of distortion, 391–392median, definition of, 365memoryless source, 42, 162, 164, 166, 167

definition of, 2meridian lossless packing, see MLP (audio)mesh compression, edgebreaker, xvii, 852,

911–922metric, definition of, 524Mexican hat wavelet, 543, 545, 547, 548

Meyer, Carl D., 325

Meyer, Yves (and multiresolutiondecomposition), 589

Microcom, Inc., 26, 95, 1056

midriser quantization, 740

midtread quantization, 739

in MPEG audio, 806

Millan, Emilio, xx

mirroring, see lockstep

MLP, 271, 414, 422–435, 438, 439, 727,1054, 1056, 1058

MLP (audio), xiii, 11, 422, 744–750, 1056

MMR coding, 108, 378, 381, 382, 384

in JBIG2, 379

MNG, see multiple-image network format

MNP class 5, 26, 95–100, 1056

MNP class 7, 100–101, 1056

model

adaptive, 141, 364

context-based, 140

finite-state machine, 895

in MLP, 429

Markov, xvii, 32, 100, 139, 273, 370, 905,988, 1016

of DMC, 896, 899

of JBIG, 370

of MLP, 424

of PPM, 127

of probability, 114, 139

order-N , 142

probability, 125, 165, 369, 370

static, 140

zero-probability problem, 140

modem, 6, 26, 50, 90, 95, 104, 228, 229,1056, 1063

Moffat, Alistair, 412

Molnar, Laszlo, 254

Monk, Ian, 142

monkey’s audio, ix, 783, 1057

monochromatic image, see bi-level image

Montesquieu, (Charles de Secondat,1689–1755), 927

Morlet wavelet, 543, 548

Morse code, 17, 47

Morse, Samuel Finley Breese (1791–1872),1, 47

Moschytz, George S. (LLM method), 321

Motil, John Michael (1938–), 81

Index 1083

motion compensation (in videocompression), 666–676

motion vectors (in video compression), 667,709

Motta, Giovanni (1965–), ix, xiii, 852, 930,941, 951, 1055

move-to-front method, 8, 37–40, 853, 855,1043, 1055, 1057

and wavelets, 551, 559inverse of, 857

Mozart, Joannes Chrysostomus WolfgangusTheophilus (1756–1791), x

mp3, 795–820and Tom’s Diner, 846compared to AAC, 826–827mother of, see Vega, Susan

.mp3 audio files, xvi, 796, 820, 1011and shorten, 757

mp4, see advanced audio codingMPEG, xvi, 341, 656, 670, 1052, 1057

D picture, 688DCT in, 679–687IDCT in, 681–696quantization in, 680–687similar to JPEG, 678

MPEG-1 audio, xvi, 10, 292, 795–820, 822MPEG-1 video, 676–698MPEG-2 audio, ix, 821–847MPEG-3, 822MPEG-4, 698–703, 822

AAC, 841–844audio codecs, 842audio lossless coding (ALS), ix, 784–795,

842, 1042extensions to AAC, 841–844

MPEG-7, 822MQ coder, 129, 379, 641, 642, 647MSE, see mean square errorMSZIP (deflate), 187μ-law companding, 737–742, 752multiple-image network format (MNG), 250multiresolution decomposition, 589, 1057multiresolution image, 500, 1057multiresolution tree (in wavelet

decomposition), 569multiring chain coding, 903Murray, James, 46musical notation, 541musical notations, 379

Musset, Alfred de (1810–1857), xxvMuth, Robert, 937

N -trees, 471–476nanometer (definition of), 341negate and exchange rule, 516Nelson, Mark, 14Newton, Isaac (1643–1727), 124nonadaptive compression, 8nondifferentiable functions, 588nonstandard (wavelet image

decomposition), 592normal distribution, 468, 1050, 1057NSCT (never the same color twice), 662NTSC (television standard), 655, 656, 661,

689, 847numerical history (mists of), 435Nyquist rate, 541, 585, 724Nyquist, Harry (1889–1976), 541Nyquist-Shannon sampling theorem, 724

Oberhumer, Markus Franz Xaver Johannes,254

OCR (optical character recognition), 889octasection, xvii, 485, 1059octave, 570

in wavelet analysis, 570, 592octree (in prefix compression), 883octrees, 471odd functions, 330Ogg Squish, 762Ohm’s law, 722Okumura, Haruhiko (LZARI), viii, xx, 181,

229, 1054, 1055optimal compression method, 10, 184orthogonal

projection, 447transform, 284, 554–559

orthogonal filters, 568orthonormal matrix, 285, 288, 307, 309, 312,

566, 578

packing, 19PAL (television standard), 655, 656, 659,

689, 847Pandit, Vinayaka D., 508parametric cubic polynomial (PC), 431parcor (in MPEG-4 ALS), 787

1084 Index

parity, 254of functions, 330vertical, 255

parrots (image), 275parsimony (principle of), 22partial correlation coefficients, see parcorPascal, Blaise (1623–1662), 2, 63, 169patents of algorithms, xvi, 241, 256–258,

1058pattern substitution, 27Pavlov, Igor (7z and LZMA creator), viii,

ix, 241, 246, 1041, 1056PDF, see portable document formatPDF (Adobe’s portable document format)

and DjVu, 631peak signal to noise ratio (PSNR), 11,

279–283Peano curve, 491

traversing, 496–497used by mistake, 487

pel, see also pixelsaspect ratio, 656, 689difference classification (PDC), 669fax compression, 104, 263

peppers (image), 333perceptive compression, 9Percival, Colin (BSDiff creator), 939–941,

1043Perec, Georges (1936–1982), 142permutation, 853petri dish, 1064phased-in binary codes, 59, 90, 224Phong shading (for polygonal surfaces), 911phrase, 1058

in LZW, 199physical compression, 10physical entropy, 53Picasso, Pablo Ruiz (1881–1973), 315PIFS (IFS compression), 523pixels, 28, 369, see also pel

background, 369, 375, 1042correlated, 284decorrelated, 269, 272, 284, 292, 297, 331definition of, 263, 1058foreground, 369, 375, 1042highly correlated, 268

PKArc, 229, 1058PKlite, 229, 253, 1058PKunzip, 229, 1058

PKWare, 229, 1058

PKzip, 229, 1058

Planck constant, 539

plosive sounds, 751

plotting (of functions), 578–580

PNG, see portable network graphics

pod code, vii, 761

Poe, Edgar Allan (1809–1849), 8

points (cross correlation of), 285

Poisson distribution, 149

polygonal surfaces compression,edgebreaker, xvii, 852, 911–922

polynomial

bicubic, 434

definition of, 256, 430

parametric cubic, 431

parametric representation, 430

polynomials (interpolating), xiii, xvii, 423,429–435, 604–608, 612, 1052

degree-5, 605

Porta, Giambattista della (1535–1615), 1

portable document format (PDF), ix, 852,928–930, 1058

portable network graphics (PNG), xii,246–251, 257, 1058

PostScript (and LZW patent), 257

Poutanen, Tomi (LZX), 187

Powell, Anthony Dymoke (1905–2000), 15,551

PPM, xii, 43, 139–159, 438, 858, 867, 892,900, 910, 1058

exclusion, 148–149

trie, 150

vine pointers, 152

word-based, 887–889

PPM (fast), xii, 159–161

PPM*, xii, 155–157

PPMA, 149

PPMB, 149, 439

PPMC, 146, 149

PPMD (in RAR), 227

PPMdH (by Dmitry Shkarin), 241

PPMP, xii, 149

PPMX, xii, 149

PPMZ, xii, 157–159

PPPM, 438–439, 1058

prediction, 1058

nth order, 446, 758, 759, 785

Index 1085

AAC, 840–841ADPCM, 742–744BTPC, 456, 458, 459CALIC, 440CELP, 757definition of, 140deterministic, 369, 377FELICS, 988image compression, 271JPEG-LS, 339, 350, 354, 356long-term, 792LZP, 214LZRW4, 198MLP, 422–424MLP audio, 748monkey’s audio, 783Paeth, 250PNG, 246, 248PPM, 145PPM*, 155PPMZ, 157PPPM, 438probability, 139progressive, 790video, 665, 670

preechoesin AAC, 839in MPEG audio, 815–820

prefix codes, 34, 55–60, 184, 223, 270, 271,409

and video compression, 669prefix compression

images, xvii, 477–478, 1058sparse strings, xvii, 880–884

prefix property, 101, 416, 1058, 1063definition of, 55

probabilityconcepts, xiii, xviiconditional, 1045geometric distribution, 63model, 12, 114, 125, 139, 165, 364, 369,

370, 1057adaptive, 141

Prodigy (and LZW patent), 257progressive compression, 369, 372–378progressive FELICS, 417–422, 1059progressive image compression, 273,

360–368, 1059growth geometry coding, 366–368

lossy option, 361, 422

median, 365

MLP, 271

SNR, 361, 651

progressive prediction, 790

properties of speech, 750–752

Proust, Marcel Valentin Louis GeorgeEugene (1871–1922), 1054, (Colophon)

Prowse, David (Darth Vader, 1935–), 90

PSNR, see peak signal to noise ratio

psychoacoustic model (in MPEG audio),795, 797–798, 803, 805, 809, 813–815,820, 1011, 1059

psychoacoustics, 9, 727–732

pulse code modulation (PCM), 726

punting (definition of), 450

pyramid (Laplacian), xv, 532, 590, 610–613

pyramid (wavelet image decomposition),554, 592

pyramid coding (in progressivecompression), 361, 454, 457, 459, 590,610–613

QIC, 103

QIC-122, 184–186, 1059

QM coder, xvi, 129–137, 161, 338, 350, 1059

QMF, see quadrature mirror filters

QR matrix decomposition, 316, 324–325

QTCQ, see quadtree classified trellis codedquantized

quadrant numbering (in a quadtree), 461,498

quadrature mirror filters, 578

quadrisection, xvii, 478–485, 1059

quadtree classified trellis coded quantizedwavelet image compression (QTCQ),624–625

quadtrees, xiii, xvii, 270, 397, 461–478, 485,498, 1042, 1050, 1058, 1059, 1064

and Hilbert curves, 486

and IFS, 525

and quadrisection, 478, 480

prefix compression, xvii, 477–478, 880,1058

quadrant numbering, 461, 498

spatial orientation trees, 625

quantization

1086 Index

block truncation coding, xvi, 406–411,1043

definition of, 41image transform, 272, 284, 1062in H.261, 704in JPEG, 344–345in MPEG, 680–687midriser, 740midtread, 739, 806scalar, xvi, 41–43, 266, 283, 634, 1060

in SPIHT, 617vector, xvi, 283–284, 361, 390–397, 513,

1063adaptive, xvi, 398–402

quantization noise, 445Quantum (dictionary-based encoder), 187quaternary (base-4 numbering), 462, 1059Quayle, James Danforth (Dan, 1947–), 664queue (data structure), 178–179, 1044, 1047quincunx (wavelet image decomposition),

592

random data, 5, 6, 77, 228, 959, 1063range encoding, 127–129, 783

in LZMA, 243Rao, Ashok, 508RAR, viii, 226–228, 1059Rarissimo, 228, 1060raster (origin of word), 655raster order scan, 33, 270, 272, 284, 360,

403, 417, 423, 438, 440, 450, 451, 672,679, 691

rate-distortion theory, 174ratio of compression, 10, 1045reasons for data compression, 2recursive range reduction (3R), viii, x,

43–46, 1060redundancy, 54, 73, 102

alphabetic, 2and data compression, 2, 226, 265, 897and reliability, 102contextual, 2definition of, 51, 979direction of highest, 592spatial, 268, 664, 1064temporal, 664, 1064

Reed-Solomon error-correcting code, 226reflected Gray code, xvi, 271, 273–279, 369,

449, 496, 1050

Hilbert curve, 486

reflection, 515

reflection coefficients, see parcor

refresh rate (of movies and TV), 654–656,663, 677, 690

relative encoding, 27, 208, 444, 666, 1048,1060

in JPEG, 339

reliability, 101–102

and Huffman codes, 101

as the opposite of compression, 17

in RAR, 226

renormalization (in the QM-coder),131–137, 969

repetition finder, 221–224

repetition times, 182–184

resolution

of HDTV, 661–664

of images (defined), 263

of television, 655–658

of the eye, 342

Reynolds, Paul, 173

RGB

color space, 341

reasons for using, 342

RGC, see reflected Gray code

Ribera, Francisco Navarrete y (1600–?), 142

Rice codes, 44, 59, 418, 1042, 1050, 1060,1061

definition of, 66

fast PPM, 161

FLAC, 763, 767

in hyperspectral data, 947

not used in WavPack, 777

Shorten, 760

subexponential code, 418

Rice, Robert F. (Rice codes developer), 66,760

Richardson, Iain, 706

Richter, Jason James (1980–), 850

Riding the Bullet (novel), 928

RIFF (Resource Interchange File Format),734

Rijndael, see advanced encryption standard

Rizzo, Francesco, 951, 1055

RLE, 23–46, 105, 270, 271, 1060, see alsorun-length encoding

and BW method, 853, 855, 1043

Index 1087

and sound, 732and wavelets, 551, 559BinHex4, 34–36BMP image files, xii, 36–37, 1043image compression, 28–32in JPEG, 338QIC-122, 184–186, 1059

RMSE, see root mean square errorRobinson, John Allen, xxrods (in the retina), 342Roman numerals, 532root mean square error (RMSE), 281Roshal, Alexander Lazarevitch, 226Roshal, Eugene Lazarevitch (1972–), viii, x,

226, 227, 1059rotation, 515

90◦, 516matrix of, 312, 317

rotationsGivens, 316–325improper, 316in three dimensions, 329–330

roulette game (and geometric distribution),63

run-length encoding, 7, 23–46, 77, 266, 1056and EOB, 345and Golomb codes, 63–70, see also RLEBTC, 409FLAC, 766in images, 271MNP5, 95

Ryan, Abram Joseph (1839–1886), 552, 559,566

Sagan, Carl Edward (1934–1996), 41, 658sampling of sound, 724–727Samuelson, Paul Anthony (1915–), 146Saravanan, Vijayakumaran, xx, 320, 984Sayood, Khalid, 442SBC (speech compression), 753scalar quantization, xvi, 41–43, 266, 283,

1060in SPIHT, 617in WSQ, 634

scaling, 515Scott, Charles Prestwich (1846–1932), 676SCSU (Unicode compression), xiii, 852,

922–927, 1060SECAM (television standard), 655, 660, 689

self-similarity in images, 273, 497, 504, 1050,1064

semiadaptive compression, 8, 89, 1060

semiadaptive Huffman coding, 89

semistructured text, xvii, 852, 910–911, 1060

sequency (definition of), 293

sequitur, xvii, 26, 852, 906–911, 1046, 1060

and dictionary-based methods, 910

set partitioning in hierarchical trees(SPIHT), xv, 532, 614–625, 1049, 1061

and CREW, 626

seven deadly sins, 823

SHA-256 hash algorithm, 242

shadow mask, 656, 657

Shanahan, Murray, 161

Shannon, Claude Elwood (1916–2001), 51,72, 858, 1049, 1052

Shannon-Fano method, 47, 55, 72–74, 1051,1061

shearing, 515

shift invariance, 1046

Shkarin, Dmitry, 227, 241

shorten (speech compression), xiii, 66,757–763, 767, 945, 1050, 1061

sibling property, 91

Sierpinski

curve, 485, 490–491

gasket, 518–522, 998, 1000

triangle, 518, 519, 521, 998

Sierpinski, Wac�law (1882–1969), 485, 490,518, 521

sign-magnitude (representation of integers),648

signal to noise ratio (SNR), 282

signal to quantization noise ratio (SQNR),282

silence compression, 732

simple images, EIDAC, xvi, 389–390, 1061

sins (seven deadly), 823

skewed probabilities, 116

sliding window compression, xvi, 176–182,403, 910, 1043, 1061

repetition times, 182–184

small numbers (easy to compress), 38, 346,350, 365, 444, 449, 454, 455, 551, 856,1015

SNR, see signal to noise ratio

1088 Index

SNR progressive image compression, 361,640, 651

Soderberg, Lena (of image fame, 1951–), 334solid archive, see RARsort-based context similarity, xvii, 851,

868–873sound

fricative, 751plosive, 751properties of, 720–723sampling, 724–727unvoiced, 751voiced, 750

source coding (formal name of datacompression), 2

source speech codecs, 752–756SourceForge.net, 762SP theory (simplicity and power), 7space-filling curves, xix, 270, 485–497, 1061

Hilbert, 487–490Peano, 491Sierpinski, 490–491

sparse strings, 19, 68, 851, 874–884, 1061prefix compression, xvii, 880–884

sparseness ratio, 11, 560spatial orientation trees, 619–620, 625, 626spatial redundancy, 268, 664, 1064

in hyperspectral data, 943spectral dimension (in hyperspectral data),

943spectral selection (in JPEG), 339speech (properties of), 750–752speech compression, 585, 720, 750–762

μ-law, 752A-law, 752AbS, 756ADPCM, 752ATC, 753CELP, 756CS-CELP, 757DPCM, 752hybrid codecs, 752, 756–757LPC, 753–756SBC, 753shorten, xiii, 66, 757–763, 1050, 1061source codecs, 752–756vocoders, 752, 753waveform codecs, 752–753

Sperry Corporation

LZ78 patent, 258

LZW patent, 256

SPIHT, see set partitioning in hierarchicaltrees

SQNR, see signal to quantization noise ratio

square integrable functions, 543

Squish, 762

stack (data structure), 208, 1047

Stafford, David (quantum dictionarycompression), 187

standard (wavelet image decomposition),554, 590, 593

standard test images, 333–336

standards (organizations for), 102–103

standards of television, 559, 655–660, 847

start-step-stop codes, 56

static dictionary, 171, 172, 191, 224

statistical distributions, see distributions

statistical methods, 7, 47–169, 172, 266–267,1061

context modeling, 140

unification with dictionary methods,259–261

statistical model, 115, 139, 171, 369, 370,1057

steganography (data hiding), 184

stone-age binary (unary code), 55

Storer, James Andrew, 179, 951, 1019, 1055,1056, 1067

streaming mode, 10, 853

string compression, 173–174, 1061

subband (minimum size of), 589

subband transform, 284, 554–559, 566

subexponential code, 59, 418–421, 760

subsampling, xvi, 283, 1062

in video compression, 665

successive approximation (in JPEG), 339

support (of a function), 550

surprise (as an information measure), 48, 53

Swift, Jonathan (1667–1745), xx

SWT, see symmetric discrete wavelettransform

symbol ranking, xvii, 139, 851, 858–861,867, 868, 1062

symmetric (wavelet image decomposition),589, 635

symmetric compression, 9, 172, 185, 339,634

Index 1089

symmetric context (of a pixel), 414, 423,439, 441, 647

symmetric discrete wavelet transform, 634synthetic image, 264Szymanski, Thomas G., 179, 1056

T/F codec, see time/frequency (T/F) codectaps (wavelet filter coefficients), 571, 584,

585, 589, 626, 635, 1062TAR (Unix tape archive), 1062television

aspect ratio of, 655–658resolution of, 655–658scan line interlacing, 663standards used in, 559, 655–660

temporal masking, 730–731, 798temporal redundancy, 664, 1064tera (= 240), 633text

case flattening, 19English, 2, 3, 13, 172files, 8natural language, 101random, 5, 77, 959semistructured, xvii, 852, 910–911, 1060

text compression, 7, 13, 23LZ, 173, 1055QIC-122, 184–186, 1059RLE, 23–27symbol ranking, 858–861, 868

textual image compression, 851, 888–895,1062

textual substitution, 398Thomson, William (Lord Kelvin

1824–1907), 541TIFF

and JGIB2, 1053and LZW patent, 257

time/frequency (T/F) codec, 824, 1041,1062

title of this book, 15Tjalkins, Tjalling J., xxToeplitz, Otto (1881–1940), 772token (definition of), 1062tokens

dictionary methods, 171in block matching, 403, 406in LZ77, 176, 177, 230in LZ78, 189, 190

in LZFG, 192, 193

in LZSS, 179, 181

in LZW, 199

in MNP5, 96, 97

in prefix compression, 477, 478

in QIC-122, 184

training (in data compression), 32, 104, 105,141, 333, 391–393, 398, 404, 440, 442,487, 682, 868, 905

transforms, 7

AC coefficient, 288, 298, 301

DC coefficient, 288, 298, 301, 302, 330,339, 345–347, 350

definition of, 532

discrete cosine, xvi, 292, 298–330, 338,343–344, 557, 709, 714, 815, 1047

3D, 298, 947–949

discrete Fourier, 343

discrete sine, xvi, 330–333

Fourier, xix, 284, 532–541, 570, 571

Haar, xv, xvi, 292, 294–295, 326, 549–566

Hotelling, see Karhunen-Loeve transform

images, xvi, 284–333, 487, 554–559, 1062

inverse discrete cosine, 298–330, 343–344,984

inverse discrete sine, 330–333

inverse Walsh-Hadamard, 293–294

Karhunen-Loeve, xvi, 292, 295–297, 329,557

orthogonal, 284, 554–559

subband, 284, 554–559, 566

Walsh-Hadamard, xvi, 292–294, 557, 714,715, 982

translation, 516

tree

adaptive Huffman, 89–91

binary search, 179, 180, 182, 1056

balanced, 179, 181

skewed, 179, 181

data structure, 1047

Huffman, 74, 75, 79, 89–91, 1051

height of, 82–84

unique, 235

Huffman (decoding), 93

Huffman (overflow), 93

Huffman (rebuilt), 93

logarithmic, 569

LZ78, 190

1090 Index

overflow, 191LZW, 203, 204, 206, 208multiway, 203spatial orientation, 619–620, 625, 626traversal, 74

tree-structured vector quantization, 396–397trends (in an image), 542triangle (Sierpinski), 519, 521, 998triangle mesh compression, edgebreaker,

xvii, 852, 911–922, 1062trie

definition of, 150, 191LZW, 203Patricia, 245

trigram, 140, 141and redundancy, 2

trit (ternary digit), 50, 60, 82, 497, 792, 961,1062, 1063

Truta, Cosmin, ix, xi, xiii, 70, 251TSVQ, see tree-structured vector

quantizationTunstall code, 61–62, 1063Twain, Mark (1835–1910), 26two-pass compression, 8, 89, 114, 220, 350,

426, 876, 885, 1060

Udupa, Raghavendra, xvii, xx, 508Ulysses (novel), 19unary code, 55–60, 220, 416, 421, 478, 956,

1049, 1063general, 56, 193, 404, 987, 1063, see also

stone-age binaryuncertainty principle, 538–540, 548

and MDCT, 815Unicode, 179, 987, 1044, 1063Unicode compression, xiii, 852, 922–927,

1060unification of statistical and dictionary

methods, 259–261uniform (wavelet image decomposition), 594Unisys (and LZW patent), 256, 257universal compression method, 10, 184univocalic, 142UNIX

compact, 89compress, 191, 196, 224–225, 257, 1044,

1058Gzip, 224, 257

unvoiced sounds, 751

UPX (exe compressor), 254

V.32, 90

V.32bis, 228, 1063

V.42bis, 6, 228–229, 1063

Vail, Alfred (1807–1859), 47

Valens, Clemens, 652

Valenta, Vladimir, 497

Vanryper, William, 46

variable-size codes, 2, 5, 47, 54–60, 78, 89,94, 96, 100, 171, 173, 959, 1058, 1061,1063

and reliability, 101, 1060

and sparse strings, 875–880

definition of, 54

designing of, 55

in fax compression, 104

unambiguous, 71, 1054

variance, 270

and MLP, 426–429

as energy, 288

definition of, 427

of differences, 444

of Huffman codes, 75

VCDIFF (file differencing), 932–934, 1063

vector quantization, 283–284, 361, 390–397,513, 1063

AAC, 842

adaptive, xvi, 398–402

hyperspectral data, 950–952

LBG algorithm, 392–398, 487, 952

locally optimal partitioned vectorquantization, 950–952

quantization noise, 445

tree-structured, 396–397

vector spaces, 326–329, 447

Vega, Susan (mother of the mp3), 846

Veldmeijer, Fred, xv

video

analog, 653–660

digital, 660–661, 1047

high definition, 661–664

video compression, xvi, 664–718, 1064

block differencing, 666

differencing, 665

distortion measures, 668–669

H.261, xvi, 703–704, 1051

H.264, viii, 350, 706–718, 1051

Index 1091

inter frame, 665intra frame, 665motion compensation, 666–676motion vectors, 667, 709MPEG-1, 670, 676–698, 1057MPEG-1 audio, xvi, 795–820subsampling, 665

vine pointers, 152vision (human), 342, 942vmail (email with video), 660vocoders speech codecs, 752, 753voiced sounds, 750Voronoi diagrams, 396, 1064

Wagner’s Ring Cycle, 821Walsh-Hadamard transform, xvi, 290,

292–294, 557, 714, 715, 982warm colors, 342Warnock, John (1940–), 928WAVE audio format, viii, 734–736wave particle duality, 539waveform speech codecs, 752–753wavelet image decomposition

adaptive wavelet packet, 596full, 595Laplacian pyramid, 590line, 590nonstandard, 592pyramid, 554, 592quincunx, 592standard, 554, 590, 593symmetric, 589, 635uniform, 594wavelet packet transform, 594

wavelet packet transform, 594wavelet scalar quantization (WSQ), 1064wavelets, 272, 273, 292, 541–652

Beylkin, 585Coifman, 585continuous transform, 343, 543–549, 1046Daubechies, 580, 585–588

D4, 568, 575, 577D8, 1004, 1005

discrete transform, xv, 343, 576–589, 1048filter banks, 566–576

biorthogonal, 569decimation, 567deriving filter coefficients, 574–576orthogonal, 568

fingerprint compression, xv, 589, 633–639,1064

Haar, 549–566, 580

image decompositions, 589–596

integer transform, 608–610

lazy transform, 599

lifting scheme, 596–608, 1054

Mexican hat, 543, 545, 547, 548

Morlet, 543, 548

multiresolution decomposition, 589, 1057

origin of name, 543

quadrature mirror filters, 578

symmetric, 585

used for plotting functions, 578–580

Vaidyanathan, 585

wavelets scalar quantization (WSQ), xv,532, 633–639

WavPack audio compression, viii, 772–782,1064

web browsers

and FABD, 451

and GIF, 225, 257

and PDF, 928

and PNG, 247

and XML, 251

DjVu, 630

Web site of this book, x, xiii, xvii–xviii

Weierstrass, Karl Theodor Wilhelm(1815–1897), 588

weighted finite automata, xx, 464, 497–510,1064

Weisstein, Eric W., 486

Welch, Terry A., 199, 256, 1056

WFA, see weighted finite automata

Wheeler, Wayne, v, vii

Whitman, Walt (1819–1892), 253

Whittle, Robin, 761

WHT, see Walsh-Hadamard transform

Wilde, Erik, 354

Willems, Frans M. J, xx, 182

Williams, Ross N., 195, 198, 258

Wilson, Sloan (1920–2003), 142

WinRAR, viii, 226–228

Wirth, Niklaus (1934–), 490

Wister, Owen (1860–1938), 212

Witten, Ian A., 139

Wolf, Stephan, viii, ix, 475

woodcuts, 437

1092 Index

unusual pixel distribution, 437word-based compression, 885–887Wright, Ernest Vincent (1872–1939), 141WSQ, see wavelet scalar quantizationwww (web), 257, 337, 750, 1053

Xerox Techbridge (OCR software), 890XML compression, XMill, xii, 251–253, 1064xylography (carving a woodcut), 437

YCbCr color space, 270, 320, 341, 343, 352,643, 660

YIQ color model, 510, 559Yokoo, Hidetoshi, xx, 221, 224, 873Yoshizaki, Haruyasu, 229, 1054YPbPr color space, 320

zdelta, 934–936, 1064

Zelazny, Roger (1937–1995), 237

zero-probability problem, 140, 144, 364, 413,437, 682, 897, 1065

zero-redundancy estimate (in CTW), 169

zigzag sequence, 270

in H.261, 704

in H.264, 715

in JPEG, 344, 984

in MPEG, 684, 698, 1009

in RLE, 32

three-dimensional, 948–949

Zip (compression software), 230, 1047, 1065

Ziv, Jacob (1931–), 43, 173, 1055

LZ78 patent, 258

Zurek, Wojciech Hubert (1951–), 53

There is no index of character so sure as the voice.

—Benjamin Disraeli

Colophon

The first edition of this book was written in a burst of activity during the short periodJune 1996 through February 1997. The second edition was the result of intensive workduring the second half of 1998 and the first half of 1999. The third edition includesmaterial written, without haste, mostly in late 2002 and early 2003. The fourth editionwas written, at a leisurely pace, during the first half of 2006. The book was designedby the author and was typeset by him with the TEX typesetting system developed byD. Knuth. The text and tables were done with Textures, a commercial TEX imple-mentation for the Macintosh. The diagrams were done with Adobe Illustrator, also onthe Macintosh. Diagrams that required calculations were done either with Mathematicaor Matlab, but even those were “polished” by Adobe Illustrator. The following pointsillustrate the amount of work that went into the book:

The book (including the auxiliary material located in the book’s Web site) containsabout 523,000 words, consisting of about 3,081,000 characters (big, even by the standardsof Marcel Proust). However, the size of the auxiliary material collected in the author’scomputer and on his shelves while working on the book is about 10 times bigger than theentire book. This material includes articles and source codes available on the Internet,as well as many pages of information collected from various sources.

The text is typeset mainly in font cmr10, but about 30 other fonts were used.

The raw index file has about 4500 items.

There are about 1150 cross references in the book.

You can’t just start a new project in Visual Studio/Delphi/whatever,

then add in an ADPCM encoder, the best psychoacoustic model,

some DSP stuff, Levinson Durbin, subband decomposition, MDCT, a

Blum-Blum-Shub random number generator, a wavelet-based

brownian movement simulator and a Feistel network cipher

using a cryptographic hash of the Matrix series, and expect

it to blow everything out of the water, now can you?

Anonymous, found in [hydrogenaudio 06]