1 010.141 engineering mathematics ii lecture 16 compression bob mckay school of computer science and...

30
1 010.141 Engineering 010.141 Engineering Mathematics II Mathematics II Lecture 16 Lecture 16 Compression Compression Bob McKay Bob McKay School of Computer Science and Engineering School of Computer Science and Engineering College of Engineering College of Engineering Seoul National University Seoul National University

Upload: brook-cross

Post on 23-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

010.141 Engineering Mathematics II010.141 Engineering Mathematics IILecture 16Lecture 16

CompressionCompression

Bob McKayBob McKay

School of Computer Science and EngineeringSchool of Computer Science and Engineering

College of EngineeringCollege of Engineering

Seoul National UniversitySeoul National University

•2

OutlineOutline• Lossless CompressionLossless Compression

– Huffman & Shannon-FanoHuffman & Shannon-Fano

– Arithmetic CompressionArithmetic Compression

– The LZ Family of AlgorithmsThe LZ Family of Algorithms

• Lossy CompressionLossy Compression

– Fourier compressionFourier compression

– Wavelet CompressionWavelet Compression

– Fractal CompressionFractal Compression

•3

Lossless CompressionLossless Compression• Lossless encoding methods guarantee to Lossless encoding methods guarantee to

reproduce exactly the same data as was input to reproduce exactly the same data as was input to themthem

•4

Run Length EncodingRun Length Encoding

Original Data String

Encoded Data String

$******55.72 $¤*<6>55.72 --------- ¤-<9> Guns Butter

Guns¤ <10>Butter

•5

Relative EncodingRelative Encoding• Useful when there are sequences of runs of data Useful when there are sequences of runs of data

that vary only slightly from one run to the next: that vary only slightly from one run to the next:

– eg the lines of a faxeg the lines of a fax

– The position of each change is denoted relative to the The position of each change is denoted relative to the start of the linestart of the line

– Position indicator can be followed by a numeric count Position indicator can be followed by a numeric count indicating the number of successive changesindicating the number of successive changes

– For further compression, the position of the next change For further compression, the position of the next change can be denoted relative to the previouscan be denoted relative to the previous

•6

Statistical CompressionStatistical Compression• For the examples below, we will use a simple For the examples below, we will use a simple

alphabet with the following frequencies of alphabet with the following frequencies of occurrence (after Held)occurrence (after Held)

Character Probability

X1 0.10

X2 0.05

X3 0.20

X4 0.15

X5 0.15

X6 0.25

X7 0.10

•7

Huffman EncodingHuffman Encoding• Arrange the character set in order of decreasing Arrange the character set in order of decreasing

probabilityprobability

• While there is more than one probability class:While there is more than one probability class:

– Merge the two lowest probability classes and add their Merge the two lowest probability classes and add their probabilities to obtain a composite probabilityprobabilities to obtain a composite probability

– At each branch of the binary tree, allocate a '0' to one At each branch of the binary tree, allocate a '0' to one branch and a '1' to the otherbranch and a '1' to the other

• The code for each character is found by traversing The code for each character is found by traversing the tree from the root node to that characterthe tree from the root node to that character

•8

Huffman EncodingHuffman EncodingCharacter

X6

X3

X4

X5

X1

X7

X2

Probability

0.25

0.2

0.15

0.15

0.1

0.1

0.05

0.25

0.35

0.25

0.15

0.6

0.4

1.0

Character

X6

X3

X4

X5

X1

X7

X2

Probability

00

010

011

100

101

110

111

0

1

0

1

0

1

0

1

0

1

0

1

•9

Shannon-Fano AlgorithmShannon-Fano Algorithm• Arrange the character set in order of decreasing Arrange the character set in order of decreasing

probabilityprobability

• While a probability class contains more than one While a probability class contains more than one symbol:symbol:

– Divide the probability class in two Divide the probability class in two

• so that the probabilities in the two halves are as nearly as so that the probabilities in the two halves are as nearly as possible equalpossible equal

– Assign a '1' to the first probability class, and a '0' to the Assign a '1' to the first probability class, and a '0' to the secondsecond

•10

Shannon-Fano EncodingShannon-Fano Encoding

Character

X6

X3

X4

X5

X1

X7

X2

Probability

0.25

0.2

0.15

0.15

0.1

0.1

0.05

1

0

1

0

1

0

1

0

1

01

0

Code

11

10

011

010

001

0001

0000

•11

Arithmetic CodingArithmetic Coding• Arithmetic coding assumes there is a model for Arithmetic coding assumes there is a model for

statistically predicting the next character of the statistically predicting the next character of the string to be encodedstring to be encoded

– An order-0 model predicts the next symbol based on its An order-0 model predicts the next symbol based on its probability, independent of previous charactersprobability, independent of previous characters

• For example, an order-0 model of English predicts the For example, an order-0 model of English predicts the highest probability for ‘e’highest probability for ‘e’

– An order-1 model predicts the next symbol based on An order-1 model predicts the next symbol based on the preceding characterthe preceding character

• For example, if the preceding character is ‘q’, then ‘u’ is a For example, if the preceding character is ‘q’, then ‘u’ is a likely next characterlikely next character

– And so on for higher order modelsAnd so on for higher order models

• ‘‘ert’ ‘erty’, etc.ert’ ‘erty’, etc.

•12

Arithmetic CodingArithmetic Coding• Arithmetic coding assumes the coder and decoder Arithmetic coding assumes the coder and decoder

share the probability tableshare the probability table

– The main data structure of arithmetic coding is an The main data structure of arithmetic coding is an interval, representing the string constructed so farinterval, representing the string constructed so far

• Its initial value is [0,1]Its initial value is [0,1]

– At each stage, the current interval [min,max] is At each stage, the current interval [min,max] is subdivided into sub-intervals corresponding to the subdivided into sub-intervals corresponding to the probability model for the next characterprobability model for the next character

– The interval chosen will be the one representing the The interval chosen will be the one representing the actual next characteractual next character

– The more probable the character, the larger the intervalThe more probable the character, the larger the interval

– The coder output is a number in the final intervalThe coder output is a number in the final interval

•13

Arithmetic CodingArithmetic CodingCharacter Probability

X1 0.10

X2 0.05

X3 0.20

X4 0.15

X5 0.15

X6 0.25

X7 0.10

•14

Arithmetic CodingArithmetic Coding• Suppose we want to encode the string X1X3X7Suppose we want to encode the string X1X3X7

– After X1, our interval is [0,0.1]After X1, our interval is [0,0.1]

– After X3, it is [0.015,0.035]After X3, it is [0.015,0.035]

– After X7, it is [0.033,0.035]After X7, it is [0.033,0.035]

• The natural output to choose is the shortest binary The natural output to choose is the shortest binary fraction in [0.033,0.035]fraction in [0.033,0.035]

• Obviously, the algorithm as stated requires infinite Obviously, the algorithm as stated requires infinite precisionprecision

• Slight variants re-normalise at each stage to Slight variants re-normalise at each stage to remain within computer precisionremain within computer precision

•15

Substitutional CompressionSubstitutional Compression• The basic idea behind a substitutional compressor The basic idea behind a substitutional compressor

is to replace an occurrence of a particular phrase is to replace an occurrence of a particular phrase with a reference to a previous occurrence with a reference to a previous occurrence

• There are two main classes of schemesThere are two main classes of schemes

– Named after Jakob Ziv and Abraham Lempel, who first Named after Jakob Ziv and Abraham Lempel, who first proposed them in 1977 and 1978proposed them in 1977 and 1978

•16

LZWLZW• LZW is an LZ78-based scheme designed by T LZW is an LZ78-based scheme designed by T

Welch in 1984Welch in 1984

– LZ78 schemes work by putting phrases into a dictionary LZ78 schemes work by putting phrases into a dictionary

• when a repeat occurrence of a particular phrase is found, when a repeat occurrence of a particular phrase is found, outputting the dictionary index instead of the phraseoutputting the dictionary index instead of the phrase

– LZW starts with a 4K dictionaryLZW starts with a 4K dictionary

• entries 0-255 refer to individual bytesentries 0-255 refer to individual bytes

• entries 256-4095 refer to substringsentries 256-4095 refer to substrings

– Each time a new code is generated it means a new Each time a new code is generated it means a new string has been parsedstring has been parsed

• New strings are generated by adding current character K New strings are generated by adding current character K to the end of an existing string w (until dictionary is full)to the end of an existing string w (until dictionary is full)

•17

LZW AlgorithmLZW Algorithm

set w = NILset w = NILlooploop

read a character Kread a character Kif wK exists in the dictionaryif wK exists in the dictionary

w = wKw = wKelseelse

output the code for woutput the code for wadd wK to the string tableadd wK to the string tablew = Kw = K

endloopendloop

•18

LZWLZW

The most remarkable feature of this type of The most remarkable feature of this type of compression is that the entire dictionary has been compression is that the entire dictionary has been transmitted to the decoder without actually transmitted to the decoder without actually explicitly transmitting the dictionaryexplicitly transmitting the dictionary

– At the end of the run, the decoder will have a dictionary At the end of the run, the decoder will have a dictionary identical to the one the encoder has, built up entirely as identical to the one the encoder has, built up entirely as part of the decoding processpart of the decoding process

• Codings in this family are behind such Codings in this family are behind such representations as .gifrepresentations as .gif

– They were previously under patent, but many of the They were previously under patent, but many of the relevant patents are now expiringrelevant patents are now expiring

•19

Lossy CompressionLossy Compression• Lossy compression algorithms do not guarantee to Lossy compression algorithms do not guarantee to

reproduce the original inputreproduce the original input

– They achieve much higher compression by limiting their They achieve much higher compression by limiting their compression to what is ‘near enough’ to be acceptably compression to what is ‘near enough’ to be acceptably detectabledetectable

• Usually, this means detectable by a human sense - sight Usually, this means detectable by a human sense - sight (jpeg), hearing (mp3), motion understanding (mp4)(jpeg), hearing (mp3), motion understanding (mp4)

– This requires a model of what is acceptableThis requires a model of what is acceptable

• The model may only be accurate in some circumstancesThe model may only be accurate in some circumstances– Which is why compressing a text or line drawing with jpeg is a Which is why compressing a text or line drawing with jpeg is a

bad ideabad idea

•20

Fourier Compression (jpeg)Fourier Compression (jpeg)• The Fourier transform of a dataset is a frequency The Fourier transform of a dataset is a frequency

representation of that datasetrepresentation of that dataset

– You have probably already seen graphs of Fourier You have probably already seen graphs of Fourier transformstransforms

• the frequency diagram of a sound sample is a graph the frequency diagram of a sound sample is a graph representation of the Fourier transform of the original representation of the Fourier transform of the original data, which you see graphed as the original data, which you see graphed as the original time/amplitude diagramtime/amplitude diagram

•21

Fourier CompressionFourier Compression• From our point of view, the important features of the From our point of view, the important features of the

Fourier transform are:Fourier transform are:

– it is invertibleit is invertible

• original dataset can be rebuilt from the Fourier transformoriginal dataset can be rebuilt from the Fourier transform

– graphic images of the World usually contain spatially repetitive graphic images of the World usually contain spatially repetitive information patternsinformation patterns

– Human senses are (usually) poor at detecting low-amplitude Human senses are (usually) poor at detecting low-amplitude visual frequenciesvisual frequencies

• The Fourier transform usually has information The Fourier transform usually has information concentrated at particular frequencies, depleted at othersconcentrated at particular frequencies, depleted at others

• The depleted frequencies can be transmitted at low The depleted frequencies can be transmitted at low precision without serious loss of overall information.precision without serious loss of overall information.

•22

Discrete Cosine TransformDiscrete Cosine Transform• A discretised version of the Fourier transformA discretised version of the Fourier transform

– Suited to representing spatially quantised (ie raster) Suited to representing spatially quantised (ie raster) images in a frequency quantised (ie tabular) format. images in a frequency quantised (ie tabular) format.

– Mathematically, the DCT of a function f ranging over a Mathematically, the DCT of a function f ranging over a discrete variable x (omitting various important discrete variable x (omitting various important constants) is given byconstants) is given by

• F(n) = Σx f(x) cos(nπx)F(n) = Σx f(x) cos(nπx)

– Of course, we’re usually interested in two-dimensional Of course, we’re usually interested in two-dimensional images, and hence need the two-dimensional DCT, images, and hence need the two-dimensional DCT, given (omitting even more important constants) bygiven (omitting even more important constants) by

• F(m,n) = Σx Σyf(x,y) cos(mπx) cos(nπy)F(m,n) = Σx Σyf(x,y) cos(mπx) cos(nπy)

•23

Fourier Compression RevisitedFourier Compression Revisited• Fourier-related transforms are based on sine (or Fourier-related transforms are based on sine (or

cosine) functions of various frequenciescosine) functions of various frequencies

– The transform is a record of how to add together the The transform is a record of how to add together the periodic functions to obtain the original functionperiodic functions to obtain the original function

• Really, all we need is a basis set of functionsReally, all we need is a basis set of functions

– A set of functions that can generate all othersA set of functions that can generate all others

•24

The Haar TransformThe Haar Transform• Instead of periodic functions, we could instead add Instead of periodic functions, we could instead add

together discrete functions such as:together discrete functions such as: +--+ +------++--+ +------+ + | +------------------ + | +--------------+ | +------------------ + | +-------------- +--+ +------++--+ +------+

+--+ +------++--+ +------+ ------+ | +------------ --------------+ | +------+ | +------------ --------------+ | + +--+ +------++--+ +------+

+--+ +-------------++--+ +-------------+ ------------+ | +------ + | +------------+ | +------ + | + +--+ +-------------++--+ +-------------+

+--+ +---------------------------++--+ +---------------------------+ ------------------+ | + + +------------------+ | + + + +--++--+

• This would give us the Haar transformThis would give us the Haar transform

– It can also be used to compress image data, though not as It can also be used to compress image data, though not as efficiently as the DCT efficiently as the DCT

• images compressed at the same rate as the DCT tend to look images compressed at the same rate as the DCT tend to look ‘blocky’, so higher compression is required to give the same ‘blocky’, so higher compression is required to give the same impressionimpression

•25

Wavelet CompressionWavelet Compression• Wavelet compression uses a basis set Wavelet compression uses a basis set

intermediate between Fourier and Haar transformsintermediate between Fourier and Haar transforms

– The functions are ‘smoothed’ versions of the Haar The functions are ‘smoothed’ versions of the Haar functionsfunctions

• They have a sinusoidal rather than square shapeThey have a sinusoidal rather than square shape

• They don’t die out abruptly at the edgesThey don’t die out abruptly at the edges– They decay into lower amplitudeThey decay into lower amplitude

• Wavelet compression can give very high ratiosWavelet compression can give very high ratios

– attributed to similarities between wavelet functions and attributed to similarities between wavelet functions and the edge detection present in the human retina the edge detection present in the human retina

• wavelet functions encode just the detail that we see bestwavelet functions encode just the detail that we see best

•26

Vector QuantisationVector Quantisation• Relies on building a codebook of similar image Relies on building a codebook of similar image

portionsportions

– Only one copy of the similar portions is transmittedOnly one copy of the similar portions is transmitted

• Just as LZ compression relies on building a dictionary of Just as LZ compression relies on building a dictionary of strings seen so farstrings seen so far

– just transmitting references to the dictionaryjust transmitting references to the dictionary

•27

Fractal CompressionFractal Compression• Rely on self-similarity of (parts of) the image to Rely on self-similarity of (parts of) the image to

reduce transmissionreduce transmission

– It has a similar relation to vector quantisation methods It has a similar relation to vector quantisation methods as LZW has to LZas LZW has to LZ

– LZW can be thought of as LZ in which the dictionary is LZW can be thought of as LZ in which the dictionary is derived from the part of the text seen so farderived from the part of the text seen so far

– fractal compression can be viewed as deriving its fractal compression can be viewed as deriving its dictionary from the portion of the image seen so fardictionary from the portion of the image seen so far

•28

Compression TimesCompression Times• For transform encodings such as DCT or waveletFor transform encodings such as DCT or wavelet

– compression and decompression times are roughly compression and decompression times are roughly comparablecomparable

• For fractal compression For fractal compression

– Compression takes orders of magnitude longer than Compression takes orders of magnitude longer than decompressiondecompression

• Difficult to find the right codebookDifficult to find the right codebook

• Fractal compression is well suited where pre-Fractal compression is well suited where pre-canned images will be accessed many times overcanned images will be accessed many times over

•29

SummarySummary• Lossless CompressionLossless Compression

– Huffman & Shannon-FanoHuffman & Shannon-Fano

– Arithmetic CompressionArithmetic Compression

– The LZ Family of AlgorithmsThe LZ Family of Algorithms

• Lossy CompressionLossy Compression

– Fourier compressionFourier compression

– Wavelet CompressionWavelet Compression

– Fractal CompressionFractal Compression

•30

감사합니다감사합니다