ee591u wavelets and filter banks copyright xin li 20081 roadmap to lossy image compression jpeg...
Post on 31-Dec-2015
224 Views
Preview:
TRANSCRIPT
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
1
Roadmap to Lossy Image Compression JPEG standard: DCT-based image
coding First-generation wavelet coding
FBI WSQ standard Second-generation schemes
Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation
Beyond wavelet coding
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
2
A Tour of JPEG Coding Algorithm
Flow-chart diagram of DCT-based coding algorithm specified by Joint Photographic Expert Group (JPEG)
T Q C
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
3
Transform Coding of Images Why not transform the whole image
together? Require a large memory to store transform
matrix It is not a good idea for compression due to
spatially varying statistics within an image Idea of partitioning an image into blocks
Each block is viewed as a smaller-image and processed independently
It is not a magic, but a compromise
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
4
8-by-8 DCT Basis Images
Tiii
Tjiij
i jijij
aabbb
x
],...,[,
,
81
8
1
8
1
B
BY
8881
1811
88
......
............
............
......
aa
aa
A
81,82,16
)1)(12(cos
2
1
81,1,8
1
lkkl
lkakl
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
5
Block Processing under MATLAB
Type “help blkproc” to learn the usage of this function B = BLKPROC(A,[M N],FUN) processes
the image A by applying the function FUN to each distinct M-by-N block of A, padding A with zeros if necessary.
ExampleI = imread('cameraman.tif'); fun = @dct2; J = blkproc(I,[8 8],fun);
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
6
Block-based DCT Example
JI
note that white lines are artificially added to the border of each 8-by-8 block to denote that each block is processed independently
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
7
Boundary Padding
padded regions
Example
1213141516171819
1213141516171819
1213141516171819
1213141516171819
When the width/height of an image is not the multiple of 8, the boundary isartificially padded with repeated columns/rows to make them multiple of 8
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
8
Work with a Toy Example
169130
173129
170181
170183
179181
182180
179180
179179169132
171130
169183
164182
179180
176179
180179
178178167131
167131
165179
170179
177179
182171
177177
168179169130
165132
166187
163194
176116
15394
153183
160183
Any 8-by-8 block in an image is processed in a similar fashion
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
9
Encoding Stage I: Transform
• Step 1: DC level shifting
169130
173129
170181
170183
179181
182180
179180
179179169132
171130
169183
164182
179180
176179
180179
178178167131
167131
165179
170179
177179
182171
177177
168179169130
165132
166187
163194
176116
15394
153183
160183
412
451
4253
4255
5153
5452
5152
5151414
432
4155
3654
5152
4851
5251
5050393
393
3751
4251
4951
5443
4949
4051412
374
3859
3566
4812
2534
2555
3655
128 (DC level)
_
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
10
• Step 2: 8-by-8 DCT
412
451
4253
4255
5153
5452
5152
5151414
432
4155
3654
5152
4851
5251
5050393
393
3751
4251
4951
5443
4949
4051412
374
3859
3566
4812
2534
2555
3655
13
42
12
09
40
21
13
4430
55
47
73
30
46
32
16113
916
109
621
179
3310
810
17201024
2727
132
6078
4413
1827
2738
56313
Encoding Step 1: Transform (Con’t)
88DCT
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
11
Encoding Stage II: Quantization
99103
101120
100112
121103
9895
8778
9272
644992113
77103
10481
10968
6455
5637
3524
22186280
5669
8751
5740
2922
2416
1714
13145560
6151
5826
4024
1914
1610
1212
1116
Q-table
8,1
ˆ:
:
1
ji
Qsxf
Q
xsf
ijijij
ij
ijij
: specifies quantization stepsize (see slide #28)
Notes: Q-table can be specified by customerQ-table is scaled up/down by a chosen quality factor Quantization stepsize Qij is dependent on the coordinates (i,j) within the 8-by-8 block Quantization stepsize Qij increases from top-left to bottom-right
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
12
Encoding Stage II: Quantization (Con’t)
13
42
12
09
40
21
13
4430
55
47
73
30
46
32
16113
916
109
621
179
3310
810
17201024
2727
132
6078
4413
1827
2738
56313
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
01
10
11
01
1100
01
01
23
21
13
23
520
Example
f
xijsij
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
13
Encoding Stage III: Entropy Coding
Zigzag Scan
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
01
10
11
01
1100
01
01
23
21
13
23
520
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
zigzag scan
End Of the Block:All following coefficients are zero
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
14
Run-length Coding
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
DCcoefficient
ACcoefficient
- DC coefficient : DPCM coding- AC coefficient : run-length coding (run, level)
(5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
(0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1),(0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB
Huffman codingencoded bit stream
encoded bit stream
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
15
JPEG Decoding Stage I: Entropy Decoding
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
Huffman decodingencoded bit stream
AC coefficients
DC coefficient
DPCM decoding
(0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1),(0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB
encoded bit stream
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
16
JPEG Decoding Stage II: Inverse Quantization
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
01
10
11
01
1100
01
01
23
21
13
23
520
zigzag
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
040
290
2416
014
131400
051
026
8072
3814
1630
2436
55320
f-1
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
17
JPEG Decoding Stage III: Inverse Transform
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
040
290
2416
014
131400
051
026
8072
3814
1630
2436
55320
169130
185127
170181
162192
179181
181184
179180
178183173128
161129
169191
175186
174181
168186
178181
182172170131
177128
170185
166187
171187
168172
180169
169174175124
170120
168193
171197
158143
148119
153186
140195
412
571
4253
3464
5153
5356
5152
5055450
331
4163
4758
4653
4058
5053
5444423
490
4257
3859
4359
4044
5241
4146474
438
4065
4369
3015
209
2558
1267
88IDCT
128 (DC level)
+
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
18
Quantization Noise
169130
173129
170181
170183
179181
182180
179180
179179169132
171130
169183
164182
179180
176179
180179
178178167131
167131
165179
170179
177179
182171
177177
168179169130
165132
166187
163194
176116
15394
153183
160183
169130
185127
170181
162192
179181
181184
179180
178183173128
161129
169191
175186
174181
168186
178181
182172170131
177128
170185
166187
171187
168172
180169
169174175124
170120
168193
171197
158143
148119
153186
140195
X X^
MSE=||X-X||2^Distortion calculation:
Rate calculation: Rate=length of encoded bit stream/number of pixels (bps)
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
19
JPEG Examples
1000
90 (58k bytes)50 (21k bytes)10 (8k bytes)
best quality,
lowest compression
worst quality,
highest compression
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
20
Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and
transform First-generation schemes
FBI WSQ standard Second-generation schemes
Probabilistic modeling of wavelet coefficients
Embedded Zerotree Wavelet (EZW) SPIHT coder A unified where-and-what perspective
JPEG2000
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
21
Early Attempts
Each band is modeled by a Guassian random variable with zero mean and unknown variance (e.g., WSQ)
Only modest gain over JPEG (DCT-based) is achieved
Question: is this an accurate model?and how can we test it?
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
22
FBI Wavelet Scalar Quantization (WSQ)
),0(~ 2kk Nx k: band index
kk k
Dm
D 1
mk= image size
subband size
Each band is approximately modeled by a Gaussian r.v.
Given R, minimize
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
23
Rate Allocation Problem*
Solution: Lagrangian Multiplier technique (turn a constrained optimizationInto an unconstrained optimization problem)
LL
LH HH
HL Given a quota of bits R, how should weallocate them to each band to minimizethe overall MSE distortion?
*.,.,1
min RRtsDm
D kk k
RD min
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
24
Proof by Contradiction (I)
Suppose each coefficient X in a high band does observeGaussian distribution, i.e., X~N(0,σ2), then flip the sign ofX (i.e., replace X with –X) should not matter and generatesanother element in Ω (i.e., a different but meaningful image)
Assumption: our modeling target Ω is the collection of natural images
Let’s test it!
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
25
Proof by Contradiction (II)
DWT
sign flip
IWT
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
26
What is wrong with that? Think of two coefficients: one in
smooth region and the other around edge, do they observe the same probabilistic distribution?
Think of all coefficients around the same edge, do they observe the same probabilistic distribution?
Ignorance of topology and geometry
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
27
The Importance of Modeling Singularity Location Uncertainty
Singularities carry critical visual information: edges, lines, corners …
The location of singularities is important Recall locality of wavelets in spatial-
frequency domain Singularities in spatial domain →
significant coefficients in wavelet domain
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
28
Where-and-What Coding
Communication context
Where The location of significant coefficients
What The sign and magnitude of significant
coefficients
Alice Bob
communicationchannelpicture
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
29
Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and
transform First-generation schemes
FBI WSQ standard Second-generation schemes
Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation
Scalable and ROI coding in JPEG2000
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
30
1993-2003 Embedded Zerotree Wavelet (EZW)’1993 Set Partition In Hierarchical Tree
(SPIHT)’1995 Space-Frequency Quantization (SFQ)’
1996 Estimation Quantization (EQ)’1997 Embedded Block Coding with Optimal
Truncation (EBCOT)’2000 Least-Square Estimation Quantization
(LSEQ)’2003
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
31
A Simpler Two-Stage Coding Position coding stage (where)
Generate a binary map indicating the location of significant coefficients (|X|>T)
Use context-based adaptive binary arithmetic coding (e.g., JBIG) to code the binary map
Intensity coding stage (what) Code the sign and magnitude of
significant coefficients
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
32
Classification-based Modeling
),0(~ 200 NX
Insignificant class
),0(~ 211 NX
Significant class
Mixture
20
21
2201 )1(),,0(~)1( aaNXaaXX
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
33
Classification Gain
RRD 22 2)(
Without classification
With classification
RaaRD 221
)1(20 2)('
Classification gain
0)1(
log10)('
)(log10
21
)1(20
20
21
1010
dBaa
dBRD
RDG
aa
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
34
Example
100,1 21
20
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
35
Advanced Wavelet Coding
SPIHT: a simpler yet more efficient implementation of EZW coder
SFQ: Rate-Distortion optimized zerotree coder
EQ: Rate-Distortion optimization via backward adaptive classification
EBCOT (adopted by JPEG2000): a versatile embedded coder
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
36
Beyond SPIHT
JPEG-decoded at rate of 0.32bpp(PSNR=32.07dB)
SFG-enhanced at rate of 0.32bpp(PSNR=33.22dB)
SPIHT-decoded at rate of 0.20bpp(PSNR=26.18dB)
SFG-enhanced at rate of 0.20bpp(PSNR=27.33dB)
Maximum-Likelihood (ML) Decoding
Maximum a Posterior (MAP) Decoding
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
37
Open Problems Related to Image Coding
Coding of specific class of images (e.g., Satellite, microarray, fingerprint)
Coding of color-filter-array (CFA) images
Error resilient coding of images Perceptual image coding Image coding for pattern recognition
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
38
Coding of Specific Class of Images
How to designspecific codingalgorithms foreach class?
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
39
CFA Image Coding
Bayer Pattern
CFA Interpolation(demosaicing)
Color imagecompression
CFA Interpolation(demosaicing)
CFA datacompression
Approach I
Approach II
Which one is better and why?
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
40
Error Resilient Image Coding
sourceencoder
channel
sourcedecoder
source destination
super-channel
channelencoder
channeldecoder
How can we optimize the end-to-end performance in the presenceof channel errors?
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
41
Perceptual Image Coding
Characterizing image distortion is difficult!
How do we objectively define mage qualitywhich has to be subjectto individual opinions?
EE591U Wavelets and Filter Banks Copyright Xin Li 2008
42
Image Coding for PR
imagesensor
Communicationchannel
Patternrecognition
How does coding distortion affect the recognition performance?
We need to develop a new image representation whichCan simultaneously support low-level (e.g., compression,denoising) and high-level (e.g., recognition and retrieval) vision tasks
top related