EE591b Advanced Image Processing Copyright Xin Li 2003
1
Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and
transform First-generation schemes
FBI WSQ standard Second-generation schemes
Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation
Scalable and ROI coding in JPEG2000
EE591b Advanced Image Processing Copyright Xin Li 2003
2
Lifting Scheme
(Wim Sweldens’1995)
scale parameter
EE591b Advanced Image Processing Copyright Xin Li 2003
3
Step 1: Split
sj(n)
n 0 1-1-2-3
… …
2 3 4 ……
oddj(n)
evenj(n)
EE591b Advanced Image Processing Copyright Xin Li 2003
4
Step 2: Prediction
oddj
evenj
)( 111 jjj evenPoddd
High-band (difference of sj)
n
nn-1 n+1
n-1 n+1
EE591b Advanced Image Processing Copyright Xin Li 2003
5
Step 3: Updating
dj-1
evenj
)( 111 jjj dUevens
Low-band (approximation of sj)
nn-1 n+1
nn-1 n+1
EE591b Advanced Image Processing Copyright Xin Li 2003
6
Algorithmic Advantages
In-place operation: good for memory savings
Computational efficiency: fewer floating operations than subband filtering implementations
Parallelism: Inherent SIMO parallelism at all scales
odd-length filter
EE591b Advanced Image Processing Copyright Xin Li 2003
7
Structural Advantages
Inverse transform: simply run the split-prediction-updating backward, you will get the implementation of inverse transform (i.e., updating, prediction and merge)
Generality: easy to be generalized into unconventional geometric settings such as curve, surface and volume
EE591b Advanced Image Processing Copyright Xin Li 2003
8
Inverse Transform
Reconstruct sj from (sj-1,dj-1)
EE591b Advanced Image Processing Copyright Xin Li 2003
9
Forward vs. Inverse
)( 111 jjj dUevens
)( 111 jjj evenPoddd
Forward transform Inverse transform
),( 11 jjsplit
j oddevens
)( 111 jjj evenPdodd
)( 111 jjj dUseven
jmerge
jj soddeven ),( 11
obtain (sj-1,dj-1) from sj obtain sj from (sj-1,dj-1)
EE591b Advanced Image Processing Copyright Xin Li 2003
10
Example (I)
)]()([2
1
)(2
1)()(
11
111
nevennodd
ndnevenns
jj
jjj
)()()( 111 nevennoddnd jjj
S-transform (a variant of Haar transform)
),( 11 jjsplit
j oddevens
EE591b Advanced Image Processing Copyright Xin Li 2003
11
Example (II)
4
)1()()()( 1
11
ndndnevenns jj
jj
2
)()1()()( 11
11
nevennevennoddnd jj
jj
5/3 transform (also called (2,2) interpolating transform)
),( 11 jjsplit
j oddevens
EE591b Advanced Image Processing Copyright Xin Li 2003
12
Generalization (I)
Forward Transform
Inverse Transform
EE591b Advanced Image Processing Copyright Xin Li 2003
13
Factoring Wavelet Transform into Lifting Steps
Example: Daubechies’ 9-7 filter
splitting
P
U
P
U
scaling
EE591b Advanced Image Processing Copyright Xin Li 2003
14
Generalization (II)
Conventional subband-filtering based WT is not suitable for lossless coding (it is simply impossible to preserve real numbers with finite precision)
Lifting scheme elegantly solves this problem because inverse transform is always guaranteed by lifting structure (so just round off those real numbers)
EE591b Advanced Image Processing Copyright Xin Li 2003
15
Example
2
1
4
)1()()()( 1
11
ndndnevenns jj
jj
2
1
2
)()1()()( 1
11
nevennevennoddnd jj
jj
Integer-to-integer (Reversible) 5/3 transform (Adopted by JPEG2000 for lossless image compression)
),( jjsplit
j oddevens
Note: outputs (sj-1,dj-1) are both integers, just like the input sj
EE591b Advanced Image Processing Copyright Xin Li 2003
16
Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and
transform First-generation schemes
FBI WSQ standard Second-generation schemes
Probabilistic modeling of wavelet coefficients
Embedded Zerotree Wavelet (EZW) SPIHT coder A unified where-and-what perspective
JPEG2000
EE591b Advanced Image Processing Copyright Xin Li 2003
17
Early Attempts
Each band is modeled by a Guassian random variable with zero mean and unknown variance (e.g., WSQ)
Only modest gain over JPEG (DCT-based) is achieved
Question: is this an accurate model?and how can we test it?
EE591b Advanced Image Processing Copyright Xin Li 2003
18
FBI Wavelet Scalar Quantization (WSQ)
),0(~ 2kk Nx k: band index
kk k
Dm
D 1
mk= image size
subband size
Each band is approximately modeled by a Gaussian r.v.
Given R, minimize
EE591b Advanced Image Processing Copyright Xin Li 2003
19
Rate Allocation Problem*
Solution: Lagrangian Multiplier technique (we will studyit in detail on the blackboard)
LL
LH HH
HL Given a quota of bits R, how should weallocate them to each band to minimizethe overall MSE distortion?
EE591b Advanced Image Processing Copyright Xin Li 2003
20
Proof by Contradiction (I)
Suppose each coefficient X in a high band does observeGaussian distribution, i.e., X~N(0,σ2), then flip the sign ofX (i.e., replace X with –X) should not matter and generatesanother element in Ω (i.e., a different but meaningful image)
Assumption: our modeling target Ω is the collection of natural images
Let’s test it!
EE591b Advanced Image Processing Copyright Xin Li 2003
21
Proof by Contradiction (II)
DWT
sign flip
IWT
EE591b Advanced Image Processing Copyright Xin Li 2003
22
What is wrong with that? Think of two coefficients: one in
smooth region and the other around edge, do they observe the same probabilistic distribution?
Think of all coefficients around the same edge, do they observe the same probabilistic distribution?
Ignorance of topology and geometry
EE591b Advanced Image Processing Copyright Xin Li 2003
23
The Importance of Modeling Singularity Location Uncertainty
Singularities carry critical visual information: edges, lines, corners …
The location of singularities is important Recall locality of wavelets in spatial-
frequency domain Singularities in spatial domain →
significant coefficients in wavelet domain
EE591b Advanced Image Processing Copyright Xin Li 2003
24
Where-and-What Coding
Communication context
Where The location of significant coefficients
What The sign and magnitude of significant
coefficients
Alice Bob
communicationchannelpicture
EE591b Advanced Image Processing Copyright Xin Li 2003
25
Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and
transform First-generation schemes
FBI WSQ standard Second-generation schemes
Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation
Scalable and ROI coding in JPEG2000
EE591b Advanced Image Processing Copyright Xin Li 2003
26
1993-2003 Embedded Zerotree Wavelet (EZW)’1993 Set Partition In Hierarchical Tree
(SPIHT)’1995 Space-Frequency Quantization (SFQ)’
1996 Estimation Quantization (EQ)’1997 Embedded Block Coding with Optimal
Truncation (EBCOT)’2000 Least-Square Estimation Quantization
(LSEQ)’2003
EE591b Advanced Image Processing Copyright Xin Li 2003
27
Embedded Zerotree Wavelet (EZW) Coding
T=T0
Dominant Pass
Subordinate Pass
T=T/2
Code the position information(where are the significant coefficients?)
Code the intensity information(what are the significant coefficients?)
Reach the specifiedBit rate?
Yes
No
CoreEngine
Significance testing: |X|>T
EE591b Advanced Image Processing Copyright Xin Li 2003
28
Zerotree Data Structure
EE591b Advanced Image Processing Copyright Xin Li 2003
29
Ancestor-and-Descendent
Parent-and-Children Ancestor-and-Descendent
EE591b Advanced Image Processing Copyright Xin Li 2003
30
Zerotree Terminology Zerotree root (ZRT): it and its all
descendants are insignificant Isolated zero (IZ): it is insignificant
but its descendant is not Positive significant (POS): it is
significant and have a positive sign Negative significant (NEG): it is
significant and have a negative sign
EE591b Advanced Image Processing Copyright Xin Li 2003
31
Dominant Pass: Significance Testing (Where-coding)
EE591b Advanced Image Processing Copyright Xin Li 2003
32
Subordinate Pass: Magnitude Refinement (What-coding)
For Significant coefficients (POS/NEG), refine their magnitude by sending one bit indicating if it is larger than 1.5T, i.e., to resolve the ambiguity whether it is within [T,1.5T) or within[1.5T,2T)
EE591b Advanced Image Processing Copyright Xin Li 2003
33
Toy Example
EE591b Advanced Image Processing Copyright Xin Li 2003
34
Dominant Pass
Note: T=32
LH1 contains POS
LH1 contains POS
EE591b Advanced Image Processing Copyright Xin Li 2003
35
Subordinate Pass
32 6448
5640
EE591b Advanced Image Processing Copyright Xin Li 2003
36
Where-and-What Interpretation Zerotree data structure effectively
resolves the location uncertainty (where) of insignificant coefficients
The dominant and subordinate passes defined in EZW can be viewed as “where” and “what” coding respectively
Dyadic choice of T values (i.e., T=128,64, 32,16,…) renders embedded coding
EE591b Advanced Image Processing Copyright Xin Li 2003
37
A Simpler Two-Stage Coding Position coding stage (where)
Generate a binary map indicating the location of significant coefficients (|X|>T)
Use context-based adaptive binary arithmetic coding (e.g., JBIG) to code the binary map
Intensity coding stage (what) Code the sign and magnitude of
significant coefficients
EE591b Advanced Image Processing Copyright Xin Li 2003
38
A Different Interpretation
Two-class modeling of high-band coefficients Significant class: |X|>T Insignificant class: |X|<T
Why does classification help? Nonstationarity of image source A probabilistic modeling perspective
EE591b Advanced Image Processing Copyright Xin Li 2003
39
Classification-based Modeling
),0(~ 200 NX
Insignificant class
),0(~ 211 NX
Significant class
Mixture
20
21
2201 )1(),,0(~)1( aaNXaaXX
EE591b Advanced Image Processing Copyright Xin Li 2003
40
Classification Gain
RRD 22 2)(
Without classification
With classification
RaaRD 221
)1(20 2)('
Classification gain
0)1(
log10)('
)(log10
21
)1(20
20
21
1010
dBaa
dBRD
RDG
aa
EE591b Advanced Image Processing Copyright Xin Li 2003
41
Example
100,1 21
20
EE591b Advanced Image Processing Copyright Xin Li 2003
42
Advanced Wavelet Coding
SPIHT: a simpler yet more efficient implementation of EZW coder
SFQ: Rate-Distortion optimized zerotree coder
EQ: Rate-Distortion optimization via backward adaptive classification
EBCOT (adopted by JPEG2000): a versatile embedded coder
EE591b Advanced Image Processing Copyright Xin Li 2003
43
Another New Perspective “What” and “Where” in human brain
Ventral stream for object vision (what) Dorsal stream for spatial vision (where)
If human vision system (HVS) understands the world in this where-and-what fashion and if we believe in the superiority of human intelligence, shouldn’t we represent images in a similar manner? Understanding HVS is as important as
understanding image data
EE591b Advanced Image Processing Copyright Xin Li 2003
44
Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and
transform First-generation schemes
FBI WSQ standard Second-generation schemes
Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation
Scalable and ROI coding in JPEG2000
EE591b Advanced Image Processing Copyright Xin Li 2003
45
From JPEG to JPEG2000
What is wrong with JPEG?
• Poor low bit-rate performance
• Separate lossy and lossless compression
• Awkward progressive transmission
• Do not support Region-Of-Interest (ROI) coding
• Do not support random access and processing
• Poor error resilience and security
EE591b Advanced Image Processing Copyright Xin Li 2003
46
EBCOT System Overview
Sourceimage data
channel
Reconstructedimage data
encoder
decoder
WT Q C
C-1Q-1IWT
Embedded Block Coding with Optimized Truncation (EBCOT)
EE591b Advanced Image Processing Copyright Xin Li 2003
47
What is new with EBCOT? Block tiling
How is it different from block DCT? What do we buy from it?
To support rate and resolution scalability To support ROI and random access To enhance error resilience capability
R-D optimized truncation Implement R-D optimized embedded
coding
EE591b Advanced Image Processing Copyright Xin Li 2003
48
Scalable vs. Multicast
What is scalable coding?
Multicast Scalable coding
Lena.pgm
Lena_0.125bpp.codLena_0.25bpp.codLena_0.5bpp.codLena_1.00bpp.cod
lena.cod
1bpp0.5bpp0.25pp
Lena.pgm
EE591b Advanced Image Processing Copyright Xin Li 2003
49
Spatial scalability
1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0
EE591b Advanced Image Processing Copyright Xin Li 2003
50
SNR (Rate) scalability
1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0
PSNR=30dB PSNR=35dB PSNR=40dB
EE591b Advanced Image Processing Copyright Xin Li 2003
51
Embedded Zerotree Wavelet (EZW) Coding
T=T0
Dominant Pass
Subordinate Pass
T=T/2
Code the position information(where are the significant coefficients?)
Code the intensity information(what are the significant coefficients?)
Reach the specifiedBit rate?
Yes
No
EE591b Advanced Image Processing Copyright Xin Li 2003
52
Bit-Plane Coding
MSB
LSB
00000110
01000110
11000100
01100011
00010110
00100010
11001111
00100000
00000110
01000100
00000010
01000110
0 1 2 3 4 5 6 7 8 9 10 ……
1st pass
2nd pass
…
Successive refinement of coefficient magnitude
3rd pass
dominant
subordinate
EE591b Advanced Image Processing Copyright Xin Li 2003
53
Rate-Distortion Optimization in Scalable Image Coding
An old problem
Given a bit budget, how to allocate them in such a waythat the total distortion is minimized?
A new challenge (due to embedded coding constraint)
a
b
c
db’
c’
D
RR1 R2
We need to make sure R-D isoptimized not only for a and dbut also for b and c
EE591b Advanced Image Processing Copyright Xin Li 2003
54
Fractional Bit-plane Coding
MSB
LSB
00000110
01000110
11000100
01100011
00010110
00100010
11001111
00100000
00000110
01000100
00000010
01000110
0 1 2 3 4 5 6 7 8 9 10 ……
…
sub-pass 1 sub-pass 2 sub-pass 3
EE591b Advanced Image Processing Copyright Xin Li 2003
55
Example
EE591b Advanced Image Processing Copyright Xin Li 2003
56
Comparison between JPEG and JPEG2000 (I)
JPEG (0.25bpp) JPEG2000 (0.25bpp)
EE591b Advanced Image Processing Copyright Xin Li 2003
57
Comparison between JPEG and JPEG2000 (II)
JPEG (0.5bpp) JPEG2000 (0.5bpp)
EE591b Advanced Image Processing Copyright Xin Li 2003
58
JPEG2000 vs. WSQ
Decoded fingerprint image by WSQ at compression ratio of 27
EE591b Advanced Image Processing Copyright Xin Li 2003
59
JPEG2000 vs. WSQ
Decoded fingerprint image by JPEG2000 at compression ratio of 27
EE591b Advanced Image Processing Copyright Xin Li 2003
60
Region-Of-Interest (ROI) Coding
ROI
EE591b Advanced Image Processing Copyright Xin Li 2003
61
Block Tiling
DC levelshifting
Tiling DWT on each tile
EE591b Advanced Image Processing Copyright Xin Li 2003
62
Tile, Subband, Precinct and Block
precinctcode-block
Tile partitions into subbands,precincts and code-blocks
EE591b Advanced Image Processing Copyright Xin Li 2003
63
Bit-plane Lifting Strategy
LSB
MSB
BG BGROI
LSB
MSB
BG BG
ROI
Scale up the coefficients in the region of interest
EE591b Advanced Image Processing Copyright Xin Li 2003
64
Image Example
ROI
EE591b Advanced Image Processing Copyright Xin Li 2003
65
Open Problems Related to Image Coding
Coding of specific class of images (e.g., Satellite, microarray, fingerprint)
Coding of color-filter-array (CFA) images
Error resilient coding of images Perceptual image coding Image coding for pattern recognition
EE591b Advanced Image Processing Copyright Xin Li 2003
66
Coding of Specific Class of Images
How to designspecific codingalgorithms foreach class?
EE591b Advanced Image Processing Copyright Xin Li 2003
67
CFA Image Coding
Bayer Pattern
CFA Interpolation(demosaicing)
Color imagecompression
CFA Interpolation(demosaicing)
CFA datacompression
Approach I
Approach II
Which one is better and why?
EE591b Advanced Image Processing Copyright Xin Li 2003
68
Error Resilient Image Coding
sourceencoder
channel
sourcedecoder
source destination
super-channel
channelencoder
channeldecoder
How can we optimize the end-to-end performance in the presenceof channel errors?
EE591b Advanced Image Processing Copyright Xin Li 2003
69
Perceptual Image Coding
Characterizing image distortion is difficult!
How do we objectively define mage qualitywhich has to be subjectto individual opinions?
EE591b Advanced Image Processing Copyright Xin Li 2003
70
Image Coding for PR
imagesensor
Communicationchannel
Patternrecognition
How does coding distortion affect the recognition performance?
We need to develop a new image representation whichCan simultaneously support low-level (e.g., compression,denoising) and high-level (e.g., recognition and retrieval) vision tasks