matching pursuits
TRANSCRIPT
Matching Pursuits
Vidhya N.S. Murthy
Roadmap● Low Bitrate Video coding● Some history about Matching Pursuits● What is Matching pursuits?● Applying this technique to Video Encoder.● Results.
Motivation for Low bitrate Video● Demand for video telephony,video conferencing
etc over PSTN networks. ● Limited bandwidth in wireless networks.● Function at bitrates in the range of 10-24kbps● Error resilience over noise prone channels, the
source encoder has to perform well to reduce error protection overhead
Evolution of International Standards
● All these standards are based on Block Matching techniques and DCT framework
The effect of transform and quantization
−−−−−−−−
→
−
→
−−−−−−−−
4848484844444444
44448888
0002000000080006
4243464744424743
548447610
// ITransformIQuantQuantTransform
Motion Residue Reconstructed Data
Typical encoder and where are we planning to modify
Reference frames
Motion Estimation
Frame Predictor
IDCT
DCTQuantization
VLC
Inverse Quantization
+
+
Some History about Matching Pursuits
● Introduced by Mallat and Zhang in 1993. Based on Projection pursuits work by Friedman and Tukey in 1978
● Used for compressing video in 1994 by Neff, Zakhor.
● A comprehensive work carried out by Neff and Zakhor at Berkeley and was a part of proposals to the MPEG4 standards committee.
● Currently work is being done to find
What is Matching Pursuits?● Matching Pursuits is a greedy algorithm which
matches signal structures to a large diverse dictionary of functions.
● Expands a signal using an over complete dictionary of functions
● More number of basis functions implies there are a larger number of available options to approximate structures in pictures better
Geometric Analogy
A three dimensional vector in the space R3
If the vector were (3,2,3) it means we have resolved it along the x,y and z axis as 3,2 and 3 respectively
The unit vectors along x,y and z form the complete basis for R3 span all possible vectors in the 3 dimensional space
Now if we add the vector (3,2,3) to the basis vector set of R3 then we have a redundant basis and vectors
like scaled versions of (3,2,3) and its linear combinations with other vectors
can have sparser representations in this new space spanned by these 4
basis vectors.
z
y
x
z
y
x
Fourier BasesSum of the first 4 harmonics
Fundamental
3rd Harmonic
5th Harmonic
7th Harmonic
Diagramatically
Signal h(t)
Dictionary gk(t)
Decompose
M
ĥ(t) = Σ pngn(t) n = 1
No restriction on the choice of
dictionary
No restriction on the choice of
dictionarySignal can be
multidimensional
Notice similarity to Fourier expansion
The Gabor dictionaryModulated Gaussian window
2 D case
2D Gabor basis visualization
Algorithm Stages● Dictionary design● Atom Decomposition or Atom Search or simply
Find atoms
2D Dictionaries64 basis images of
8x8 DCT
400 basis images of Gabor Dictionary
All basis images have a fixed size of 8x8
Finding Atoms
Atom StructureAtom Structure
Find Energy Stage
Flowchart explaining the position coding
system
General Block diagram of DCT based Encoder
Reference frames
Motion Estimation
Frame Predictor
IDCT
DCT Quantization
VLC
Inverse Quantization
+
+
Bitstream
I//P video
The new Encoder block diagram
More visible features tend to be coded firstForeman
Hall
Motion ResidueMotion Residue
Motion Residue
First 5 atoms
First 5 atoms
First 32 atoms
First 32 atoms
First 64 atoms
First 64 atoms
Reconstructed Images
First 5 atoms First 32 atoms First 64 atoms
First 5 atoms First 32 atoms
First 64 atoms
First 64 atoms
MPEG2 at Low Bitrates and Matching Pursuits
Foreman
Reconstructed image for 64 coded atoms Reconstructed image MPEG2 at 20 kbps
Hall Monitor
Reconstructed image for 64 coded atoms Reconstructed image MPEG2 at 20 kbps
Software
Software can be downloaded from
http://cnx.org/content/expanded_browse_authors?letter=M&author=vmurthy.
Conclusions● This coding paradigm is very effective at low bitrates.● It is computationally very complex and hence future enhancements will be more towards reducing the number of searches and looking for better dictionaries which will also in turn assist in reducing the number of searches.
References[1] Z, Zhang, and S. Mallat, “Matching pursuit with time-frequency dictionaries”,IEEE Transactions on Signal Processing,Vol 41, No. 12,pp. 3397-3415, Dec 1993. [2] J. H. Friedman and W. Stuetzle, “Projection pursuit regression,” J. Amer. Stat. Assoc., vol. 76, no. 376, pp. 817–823, Dec. 1981. [3] F. Bergeaud, and S. Mallat, “Matching pursuit of images,” Image Processing, 1995. ICIP 1995. IEEE International Conference on , pp. 53-56, Sept 1995.[4] M. Vetterli, and T. Kalker,”Matching pursuit for compression and application to motion compensated video coding”, Image Processing, 1994 , ICIP 1994 , IEEE International Conference on, pp. 724-729,Nov 1994.[5] R. Neff, and A. Zakhor, “Very-Low Bit-Rate Video Coding Based on Matching Pursuits”, IEEE Transactions on circuits and systems for video technology, Vol 7 No. 1, pp. 158-171, Feb 1997. [6] J. Pearl, H. C. Andrews, and W. K. Pratt, “Performance measures for transform data coding,” IEEE Trans. Commun., vol. COM–20, pp. 411–415, June1972.[7] P. Yip and K. R. Rao, “Energy packing efficiency for the generalized discrete transforms,” IEEE Trans. Commun., vol. COM–26, pp. 1257–1261, Aug. 1978.[8] K. Imammura et al, “A fast matching pursuits algorithm based on sub-band decomposition of video signals”,IEEE ICME 2006, pp. 729-732,July 2006.[9] K. Cheung and Y. Chan, “An efficient algorithm for realizing matching pursuits and its applications in MPEG4 coding system”, Image Processing, 2000. ICIP 2000. IEEE International Conference on ,Vol 2, pp. 863-866,Sept 2000.[10] A. Shoa and S. Shirani, “Tree structure search for matching pursuit” Image Processing, 2005. ICIP 2005. IEEE International Conference on , Vol 3, pp 908-911,Sept 2005.[11] R. Neff et. al., “Decoder complexity and performance comparison of matching pursuit and DCT based MPEG – 4 video codecs”, Image Processing, 1998. ICIP 98. Proceedings. 1998 International Conference on, Vol 1, pp 783-787, Oct 1998. [12] R. Neff, A. Zakhor, and M. Vetterli, “Very low bit rate video coding using matching pursuits,” in Proc. SPIE VCIP, vol. 2308, no. 1, pp. 47–60, Sept. 1994. [13] R. Neff and A. Zakhor, “Matching pursuit video coding at very low bit rates,” in IEEE Data Compression Conf., Snowbird, UT, pp. 411–420, Mar 1995.
Thank You