STUDY AND IMPLEMENTATION OF STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS VIDEO COMPRESSION STANDARDS
(H.264/AVC, DIRAC)(H.264/AVC, DIRAC)
EE 5359-Multimedia ProcessingSpring 2012Dr. K.R Rao
By:Sumedha Phatak(1000731131)
OBJECTIVEOBJECTIVE
• A study, implementation and comparison of the baseline profiles of H.264/AVC [6] and Dirac [21]
• For factors like video quality, bit rates, compression ratio, complexity and performance analysis
• A comparison of these two standards • Based on quality parameters like SSIM [13], MSE [13]
and PSNR [13] at various bit rates will be done
SSIMSSIM• Method for measuring the similarity between two
frames • Measuring of image quality is done using an initial
uncompressed or distortion-free frame as reference • Designed to improve on methods like peak signal-to-
noise ratio (PSNR) and mean squared error (MSE), which have proved to be inconsistent with human eye perception.
EQUATION FOR SSIMEQUATION FOR SSIM
where x and y correspond to two different signals that need to be compared for similarity, i.e. two different blocks in two separate images;
MEAN SQUARED ERROR (MSE) [13]: MEAN SQUARED ERROR (MSE) [13]:
• MSE represents the cumulative squared error between the compressed and the original image
• lower the value of MSE, lower is the error • MSE is computed by averaging the squared intensity
differences of the distorted and reference image/frame pixels
• Two distorted images with the same MSE may have very different types of errors, some of which are much more visible than others.
PEAK SIGNAL TO NOISE RATIO(PSNR) [13] PEAK SIGNAL TO NOISE RATIO(PSNR) [13]
• This ratio is often used as a quality measurement between the original and a compressed image
• Higher the PSNR, the better the quality of the compressed, or reconstructed image
• R is the maximum fluctuation in the input image data type
INTRODUCTIONINTRODUCTION
• Data compression means bit-rate reduction • Compression can be either lossless or lossy [9]• Former reduces bits by eliminating statistical redundancy and
no information is lost in this type of compression • Whereas, the latter by identifying and removing marginally
important information [9]• Majority of video compression algorithms use lossy
compression [1]• Video compression uses modern coding techniques to reduce
redundancy in video data and combines spatial image compression and temporal motion compensation. [3]
NEED FOR VIDEO COMPRESSION?NEED FOR VIDEO COMPRESSION?
• Mainly needed because bandwidth is still a very valuable commodity
• Consider a TV picture resolution of 720×480 and a frame rate of 30 fps
• If represented by 3 bytes per pixel• 1 sec of video=31.1 MB and 1 hr of video=112GB • BW required to deliver wirelessly will be 124.4 MHz
H.264 [4]H.264 [4]
• H.264/MPEG-4 Part 10 or AVC (Advanced Video Coding) is a standard for video compression [3]
• Currently one of the most commonly used formats for the recording, compression, and distribution of high definition video [4]
• Good video quality at substantially lower bit rates than previous standards without significantly increasing the complexity of design.[3]
BASELINE PROFILE [3]BASELINE PROFILE [3]
• Consists of some error resilience tools such as flexible macro block order, arbitrary slice order and redundant slices
• Designed for low delay applications, as well as for applications that run on platforms with low processing power and in high packet loss environment
• Offers the least coding efficiency• Caters to applications such as video conferencing and mobile
video [6].
EXTENDED PROFILE [3]EXTENDED PROFILE [3]
• Superset of the baseline profile• Besides tools of the baseline profile it includes B-, SP- and SI-
slices, data partitioning, and interlace coding tools [21]• SP and SI are specifically coded P and I slices respectively
which are used for efficient switching between different bitrates in some streaming applications
• Does not include context adaptive binary arithmetic coding [7]• More complex but also provides better coding efficiency• Intended for applications like streaming video over internet
MAIN PROFILE [6]MAIN PROFILE [6]
• Other than the common features main profile includes tools such as CABAC- context adaptive binary arithmetic coding for entropy coding, B-slices
• Does not include any error resilience tools • Used in Broadcast television and high resolution
video storage and playback• Contains interlaced coding tools like extended profile
[3]
HIGH PROFILE [6]HIGH PROFILE [6]
• Superset of the main profile [7]• Also includes additional tools such as adaptive
transform block size, quantization scaling matrices
• Used for applications such as content-contribution, content-distribution, and studio editing and post-processing [17]
ADVANTAGES OF H.264ADVANTAGES OF H.264
• High video quality at low and high bit rates • H.264 is error resilient • Can deal with packet losses in packet networks
and also bit errors in error-prone wireless networks
• Wide areas of application• Mobile TV, HDTV over IP
H.264 IMPLEMENTATIONH.264 IMPLEMENTATION
• JM 17.2 software [22] • Build the solution and run it using Microsoft Visual Studio• Generate lencod and ldecod and executable files• lencod/ldecod file should then be run from the command
prompt • Since, baseline profile is being implemented, the
encoder_baseline.cfg file should be used.• Necessary parameters should be changed along with the
required input file and destination to get the output at desired parameters.
ORIGINAL FILE : news_qcif.yuv [23]ORIGINAL FILE : news_qcif.yuv [23]
Figure 7: Original video file news_qcif.yuv
RESULTSRESULTS
• QCIF sequence: news_qcif.yuv• Height: 176, Width: 144• Total no. of frames: 300• Frames used: 100• Original File size: 3713KB• Frame Rate = 25 fps
BIT RATE, MSE, PSNR, SSIM AT BIT RATE, MSE, PSNR, SSIM AT VARIOUS QPVARIOUS QP
Table 2: Various metrics for news_qcif.yuv at different QP values
QP Bitrate (kbps) MSE (Y-component) PSNR (Y-component) in dB SSIM (Y-component)
0 227.80 0.0129 65.567 0.999
10 89.42 0.49 52.24 0.997
25 14.25 8.01 40.54 0.987
40 2.22 103.55 29.11 0.843
50 0.723 433.89 20.66 0.610
ORIGINAL FILE : foreman_qcif.yuv ORIGINAL FILE : foreman_qcif.yuv [23][23]
Figure 12: Original video file foreman_qcif.yuv
RESULTSRESULTS
• QCIF sequence: foreman_qcif.yuv• Height: 176, Width: 144• Total no. of frames: 300• Frames used: 100• Original File size: 14850 KB• Frame Rate = 25 fps
BIT RATE, MSE, PSNR, SSIM AT BIT RATE, MSE, PSNR, SSIM AT VARIOUS QPVARIOUS QP
Table 3: foreman_qcif.yuv-Various metrics at different QP values
QP Bitrate (kbps) MSE (Y-component) PSNR (Y-component) in dB SSIM (Y-component)
0 388.48 0.0079 69.182 0.999
10 172.25 0.4573 51.563 0.9975
25 23.794 7.999 39.134 0.9708
40 3.896 90.264 28.609 0.8457
50 1.1475 430.874 21.821 0.6112
ORIGINAL FILE : container_qcif.yuv ORIGINAL FILE : container_qcif.yuv [23][23]
Figure 17: Original video file container_qcif.yuv
RESULTSRESULTS
• QCIF sequence: container_qcif.yuv• Height: 176, Width: 144• Total no. of frames: 300• Frames used: 100• Original File size: 14850KB• Frame Rate = 25 fps
BIT RATE, MSE, PSNR, SSIM AT BIT RATE, MSE, PSNR, SSIM AT VARIOUS QPVARIOUS QP
Table 4: Various metrics at different QP values for container_qcif.yuv
QP Bitrate (kbps) MSE (Y-component) PSNR (Y-component) in dB SSIM (Y-component)
0 3124.45 0.00805 68.126 0.999
10 1093.6 0.511 51.295 0.9964
25 196.73 16.111 37.776 0.9698
40 42.88 93.767 28.4307 0.852
50 4.22 363.973 22.783 0.6889
PLOTS FOR container_qcif.yuv:PLOTS FOR container_qcif.yuv:
Figure 19: container_qcif.yuv-MSE vs. bit rate
DIRACDIRAC [21] [21]
• Open and free video compression format developed by BBC research [6]
• Intended to provide high quality video compression for applications like Ultra HDTV
• Mainly competes with existing standards like H.264 [5] and VC1 [12]
• Hybrid video codec because it involves both transform and motion compensation
WAVELET TRANSFORMWAVELET TRANSFORM
• Uses wavelet transform on the entire picture at once providing flexibility to operate at several resolution ranges
• When the transform is applied, the wavelet filters split the signal into 4 frequency sub-bands namely LL (Low-Low), LH (Low-High), HL (High-Low) and HH (High-High). For our sequence the filter is applied both horizontally and vertically
• Since, LL sub-band consists of most significant information, for further stages the LL is decomposed and the rest can be discarded. This decomposition is carried out up to 4 stages.
• Daubechies wavelet filters are used to transform and divide the data in sub-bands which then are quantized with the corresponding RDO (rate distortion optimization) parameters and then variable length encoded
• At the decoder these stages are reversed.
ORIGINAL FILE : news_qcif.yuv [23]ORIGINAL FILE : news_qcif.yuv [23]
Figure 24: Original video file news_qcif.yuv
RESULTSRESULTS
• QCIF sequence: news_qcif.yuv• Height: 176, Width: 144• Total no. of frames: 300 • Frames used: 100 • Original File size: 3713 KB• Frame Rate = 25 fps
BIT RATE, MSE, PSNR, SSIM AT BIT RATE, MSE, PSNR, SSIM AT VARIOUS QFVARIOUS QF
Table 5: Various metrics at different QF values for news_qcif.yuv
QP Bitrate (kbps) MSE (Y-component) PSNR (Y-component) in dB SSIM (Y-component)
0 4.58 326.68 23.0236 0.6933
3 8.152 104.713 27.965 0.8544
5 14.989 38.027 32.364 0.925
8 42.426 4.184 41.949 0.9849
10 95.388 1.401 46.699 0.993
ORIGINAL FILE : foreman_qcif.yuv [23]ORIGINAL FILE : foreman_qcif.yuv [23]
Figure 29: Original video file foreman_qcif.yuv
RESULTSRESULTS
• QCIF sequence: foreman_qcif.yuv• Height: 352, Width: 288• Total no. of frames: 300• Frames used: 100• Original File size: 14850 KB• Frame Rate = 25 fps
BIT RATE, MSE, PSNR, SSIM AT BIT RATE, MSE, PSNR, SSIM AT VARIOUS QFVARIOUS QF
Table 6: Various metrics at different QF values for foreman_qcif.yuv
QP Bitrate (kbps)MSE (Y-
component)PSNR (Y-component) in dB SSIM (Y-component)
0 6.1125 299.55 23.4 0.6837
3 10.984 110.48 27.73 0.8316
5 20.673 37.207 32.458 0.9179
8 65.418 5.183 41.019 0.9795
10 145.766 1.5507 46.259 0.993
ORIGINAL FILE : container_qcif.yuv [23]ORIGINAL FILE : container_qcif.yuv [23]
Figure 33: Original video file container_qcif.yuv
RESULTSRESULTS
• QCIF sequence: container_qcif.yuv• Height: 352, Width: 288• Total no. of frames: 300• Frames used: 100• Original File size: 14850 KB• Frame Rate = 25 fps
BIT RATE, MSE, PSNR, SSIM AT BIT RATE, MSE, PSNR, SSIM AT VARIOUS QFVARIOUS QF
Table 7: Various metrics at different QF values for container_qcif.yuv
QP Bitrate (kbps)MSE (Y-
component)PSNR (Y-component) in
dBSSIM (Y-component)
0 18.817 203.688 24.0612 0.8217
3 27.355 46.6012 30.3868 0.9408
5 82.955 4.0694 40.0146 0.9754
8 187.291 1.882 44.263 0.9827
10 453.601 0.999 47.1666 0.9853
PLOTS FOR container_qcif.yuv:PLOTS FOR container_qcif.yuv:
Figure 35: container_qcif.yuv-MSE vs. bit rate
CONCLUSIONSCONCLUSIONS
From the above results, we see that H.264 is better compared to Dirac in terms of performance, quality. Also the PSNR and SSIM increase with increase in the bit rates whereas MSE decreases with increase in the bitrate. The variation in the bitrate can be achieved by changing the QP or QF for H.264 and dirac respectively.
ABBREVIATIONS AND ACRONYMS
• AVC: Advanced Video Coding • AVS: Audio Video Standard• BBC: British Broadcasting Corporation • CIF: Common Intermediate Format• CABAC: Context Adaptive Binary Arithmetic Coding • CODEC: Coder and Decoder• DCT: Discrete Cosine Transform• HDTV: High-Definition Television • IEC: International Electrotechnical Commission • ISO: International Organization for Standardization • ITU-T: International Telecommunication Union - Telecommunication Standardization sector • JPEG: Joint Photographic Experts Group• MPEG: Moving Picture Experts Group• MSE: Mean Square Error • PSNR: Peak Signal to Noise ratio • QCIF: Quarter Common Intermediate Format • SMPTE: Society of Motion Picture and Television Engineers • SSIM: Structural Similarity Metric• VQMT: Video Quality Measurement Tool
REFERENCESREFERENCES[1]Video compression standards history: http://en.wikipedia.org/wiki/Video_compression#Video[2] Video conferencing standards and technology.http://blog.radvision.com/videooverenterprise/2008/06/03/the-babel-fish-proves-video-conferencing-does-exist/[3] K. R. Rao and D. N. Kim, “Current Video Coding Standards: H.264/AVC, Dirac, AVS China and VC-1,” IEEE 42nd Southeastern symposium on system theory (SSST), March 7-9 2010, pp. 1-8, March 2010.[4] S. Kwon, A. Tamhankar and K.R. Rao, “Overview of H.264 / MPEG-4 Part 10”, J. Visual Communication and Image Representation, vol. 17, pp.186-216, April 2006.[5]T. Borer and T. Davies, “Dirac video compression using open technology,” BBC EBU Technical Review, July 2005.[6] A. Ravi, and K.R. Rao, “Performance analysis and comparison of the dirac video codec with H.264/MPEG-4 Part 10 AVC”, International Journal of Wavelets, Multiresolution and Information Processing, vol.4, pp. 635-654, January 2010.[7] T. Wiegand, and G. Sullivan, “Overview of H.264/AVC video coding standards,” IEEE Transactions on circuits and systems for video technology, vol. 13, no. 7,pp. 560-576, July 2003.[8]DiracSpecification,Version2.2.3,Available:http://diracvideo.org/download/specification/dirac-spec-latest.pdf[9] General information on Data/ Video compression http://en.wikipedia.org/wiki/Data_compression[10] The Dirac web page: http://www.bbc.co.uk/rd/projects/dirac/technology.shtml[11] S.-T. Hsiang, “A new sub band/wavelet framework for AVC/H.264 intra frame coding and performance comparison with motion-JPEG 2000", SPIE/VCIP, vol.6822, pp. 68220P-1 through 12, Jan. 2008.[12] VC-1 Compressed video bit stream format and decoding process (SMPTE 421M-2006), SMPTE standard, pp. 2-9, 2006.[13] Z. Wang, et al, “Image quality assessment: From error visibility to structural similarity”, IEEE Transactions on Image Processing, vol.13, no.4, pp. 600-612, April 2004.[14] MSU Video quality measurement tool: http://compression.ru/video/quality_measure/video_measurement_tool_en.html#nav[15] G. J. Sullivan and J. Ohm, Recent developments in standardization of high efficiency video coding (HEVC), Proc. SPIE 7798, 77980V (2010)
REFERENCESREFERENCES• [16] Dirac developer support documentation:
http://dirac.sourceforge.net/documentation/algorithm/algorithm/wlt_transform.xht• [17] I. Richardson, “The H.264 advanced video compression standard”, Wiley, 2nd edition, 2010.• [18] C. Christopoulos, A. Skodras, T.Ebrahimi, “The JPEG2000 still image coding system: An Overview”, IEEE Trans. on
Consumer Electronics, vol.46, pp.1103-1127, Nov. 2000• [19] B. Zeng and J. Fu, “Directional discrete cosine transforms - A new framework for image coding”, IEEE Trans. on Circuits
and Systems for Video Technology, vol. 18, no. 3, pp. 305-313, Mar. 2008.• [20] K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications (Academic Press, Boston, 1990).• [21] A. Ravi, “Performance analysis and comparison of the dirac video codec with H.264/ MPEG 4 Part 10 AVC”, M.S thesis,
EE dept., UT Arlington, Aug 2009• [22] JM software source code: http://iphome.hhi.de/suehring/tml/ • [23]Sample videos in yuv format: http://trace.eas.asu.edu/yuv/• [24] L Fan et al, “Overview of AVS video standard”, IEEE International conference on multimedia and expo, vol. 1, pp. 423 -
426, June 2004.• [25] J.Ostermann et al, “Video coding with H.264/AVC: Tools, performance, and complexity”, IEEE Circuits and Systems
Magazine, vol. 4, Issue: 1, pp. 7 – 28, Aug. 2004.