ieee transactions on circuits and systems for video technology, 2008
DESCRIPTION
A Fast MB Mode Decision Algorithm for MPEG-2 to H.264 P-Frame transcoding Pedro Cuenca, Member, IEEE, Luis Orozco- Barbosa , Member, IEEE , Gerardo Fernández-Escribano , Antonio Garrido , Hari Kalva. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/1.jpg)
A FAST MB MODE DECISION ALGORITHM FOR MPEG-2 TO H.264 P-FRAME TRANSCODING PEDRO CUENCA, MEMBER, IEEE, LUIS OROZCO-BARBOSA, MEMBER, IEEE, GERARDO FERNÁNDEZ-ESCRIBANO, ANTONIO GARRIDO, HARI KALVA
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008
![Page 2: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/2.jpg)
2
Outline Introduction Fast MB Mode Decision Using Machine
Learning Performance Evaluation Conclusion
![Page 3: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/3.jpg)
3
Introduction1/3
Motivation: make transcoding from MPEG-2 to H.264
seamless.
Hypothesis: the MB mode decision in H.264 have a
correlation with the distribution of the motion compensated residual in MPEG-2 video.
![Page 4: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/4.jpg)
4
Introduction2/3
Fig. 1. Relationship between MPEG-2 MB residual and H.264 MB coding mode.
the H.264 MB mode computation problem is posed as a data classification problem.
the MPEG-2 MB coding mode and residual have to be classified into one of the several H.264 coding modes.
![Page 5: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/5.jpg)
5
Introduction3/3
Method: use machine learning tools to exploit the
correlation and construct decision trees to classify the MPEG-2
MBs into one of the coding modes in H.264.
![Page 6: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/6.jpg)
6
Fast MB Mode Decision Using Machine Learning1/14
Fig. 2. Process for building decision trees for MPEG-2 to H.264 transcoding.
![Page 7: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/7.jpg)
7
Fast MB Mode DecisionUsing Machine Learning2/14 WEKA data mining tool : machine learning software written in Java and supports several standard data mining tasks.
the J48 algorithm: implemented in the WEKA data mining tool was
used to create the WEKA decision trees. the J48 algorithm is an implementation of the
C4.5 algorithm which widely used as a reference for building decision trees.
![Page 8: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/8.jpg)
8
Fast MB Mode Decision Using Machine Learning 3/14 Attribute-Relation File Format (ARFF): The file used by the WEKA data mining program,
contain the existing relationship between a set of attributes.
An ARFF file has two sections: (1) header: contains the name of the relation,
the attributes and their types. (2) section: containing the data.
![Page 9: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/9.jpg)
9
Fast MB Mode Decision Using Machine Learning 4/14 Training sets:
the MPEG-2 sequences encoded at high quality since no B-frames have been used.
use H.264 encoder with a QP of 25 and the R-D optimization enable.
Goal: develop a single, generalized, decision tree to be
used for the MPEG-2 to H.264 transcoding process. It’s found that Flower sequence was good for a large number of videos.
![Page 10: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/10.jpg)
10
Fast MB Mode Decision Using Machine Learning 5/14
The Decision Tree for the proposed transcoder is a hierarchical
decision tree consisting of three different WEKA trees.
Fig. 3. Decision tree.
![Page 11: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/11.jpg)
11
Fast MB Mode DecisionUsing Machine Learning6/14
mean and variance of each one of the 4x4 residual subblocks.
MB mode in MPEG-2. coded block pattern
(CBPC) used in MPEG-2.
A. Creating the Training Files
![Page 12: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/12.jpg)
12
Fast MB Mode DecisionUsing Machine Learning7/14
B. Decision Tree
decision tree
Works as follow
Node 1 Input: MPEG-2 MB information.Output: First level decision that classifies the MB as Skip, Intra, Inter- 8x8 or Inter-16x16.Rule:MPEG-2 MB mode H.264 MB modeMC not coded Inter-16x16 intra Intra or Inter-8X8 skip skip
![Page 13: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/13.jpg)
13
Fast MB Mode DecisionUsing Machine Learning8/14
decision tree
Works as follow
Node 2 Input: 16x16 MBs classified by the Node 1.
Output: 16x16 submode decision used for coding the MB into 16x16, 16x8 or 8x16.
Rule: This tree examines if there are continuous 16x8 or 8x16 subblocks that might result in a better prediction.
B. Decision Tree
![Page 14: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/14.jpg)
14
Fast MB Mode DecisionUsing Machine Learning9/14
decision tree
Works as follow
Node 3 Input: The MBs classified by Node 1 as 8x8.
Output: 8x8 submode decision used for coding the MB into 8x8, 8x4, 4x8 or 4x4.Rule: (1)Evaluates only the H.264 8x8 modes using the third WEKA tree and selects the best option. (2)This node is different from the others since this one only uses four means and four variances to make the decision.
B. Decision Tree
![Page 15: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/15.jpg)
15
Fast MB Mode DecisionUsing Machine Learning10/14
decision tree
Works as follow
Node 4 Input: (1) skip-mode MBs in the MPEG-2 bit stream classified by Node 1(2) the 16x16 MBs classified by Node 2
Output: Select skip or inter-16x16.
Rule: Evaluates only the H.264 16x16 mode (without the submodes 16x8 or 8x16). Then, the node selects the best option.
B. Decision Tree
![Page 16: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/16.jpg)
16
Fast MB Mode DecisionUsing Machine Learning11/14
MB mode decision and threshold used in the decision tree depend on the QP used in the H.264 encoding stage.
The mean and variance threshold will have to be different at each QP.
![Page 17: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/17.jpg)
17
Fast MB Mode DecisionUsing Machine Learning12/14Solution(1):method: Develop the decision trees for each QP
and use the appropriate decision tree depending on the
QP selected.
drawback: It's complex since implies to switch between 52
different decision trees resulting in 156 WEKA
trees for a transcoder.
![Page 18: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/18.jpg)
18
Fast MB Mode DecisionUsing Machine Learning13/14
Solution(2):method: Develop a single decision tree and adjust the
mean and variance threshold used by the trees based on
the QP of 25. For QP values higher than 25, the thresholds
are decreased and for QP values lower than 25 thresholds are oportionally increased. The threshold are adjusted by 2.5% for a change in QP of 1.
![Page 19: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/19.jpg)
19
Fast MB Mode DecisionUsing Machine Learning14/14
Fig. 2. Process for building decision trees for MPEG-2 to H.264 transcoding
Fig. 4. Proposed transcoder
.
.
![Page 20: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/20.jpg)
20
Performance Evaluation1/8
Input: (1) A high quality MPEG-2 video. (2) QP ranging from 5 up to 45 in steps of 5.(3) The size of the GOP is 12 frames;where the first frame was I-frame, and the rest of the frames were P-frames.(4) The rate control and CABAC algorithms were disabled for all the simulations.(5) The number of reference in P-frames was set to 1.(6) The motion search range was set to 16 pels with a MV resolution of 1/4 pel.
![Page 21: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/21.jpg)
21
Performance Evaluation2/8Fig. 6. MB mode decisions generated by the proposed algorithm for the first P-frame in the Ayersroc, Paris, and Foreman sequence.Full
estimation of H.264
Proposed algorithm
![Page 22: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/22.jpg)
22
Performance Evaluation3/8
Test sequence:Martin, Ayersroc, Paries, Tempete, News, Foreman
RD-results:R-D-cost without FME optionor R-D-cost with FME option
Fromat:CCIR, CIF, QCIF
![Page 23: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/23.jpg)
23
Performance Evaluation4/8
![Page 24: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/24.jpg)
24
Performance Evaluation5/8
RD-results:SAE-cost without FME optionor SAE-cost with FME option
![Page 25: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/25.jpg)
25
Performance Evaluation6/8
![Page 26: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/26.jpg)
26
Performance Evaluation7/8
Reference transcoder
Proposed transcoder WIN
![Page 27: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/27.jpg)
27
Performance Evaluation8/8
![Page 28: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008](https://reader035.vdocuments.us/reader035/viewer/2022062302/56816401550346895dd5a0ed/html5/thumbnails/28.jpg)
28
Conclusion The proposed algorithm uses machine
learning techniques to develop decision tree decide MPEG-2 to H.264 coding mode, considerably reducing the computational complexity .
It can be applied to develop other transcoders as well.