1 adaptive slice-level parallelism for h.264/avc encoding using pre macroblock mode selection...
Post on 19-Dec-2015
216 views
TRANSCRIPT
![Page 1: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/1.jpg)
1
Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection
Bongsoo Jung, Byeungwoo Jeon
Journal of Visual Communication and Image Representation 2008
![Page 2: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/2.jpg)
2
Outline
Introduction Complexity Analysis Method
Pre Macroblock Mode Selection Adaptive Slice-level Parallelism
Experimental Results Conclusions
![Page 3: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/3.jpg)
3
Introduction
H.264/AVC achieves high coding efficiency Variable block size, multiple reference frame,
quarter-pel motion vector accuracy,etc. High computational complexity
Complexity reduction algorithm Parallel processing
![Page 4: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/4.jpg)
4
Introduction
GOP level Simple but high latency
Frame level Keep coding efficiency, but the dependence am
ong frames limits the thread scalability Slice level
Encode independently but less coding efficiency Macroblock level
High dependency
![Page 5: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/5.jpg)
5
Introduction
MBs in a slice may not have similar computational complexity. Unnecessary extra waiting time in some thr
eads.
slice 0
slice 1
slice 2
slice 3
slice 4
slice 5
slice 6
slice 7
Encoding time
PU0
PU1
PU2
PU3
PU4
PU5
PU6
PU7
![Page 6: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/6.jpg)
6
Main Purpose
Objective Using parallel algorithm to speed up
H.264/AVC encoder Maximize the parallelism efficiency by
distributing the workload equally. Method
Pre processing: Fast MB mode selection Adaptive slice-level parallelism
![Page 7: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/7.jpg)
7
Complexity Analysis
Inter prediction mode of MBs in H.264 Intra prediction mode: 4*4, 16*16
![Page 8: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/8.jpg)
8
Complexity Analysis
The run-time complexity of the H.264/AVC encoder Pentium IV 2.4GHz Foreman_CIF with IPPP structure
![Page 9: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/9.jpg)
9
Pre Macroblock Mode SelectionOverview
Why? High computational complexity of ME in
variable block size Remove unnecessary ME block size and RD
calculation of intra prediction mode This removal leads to
Complexity reduction Workload balancing among slices
![Page 10: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/10.jpg)
10
Pre Macroblock Mode SelectionInter MB mode selection
MC block sizes in video sequence Foreground region : 8*8 or smaller Non-moving region : 16*16
High temporal correlation Check consistency history of block size 16*
16 and zero MV Two measurements
Zero motion consistency (ZMC) Large block consistency (LBC)
![Page 11: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/11.jpg)
11
Pre Macroblock Mode SelectionInter MB mode selection
Zero Motion Consistency (ZMC) Indicates how long a specified block has had
a zero MV consecutively
When a block is encoded in intra mode ZMC is set to 0
t : frame index , ZMC0 = 0,
(n,m;i,j) indicates a 4*4 block at (n,m)
within a MB (i,j)
high value of ZMC
high prob. of belonging
to background region
![Page 12: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/12.jpg)
12
Pre Macroblock Mode SelectionInter MB mode selection
Zero Motion Consistency Score Indicates how likely a MB being a stationary
region
TMOTION : A threshold value
![Page 13: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/13.jpg)
13
Pre Macroblock Mode SelectionInter MB mode selection
Large Block Consistency (LBC) Indicates the number of continuous frames h
aving a 16*16 MC block size at (i,j)th MB
When a block is encoded in intra mode LBC is set to 0
bestModet(i,j) : The best MB mode of the (i,j) MB in tth
frame
LBC0 = 0
![Page 14: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/14.jpg)
14
Pre Macroblock Mode SelectionInter MB mode selection
Large Block Consistency Score Indicates how likely a MB being partitioned in
16*16
TMODE1 ,TMODE2 : Threshold values used to make the
assessment of the LBC
![Page 15: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/15.jpg)
15
Pre Macroblock Mode SelectionInter MB mode selection
A illustration of LBCS
![Page 16: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/16.jpg)
16
Pre Macroblock Mode SelectionInter MB mode selection
Conditional probability of MB modes given ZMCS = High
The other block sizes are very unlikely to appear (less than about 0.04)
Early detect SKIP and P16*16 mode
TMotion = 4
![Page 17: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/17.jpg)
17
Pre Macroblock Mode SelectionInter MB mode selection
Joint conditional probability of given LBCS with ZMCS = Low
A: LBCS = High, B: LBCS = Medium, C: LBCS = Low
TMODE1 = 1, TMODE2 = 4
![Page 18: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/18.jpg)
18
Pre Macroblock Mode SelectionPre selective intra mode selection
High computational load of computing RD costs of intra mode
Comparing temporal correlation with spatial correlation of the current MB prior to frame coding
![Page 19: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/19.jpg)
19
Pre Macroblock Mode SelectionSelective intra mode selection
Mean Absolute Temporal Difference
Mean Absolute Spatial Difference
cx,y : Pixel values at location (x,y) of MB in current frame
rx,y : Pixel values at location (x,y) of MB in previous frame
X, Y : Horizontal and vertical dimensions of a MB
MASDH : The MASD between horizontally
neighboring pixels
MASDV : The MASD between vertically
neighboring pixels
![Page 20: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/20.jpg)
20
Pre Macroblock Mode SelectionSelective intra mode selection
Comparing MATD and MASD to determine whether current MB should calculate RD costs of intra modes
A larger w makes skipping intra mode search easier
A smaller QP will incur more intra modes than a larger QP
w: Weighting factor, currently is set to 0.6
More temporally correlated than spatially correlated
![Page 21: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/21.jpg)
21
Pre Macroblock Mode SelectionMB mode classfication
Decision table of candidate MB mode
A block diagram of MB selection
![Page 22: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/22.jpg)
22
Adaptive Slice-level ParallelismOverview
Characteristic Easy to implement Lower overhead of inter communication a
mong processor unit Good scalability Increase bitrate
Slice boundary is defined on the basis of a fixed number of MBs or fixed number of bits
Hard to decide a slice boundary prior toencoding
![Page 23: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/23.jpg)
23
Adaptive Slice-level ParallelismFixed MB assignment
The number of consecutive MBs in each slice
L : The number of processor units on a multi-core system
M : The total number of MBs in a frame i : Slice index
Example : number of processing unit L = 8, sequence resolution
is CIF (352*288), M = 22*18 = 396
We can assign about 49 MBs to each slice
![Page 24: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/24.jpg)
24
Adaptive Slice-level ParallelismFixed MB assignment
The scheduling of slice-level parallelism in eight processor units
slice 0
slice 1
slice 2
slice 3
slice 4
slice 5
slice 6
slice 7
Encoding time
PU0
PU1
PU2
PU3
PU4
PU5
PU6
PU7
slice 0
slice 1
slice 2
slice 3
slice 4
slice 5
slice 6
slice 7
Encoding time
PU0
PU1
PU2
PU3
PU4
PU5
PU6
PU7
Ideal case Practical case
Bottleneck
![Page 25: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/25.jpg)
25
Adaptive Slice-level ParallelismFixed MB assignment
The imbalance of computational load distribution
Exhaustive Search Method Fast ME / Fast Mode Search
![Page 26: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/26.jpg)
26
Adaptive Slice-level ParallelismFixed MB assignment
Computational load for encoding one frame in slice level parallelism
Computation load of the tth frame by a single processor system
Ctslice(i) : The computational load of ith slice in tth frame
L : Number of slice in a frame
![Page 27: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/27.jpg)
27
Adaptive Slice-level ParallelismFixed MB assignment
The speedup of multiprocessor system over a single processor system
To achieve the maximum speedup Computation loads of each slice should be
as similar as possible Adaptive slice partition method
![Page 28: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/28.jpg)
28
Adaptive Slice-level ParallelismComplexity estimation model
A simple estimation method by utilizing the result of fast MB mode selection
Define the group value g corresponding to the candidate MB modes
![Page 29: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/29.jpg)
29
Adaptive Slice-level ParallelismComplexity estimation model
Complexity model
Ck,CHKIntra(g) : Complexity cost of the kth MB
g : Group index
einter : Estimated complexity cost of inter mode in g = 1
eintra : Complexity cost according to the intra mode check
in g = 1
α1, α2, α3, β1 β2 β3 : Weighting values of complexity cost
![Page 30: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/30.jpg)
30
Adaptive Slice-level ParallelismComplexity estimation model
Relative computational load
4,5.28
3, 3.12
2,2.42
1, 1
)(
33
22
11
0,
gee
gee
gee
gee
gC
IntraInter
IntraInter
IntraInter
IntraInter
IntraCHKk
CHKintra = 0
CHKintra = 1
Assume einter = 1, eintra = 0
α1=2.42, α2=3.12,α3=5.28
4,9.48
3, 7.23
2,.486
1,97.4
)(
33
22
11
1,
gee
gee
gee
gee
gC
IntraInter
IntraInter
IntraInter
IntraInter
IntraCHKk
β1=0.82, β2=0.83, β3=0.84
Assume einter = 1, eintra = 3.97
![Page 31: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/31.jpg)
31
Adaptive Slice-level ParallelismAdaptive MB assignment
The total computational load at the tth frame
Ideal computational load of each slice for the uniform workload distribution
1
0, )(
~ M
kIntraCHKk
t gCC
L
CC
ttslice
~~
![Page 32: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/32.jpg)
32
Adaptive Slice-level ParallelismAdaptive MB assignment
MB assignment of slice
Much better than fixed MB assignment in each slice
![Page 33: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/33.jpg)
33
Adaptive Slice-level ParallelismAdaptive MB assignment
Entire block diagram
![Page 34: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/34.jpg)
34
Experimental ResultsOverview
Performance comparison between proposed MB mode decision and the conventional method
Comparing adaptive slice-level parallelism with fixed slice-level parallelism
![Page 35: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/35.jpg)
35
Experimental ResultsMB mode selection
Average encoding time saving AST[%]
BDPSNR and BDBR are used to measure the performance against FULL_1Slice
FULL_1Slice : Exhaustive methodFMD_1Slice : Fast MB mode search method
![Page 36: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/36.jpg)
36
Experimental ResultsRate distortion curves
![Page 37: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/37.jpg)
37
Experimental Results
R-D performance compared to one slice per frame (FMD_1Slice)
![Page 38: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/38.jpg)
38
Experimental ResultsRate distortion curves
![Page 39: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/39.jpg)
39
Experimental ResultsSlice-level parallelism
Comparing adaptive and fixed slice level parallelism
Speedup
meOverheadTiisliceEncTimeMAX
SliceFMDEncTimeSpeedup
FixedFMDiFixedFMD
_
_
)1_(
meOverheadTiisliceEncTimeMAX
SliceFMDEncTimeSpeedup
AdaptiveFMDiAdaptiveFMD
_
_
)1_(
Encoding time of one slice per frame
by a single processor system
The longest encoding time of a slice using fixed mode
The longest encoding time of a slice using adaptive mode
![Page 40: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/40.jpg)
40
Experimental ResultsSpeedup
![Page 41: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/41.jpg)
41
Conclusions
Proposed a fast MB mode selection using consistency history of block size and a zero MV
Proposed a intra mode selection by comparing the correlation
Using these two schemes, they proposed a new adaptive slice-level parallelism to speed up H.264/AVC encoder
![Page 42: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d2c5503460f94a01ced/html5/thumbnails/42.jpg)
42
Reference
Z. Chen, P. Zhou, Y. He, Fast motion estimation for JVT, JVT Doc.JVT-G016,March 2003.
B. Jeon, J. Lee, Fast mode decision for H.264, JVT-J003, ISO/IEC MPEG and ITU-T VCEG Joint Video Team, (Waikoloa, HI), December 2003.
I. Choi, J. Lee, B. Jeon, Fast coding mode selection with rate-distortion optimization for MPEG-4 Part-10 AVC/H.264, IEEE Trans. Circuits Syst. VideoTechnol. 16 (12) (2006) 1557–1561.