chuong 1&2.pdf
TRANSCRIPT
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Video Coding
Tien Pham Van, Dr. rer. nat.
Hanoi University of Technology
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Agenda
• Video coding process
• Video coding standards
• Future development
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Introduction (1/2)
• Why video compression technique is
important ?
• One movie video without compression
– 720 x 480 pixels per frame
– 30 frames per second
– Total 90 minutes
– Full color
– The full data quantity = 167.96 G bytes !!
3
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Introduction (2/2)
• What is the difference between video
compression and image compression?
– Temporal Redundancy
• Coding method to remove redundancy
– Intraframe Coding
• Remove spatial redundancy
– Interframe Coding
• Remove temporal redundancy
4
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Desired Features
• Better compression
• Improved quality
• Interactivity and Manipulation of Content
• Error Resilience
• Processing of content in the compressed domain
• Identification and selective coding/decoding of the object of interest
• Facilitate Search / Indexing (MPEG-7)
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Time table
1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 … 2010
6
JPEG
MPEG1
MPEG2/H.262
MPEG4
H.26L H.264
H.261
H.263
Year
VC-1/VC-2
H.265
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Evolution of Video Compression
Standards
H.261
Video Telephony
H.262/MPEG-2
Digital TV/DVD
MPEG-4 Visual
Object-based Coding
H.263
Video Conferencing
H.264 MPEG-4 AVC
MPEG-1
Video-CD
ITU-T MPEG
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Where used?
– MPEG-1• Video-CD
• Usually .mpg or .mpeg files are MPEG-1
• DAB Digital Radio is MP2 (MPEG-1 Layer 2)
• MP3 files (MPEG-1 Layer 3)
– MPEG-2:• .vob, .m2v, rarely .mpg files
• Anything to do with DVD– Camcorders, DVD players, DVD recorders
• Digital TV (DVB)
– MPEG-4:• High Quality AVI files
• Video Phones
• DivX
• Some advanced audio players support MPEG-4 Advanced Audio Coding (AAC)
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Where used?
–H.263/+/++• NetMeeting and similar video-chat
• Network streaming application, video phone…
– H.264• Video Conferencing: over different networks
• Multimedia Streaming: live and on-demand
• Multimedia Messaging Services (MMS)
• Blu-ray, Digital Video Broadcasting, iPod Video, HD DVD
– VC-1, VC-2 • Video on Internet,
• HDTV broadcast, UHDTV
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
R-D Performance of MPEG Codecs
32
34
36
38
40
42
44
46
48
50
350 450 550 650 750 850 950 1050
Bit rate (kbps)
PS
NR
(Y
)
MPEG-1 MPEG-2 MPEG-4 H.264
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Questions
• What are video/audio codecs ? Name some
popular codecs that your media players
support. What are disadvantages of using
specific codecs ?
• What is container format? Name some
examples.
• Codecs and Formats
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Compression...
movie picture 1 movie picture 2
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Residue after motion compensation
Pixel-wise difference w/o motion compensation
Motion estimation
“Horse ride”
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Motion Prediction
• Motion vector: a motion vector is a bi-dimensional pointer that tell the decoder how much left/right and up/down
• Motion estimation: the process, perfomed by the coder, that should find the motion vector pointing to the best prediction macroblock in a reference frame or field
• Motion compensation: what obtained after applying motion vector on reference frame
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Motion Estimation
• Help understanding the content of image sequence
– For surveillance
• Help reduce temporal redundancy of video
– For compression
• Stabilizing video by detecting and removing small, noisy global motions
– For building stabilizer in camcorder
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Motion Compensation
• It aims to reduce the data transmitted by detecting the motion of objects
– Use the previous as reference
– In steps:
• Split the current frame in blocks. For each one:
• Find the best-matching block in the reference frame
• The best matching block is coded and transmitted
– Next frame can be used a reference too
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Picture type
• Slice
– One or more "contiguous'' macroblocks. The order of
the macroblocks within a slice is from left-to-right
and top-to-bottom.
• Macroblock– A 16-pixel by 16-line section of luminance
components and the corresponding 8-pixel by 8-line
section of the two chrominance components.
• Block – A block is an 8-pixel by 8-line set of values of a
luminance or a chrominance component.
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
CODEC Design
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Coding functions
• Achieve high compression performance while keep
good picture quality
• Theorem
– Spatial redundancy – DCT,DFT,subband,wavelet
– Temporal redundancy – MC/ME
– Statistical redundancy – VLC, Entropy coding
– Perceptual redundancy – VQ
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Tradeoffs in lossy compression
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
DCT
• Use the technique of the JPEG
– DCT based coding scheme
• DCT transform (2D)
• 3D DCT transform ?
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Discrete cosine transform
• Use the technique of the JPEG
– Discrete cosine transform
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
23
DCT Transformation
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Steps
Image
Spatial-to-DCT domain
transformation
8 x 8 DCT
Lossless coding of
DCT domain samples
Entropy Coding
Discard unimportant
DCT domain samples
Quantization
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Quantization
• Quantization
– Eyes are insensible to high-frequency components
– The greater quantizer means greater loss
– Lower frequency component has smaller quantizer, high frequency component has greater quantizer
– The quantization tables in the encoder and decoder are the same
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Picture type
• Video bit stream
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Picture type
• Intra picture
– Coded using only information present in the
picture itself
– I-pictures provide potential random access points into the compressed video data.
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Picture type
• Predicted picture
– coded with respect to the nearest previous I- or P-
picture.
– P-pictures use motion compensation
– Unlike I-pictures, P-pictures can propagate coding errors
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Picture type
• Bidirectional picture
– Coded use both a past and future picture as a
reference
– B-pictures provide the most compression and do
not propagate errors
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Picture type
• Typical display order of picture types
• Video stream composition
– The MPEG encoder reorders pictures in the video stream to
present the pictures to the decoder in the most efficient sequence
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Hybrid MC-DCT Video Encoder
• Intra-frame: encoded without prediction
• Inter-frame: predictively encoded => use quantized frames as ref for residue
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
MPEG-1 = JPEG + Motion Prediction + Rate Control
• Early motivation: to encode motion video at 1.5Mbits/s for
transport over T1 data circuits and for replay from CD-ROM
• Defines the decoder but not the encoder
• Frames (pictures)
– Intra-coded using JPEG
– Inter-coded using (interpolated)
ME & MC and JPEG for
the residuals
• MacroBlocks (MBs)
– 16×16 pixels block
• Rate control
– buffer at each end
– Test Model 5 (TM5)
A22
A21
Slide 32
A22 Intracoding of MBs in MPEG is as same as what is described for JPEG, except that 1) unless otherwise specified in the sequence
header MPEG defines quantization tables: one is used for intracoding, the other is used to code any residules when prediction by
montion estimation. 2)Quantization scale factor, or MQuant is different.Author, 6/17/2004
A21 MPEG does not define the encoder. A valid encoder produces a syntactically correct bit stream, resulting in the desired output if the bit
stream is fed to a compliant decoder. But an MPEG-1 complaint decoder is required to decode all valid MPEG-1 bit streams.Author, 6/17/2004
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
MPEG-2 = MPEG-1 +
• Improvements
– Color space: could support 4:2:2 and 4:4:4 coding
– Quantization: could have 9- or 10- bit precision for DC
coefficients
– Concealment motion vectors: used when an intra-MB is
lost
– Pan and Scan: supports display of different aspect
ratios, e.g., 16:9
• Profiles and levels
– Profiles: define the tools or syntactical elements
– Levels: define the permissible ranges of parameters
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
MPEG-2 = MPEG-1 +
• Interlace tools
• Scalable coding profiles
• System layer: define two bit stream
constructs
– Program stream (PS): modeled on MPEG-1
(backward compatibility)
– Transport stream (TS): more robust, does not
need a common time base, designed for use in
error-prone environment.
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
MPEG-4 = MPEG-2+Objects+Other Enhancements• Object-oriented
– Video (texture+shape), image, audio, speech, text, etc.
– Encoded using different techniques
– Transmitted independently
– Composited at the decoder using BInary Format for Scenes
(BIFS)
• Improvements in MPEG-4 version2
– Global motion compensation (GMC)
– Quarter pixel motion compensation
– Shape-adaptive DCT
• Why is MPEG-4 not a success as MPEG-2?
– Not substantially better than MPEG-2
– Suffers from its sheer size and flexibility
– Issue of licensing35
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
MPEG-4 – Error Resilience Tools
• Video packet resynchronization
– Previous coding standards: Resynchronization markers are
fixed at the beginning of each row of MBs
– MPEG-4: Resynchronization markers are inserted at every
K bits
• Data partitioning
– Partitions the data in a video packet into a motion part and
a texture part separated by a motion boundary marker
(MBM)
Resync.
marker
MB
No.QP HEC
Repeated
header info.
Motion
dataMBM DCT dataA video
packet
use discard use
I-VOPVP
Header
DC DCT
data
AC DCT
dataP-VOP
VP
Header
Motion
data
Texture
data
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
MPEG-4 – Error Resilience Tools
• Reversible variable length codes (RVLC)
– Finds the next resynchronization marker and
decode backwards
• Header extension code (HEC)
– The header information is repeated after the 1-bit
HEC
• Unequal error protection technique
(UEP)
Resync.
marker
MB
No.QP HEC
Repeated
header info.
Motion
dataMBM DCT data
A video
packet
use discard use
I-VOPVP
Header
DC DCT
data
AC DCT
dataP-VOP
VP
Header
Motion
data
Texture
data
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
New Features of H.264
• Multi-mode, multi-reference MC
• Motion vector can point out of image border
• 1/4-, 1/8-pixel motion vector precision
• B-frame prediction weighting
• 4×4 integer transform
• Multi-mode intra-prediction
• In-loop de-blocking filter
• UVLC (Uniform Variable Length Coding)
• NAL (Network Abstraction Layer)
• SP-slices
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Profiles and Levels
• Profiles: Baseline, Main, and X
– Baseline: Progressive, Videoconferencing &
Wireless
– Main: esp. Broadcast
– X: Mobile network
• Baseline profile is the minimum implementation
– Without CABAC, 1/8 MC, B-frame, SP-slices
• 11 levels
– Resolution, capability, bit rate, buffer, reference #
– Built to match popular international production and
emission formats
– From QCIF to D-Cinema
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Basic Marcoblock Coding Structure
Entropy
Coding
Scaling & Inv.
Transform
Motion-
Compensation
Control
Data
Quant.
Transf. coeffs
Motion
Data
Intra/Inter
Coder
Control
Decoder
Motion
Estimation
Transform/
Scal./Quant.-
Input
Video
Signal
Split into
Macroblocks
16x16 pixels
Intra-frame
Prediction
De-blocking
Filter
Output
Video
Signal
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Variable block size
• The fixed block size may not be suitable for
all motion objects
– Improve the flexibility of comparison
– Reduce the error of comparison
• 7 types of blocks for selection
– 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4
41
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Multiple Reference Frames
• The neighboring frames are not the most
similar in some cases
• The B-frame can be reference frame
– B-frame is close to the target frame in many
situations
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Spatial Prediction for Intra-Coded MBs
• luma
- 4x4: 9 modes
- 16x16: 4 modes
• chroma
- 8x8: 4modes
- The same prediction mode is always applied to
both chroma blocks
M A B C D
I
J
K
L
M A B C D
I
J
K
L
M
I
J
A B C D
K
L
Mean (A-D,
I-M)
M A B C D
I
J
K
L
E F G H
……..
H
V
……..
H
VMean(H, V)
H
V
H
V
……..
H
V ……..
H
V
H
VMean
(H, V)
H
V
…
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Deblocking filter
• Picture is filtered using an adaptive deblocking filter.
• The filter removes visible block structures on the
edges of the 4 X 4 blocks caused by block-based
transform coding and motion estimation
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Deblocking FiltersA boundary-strength (BS) parameter
is assigned to every 4×4 block• BS = 0 No filtering
BS = 1-3 Slight filtering
BS = 4 Strong filtering
• Filters only when
– |P0-Q0|< α
– |P1-P0|< β
– |Q1-Q0|< β
• Thresholds α and β depend on the average quantization parameter (QP)
• The deblocking filtering accounts for 1/3 of the computational complexity of a decoder.
46
Block modes and
conditions(BS)
One of the blocks is intra-
coded and the edge is a
MB edge
4
One of the blocks is intra-
coded
3
One of the blocks has
coded residuals
2
Difference of block
motion ≥ one luma
sample distance
1
Motion compensation
from different reference
frames
1
Else 0
P3 P2 P1 P0 Q0 Q1 Q2 Q3
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
SP and SI-Frame Design
• SP and SI-frames
– allow identical reconstruction when coded using different
references
– Subtract the reference in the coder and add it back in the
decoder
• Bitstream switching
– In previous coding standards:
perfect (mismatch-free) switching
only happens at Intra-frames.
• Other applications
– Bitstream splicing
– Error recovery/resilience
– Video redundancy coding47
P2,n-2 P2,n-1SP2,n P2,n+1 P2,n+2
P1,n-2 P1,n-1 P1,n P1,n+1 P1,n+2
SP12,n
Stream 2:
Stream 1:
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Transformation
• H.264 employs a 4X4 integer transform
• The transform is an approximation of the DCT
– It has a similar coding-gain to the DCT transform.
– Since the integer transform has an exact inverse
operation, there is no mismatch between the
encoder and the decoder which was a problem in
all DCT based codecs
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Network friendliless
• H.264 structure
– Video coding layer (VCL)
– Network abstraction layer (NAL)
Scope of H.264 standard
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
H.264 Over IP
• Network Abstraction Layer
Unit (NALU)
– A byte stream of variable
length
– 1-byte header
• NALU type (T)
• NALU importance (R)
• Error indication (F)
• RTP packetization
– Simple packetization
• One NALU in one RTP
packet
• NALU header as RTP
header
– NALU fragmentation
– NALU aggregation
OSI/RM Protocols and specifi-cations for H.264
Application Layer� RTP (Real-Time Transport Protocol)
Header size: IP/UDP/RTP = 20+8+12=40 bytes
Media-Unaware RTP payload specifications to reduce the loss rates observed by the decoder.
Packet duplication/Packet based FEC/Audio redundancy coding
� Control protocols: H.245, SIP (Session Initiation Protocol), SDP (Session Description Protocol), RTSP (Real-Time Streaming Protocol)
Presentation Layer
Session Layer
Transport Layer� UDP (User Datagram Protocol)
Network Layer � IP: best effort service
T FR
A1
Slide 50
A1 IP header is 20 bytes in size and protected by a checksum. No protection of the payload is performed.Author, 8/24/2011
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Comparison
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
H265 outlook
• Half-rate reduction compared to H264
• Tree-structured prediction and residual difference
block segmentation
• Extended prediction block sizes (up to 64x64)
• Tile and slice picture segmentations for loss
resilience and parallelism
• Wavefront processing structure for decoder
parallelism
• Mode-dependent sine/cosine transform type
switching
• Adaptive motion vector predictor selection
• Temporal motion vector prediction
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
3D video coding
53
• Left and right eye view
• Depth sensation
• Resolving 2D viewing ambiguity
• Additional features:
• Free view points
• Depth-controlled
object insertion
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Multiview Frame Structure
1 2 3 4 5 6 7
.
.
.
…..
time
view
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Predictions based on H.264/AVC JM95
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Homework 1
• Download the open source tool X264 from VIDEOLAN website
• Capture a video sequence via webcam or from the Internet
• Work around with FFMPEG to encode and transcode the video sequence with different standards (mpeg2, mpeg4, h.263, h.264, etc), parameters
• Playback the encoded video and comment
• Contain the encoded video sequence in mp4 format
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Homework 2
• Draw decoding diagrams for MPEG1, MPEG2,
MPEG4, H264 and 3D
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Email: [email protected] of Elec. and Telecom, Hanoi University of Science and Technology C9-411 Dai Co Viet str. 1, Hanoi
Future development
• Future coding/presentation standards:
– H265, VC-1, VC-2
– MPEG-21, MHEG
• Computer vision
– Game
– Graphics
• Multimedia retrieval
– Segmentation
– Search (Google)
• Multi-camera system
– 3D cinema
– Realistic broadcasting