1
“Evalvid-RA”Simulation of rate adaptive video
TTM4142 Networked Multimedia Systems
Arne Lie, SINTEF ICT
November 6, 2008
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA2
Overview
• Why Evalvid-RA• How to compress video• How to simulate video transmission• How to simulate rate adaptive video• Evalvid-RA architecture• How to use Evalvid-RA
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA3
Objectives
• Congestion control for media:: highest possible perceived quality!– Avoid persistent long queues
• low latency (media sender, network queues, media receiver)• low drop probability
– Bandwidth:• Fair bandwidth• Avoid unnecessary large rate reduction• Grab available excess bandwidth
• Network simulation of media:: requirements– Run traffic with the right characteristics
• Use source models, or• Use trace driven simulation (i.e. genuine video traffic)
– Perceived quality: need real media!• Evalvid tool-set• But we need “online” rate adaptive trace simulations
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA4
Example with congestion
time
Link capacity 15Mbps
Link utilization
100%
MPEG-4 “Foreman” ~700kbps
~6Mbps
~6Mbps
~6Mbps
2s 4s 6s 8s 10s
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA5
The throughput using best effort Internet
time
Link capacity 15Mbps
Link utilization
100%
MPEG-4 “Foreman” still ~700kbps
~4.8Mbps
~4.8Mbps
~4.8Mbps
2s 4s 6s 8s 10s
~6Mbps
~6Mbps
Only 15/18.7=80.2% of the packets can survive after congestion takes place: 20% packet loss for all flows!
Or adapt the rate with 20%
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA6
Comparison of TailDrop, P-AQM and P-AQM with “ECF CC”
0 50 100 150 200 250 3005
10
15
20
25
30
35
40
frame number
PS
NR
[dB
]
TailDrop
P-AQM
P-AQM with ECF CC
0 5 10 15 200
1
2
3
4
5
6
7Bandwidth share tailDrop example
time [s]
thro
ughp
ut [
Mbi
t/s]
0 5 10 15 200
1
2
3
4
5
6
7
time [s]
thro
ughp
ut [
Mbi
t/s]
Bandwidth share P-AQM
0 5 10 15 200
1
2
3
4
5
6
7
time [s]
thro
ughp
ut [
Mbi
t/s]
Bandwidth share P-AQM w/ECF CC
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA7
Main challenges
• Encoding/decoding of video is CPU demanding• We want to be able to simulate multiple video traffic
flows in mixed traffic scenarios on a single computer!– How to keep complexity low?
• We want to be able to play resulting video so that perceptual quality can be determined– How to avoid “online” encoding/decoding?
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA8
2005: What was available
• Trace driven simulation need trace files from real sources, e.g.– http://www-tkn.ee.tu-berlin.de/research/trace/ltvt.html
– Only the frame SIZES and timing is used, not the content
• or synthetic traffic that models real traffic very closely– e.g. GenSyn http://www.item.ntnu.no/~poulh/GenSyn/gensyn.html
• Evalvid tools from http://www.tkn.tu-berlin.de/research/evalvid/ – real traces, and media is re-assembled after network simulation for
visual inspection and PSNR calculation (Jirka Klaue)
• Evalvid interface to ns-2 (Ke Chih-Heng)– http://hpds.ee.ncku.edu.tw/~smallko/ns2/Evalvid_in_NS2.htm
• but rate adaptive media will change depending on network state…
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA9
Video coding
• Intra-frames (key-frames)– still images self-contained– used at scene changes
• Predicted-frames (P-frames)– uses motion-estimation
• Bidirectional frames (B-frames)– uses motion-estimation both forward and backward
in time– must be relative to an anchor picture (I- or P-frame)
GOP
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA10
Hybrid encoding: transform (spatial) + prediction (time)
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA11
Quantization of 8x8 pixel block
quantization steps (SQ)
after quantization performed
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA12
Motion vector, prediction error
• Source: Eckehard Steinbach: Internet Media Streaming
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA13
Scalable media
• change either (video / audio)– frame rate / sample rate (temporal)– frame size / sample size (spatial)– compression quantization Q (quality)
• =quantiser_scale in MPEG-4• or a combination• Most players/decoders don’t respond
(correctly) to changes in frame size and frame rate change of the Q-value
(=quantiser_scale) is easiest– the Q-value actually normally change
each frame, or even each macro block (video)
– but how to avoid doing this “live” in the network simulation?
Differences between live adaptation and pre-stored media with adaptation possibilities (scalable video coding)
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA14
Rate controllers varies Q
• Adjust output rate according to a bit rate budget on time average and variability constraints– leaky bucket
• CBR: constant bit rate– each GOP has the same number of bits (or bit/s)
– Q changes from macro block to macro block
– Cost: algorithmic delay, variable quality
• VBR: variable bit rate– allows for more variability
– Q changes less: more stable quality
• Quality based (“VBR open loop”, constant Q)– rate totally dependent on content
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA15
VBR open loop @ Q=2
0 200 400 600 800 1000 1200 1400 1600 1800 20000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5x 10
4
frame number
fram
e si
ze (
byte
s)I & P-frame sizes
I-frame
0 10 20 30 40 50 60 700
1
2
3
4
5
6
7x 10
6
time (s)
GO
P r
ate
(bit/
s)
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA16
Objective quality at Q=2: PSNR
• Constant Q gives ~ constant quality
200 400 600 800 1000 1200 1400 1600 1800 20000
5
10
15
20
25
30
35
40
45
50
frame number
PS
NR
(dB
)
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA17
Rate dependability on Q
0 10 20 30 40 50 60 700
1
2
3
4
5
6
7x 10
6
time (s)
GO
P r
ate
(bit/s
)Q=2
Q=3Q=4
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2 4 6 8 10
12
14
16
18
20
22
24
26
28
30
Q-value
Avera
ge b
it r
ate
(kb
it/s
)
Aha_you_are.mov ffmpeg
Approximation f(q)
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA18
Rate controller objectives
• Limits the rate fluctuations & have an average rate constraint, by varying the quantization value Q– at each macro block– at each frame,– or at each GOP
• If Congestion Control is applied– the rate controller must have adaptable average rate
constraint!– Problem: the rate controller must run at simulation time!
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA19
No rate controller (VBR open loop)
• Bit rate too variable to control• Has Long Range Dependence (LRD)
News
Football
Akiyo
Stefan
Paris
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA20
Rate controller (VBR constrained)
0 200 400 600 800 1000 1200 1400 1600 1800 20000
2
4x 10
4
byte
s/f
ram
e
frame No.
Concat MPEG, r=600kbit/s
0 20 40 60 80 100 120 140 1600
2
4x 10
4
GOP No.
byte
s/G
OP
0 200 400 600 800 1000 1200 1400 1600 1800 20000
20
40
q-s
cale
frame No.
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA21
Adaptive rate controller (VBR constrained)
• Red line: no adaptive rate control• Blue line: adaptive rate reduces the bit rate at ~40 second
0 50 100 150 200 250 300 350 4000
2
4x 10
4
byte
s/fr
ame
Inconvenient truth example
0 50 100 150 200 250 300 350 4000
5
10x 10
4
byte
s/G
OP
0 50 100 150 200 250 300 350 4000
10
20
time (s)
q-sc
ale
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA22
Quality of received video• PSNR of video flows examined (with delay constraints)
– P-AQM with highest score, and with Statistical Multiplexing Gain
– TFRC gains on running over networks with AQM
150ms delay constraint
TFRC 1: RED w/ ECN
TFRC 2: RED w/ dropping
TFRC 3: FIFO
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA23
32 videos @ 150 ms e2e delay constraint, 32 FTP, Web traffic
Original quality
TFRC supported adaptation P-AQM supported adaptation
(600 kbit/s)
400 kbit/s600 kbit/s400 kbit/s600 kbit/s
Adapt to 400 kbit/sAdapt to 400 kbit/s
32 video flows
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA24
How to avoid having “online” encoder to follow the adaptive feedback
• CBR changes Q at macro block granularity– too detailed for frame size trace files!
• VBR changes Q at frame or GOP granularity– Yes!
• “SVBR” (shaped VBR) by Hamdi/Roberts/Rolin ’97– change Q at GOP scale to constrain video to LB(r,b)
constraint• r: average video rate (=leaky bucket rate)• b: bucket size (to allow variability)
– very simple, no extra delay– my modification: variable r
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA25
Hamdi’s SVBR leaky bucket controller
encoder packetizer
r
bx
calc. Q next GOP
Qto network
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA26
How to map r to Q
• rate x Q product almost independent on Q– dependent only on content complexity
at start of any new GOP, assuming complexity change smoothly from GOP to GOP
– r (bits/s) R(k+1) (bits/GOP) using a formula (PhD Thesis)– Q(k+1) = R(k)*Q(k)/R(k+1) (if video complexity does not change)– for stored media, next GOP complexity is known a priori
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA27
Pseudo code from rate adapt algorithm
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA28
Evalvid-RA solutions
• Multiple trace files– one per Q-value– Q=[2,3,4,…,31] (ffmpeg)
• make SVBR calculate Q(k+1): select GOP(k+1) trace • This requires fixed GOP sizes!• LB(r,b) parameters change at feedback event
– but the new Q-value is not used before start of next GOP
• Received video file must be assembled – using trace of actual Q(i)-values used, and – multiple *.m4v files
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA29
Long Range Dependence (LRD)
• Garrett & Willinger ’94: VBR video traffic is self-similar– the autocorrelation (k) function decays slowly at increasing
lag– makes buffer dimensioning & high link utilization very difficult– The cause of LRD: scene complexity changes!
• many papers on video characterization (GOP scale, frame scale)– very little related to what kind of rate controller in use!
• Hamdi showed in his thesis that– a stream satisfying a LB(r,b) constraint, where r equals the
traffic average rate, is not self similar
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA30
rate controllers limits the rate variance…
0 10 20 30 40 50 60 700
1
2
3
4
5
6
7x 10
6
time (s)
GO
P r
ate
(bit/
s)
Q=2
Q=3Q=4
0 10 20 30 40 50 60 700
0.5
1
1.5
2
2.5x 10
6
time (s)
GO
P r
ate
(bit/
s)
No apriori information
Apriori information
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA31
(k) of “concatenated” video (GOP)
• Positive correlations at lag k poses long bursts of time duration k
-20 0 20 40 60 80 100 120 140 160-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
lags
Autocorrelation GOP scale
Apriori
Non-aprioriQ=8 open loop
-20 0 20 40 60 80 100 120 140 160-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5x 10
10 Autocovariance GOP scale
lags
Apriori
Non-aprioriQ=8 open loop
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA33
Evalvid-RA: overview
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA34
The tool-set overviewPre-process
• ffmpeg -s cif -r 30 -i video.yuv -vcodec mpeg4 -4mv -g 12 –flags sgop -sc_threshold 20000 -qscale 8 -s cif -r 30 -y video_Q12.m4v
• mp4.exe -send <IP address> <port #> <MTU> <fps> video_Q12.m4v > st_video_Q12.txt
ffmpeg.exe
Pre-process once (shell script)
Evalvidmp4.exe
video_Q*.m4v
Q=[2..31]
2, 3, …, 31
st_*.txt2, 3, …
, 31
30 MPEG-4 compressed video rate
variants
30 different possible frame traces:
[No. frame_size type]
*.yuv, *.mov, *.mp4, ...
Original video source e.g. video_orig.yuv
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA35
The tool-set overview (cont.)Network Simulation
• ns-2: evalvid_rateadapt.tcl – modified ns-2 interface to
“Evalvid-RA” & adaptive SVBR responding to P-AQM feedback
– Tcl init-function makes• video2.dat (frames all Q)
– read into memory
– used by all nodes sending the same media (different timing)
• gop_size.dat (GOP size all Q)– used by et_ra.exe
– sd_be_* stores e.g. actual Q used
ns-2 simulation
sd_be_5
sd_be_7 rd_be_8
rd_be_6
Actual frame traces used at packet level:
[Time packet_size type Q]
Actual packets received:[Time packet_size type or Loss]
· Tcl pre-process:· run through all st_*.txt input files· generate media matrix at frame
and GOP scale
video2.dat
gop_size.dat
RAM
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA36
The multi-rate trace file (multi Q)
• video2.dat:
<time s> <bytes Q=2> <type> <MTU> <Q=3> <Q=4>… <Q=31>
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA37
The tool-set overview (cont.)Post-process
• et_ra.exe – modified Evalvid original et.exe
– Reads packet Tx and Rx trace files
– Finds used Q
– Reads video2.dat for frame sizes and types
– Reads gop_size.dat to assist assembling the resulting MPEG-4 file
• ffmpeg to decode to YUV• fixyuv_ra.exe: takes e2e delivery
time constraints into account resulting video file (*.yuv)
• psnr.exe: compare decoded YUV to original
Post-processes
et_ra sd_be_5 rd_be_6 video2.dat video_Q 2 31 gop_size.dat video_received.m4v
ffmpeg -i video_received.m4v -vcodec rawvideo video_received.yuv
fixyuv sd_be_5 rd_be_6 new_st.txt video_received.yuv fixed_packetloss.yuv
psnr 352 288 420 video_orig.yuv fixed_packetloss.yuv
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA38
GOP1
et_ra.exe (Evaluate Trace, rate Adaptive)
• original et.exe:
• et_ra.exe:
GOP1
GOP2
GOP3
GOP4
GOP5
GOP1
frame 1frame 2frame 3frame 4frame 5
packet 1
packet 2
packet 3
frame 1:
GOP1
GOP2
GOP3
GOP4
GOP5
…
GOP1
GOP2
GOP3
GOP4
GOP5
GOP1
GOP2
GOP3
GOP4
GOP5
GOP1
GOP2
GOP3
GOP4
GOP5
GOP1
GOP2
GOP3
GOP4
GOP5
GOP1
GOP2
GOP3
GOP4
GOP5
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA40
Limitations of this implementation
• GOP time scale rate adaptation– Hamdi confirms that SVBR could be modified to frame scale
• Fixed GOP size– live encoders could start a new GOP (i.e. next frame
being I-frame) at a feedback event!– relaxation will make distortions?
• error concealment (packet loss)– FRAME mode vs. PACKET mode considerations– ffmpeg drops first frame after frame marked with “loss”
• No audio yet– limitation in mp4.exe tool
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA41
Research usage of Evalvid-RA
• Simulate many flows, coming from many sources, all of them rate adaptive
• Have different media sources, not only one • Wireless rate adaptive multimedia• Different congestion control algorithms
– Self-limited sources and their actual bandwidth– friendliness (towards TCP, UDP, DCCP, etc.)
• Different queuing systems (FIFO, AQMs, QoS/DiffServ e.g.)• Investigate the removal of LRD, or not?• trade latency for loss (short queues)• how to inject new flows• new initiatives for rate adaptation incentives• vary the sources rates• …
6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA42
Evalvid-RA lab
• Software– Windows/Cygwin (Linux on Win32)– ns-2 w/ Evalvid-RA– ffmpeg & video inspection programs
• Script files• The pre-process is already performed• To do:
– Modify TCL script / run ns-2 simulation with selected parameters
– Run post-process and inspect video quality / statistics