Download - “Evalvid-RA” Simulation of rate adaptive video

1

“Evalvid-RA”Simulation of rate adaptive video

TTM4142 Networked Multimedia Systems

Arne Lie, SINTEF ICT

November 6, 2008

6 Nov. 2008TTM4142:: Arne Lie, Evalvid-RA2

Overview

• Why Evalvid-RA• How to compress video• How to simulate video transmission• How to simulate rate adaptive video• Evalvid-RA architecture• How to use Evalvid-RA


Objectives

• Congestion control for media:: highest possible perceived quality!– Avoid persistent long queues

• low latency (media sender, network queues, media receiver)• low drop probability

– Bandwidth:• Fair bandwidth• Avoid unnecessary large rate reduction• Grab available excess bandwidth

• Network simulation of media:: requirements– Run traffic with the right characteristics

• Use source models, or• Use trace driven simulation (i.e. genuine video traffic)

– Perceived quality: need real media!• Evalvid tool-set• But we need “online” rate adaptive trace simulations


Example with congestion

time

Link capacity 15Mbps

Link utilization

100%

MPEG-4 “Foreman” ~700kbps

~6Mbps

~6Mbps

~6Mbps

2s 4s 6s 8s 10s


The throughput using best effort Internet

time

Link capacity 15Mbps

Link utilization

100%

MPEG-4 “Foreman” still ~700kbps

~4.8Mbps

~4.8Mbps

~4.8Mbps

2s 4s 6s 8s 10s

~6Mbps

~6Mbps

Only 15/18.7=80.2% of the packets can survive after congestion takes place: 20% packet loss for all flows!

Or adapt the rate with 20%


Comparison of TailDrop, P-AQM and P-AQM with “ECF CC”

0 50 100 150 200 250 3005

10

15

20

25

30

35

40

frame number

PS

NR

[dB

]

TailDrop

P-AQM

P-AQM with ECF CC

0 5 10 15 200

1

2

3

4

5

6

7Bandwidth share tailDrop example

time [s]

thro

ughp

ut [

Mbi

t/s]

0 5 10 15 200

1

2

3

4

5

6

7

time [s]

thro

ughp

ut [

Mbi

t/s]

Bandwidth share P-AQM

0 5 10 15 200

1

2

3

4

5

6

7

time [s]

thro

ughp

ut [

Mbi

t/s]

Bandwidth share P-AQM w/ECF CC


Main challenges

• Encoding/decoding of video is CPU demanding• We want to be able to simulate multiple video traffic

flows in mixed traffic scenarios on a single computer!– How to keep complexity low?

• We want to be able to play resulting video so that perceptual quality can be determined– How to avoid “online” encoding/decoding?


2005: What was available

• Trace driven simulation need trace files from real sources, e.g.– http://www-tkn.ee.tu-berlin.de/research/trace/ltvt.html

– Only the frame SIZES and timing is used, not the content

• or synthetic traffic that models real traffic very closely– e.g. GenSyn http://www.item.ntnu.no/~poulh/GenSyn/gensyn.html

• Evalvid tools from http://www.tkn.tu-berlin.de/research/evalvid/ – real traces, and media is re-assembled after network simulation for

visual inspection and PSNR calculation (Jirka Klaue)

• Evalvid interface to ns-2 (Ke Chih-Heng)– http://hpds.ee.ncku.edu.tw/~smallko/ns2/Evalvid_in_NS2.htm

• but rate adaptive media will change depending on network state…


Video coding

• Intra-frames (key-frames)– still images self-contained– used at scene changes

• Predicted-frames (P-frames)– uses motion-estimation

• Bidirectional frames (B-frames)– uses motion-estimation both forward and backward

in time– must be relative to an anchor picture (I- or P-frame)

GOP


Hybrid encoding: transform (spatial) + prediction (time)


Quantization of 8x8 pixel block

quantization steps (SQ)

after quantization performed


Motion vector, prediction error

• Source: Eckehard Steinbach: Internet Media Streaming


Scalable media

• change either (video / audio)– frame rate / sample rate (temporal)– frame size / sample size (spatial)– compression quantization Q (quality)

• =quantiser_scale in MPEG-4• or a combination• Most players/decoders don’t respond

(correctly) to changes in frame size and frame rate change of the Q-value

(=quantiser_scale) is easiest– the Q-value actually normally change

each frame, or even each macro block (video)

– but how to avoid doing this “live” in the network simulation?

Differences between live adaptation and pre-stored media with adaptation possibilities (scalable video coding)


Rate controllers varies Q

• Adjust output rate according to a bit rate budget on time average and variability constraints– leaky bucket

• CBR: constant bit rate– each GOP has the same number of bits (or bit/s)

– Q changes from macro block to macro block

– Cost: algorithmic delay, variable quality

• VBR: variable bit rate– allows for more variability

– Q changes less: more stable quality

• Quality based (“VBR open loop”, constant Q)– rate totally dependent on content


VBR open loop @ Q=2

0 200 400 600 800 1000 1200 1400 1600 1800 20000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10

4

frame number

fram

e si

ze (

byte

s)I & P-frame sizes

I-frame

0 10 20 30 40 50 60 700

1

2

3

4

5

6

7x 10

6

time (s)

GO

P r

ate

(bit/

s)


Objective quality at Q=2: PSNR

• Constant Q gives ~ constant quality

200 400 600 800 1000 1200 1400 1600 1800 20000

5

10

15

20

25

30

35

40

45

50

frame number

PS

NR

(dB

)


Rate dependability on Q

0 10 20 30 40 50 60 700

1

2

3

4

5

6

7x 10

6

time (s)

GO

P r

ate

(bit/s

)Q=2

Q=3Q=4

0

200

400

600

800

1000

1200

1400

1600

1800

2000

2 4 6 8 10

12

14

16

18

20

22

24

26

28

30

Q-value

Avera

ge b

it r

ate

(kb

it/s

)

Aha_you_are.mov ffmpeg

Approximation f(q)


Rate controller objectives

• Limits the rate fluctuations & have an average rate constraint, by varying the quantization value Q– at each macro block– at each frame,– or at each GOP

• If Congestion Control is applied– the rate controller must have adaptable average rate

constraint!– Problem: the rate controller must run at simulation time!


No rate controller (VBR open loop)

• Bit rate too variable to control• Has Long Range Dependence (LRD)

News

Football

Akiyo

Stefan

Paris


Rate controller (VBR constrained)

0 200 400 600 800 1000 1200 1400 1600 1800 20000

2

4x 10

4

byte

s/f

ram

e

frame No.

Concat MPEG, r=600kbit/s

0 20 40 60 80 100 120 140 1600

2

4x 10

4

GOP No.

byte

s/G

OP

0 200 400 600 800 1000 1200 1400 1600 1800 20000

20

40

q-s

cale

frame No.


Adaptive rate controller (VBR constrained)

• Red line: no adaptive rate control• Blue line: adaptive rate reduces the bit rate at ~40 second

0 50 100 150 200 250 300 350 4000

2

4x 10

4

byte

s/fr

ame

Inconvenient truth example

0 50 100 150 200 250 300 350 4000

5

10x 10

4

byte

s/G

OP

0 50 100 150 200 250 300 350 4000

10

20

time (s)

q-sc

ale


Quality of received video• PSNR of video flows examined (with delay constraints)

– P-AQM with highest score, and with Statistical Multiplexing Gain

– TFRC gains on running over networks with AQM

150ms delay constraint

TFRC 1: RED w/ ECN

TFRC 2: RED w/ dropping

TFRC 3: FIFO


32 videos @ 150 ms e2e delay constraint, 32 FTP, Web traffic

Original quality

TFRC supported adaptation P-AQM supported adaptation

(600 kbit/s)

400 kbit/s600 kbit/s400 kbit/s600 kbit/s

Adapt to 400 kbit/sAdapt to 400 kbit/s

32 video flows


How to avoid having “online” encoder to follow the adaptive feedback

• CBR changes Q at macro block granularity– too detailed for frame size trace files!

• VBR changes Q at frame or GOP granularity– Yes!

• “SVBR” (shaped VBR) by Hamdi/Roberts/Rolin ’97– change Q at GOP scale to constrain video to LB(r,b)

constraint• r: average video rate (=leaky bucket rate)• b: bucket size (to allow variability)

– very simple, no extra delay– my modification: variable r


Hamdi’s SVBR leaky bucket controller

encoder packetizer

r

bx

calc. Q next GOP

Qto network


How to map r to Q

• rate x Q product almost independent on Q– dependent only on content complexity

at start of any new GOP, assuming complexity change smoothly from GOP to GOP

– r (bits/s) R(k+1) (bits/GOP) using a formula (PhD Thesis)– Q(k+1) = R(k)*Q(k)/R(k+1) (if video complexity does not change)– for stored media, next GOP complexity is known a priori


Pseudo code from rate adapt algorithm


Evalvid-RA solutions

• Multiple trace files– one per Q-value– Q=[2,3,4,…,31] (ffmpeg)

• make SVBR calculate Q(k+1): select GOP(k+1) trace • This requires fixed GOP sizes!• LB(r,b) parameters change at feedback event

– but the new Q-value is not used before start of next GOP

• Received video file must be assembled – using trace of actual Q(i)-values used, and – multiple *.m4v files


Long Range Dependence (LRD)

• Garrett & Willinger ’94: VBR video traffic is self-similar– the autocorrelation (k) function decays slowly at increasing

lag– makes buffer dimensioning & high link utilization very difficult– The cause of LRD: scene complexity changes!

• many papers on video characterization (GOP scale, frame scale)– very little related to what kind of rate controller in use!

• Hamdi showed in his thesis that– a stream satisfying a LB(r,b) constraint, where r equals the

traffic average rate, is not self similar


rate controllers limits the rate variance…

0 10 20 30 40 50 60 700

1

2

3

4

5

6

7x 10

6

time (s)

GO

P r

ate

(bit/

s)

Q=2

Q=3Q=4

0 10 20 30 40 50 60 700

0.5

1

1.5

2

2.5x 10

6

time (s)

GO

P r

ate

(bit/

s)

No apriori information

Apriori information


(k) of “concatenated” video (GOP)

• Positive correlations at lag k poses long bursts of time duration k

-20 0 20 40 60 80 100 120 140 160-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

lags

Autocorrelation GOP scale

Apriori

Non-aprioriQ=8 open loop

-20 0 20 40 60 80 100 120 140 160-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5x 10

10 Autocovariance GOP scale

lags

Apriori

Non-aprioriQ=8 open loop


Evalvid-RA: overview


The tool-set overviewPre-process

• ffmpeg -s cif -r 30 -i video.yuv -vcodec mpeg4 -4mv -g 12 –flags sgop -sc_threshold 20000 -qscale 8 -s cif -r 30 -y video_Q12.m4v

• mp4.exe -send <IP address> <port #> <MTU> <fps> video_Q12.m4v > st_video_Q12.txt

ffmpeg.exe

Pre-process once (shell script)

Evalvidmp4.exe

video_Q*.m4v

Q=[2..31]

2, 3, …, 31

st_*.txt2, 3, …

, 31

30 MPEG-4 compressed video rate

variants

30 different possible frame traces:

[No. frame_size type]

*.yuv, *.mov, *.mp4, ...

Original video source e.g. video_orig.yuv


The tool-set overview (cont.)Network Simulation

• ns-2: evalvid_rateadapt.tcl – modified ns-2 interface to

“Evalvid-RA” & adaptive SVBR responding to P-AQM feedback

– Tcl init-function makes• video2.dat (frames all Q)

– read into memory

– used by all nodes sending the same media (different timing)

• gop_size.dat (GOP size all Q)– used by et_ra.exe

– sd_be_* stores e.g. actual Q used

ns-2 simulation

sd_be_5

sd_be_7 rd_be_8

rd_be_6

Actual frame traces used at packet level:

[Time packet_size type Q]

Actual packets received:[Time packet_size type or Loss]

· Tcl pre-process:· run through all st_*.txt input files· generate media matrix at frame

and GOP scale

video2.dat

gop_size.dat

RAM


The multi-rate trace file (multi Q)

• video2.dat:

<time s> <bytes Q=2> <type> <MTU> <Q=3> <Q=4>… <Q=31>


The tool-set overview (cont.)Post-process

• et_ra.exe – modified Evalvid original et.exe

– Reads packet Tx and Rx trace files

– Finds used Q

– Reads video2.dat for frame sizes and types

– Reads gop_size.dat to assist assembling the resulting MPEG-4 file

• ffmpeg to decode to YUV• fixyuv_ra.exe: takes e2e delivery

time constraints into account resulting video file (*.yuv)

• psnr.exe: compare decoded YUV to original

Post-processes

et_ra sd_be_5 rd_be_6 video2.dat video_Q 2 31 gop_size.dat video_received.m4v

ffmpeg -i video_received.m4v -vcodec rawvideo video_received.yuv

fixyuv sd_be_5 rd_be_6 new_st.txt video_received.yuv fixed_packetloss.yuv

psnr 352 288 420 video_orig.yuv fixed_packetloss.yuv


GOP1

et_ra.exe (Evaluate Trace, rate Adaptive)

• original et.exe:

• et_ra.exe:

GOP1

GOP2

GOP3

GOP4

GOP5

GOP1

frame 1frame 2frame 3frame 4frame 5

packet 1

packet 2

packet 3

frame 1:

GOP1

GOP2

GOP3

GOP4

GOP5

…

GOP1

GOP2

GOP3

GOP4

GOP5

GOP1

GOP2

GOP3

GOP4

GOP5

GOP1

GOP2

GOP3

GOP4

GOP5

GOP1

GOP2

GOP3

GOP4

GOP5

GOP1

GOP2

GOP3

GOP4

GOP5


Limitations of this implementation

• GOP time scale rate adaptation– Hamdi confirms that SVBR could be modified to frame scale

• Fixed GOP size– live encoders could start a new GOP (i.e. next frame

being I-frame) at a feedback event!– relaxation will make distortions?

• error concealment (packet loss)– FRAME mode vs. PACKET mode considerations– ffmpeg drops first frame after frame marked with “loss”

• No audio yet– limitation in mp4.exe tool


Research usage of Evalvid-RA

• Simulate many flows, coming from many sources, all of them rate adaptive

• Have different media sources, not only one • Wireless rate adaptive multimedia• Different congestion control algorithms

– Self-limited sources and their actual bandwidth– friendliness (towards TCP, UDP, DCCP, etc.)

• Different queuing systems (FIFO, AQMs, QoS/DiffServ e.g.)• Investigate the removal of LRD, or not?• trade latency for loss (short queues)• how to inject new flows• new initiatives for rate adaptation incentives• vary the sources rates• …


Evalvid-RA lab

• Software– Windows/Cygwin (Linux on Win32)– ns-2 w/ Evalvid-RA– ffmpeg & video inspection programs

• Script files• The pre-process is already performed• To do:

– Modify TCL script / run ns-2 simulation with selected parameters

– Run post-process and inspect video quality / statistics

Download - “Evalvid-RA” Simulation of rate adaptive video

Top Related