evaluation of packet video quality in real time using random neural networks samir mohamed gerardo...
TRANSCRIPT
Evaluation ofEvaluation ofpacket video qualitypacket video quality
in real timein real timeusingusing
Random Neural Random Neural NetworksNetworks
Samir MohamedSamir MohamedGerardo RubinoGerardo Rubino
IRISA, Rennes, FRANCEIRISA, Rennes, FRANCE
22Workshop RNN, ICANN’02, Madrid, 27/8/02
The problemThe problem
• How to automatically quantify the quality of the stream, as perceived by the as perceived by the receiverreceiver?
Source
Receiver
stream ofvoice,music,videovideo,multimedia,…
IP network
33Workshop RNN, ICANN’02, Madrid, 27/8/02
OUTLINEOUTLINE
1. Subjective tests2. Objective tests3. Our approach in 5 steps4. The obtained performances5. ANN vs RNN6. Application: analyzing parameters impact7. A view of our demo tool8. Related work in audio9. Ongoing research
44Workshop RNN, ICANN’02, Madrid, 27/8/02
1: 1: The “The “voie royalevoie royale””
• To quantify the quality at the receiver side …
• just put a human there,
• give her/him a scale (say, from 1 to 10),
• and ask her/him to evaluate the qualityof a (short) part of the stream (say, some seconds).
55Workshop RNN, ICANN’02, Madrid, 27/8/02
(1:) (1:) The “The “voie royalevoie royale”” (cont.)(cont.)
• Better:– put n humans at the receiver side,– ask them to evaluate a (short) given
sequence,– take the average as the quantified quality.
• Still better:– give to the set of humans the sequence to
evaluateplus several other sequences, in order to allow each member to adjust her/his scale
66Workshop RNN, ICANN’02, Madrid, 27/8/02
(1:) (1:) Subjective testsSubjective tests
• The previous procedure is in fact standardized:see, for instance, the norm ITU-R BT.500-10 (March 2000).
• Basically, the procedure is as previously described + a statistical filter of the results.
• Goal of the statistical filter:to eliminate bad observers.
• Bad observer:“one disagreeing with the majority”
77Workshop RNN, ICANN’02, Madrid, 27/8/02
(1:) (1:) DrawbacksDrawbacks
• The previous procedure gives “the truth”but doesn’t do the job automatically (so, a fortiori, not in real time neither).
• Moreover, performing subjective tests is costly– in logistics (you need about 20 people with
appropriate characteristics)
– in resources (you need an appropriate place or environment to put your set of human subjects)
– and in time to perform the tests.
88Workshop RNN, ICANN’02, Madrid, 27/8/02
(1:) Drawbacks (cont.)(1:) Drawbacks (cont.)
• Suppose you decide to analyze the way some factor (the loss rate in the network, for instance) affects quality:– you must then evaluate the function q = f(lr) where q
is the quality and lr is the loss rate in many pointsin many points and for many sequencesfor many sequences: VERY EXPENSIVEVERY EXPENSIVE.
• Suppose you want now to study q = g(lr,br,d) where br is the source bit rate and d the mean end-to-end delay: TOO EXPENSIVE TO BE TOO EXPENSIVE TO BE DONEDONE
99Workshop RNN, ICANN’02, Madrid, 27/8/02
2: 2: Another possibilityAnother possibility
• The other way is obviously to look for an explicit formula (or algorithm) giving q as a function of the considered factors (lr, d, etc.) in, say, O(1) time.
• This appears to be a formidable task, because of two factors:– no formal definition of what we want to quantify
(quality) is available
– the “intuitive concept”of quality depends a priori on too many factors and in a complex way
1010Workshop RNN, ICANN’02, Madrid, 27/8/02
(2:) Objective tests(2:) Objective tests
• There is an area called “objective tests” where different procedures are proposed to evaluate the degradationdegradation of a sequence.
• They consist of specific metrics that compare the original and the degradated streams.
• Even if this is not the goal here, the results of these objective tests are not satisfactory so far.
• How do we know? by comparing the results obtained using these techniques with “the truth”, that is, with those coming from subjective tests.
1111Workshop RNN, ICANN’02, Madrid, 27/8/02
(2:) An example(2:) An example
• The MNB metric:– developped at the US Dep. of Commerce, 1997
– for voice (VoIP applications)
– based on a cognition model
– has been shown to behave correctly in some cases
• In next slide, some results of a test against subjective evaluations(from T. A. Hall, Objective speech quality measures for Internet telephony, Proc. of SPIE 2001).
1212Workshop RNN, ICANN’02, Madrid, 27/8/02
(2:) Performance of MNB 2(2:) Performance of MNB 2
1313Workshop RNN, ICANN’02, Madrid, 27/8/02
(2:) (2:) The only exceptionThe only exception
• There is one trial of measuring quality without referring to the original sequence:
ITU “E-model” www.itu.intwww.itu.int
• It has been proposed for the specific case of VoIP applications.
• However, the comparison with subjective evaluation says that the performance of this metric can be very poor (in next slide, some results from T. A. Hall, Objective speech quality measures for Internet telephony, Proc. of SPIE 2001).
1414Workshop RNN, ICANN’02, Madrid, 27/8/02
(2:) Performance(2:) Performanceof the E-modelof the E-model
1515Workshop RNN, ICANN’02, Madrid, 27/8/02
(2:) An example in video:(2:) An example in video:the the ITS ITS mmetricetric
Source: S. Voran and S. Wolf, The development and evaluation of an objective video quality assesment system that emulates human viewing panels, in IBC 1992.
1616Workshop RNN, ICANN’02, Madrid, 27/8/02
3a: 3a: Our approachOur approach: first step: first step
• Our goal is to take, in some sense, the best of subjective and objective approaches.
• As a first stepfirst step, we select a set of factors or parameters that we think are important to the final perceived quality.
• Even if this is an a priori task, it is not a difficult one.
• For each parameter, we select a few representative or important values.Our goal here is to discretize the problem.
1717Workshop RNN, ICANN’02, Madrid, 27/8/02
(3a:) (3a:) Our approach: Our approach: first step first step (cont.)(cont.)
• In our case, we selected 5 parameters:– BR: Bit RateBR: Bit Rate.
The normalized rate of the encoder.In our environments, we considered 4 possible values (256, 512, 768 and 1024 KBps), normalized with respect to the maximal value.
– FR: Frame RateFR: Frame Rate, in fps (frames per sec).The rate at which the original stream is encoded.Four selected values: 6, 10, 15 and 30 fps.
1818Workshop RNN, ICANN’02, Madrid, 27/8/02
(3a:) (3a:) Our approach: Our approach: first step first step (cont.)(cont.)
– LR: loss rateLR: loss rate, in % (loss probability).Selected values: 0, 1, 2, 4 and 8 %.
– CLP: number of consecutive loss packetsCLP: number of consecutive loss packets.We consider packets dropped in bursts of size 1, 2, 3, 4 or 5.
– RA: ratio between intra macro-blocs to inter RA: ratio between intra macro-blocs to inter macro-blocsmacro-blocs.It (indirectly) measures the redundancy in the sequence.We used for this parameter five values between 0.05 and 0.45.
1919Workshop RNN, ICANN’02, Madrid, 27/8/02
(3a:) (3a:) Our approach: Our approach: first step first step (cont.)(cont.)
• Observe that there can be interactions (in general, complex) between the parameters– for instance, here, RA is used by the encoder to
protect the stream against losses (measured by LR)
• It must also be underlined that our approach does not depend on the specific set of chosen parameters.
• Observe that BR, FR and RA are source source parametersparameters, and that LR and CLP are network network parametersparameters.
2020Workshop RNN, ICANN’02, Madrid, 27/8/02
3b: 3b: Our approach: Our approach: second stepsecond step
• We have a set of parameters
P = {1, 2 , …, P}
• For parameter i we have a set of possible values:
i {vi1, vi2, …}
• Any vector of the form
(v1i , v2j , …, vPk)
is then called a configuration.
2121Workshop RNN, ICANN’02, Madrid, 27/8/02
(3b:) (3b:) Our approach: Our approach: second second step (cont.)step (cont.)
• In our experiments, we had P = 5 parameters• The # of selected values for them were
respectively 4, 4, 4, 5, 5, leading to
44455 = 1600 configurations• Call C the set of all possible configurations.• The second stepsecond step consists of selecting a reduced
part of C, trying to have a “good coverage” of this set.
2222Workshop RNN, ICANN’02, Madrid, 27/8/02
(3b:) (3b:) Our approach: Our approach: second second step (cont.)step (cont.)
• In our experiments, we selected about 100 configurations.
• The method we followed was not to use something like a low discrepancy sequence of points in a hypercube, but– to take into account the characteristics of the
parameters– and the extreme values.
• Call SC the set of selected configurations
2323Workshop RNN, ICANN’02, Madrid, 27/8/02
3c: 3c: Our approach: Our approach: third stepthird step
• For the third stepthird step, we must be able to reproduce an environment (source + network) where the selected parameters can be put in any configuration we want.
• In our case, we achieved this using simulators and appropriated controlled (lab conditions) networks.
• The third step consists of reproducing each configuration from the set SC and to send a fixed original sequence from source to receiver.
2424Workshop RNN, ICANN’02, Madrid, 27/8/02
(3c:) (3c:) Our approach: Our approach: third step third step (cont.)(cont.)
1
1
2
2
3
3
SC = { 1, 2, … }
2525Workshop RNN, ICANN’02, Madrid, 27/8/02
3d: 3d: Our approach: fourth stepOur approach: fourth step
• The result is a set of versions of , each having encountered different conditions at the source and in the network.
• We have now a set of sequences (in our tests, about 100 sequences), and a configuration associated with each one.
• The fourth stepfourth step consists of performing a standard subjective test on each sequence, to build a value assumed to be, by definitionby definition, the quality of the sequence.
2626Workshop RNN, ICANN’02, Madrid, 27/8/02
(3d:) (3d:) Our approach: Our approach: fourth fourth step (cont.)step (cont.)
• In symbols, we have S sequences {1,…, S} and, associated with sequence i, the configuration (xi1, xi2 ,…, xiP).
• In other words, xik is the value of the kth parameter which led to the sequence i at the receiver.
• Together with this, we have the quality value of each version, coming from the subjective test;
i is the value of i.
2727Workshop RNN, ICANN’02, Madrid, 27/8/02
(3d:) (3d:) Our approach: Our approach: fourth fourth step (cont.)step (cont.)
1 2 3 …
LRLR 2% 0% 1% …
FRFR 15 6 30 …
…
3.53.5 2.52.5 3.23.2 ……
2828Workshop RNN, ICANN’02, Madrid, 27/8/02
3e: 3e: Our approach: Our approach: fifth (last) fifth (last) stepstep
• The S sequences are now divided into two sets, randomlyrandomly. For instance, our 100 sequences were divided into one set of about 80, and one set of about 20.
• To simplify, assume we renumber things such that the first set is {1, …, K}.
• The idea is then to train a Neural Network The idea is then to train a Neural Network (NN) to learn (NN) to learn “the” function with function with PP inputs and inputs and 1 output, which associates with the input (1 output, which associates with the input (xxk1k1, ,
xxk2 k2 ,…, x,…, xkPkP) the output ) the output kk, for , for kk = 1,…, = 1,…,KK..
2929Workshop RNN, ICANN’02, Madrid, 27/8/02
(3e:) (3e:) Our approach: Our approach: fifth (last) fifth (last) step (cont.)step (cont.)
• The second set of sequences is then used to validate the obtained NN (standard approach).
• The hope is the following: if we give to the function an input (x1, x2 ,…, xP) which is not in the data base (taht is, not in SC), the output y should be close to the subjective evaluation of any sequence degradated through a system (source + network) where the configuration was (x1, x2 ,…, xP).
3030Workshop RNN, ICANN’02, Madrid, 27/8/02
(3e:) (3e:) Our approach:Our approach:key implicit assumptionkey implicit assumption
• Our approach implies the following implicit assumption:
For any sequence For any sequence (or for any sequence (or for any sequence
belonging to a given family or class),belonging to a given family or class),the subjective quality depends the subjective quality depends onlyonly
on the values of the set of chosen on the values of the set of chosen characterizing parameterscharacterizing parameters.
3131Workshop RNN, ICANN’02, Madrid, 27/8/02
(3e:) (3e:) Our approach:Our approach:key implicit assumption (cont.)key implicit assumption (cont.)
• This means in particular that given, say, three video sequences perhaps with very different contents, if the system configuration is the same (source bit rate, loss rate, …) then the perceived quality will be roughly the same.
3232Workshop RNN, ICANN’02, Madrid, 27/8/02
(3e:) (3e:) Our approachOur approach at work at work
Source
Receiver
stream ofvoice,music,videovideo,multimedia,…
IP network
NNNN
asking the sourcefor BR, FR, RA
measuringLR, CLP
3333Workshop RNN, ICANN’02, Madrid, 27/8/02
(3e:) (3e:) Our approachOur approach at work at work (cont.)(cont.)
Receiver
sourceparameters
module
BR
FR
RAnetwork
parametersmodule
LR
CLP
NN measureof
quality
3434Workshop RNN, ICANN’02, Madrid, 27/8/02
4: Our results in a nutshell4: Our results in a nutshell
• Once trained, the NN is supposed to behave like an average human observer face to a received sequence.
• Observe that the performance of the method depends on the selected parameters, but that there is no restriction in such a choice.
• Moreover, if a posteriori some new parameter appears to be important, it can be easily added following the same procedure.
3535Workshop RNN, ICANN’02, Madrid, 27/8/02
(4:) Our results in a nutshell (4:) Our results in a nutshell (cont.)(cont.)
• The trained NN (classical and RNN) correlated remarquably well against human evaluations.
• They (obviously) run in negligible time.• They allow us to study the behavior of quality
as a function of several parameters.• We did the same work for audio, with the same
results.• We also developped an example of application
using our tool for control purposes (in audio).
3636Workshop RNN, ICANN’02, Madrid, 27/8/02
(4:) Performances (RNN) (4:) Performances (RNN) during the training phaseduring the training phase
3737Workshop RNN, ICANN’02, Madrid, 27/8/02
(4:) Another view(4:) Another view
1,0
2,0
3,0
4,0
5,0
6,0
7,0
8,0
9,0
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77
Sample number
MO
S s
core
s
Actual Predicted
3838Workshop RNN, ICANN’02, Madrid, 27/8/02
(4:) Performances (RNN) (4:) Performances (RNN) during the validation phaseduring the validation phase
3939Workshop RNN, ICANN’02, Madrid, 27/8/02
(4:) Another view(4:) Another view
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Sample number
MO
S sc
ores
Actual Predicted
4040Workshop RNN, ICANN’02, Madrid, 27/8/02
5: ANN vs RNN5: ANN vs RNN
• We used the toolbox of MATLAB implementing several standard Neural Networks techniques (Artificial Neural Networks, ANN) and a specific software for RNN.
• We compared their performances mainly in the respective learning abilities.
• We will present here some of the obtained results, which show that RNN behave better for our applications.
4141Workshop RNN, ICANN’02, Madrid, 27/8/02
(5:) An interesting example:(5:) An interesting example:left: RNN right: ANNleft: RNN right: ANN
4242Workshop RNN, ICANN’02, Madrid, 27/8/02
(5:) (5:) RNN vs. ANNRNN vs. ANN::# of hidden neurons# of hidden neurons
4343Workshop RNN, ICANN’02, Madrid, 27/8/02
(5:) (5:) RNN vs. ANNRNN vs. ANN:: # of hidden neurons # of hidden neurons
4444Workshop RNN, ICANN’02, Madrid, 27/8/02
6: Analyzing parameters impact6: Analyzing parameters impact
4545Workshop RNN, ICANN’02, Madrid, 27/8/02
(6:) Analyzing parameters impact (6:) Analyzing parameters impact (cont.)(cont.)
4646Workshop RNN, ICANN’02, Madrid, 27/8/02
(6:) (6:) MPQMMPQM (well-known (well-known metric)metric) when applying loss when applying losseses
4747Workshop RNN, ICANN’02, Madrid, 27/8/02
(6:) (6:) ITS ITS (well-known metric)(well-known metric)when bit rate is variedwhen bit rate is varied
4848Workshop RNN, ICANN’02, Madrid, 27/8/02
7: Our demo tool7: Our demo tool
4949Workshop RNN, ICANN’02, Madrid, 27/8/02
8: Our work on8: Our work oncontrol schemes for audiocontrol schemes for audio
Receiver Sender
network
trained neural network
RTP/RTCP
MOS
statistics
codec,playback, ...
payload
RTP/RTCP
codec,packetization, ...
Hybridcontroller
TFRC
suggested ratenetwork state,
MOS
payload
control
stats.
5050Workshop RNN, ICANN’02, Madrid, 27/8/02
(8:) Some simulation results(8:) Some simulation results
BW needed by PCM
BW needed by GSM
BW needed by PCM
BW needed by GSM
5151Workshop RNN, ICANN’02, Madrid, 27/8/02
(8:) Illustrating our control (8:) Illustrating our control applicationapplication
0
1
2
3
4
5
time(sec.)
MO
S
MOS without CM MOS with CM
5252Workshop RNN, ICANN’02, Madrid, 27/8/02
9: Ongoing work9: Ongoing work
• Refining our initial models for quantitying the quality of audio and video transmission:– better loss models– exploration of new parameters
(for instance, FEC)– trying to characterize streams types
5353Workshop RNN, ICANN’02, Madrid, 27/8/02
(9:) Ongoing work (cont.)(9:) Ongoing work (cont.)
• Coupling our approach with traffic prediction (also using neural techniques)
• Exploration of other applications (diffserv
• Work on learning algorithms for RNN (numerical analysis)