manual - squad voice test result description

WHEN QUALITY MATTERS

SQuad Voice Measurement Description Manual

November 2010

SwissQual® License AG Allmendweg 8 CH-4528 Zuchwil Switzerland

t +41 32 686 65 65 f +41 32 686 65 66 e [email protected] www.swissqual.com

Part Number: 16-100-200047-3 Rev 2.20

SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQual‟s liability for any errors in the documents is limited to the correction of errors and the aforementioned advisory services.

Copyright 2000 - 2010 SwissQual AG. All rights reserved.

No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any human or computer language without the prior written permission of SwissQual AG.

Confidential materials.

All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.

When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in your text.

SwissQual®, Seven.Five®, SQuad®, QualiPoc®, NetQual®, VQuad®, Diversity® as well as the following logos are registered trademarks of SwissQual AG.

Diversity Explorer™, Diversity Ranger™, Diversity Unattended™, NiNA+™, NiNA™, NQAgent™, NQComm™, NQDI™, NQTM™, NQView™, NQWeb™, QPControl™, QPView™, QualiPoc Freerider™, QualiPoc iQ™, QualiPoc Mobile™, QualiPoc Static™, QualiWatch-M™, QualiWatch-S™, SystemInspector™, TestManager™, VMon™, VQuad-HD™ are trademarks of SwissQual AG.

SwissQual acknowledges the following trademarks for company names and products:

Adobe®, Adobe Acrobat®, and Adobe Postscript® are trademarks of Adobe Systems Incorporated.

Apple is a trademark of Apple Computer, Inc.

DIMENSION®, LATITUDE®, and OPTIPLEX® are registered trademarks of Dell Inc.

ELEKTROBIT® is a registered trademark of Elektrobit Group Plc.

Google® is a registered trademark of Google Inc.

Intel®, Intel Itanium®, Intel Pentium®, and Intel Xeon™ are trademarks or registered trademarks of Intel Corporation.

INTERNET EXPLORER®, SMARTPHONE®, TABLET® are registered trademarks of Microsoft Corporation.

Java™ is a U.S. trademark of Sun Microsystems, Inc.

Linux® is a registered trademark of Linus Torvalds.

Microsoft®, Microsoft Windows®, Microsoft Windows NT®, and Windows Vista® are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries U.S.

NOKIA® is a registered trademark of Nokia Corporation.

Oracle® is a registered US trademark of Oracle Corporation, Redwood City, California.

SAMSUNG® is a registered trademark of Samsung Corporation.

SIERRA WIRELESS® is a registered trademark of Sierra Wireless, Inc.

TRIMBLE® is a registered trademark of Trimble Navigation Limited.

U-BLOX® is a registered trademark of u-blox Holding AG.

UNIX® is a registered trademark of The Open Group.


© 2000 - 2010 SwissQual AG

Contents |

CONFIDENTIAL MATERIALS

ii

Contents

1 Introduction .......................................................................................................................................... 1

2 SQuad Listening Quality ..................................................................................................................... 2

Introduction ............................................................................................................................................ 2

Speech Quality Definition ...................................................................................................................... 2

SQuad Method ....................................................................................................................................... 2

MOS Rating ........................................................................................................................................... 3

Speech and Noise Level – Received Signal ......................................................................................... 4

Channel Gain ......................................................................................................................................... 4

Clipping .................................................................................................................................................. 5

DC-Offset ............................................................................................................................................... 5

Frequency-Shift ..................................................................................................................................... 5

Delay Spread (Voice Jitter) .................................................................................................................... 6

Speech Threshold .................................................................................................................................. 6

Degradations ......................................................................................................................................... 7

AGC Problems ................................................................................................................................. 7

Speech Enhancer / Noise Suppressors ........................................................................................... 8

Impulsive Noise ................................................................................................................................ 8

Background Noise ............................................................................................................................ 8

Interruptions...................................................................................................................................... 9

VAD resp. Silence Suppression Problems ..................................................................................... 10

Variable Delay (Voice Jitter) ........................................................................................................... 10

Delays Deviation ............................................................................................................................ 10

Frequency Shifts ............................................................................................................................ 11

Quality Code ................................................................................................................................... 12

Option: P.862 'PESQ' .......................................................................................................................... 13

3 SQuad Noise Suppression ............................................................................................................... 14

Introduction .......................................................................................................................................... 14

Listening Quality .................................................................................................................................. 14

NS-Speech Power Classes ................................................................................................................. 15

SNRI, Signal-to-Noise Ratio Improvement .......................................................................................... 17

NPLR, Noise Power Level Reduction .................................................................................................. 17

SPLR, Signal Power Level Reduction ................................................................................................. 17

Overall NS Quality ............................................................................................................................... 19

Quality Index ........................................................................................................................................ 20

Convergence Time .............................................................................................................................. 21

Noise Reduction/Suppression Test ..................................................................................................... 23



Contents |


iii

Examples ............................................................................................................................................. 24

Evaluation of the transmitted signal ..................................................................................................... 24

Evaluation of the transmitted signal ..................................................................................................... 25

4 DTMF Tests ........................................................................................................................................ 26

Introduction .......................................................................................................................................... 26

DTMF-Test Overview........................................................................................................................... 26

Criterions ............................................................................................................................................. 27

Results ................................................................................................................................................. 27

5 SQuad Advanced Echo Check (Passive Test) ................................................................................ 30

Introduction .......................................................................................................................................... 30

Echo Measurement.............................................................................................................................. 30

Measurement Results .......................................................................................................................... 30

6 SQuad Advanced Echo Check (Active Test) .................................................................................. 34

Introduction .......................................................................................................................................... 34

Echo Measurement.............................................................................................................................. 34

Measurement Results – Echo Evaluation............................................................................................ 34

Measurement Results – Listening Quality ........................................................................................... 36

7 Round Trip .......................................................................................................................................... 37

Introduction .......................................................................................................................................... 37

The Round Trip Method ....................................................................................................................... 37

Results ................................................................................................................................................. 37

References .......................................................................................................................................... 37

A Appendix ............................................................................................................................................ 38

Abbreviations ....................................................................................................................................... 38

Figures

Figure 2-1 Block Diagram of the SQuad ............................................................................................................ 3

Figure 2-2 Main outcomes of SQuad-LQ ........................................................................................................... 3

Figure 2-3 Typical MOS-LQ values for Different Codecs .................................................................................. 4

Figure 2-4 Frequency Shift ................................................................................................................................ 6

Figure 2-5 Histogram of Noised Speech Sample .............................................................................................. 7

Figure 2-6 Level Chart with AGC ....................................................................................................................... 8

Figure 2-7 Similarity Chart with Impulsive Noise ............................................................................................... 8

Figure 2-8 Background Noise ............................................................................................................................ 9

Figure 2-9 Level Chart with Handover ............................................................................................................... 9

Figure 2-10 Time Clipping ............................................................................................................................... 10

Figure 2-11 Variable Delay, Voice Jitter .......................................................................................................... 10

Figure 2-12. Example for Variable Delay, which shows that Block B is delayed for –244 samples to the left



Contents |


iv

(arrives earlier when compared with the same reference block). Block B arrives later by 244 samples. ....... 11

Figure 2-13 Frequency Shift ............................................................................................................................ 12

Figure 2-14 P.862 result representation .......................................................................................................... 13

Figure 2-15 P.862 vs. P.862.1 scale transformation ....................................................................................... 13

Figure 3-1 The Principle of MOS Calculation in Squad-NS. ............................................................................ 15

Figure 3-2 The Five Energy Windows for 16 bit Digital System (90.3 dB dynamics) ...................................... 16

Figure 3-3 Speech Power Class Chart ............................................................................................................ 16

Figure 3-4 SPLR Calculation out of Five Values Calculated in Five Different Energy Windows. ................... 18

Figure 3-5 Speech Power Class Chart ............................................................................................................ 19

Figure 3-6 Overall NS Quality .......................................................................................................................... 19

Figure 3-7 Calculation of Quality Index ........................................................................................................... 20

Figure 3-8 Some Experimental Results for DifferentConfigurationofNR in the Network ................................. 21

Figure 3-9 MOS vs. MOSobj & Quality Index .................................................................................................. 21

Figure 3-10 Example of Convergence Time Evaluation .................................................................................. 21

Figure 3-11 Filtered Difference Envelope is Compared with the Threshold Value ......................................... 22

Figure 3-12 Five Point Analysis of the Difference Envelope during Decision on Noise Reduction State ....... 22

Figure 3-13 Additional Condition before a Final Decision is Calculated. ........................................................ 23

Figure 3-14 Typical Reference Signal with White Noise Added ...................................................................... 23

Figure 3-15 Noise Suppression Applied on Signal .......................................................................................... 24

Figure 3-16 Noise Reduction Applied on Signal .............................................................................................. 24

Figure 3-17 NS Signal Envelope ..................................................................................................................... 24

Figure 3-18 Results presentation within NQDI ................................................................................................ 25

Figure 4-1 Allocation of Frequencies to the Various Digits and Symbols of a Push-button Set ..................... 26

Figure 4-2 Block Diagram of DTMF-Test ......................................................................................................... 27

Figure 5-1 Results of SQuad-AEC Test, shown in NQDI presentation ........................................................... 31

Figure 5-2 EOR (Echo Objection Rate) derived from G.131 ........................................................................... 32

Figure 5-3 Echo Loss during scanning versus echo delay .............................................................................. 33

Figure 6-1 Results presentation SQuad-AEC active ....................................................................................... 35

Figure 6-2 Echo Loss as profile versus echo delay ......................................................................................... 35

Figure 6-3 Result presentation in an echo free/non echo detectable connection ........................................... 36

Figure 7-1 Detail of the NQDI Representation ................................................................................................ 37

Tables

Table 2-1 Example for Variable Delay where five blocks are elayed at different offsets regarding Reference Speech Sample ............................................................................................................................................... 11

Table 3-1 Energy Windows used in Calculation of SPLR and PLR ................................................................ 16

Table 4-1 DTMF Result Code .......................................................................................................................... 27

Table 7-1 One Way Delay Quality Classes ..................................................................................................... 37



Chapter 1 | Introduction


1

1 Introduction

This document describes the parameters that are measured with the SwissQual QoS Measurement System. It also describes briefly the used algorithms as well as some background information with regards to the causes of different kind of quality degradations. The screenshots are made from the SwissQual’s Post Processing System NQDI.



Chapter 2 | SQuad Listening Quality


2

2 SQuad Listening Quality

Introduction

For network operators or equipment manufacturers, it is important to know where and why the speech quality may be degraded. Since speech quality is a major factor determining customer satisfaction, encoding techniques must be designed for optimal speech quality. In order to assess the quality of speech encoding techniques, large-scale auditory tests are commonly employed. However, it is practically impossible to reproduce results obtained in such way. Furthermore, such results are depending on the level of motivation of the individual test candidates. Therefore, it is a big advantage to have an instrumental method capable of physically measuring speech quality parameters and producing results, which correlates as closely as possible with subjectively acquired results. The perfect transmission of speech via a telecommunications channel with a bandwidth of 0.3 - 3.4 kHz results in a sentence intelligibility of approx. 98%. The speech coders introduced for handsets used in digital mobile radio networks also further impair intelligibility. Speech quality is a vague term compared with “bit rate,” “echo” or “loudness.” Since customer satisfaction can be measured directly by the quality of the transmitted speech, encoding techniques must be selected and optimized based on their speech quality.

Speech Quality Definition

Speech Quality is defined as a measure of a listener‟s satisfaction based on his experience and expectation regarding voice communication. It is generally expressed as a Mean Opinion Score (MOS). This measurement denotes the average of many individual opinions on speech quality, which are obtained from a representative number of listeners. Speech quality is a complex psycho-acoustic phenomenon within the process of human perception. As such, it is necessarily subjective. Most objective algorithms are based on a comparison between a reference sample and a coded version of the reference.

SQuad Method

SQuad consists of three main parts. First, a pre-processing unit adjusts reference and coded sample. Then, an auditory model is used to reduce both samples to their perceptually relevant features. Finally, an assessment unit evaluates the perceptual difference between reference and coded sample and outputs the result as a MOS value.

A speech sample is transmitted over a line with generally unknown combination of speech coders. This speech sample is available in digital form. The sampling frequency is 8 kHz and the digital quantization is 16 bits. As an initial step, the source speech signal is read into the vector x(i) and the coded speech signal into the vector y(i). These speech signals are synchronized with respect to both time and level. The DC offset must be removed from every sample. In addition, the signals are normalized to a common RMS (Root Mean Square) level, to ensure that the constant amplification factor is not taken into account.

The signals are split into processing units of 32 ms duration, also called Frames. The unit overlap is 50%. During the first processing step, the frame is multiplied by a hamming window. The source signal x(t) in the time domain is now transformed to the frequency domain using a discrete Fourier transform, followed by computation of the squared magnitude FFT spectrum. Both signals are filtered using a filter equivalent to the receiving curve of the corresponding telephone handset. A rough approximation of the time masking is already achieved through the frame overlapping during the signal pre-processing. The comparison method of SQuad is based on the following principle; Signal parts with high energy are more important for the perceived speech quality. A similarity coefficient for reference and impaired signal is computed for 4 different energy thresholds. Only the parts of the signal exceeding the respective threshold are considered. This can be viewed as a multi-resolution analysis with respect to signal energy. The “overall similarity” is then computed using the coefficients from all thresholds. A polynomial is used to transform the comparison result to the ITU MOS scale. The length of the speech sample varies between 4 and 30 seconds.





3

Figure 2-1 Block Diagram of the SQuad

MOS Rating

Speech Quality is defined as a measure of a listener‟s satisfaction and is generally expressed as a Mean Opinion Score (MOS). SQuad delivers MOS rating as one number, ranging 1 to 4.5, fully in accordance to the Listening Scale defined in ITU’s P.800 recommendation. This is not exactly the same scope as MOS which is defined with 1-5. This is allowed since based on subjective tests used for the validation of Squad-LQ the values above 4.5 have almost never appeared.

As described in ITU’s P.800 recommendation Annex B.4.5, various five-point category-judgment scales may be used for different purposes. The Listening Only quality scale is the most frequently used for ITU-T applications:

Quality of the speech Score

Excellent 5

Good 4

Fair 3

Poor 2

Bad 1

The following picture gives an overview about the obtained results in the main section of NQDI:

Figure 2-2 Main outcomes of SQuad-LQ

- echo - call setup quality

Time & Level

alignment

Reference

signal

Degraded signal

Frequency equalization

Psychoacoustics

modelling

IRS-filtering & BG Noise detection

Psychoacoustics

modelling

Listening only Quality

estimation

Network

Information

Other measured

data

Overall Audio Quality

- round-trip delay - jitter

Q NUT

Listening only Quality

Estimation LQQ





4

3.43.4AMR 4.75

3.73.8AMR 7.4

3.94.0AMR 12.2

3.33.4GSM-HR

3.94.0GSM-EFR

3.53.5

G.723.1

(6.3)

3.73.8G.729

4.44.3G.711

Typical

SQuadLQ

Typical MOS

ValueCodec

3.43.4AMR 4.75

3.73.8AMR 7.4

3.94.0AMR 12.2

3.33.4GSM-HR

3.94.0GSM-EFR

3.53.5

G.723.1

(6.3)

3.73.8G.729

4.44.3G.711

Typical

SQuadLQ

Typical MOS

ValueCodec

Figure 2-3 Typical MOS-LQ values for Different Codecs

Speech and Noise Level – Received Signal

Within the SQuad-LQ algorithm itself also the Active Speech Level (acc. to ITU-T P.56) of the received signal is calculated. This value describes the r.m.s level of active speech parts only. Speech pauses will not influence that value. The result is given in dBov. The level should be in a range from -20 … -38dBov. Related to a sending level of -26dBov it corresponds to a gain/attenuation of +6 / -12 dB.

The Noise Level describes the noise floor of the received signal in speech pauses in dBov too. In normal noise-free connections a Noise Level of below -55dBov can be obtained. Please note that is in an un-weighted level, a common A-weigthing is not applied.

Both results are used to calculate a basic signal-to-noise ratio, which describes the distance between the speech level and the noise floor.

Channel Gain

This is a value in dBr, which shows the power level of the received signal relatively to the reference (input) signal. Because, SQuad-LQ is applied to the electrical interfaces of the connection, the terminal depending Send Loudness Rating (SLR) and the Receive Loudness Rating (RLR) as well are modelled in SQuad-LQ itself. In Principle, SQuad-LQ is connected to the so-called 0dbr-point of the networks input. At this 0dBr point a nominal level of -26dBov (corresponds to -20dBm at a four-wire 600 Ohms interface) will be inserted.

The Channel Gain reflects only gains or attenuation caused by network (exception: attenuating PSTN subscriber loops). It is close to the so-called JLR (Junction Loudness Rating) but does not apply any spectral weighting.

In a transparent ISDN connection the Channel Gain should be around 0 dB. In principle also in a Mobile-to-ISDN or Mobile-to-Mobile connection this value should be around 0dB too. Caused by individual signal amplifications of cellular network providers this value might differ. Mainly they amplify the signals, so a gain in the positive range can be observed. If a overall gain of 6dB is exceeded, amplitude clipping may occur. This will lead – like in a real call – to quality impacts and result in a lower SQuad-LQ score.

On the other way around, an attenuating PSTN subscriber loop may lead to negative Channel Gains because it is part of the evaluated transmission chain. Like a PSTN phone, which is more sensitive, also SQuad-LQ gain internally such attenuated signals to a nominal level of -26dBov (corresponds to 79dB sound pressure level at the subscriber‟s ear).

To inform the user of SQuad-LQ, within NQDI Channel Gains outside of the expected range are highlighted. The expected range is here +6…-9dB and in an extended range down to -15dB.

The Channel Gain is available as a single overall value in dBr (total Gain) but also as a range of values in





5

the time domain (every 16ms) like a an attenuation profile. Based on this attenuation profile values a chart can be created providing information on:

AGC (Adaptive Gain Control) Elements that are not working correctly

Level Jumps (for example after a handover)

Level Interruptions (for example interruptions in the audio path or during handovers)

Clipping

Temporal Speech Clipping (also called front-end clipping) is the loss of speech frames. It may occur when voice activity detection is used, when Digital Circuit Multiplication Equipment (DCME) is used or during uncontrolled slips. Time clipping is presented as clipped frames in a function of time.

Clipping is an annoying phenomenon that cuts off a bit of speech in the instant it takes for the transmitter to detect presence of speech. It is almost impossible to eliminate clipping in a traditional circuit-switched voice conversation. Using circuit switching, the transmitter is not turned on until sound is detected, and by then, a piece of the speech has been clipped off. SQuad detects this clipping and generates the results as a distribution of time. The resolution of the clipping measurement is 8 milliseconds. First, the mean energy per 8 milliseconds is calculated. The energy values are then saved for each frame (both reference and coded). After the whole speech sample has been processed, the post processing of time clipping data is done. There are some simple rules during this post-processing:

Time clipping can only occur during transitions pause-speech.

Minimum pause length must be reached. In our case, it is 64 milliseconds.

The difference Energy (ref) – Energy (cod) must be at least 10 dB.

Clipped frames are succeeding frames.

The clipping measurement values are indicated as an average % value per sample (number of active speech frames / number of clipped speech frames) and as a time domain distribution. Time Clipping in SQuad-LQ is calculated each 8 ms, but only an average value of two succeeding frames is reported in output file.

DC-Offset

This number shows the DC-Offset of the coded signal in percentage. This is an important piece of information if the measured speech quality is lower than expected. Various interface problems (impedance, coding technique, HW) can produce DC-offset discrepancies.

DC Offset is calculated as

100 * average_audio_voltage / Max_audio_voltage

Max_audio_voltage for 16 bit digital resolution is equal 2^15

(32768).

For example: average_audio_voltage=300 results in DC_Offset=100*300/32768=0.91%

Frequency-Shift

A low bit rate encoder can move the formants (spectral peaks) of the speech. This degradation can be described as frequency shift of one or more components of the source signal. This drift is measured as a percentage of moved frequency components in the speech active phases. The result is a number of pos- and neg -shifted frames in %, reflected in a compressed frequency (bark). Figure 2-2 shows a typical situation for one processing buffer of voice signal (32ms).





6

Figure 2-4 Frequency Shift

For the detection of the frequency shift, the peaks above the loudness threshold in both reference- and degraded-signals are analyzed. The threshold for compressed loudness is set to 10. The position of each peak in the reference is compared with the position of the peak in the coded signal (within +/- 1Bark). Frequency shift is found if the location of the two peaks is not at the same. The amplitude of the coded and reference loudness must not be equal but above the threshold value. This is allowed because the level- and frequency-alignment is done previously in a separate module.

Typical Network Elements that are responsible for frequency shift are:

Very low bit rate vocoders

Speech enhancer (Noise suppressors)

Non linear filter elements

Delay Spread (Voice Jitter)

The first stage in SQuad-LQ is the time alignment. This stage is able to deal with variable delays, which can occur in packet networks, normally indicated with big jitter/delay or packet loss values. It collects information about shifted frames by comparison with reference speech sample. The result of this alignment is a delay distribution of the coded signal. A histogram will be presented which shows the number of speech frames reflected in arrival time (delay) in milliseconds. The results are generated for each 32ms frame.

Speech Threshold

This is a value in dBov, which shows a level of the speech in a coded signal. The measurement is based on building of r.m.s. histograms for both coded and reference signals. dBov means decibel relative to a digital over-load point. The range for this value is –90 to 0 dBov. For signals containing background noise, this value is between –55 to –40 dBov.

A histogram evaluates an individual frequency for a set of data bins. The result is a number of occurrences of a value in a data set. A histogram table presents the energy-grade boundaries and the number of scores between the lowest bound and the current bound.





7

Energy Histogram for noisy signal

0

5

10

15

20

25

-53.0

-51.5

-49.9

-48.3

-46.7

-45.1

-43.5

-41.9

-40.4

-38.8

-37.2

-35.6

-34.0

-32.4

-30.8

-29.3

-27.7

-26.1

-24.5

-22.9

-21.3

-19.7

-18.2

-16.6

-15.0

RMS of the coded Signal (dB)

Co

un

t

Noise position

Bound position

Speech level

Figure 2-5 Histogram of Noised Speech Sample

In Figure 2-4, is shown an example of a histogram for noisy speech signal of 10 seconds duration. In SQuad-LQ, internally, the histogram is presented with 50 bins between minimum r.m.s and maximum r.m.s. values. There are two maxima, one for speech-pauses and one for speech active intervals. In our example, the first maximum is found at –45.1 dB, which is level of silent intervals. Second peak is at about –26 dB which is speech active level. „Speech threshold‟ measured in SQuad-LQ is defined as a boundary between these two peaks (Bound position).

Degradations

The below list present some possible degradation reasons for the Listening Quality Value using a clean reference sample:

AGC (Adaptive Gain Control) Elements

Speech Enhancer / Noise Suppressors

Impulsive Noise

Background Noise

Interruptions

VAD (Voice Activity Detectors)

Variable Delay or Jitter in Packet Networks

AGC Problems

Indications: LQ less than expected, Level Chart indicates an abnormal level trend.

Example of an AGC of a mobile handset that attenuates too strong toward the end of a sample:





8

Figure 2-6 Level Chart with AGC

Speech Enhancer / Noise Suppressors

Indications: LQ less than expected. Similarity Chart shows a bigger degradation over the complete speech signal. This must be checked with the SQuad Noise Suppression Test.

Impulsive Noise

Indications: LQ less than expected. Similarity Chart shows a lot of quite big degradation peaks.

Figure 2-7 Similarity Chart with Impulsive Noise

Background Noise

Indications: LQ less than expected. Signal Envelope Chart shows some additional energy during the speech pause.





9

Example:

Figure 2-8 Background Noise

Interruptions

Indications: LQ less than expected. Similarity Chart shows blue bars and Signal Envelope indicates a „peak‟ drop.

Example of an Interruption due to a Handover (interruption is indicated in blue):

Figure 2-9 Level Chart with Handover

Interruption measurement is based on processing frames of 32ms duration.

Such frames are divided into 16 sub-frames (2ms) in order to achieve better resolution. For each sub-frame, the signal level for both reference and degraded is calculated. Interruption flag for sub-frame is set to TRUE if the level in the reference signal is higher then –35 dBov (r.m.s. = 400) and the level in the degraded signal is lower then –61 dBov (r.m.s. = 20).

The result is the ratio of sum of sub-frames with signal interruption and total nr. of sub-frames (16).

16

___

IntFramesOfNrresultonInterrupti

Interruption result is in the range 0 and 1 with step=1/16. If only one sub-frame is lost (interrupted) in the signal, then is Interruption=1/16=0.0625. When the signal in all sub-frames is deleted (lost) then is Interruption=1.





10

VAD resp. Silence Suppression Problems

Indications: LQ less than expected. Average clipping values are high; Clipping Chart shows some significant clipping.

Example: Clipping at the beginning of the sentence. At the top is shown the signal envelope chart, at the bottom the clipping chart.

Figure 2-10 Time Clipping

Variable Delay (Voice Jitter)

Indications: LQ less than expected. Variable Delay Chart shows some delay values. This typically happens if there was a packet network (or backbone) used and if there were Jitter buffers used.

Figure 2-11 Variable Delay, Voice Jitter

Delays Deviation

“DelaysDeviation” is placed in the section “!SQuad_LQ_AVG” (in Squad result file) and is defined as an absolute value of the standard deviation of block delays (D), divided by an average of block delays [in samples]. The duration of one sample at 8000 Hz, sampling frequency is 125 µs. “DelaysDeviation” shows the smoothness of an array of delays. Small “DelaysDeviation” value means there is a uniform delay distribution, where a large value indicates a big delay-jitter like in IP networks. For only one single delay, this value is equal zero.





11

)(

)(

Daverage

DstdevfabsationDelaysDevi

Example: Coded file has fixed offset of 1024 samples to the reference file. Six blocks with different variable delays are found with Squad-LQ:

Table 2-1 Example for Variable Delay where five blocks are elayed at different offsets regarding Reference Speech Sample

Block Delays (D) in samples D”

All 1024 0

A 780 -244

B 1024 244

C 1536 512

D 1024 -512

E 1304 280

Stdev(D)=

N

DDN 22

=241.53 average(D)= DN

1=1115.3

DelaysDeviation=241.53/1115.3=0.217

Delay Spread is also another important parameter, which describes the maximum delay amplitude calculated over all single group delays. Based on the example above, we can calculate new Delay Values (D”), which are scaled D values by subtracting a fix delay from all other values.

delayfixDD ii _"

For example, fix_delay=1024 samples.

Figure 2-12. Example for Variable Delay, which shows that Block B is delayed for –244 samples to the left (arrives earlier when compared with the same reference block). Block B arrives later by 244 samples.

Delay Spread is calculated as a distance between the minimum and the maximum block delay. In our example, minimum value is –512 samples and maximum is +512 samples. So the distance between max and min equals 1024 samples. This is then converted to time in ms.

durationsmpsmpDelaydDelaySprea __

Fsdurationsmp

1_

HzFs 8000

In the calculation for our example, we get the value for DelaySpread=1024/8000=128 ms.

Frequency Shifts

The distribution of the frequency shifts is shown in the histogram below, with the number of frames in which a shift at a certain frequency occurred. The diagram covers the whole range of frequencies in steps of 31.25





12

Hz.

Figure 2-13 Frequency Shift

Quality Code

The thresholds for each degradation descriptor is, as follows:

“MOS-drops”

Quality distribution is unsteady such as during handovers or interruptions.

“Received signal level out of recommended range”

The level difference to the reference level exceeds +9dB or falls below -12dB.

“Signal interruptions”

Temporal clipping for more then 8 ms.

“High DC-Offset”

Malfunction of terminal or interface card. DC-Offset > 0.2%.

“Variable delay”

Indicates possible packet-switched transmission.

“Variable delay during speech”

Same as “Variable delay” but occurring during speech active intervals.

“Background noise”

High level of circuit noise. Higher then –50 dBov.

“Impulse noise”

Relay/switching problems detected. More then 1 pulse / second.

“Low bitrate coding / coding artefacts”

Low bit rate coding scheme has been used (e.g. Less then 8 kbit/s) or residual errors from decoding are introduced (e.g. by frame loss concealment).

“Not Specified”

signalizes that the speech quality is degraded but no outstanding reason for that degradation could be classified.

“OK”

shows that the speech quality is nearly non-degraded

Furthermore, special problems in the audio-path will be reported:

“Silence/Audio Level Too Low”

There is no signal activity in the audio path or the signal level is below -45dBov. SQuad-LQ will not calculated since it will lead to misleading results.

“Corrupted Signal/Wrong Reference”

Here the received audio signal is heavily corrupted (e.g. only partly transmitted or the audio stream was lost completely). Such a behaviour can observed e.g. during a call drops. Normally, SQuad will score those





13

signals with close to 1.0. For statistical reasons, NQDI allows the exclusion of such results from the reporting.

This indicator will also signalize if a wrong reference signal was used for SQuad.

Option: P.862 'PESQ'

Optionally, the SQuad-LQ framework can also include ITU-T P.862 'PESQ' an additional model for objective speech quality prediction. The principle function of P.862 as a psycho-acoustic driven comparison method is very close to SQuad-LQ and so the received signals can be evaluated also by P.862.

The ITU-T Recommendation P.862 was finalized 1999 and approved in February 2000. It was trained over huge amount of databases mainly from codec standardization activities in ITU-T.

If the P.862 option is used in SQuad-LQ, the SQuad-framework will report two additional quality results:

P.862 score: raw outcome of the ITU-T algorithm

Listening Quality (P.862.1) : transformed result according to P.862.1 into a MOS-scale from 1…5

Figure 2-14 P.862 result representation

That both results are basing on the same algorithm. The transformation according to P.862.1 describes only a scale mapping.

Figure 2-15 P.862 vs. P.862.1 scale transformation

Note: The P.862 results are a bit lower in tendency compared to SQuad-LQ especially in the range from 3.0 … 4.0. It is mainly caused by a high sensitivity of P.862 regarding clipping and time-variant filtering.

It has taken also into account that P.862 does not rate any linear distortions such as frequency responses. Those linear distortions will be compensated completely by P.862 itself before the quality prediction starts.

The P.862 option has to be enabled by the test-type 'Speech-P.862' and requires a special software key.

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

P.862

P.8

62

.1

Scale limit = 4.5



Chapter 3 | SQuad Noise Suppression


14

3 SQuad Noise Suppression

Introduction

The noise suppression is a feature designed to enhance speech quality in a range of environments where there is significant (acoustic) background noise. The noise suppression function is a pre-processing module that is used to improve the signal to noise ratio of a speech signal prior to voice coding.

For noise suppressors, there are certain requirements that need to be fulfilled:

The noise suppression function must not have a statistically significant distorting effect on clean speech in comparison with the performance of the speech codec without noise suppression applied.

The noise suppression function must not introduce any degradation of speech and no undesirable effects in the residual noise when there is (acoustic) background noise in the speech signal.

DTMF and other signalling tones transmission performance during the application of noise suppression shall be no worse than when noise suppression is turned off.

The above requirements are all checked with SQUAD Noise Suppression test.

The algorithm measures the Noise Power Level Reduction (NPLR) and Signal-to-Noise Ratio Improvement (SNRI), similar to the definitions in ETSI STC SMG11 (GSM 06.77) document. A comparison of the SNRI and NPLR measures are used to acquire an indication of possible speech distortion produced by the tested NS method.

For the Noise Suppression test, two reference signals are used:

Clean speech reference

Clean speech with background noise

The sample with background noise is sent as a test sample.

Listening Quality

Speech Quality is measured according to ITU’s P.800 where the coded file and the clean reference are inputs for SQuad LQ algorithm. The algorithm is elaborated in Section 2.

The Listening Quality evaluation is running twice. The LQ of the noised input signal is estimated and the LQ of the de-noise output signal as well. From both results the change of the speech quality is derived.





15

Figure 3-1 The Principle of MOS Calculation in Squad-NS.

First, the internal reference (MOS_ref) is calculated. Degraded signal is assessed by comparing it with the clean reference. Result is presented on CCR scale by subtracting MOS_ref from the measured MOS.

Speech quality measurement in noisy environment is done by sending the noisy reference through the network under test. The noisy reference is made by adding a noise signal to the clean reference. Comparing the clean reference with the coded signal would not produce stable results, since the SNR of the noisy reference will impact the results. To make this measurement independent from the reference properties, MOS_ref is calculated first. MOS_ref defines the reference speech quality, which will be measured in degraded signal if there would not be any degradations or improvements in the network. This value is mostly lower then 4.5 (excellent quality) because of noise influence. The range of MOS generated by Squad-NS is –3.5 to +3.5, which is slightly different from the ITU definition.

Comparison Category Rating (CCR)

The range of the Comparison Category Scale (CCR) as defined in source ITU P.800:

3: Much Better

2: Better

1: Slightly Better

0: About the Same

1: Slightly Worse

2: Worse

3: Much Worse

The CCR methods are particularly useful for assessing the performance of telecommunications systems when the input has been corrupted by background noise. An advantage of the CCR method over the other scales is the possibility to assess speech processing that either degrades or improves the quality of the speech.

NS-Speech Power Classes

SNRI and SPLR are calculated once as overall average values and once per speech power class. There are 6 different power classes:





16

Definition:

ETSI

Performance objective:

Speech level = -26 dBov, determined according ITU-T P.56

Table 3-1 Energy Windows used in Calculation of SPLR and PLR

Range Description Level class

3 high power frames > speech level – 1 dB h

2 medium power frames > speech level – 10 dB m

1 low power frames > speech level – 16 dB l

0 noise only frames < speech level – 19 dB

> speech level – 34 dB

noise

-1 pause frames < speech level – 34 dB p

-2 not used for calculation < speech level – 16 dB

> speech level – 19 dB

unused

Figure 3-2 The Five Energy Windows for 16 bit Digital System (90.3 dB dynamics)

Example of the Signal Level Group Chart:

Figure 3-3 Speech Power Class Chart





17

SNRI, Signal-to-Noise Ratio Improvement

Definition:

ETSI


6 dB or higher

Formula:

SNRIx = per speech power class,

SNRI = 1 / (Nh + Nm + Nl) * (Nh * SNRIh + Nm * SNRIm + SNRIl * Nl)

SNRIx = 10 * ( log(SNRcod_x) – log(SNRref_x) )

SNRy_x = cod / nse

Range Description Quality

0 SNR from coded and reference signal are equal l no improvement

< 0 SNRref > SNRcod, lower SNR in coded signal worse

> 0 SNRref < SNRcod, higher SNR in coded signal better

NPLR, Noise Power Level Reduction

Definition:

ETSI


-7 dB or lower

Formula:

NPLR = 10 * ( log(PLcod_nse) – log(PLref_nse) )

Range Description Quality

0 Noise levels from coded and reference signal are equal no noise reduction

< 0 PLref > PLcod, lower noise level in coded signal good

> 0 PLref < PLcod, higher noise level in coded signal bad

SPLR, Signal Power Level Reduction

Both SNRI and NPLR are defined in ETSI’s document TS 101 512, V8.0.0. Signal Power Level Reduction (SPLR) is a SwissQual‟s improvement of NPLR measurement, where NPLR is a subset of SPLR. SPLR is the difference between coded and reference energy, calculated separately for each energy window.

Note: Noise reduction should reduce only noise parts in a signal.

The definition of Windows is given in Table 3. The aim of this measurement is to detect the influence of noise reduction circuits on speech parts of the signal.

Five SPLR values are calculated: hSPLR , mSPLR , lSPLR , nSPLR and pSPLR . nSPLR is equal to NPLR

value. Good noise reduction would generate hSPLR closed to zero and pSPLR below –10 dB. The trend

curve down through these five values shows the quality and ability of noise reduction circuit to reduce only





18

noisy frames and to keep unchanged the speech active frames. In other words, the first coefficient (a) of the

trend curve y=ax+b must be negative (see example in Figure 16). The SPLR measure in SquadNS

algorithm is equal to this coefficient (a) of the trend curve.

Figure 3-4 SPLR Calculation out of Five Values Calculated in Five Different Energy Windows.

The bottom picture shows good noise reduction, whereas on the right is shown poor noise reduction.

SPLR is then mapped to a new range 1 – 4.5 (like MOS scale). This mapping from SPLR to mSPLR is

shown in Figure 16. mSPLR > 2.5 should be achieved for good noise reduction.





19

Figure 3-5 Speech Power Class Chart

Overall NS Quality

Figure 3-6 Overall NS Quality





20

Quality Index

Figure 3-7 Calculation of Quality Index

The calculation of Quality Index is done by using of four input parameters: NPLR, SNRI, SPLR and MOS_acr

Quality Index was introduced first in SW Release 2.2. Four values: SMOS, SNRI, NPLR and SPLR are combined into one objective number. SMOS is measured with SQuad-LQ, where the clean reference is compared with the coded signal. The range for Quality Index is 1 to 4.5 (like for MOS). Rating 1 is standing for bad quality and 4.5 for excellent one. The following equation shows the calculation of quality index based on four input parameters previously scaled into range 1-4.5.

MOSSPLRSNRINPLRQ MOSmSPLRmSNRImNPLRidx 1i

Note: The Quality Index describes the performance of the noise reduction system in combination with the network and not the Listening Quality of the de-noised signal.

The following table shows some measurement examples for different network conditions including noise reduction effects:





21

Figure 3-8 Some Experimental Results for DifferentConfigurationofNR in the Network

Figure 3-9 MOS vs. MOSobj & Quality Index

The Quality Index correlates much better with MOS_CCR than MOSobj based only on speech quality evaluation.

Convergence Time

For the measurement of the Convergence Time in a noisy signal, the algorithm examines the first two seconds of the given signal. For the calculations it uses the filtered difference between the coded and the reference signal (red color, see Figure 3-6).

Figure 3-10 Example of Convergence Time Evaluation

First it checks whether the signal belongs to the noise or pause group and then it compares data with the set threshold. The threshold is calculated as NPLR + 25 (default, use PERCENT to change) percent of the difference between the maximum value of the filtered signal in these first 2 seconds and noise level afterwards (NPLR). If the filtered data is lower than the threshold the first condition for the convergence is fulfilled (see Figure 3-7).





22

Filtered difference

-25

-20

-15

-10

-5

0

5

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191

Figure 3-11 Filtered Difference Envelope is Compared with the Threshold Value

The second condition is that the signal has a falling tendency. To verify that we check 5 (default, use CT_NR_POINTS to change) equally spaced points over the tested convergence time. In case of the falling signal the difference in values between every two consecutive points has to be less than zero. In Fig. 21, we see that the difference between signal values in third and forth points is bigger that 0, which signifies raising tendency of the signal. Here we perform additional check to clarify what is actually going on.

-40

-30

-20

-10

0

10

20

30

1 8

15

22

29

36

43

50

57

64

71

78

85

92

99

10

6

11

3

12

0

12

7

13

4

14

1

14

8

15

5

16

2

16

9

17

6

18

3

19

0

Figure 3-12 Five Point Analysis of the Difference Envelope during Decision on Noise Reduction State

This test is based on the average level of the signal before and after the first convergence criterion is met. If the average level of the signal after falling below the threshold is less than that threshold, and the average level of the signal before that point, is higher than the same threshold, we say that the signal has converged. If not, the algorithm continues searching for convergence until the end of 2 seconds buffer.

Threshold

Convergence time

Threshold





23

-40

-30

-20

-10

0

10

20

30

1

11

21

31

41

51

61

71

81

91

101

111

121

131

141

151

161

171

181

191

Figure 3-13 Additional Condition before a Final Decision is Calculated.

The levels before and after place of discontinuity are calculated.

Noise Reduction/Suppression Test

This measurement gives us general information about the type of noise treatment applied in communication channel. Based on measurement of the differences between signal power level reductions in speech power classes, the algorithm decides whether noise suppression or noise reduction was applied. Experimental results have shown that it is necessary to define one more situation. In a case that signal power level reduction in pause class is higher than SPLR in high class + offset of 3 dB, we say that the noise in communication channel wasn‟t treated in either way. Reference suitability is a fourth possible result coming from this measurement.

Figure 3-14 Typical Reference Signal with White Noise Added

The noise suppression and the noise reduction are both used to enhance speech quality in a range of environments where there is significant (audible) background noise (see Fig. 24). The noise suppression reduces the noise in pause and noise power classes, and has very little, or not at all, influence on higher power classes (speech active intervals). To draw a distinction, the noise reduction reduces the noise equally in all power classes. Therefore we have based our algorithm on measurement of the difference between signal power level reduction in high and medium speech power class. If the difference between two levels is less than a calculated threshold we say that the noise suppression was applied.

Threshold

Mean signal level before threshold

Mean signal level after threshold





24

Figure 3-15 Noise Suppression Applied on Signal

Figure 3-16 Noise Reduction Applied on Signal

An important role has the level of noise in the reference signal. If the signal to noise ratio of the reference signal is higher than 30 dB, we say that reference signal is not good for conducting the measurement, due to too low noise level. The same reference SNR is used for calculating the threshold offset between SPLRs of high and medium power classes.

Examples

The Signal Envelope shows that the noise is really reduced and the speech part is more or less the same as for the reference signal.

Figure 3-17 NS Signal Envelope

Evaluation of the transmitted signal

In addition to ratings of the noise reduction systems described above, the results of the listening quality evaluation of the transmitted and de-noised signal is given as well in separate section.

Here the SQuad-LQ is applied on the transmitted signal.





25

Figure 3-18 Results presentation within NQDI

Besides of the pure LQ also the speech and noise levels are shown. Furthermore, the clipping value can be used for evaluation non-linear processing‟s by the NS device.

Evaluation of the transmitted signal

ETSI TS 101 512, V8.0.0, (GSM 06.77 version 8.0.0 Release 1999), Digital cellular telecommunication system (Phase 2+), Minimum Performance Requirements for Noise Suppresser, Application to the AMR Speech Encoder

ETSI TS 101 745, V8.0.0, (GSM 02.76 version 8.0.0 Release 1999), Noise Suppression for the AMR Codec, Service Description, Stage 1

ETSI TS 101 831, V8.0.0, (GSM 06.78 version 8.0.0 Release 1999), Digital cellular telecommunication system (Phase 2+), Results of the AMR Noise Suppression Selection Phase, Application to the AMR Speech Encoder



Chapter 4 | DTMF Tests


26

4 DTMF Tests

Introduction

In telecommunications today, the most used signalling system is DTMF signalling. DTMF stands for Dual Tone Multi-Frequency. As the name suggests, the DTMF signal consists of two superimposed sinusoidal waveforms with frequencies chosen from a set of eight standardized frequencies.

When a DTMF signal is sent over a network it can be degraded, especially when it is encoded. For an operator of a network, it is of interest to know if the receiver of the DTMF signals can convert the DTMF signal back into a digit or a symbol. The objective is to measure the percentage of detected and undetected DTMF digits.

In the first part of SwissQual's algorithm for DTMF test, the algorithm scans through a given signal and detects the locations of DTMF signals. Once a DTMF signal is found, the algorithm calculates the characteristics and decides if the signal is valid. If the tone is invalid, the DTMF-Test describes which condition that was not accomplished. The algorithm collects all characteristics and saves them in a file.

The DTMF signal used for tests, which consist of two frequencies. According to the CCITT Recommendation Q.23 [5] and Q.24, there are two frequency groups, each with four frequencies:

The figure below shows how the frequencies are allocated to the various digits and symbols of a push-button set. Every digit and symbol consists of a frequency from the low and the high group.

Figure 4-1 Allocation of Frequencies to the Various Digits and Symbols of a Push-button Set

DTMF-Test Overview

One or more DTMF signals are sent over a network. The coded signal will be used for the DTMF-Test. This signal is available in digital form; the data format is PCM (without compression). The sampling frequency is 8 kHz or 16 kHz. The digital quantization of the signal can be 8 bit (unsigned or signed) or 16 bit (big or little endian). Inside, the algorithm works with 16 bit resolution. The figure below illustrates the basic algorithm of DTMF-Test. DTMF-Test saves its result in a comma delimitated text file.





27

Figure 4-2 Block Diagram of DTMF-Test

Criterions

The objective of the SwissQual model for DTMF testing is to measure the percentage of undetected DTMF digits processed through the network. The DTMF signals are generated at the frequencies specified in the ITU-T Rec. Q.23.

The algorithm follows the ETSI guidelines defined in "TS 101 235-1" "Technical Specification of Dual Tone Multi-Frequency (DTMF)".

The received DTMF signal shall be detected as valid when:

Only two of the signalling frequencies are present, one from the high group and one from the low group, fulfilling the conditions as described above

Each of these signalling frequencies are within +/-(1,5 % +2 Hz) of the nominal value

The level of each of these two signalling frequencies is within the range -27 dBV to -5 dBV

The difference in level of these two signalling frequencies is not more than 6 dB.

Results

Table 4-1 DTMF Result Code

Code Description

Tone Length Length of a DTMF Tone

Pause Length Pause Length between two DTMF tones

Measured Level Average Level of a DTMF Tone

Level Deviation Level Deviation of the two frequencies of a DTMF Tone





28

Code Description

Freq. Low Low Frequency value in Hertz

Freq. High High Frequency value in Hertz

DevFreqLow [Hz] Deviation of the low frequency from the standard in Hertz

DevFreqHigh [Hz] Deviation of the high frequency from the standard in Hertz

DevFreqLow [%] Deviation of the low frequency from the standard in percent

DevFreqHigh [%] Deviation of the high frequency from the standard in percent

Twist [dB] Level difference between the high and the low frequency

Signal Valid Signal valid code:

Cause: Valid (0)

If the received tone matches all conditions, then the signal is valid and the field 'SignalValid' is set to '0'

Cause: TooShort (-1)

If the received tone is too short (<40ms) then the signal is invalid and the field 'SignalValid' is set to the code '-1'

Cause: NoDigit (-2)

If the received tone does not contain the two frequencies as specified, the signal is invalid and the field 'SignalValid' is set to '-2'

Cause: LowLevel (-3)

If the Noise Level is more than 10% of the Signal Level of a tone, then the signal is invalid and the field 'SignalValid' is set to '-3'

Cause: FreqDeviation (-4)

If one of the two frequencies is out of the specified range (+/- 1.5%), then the signal is invalid and the field 'SignalValid' is set to '-4'

Cause: LevelDiff (-5)

If the level difference of the two frequencies for one tone is more than 10dB, then the signal is invalid and the field 'SignalValid' is set to '-5'

Cause: Unknown (-6)

If there is no tone at all, then the signal is invalid as well and the field 'SignalValid' is set to '-6'

Cause: TooLong (-7)

If the received tone is too long (>90ms), then the signal is invalid and the field 'SignalValid' is set to the code '-7





29

Code Description

Signal Match Signal match code:

Cause: NotRegular (A)

If a tone in the coded file is too short, or wrong in any other aspect, it is not matched with the reference tone the field 'SignalMatch' is set to the code 'A'

Cause: AdditionalNotRegular (B)

If there are one or more irregular tones with no reference, the field 'SignalMatch' is set to the code 'B'

Cause: MissingTone (C)

If a reference tone has no match but there are two or more irregular tones, the field 'SignalMatch' is set to the code 'C'

Cause: MultipleMissingTones (D)

If there are two or more reference tones with no match but three or more irregular tones, the field 'SignalMatch' is set to the code 'D'

Cause: MultipleMissingTones (E)

If there are more reference tones with no match than irregular tones, the field 'SignalMatch' is set to the code 'E'

Cause: MultipleTone (F)

If a reference tone has two or more matching tones in the coded file, the field 'SignalMatch' is set to the code 'F'

Cause: AdditionalTone (G)

If there is no reference for a tone in the coded file, the field 'SignalMatch' is set to the code 'G'

Cause: MissingTone (H)

If for a reference tone there is no tone in the coded file, the field 'SignalMatch' is set to the code 'H' Cause: Disparity (I)

If the number and order for a string of tones in reference and coded file cannot be matched, the field 'SignalMatch' is set to the code 'I'



Chapter 5 | SQuad Advanced Echo Check (Passive Test)


30

5 SQuad Advanced Echo Check (Passive Test)

Introduction

The measurement application of the Acoustic Echo Check (also called: Acoustic Echo Check) can be applied to a SwissQual measurement probe at the far end side as well as to any number on that an automatic hook-up device is connected. SQuad-AEC does not require any artificial test signals but it is optimized to detect echoes by using human speech as measuring signal. So it works for all technologies that serve voice communications and make the algorithm ready for in-service live monitoring.

The SQuad-AEC measurement will detect echoes in that active connection by sending a speech signal to the far end side and observing the receiving direction for any reflections. If no signal is inserted at the far end side the procedure is measuring during that single talk situation only. Because commonly used Non-Linear-Processors like VAD's suppress low power send signals also echoes will not occur in such connections. Therefore the SQuad-AEC algorithm is also designed to detect echoes during double talk situations. Such a double talk situation may simulate by an active playing answering station or by using a real phone at the far-end side and talking in during the measurement. This double talk at the far end side will switch through the sending path and also the echo can be transmitted.

The SQuad-AEC Test is especially designed to detect electrical as well as acoustical echoes and is able to detect 'dry' and 'hallow' acoustical echoes as well as hybrid echoes and is more robust against double talk. In case of a 4-wire connected the far-end station the echoes introduced by the network will be found. By using real (echo-producing) terminals the insertion of echoes caused by network AND the terminal can be calculated.

Echo Measurement

This Advanced Echo Check Passive Test (AEC passive) does not simulate anything on B-Side. The A-Side starts a call and after B-Side has answered the call; the collecting of the down-link (B->A) audio stream is started. When the recording of the stream has finished, the search of echo signal is started by comparing the registered signal with the reference signal.

On the B-Side, we can use any (self-answering) voice terminal or a SwissQual Diversity measurement probe. The AEC algorithm is able to detect echo in presence of background noise and double talk.

The algorithm is running in two steps:

Observing a wide range of echo delay for possible echoes (scan procedure)

Analysing accepted echo regions in detail for calculating the echo loss and the other results

Measurement Results

The AEC algorithm generates the following results:

Signal type

Echo Delay in milliseconds

Echo Loss during Single Talk acc. ITU-T G.122

Echo Loss for the complete signal (incl. Double Talk)

Echo Objection Rate acc. ITU-T G.131 in %

Distance to 1% Echo Objection Rate acc. ITU-T G.131

ECHO status

GSM3.50 test

Double Talk Ratio





31

Level of Received Signal in dBov

Signal type can be “Echo”, “SideTone”, “Double Talk”, “Silence” or combinations of them. As Sidetone received signal parts will be rated, which are correlated to the send signal and were received with less than 20ms delay. Double Talk will be signalized if in more than 40% of the signal duration the receiving signal is exceeding the defined double talk threshold.

A Signal type “NoEcho” informs that the connection is echo-free or the echo will be not perceptible. Additionally, a signal type “NoEchoFound” signalizes that the echo detection was hardly disturbed be unexpected distortions in the signal. It might be echo-free but also can include a masked echo.

An additional parameter shows the Double Talk Ratio in %. The Double Talk Ratio shows in which ratio the found echo is superset by Double Talk. Only signal parts are defined as doubletalk, where the Near-End sending signal exceeds -36dBov (approx. -30dBm) and the Far-End signal is -48dBov (approx. -42dBm) in minimum. In case of strong echoes this double talk threshold will be increased to minimize the classification of strong echoes as double talk.

1

Echo Delay in milliseconds is a time offset between the reference- and the returned echo- signal. A range of values is between 0 and 1000 ms. If no echo was detected the Echo Delay is 0.0 ms.

Echo Loss is the weighted echo signal level measured relative to the reference signal level. The calculation is in accordance to ITU-T G.122. If no echo was detected the Echo Loss is 99dB, which is no echo in fact. Under the assumption of an SLR + RLR = 10dB a TELR = Echo Loss + 10dB can be estimated by using the Echo Loss result. SLR stands for Send Loudness Rating and RLR for Receive Loudness Rating.

Figure 5-1 Results of SQuad-AEC Test, shown in NQDI presentation

Echo status shows the grade of annoyance of the echo signal. There are three possible values: GOOD stands for good echo performance, FAIR for an acceptable echo performance and POOR for annoying echoes. Using the Echo Objection curves given in ITU-T G.131 the Echo Status is derived. Therefore the TELR is estimated by adding 10dB to the Echo Loss and the corresponding cross-point between the TELR and half of the Echo Delay (= one way transmission time of the echo).

1 Please remark that in case of pure single talk situations a powerful echo region might classified as Double

Talk and a Double Talk Ratio of some percentage will shown if the adaptive Double Talk Threshold is exceeded. This might be observed especially for time varying echo paths. Furthermore, a huge mount of noise in the receiving part may be classified as double talk too.





32

Figure 5-2 EOR (Echo Objection Rate) derived from G.131

The GSM 03.50 test is defined in ETSI GSM 03.50 (Section 3.4) and derives the required Terminal Coupling Loss (TCL) from the G.131 TELR (talker echo loudness rating) chart. Under the assumption of a no-loss 4-wire connection from the measuring point to the terminal, the ERL can be interpreted directly as the TCL, because the terminal itself is the only existing source of echoes. Thus, SQuad-AEC measures the TCL value directly in this case. Thus, we can set:

TCL = TELR - (SLR + RLR) dB, where typically SLR + RLR = 10 dB

TCL = TELR - 10dB

TCL should be ideally 40 dB to 46 dB. 46dB is derived from 1% EOR curve of G.131, with maximum delay (about 400 ms). If a TCL of higher than 46 dB is reached, the 1% EOR curve will never pass even for high delays (if no other echo sources besides the terminal exist).

If the measured TELR = TCL + 10 dB is higher than the 1% EOR, the GSM3.50 test shows a “passed” value. In the case of a lower TCL, this value is considered to be in the “failed” range.

EOR (Echo Objection Rate) in % is an estimate of the percentage of the listeners who has perceived a talker echo when listening to a given telephone setup. ITU-T G.131 shows two different curves one for 1% EOR and one for 10% EOR. It is assumed that a set of equally shaped curves will describe each EOR between 0…100%. Based on the described crossing point of estimated TELR and the half of the Echo Delay a (theoretical) corresponding curve can be derived and the assumed EOR can be taken.

If this EOR is less than 1%, the Echo Status is also GOOD, if this EOR is above 10% it is POOR. Between both values the echo will be rated as FAIR.

The Distance to 1% EOR is also calculated directly from the chart given in ITU-T G.131. This value gives the distance to the 1% EOR curve for the calculated echo delay. All negative values are in the 'green region,' values above 10dB are in the 'red region.'

Echo Loss profile: The Echo Loss is shown graphically versus the delay time. This figure is for detailed information and should visualize the echo region found. This Echo Loss profile is the one result of the scan process and represents only situation during single talk. This profile is used for pin-pointing echoes only. The detailed echo analysis itself will be done in a separated step and therefore the echo loss can not derived directly from this curve.





33

Figure 5-3 Echo Loss during scanning versus echo delay

The Level of Received Signal in dBov gives only information about the r.m.s. level of the received signal at all. It covers echoes, double talk sequences and noises.



Chapter 6 | SQuad Advanced Echo Check (Active Test)


34

6 SQuad Advanced Echo Check (Active Test)

Introduction

The active named measurement application of the Acoustic Echo Check can be applied only to an SwissQual measurement probe at the far end caused by active actions has to be done. It is generating an echo at he far end side.

Echo Measurement

The SQUAD AEC active measurement is using the same echo detection approach as the passive measurement described above. Compared to the passive measurement, where the far-end side is silent in the active mode the far-end side will create an echo actively. The SQUAD AEC active measurement includes an inband synchronization between both sides. In a first communication the incoming signal will be recorded at the far-end side. Based on that signal an echo is generated by applying selectable echo path responses on that. If required, the generated echo can be interlaced with double talk. During receiving the signal second time, the pre-processed echo will be played back to the sending side for evaluation.

This measurement is especially designed to detect and rate echo cancellers or suppressors in the network. The generated echo will challenge these echo cancellers and possible integrated level-switching devices will be forced by an inserted double talk signal.

The echo-detection is more confident if the remaining echo has linear components. Especially during double talk, only linear dependent echoes can be detected. In connections including low bit-rate codecs and/or non-linear processors the residual or low echoes might be non-detectable by the measurement. For more confidence chose higher echo levels to increase their differentiation from doubletalk and other non-linear components.

Measurement Results – Echo Evaluation

The SQuad AEC active measurement generates the same results as described in Chapter 5.

Signal type

Echo Delay in milliseconds

Echo Loss during Single Talk acc. ITU-T G.122

Echo Loss for the complete signal (incl. Double Talk)

Echo Objection Rate acc. ITU-T G.131 in %

Distance to 1% Echo Objection Rate acc. ITU-T G.131

ECHO status





35

GSM3.50 test

Double Talk Ratio

Level of Received Signal in dBov

Furthermore, SwissQual‟s database interface NQDI displays the settings used at the far end together with the measurement results:

Figure 6-1 Results presentation SQuad-AEC active

The results in the Figure presented should be used here for discussion of the results as well. The measurement was done in a Mobile to PSTN connection. The PSTN-side was the „echo generating‟ loop. At this side the incoming signal was convoluted by the echo path response „M1‟ from ITU-T G.168 („G168_M1‟). Afterwards it was attenuated by 20dB and interlaced by a double talk signal containing 50% active speech (dt_50_08kHz.wav). An additional delay was not chosen at PSTN side.

The results show that this echo was detected at the mobile side. The echo path delay of 224ms is typical for a Mobile to PSTN connection. The echo loss over the complete signal is 21dB, which reflects pretty well the range of the defined echo at the PSTN side. The echo loss during single talk is a bit lower, which signalizes that there is an active component reducing the echo in speech pauses at least a bit.

2

Using this results the corresponding Echo Objection Rate is calculated (here: 54%) and the distance to the G.131 1% curve (12dB) as well. That means a increasing of the echo loss by 12dB would be necessarily to reach the 1% curve and therefore the echo status „good‟.

Consequently, by the reached results the echo status is rated as „poor‟.

Additionally, the Double Talk Ratio is 47%, which is caused by the defined signal at the far-end side.

Figure 6-2 Echo Loss as profile versus echo delay

Also for SQuad-AEC in the active mode the echo loss profile is displayed.

If no echo is found or if it was not detectable, only the status messages and the level of the received signal will be displayed:

2 Please note that the channel gain will also influence the measured echo loss. Basically, the channel

attenuation in both directions has to be added to the defined echo loss at far-end side. The measuring signal is attenuated due to the transmission from A to B, is there attenuated again (during the defined loss value) and will be attenuated again due to the transmission from B to A again. The echo loss reflects the level of the received echo compared to the original measuring signal.





36

Figure 6-3 Result presentation in an echo free/non echo detectable connection

Measurement Results – Listening Quality

Within the SQuad AEC active measurement an additionally evaluation of the Listening Quality is integrated. Here the SQuad-LQ is applied on the signal received by the echo-generating far-end side.

So a simple Listening Quality measurement can be done in parallel. The interesting point is here: How the Listening Quality is affected by double talk/echo in the other direction. By comparison of both Listening Quality values the double talk capability can be evaluated. If a network is fully duplex both Listening Quality values should be the same even a double talk signal is chosen.

The right value gives the Listening Quality for the first transmission where no echo or double talk is played back. The right value gives the LQ during the echo / double talk is sent at the same time. In addition the channel gain and the clipping of the received signal is also given.

Please note that a strong side-tone at that B-side may affect the SQuad-LQ measurement, because it interleaves with the received and evaluated signal.



Chapter 7 | Round Trip


37

7 Round Trip

Introduction

The Round Trip Time is the time a signal needs to travel from the near end side to the far end side and back. The Round Trip Time is mostly close to the delay of the latest possible echo. The time speech needed to travel from one talker to the other (One Way Signal Delay) is an important indicator of the conversational quality of a call. A travel time that is too high leads to the annoying effect that the talkers interrupt each other unintentionally.

The Round Trip Method

The RTT inband measurement measures the Round Trip Time of a connection by using short voice-like sequences. This guarantees the transmission over the complete link and avoids suppressions how it may happen in case of artificial signals like sweeps or impulses. The RTT inband measurement is a point-to-point measurement, i.e. an A-Side user calls a B-Side user. After a successful call establishment, the A-side sends the RTT synchronisation signal (RTTvoiceA) three times one after another but separated by a silence gap of 5.4s to the B-Side.

After receiving this sequence the B-Side sends back the RTTvoiceB sequence. In comparison to a pure reflecting at B-Side, the usage of different sequences at A- and B-Side avoids the suppression of the reflected signal by an echo-compensation system in the network. In minimum two of three samples has to be detected at the A-Side again.

Results

The measurable Round Trip Time is limited from 4ms in minimum to 3000ms in maximum; the maximal

delay jitter between the three repetitions within one measurement has to be below 500ms. The results of the measurement are presented in the following table. In addition, the lowest of the one way and round trip time of the measurements in milliseconds is shown as final results.

Figure 7-1 Detail of the NQDI Representation

The quality classes according to ETSI:

Table 7-1 One Way Delay Quality Classes

4 (BEST) 3 (HIGH) 2 (MEDIUM) 1 (BEST EFFORT)

One Way Delay < 100 ms < 100 ms < 150 ms < 400 ms

References

ETSI TS 101 329-2 V1.1.1 (2000-07), Part 2: Definition of Quality of Service (QoS) Classes



Appendix A | Appendix


38

A Appendix

Abbreviations

Abbreviation Description

ACR Absolute Category Rating

CELP Code Excited Linear Prediction

DCR Degradation Category Rating

DMOS Degradation Mean Opinion Score

MOS Mean Opinion Score

dBov dB relative to the overload point of a digital system

ADPCM Adaptive Differential Pulse Code Modulation

BFI Bad Frame Indication

CCITT Comité Consultatif International Télégraphique et Téléphonique (The International Telegraph and Telephone Consultative Committee)

CDMA Code-Division Multiple Access

CRC Cyclic Redundancy Check (3 bit)

DAC Digital to Analogue Converter

DMR Digital Mobile Radio

DTMF Dual Tone Multi-Frequency (signalling)

DTX Discontinuous Transmission (mechanism)

EPROM Erasable Programmable Read Only Memory

ETR ETSI Technical Report

ETS European Telecommunication Standard

ETSI European Telecommunications Standards Institute

FER Frame Erasure Ratio

FR Full Rate

GMSK Gaussian Minimum Shift Keying (modulation)

GSM Global System for Mobile communications

GSM MS GSM Mobile Station

HANDO Handover

HDLC High level Data Link Control

HR Half Rate

IEC International Electro-technical Commission

ISDN Integrated Services Digital Network

ISO International Organization for Standardization

ITU International Telecommunication Union

LAN Local Area Network



Appendix A | Appendix


39

Abbreviation Description

MSC Mobile-services Switching Center, Mobile Switching Center

OSI Open System Interconnection

PABX Private Automatic Branch eXchange

PDN Public Data Networks

PSPDN Packet Switched Public Data Network

PSTN Public Switched Telephone Network

QOS Quality Of Service

RXLEV Received signal level

RXQUAL Received Signal Quality

S/W Software

SIM Subscriber Identity Module

SS7 Signaling System No. 7

TDMA Time Division Multiple Access

TE Terminal Equipment

VAD Voice Activity Detection

manual - squad voice test result description

Documents

adaptive gain control

noise reduction state

round trip method

receive loudness rating

711typical squadlqtypical mos

speech power classes

speech power class

white noise added